Koha OAI server has been done in one unique .pl file because there
wasn't any object model or rules in the Koha project when it has been
coded. This patch modularized existing classes, putting each class in a
separate file in Koha::OAI::Server namespace. UT begining.
Add new dependency: Capture::Tiny
Signed-off-by: Hector Castro <hector.hecaxmmx@gmail.com>
OAI server moduralized succefully. Works for Debian Jessie and
Wheezy. Test pass successfully
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
To test:
- activate OAI-PMH with the inclusion of items as explained on bug 12252
- set the OAI-PMH:MaxCount to a low number, 50 for instance
- go to the OAI-PMH page to get the records : [your koha
catalogue]/cgi-bin/koha/oai.pl?verb=ListRecords&metadataPrefix=marcxml
- check that item data is included
- get the resumptionToken at the end of the xml
- got to the next page of records [your koha
catalogue]/cgi-bin/koha/oai.pl?verb=ListRecords&resumptionToken=[your
resumption token]
- check that item data is now missing
Apply the patch, and repeat previous steps: item data is back.
Signed-off-by: Gaetan Boisson <gaetan.boisson@biblibre.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
Previous patches attached to this bug have been refactored to merge bug
3206 and bug 13568 features. So OAI server must be carrefully tested to
ensure that there is no regression in this area: deleted records and
resumption token.
This last patch fixed the way items are returned. They are returned only
if OAI server operates in extended mode, and specifically for format
having the parameter include_item set to 1 (true). For example this
configuration file set via OAI-PMH:ConfFile syspref will return items:
Signed-off-by: Signed-off-by: Gaetan Boisson <gaetan.boisson@biblibre.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
Same in spirit to the other patch, this also includes the item detail in
ListRecords.
Test plan:
* Fetch a URL like:
http://koha/cgi-bin/koha/oai.pl?verb=ListRecords&metadataPrefix=marcxml
* Verify that there are 952 entries in the returned records where
appropriate.
Signed-off-by: Frederic Demians <f.demians@tamil.fr>
ListRecords OAI verb returns a list of records including items in 952/995 which
are not hidden based on OpacHiddenItems syspref.
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Signed-off-by: Gaetan Boisson <gaetan.boisson@biblibre.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
GetRecord for OAI-PMH was pulling the MARCXML directly from the
database. Now it uses GetMarcBiblio and includes the item data with it,
making it more generally useful.
Test plan:
* Run an OAI-PMH query, for example:
http://koha/cgi-bin/koha/oai.pl?verb=GetRecord&identifier=KOHA-OAI-TEST:52&metadataPrefix=marcxml
to fetch biblionumber 52
* Note that it doesn't include the 952 data
* Apply the patch
* Do the same thing, but this time see that the 952 data is at the
bottom of the MARCXML.
Note:
* This patch also includes a small tidy-up in C4::Biblios to group
things semantically a bit better, so I don't spend ages looking for a
function that was staring me in the face all along again.
Signed-off-by: David Cook <dcook@prosentient.com.au>
Works as described. Simple yet useful patch.
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Frederic Demians <f.demians@tamil.fr>
952/995 item fields are back in response to GetRecord OAI verb.
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Signed-off-by: Gaetan Boisson <gaetan.boisson@biblibre.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
When getting records from OAI-PMH, an error must be returned if there is no results.
See : http://www.openarchives.org/OAI/openarchivesprotocol.html#ErrorConditions
Test plan :
- Enable OAI webservice
- Perform a query that will return no results. ie : /cgi-bin/koha/oai.pl?verb=ListRecords&metadataPrefix=marcxml&from=2099-12-30&until=2099-12-31
=> Without patch you get a response with :
<ListRecords/>
=> With patch you get a response with error code :
<error code="noRecordsMatch">No records match the given criteria</error>
- Check a good query returns still results
- Same test with ListIdentifiers verb
Signed-off-by: Mirko Tietgen <mirko@abunchofthings.net>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
- Fix QA.
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
New pref in right order, new option 'no' on syspref, other
fixes following comment #12
All seems to work
No errors
Signed-off-by: Jonathan Druart <jonathan.druart@koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
This patch allows Koha OAI repository to support deleted records.
The OAI-PMH:DeletedRecord syspref is introduced and can be set to:
- persistent (in case Koha's deletedbiblio table will never be emptied
or truncated)
- transient (in case Koha's deletedbiblio table might be emptied or
truncated at some point)
Test plan:
- After applying the patch, test that:
- Deleted records appear in ListRecords and ListIdentifiers requests.
- Filter parameters (from, until, set and resumptionToken) still work
and are applied to ListRecords and ListIdentifiers requests.
- Identify request shows if the repository is considered persistent
or transient, according to the OAI-PMH:DeletedRecord syspref.
- Deleted records that used to belong to a set are still displayed in
those sets and marked as deleted.
- GetRecord requests work on deleted records, which are marked as deleted.
Requests examples:
/cgi-bin/koha/oai.pl?verb=ListRecords&metadataPrefix=oai_dc
/cgi-bin/koha/oai.pl?verb=ListRecords&metadataPrefix=oai_dc&from=2015-02-20T11:08:33Z
/cgi-bin/koha/oai.pl?verb=ListRecords&metadataPrefix=oai_dc&set=new_specSet1
/cgi-bin/koha/oai.pl?verb=GetRecord&identifier=KOHA-OAI-TEST:2&metadataPrefix=oai_dc
/cgi-bin/koha/oai.pl?verb=Identify
Signed-off-by: Frederic Demians <f.demians@tamil.fr>
It works in all situations described in the test plan. Great addition.
Thanks.
Signed-off-by: Jonathan Druart <jonathan.druart@koha-community.org>
Signed-off-by: Jonathan Druart <jonathan.druart@koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
This patch removes the DATE() function from a query on timestamp, and adds a
sub that strips the UTC designators "T" and "Z" from incoming "from" and
"until" arguments in OAI-PMH requests so that they're more compliant with
MySQL (and probably other databases as well). This means that the date
and time for the 'from' and 'until' arguments will be matched correctly
in the database.
This patch also adds 'T00:00:00Z' to 'from' arguments and 'T23:59:59Z' to
until arguments, when only dates are provided via the OAI parameters.
The zero time isn't necessary, since MySQL treats '2013-09-30' as
'2013-09-30 00:00:00' by default. However, the near midnight time
is needed for 'until'. Otherwise, you'll never be able to retrieve
a record with a date/time matching the 'until' argument.
In summary, this patch adds handling for times as well as dates, which
is necessary so that Koha is closer to meeting the actual OAI-PMH spec.
TEST PLAN:
0) Note down a selection of timestamps from your biblio table
1) Enable your OAI-PMH server through the global system preferences
Web services tab.
2) Craft and submit a similar request to the following in your browser:
KOHAINSTANCE/cgi-bin/koha/oai.pl?verb=ListRecords&metadataPrefix=oai_dc&
from=2013-09-02T13:44:33Z&until=2013-09-05T13:44:33Z
Change the exact dates to accord with your timestamps, but keep the
YYYY-MM-DDTHH:MM:SSZ format.
3) Note the unexpected behaviour. A "from" argument with the timestamp
2013-09-02T13:44:33Z will show records from 2013-09-03 but not records
from 2013-09-02 even though the timestamp in the database will say
"2013-09-02 13:44:33".
Also note that records with a timestamp later than 13:44:33 will show
up for the day 2013-09-05, even though they shouldn't.
4) APPLY THE PATCH
5) Resubmit the links you tried above
6) Note that the applicable records now appear (or do not appear) in
accordance with the precise date/time ranges!
--
Developer Note: We could've not stripped the UTC designators and used
DATE() around the parameters in the SQL queries, but that would have
lost the whole purpose of using times in the "from" arguments, since
they would've been generalized to just the dates.
I think this is probably the best solution. Admittedly, creating
"form_arg" and "until_arg" hashrefs in the ResumptionToken object
might not be ideal, but I preferred that to copying the
_strip_UTC_designator subroutine into two other objects. Perhaps this
sub could go somewhere else and be imported into those other two objects
but this seemed to be the most sensible decision. I'm open to other
opinions though.
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Works, find results with correct timestamp
No koha-qa errors
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Signed-off-by: Chris Nighswonger <cnighswonger@foundations.edu>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Signed-off-by: Katrin Fischer <katrin.fischer@bsz-bw.de>
http://bugs.koha-community.org/show_bug.cgi?id=9987
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
When responding to ListRecords and ListIdentifiers verbs, OAI server doesn't
return proper resumption token. At the end of a result set, OAI server
generates a resumption token even if there isn't anymore records. Consequently,
OAI harverster will send a new request, based on this invalid resumption,
token. OAI Server responds with an empty resultset, which is considered as an
invalid response by most of the harvesters.
TO TEST:
- Find in your DB, a day where a few biblio records have been created. The
number of created biblios must inferior to OAI-PMH:MaxCount.
- Let say this day is 2014-01-09. Send an OAI-PMH request to Koha OAI Server:
/cgi-bin/koha/oai.pl?verb=ListRecords&metadataPrefix=marcxml&from=2014-01-09&until=2014-01-09
- At the end of the result, you will see a resumption token which looks like that:
<resumptionToken cursor="47">marcxml/47/2014-01-09/2014-01-09/</resumptionToken>
This is wrong. No resumptiion token should be sent since there isn't anymore
records to harvest.
- Apply the patch.
- Resend the OAI-PMH request. There is no resumption token at the end of the
result.
- You could test also with ListIdenfiers verb in place of ListRecord.
Signed-off-by: Christophe Brocquet <christophe.brocquet@obspm.fr>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
This patch changes the value separator in OAI-PMH resumption tokens
from colons to slashes, so that the token string isn't split incorrectly
when a time is included.
TEST PLAN:
1) Turn on the OAI-PMH server syspref in Koha
2) Send a ListRecords request using 'from' and 'until' arguments that
include times (Best to use very far apart times so that you retrieve
more than 50 records which will likely be the trigger for a resumptionToken).
Here is an example:
http://KOHAINSTANCE/cgi-bin/koha/oai.pl?verb=ListRecords&
metadataPrefix=oai_dc&from=2012-09-05T13:44:33Z&until=2014-09-05T13:44:33Z
N.B. Replace KOHAINSTANCE with the URL of your Koha instance.
3) Scroll down to the bottom of the page until you find the resumptionToken.
It will look similar to this:
<resumptionToken cursor="50">
oai_dc:50:2012-09-05T13:44:33Z:2014-09-05T13:44:33Z:
</resumptionToken>
4) Copy that resumption token and send a request with it like so:
http://KOHAINSTANCE/cgi-bin/koha/oai.pl?verb=ListRecords&
resumptionToken=oai_dc:50:2012-09-05T13:44:33Z:2014-09-05T13:44:33Z:
5) The page should (incorrectly) show no records.
6) APPLY PATCH
7) Repeat steps 2, 3, and 4
8) Note that the resumptionToken now uses slashes (e.g. /) instead of
colons.
Note also that now the second request will show records!!!
N.B. This will only happen if Koha has enough records to serve to you.
If your Koha has less than 50 records, try lowering the number provided
in the "OAI-PMH:MaxCount" system preference.
Signed-off-by: Petter Goksoyr Asen <boutrosboutrosboutros@gmail.com>
I understand; I can now confirm the behaviour described in the test plan.
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Passes test plan, all tests and QA script.
Resumption Token works correctly after applying the patch.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
New sql tables:
- oai_sets: contains the list of sets, described by a spec and a name
- oai_sets_descriptions: contains a list of descriptions for each set
- oai_sets_mappings: conditions on marc fields to match for biblio to be
in a set
- oai_sets_biblios: list of biblionumbers for each set
New admin page: allow to configure sets:
- Creation, deletion, modification of spec, name and descriptions
- Define mappings which will be used for building oai sets
Implements OAI Sets in opac/oai.pl:
- ListSets, ListIdentifiers, ListRecords, GetRecord
New script misc/migration_tools/build_oai_sets.pl:
- Retrieve marcxml from all biblios and test if they belong to defined
sets. The oai_sets_biblios table is then updated accordingly
New system preference OAI-PMH:AutoUpdateSets. If on, update sets
automatically when a biblio is created or updated.
Use OPACBaseURL in oai_dc xslt
use encoding(UTF-8) rather than utf-8 for stricter
encoding
Marking output as ':utf8' only flags the data as utf8
using :encoding(UTF-8) also checks it as valid utf-8
see binmode in perlfunc for more details
In accordance with the robustness principle input
filehandles have not been changed as code may make
the undocumented assumption that invalid utf-8 is present
in the imput
Fixes errors reported by t/00-testcritic.t
Where feasable some filehandles have been made lexical rather than
reusing global filehandle vars
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Based on patch by Tomás Cohen Arazi <tomascohen@gmail.com>,
revised to work regardless of the installation mode.
Signed-off-by: Galen Charlton <gmcharlt@gmail.com>
Signed-off-by: Magnus Enger <magnus@enger.priv.no>
Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>
- Add preference OAI-PMH:ConfFile. I just add it in web-services.pref
and not in DB. It's enough. It's not an end-user preference. Without
this pref, OAI server operates as previously. And preferences editor
allow to add a new value to the DB if necessary.
- Fix response to ListMetadataFormats which was empty in extended mode.
Signed-off-by: Galen Charlton <gmcharlt@gmail.com>
Currently Koha OAI server returns records in two formats: marcxml and
oai_dc (Dublin Core). This patch adds a new mode of operation where as
many as necessary metadata formats can be added via XSLT.
Documentation: See the end of oai.pl file to have an explanation of
how it works.
Signed-off-by: Galen Charlton <gmcharlt@gmail.com>
The optional description element of an Identify response
can't just be a string. Identify.description is a container
for one or more elements; see http://www.openarchives.org/OAI/2.0/guidelines.htm
For now, simply commenting it out.
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
When retrieving a record via the OAI-PMH interface, if one of the fields used
to prepare the DC metadata is not defined in the MARC framework (e.g.,
biblioitems.publicationyear in the default MARC21 framework), an OAI GetRecord
can fail with the following error:
> Can't call method "as_string" on an undefined value at
> /usr/share/koha/opac/cgi-bin/opac/oai.pl line 59.
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>