Only in MARC21 is possible to use ind2 of tag 245 to skip articles.
This patch is based on inserting a special template in
koha-indexdefs-to-zebra.xsl With this patch you must not insert index
Title:s in biblio-koha-indexdefs.xml, it is defined in
koha-indexdefs-to-zebra.xsl. It is not the best setup, but I find very
difficult to use biblio-koha-indexdefs.xml.
To test it in a english MARC21 setup:
Insert same records with titles and correct values in ind2 of 245.
If you have articles not in the skiping list of sort-string-utf.chr (The|the|a|A|an|An)
you can see that the sort by articles use also articles.
Insert the patch
Rebuilt indexes from scratch
Now all articles of titles are skipped
TO TEST WITHOUT INDEXING:
1. Go to etc/zebradb/marc_defs/marc21/biblios directory.
2. Put the sample MARCXML file in this directory.
3. Transform the file into Zebra indexes:
xsltproc biblio-zebra-indexdefs.xsl record.xml
Observe that the Title:s index contains:
01 Business and Technologies
4. Apply the patch.
5. Repeat:
xsltproc biblio-zebra-indexdefs.xsl record.xml
Observe that the Title:s index contains:
Business and Technologies
Signed-off-by: Frederic Demians <f.demians@tamil.fr>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Verified working using yaz-client (as in
http://wiki.koha-community.org/wiki/Understanding_Zebra_indexing#Examine_Zebra_index,
though note that the `elem zebra::index` seems to be unneeded).
Signed-off-by: Brendan A Gallagher <brendan@bywatersolutions.com>
Test plan:
Without this patch, doing a zebra reindex on a fedora-based install will cause errors like this:
15:10:47-01/05 zebraidx(16108) [warn] No such record type: dom./etc/koha/zebradb/biblios/etc/dom-config.xml
With this patch, reindexing should just work.
Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>
I have tested this doesn't break on debian/ubuntu systems, someone with a non
debian system will need to test it on that
Signed-off-by: Bob Ewart bob-ewart@bobsown.com
It works on openSUSE Leap 42.1
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Just noting that the debian zebra files already contain much more paths here.
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
This patch adds the Zebra special attribute 14 to ccl.properties and
opac-search.pl, so that we can turn on OpacSuppression and still return
results even if there are no records in Zebra for the Suppress index.
_TEST PLAN_
Before applying:
1) Make sure that you have no suppressed records indexed in Zebra
2) Turn on OpacSuppression system preference
3) Search using a keyword which should bring up records
4) Note that no records are returned in the results
5) Change UseQueryParser system preference to "Try"
6) Repeat steps 3-4
Apply the patch.
After applying:
7) Repeat step 3 (ie search using a keyword which should bring up records)
8) Confirm that records are appearing in the results!
9) Change UseQueryParser system preference to "Do not try"
10) Repeat step 3
11) Confirm that records are appearing in the results!
Signed-off-by: Hector Castro <hector.hecaxmmx@gmail.com>
Works as advertised. No more, won't need to have at least one record with the
value "1" in the field mapped with this index. All records in OPAC returned.
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
If an item is not checked out when a checkin via SIP2 is attempted,
Koha's SIP server sends back an "ok" of 0, and the AF message "Item
not checked out". I am not entirely sure this is good and correct
behavior by the SIP2 protocol.
In particular, this will cause SIP2 book sorting devices to fail on
all items that are not checked out, causing them all to be put into
the "special handling" been that should be reserved for things like
items checked in at the wrong library and items on hold.
Test Plan:
1) Apply the patch for bug 13159 so you can use the new enhanced
SIP2 command line emulator
2) Use a command similar to the following to check in an item:
sip_cli_emulator.pl -a localhost -su <sip user> -sp <sip password> -l <instituation id> --item <barcode> -m checkin
3) Note the 3rd character is 0, and there is an AF field saying the item is not checked out
4) Apply this patch
5) Restart the SIP server
6) Repeat steps 2-3, note that nothing has changed
7) In the SIP config file, Add the parameter checked_in_ok="1" to the SIP account you are using.
8) Restart the SIP server
9) Repeat steps 2-3, note that this time the 3rd character is 1, and you do not recieve the item not checked out message.
Signed-off-by: Benjamin Rokseth <benjamin.rokseth@kul.oslo.kommune.no>
Signed-off-by: Brendan A Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Single quotes in common language (not in programming) are usually ', but
there is also the form known as ’ in HTML. See
https://fr.wikipedia.org/wiki/Apostrophe_%28typographie%29
This bug proposes to transliterate all forms into a space.
Test plan :
(I'll use the code ’ instead of the unicode character)
- Without the patch
- Create a record with title : L’avion d’argile
- Index this record
- Search for "L’avion d’argile" => You find the record
- Search for "L'avion d'argile" => You do not find the record
- Apply patch
- Search for "L’avion d’argile" => You find the record
- Search for "L'avion d'argile" => You find the record
- Search for "L avion d argile" => You find the record
Signed-off-by: Frederic Demians <f.demians@tamil.fr>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
This avoid getting a debug page when URL is wrong.
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Actual routes are:
/borrowers
Return a list of all borrowers in Koha
/borrowers/{borrowernumber}
Return the borrower identified by {borrowernumber}
(eg. /borrowers/1)
There is a test file you can run with:
$ prove t/db_dependent/rest/borrowers.t
All API stuff is in /api/v1 (except Perl modules)
So we have:
/api/v1/script.cgi CGI script
/api/v1/swagger.json Swagger specification
Change both OPAC and Intranet VirtualHosts to access the API,
so we have:
http://OPAC/api/v1/swagger.json Swagger specification
http://OPAC/api/v1/{path} API endpoint
http://INTRANET/api/v1/swagger.json Swagger specification
http://INTRANET/api/v1/{path} API endpoint
Add a (disabled) virtual host in Apache configuration api.HOSTNAME,
so we have:
http://api.HOSTNAME/api/v1/swagger.json Swagger specification
http://api.HOSTNAME/api/v1/{path} API endpoint
Add 'unblessed' subroutines to both Koha::Objects and Koha::Object to be
able to pass it to Mojolicious
Test plan:
1/ Install Perl modules Mojolicious and Swagger2
2/ perl Makefile.PL
3/ make && make install
4/ Change etc/koha-httpd.conf and copy it to the right place if needed
5/ Reload Apache
6/ Check that http://(OPAC|INTRANET)/api/v1/borrowers and
http://(OPAC|INTRANET)/api/v1/borrowers/{borrowernumber} works
Optionally, you could verify that http://(OPAC|INTRANET)/vX/borrowers
(where X is an integer greater than 1) returns a 404 error
Signed-off-by: Alex Arnaud <alex.arnaud@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
This patch add zebra indexes to RDA 264 field.
The new Provider index is added too.
QA comments corrected.
To test:
1) Download RDA records with 264 fields from this attachment <http://bugs.koha-community.org/bugzilla3/attachment.cgi?id=36825>. Import the file and re-index/rebuild zebra. These records contain 260 and 264 fields per record.
2) Do a search with pb:Bethany two records will appear with title The guardian. Search with pl:Minneapolis too, the two records will appear.
3) Select one record of both records and delete the 260 field keeping the 264 field and save, rebuild your zebra.
4) Search again with pb:Bethany and just one record will appear. Thats mean 264 is not indexed.
5) Apply patches.
6) Rebuild your zebra but this time all biblio records.
7) Search again with pv:Bethany or Provider:Bethany, this time will appear the two records, 264 is indexed. Note that if you search again with pb only one record appear. This is because the suggestion of LOC.
10) Search with copydate:2013 only launch records with 260 fields and pv:2013 show both fields, i.e., 260 and 264.
11) Apply QA Test Tools
Sponsored-by: Universidad de El Salvador
Signed-off-by: Nick Clemens <nick@quecheelibrary.org>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Currently, Norwegian vowels are not sorted correctly, even when
you have chosen "nb" as the ZEBRA_LANGUAGE during installation.
To test:
- Make sure you have three records with titles that begin with ÆØÅ
respectively
- Do a search that turns up those three records and some others, and
sort the results by title, bot ascending and descending.
- Verify that ÆØÅ is shown in some weird order.
- Edit your sort-string-utf.chr* so it is in line with the current
patch. It should include these two lines:
lowercase {0-9}{a-z}æøå
uppercase {0-9}{A-Z}ÆØÅ
- Restart Zebra and reindex all the records, e.g.:
$ sudo koha-restart-zebra <instancename>
$ sudo koha-rebuild-zebra -f -v <instancename>
- Do the search again, make sure you order by title and check that
ÆØÅ are sorted in the order of 1. Æ 2. Ø 3. Å, and after all other
characters
* = If you are on a gitified install, you need to edit this file:
/etc/koha/zebradb/lang_defs/<your ZEBRA_LANGUAGE>/sort-string-utf.chr
NOT the file in your git clone (yeah, i wasted some time there...)
Signed-off-by: Mirko Tietgen <mirko@abunchofthings.net>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
In DOM config file :
etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml, the 608$9 is
defined a second time instead of 610$9. Just a type I think.
Test plan :
- Apply patch
- Install a UNIMARC + DOM instance
- Define in a framework 610 using a thesaurus
- Create a new biblio
- Create a new authority (same type as the thesaurus defined above)
- Index : rebuild_zebra.pl -a -b -x -z
- Link the field 610 to the new authority
- Index : rebuild_zebra.pl -a -b -x -z
- In authorities search, search for the new authority
=> You see Use in 1 Records(s)
Signed-off-by: Frederic Demians <f.demians@tamil.fr>
I confirm the typo.
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
_Test plan_
Prerequisites: Make sure that you have an item with a valid dateaccessioned,
and that the bib is indexed in zebra.
For the purposes of explanation, I'm going to use the date '2011-09-07'
1) Go to advanced search in the staff client and choose 'Acquisition date (yyyy-mm-dd)'
2) Enter 2011-09-07 (or the date of your choice).
3) Click the search button - you should get your item in the search results.
4) Return to the advanced search screen and select Acquisition date again.
5) Enter a start and end date in the text field separated by ' - '.
For example:
2011-09-01 - 2011-09-30
6) Click the search button -- this will return no results.
7) Apply the patch and copy etc/zebradb/ccl.properties to whatever directory is specified
by the koha-conf.xml referenced by $KOHA_CONF.
8) Try the search again -- this will return the expected results
Signed-off-by: Barton Chittenden <barton@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
1) Import MARC21 bibs containing
- ISBN in 020$z
- ISSN in 022$y
- ISSN in 022$z
2) Make sure that bibs are indexed
3) Search by ISBN and ISSN above -- bibs should not show in search.
4) Apply patch, re-index
5) Search again; ISBN in 020$z and ISSN in 022$y and 022$z should return
results.
Signed-off-by: kholten@switchinc.org
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
This patch introduces an extension to the current syntax for DOM index definition.
Specifically, it extends the 'index_subfields' tag to allow adding a 'condition'
attribute that is used as a condition ofr applying the specified index.
This (exotic) example is self-explanatory:
The previous syntax (which is keeped by this patch) took this snippet from biblio-koha-indexdefs.xml
<index_subfields tag="100" subfields="acbd">
<target_index>Encuadernador:w</target_index>
</index_subfields>
and generated an XSLT snippet in the DOM indexing XSLT that looks like this:
<xslo:for-each select="marc:subfield">
<xslo:if test="contains('acbd', @code)">
<z:index name="Encuadernador:w">
<xslo:value-of select="."/>
</z:index>
</xslo:if>
</xslo:for-each>
This patch introduces this syntax change (note the 'condition' attribute:
<index_subfields tag="100" subfields="acbd" condition="@ind2='7'">
<target_index>Encuadernador:w</target_index>
</index_subfields>
which yields to this XSLT snippet in the DOM indexing XSLT:
<xslo:if test="@ind2='7'">
<xslo:for-each select="marc:subfield">
<xslo:if test="contains('acbd', @code)">
<z:index name="Encuadernador:w">
<xslo:value-of select="."/>
</z:index>
</xslo:if>
</xslo:for-each>
</xslo:if>
To test:
- Verify that the shipped XSLT files are current regarding the shipped index definitions:
$ for i in marc21 normarc unimarc; do
xsltproc etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl \
etc/zebradb/marc_defs/$i/biblios/biblio-koha-indexdefs.xml \
> etc/zebradb/marc_defs/$i/biblios/biblio-zebra-indexdefs.xsl
done
$ git status
(repeat for authorities, skip normarc which doesn't have authorities)
- Apply the patch
- Re-run the previous commands
=> SUCCESS: no changes
- Add a condition to an index_subfields tag (for example, condition="@ind2='7'" in the Author's index
- Regenerate the specific XSLT
=> SUCCESS: doing a diff shows the only change is the code has been wrapped inside an xslo:if using the condition for the test
- Apply the generated xsl to a MARCXML record that has a field matching the condition like this:
$ xsltproc .../biblio-zebra-indexdefs.xsl sample_record.xml
=> SUCCESS: There's an index on the result, containing the configured field/subfields, that matches the criteria.
- Sign off and feel really happy :-D
Note: the attached sample record includes a 100 field, with ind2=7 and $a=Tomasito
Edit: This patch was squashed once I figured it got too complex and Jonathan required a followup
to avoid code duplication.
This avoids code duplication, with the same results.
Sponsored-by: Orex Digital
Signed-off-by: Barton Chittenden <barton@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
This patch changes the "itemnumber" alias so that it acts like
"itemnumber,st-numeric". That is, it always does a numeric search.
_TEST PLAN_
The best way to test this patch is to apply the patch and then run
"make upgrade", I suspect. As this will refresh your "ccl.properties".
However, this patch is actually really small, so you can just apply
it manually to an existing "ccl.properties" if you rather save time.
Basically, you just need to do the following steps:
0) Do a search for "itemnumber:<insert real indexed itemnumber here>"
1) Note that you can't retrieve any results
2) Change your ccl.properties to say "itemnumber 1=8010 4=109"
3) Repeat the search for "itemnumber:<X>"
4) Note that you now retrieve your result
Signed-off-by: Magnus Enger <magnus@libriotech.no>
Tested on a gitified package install. Made the change to
/etc/koha/zebradb/ccl.properties manually. After this change
I can successfully search for "itemnumber:1".
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
* Renames uploadPath to upload_path to follow the standard naming
conventions in koha-conf which use underscores rather than camel case
* Remove reference to intranet-tmpl and replace with [% interface %]
required to pass qa
Signed-off-by: Mark Tompsett <mtompset@hotmail.com>
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Mark Tompsett <mtompset@hotmail.com>
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
Koha needs a better logger, and it seems like the best solution would be
to take advantage of Log4perl which is already a fully featured logger.
We use Log4perl to selectively decide what statements should be logged,
and where they should go!
Test plan:
0) Install Log::Log4perl via packages or cpan
1) Apply this patch and the example renewal patch
2) Copy etc/log4perl.conf to your koha conf directory, edit the paths
to match your current error logs
3) Edit your koha-conf file and add the
<log4perl_conf>/path/to/log4perl.conf</log4perl_conf> line
4) Watch your intranet and opac error logs
5) Perform a renewal via the staff interface, note there is nothing new
in the log file
7) Update the log4perl.conf, change the log level from WARN to TRACE
for both the staff and opac sides
8) Perform a renewal via the staff interface, note the logged lines
9) Perform a renewal via the opac, note the logged lines
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Amended this patch: Moved the renewal stuff to a separate example patch.
And upgraded the DEBUG level to WARN in the log4perl config file.
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
Bug 11202 introduced a new index 'dissertation-information' for
UNIMARC. This patch adds the index also for MARC21 installations.
http://www.loc.gov/marc/bibliographic/bd502.html
To test:
- Apply patch
- Copy files in etc/zebradb changed by this patch to your
corresponding directory (koha-dev..)
- Make sure you have records with 502
- Reindex
- Verify you can search the field contents with
dissertation-information= and
diss=
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Can find by dissertation-information,
No errors
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
Make the shipped XSLTs for authorities (MARC21 and UNIMARC) the same as the generated version
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
In authority-koha-indexdefs.xml, all tags use the namespace "kohaidx" except the tag "id".
When re-generating authority-zebra-indexdefs.xsl, the line :
<xslo:variable name="idfield" select="normalize-space(marc:controlfield[@tag='001'])"/>
is modified :
<xslo:variable name="idfield" select="normalize-space()"/>
This is an error.
This patch adds kohaidx namespace to correct.
Test plan :
- Without patch
- go to etc/zebradb/marc_defs/marc21/authorities/
- run : xslproc xsltproc ../../../xsl/koha-indexdefs-to-zebra.xsl authority-koha-indexdefs.xml > authority-zebra-indexdefs.xsl
- read authority-zebra-indexdefs.xsl
=> the line has changed : <xslo:variable name="idfield" select="normalize-space()"/>
- Apply patch
- go to etc/zebradb/marc_defs/marc21/authorities/
- run : xslproc xsltproc ../../../xsl/koha-indexdefs-to-zebra.xsl authority-koha-indexdefs.xml > authority-zebra-indexdefs.xsl
- read authority-zebra-indexdefs.xsl
=> the line has not changed
(same for unimarc flavor)
Signed-off-by: Mirko Tietgen <mirko@abunchofthings.net>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
As Mirko mentioned, the xslt's now generate the facet-processing templates in
the authority xslt's too. They are harmless because we don't define facets
for authority records. If we did, it would be harmless too.
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
All of them were found and fixed using codespell.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Signed-off-by: Jonathan Druart <jonathan.druart@koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2 lines in the Zebra configuration files prevent an exact search for C.,
while all other [A-Z]. searches work correctly.
After taking a look at the /etc/zebradb/etc/word-phrase-utf.chr
those 2 lines cause the problem:
map (^c\.) @
map (^C\.) @
I propose to remove them.
To test:
- Catalog a record with an item with callnumber: C.
- Catalog a record with an item with callnumber: B.
- Try seaching for the second using callnum,ext:B. (exact field search)
- Verify search works.
- Try searching for the other with callnum,ext:C.
- Verify no result.
- Apply the patch - copy the zebra config file if necessary into the right spot
- Reindex
- Repeat searches - both should not bring up the correct record.
Signed-off-by: Indranil Das Gupta (L2C2 Technologies) <indradg@gmail.com>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
NOTE : I use HTML codes for special characters to avoir encoding issues in patch file.
In ICU configuration, add a transliterate rule for
œ = oe
æ = ae
Test plan :
- Without patch
- Create a record R1 with title containing for example "cœur"
- Create a record R2 with title containing for example "coeur"
- Index those records
- Search for "cœur"
=> You only find R1
- Search for "coeur"
=> You only find R2
- Apply patch
- Restart zebra
- Index R1 and R2
- Search for "cœur"
=> You find R1 and R2
- Search for "coeur"
=> You find R1 and R2
(Same test plan for ae)
------
Tested with all variants of Ae ae Oe oe. Search worked as expected.
Note: The words with special characters were not highlighted, but I think this can be done in an other bug.
Signed-off-by: Marc Veron <veron@veron.ch>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
This patch adds a mapping for the lower case ð character to word-phrase-utf.chr ( Ð was already mapped to d)
To test:
1. Add a record with the ð character (Arnaldur Indriðason is an example author)
2. Rebuild zebra
3. Search for your record using d instead of ð and verify it is not found
4. Apply patch and copy word-phrase-utf.chr to the appropriate folder
5. Restart and rebuild zebra
6. Search for your record using d instead of ð and verify it is found
Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
works as expected
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
This patch is for MARC21. To test:
1)Setup a site with
MARC21
2)Insert 2 record, one lang A in 041 and 008 pos
35-37 an other with lang A in 041 and lang B in 008 pos
35-37
3)Index them
4)Search in advanced search with filter
'languare' for lan A. You will see 2 records
5)Search in
advanced search with filter 'languare' for lan B. You will
see 0 records
6)Apply the patch
7)Full reindex
8)Search in advanced search
with filter 'languare' for lan B. You will see 1 records
http://bugs.koha-community.org/show_bug.cgi?id=12948
Signed-off-by: Magnus Enger <magnus@enger.priv.no>
I have *not* actually tested this, but the changes are identical to the ones
done for NORMARC, which I have tested, so I think it is safe to sign off. If
anyone disagrees, please reset the bug to "Needs signoff".
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
This patch is for Normarc
Same test plan as patch for MARC21, except you need a setup with Normarc.
http://bugs.koha-community.org/show_bug.cgi?id=12948
Signed-off-by: Magnus Enger <magnus@enger.priv.no>
- Added a record with "bul" in 008pos35-37
- Verified that this did not turn up in an advanced search with language =
Bulgarian
- Applied the patch
- I was testing on a gitified install, so I had to copy the patched index file
to the right location with this command:
sudo cp etc/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl \
/etc/koha/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl
- Did a full reindex
- Verified that the record *did* turn up in an advanced search with language =
Bulgarian
- Signing off! Thanks Zeno!
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Small error in word-phrase-utf.chr.
It generates this logs :
17:03:25-21/01 zebraidx(10636) [warn] Map: 'ς' has no mapping
17:03:25-21/01 zebraidx(10636) [warn] duplicate entry for charmap from 'Σ'
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Small fixes :
more space characters : ¡¿
uppercase AE missing in equivalent
some trailling spaces
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Add greek support in word-phrase-utf.chr for searching in a Greek catalog (it can also contain latin records).
Developped in collaboration with Giannis Kourmoulis <ikourmou@lib.auth.gr>
Test plan :
- Install using CHR zebra indexing
- Index a greek catalog
- Look for results with mixed uppercase, lowercase and diacritics in title
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Add the sort-string-utf.chr for sorting Greek catalog (it can also contain latin records).
Developped in collaboration with Giannis Kourmoulis <ikourmou@lib.auth.gr>
Test plan :
- Install using "gr" in "Primary language for Zebra indexing"
- Index a greek catalog
- Sort by title and check sorting is correct
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Issue existed in koha-conf.xml of /etc.
The following lines were removed from the file:
<!-- uncomment these lines and comment out the above if running on MSWin32 -->
<!--
<listen id="biblioserver" >tcp:localhost:9998/bibliosocket</listen>
<listen id="authorityserver" >tcp:localhost:9999/authoritysocket</listen>
-->
This section was located on lines 9, 10, 11, 12 and 13 of the koha-conf.xml file.
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Fix a typo. Not test plan required, just a look at default UNIMARC framework.
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
The ICU configuration files contains a rule to remove control characters :
<transform rule="[:Control:] Any-Remove"/>
This rule is before tokenization.
The problem is that "[:Control:]" regex contains line feed, carriage return and tab. See http://www.regular-expressions.info/posixbrackets.html.
So when several lines are indexed, last word of line is joined with first line of next line. Thoses words are then not searchable.
For example :
First line
Second line
This will become "First lineSecond line", tokenized as "First", "lineSecond" and "line".
Test plan :
- Use ICU in Zebra configuration
- Choose an indexed field, like 300$a
- Create a new record
- Enter several lines in choosen field, like :
First line
Second line
- Index this record
=> Without patch the search on "Second" does not return the record
=> With patch the search on "Second" returns the record
- Same tests with tab and carriage return instead of line feed
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
This patch fixes the biblio-koha-indexdefs.xml for NORMARC, so
it includes the <id> element.
Because of how our DOM files work, the resulting biblio-zebra-indexdefs.xsl
for NORMARC picked the whole MARC record as ID, so every time the record
was edited, the id wouldn't match and a new record was created.
To test:
- Have a MARCXML record
- run:
$ xsltproc etc/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl the_record | less
=> FAIL: verify the z:id property on the <z:record> line contains all subfields concatenated
- Apply the patch
- re-run the xsltproc line
=> SUCCESS: z:id contains the 999$c number
- Sign off :-D
Regards
Signed-off-by: Frederic Demians <f.demians@tamil.fr>
Known bug with DOM: Without <z:id> indexing biblionumber Zebra hasn't it record
unique ID, and so fails to identify existing records. Works as described. 999$c
is linked to biblionumber in default Normarc framework.
Signed-off-by: Magnus Enger <magnus@enger.priv.no>
I have applied the patch to my production server, and at least one customer has
confirmed that it fixes the problem with multiple copies of records in search
results.
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Passes tests and QA script, fix matches what we have for the other MARC flavours.
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
We should add the ability to apply a regular expression to screen
messages for the SIP2 server. This would allow libraries to not only
customize the screen messages the patron sees, but can also allow screen
messages to be translated.
Test Plan:
1) Apply this patch
2) Inspect etc/SIPconfig.xml, note the new screen_msg_regex tags
that can be nested inside a given login tag.
3) Add one or more screen_msg_regex tags to your own SIP config
Recommendation: s/Greetings from Koha./Welcome to your library!/g
4) Restart your SIP2 server
5) Test with a SIP2 machine, or use /misc/sip_cli_emulator.pl
6) Note your new AF fields!
Signed-off-by: Jason Burds <jburds@dubuque.lib.ia.us>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
This followup
- changes some indexes in Queryparser configuration file
- supresses some clearly useless 6XX$9 in biblio-koha-indexdefs.xml and adds 2 new ones, probably useless (not sure of that)
- change the name of index Subject-geographical to Subject-name-geographical in ccl.properties (to match bib1.att)
the xsl file zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl was generated with the following command:
xsltproc zebradb/xsl/koha-indexdefs-to-zebra.xsl zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml > zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl
To test :
1) Apply the 3 patches
2) copy the modified files from the source directory to the directory where you store the config files for Zebra and Queryparser
The files modified by the 3 patches and that need to be copied are:
etc/zebradb/biblios/etc/bib1.att
etc/zebradb/ccl.properties
etc/searchengine/queryparser.yaml
etc/zebradb/ccl.properties
.../unimarc/biblios/biblio-koha-indexdefs.xml
.../unimarc/biblios/biblio-zebra-indexdefs.xsl
3) Rebuild Zebra
4) Create a record A with some values in critical fields, for example:
- the string "test9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f, 606$2,600$2
- the string "subform" in 600$j
4) Create a record B with the string "subgeo" in 606$y
5) Create a record C with the string "subdate" in 606$z
WITHOUT QP activated in sysprefs ("Don't try to use QP"):
6) try to search "su:test9828". You should have no results
7) try to search "su-genre:subform". You should have 1 result : record A
8) try to search "su-geo:subgeo". You should have 1 result : record B
9) try to search "su-chrono:subdate". You should have 1 result : record C
10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo indexes, and see it results are relevant
WITH QP activated in sysprefs:
Same tests
Signed-off-by: Nick Clemens <nick@quecheelibrary.org>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Only cosmetic :
- the references to lines record.abs are now useless and outdated
- some comments added in record.abs could be usefull in biblio-koha-indexdefs.xml
No change expected, only comments
Signed-off-by: Nick Clemens <nick@quecheelibrary.org>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
[New commit on 18 Aug 2014 : rebased, and DOM indexing only]
Issues to fix :
Most of 6XX may contain a $2 that identifies the system used for indexing. It should not be indexed.
In French libraries, $2 contains "rameau". So searching books about the music composer "Rameau" retreive thousands of records!
For some 6XX fiels, other subfields should not be indexed, for example dates of persons and family, or adresses.
In Unimarc guide, 600$t,601$t,602$t are said to exist but to be "not used". I keep them indexed.
Additionnally, subject indexing could be improved by using specific indexes for each 6XX if possible :
In ccl.properties :
- su-to, su-geo and su-ut are defined as aliases of Subject.
- a specific index is defined, but not used in record.abs : Subject-name-personal, alias su-na
We can use these indexes and create new specific indexes by using existing bib1 attributes.
We could also index $j,$x,$y,$z subdivision in specific indexes.
This patch does the following changes :
1) For all 6XX : Not indexing $2 (LSCH, Rameau...), $3 and $5
2) Suppressing the indexing of some specific subfields, depending on the field:
600 : Personal name used as a subject // see Marc21 600
not indexing c (additional elements),f (dates),p (address/affiliation)
602 : Family name used as a subject // see Marc21 600 3X
not indexing f (dates)
616 : Trademark
not indexing c,f
3) For all 6XX : index $j,$x,$y,$z in several indexes in addition to the specfific index for their 6XX field:
4) Define in ccl.properties some specific indexes :
Subject-name-conference 1=1073 => alias su-conf
Subject-name-corporate 1=1074 => alias su-corp
Subject-genre-form 1=1075 => alias su-genre and su-form
Subject-geographical 1=1076 => alias su-geo
Subject-chronological 1=1077 => alias su-chrono
Subject-title 1=1078 => alias su-ut and su-ti
Subject-topical 1=1079 => alias su-to
5) Adding new aliases in Search.pm :
su-chrono, su-form, su-genre, su-corp, su-conf, su-ti
6) Using these new indexes in for
600 : Subject and Subject-Personal-Name ; all subfields except subdivisions in Personal-name
601 : Subject, Subject-name-conference and Subject-name-corporate and Subject-name-conf ; all subfields except subdivisions in Corporate-name and Conference-name
602 : same as 600 but could be improved later
604 : Subject and Subject-title ; $a in Subject-Personal-Name ; all subfields except subdivisions in Name-and-Title
605 : Subject and Subject-title
606 : Subject and Subject-topical
607 : Subject and Subject-geographical ; all subfields except subdivisions in Name-geographic
608 : Subject and Subject-genre-form
To test :
A. In a UNIMARC-DOM indexing environment
1) Apply the patch
2) Rebuild zebra
3) Create a record A with some values in critical fields, for example:
- the string "test9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f, 606$2,600$2
- the string "subform" in 600$j
4) Create a record B with the string "subgeo" in 606$y
5) Create a record C with the string "subdate" in 606$z
6) try to search "su:test9828". You should have no results
7) try to search "su-genre:subform". You should have 1 result : record A
8) try to search "su-geo:subgeo". You should have 1 result : record B
9) try to search "su-chrono:subdate". You should have 1 result : record C
10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo indexes, and see it results are relevant
Indexing of subjects could maybe be improved later
Signed-off-by: Nick Clemens <nick@quecheelibrary.org>
All seems to work as expected, I am not super-familiar with UNIMARC but I wonder if in su-corp and su-conf the subdivisions might be useful (e.g. France-Gendarmie / Staatsbibliothek-Berlin)
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
[1] Routine add_cron_job was added in 2007 but has not been defined.
Parameter Recurring is not used.
[2] Made some changes to koha-conf.xml. Instead of an example to edit,
I replaced it by the SCRIPT_NONDEV_DIR install variable.
[3] SCRIPT_NONDEV_DIR had to be included in rewrite-config.pl and the
path had to be corrected for dev installs in Makefile.PL
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Tested single and dev install for supportdir change.
Compared installations with and without the patches for this report.
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
By adding a supportdir, this allows for configuring use in a
non-package install environment, such as git.
Seeing as I only tested git, I clearly had this defined.
Further testing should include packaging up an installation, and
installing a package version without setting the supportdir
configuration value.
Signed-off-by: Mark Tompsett <mtompset@hotmail.com>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
This file should have been removed as part of Bug 12538.
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
To test...
- apply patch
- build and install a new Koha .deb from patched codebase
- create a new Koha instance
- add some authority records to instance
- do a full zebra reindex
- do an authorities search, and get some results
note: this patch does not fix existing Koha instances, just new ones
Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
This patch updates the Zebra configuration for unimarc.
995$d and 995$j should not be indexed.
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>