Koha-community/Koha - Koha: The world's first free and open source library system

Author	SHA1	Message	Date
Tomas Cohen Arazi	9c3efeaab7	Bug 14106: (RM followup) sick of failing tests in Jessie This patch adds the original fix for source installs too... Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>	2015-10-23 22:58:49 -03:00
Fridolin Somers	aaf3ff3fec	Bug 14154: 608$9 defined twice in UNIMARC biblio-koha-indexdefs.xml In DOM config file : etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml, the 608$9 is defined a second time instead of 610$9. Just a type I think. Test plan : - Apply patch - Install a UNIMARC + DOM instance - Define in a framework 610 using a thesaurus - Create a new biblio - Create a new authority (same type as the thesaurus defined above) - Index : rebuild_zebra.pl -a -b -x -z - Link the field 610 to the new authority - Index : rebuild_zebra.pl -a -b -x -z - In authorities search, search for the new authority => You see Use in 1 Records(s) Signed-off-by: Frederic Demians <f.demians@tamil.fr> I confirm the typo. Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org> Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>	2015-10-21 13:43:12 -03:00
David Cook	5a9b8d9359	Bug 14861: Accession date comparison does not work in advanced search _Test plan_ Prerequisites: Make sure that you have an item with a valid dateaccessioned, and that the bib is indexed in zebra. For the purposes of explanation, I'm going to use the date '2011-09-07' 1) Go to advanced search in the staff client and choose 'Acquisition date (yyyy-mm-dd)' 2) Enter 2011-09-07 (or the date of your choice). 3) Click the search button - you should get your item in the search results. 4) Return to the advanced search screen and select Acquisition date again. 5) Enter a start and end date in the text field separated by ' - '. For example: 2011-09-01 - 2011-09-30 6) Click the search button -- this will return no results. 7) Apply the patch and copy etc/zebradb/ccl.properties to whatever directory is specified by the koha-conf.xml referenced by $KOHA_CONF. 8) Try the search again -- this will return the expected results Signed-off-by: Barton Chittenden <barton@bywatersolutions.com> Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org> Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>	2015-10-19 11:52:44 -03:00
Barton Chittenden	cab481dbb2	Bug 14617: Add fields to ISBN and ISSN indexes: 020$z, 022$y, 022$z 1) Import MARC21 bibs containing - ISBN in 020$z - ISSN in 022$y - ISSN in 022$z 2) Make sure that bibs are indexed 3) Search by ISBN and ISSN above -- bibs should not show in search. 4) Apply patch, re-index 5) Search again; ISBN in 020$z and ISSN in 022$y and 022$z should return results. Signed-off-by: kholten@switchinc.org Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org> Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>	2015-10-19 10:12:04 -03:00
Tomas Cohen Arazi	5d46dbf3e9	Bug 14217: Add 'condition' attribute for DOM index definition This patch introduces an extension to the current syntax for DOM index definition. Specifically, it extends the 'index_subfields' tag to allow adding a 'condition' attribute that is used as a condition ofr applying the specified index. This (exotic) example is self-explanatory: The previous syntax (which is keeped by this patch) took this snippet from biblio-koha-indexdefs.xml <index_subfields tag="100" subfields="acbd"> <target_index>Encuadernador:w</target_index> </index_subfields> and generated an XSLT snippet in the DOM indexing XSLT that looks like this: <xslo:for-each select="marc:subfield"> <xslo:if test="contains('acbd', @code)"> <z:index name="Encuadernador:w"> <xslo:value-of select="."/> </z:index> </xslo:if> </xslo:for-each> This patch introduces this syntax change (note the 'condition' attribute: <index_subfields tag="100" subfields="acbd" condition="@ind2='7'"> <target_index>Encuadernador:w</target_index> </index_subfields> which yields to this XSLT snippet in the DOM indexing XSLT: <xslo:if test="@ind2='7'"> <xslo:for-each select="marc:subfield"> <xslo:if test="contains('acbd', @code)"> <z:index name="Encuadernador:w"> <xslo:value-of select="."/> </z:index> </xslo:if> </xslo:for-each> </xslo:if> To test: - Verify that the shipped XSLT files are current regarding the shipped index definitions: $ for i in marc21 normarc unimarc; do xsltproc etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl \ etc/zebradb/marc_defs/$i/biblios/biblio-koha-indexdefs.xml \ > etc/zebradb/marc_defs/$i/biblios/biblio-zebra-indexdefs.xsl done $ git status (repeat for authorities, skip normarc which doesn't have authorities) - Apply the patch - Re-run the previous commands => SUCCESS: no changes - Add a condition to an index_subfields tag (for example, condition="@ind2='7'" in the Author's index - Regenerate the specific XSLT => SUCCESS: doing a diff shows the only change is the code has been wrapped inside an xslo:if using the condition for the test - Apply the generated xsl to a MARCXML record that has a field matching the condition like this: $ xsltproc .../biblio-zebra-indexdefs.xsl sample_record.xml => SUCCESS: There's an index on the result, containing the configured field/subfields, that matches the criteria. - Sign off and feel really happy :-D Note: the attached sample record includes a 100 field, with ind2=7 and $a=Tomasito Edit: This patch was squashed once I figured it got too complex and Jonathan required a followup to avoid code duplication. This avoids code duplication, with the same results. Sponsored-by: Orex Digital Signed-off-by: Barton Chittenden <barton@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io> Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org> Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>	2015-09-25 11:53:24 -03:00
David Cook	9eee8b90d7	Bug 14031: Itemnumber should be a numeric search in ccl.properties This patch changes the "itemnumber" alias so that it acts like "itemnumber,st-numeric". That is, it always does a numeric search. _TEST PLAN_ The best way to test this patch is to apply the patch and then run "make upgrade", I suspect. As this will refresh your "ccl.properties". However, this patch is actually really small, so you can just apply it manually to an existing "ccl.properties" if you rather save time. Basically, you just need to do the following steps: 0) Do a search for "itemnumber:<insert real indexed itemnumber here>" 1) Note that you can't retrieve any results 2) Change your ccl.properties to say "itemnumber 1=8010 4=109" 3) Repeat the search for "itemnumber:<X>" 4) Note that you now retrieve your result Signed-off-by: Magnus Enger <magnus@libriotech.no> Tested on a gitified package install. Made the change to /etc/koha/zebradb/ccl.properties manually. After this change I can successfully search for "itemnumber:1". Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org> Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>	2015-08-21 10:05:55 -03:00
Katrin Fischer	e799e1cbc3	Bug 11620: Add dissertation-information index for MARC21 (502) Bug 11202 introduced a new index 'dissertation-information' for UNIMARC. This patch adds the index also for MARC21 installations. http://www.loc.gov/marc/bibliographic/bd502.html To test: - Apply patch - Copy files in etc/zebradb changed by this patch to your corresponding directory (koha-dev..) - Make sure you have records with 502 - Reindex - Verify you can search the field contents with dissertation-information= and diss= Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Can find by dissertation-information, No errors Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>	2015-07-20 10:31:06 -03:00
Mirko Tietgen	fbe25b1d8e	Bug 14453: (followup) Fix shipped XSLT files Make the shipped XSLTs for authorities (MARC21 and UNIMARC) the same as the generated version Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>	2015-07-08 14:39:04 -03:00
Fridolin Somers	2365537eea	Bug 14453: kohaidx is missing for id in authority-koha-indexdefs.xml In authority-koha-indexdefs.xml, all tags use the namespace "kohaidx" except the tag "id". When re-generating authority-zebra-indexdefs.xsl, the line : <xslo:variable name="idfield" select="normalize-space(marc:controlfield[@tag='001'])"/> is modified : <xslo:variable name="idfield" select="normalize-space()"/> This is an error. This patch adds kohaidx namespace to correct. Test plan : - Without patch - go to etc/zebradb/marc_defs/marc21/authorities/ - run : xslproc xsltproc ../../../xsl/koha-indexdefs-to-zebra.xsl authority-koha-indexdefs.xml > authority-zebra-indexdefs.xsl - read authority-zebra-indexdefs.xsl => the line has changed : <xslo:variable name="idfield" select="normalize-space()"/> - Apply patch - go to etc/zebradb/marc_defs/marc21/authorities/ - run : xslproc xsltproc ../../../xsl/koha-indexdefs-to-zebra.xsl authority-koha-indexdefs.xml > authority-zebra-indexdefs.xsl - read authority-zebra-indexdefs.xsl => the line has not changed (same for unimarc flavor) Signed-off-by: Mirko Tietgen <mirko@abunchofthings.net> Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar> As Mirko mentioned, the xslt's now generate the facet-processing templates in the authority xslt's too. They are harmless because we don't define facets for authority records. If we did, it would be harmless too. Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>	2015-07-08 14:39:04 -03:00
Stefan Weil	f6aec46dda	Bug 14383: etc/zebradb: Fix some typos in documentation and Bib-1 attribute set All of them were found and fixed using codespell. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Signed-off-by: Jonathan Druart <jonathan.druart@koha-community.org> Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>	2015-06-22 17:34:46 -03:00
Katrin Fischer	f86743d893	Bug 14401: Zebra index configuration doesn't allow exact search for C. 2 lines in the Zebra configuration files prevent an exact search for C., while all other [A-Z]. searches work correctly. After taking a look at the /etc/zebradb/etc/word-phrase-utf.chr those 2 lines cause the problem: map (^c\.) @ map (^C\.) @ I propose to remove them. To test: - Catalog a record with an item with callnumber: C. - Catalog a record with an item with callnumber: B. - Try seaching for the second using callnum,ext:B. (exact field search) - Verify search works. - Try searching for the other with callnum,ext:C. - Verify no result. - Apply the patch - copy the zebra config file if necessary into the right spot - Reindex - Repeat searches - both should not bring up the correct record. Signed-off-by: Indranil Das Gupta (L2C2 Technologies) <indradg@gmail.com> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>	2015-06-22 11:26:13 -03:00
Jonathan Druart	b5e9691060	Bug 8992: Add 7..$3 to the Indentifier-standard index Signed-off-by: valerie bertrand <valerie.bertrand@univ-lyon3.fr> Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>	2015-04-28 15:47:40 -03:00
Fridolin Somers	9dbedfd854	Bug 13981 - Transliterate rule for oe and ae NOTE : I use HTML codes for special characters to avoir encoding issues in patch file. In ICU configuration, add a transliterate rule for &oelig; = oe æ = ae Test plan : - Without patch - Create a record R1 with title containing for example "c&oelig;ur" - Create a record R2 with title containing for example "coeur" - Index those records - Search for "c&oelig;ur" => You only find R1 - Search for "coeur" => You only find R2 - Apply patch - Restart zebra - Index R1 and R2 - Search for "c&oelig;ur" => You find R1 and R2 - Search for "coeur" => You find R1 and R2 (Same test plan for ae) ------ Tested with all variants of Ae ae Oe oe. Search worked as expected. Note: The words with special characters were not highlighted, but I think this can be done in an other bug. Signed-off-by: Marc Veron <veron@veron.ch> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-04-20 10:03:41 -03:00
Nick Clemens	3e136b4f9f	Bug 13800 - Diacritics not mapped This patch adds a mapping for the lower case ð character to word-phrase-utf.chr ( Ð was already mapped to d) To test: 1. Add a record with the ð character (Arnaldur Indriðason is an example author) 2. Rebuild zebra 3. Search for your record using d instead of ð and verify it is not found 4. Apply patch and copy word-phrase-utf.chr to the appropriate folder 5. Restart and rebuild zebra 6. Search for your record using d instead of ð and verify it is found Signed-off-by: Josef Moravec <josef.moravec@gmail.com> works as expected Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-04-08 15:13:46 -03:00
Zeno Tajoli	c29a53ea20	Bug 12948: Use word indexing for language (MARC21) This patch is for MARC21. To test: 1)Setup a site with MARC21 2)Insert 2 record, one lang A in 041 and 008 pos 35-37 an other with lang A in 041 and lang B in 008 pos 35-37 3)Index them 4)Search in advanced search with filter 'languare' for lan A. You will see 2 records 5)Search in advanced search with filter 'languare' for lan B. You will see 0 records 6)Apply the patch 7)Full reindex 8)Search in advanced search with filter 'languare' for lan B. You will see 1 records http://bugs.koha-community.org/show_bug.cgi?id=12948 Signed-off-by: Magnus Enger <magnus@enger.priv.no> I have not actually tested this, but the changes are identical to the ones done for NORMARC, which I have tested, so I think it is safe to sign off. If anyone disagrees, please reset the bug to "Needs signoff". Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-02-20 11:51:59 -03:00
Zeno Tajoli	bf89e306a8	Bug 12948: Use word indexing for language (NORMARC) This patch is for Normarc Same test plan as patch for MARC21, except you need a setup with Normarc. http://bugs.koha-community.org/show_bug.cgi?id=12948 Signed-off-by: Magnus Enger <magnus@enger.priv.no> - Added a record with "bul" in 008pos35-37 - Verified that this did not turn up in an advanced search with language = Bulgarian - Applied the patch - I was testing on a gitified install, so I had to copy the patched index file to the right location with this command: sudo cp etc/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl \ /etc/koha/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl - Did a full reindex - Verified that the record did turn up in an advanced search with language = Bulgarian - Signing off! Thanks Zeno! Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-02-20 11:51:50 -03:00
Fridolin Somers	a73c464f6e	Bug 11927 - Add greek support to CHR (followup) Small error in word-phrase-utf.chr. It generates this logs : 17:03:25-21/01 zebraidx(10636) [warn] Map: 'ς' has no mapping 17:03:25-21/01 zebraidx(10636) [warn] duplicate entry for charmap from 'Σ' Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-01-22 18:22:04 -03:00
Fridolin Somers	6f5f8b112f	Bug 11927 - Small corrections on word-phrase-utf.chr Small fixes : more space characters : ¡¿ uppercase AE missing in equivalent some trailling spaces Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-01-21 10:59:17 -03:00
Fridolin Somers	5ac46633ef	Bug 11927 - Add greek to word-phrase-utf.chr Add greek support in word-phrase-utf.chr for searching in a Greek catalog (it can also contain latin records). Developped in collaboration with Giannis Kourmoulis <ikourmou@lib.auth.gr> Test plan : - Install using CHR zebra indexing - Index a greek catalog - Look for results with mixed uppercase, lowercase and diacritics in title Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-01-21 10:59:11 -03:00
Fridolin Somers	d0d150a0c5	Bug 11927 - Add greek chr lang_def file Add the sort-string-utf.chr for sorting Greek catalog (it can also contain latin records). Developped in collaboration with Giannis Kourmoulis <ikourmou@lib.auth.gr> Test plan : - Install using "gr" in "Primary language for Zebra indexing" - Index a greek catalog - Sort by title and check sorting is correct Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-01-21 10:59:04 -03:00
Tomas Cohen Arazi	532b41934c	Bug 13157: (QA followup) homebranch is 995$b on UNIMARC frameworks Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>	2014-11-25 15:27:12 -03:00
Frédéric Demians	9ebb6ba5d1	Bug 13157: UNIMARC holdingbranch facet is 995$c not 995$b Fix a typo. Not test plan required, just a look at default UNIMARC framework. Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>	2014-11-25 15:27:05 -03:00
Fridolin Somers	9220482cd3	Bug 13064 - Indexing problem with ICU on control characters The ICU configuration files contains a rule to remove control characters : <transform rule="[:Control:] Any-Remove"/> This rule is before tokenization. The problem is that "[:Control:]" regex contains line feed, carriage return and tab. See http://www.regular-expressions.info/posixbrackets.html. So when several lines are indexed, last word of line is joined with first line of next line. Thoses words are then not searchable. For example : First line Second line This will become "First lineSecond line", tokenized as "First", "lineSecond" and "line". Test plan : - Use ICU in Zebra configuration - Choose an indexed field, like 300$a - Create a new record - Enter several lines in choosen field, like : First line Second line - Index this record => Without patch the search on "Second" does not return the record => With patch the search on "Second" returns the record - Same tests with tab and carriage return instead of line feed Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-11-14 12:03:12 -03:00
Tomas Cohen Arazi	d4a7fa8580	Bug 13163: NORMARC DOM config missing <id> entry This patch fixes the biblio-koha-indexdefs.xml for NORMARC, so it includes the <id> element. Because of how our DOM files work, the resulting biblio-zebra-indexdefs.xsl for NORMARC picked the whole MARC record as ID, so every time the record was edited, the id wouldn't match and a new record was created. To test: - Have a MARCXML record - run: $ xsltproc etc/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl the_record \| less => FAIL: verify the z:id property on the <z:record> line contains all subfields concatenated - Apply the patch - re-run the xsltproc line => SUCCESS: z:id contains the 999$c number - Sign off :-D Regards Signed-off-by: Frederic Demians <f.demians@tamil.fr> Known bug with DOM: Without <z:id> indexing biblionumber Zebra hasn't it record unique ID, and so fails to identify existing records. Works as described. 999$c is linked to biblionumber in default Normarc framework. Signed-off-by: Magnus Enger <magnus@enger.priv.no> I have applied the patch to my production server, and at least one customer has confirmed that it fixes the problem with multiple copies of records in search results. Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Passes tests and QA script, fix matches what we have for the other MARC flavours. Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-31 16:45:04 -03:00
Tomas Cohen Arazi	c217b2c418	Revert "Bug 9828: More specific indexing of UNIMARC 6XX fields" This reverts commit `0dd1ac40a0`.	2014-10-28 12:02:34 -03:00
Tomas Cohen Arazi	e43f012af6	Revert "ug 9828 : Add and fix comments in UNIMARC biblio-koha-indexdefs.xml" This reverts commit `5bbe42932e`.	2014-10-28 12:02:22 -03:00
Tomas Cohen Arazi	b108a111f6	Revert "Bug 9828 : Followup for Queryparser and deletion of useless 6XX$9" This reverts commit `49788987b2`.	2014-10-28 12:02:09 -03:00
Mathieu Saby	49788987b2	Bug 9828 : Followup for Queryparser and deletion of useless 6XX$9 This followup - changes some indexes in Queryparser configuration file - supresses some clearly useless 6XX$9 in biblio-koha-indexdefs.xml and adds 2 new ones, probably useless (not sure of that) - change the name of index Subject-geographical to Subject-name-geographical in ccl.properties (to match bib1.att) the xsl file zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl was generated with the following command: xsltproc zebradb/xsl/koha-indexdefs-to-zebra.xsl zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml > zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl To test : 1) Apply the 3 patches 2) copy the modified files from the source directory to the directory where you store the config files for Zebra and Queryparser The files modified by the 3 patches and that need to be copied are: etc/zebradb/biblios/etc/bib1.att etc/zebradb/ccl.properties etc/searchengine/queryparser.yaml etc/zebradb/ccl.properties .../unimarc/biblios/biblio-koha-indexdefs.xml .../unimarc/biblios/biblio-zebra-indexdefs.xsl 3) Rebuild Zebra 4) Create a record A with some values in critical fields, for example: - the string "test9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f, 606$2,600$2 - the string "subform" in 600$j 4) Create a record B with the string "subgeo" in 606$y 5) Create a record C with the string "subdate" in 606$z WITHOUT QP activated in sysprefs ("Don't try to use QP"): 6) try to search "su:test9828". You should have no results 7) try to search "su-genre:subform". You should have 1 result : record A 8) try to search "su-geo:subgeo". You should have 1 result : record B 9) try to search "su-chrono:subdate". You should have 1 result : record C 10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo indexes, and see it results are relevant WITH QP activated in sysprefs: Same tests Signed-off-by: Nick Clemens <nick@quecheelibrary.org> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-27 12:46:47 -03:00
Mathieu Saby	5bbe42932e	ug 9828 : Add and fix comments in UNIMARC biblio-koha-indexdefs.xml Only cosmetic : - the references to lines record.abs are now useless and outdated - some comments added in record.abs could be usefull in biblio-koha-indexdefs.xml No change expected, only comments Signed-off-by: Nick Clemens <nick@quecheelibrary.org> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-27 12:46:44 -03:00
Mathieu Saby	0dd1ac40a0	Bug 9828: More specific indexing of UNIMARC 6XX fields [New commit on 18 Aug 2014 : rebased, and DOM indexing only] Issues to fix : Most of 6XX may contain a $2 that identifies the system used for indexing. It should not be indexed. In French libraries, $2 contains "rameau". So searching books about the music composer "Rameau" retreive thousands of records! For some 6XX fiels, other subfields should not be indexed, for example dates of persons and family, or adresses. In Unimarc guide, 600$t,601$t,602$t are said to exist but to be "not used". I keep them indexed. Additionnally, subject indexing could be improved by using specific indexes for each 6XX if possible : In ccl.properties : - su-to, su-geo and su-ut are defined as aliases of Subject. - a specific index is defined, but not used in record.abs : Subject-name-personal, alias su-na We can use these indexes and create new specific indexes by using existing bib1 attributes. We could also index $j,$x,$y,$z subdivision in specific indexes. This patch does the following changes : 1) For all 6XX : Not indexing $2 (LSCH, Rameau...), $3 and $5 2) Suppressing the indexing of some specific subfields, depending on the field: 600 : Personal name used as a subject // see Marc21 600 not indexing c (additional elements),f (dates),p (address/affiliation) 602 : Family name used as a subject // see Marc21 600 3X not indexing f (dates) 616 : Trademark not indexing c,f 3) For all 6XX : index $j,$x,$y,$z in several indexes in addition to the specfific index for their 6XX field: 4) Define in ccl.properties some specific indexes : Subject-name-conference 1=1073 => alias su-conf Subject-name-corporate 1=1074 => alias su-corp Subject-genre-form 1=1075 => alias su-genre and su-form Subject-geographical 1=1076 => alias su-geo Subject-chronological 1=1077 => alias su-chrono Subject-title 1=1078 => alias su-ut and su-ti Subject-topical 1=1079 => alias su-to 5) Adding new aliases in Search.pm : su-chrono, su-form, su-genre, su-corp, su-conf, su-ti 6) Using these new indexes in for 600 : Subject and Subject-Personal-Name ; all subfields except subdivisions in Personal-name 601 : Subject, Subject-name-conference and Subject-name-corporate and Subject-name-conf ; all subfields except subdivisions in Corporate-name and Conference-name 602 : same as 600 but could be improved later 604 : Subject and Subject-title ; $a in Subject-Personal-Name ; all subfields except subdivisions in Name-and-Title 605 : Subject and Subject-title 606 : Subject and Subject-topical 607 : Subject and Subject-geographical ; all subfields except subdivisions in Name-geographic 608 : Subject and Subject-genre-form To test : A. In a UNIMARC-DOM indexing environment 1) Apply the patch 2) Rebuild zebra 3) Create a record A with some values in critical fields, for example: - the string "test9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f, 606$2,600$2 - the string "subform" in 600$j 4) Create a record B with the string "subgeo" in 606$y 5) Create a record C with the string "subdate" in 606$z 6) try to search "su:test9828". You should have no results 7) try to search "su-genre:subform". You should have 1 result : record A 8) try to search "su-geo:subgeo". You should have 1 result : record B 9) try to search "su-chrono:subdate". You should have 1 result : record C 10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo indexes, and see it results are relevant Indexing of subjects could maybe be improved later Signed-off-by: Nick Clemens <nick@quecheelibrary.org> All seems to work as expected, I am not super-familiar with UNIMARC but I wonder if in su-corp and su-conf the subdivisions might be useful (e.g. France-Gendarmie / Staatsbibliothek-Berlin) Signed-off-by: Paul Poulain <paul.poulain@biblibre.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-27 12:46:42 -03:00
Mason James	a657589b1f	Bug 11362 - increase zebra AUTH register sizes, from 4G to 20G To test... - apply patch - build and install a new Koha .deb from patched codebase - create a new Koha instance - add some authority records to instance - do a full zebra reindex - do an authorities search, and get some results note: this patch does not fix existing Koha instances, just new ones Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz> Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-24 09:41:04 -03:00
Jonathan Druart	b3acefc319	Bug 11586: Better default framework for UNIMARC - zebra conf This patch updates the Zebra configuration for unimarc. 995$d and 995$j should not be indexed. Signed-off-by: Paul Poulain <paul.poulain@biblibre.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-23 10:52:03 -03:00
Tomas Cohen Arazi	ca17512a8e	Bug 11232: (qa followup) empty ID due to namespace mistake Note: NORMARC is missing the id field. Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> This patch makes t/db_dependent/Search.t pass again. NORMARC is currently not tested. I checked the results before and after applying the patch and the facets are now looking the same as before. Passes all tests and QA script. Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-15 12:55:52 -03:00
Tomas Cohen Arazi	ccf7ae56f6	Bug 11232: (qa followup) Add missing fields/subfields to the item types faceta The itype facet was missing 952$y for both MARC21 and NORMARC. This patch adds that. And also modifies the zebra-biblios-dom.cfg file (also the debian/ version) so facetNumRecs is set to 1000 for zebra. It is the amount of records that are taken into account. The more record, the more exact the facets for the result set. 1000 was chosen as it changed the time to reindex 1000 records from 18s to 19s. Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-15 12:55:47 -03:00
Tomas Cohen Arazi	e95cd1b126	Bug 11232: (followup) remove unnecesary namespace definition from all XML elements The previous patches for facet extraction from Zebra indexes set a default namespace on the following files: etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml etc/zebradb/marc_defs/normarc/biblios/biblio-koha-indexdefs.xml etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml and hence the XML file index_subfields can be cleaned by removing the namespace. To test: - Apply this patch - Run $ for i in marc21 normarc unimarc do xsltproc etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl \ etc/zebradb/marc_defs/$i/biblios/biblio-koha-indexdefs.xml \ > etc/zebradb/marc_defs/$i/biblios/biblio-zebra-indexdefs.xsl done => SUCCESS: no errors reported - Run $ git diff => SUCCESS: no differences on the xsl files - Sign off :-D Sponsored-by: Universidad Nacional de Cordoba Signed-off-by: David Cook <dcook@prosentient.com.au> Seems to work with DOM and MARC21. Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-15 12:55:44 -03:00
Tomas Cohen Arazi	c1e384f250	Bug 11232: NORMARC facet definition and updated XSL file for DOM This patch adds the facets definitions to the biblio-koha-indexdefs.xml, based on what is hardcoded on C4::Koha::getFacets(). The biblio-zebra-indexdefs.xsl file for NORMARC is generated using the usual: xsltproc ...koha-indexdefs-to-zebra.xsl ...normarc/biblios/biblio-koha-indexdefs.xml > \ ...normarc/biblios/biblio-zebra-indexdefs.xsl Sponsored-by: Universidad Nacional de Cordoba Signed-off-by: David Cook <dcook@prosentient.com.au> Seems to work with DOM and MARC21. Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-15 12:55:40 -03:00
Tomas Cohen Arazi	eafeb34097	Bug 11232: UNIMARC facet definition and updated XSL file for DOM This patch adds the facets definitions to the biblio-koha-indexdefs.xml, based on what is hardcoded on C4::Koha::getFacets(). The biblio-zebra-indexdefs.xsl file for UNIMARC is generated using the usual: xsltproc ...koha-indexdefs-to-zebra.xsl ...unimarc/biblios/biblio-koha-indexdefs.xml > \ ...unimarc/biblios/biblio-zebra-indexdefs.xsl Sponsored-by: Universidad Nacional de Cordoba Signed-off-by: David Cook <dcook@prosentient.com.au> Seems to work with DOM and MARC21. Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-15 12:55:38 -03:00
Tomas Cohen Arazi	2cc293ecd6	Bug 11232: MARC21 facet definition and updated XSL file for DOM This patch adds the facets definitions to the biblio-koha-indexdefs.xml, based on what is hardcoded on C4::Koha::getFacets(). The biblio-zebra-indexdefs.xsl file for MARC21 is generated using the usual: xsltproc ...koha-indexdefs-to-zebra.xsl ...marc21/biblios/biblio-koha-indexdefs.xml > \ ...marc21/biblios/biblio-zebra-indexdefs.xsl Sponsored-by: Universidad Nacional de Cordoba Signed-off-by: David Cook <dcook@prosentient.com.au> Seems to work with DOM and MARC21. Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-15 12:55:36 -03:00
Tomas Cohen Arazi	ca074c9253	Bug 11232: Add new syntax for facets definition on koha-indexdefs-to-zebra.xsl This patch changes koha-indexdefs-to-zebra.xsl to correctly process a new syntax for defining facet indexes on the XML files. It also changes the retrieval file to allow access to Zebra's internal data from Zoom (i.e. access to zebra::facet:*). Sponsored-by: Universidad Nacional de Cordoba Signed-off-by: David Cook <dcook@prosentient.com.au> Seems to work with DOM and MARC21. Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-15 12:55:33 -03:00
Fridolin Somers	95adc7a1f4	Bug 12453 - Do not use by default Host-Item-Number in UNIMARC Actually, in default UNIMARC install, 461$9 is indexed as Host-Item-Number, meaning it is used for analytical itemnumber. But most UNIMARC catalog use the analytical relation using unimarc_field_4XX.pl plugin on 461$a. In fact, this plugin is defined in default UNIMARC frameworks. If Host-Item-Number is defined but 461$9 is used for something else, it will lead to odd bugs. For example, records containing analytical items can not be deleted. This patch comments the 461$9 indexing in UNIMARC zebra config. Test plan : - Create a fresh UNIMARC install - Create a record with 461$9 containing a value - Index the record - Perform a search on Host-Item-Number : ccl=Host-Item-Number,alwaysmatches='' => Without the patch you get a result => With the patch you get no result Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Code is clean, commenting out all the indexing of 461$9. Trusting the author that this is the correct thing to do :) Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-08-24 12:32:30 -03:00
Marcel de Rooy	fb2345a302	Bug 9612: (follow-up) restore elementSetName in Context.pm Restore elementSetName to marcxml for DOM indexing in Zconn (Context.pm). This prevents the need of rebuilding the index after restarting Zebra server. Removes the now incorrect reference to marcxml as 'superfluous' in four dom config files. Test plan: [1] Do not yet apply this patch. [2] Rebuild zebra index with the zebra config of commit `036f2a50e1`. [3] (Go back to master.) Restart your zebra server (no config change). You will have results without details. Apply this patch: you see details. Reset to master: no details again. [4] Install new zebra config from master. Search again: you still see no details. Restart zebra server. Search: you see details. Apply this patch. Search: still details. Restart zebra server. Search: still details. Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl> Tested in a non-package environment (manual dev install). The package environment should work now too (results in step 4c might differ). Progress on bug 12012 would be appropriate to sync all changes. Tested the response of the SRU server too. Signed-off-by: Marc Veron <veron@veron.ch> I tested starting on a VM with Koha 3.15.00.019 installed. Did git pull -> Koha 3.15.00.051 Result: No details in search results. Applied patch. Result: Search results display fine. Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com> Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-05-19 16:46:57 +00:00
Marcel de Rooy	67abcc6443	Bug 9612: fix SRU response for DOM indexing This patch makes changes to koha-conf.xml by removing the fallback section from biblioserver and authserver. The information is in a include file on the same server (no need to fall back) and moreover, some information is not up-to-date and should be moved elsewhere. The patch also simplifies the DOM retrieval-info files for auth and bib. And eliminates superfluous F and usmarc from the dom-config files. (I felt the urge to remove marcxml too, but left it for now; see also the second patch.) For reference, look at the marcxml example files of Zebra. NOTE: This patch does not deal with the Debian package installs. In the same way koha-conf-site.xml.in, and -retrieval-info- could be adjusted. Test plan: [1] Run at least a dev install in order to copy the new files to your Zebra folders. Choose for DOM indexing. Enable the SRU server on port 9998 (small edit in koha-conf.xml). [2] Restart Zebra and reindex -a -b -x. [3] Verify if a search from Koha still functions as expected. Check the SRU output on port 9998. NOTE: If you do not pass recordSchema, you should get back a marc response now (instead of index schema). Bonus: Add your server as a Z3950 target to another Koha install. And perform a Z3950 search from the other server to your new install. Bonus: Check response from the auth and biblio socket via yaz-client. [4] Reindex again with -a -b but without -x. [5] Repeat Koha search, SRU response (Z3950, yaz-client). Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-05-05 20:28:04 +00:00
Colin Campbell	735381b371	Bug 10729: Add phrases configuration for ICU Add a separate phrases-icu.xml for phrase indexes The file is based on that distributed with zebra with a couple of additions to reflect Koha usage This patch adds a separate tokenizer variable for phrase indexes so that default.idx is correctly rewritten for sites using icu indexing Signed-off-by: Paola Rossi <paola.rossi@cineca.it> Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> - Applied patch - perl Makefile.PL --prev-install-log ../koha-dev/misc/koha-install-log - make upgrade - Restarted Zebra server - Did a full reindex of bibliographic and authorities - Checked various searches - Links records to authorities - Checked created links work correctly I couldn't find a regression with this patch. Passes all tests and QA script. Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-05-05 04:10:57 +00:00
Fridolin Somers	bd65c6e95b	Bug 11635: remove duplicate definition of 995$r in UNIMARC record.abs Test plan : - Create a fresh install UNIMARC flavor and GRS1 indexing for biblios - Re-indexe database - Perform a search with index "itemtype" (and then "itype") on an existing value of 995$r. For example : itemtype:BOOK => Check you get results Signed-off-by: Mark Tompsett <mtompset@hotmail.com> Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-05-05 02:25:20 +00:00
Mirko Tietgen	84bdb55549	Bug 9972: Add/change some zebra indexes (MARC21) This patch adds :w and :p versions to the index for »Lexile number« (it has only :n so far) and adds indexes for 653 (Index term uncontrolled), 655 (Index term Genre/Form), 041 (language-audio) and 041 (language-subtitle). It also adds the »curriculum«-index to Search.pm. Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz> Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-04-20 16:24:08 +00:00
Mathieu Saby	b6118db2f5	Bug 11202: Improve UNIMARC biblio indexing This patch makes the following changes to UNIMARC biblio indexing : A. Changes to UNIMARC conf files 1. add comments to biblio-koha-indexdefs.xml 2. make biblio-koha-indexdefs.xml more compact by grouping some declarations Ex : 200$f and 200$g => one declaration for 200$fg 3. suppress unneeded declarations (indexing of some 4XX fields and 6XX fields not in unimarc format) 4. unindex some (sub)fields unneeded by most users (318, 207,230,210a, 215, 4XXd) 5. change the way 308 field is indexed (no visible changes) 6. replace Title-host with Host-item -- see bug 11119 7. index 208 in Material-Type -- see bug 11119 8. index 100 pos 8-9 and 9-12 in pubdate:y and pubdate:n 9. index 100 pos 8-9 in pubdate:s instead of 210$d 10. Index all subfields of note 334 and 327 in note index 11. Index 304 and 327 in title index as well as note index 327 can contain a list of titles included in a work 304 can contain the title of the original work in case of a translation 12. Index 314 in author index as well as note index 314 can contain authors not mentionned in 200$f/g (the 4th, 5th etc. author) 13. Index 328 note in Dissertation-information as well as note 14. Index 328$t in Title B. Changes to ccl.properties : 1. add a new index Dissertation-information (1056) 2. fix EAN, pubdate and acqdate (they were not linked with bib1 attributes) C. Changes to Search.pm 1. add Dissertation-information and suppress Title-host and UPC D. Changes to QP config file queryparser.yaml 1. add Dissertation-information 2 fix EAN, pubdate and acqdate Test plan : If you cannot test in GRS1, test only in DOM, as GRS will be deprecated. 1. Apply the patch in a UNIMARC Koha running with DOM and ICU 2. copy src/etc/searchengine/queryparser.yaml into the main config directory of QP 3. copy src/etc/zebradb/ccl.properties into the main config directory of Zebra 4. copy src/etc/zebradb/marc_defs/unimarc/biblio/* into the main config directory of Zebra 5. reindex biblios (rebuild_zebra.pl -r -b -x -v) 6. test note index : make some searches on 334$b or 327$b 7. test author index : make some searches on 314 field 8. test title index : make some searches on 304 and 327 field, make a search on 328$t subfield 9. test dissertation-information index : make some searches on 328 field 10. In a record, put in the dates of 100 fields the values "1000" (1st date) and "1001" (2d date) ; try to search a book written in year 1000, you should find the record ; idem for year 1001 11. make some searches and sort by date. It should work better as before, especially if you have values like "c2009" or "impr. 2010" in 210 field 12. Regression test : make some searches on several indexes, like EAN, etc. It should work as before Test 10-12 with and without Queryparser activated. Be careful: with Queryparser activated, the index names (title, dissertation-information...) must be entered in lowercase only. Of course, to test search and sort by dates, you need to have full records, with dates in 100 field as well as 210 field. Signed-off-by: Paola Rossi <paola.rossi@cineca.it> Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-02-19 21:01:15 +00:00
Galen Charlton	aaff735269	Bug 10544: (follow-up) update MARC21 DOM index definitions This patch updates the MARC21 DOM index definitions to index the 952$i as 'Number-local-acquisition' rather than 'stocknumber'. To test (for a MARC21/DOM setup): [1] Copy the MARC21 biblio-zebra-indexdefs.xsl over to the active Zebra configuration directory. [2] Reindex the bib records. [3] Verify that 'stocknumber', 'inv', and 'number-local-acquisition' searches work. Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-02-19 20:41:37 +00:00
Fridolyn SOMERS	b0f39cee0d	Bug 10544: add Number-local-acquisition in known indexes Adding Number-local-acquisition in C4::Search known indexes allows to search without using "ccl=" prefix. Also corrects in ccl.properties : inv must be an alias of Number-local-acquisition. Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-02-19 20:39:58 +00:00
Fridolyn SOMERS	10e1cbeb14	Bug 10544: ensure that stocknumber searches work for MARC21 Bug 6256 replaced in bib1.att stocknumber by Number-local-acquisition for number 1062. In this case, Number-local-acquisition must be used in record.abs and stocknumber can be an alias of it in ccl.properties. Test plan (for MARC21/GRS1): - drop zebra database (rebuild_zebra.pl -r ...) - reindex - test in simple search : ccl=Number-local-acquisition,alwaysmatches='' => you get all records with a stocknumber - test in simple search : ccl=stocknumber,alwaysmatches='' => you get the same results Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-02-19 20:39:10 +00:00
Chris Cormack	211acdd30b	Bug 11192: (follow-up) fix a little typo Test plan the same as the original patch Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Tested according to test plan. Searches tested were: fic=e fiction=e Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-01-03 15:34:06 +00:00
Mathieu Saby	a032b0a5cd	Bug 11192: Fix lf and ff07-02 definition in ccl.properties ff7-02 1=87020 (position 2 of field 007 in MARC21) should be ff7-02 1=8702 lf 1=8833 lf fiction fic fiction should be lf 1=8833 fiction lf fic lf To test : 1. apply the patch 2. copy the modified ccl.properties into your active Zebra config directory 3. reindex zebra (rebuild_zebra.pl -b -x -r -v) 4. make some searches using the fixed indexes Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-01-03 15:33:04 +00:00
Jonathan Druart	a573ac1fa8	Bug 9940: (follow-up) FIX comment: language-original is 101$c, not $h Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-12-25 15:41:31 +00:00
Mathieu Saby	451f67c055	Bug 9940: Add a new index for the original language of a document It could be useful to index the original language of a document (i.e. "fre" for the English translation of a French novel). This patch renames the Bib-1 use attribute 1095 from Code-language-original to language-original and uses it to index: - MARC21 041$h subfield - UNIMARC 101$c subfield It adds "language-original" in the list of index in Search.pm. Test plan : A. in a MARC21 GRS1 environment 1. Copy Zebra config files (zebradb/biblios/etc/bib1.att, zebradb/ccl.properties, marc_defs/marc21/biblios/record.abs) from your source etc/ directory to your main koha etc/ directory 2. Reindex zebra 3. Make some searches, like "language-original:fre" B. in a MARC21 DOM environment 4. Copy Zebra config files (zebradb/biblios/etc/bib1.att, zebradb/ccl.properties, marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl) from your source etc/ directory to your main koha etc/ directory 5. Reindex zebra 6. Make some searches, like "language-original:fre" C. in a UNIMARC GRS1 environment 7. Copy Zebra config files (zebradb/biblios/etc/bib1.att, zebradb/ccl.properties, marc_defs/unimarc/biblios/record.abs) from your source etc/ directory to your main koha etc/ directory 8. Reindex zebra 9. Make some searches, like "language-original:fre" A. in a UNIMARC DOM environment 10. Copy Zebra config files (zebradb/biblios/etc/bib1.att, zebradb/ccl.properties, marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl) from your source etc/ directory to your main koha etc/ directory 11. Reindex zebra 12. Make some searches, like "language-original:fre" Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz> Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-12-25 15:37:14 +00:00
Mathieu Saby	c00131e0ff	Bug 9830: Fix some indexes in UNIMARC item indexing With this combination of sysprefs, and a UNIMARC configuration, it was impossible to search on location, barcode and ccode indexes : QueryWeightFields is activated QueryAutoTruncate only if * is added But in UNIMARC, location, barcode and ccode (995 $e,$f,h) were indexed only as "words". They need to be indexed also as "phrase". Additionnaly, in UNIMARC, information about damaged and withdrawn status of items is not indexed, while it is done in MARC21. This patch - add 2 new indexes for 995$1 (damaged) and 995$3 (withdrawn) - index location, barcode and ccode as "phrase" as well as "words" Indexing of items in UNIMARC could be improved later. So this patch also add comments explaining the origin of Koha 995, I think it could be useful for further changes. To test, on a UNIMARC configuration : A. indexed with GRS-1 1) Set sysprefs QueryWeightFields as "activated" and QueryAutoTruncate as "only if * is added" 2) Select location index in advanced search and search for a value existing in your records in 995$e => 0 results 3) Apply patch 4) Rebuild zebra 5) Select location index in advanced search and search for a value existing in your records in 995$e => x results 6) Mark an item as withdrawn; search "withdrawn:1" => x results, and among them the biblio to which the item is attached 7) Mark an item as damaged ; search "damaged:1" => x results, and among them the biblio to which the item is attached B. indexed with DOM Do the same operations Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Work as described. No koha-qa errors Test Apply the patch Begin with GRS-1 Full reindex Search by location, no results cp files biblio-*-indexdefs.xml and record.abs to destination on etc/zebra Full reindex Search by location, got results Switch to DOM reset files Full reindex Search by location, no results cp files Full reindex Search by location, results ! Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-21 15:38:49 +00:00
Mathieu Saby	4d8b1ec786	Bug 7421: support indexing UNIMARC authority records using the DOM Filter I took as a base the patch of F. Demians, but made a lot of changes, so I think it is more logical to create a new patch as the behavior is not the same as previous patch. I tried to define DOM config files as a "miror" of record.abs, so the behavior be the same. If it is OK, we will be able to improve indexing later, for example suppressing warns, managing indicators or subdivisions, etc. I made some little changes to record.abs : - comments - 216 was indexed in Conference-name as well as Trademark. I suppose that "Conference-name" is an error, so I indexed only in Trademark - index 2 new notes : 340 / 356 The only difference between record.abs and DOM is that DOM config files does not index complete fields, but subfields. Ex : melm 200 ===> <kohaidx:index_subfields tag="200" subfields="abcdfgjxyz"> I took all the subfields from the UNIMARC Authorities manual. The only subfields not indexed are numeric subfields : $7, $8 for language of record, and $0,2,3,5,6 for 4XX/5XX/7XX To test : - index a set of bib and auth records with GRS-1 - make some searches on different kind of authorities - index the same records with DOM - make the same searches - You are not supposed to see differences Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> As I am not a UNIMARC user it's hard for me to test this, but while testing other authority related patches I noticed that I couldn't index the UNIMARC authorities of the sample base. The files are obviously missing and reindex_zebra.pl notes this. With this patch applied, indexing works and authorities are searchable in my installation. Signed-off-by: Vitor Fernandes <fvernandes@keep.pt> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 21:03:15 +00:00
Mathieu Saby	5298140c67	Bug 10037: fix item index in UNIMARC DOM indexing In UNIMARC DOM indexing, "item" index was working only for subfields of 995 field mapped with specific indexes, and also in index (ex : $a, $b...). It was not working for the other subfields (ex : $g), because a comment from record.abs was integrated in DOM config files. This patch removes the comment. To test, in a DOM UNIMARC environment : 1) In a item, write some value "Test10037" in 995$g 2) Search for this value in simple search, this way : item=Test10037 => you should have no results 3) Apply the patch. if necessary, copy the modified etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml and etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl into the /etc/... directory in your main Koha directory 4) Reindex Zebra biblios 5) Do the same search as 2) => you should have one result Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Work as described. No koha-qa errors. Test NOTE: default UNIMARC framework don't have 995g, so I must add it first. 1) Added test string to 995b on some record 2) Reindex and search as indicated, no results 3) cp files to destination 4) reindex 5) search and result ok ! Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 18:54:12 +00:00
Galen Charlton	8ea3462517	Bug 8252: (follow-up) standardize name of Identifier-publisher-for-music index To test: [1] When running t/db_dependent/Search.t, veify that no warnings like this are shown: 15:52:07-10/10 zebraidx(2006) [warn] Index 'Number-music-publisher' not found in attset(s) Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 16:05:33 +00:00
Galen Charlton	45d0365d12	Bug 8620: (follow-up) apply to NORMARC and MARC21 authorities This applies the fix for the Any index to NORMARC bib and MARC21 authority DOM Zebra indexes. Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 15:56:13 +00:00
Galen Charlton	5024e519ad	Bug 8252: (follow-up) tidy up long lines in bib1.att Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 15:56:13 +00:00
Mathieu Saby	475a9d19d1	Bug 8252: (follow-up) fix biblio-zebra-indexdefs.xsl This patch fixes biblio-zebra-indexdefs.xsl files. It was generated from biblio-koha-indexdefs.xsm with the new koha-indexdefs-to-zebra.xsl amended by F. Démians's patch. To test : - Take a DOM UNIMARC Koha - Apply all the patchs of 8252 bug, including this one - Copy src/etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl to your etc/zebradb/marc_defs/unimarc/biblios/ located in your installation directory - Run rebuid_zebra -b -x -r -v - make advanced searches on staff interface and opac, on coded fields indexes (Audience, Literary genre, Biography, Illustration, Content, Video Types, Serial Type, Periodicity, Regularity, Picture) Signed-off-by: Frédéric Demians <f.demians@tamil.fr> Ok for me. This patch put in sync indexes XSL definition with authoritative XML definition. Subsequently, it won't be difficult to amend DOM UNIMARC indexes defintion if necessary. And, as it is, I don't see any regression, whereas I can see huge improvements. Thanks Mathieu! Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 15:21:56 +00:00
Frédéric Demians	f9addcc98b	Bug 8852: DOM XSL now handles subfield substring extraction This patch modify koha-indexdefs-to-zebra.xsl in order to add the ability to populate indexes with subfield substring. It's now possible to understand such construction as: <index_subfields xmlns="http://..." tag="100" subfields="a" offset="7" length="1"> <target_index>tpubdate:s</target_index> </index_subfields> Signed-off-by:Mathieu Saby <mathieu.saby@univ-rennes2.fr> I applied the patch and ran xsltproc koha-indexdefs-to-zebra.xsl ../marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml \ > ../marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl I looked at the generated file. It looks nice. Then I copied it file in my INSTALLDIR/etc/zebra.... and reindexed my records with rebuild_zebra.pl I made some searches on coded position index and non coded position indexes, everything works. http://bugs.koha-community.org/show_bug.cgi?id=8252 Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 15:19:20 +00:00
Mathieu Saby	43809d2835	Bug 8252: Followup for Date/time-last-modified and Music number This followup restores the original wording of "Date/time-last-modified" index, and change the name of "Music-number" index to "Number-music-publisher" To test : 1. In a UNIMARC Koha instance 2. Apply patchs #1, #2 and this followup 3. Copy from src/etc/zebradb directory to the etc/zebradb/ in your main Koha directory the following files: -- zebradb/biblios/etc/bib1.att -- zebradb/ccl.properties -- zebradb/marc_defs/unimarc/biblios/record.abs -- zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml -- zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl 4. Rebuild zebra with -b -x -v -r options 5. Write a value like "test071a" in 071$a field in a record 6. Check if you can find this record with this search: "ccl=Number-music-publisher:test071a" Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> No koha-qa errors. Test Copy files reindex full Modify a couple of record to add 071a with test message Reindex -v -z -b -x Search test message as described and found modified records. Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 15:16:27 +00:00
Mathieu Saby	8034566027	Bug 8252: Fix indexing of UNIMARC 1xx for DOM This patch makes the same changes in UNIMARC DOM configuration as patch 1 made for GRS-1. positions of subfields are indexed that way : In biblio-koha-indexdefs.xml : tag="100" subfields="a" offset="17" length="1" In biblio-zebra-indexdefs.xsl : xslo:value-of select="substring(., 17, 1)" I had to edit biblio-zebra-indexdefs.xsl by hand, because etc/zebdradb/xml/koha-indexdefs-to-zebra.xsl does only support "subtring" in handle-one-index-control-field template. It is good for MARC21, but not for UNIMARC : in MARC21, indexing subtrings is needed for controled field (001-009, with no subfields) But in UNIMARC it is needed for subfields of 1XX fields. So if DOM indexing is working with these new files, we may need to change koha-indexdefs-to-zebra.xsl. Test plan (not possible in a sandbox) : 1) In a Koha instance using UNIMARC and DOM indexing 2) Apply Patch 1 and Patch 2 (this one) 3) Copy the following files from the etc/zebradb directory of your source into the etc/zebradb directory of your main Koha directory : -- etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml -- etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl -- etc/zebradb/ccl.properties -- etc/zebradb/biblios/etc/bib1.att 4) rebuild zebra with -x -b -r -v options 5) check if coded filters in advanced search are usable in OPAC and Staff interface Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Works. No koha-qa errors. Test for DOM Apply patches Don't forget to copy files reindex Search by coded fields works, also Country-publication Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 15:15:04 +00:00
Mathieu Saby	041e3603a1	Bug 8252: Fix indexing of UNIMARC 1xx for GRS-1 Before fixing UNIMARC DOM indexing, we must fix GRS-1 indexing 1) In advanced search, some Coded fields index are not working: Print, Illustration, Content 2) Country-heading index is not working 3) Some subfields are indexed in wrong indexes : 102$a should be in Country-publication instead of Country-heading (non defined in bib1.att) 106$a, filled only for printed works, should be in ff88-23 (form of item) instead of itype. (ff88-23 is made for Marc21 008 pos 23, which contains the same data as 106a) 200$b should be in Material-type instead of (or in addition to) itype and itemtype: (Material-type :"free-form string, ... that describes the material type of the item, e.g., cassette, kit, computer database, computer file.") 100$a pos 22-24 should not be indexed as "ln" : it is the language of the record, not the language of the ressource 4) Index names are too long : if we index new positions of coded fields, with existing names it breaks Zebra indexing (there must be a limit in line lenghth in record.abs?) 5) There are a lot of warns when rebuiding zebra. This patch make some changes in bib1.att (could be used later to improve search) : - fixing wording for att 51 and 1012 - adding comments for attributes based on MARC21 008 field (8800-8841) - creating 8806 (tpubdate), 8838 (Modified-code), 8818 (ff8-18), 8840 (ff8-18-21), 8819 (ff8-19), 8821 (ff8-21), 8828 (ff8-28), 8830 (ff8-30), 8831 (ff8-31) - creating attributes specific to UNIMARC : 9701-9707 (Video-mt, Graphics-type, Graphics-support, Title-page-availability, Cumulative-index-availability, script-Title, char-encoding) - setting apart 3 blocks of attributes, so it could be easy to make further changes : -- common to Marc21 and UNIMARC : 8806, 8822, 8838 -- slightly different in Marc21 and UNIMARC (different meanings according to the type of the record => don't match a single UNIMARC field) -- specific to UNIMARC : 9701-9707 In ccl.properties : - creating a new index: Country-publication 1=1053 - suppressing some warns by mapping with bib1 att: Date-time-last-modified, Name, rtype, Music-number - defining indexes using the 3 blocks attributes defined in bib1 (common to Marc21 and UNIMARC, slightly different, specific to UNIMARC) In record.abs : - renaming some index for 100-105-110 fields - correcting indexing of 102$a (country of publication) 106$a (ff88-23) 100$a pos 22-24 (language of record, no more indexed) 105$a pos. 0-3 (illustration code) 200$b (for the moment, I keep it indexed in itype and itemtype, but also Material-Type) In C4/Search.pm : - adding "Country-publication" index In OPAC and staff interface template subtypes_unimarc.in : - renaming indexes to take into account the changes made to Zebra config files To test (this cannot be done with a sandbox) : 1) Apply the patch in a UNIMARC GRS-1 Koha instance 2) Copy the following files from the etc/zebradb of your source directory into the etc/zebradb of your main Koha directory: -- etc/zebradb/biblios/etc/bib1.att -- etc/zebradb/ccl.properties -- etc/zebradb/marc_defs/unimarc/biblios/record.abs 3) Reindex your data (rebuild_zebra -x -b -r -v) 4) Try to use those Coded fields indexes in Advanced search, in OPAC and Staff interface (available after clicking on "More options", then on "Coded information filters"): Audience, Print, Literary genre, Biography, Illustration, Content, Video Types, Serials, Serial Type, Periodicity, Regularity 5) Try to search "Country-publication=FR" in simple search Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> No koha-qa errors. Tests for GRS-1 Followed test plan Search by coded fields works, but only on OPAC, on staff there are few options Search by Country-publication works after patch Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 15:06:10 +00:00
Tomas Cohen Arazi	82e1b22794	Bug 10431 - Redundant mappings removed There were 5 redundant mappings left in the file. They are removed. Its almost a QA followup as everything went fine anyway. Sponsored-by: Universidad Nacional de Córdoba Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-07-05 06:56:44 -07:00
Tomas Cohen Arazi	343c93ea43	Bug 10431 - Spanish Zebra character sorting file This patch provides a definition file for Spanish (es) character sorting in Zebra. It is based on the ideas from Hugo Agud <hagud@orex.es> and Pablo Bianchi <pablo.bianchi@gmail.com>. Makefile.PL is fixed to notice the existence of the 'es' language. The docs for koha-create are touched too. To test: Tarball ======= - Go through the install process, choosing 'es' for the Zebra's language step - Koha should work as usual. - Running this should show the lang definition is properly set. $ grep -R "lang_defs/es" /etc/koha/* (stuff like zebradb/zebra-biblios-dom.cfg:profilePath:...etc/koha/zebra/lang_defs/es... should show) - This file should be present: /etc/koha/zebradb/lang_defs/es/sort-string-utf.chr Packages ======== - Build your own package, it shouldn't break the packaging - Try the new package, using koha-create to set an instance using --lang 'es' Sponsored-by: Universidad Nacional de Córdoba Signed-off-by: Galen Charlton <gmc@esilibrary.com> Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-07-05 06:55:49 -07:00
Magnus Enger	6e4851fa40	Bug 9804 - Fix name for NORMARC biblio-koha-indexdefs.xml When i did bug 8805, I gave the biblio-koha-indexdefs.xml file the wrong name, and called it biblio-zebra-indexdefs.xml. This patch fixes that. To reproduce: - Check that etc/zebradb/marc_defs/normarc/biblios/biblio-zebra- indexdefs.xml exists To test: - Apply the patch and check that etc/zebradb/marc_defs/normarc/ biblios/biblio-zebra-indexdefs.xml no longer exists, but that etc/zebradb/marc_defs/normarc/biblios/biblio-koha-indexdefs.xml does exist. Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz> Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>	2013-04-24 09:16:17 -04:00
Magnus Enger	3f7dd2730a	Bug 9213 - Implement analytics for NORMARC XSLT Problem: Links between anaytics records were not being displayed for NORMARC setups. What this patch does: 1. Add indexing for 773 subfield a, w and 9; both for GRS-1 and DOM indexing (The DOM indexing config was generated from the GRS-1 record.abs) 2. Add "analytics links" to NORMARC XSLT files, both for OPAC and intranet To test: - Make sure you have a NORMARC installation - Set UseControlNumber = Use - Create a parent record with LDR/07=c. Leave 001 empty. - In the "Normal" view, do New > New child record and create another record. Do this twice (so you get a list of hits when you click on the "Show anaytics" links later on). - Do the following steps both in the OPAC and the Intranet: - Search for the parent record in such a way that you can see the record in a result list - Check that the "Show analytics" link is displayed, and uses the title of the parent record for linking: ?q=Host-item:<Title of parent record> - Clik on the "Show analytics" link and check that you get a result list with the two child records you created earlier - Go back to the result list and click on the parent record, so you get the detail view - Check that the "Show analytics" link is displayed, and uses the title of the parent record for linking: ?q=Host-item:<Title of parent record> - Clik on the "Show analytics" link and check that you get a result list with the two child records you created earlier - Search for one or both of the child records in such a way that you can see the record(s) in a result list - Check that the "In: <Title of parent record>" link is displayed, and that it uses the biblionumber of the parent record for linking: ?q=Control-number:<biblionumber of parent record> - Click on the "In: <Title of parent record>" link, and check that the parent record is displayed - Go back to the result list and click on the child record, so you get the detail view - Check that the "In: <Title of parent record>" link is displayed, and that it uses the biblionumber of the parent record for linking: ?q=Control-number:<biblionumber of parent record> - Click on the "In: <Title of parent record>" link, and check that the parent record is displayed - Now edit the parent record and put it's biblionumber in 001. Repeat the steps above, and check that everything still works, but that the links are different: - The "Show analytics" link on the parent record should look like this: ?q=rcn:<biblionumber of parent record>+and+(bib-level:a+or+bib-level:b) - The "In: <Title of parent record>" link on the child records should be the same as it was earlier - Now set UseControlNumber = "Don't use" and repeat all of the steps above - All of the links should still be displayed and work, of course - The "In: <Title of parent record>" link on the child records should look like this: ?q=ti,phr:<Title of parent record> - The "Show analytics" link on the parent record should look like this: ?q=Host-item:<Title of parent record> - Change LDR/07 to "s" and repeat all of the steps above - Do all of this both for GRS-1 indexing and for DOM indexing... Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>	2013-03-20 14:40:47 -04:00
Magnus Enger	be69176982	Bug 9256 - Fix search for the packages See the bug for a description of the problem. This patch tries to restore searching for marcflavour != MARC21 as well as allowing instances with different marcflavors to co-exist on the same server. To test: - Do a package install with e.g. the official squeeze-dev packages and create at least two instances, with different marcflavours, e.g.: sudo koha-create --create-db --marcflavor marc21 test1 sudo koha-create --create-db --marcflavor normarc test2 - Run through the web installers for both instances and add a couple of records to each. Wait for the records to be indexed or run indexing manually with sudo koha-rebuild-zebra -f test1 sudo koha-rebuild-zebra -f test2 - Try searching for the records you added. It should work in test1 but not in test2. - Apply the patch and build packages with the build-git-snapshot script - Install the new koha-common package - Create two instances (because of Bug 9754 it is probably best to give the instances different names than the ones you created above, or to do this on a fresh VM or similar) and add records, as described above. Searching should now work equally well for both instances. Please note: Because of Bug 9752 you will have to set marcflavour = NORMARC by hand before you do the searching, if you choose NORMARC as the marc flavour on one of the instances you create. Please note too: I am not confident that this is the perfect solution, so merciless and thorough testing is necessary! ;-) Signed-off-by: Mirko Tietgen <mirko@abunchofthings.net> Works for me for GRS-1 (package installation out of the box). Could not figure out how to set up DOM indexing and eventually stopped caring about it. Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Build packages with the patch and checked that creating instances and search within them works for both MARC21 and NORMARC. All tests and QA script pass. Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>	2013-03-19 19:34:12 -04:00
Jared Camins-Esakov	e56a0a0e62	Bug 8620: Any index in DOM mode sensitive to -x flag of rebuild_zebra.pl The definition of the Any index was sensitive to whether spaces were present between (say) subfield elements in the MARCXML representation of the bib being indexed. When using the -x option to rebuild_zebra.pl, spaces would be present because of how MARC::File::XML emits MARCXML. When not using the -x option, spaces would not be present and the contents of a field would be run together, potentially as one big token. The visible behavior was that doing a keyword search by item barcode would sometimes not work. To test: 0) Make sure Zebra is using DOM mode 1) Create an item record. 2) Reindex using rebuild_zebra.pl -b -z, without -x 3) Do a keyword search by the barcode of the item just added; the search shouldn't work 4) Apply patch. 5) Update the following two files: etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl 6) Reindex 7) Do a search that was previously failing. Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Fixes the problem for me - formerly not working callnumbers and barcodes are now found in keyword (any) searches. Signed-off-by: Galen Charlton <gmc@esilibrary.com> (revised commit description to better explain why it fixes the problem) Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com> Passes all my tests, happy to sign off Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>	2013-03-07 09:19:43 -05:00
David Cook	eb4ebab07c	Bug 9552 - BIB1 Relation "Greater Than" Attribute Not Mapped Properly in CCL.Properties Currently, you can use "lt,le,eq,ge" in your CCL query to handle "lesser than, lesser or equal to, equal to, greater than or equal to" relationships. The only one missing is "gt" (Bib1 2=5). The mappings are also off "ne, phonetic, stem", but those are Bib1 attributes that Zebra doesn't support, so that's not really relevant. To test: [1] Before applying the patch, try the following query in the OPAC: pubdate,gt:2006 You should get "no results found". [2] After applying the patch (and note that ccl.properties will usually need to be installed in the run-time Zebra configuration directory), try the same search. This time, you could get back the titles whose publication date is after 2006. Signed-off-by: Galen Charlton <gmc@esilibrary.com> Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>	2013-02-07 00:20:42 -05:00
Mathieu Saby	e86e3c24b8	Bug 8984: make Zebra more UNIMARC compliant This patch makes the following changes to record.abs, biblio-koha-indexdefs.xml and biblio-zebra-indexdefs.xsl : - adding new (sub)fields to Identifier-standard index : 011f/g ; 012a ; 013a/z ; 014a/z ; 015a/z ; 016a/z ; 017a/z, 040a/z, 071z, 072z, 073z - adding 1 new subfield to Publisher index : 071b (may contain the name of a music publisher) - adding new (sub)fields to Author and Identifier-standard index (for the $9) : 716, 72X, 730 - adding new (sub)fields to Note : 334$a (award note) - correcting 207 and 208 - suppressing 308a and 328a in Note (useless as complete fields are indexed in same index) - adding (sub)fields to Title index : 411t, 421-425t, 433-437t, 442-444t, 446-456t, 462-463t, 470-488t, 560 - adding (sub)fields to Subject and Identifier-standard index (for the $9) : 608, 615, 616, 617, 620, 621 - adding some classifications index : 670, 675, 686 - adding some comments (to make easier further modifications and to identify non unimarc fields : 414-420, 603, 630-636, 646) To test : - take a record and fill some of the missing fields (e.g 488t, 608, 720, 012a) with some data as "field488", "field608" etc - try to find the record => not possible - apply the patch, copy the new record.abs in etc/zebradb/biblios/etc and rebuild zebra - try to find the record => should be ok - check nothing else is broken... - same test with DOM indexing activated http://bugs.koha-community.org/show_bug.cgi?id=8984 Signed-off-by: Zeno Tajoli <tajoli@cilea.it> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>	2013-01-04 08:39:56 -05:00
Fridolyn SOMERS	6e62f58015	Bug 9123: Authorities search ordered by authid does not work Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl> Tested with Zebra, marc21, grs1. Discovered that paging through auth search results does no longer work, but that is not related to these changes. Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Tested with Zebra, marc21, dom. All tests pass. Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>	2012-12-02 16:28:12 -05:00
Tomas Cohen Arazi	2624a45386	Bug 8750 - Chronological terms authorities not correctly indexed (trivial fix) Patch re-done so it applies, had that double-utf8 problem There was no entry in authority's record.abs for indexing chronological terms. They couldn't be searched and (obviously) linked. I've added those entries using the index names defined in authorities/etc/bib1.att Regards To+ Sponsored-by: Universidad Nacional de Córdoba Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Passed-QA-by: Paul Poulain <paul.poulain@biblibre.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>	2012-11-20 07:08:50 -05:00
Magnus Enger	9270d84c93	Bug 8805 - Add a biblio-zebra-indexdefs.xsl for NORMARC This is required in order for Koha to support DOM indexing of the NORMARC dialect, cf Bug "Bug 7818 - support DOM mode for Zebra indexing of bibliographic records". The two files in this patch were generated from the NORMARC record.abs by doing the steps suggested at the bottom here: http://wiki.koha-community.org/wiki/Switching_to_dom_indexing No manual editing was involved. To test: - Do a fresh install, choosing NORMARC as the MARC dialect - Run rebuild_zebra.pl and check it does not complain about missing files or other things - Check that search works as expected. Using MARC21 records for the testing should be OK. 2012-10-31: New patch after an update to Bug 8665 Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Passed-QA-by: Paul Poulain <paul.poulain@biblibre.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>	2012-11-08 12:35:49 -05:00
Jared Camins-Esakov	510a2397fb	Bug 8665 follow-up: add missing line to XSLT The DOM transformer was missing a line from a previous development, resulting in the MARC21 authorities DOM indexing stylesheet being regenerated with a missing line. This patch readds the missing line to the transformer, and provides the corrected authority-zebra-indexdefs. Signed-off-by: Elliott Davis <elliott@bywatersolutions.com> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-10-29 19:12:41 +01:00
Jared Camins-Esakov	7d9b4d58e3	Bug 8665: DOM indexing fails to index some bib records Use a user-specified field for z:id. This patch also fixes an excess space before the index in the MARC21 biblio index definitions, which someone fixed in the generated file but not in the source file it should have been fixed in. Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Elliott Davis <elliott@bywatersolutions.com>	2012-10-29 19:12:38 +01:00
Frédéric Demians	084367f6cb	Bug 3087 Fix Z39.50 server to return the correct record syntax Modify Makefile.PL and Zebra configuration files in order to parametrized biblio record type returned by Zebra Z39.50 server. How to test: - Test with a MARC21 and a UNIMARC DB - Do a new installation - Search from OPAC - Search from a Z39.50 client like yaz-client: syntax = MARC21/UNIMARC must be choosed - It was working for MARC21: it continues to work - It wasn't working for UNIMARC: it works now, both in OPAC and from a Z39.50 client Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl> Works fine for MARC21. Frederic looked at UNIMARC. Magnus looked at NORMARC. GRS1 works okay for me. I still have issues with DOM, but they are not directly related to changes in this patch. A followup is still needed for packaging (debian/templates). Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-10-22 14:12:22 +02:00
Jared Camins-Esakov	91be607586	Bug 7475: Update configuration In order to make matching rules more useful for MARC21 authorities, this patch adds special indexes on previous see-from headings and LCCN. This patch does not change UNIMARC authority configuration in any way. Also modifies the Koha schema in preparation for adding authority import and matching to the Staging tools. To install: 1. Run installer/data/mysql/atomicupdate/importauthorities.pl 2. Update the following four files in your koha-dev: etc/zebradb/authorities/etc/bib1.att etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xsl etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl 3. Reindex your authorities: misc/migration_tools/rebuild_zebra.pl -a -r -v NOTE TO RM: this patch adds an atomicupdate file that needs to be incorporated into updatedatabase.pl if bug 7167 is not pushed. http://bugs.koha-community.org/show_bug.cgi?id=2060 Signed-off-by: Elliott Davis <elliott@bywatersolutions.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Rebased on master 1 August 2012 Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Rebased on master 11 September 2012	2012-09-19 17:15:25 +02:00
Colin Campbell	1e8423167e	Bug 8653 remove erroneous whitespace blocking indexing The superfluous whitespace after the definition of subject tag $9s is causing an error when carried over into dom config files so that the authority links fail to index Also removed the (harmless) trailing space in the equivalent Unimarc files A good editor and git can help in not creating excess whitespace Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-09-14 17:20:34 +02:00
Jared Camins-Esakov	66cee2f590	Bug 8206 follow-up: Add Match index to MARC21 record.abs Although the Match index was correctly configured for UNIMARC authorities and MARC21 authorities indexed with DOM, the Match index was inadvertantly removed from the record.abs file for MARC21 authorities at some point. Since the Match index is required to make best use of the new search options, this patch adds it back in. Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>	2012-09-07 15:16:45 +02:00
Marc Veron	9c492a7fae	Bug 7586 - Search: Language restriction does NOT show expected results (no items shown) modified: etc/zebradb/marc_defs/marc21/biblios/record.abs Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-06-10 11:00:14 +02:00
Frédéric Demians	38b375b32c	Bug 7818 Add UNIMARC biblio records zebra DOM def files Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> I tested two UNIMARC Koha installations using the sample UNIMARC data from the BibLibre sandbox, comparing the results with DOM and with GRS-1 indexing. The results are very similar, though there are some differences. Most noticeable: * relevance and facets seem to be more accurate with DOM enabled * the GRS-1 configuration returns approximately 10% more results with random single keywords like "petit," but the DOM results contain the most relevant items, and any lacks in the configuration can easily be corrected as UNIMARC users identify fields that should be indexed but aren't * authority-controlled searches match exactly * author and topic facets do not work with the out-of-the-box GRS-1 indexing configuration (?!?) (adding second sign-off line below because all that probably looks like a commit message and not a sign off) Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-06-09 11:44:16 +02:00
Galen Charlton	1f88669152	Bug 7818: add warning about not editing record.abs when using DOM filter This commit also updates the authority and biblio DOM indexing definition XSL to include updated header comments. Signed-off-by: Galen Charlton <gmc@esilibrary.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-06-09 11:44:14 +02:00
Galen Charlton	79c0158aab	Bug 7818: update comment to clarify availability of DOM index mode DOM indexing is now available for both bibs and authorities. Signed-off-by: Galen Charlton <gmc@esilibrary.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-06-09 11:44:12 +02:00
Galen Charlton	daca5edc52	Bug 7818: -x option of rebuild_zebra.pl now works with DOM filter One consequence is that the -x and -a options are no longer mutually exclusive. Also, because of the way that the GRS-1 SGML filter works, if you're indexing multiple documents, you can't just wrap them in a document element, but the DOM filter requires it. Consequently, two new config settings in koha-conf.xml are added to indicate the Zebra filter in use so that the -x option of rebuild_zebra.pl knows whether to wrap the exported records or not: - bib_index_mode (defaults to 'grs1' if not specified) - auth_index_mode (defaults to 'dom') Signed-off-by: Galen Charlton <gmc@esilibrary.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-06-09 11:44:09 +02:00
Galen Charlton	64680c18b3	Bug 7818: Zebra DOM filter index definitions for MARC21 bibs The file biblio-zebra-indexdefs.xsl, which is the stylesheet that is used by the Zebra DOM filter to convert an incoming MARC21 bib to its indexed form, was generated by the following two steps: misc/maintenance/make_zebra_dom_cfg_from_record_abs \ --input etc/zebradb/marc_defs/marc21/biblios/record.abs \ --output etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml xsltproc etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl \ etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml \ > etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl Records indexed using this XSLTshould behave similarly to records indexed using the GRS-1 filter and the old record.abs definition, with the following big exception (and improvemwent): indexed phrases now span subfield boundaries if a specific subfield wasn't specified in the index definition. For example, the GRS-1 filter index definition melm 245 Title would allow 245 $a Cats on boxes : $b cardboard fantasies to be searched as the phrases "cats on boxes" or "cardboard fantasies", but a title phrase seach of "cats on boxes cardboard fantasises" wouldn't work. The DOM filter equivalent, <index_data_field xmlns="http://www.koha-community.org/schemas/index-defs" tag="245"> <target_index>Title:w</target_index> <target_index>Title:p</target_index> </index_data_field> does allow phrase searches to span subfield boundaries. Signed-off-by: Galen Charlton <gmc@esilibrary.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-06-09 11:44:06 +02:00
Galen Charlton	76378ed202	Bug 7818: add index_data_field option to DOM indexing repertoire Adds a new kohaidx:index_data_field index definition type which indexes all of the subfields of a MARC data field as a single phrase, separating the contents of each with a space. Signed-off-by: Galen Charlton <gmc@esilibrary.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-06-09 11:44:04 +02:00
Galen Charlton	e660c70b82	Bug 7818: move koha-indexdefs-to-zebra.xsl Since the koha-indexdefs-to-zebra.xsl stylesheet will be used by both bib and authority indexing, put in a central location. Signed-off-by: Galen Charlton <gmc@esilibrary.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-06-09 11:44:03 +02:00
Galen Charlton	f50d433781	Bug 7818: update installer for biblio DOM indexing Adds the necessary bits to enable DOM indexing for bib records as an option during installation from source. Signed-off-by: Galen Charlton <gmc@esilibrary.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-06-09 11:43:56 +02:00
Serhij Dubyk {Сергій Дубик}	aec4ba8985	Bug 7838 - Add sort-string-utf.chr for Ukrainian and Russian Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-04-12 17:23:53 +02:00
Jared Camins-Esakov	937480abe0	Bug 7617: Sort authority results by authid Add the option of sorting authority search results by authid, and instruct the FirstMatch and LastMatch linkers to use that sort order rather than the default search order. To test: 1. Install new Zebra authorities config etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml, etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xsl, etc/zebradb/marc_defs/marc21/authorities/record.abs, and etc/zebradb/marc_defs/unimarc/authorities/record.abs 2. Reindex authorities in Zebra 3. Set LinkerModule to FirstMatch or LastMatch 4. Add two identical authority records, and a bib record with a heading that matches them 5. Run misc/link_bibs_to_authorities.pl on that record 6. Confirm that the authid that's been inserted into subfield $9 of that heading is the first, if you selected FirstMatch, or last if you selected LastMatch Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> I followed the test plan and checked that for "Last match" and "First match" the correct authority was selected and linked to the record. Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-03-29 11:04:58 +02:00
Magnus Enger	448dbe2df5	Bug 7537 - Implement TraceCompleteSubfields, TraceSubjectSubdivisions and UseICU for NORMARC XSLT IMPORTANT! This patch relies on the patch for Bug 7092, which is now pushed to master. As the title says, this patch implements TraceCompleteSubfields, TraceSubjectSubdivisions and UseICU for NORMARC XSLT, both for the OPAC and the Intranet. This affects how clickable subject-links are constructed. To make this work the indexing of MARC fields in the 600 range is changed to include "Subject:p" in several new places. To test: Find a record with a "complex" subject, like "Internet -- Law and legislation". MARC21 and NORMARC are very similar in how they handle subjects, so testing on a MARC21 database should be OK. (Changes in indexing reflect changes already made to the MARC21 indexing.) Make sure you have these syspref settings: - marcflavour = NORMARC - XSLTDetailsDisplay = using XSLT stylesheets - OPACXSLTDetailsDisplay = using XSLT stylesheets (Ideally, testing should be done on a real NORMARC setup, but since the changes to indexing only reflect how it's already done in MARC21, I think testing on a MARC21 installation with marcflavour = NORMARC should be OK.) Now try the different combinations of TraceCompleteSubfields, TraceSubjectSubdivisions and UseICU, and check the format of the clickable links, both in the OPAC and staff client. Here's what you should be seeing: 1. TraceCompleteSubfields = Don't force TraceSubjectSubdivisions = Don't include UseICU = Not using opac-search.pl?q=su:"Internet" UseICU = Using opac-search.pl?q=su:{Internet} 2. TraceCompleteSubfields = Force TraceSubjectSubdivisions = Don't include UseICU = Not using opac-search.pl?q=su,complete-subfield:"Internet" UseICU = Using opac-search.pl?q=su,complete-subfield:{Internet} 3. TraceCompleteSubfields = Don't force TraceSubjectSubdivisions = Include UseICU = Not using opac-search.pl?q=(su:"Internet") AND (su:"Law and legislation.") UseICU = Using opac-search.pl?q=(su:{Internet}) AND (su:{Law and legislation.}) 4. TraceCompleteSubfields = Force TraceSubjectSubdivisions = Include UseICU = Not using opac-search.pl?q=(su,complete-subfield:"Internet") AND (su,complete-subfield:"Law and legislation.") UseICU = Using opac-search.pl?q=(su,complete-subfield:{Internet}) AND (su,complete-subfield:{Law and legislation.}) UPDATE 2012-03-23 - Change the syspref TracingQuotes to UseICU, see bug 7092 - Change boolean operator from "and" to "AND", see bug 7695 Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Note: UseControlnumber must be turned off. 1) Works. 2) Works. 3) Works. 4) Works. Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-03-29 11:00:32 +02:00
Frédéric Demians	a0316d4d27	Bug 7698: Add CHR/ICU Zebra tokenization choice to installation Word search with multi-part facets works properly only with Zebra ICU tokenization. This patch add a new question to Koha command line installer: Zebra has two methods to perform records tokenization and characters normalization: CHR and ICU. ICU is recommended for catalogs containing non-Latin characters. (chr, icu) [chr] How to test: - perl ./Makefile.PL - Try each possible value for new parameter - Take a look at zebradb/etc/default.idx file. Depending of the parameter you get this line: icuchain words-icu.xml or this one: charmap word-phrase-utf.chr Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> (Note: This patch was previously associated with bug 3216; I moved it to a separate bug because including ICU is a good idea independent of the fix for the particular issue described in bug 3216) Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-03-13 16:08:04 +01:00
Jared Camins-Esakov	7343023a6d	Bug 7284: Improve UNIMARC Zebra configuration Add the Match-heading and Match-heading-see-from indexes to the UNIMARC Zebra configuration. Signed-off-by: Paul Poulain <paul.poulain@biblibre.com> Tested with an UNIMARC setup that things work fine. They do	2012-03-07 22:20:58 +01:00
Jared Camins-Esakov	5207699f98	signed off Bug 7284: Authority matching improvements Squashed patch incorporating all previous patches (there is no functional change compared to the previous version of this patch, this patch merely squashes the original patch and follow-up, and rebases on latest master). === TL;DR VERSION === * Installation * 1. Run installer/data/mysql/atomicupdate/bug_7284_authority_linking_pt1 and installer/data/mysql/atomicupdate/bug_7284_authority_linking_pt2 2. Make sure you copy the following files from kohaclone to koha-dev: etc/zeradb/authorities/etc/bib1.att, etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml, etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xsl, etc/zebradb/marc_defs/marc21/authorities/koha-indexdefs-to-zebra.xsl, and etc/zebradb/marc_defs/unimarc/authorities/record.abs 3. Run misc/migration_tools/rebuild_zebra.pl -a -r * New sysprefs * * AutoCreateAuthorities * CatalogModuleRelink * LinkerModule * LinkerOptions * LinkerRelink * LinkerKeepStale * Important notes * You must have rebuild_zebra processing the zebraqueue for bibs when testing this patch. === DESCRIPTION === * Cataloging module * * Added an additional box to the authority finder plugin for "Heading match," which consults not just the main entry but also See-from and See-also-from headings. * With this patch, the automatic authority linking will actually work properly in the cataloging module. As Owen pointed out while testing the patch, though, longtime users of Koha will not be expecting that. In keeping with the principles of least surprise and maximum configurability, a new syspref, CatalogModuleRelink makes it possible to disable authority relinking in the cataloging module only (i.e. leaving it enabled for future runs of link_bibs_to_authorities.pl). Note that though the default behavior matches the current behavior of Koha, it does not match the intended behavior. Libraries that want the intended behavior rather than the current behavior will need to adjust the CatalogModuleRelink syspref. * misc/link_bibs_to_authorities.pl * Added the following options to the misc/link_bibs_to_authorities.pl script: --auth-limit Only process those headings that match the authorities matching the user-specified WHERE clause. --bib-limit Only process those bib records that match the user-specified WHERE clause. --commit Commit the results to the database after every N records are processed. --link-report Display a report of all the headings that were processed. Converted misc/link_bibs_to_authorities.pl to use POD. Added a detailed report of headings that linked, did not link, and linked in a "fuzzy" fashion (the exact semantics of fuzzy are up to the individual linker modules) during the run. * C4::Linker * Implemented new C4::Linker functionality to make it possible to easily add custom authority linker algorithms. Currently available linker options are: * Default: retains the current behavior of only creating links when there is an exact match to one and only one authority record; if the 'broader_headings' option is enabled, it will try to link to headings to authority records for broader headings by removing subfields from the end of the heading (NOTE: test the results before enabling broader_headings in a production system because its usefulness is very much dependent on individual sites' authority files) * First Match: based on Default, creates a link to the first authority record that matches a given heading, even if there is more than one authority record that matches * Last Match: based on Default, creates a link to the last authority record that matches a given heading, even if there is more than one record that matches The API for linker modules is very simple. All modules should implement the following two functions: <get_link ($field)> - return the authid for the authority that should be linked to the provided MARC::Field object, and a boolean to indicate whether the match is "fuzzy" (the semantics of "fuzzy" are up to the individual plugin). In order to handle authority limits, get_link should always end with: return $self->SUPER::_handle_auth_limit($authid), $fuzzy; <flip_heading ($field)> - return a MARC::Field object with the heading flipped to the preferred form. At present this routine is not used, and can be a stub. Made the linking functionality use the SearchAuthorities in C4::AuthoritiesMarc rather than SimpleSearch in C4::Search. Once C4::Search has been refactored, SearchAuthorities should be rewritten to simply call into C4::Search. However, at this time C4::Search cannot handle authority searching. Also fixed numerous performance issues in SearchAuthorities and the Linker script: * Correctly destroy ZOOM recordsets in SearchAuthorities when finished. If left undestroyed, efficiency appears to approach O(log n^n) * Add an optional $skipmetadata flag to SearchAuthorities that can be used to avoid additional calls into Zebra when all that is wanted are authority records and not statistics about their use * New sysprefs * * AutoCreateAuthorities - When this and BiblioAddsAuthorities are both turned on, automatically create authority records for headings that don't have any authority link when cataloging. When BiblioAddsAuthorities is on and AutoCreateAuthorities is turned off, do not automatically generate authority records, but allow the user to enter headings that don't match an existing authority. When BiblioAddsAuthorities is off, this has no effect. * CatalogModuleRelink - when turned on, the automatic linker will relink headings when a record is saved in the cataloging module when LinkerRelink is turned on, even if the headings were manually linked to a different authority by the cataloger. When turned off (the default), the automatic linker will not relink any headings that have already been linked when a record is saved. * LinkerModule - Chooses which linker module to use for matching headings (current options are as described above in the section on linker options: "Default," "FirstMatch," and "LastMatch") * LinkerOptions - A pipe-separated list of options to set for the authority linker (at the moment, the only option available is "broader_headings," which is described below) * LinkerRelink - When turned on, the linker will confirm the links for headings that have previously been linked to an authority record when it runs. When turned off, any heading with an existing link will be ignored. * LinkerKeepStale - When turned on, the linker will never delete a link to an authority record, though, depending on the value of LinkerRelink, it may change the link. * Other changes * * Cleaned up authorities code by removing unused functions and adding unimplemented functions and added some unit tests. * This patch also modifies the authority indexing to remove trailing punctuation from Match indexes. * Replace the old BiblioAddAuthorities subroutines with calls into the new C4::Linker routines. * Add a simple implementation for C4::Heading::UNIMARC. (With thanks to F. Demians, 2011.01.09) Correct C4::Heading::UNIMARC class loading. Create biblio tag to authority types data structure at initialization rather than querying DB. * Ran perltidy on all changed code. * Linker Options * Enter "broader_headings" in LinkerOptions. With this option, the linker will try to match the following heading as follows: =600 10$aCamins-Esakov, Jared$xCoin collections$vCatalogs$vEarly works to 1800. First: Camins-Esakov, Jared--Coin collections--Catalogs--Early works to 1800 Next: Camins-Esakov, Jared--Coin collections--Catalogs Next: Camins-Esakov, Jared--Coin collections Next: Camins-Esakov, Jared (matches! if a previous attempt had matched, it would not have tried this) This is probably relevant only to MARC21 and LCSH, but could potentially be of great use to libraries that make heavy use of floating subdivisions. === TESTING PLAN === Note: all of these tests require that you have some authority records, preferably for headings that actually appear in your bibliographic data. At least one authority record must contain a "see from" reference (remember which one contains this, as you'll need it for some of the tests). The number shown in the "Used in" column in the authority module is populated using Zebra searches of the bibliographic database, so you must have rebuild_zebra.pl -b -z [-x] running in cron, or manually run it after running the linker. * Testing the Heading match in the cataloging plugin * 1. Create a new record, and open the cataloging plugin for an authority-controlled field. 2. Search for an authority by entering the "see from" term in the Heading Match box 3. Confirm that the appropriate heading shows up 4. Search for an authority by entering the preferred heading into the Main entry or Main entry ($a only) box (i.e., repeat the procedure you usually use for cataloging, whatever that may be) 5. Confirm that the appropriate heading shows up * Testing the cataloging interface * 6. Turn off BiblioAddsAuthorities 7. Confirm that you cannot enter text directly in an authority-controlled field 8. Confirm that if you search for a heading using the authority control plugin the heading is inserted (note, however, that this patch does not AND IS NOT INTENDED TO fix the bugs in the authority plugin with duplicate subfields; those are wholly out of scope- this check is for regressions) 9. Turn on BiblioAddsAuthorities and AutoCreateAuthorities 10. Confirm that you can enter text directly into an authority-controlled field, and if you enter a heading that doesn't currently have an authority record, an authority record stub is automatically created, and the heading you entered linked 11. Confirm that if you enter a heading with only a subfield $a that fully matches an existing heading (i.e. the existing heading has only subfield $a populated), the authid for that heading is inserted into subfield $9 12. Confirm that if you enter a heading with multiple subfields that matches an existing heading, the authid for that heading is inserted into subfield $9 13. Turn on BiblioAddsAuthorities and turn off AutoCreateAuthorities 14. Confirm that you can enter text directly into an authority-controlled field, and if you enter a heading that doesn't currently have an authority record, an authority record stub is not created 15. Confirm that if you enter a heading with only a subfield $a that matches an existing heading, the authid for that heading is inserted into subfield $9 16. Confirm that if you enter a heading with multiple subfields that matches an existing heading, the authid for that heading is inserted into subfield $9 17. Create a record and link an authority record to an authorized field using the authority plugin. 18. Save the record. Ensure that the heading is linked to the appropriate authority. 19. Open the record. Change the heading manually to something else, leaving the link. Save the record. 20. Ensure that the heading remains linked to that same authority. 21. Change CatalogModuleRelink to "on." 22. Open the record. Use the authority plugin to link that heading to the same authority record you did earlier. 23. Save the record. Ensure that the heading is linked to the appropriate authority. 24. Open the record. Change the heading manually to something else, leaving the link. Save the record. 25. Ensure that the heading is no longer linked to the old authority record. * Testing link_bibs_to_authorities.pl * 26. Set LinkerModule to "Default," turn on LinkerRelink and BiblioAddsAuthorities, and turn AutoCreateAuthorities and LinkerKeepStale off 27. Edit one bib record so that an authority controlled field that has already been linked (i.e. has data in $9) has a heading that does not match any authority record in your database 28. Run misc/link_bibs_to_authorities.pl --link-report --verbose --test (you may want to pipe the output into less or a file, as the result is quite a lot of information) 29. Look over the report to see if the headings that you have authority records for report being matched, that the heading you modified in step 2 is reported as "unlinked," and confirm that no changes were actually made to the database (to check this, look at the bib record you edited earlier, and check that the authid in the field you edited hasn't changed) 30. Run misc/link_bibs_to_authorities.pl --link-report --verbose (you may want to pipe the output into less or a file, as the result is quite a lot of information) 31. Check that the heading you modified has been unlinked 32. Change the modified heading back to whatever it was, but don't use the authority control plugin to populate $9 33. Run misc/link_bibs_to_authorities.pl --link-report --verbose --bib-limit="biblionumber=${BIB}" (replacing ${BIB} with the biblionumber of the record you've been editing) 34. Confirm that the heading has been linked to the correct authority record 35. Turn LinkerKeepStale on 36. Change that heading to something else 37. Run misc/link_bibs_to_authorities.pl --link-report --verbose --bib-limit="biblionumber=${BIB}" (replacing ${BIB} with the biblionumber of the record you've been editing) 38. Confirm that the $9 has not changed 39. Turn LinkerKeepStale off 40. Create two authorities with the same heading 41. Run misc/migration_tools/rebuild_zebra.pl -a -z 42. Enter that heading into the bibliographic record you are working with 43. Run misc/link_bibs_to_authorities.pl --link-report --verbose --bib-limit="biblionumber=${BIB}" (replacing ${BIB} with the biblionumber of the record you've been editing) 44. Confirm that the heading has not been linked 45. Change LinkerModule to "FirstMatch" 46. Run misc/link_bibs_to_authorities.pl --link-report --verbose --bib-limit="biblionumber=${BIB}" (replacing ${BIB} with the biblionumber of the record you've been editing) 47. Confirm that the heading has been linked to the first authority record it matches 48. Change LinkerModule to "LastMatch" 49. Run misc/link_bibs_to_authorities.pl --link-report --verbose --bib-limit="biblionumber=${BIB}" (replacing ${BIB} with the biblionumber of the record you've been editing) 50. Confirm that the heading has been linked to the second authority record it matches 51. Run misc/link_bibs_to_authorities.pl --link-report --verbose --auth-limit="authid=${AUTH}" (replacing ${AUTH} with an authid) 52. Confirm that only that heading is displayed in the report, and only those bibs with that heading have been changed If all those things worked, good news! You're ready to sign off on the patch for bug 7284. Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Rebased on latest master and squashed follow-up, 16 February 2012 Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Rebased on latest master, 21 February 2012 Signed-off-by: schuster <dschust1@gmail.com>	2012-03-07 17:34:11 +01:00
Magnus Enger	5f2e1ba7b1	Bug 7552 - Remove wrong line endings in NORMARC record.abs Line endings contain erroneous \r 's. Also remove a useless comment at the top of the file. This patch was produced by doing the following operations: git config --global core.autocrlf true git rm --cached -r etc/zebradb/marc_defs/normarc/biblios/record.abs git diff --cached --name-only -z \| xargs -0 git add as recommended here: http://help.github.com/line-endings/ First version of this file resulted in whitespaceerrors. Trying to fix that now. To test: - Open etc/zebradb/marc_defs/normarc/biblios/record.abs in a file editor that will let you search for \r - gedit seems to work nicely for this. Check that there are occurences of \r in the file - Apply the patch - Open etc/zebradb/marc_defs/normarc/biblios/record.abs in the editor and check that it can not find any \r Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Still a few \r But only on comments, safe to push Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-02-21 17:37:35 +01:00
Janusz Kaczmarek	c1e47b9359	Error in records.abs for marc21/biblios http://bugs.koha-community.org/show_bug.cgi?id=7502 Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-02-06 21:51:48 +01:00
MJ Ray	d4b132136c	Bug 7476 Remove executable bit from files that probably should not be executed Signed-off-by: Aleksa Vujicic <aleksa@catalyst.net.nz> Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl> Amended to replace some copy-and-paste comments only with consent of MJR. Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-02-03 14:22:15 +01:00
Marc Balmer	c9c6bbdea8	Bug 7356 - Fix various typos and mis-spellings Fix typos: the the -> the, wether -> whether, developper -> developer. http://bugs.koha-community.org/show_bug.cgi?id=7356 Signed-off-by: Owen Leonard <oleonard@myacpl.org> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-01-13 11:51:26 +01:00

1 2 3 4 5 ...

310 commits