Koha-community/Koha - Koha: The world's first free and open source library system

Author	SHA1	Message	Date
Tomas Cohen Arazi	e95cd1b126	Bug 11232: (followup) remove unnecesary namespace definition from all XML elements The previous patches for facet extraction from Zebra indexes set a default namespace on the following files: etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml etc/zebradb/marc_defs/normarc/biblios/biblio-koha-indexdefs.xml etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml and hence the XML file index_subfields can be cleaned by removing the namespace. To test: - Apply this patch - Run $ for i in marc21 normarc unimarc do xsltproc etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl \ etc/zebradb/marc_defs/$i/biblios/biblio-koha-indexdefs.xml \ > etc/zebradb/marc_defs/$i/biblios/biblio-zebra-indexdefs.xsl done => SUCCESS: no errors reported - Run $ git diff => SUCCESS: no differences on the xsl files - Sign off :-D Sponsored-by: Universidad Nacional de Cordoba Signed-off-by: David Cook <dcook@prosentient.com.au> Seems to work with DOM and MARC21. Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-15 12:55:44 -03:00
Tomas Cohen Arazi	eafeb34097	Bug 11232: UNIMARC facet definition and updated XSL file for DOM This patch adds the facets definitions to the biblio-koha-indexdefs.xml, based on what is hardcoded on C4::Koha::getFacets(). The biblio-zebra-indexdefs.xsl file for UNIMARC is generated using the usual: xsltproc ...koha-indexdefs-to-zebra.xsl ...unimarc/biblios/biblio-koha-indexdefs.xml > \ ...unimarc/biblios/biblio-zebra-indexdefs.xsl Sponsored-by: Universidad Nacional de Cordoba Signed-off-by: David Cook <dcook@prosentient.com.au> Seems to work with DOM and MARC21. Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-15 12:55:38 -03:00
Fridolin Somers	95adc7a1f4	Bug 12453 - Do not use by default Host-Item-Number in UNIMARC Actually, in default UNIMARC install, 461$9 is indexed as Host-Item-Number, meaning it is used for analytical itemnumber. But most UNIMARC catalog use the analytical relation using unimarc_field_4XX.pl plugin on 461$a. In fact, this plugin is defined in default UNIMARC frameworks. If Host-Item-Number is defined but 461$9 is used for something else, it will lead to odd bugs. For example, records containing analytical items can not be deleted. This patch comments the 461$9 indexing in UNIMARC zebra config. Test plan : - Create a fresh UNIMARC install - Create a record with 461$9 containing a value - Index the record - Perform a search on Host-Item-Number : ccl=Host-Item-Number,alwaysmatches='' => Without the patch you get a result => With the patch you get no result Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Code is clean, commenting out all the indexing of 461$9. Trusting the author that this is the correct thing to do :) Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-08-24 12:32:30 -03:00
Mathieu Saby	b6118db2f5	Bug 11202: Improve UNIMARC biblio indexing This patch makes the following changes to UNIMARC biblio indexing : A. Changes to UNIMARC conf files 1. add comments to biblio-koha-indexdefs.xml 2. make biblio-koha-indexdefs.xml more compact by grouping some declarations Ex : 200$f and 200$g => one declaration for 200$fg 3. suppress unneeded declarations (indexing of some 4XX fields and 6XX fields not in unimarc format) 4. unindex some (sub)fields unneeded by most users (318, 207,230,210a, 215, 4XXd) 5. change the way 308 field is indexed (no visible changes) 6. replace Title-host with Host-item -- see bug 11119 7. index 208 in Material-Type -- see bug 11119 8. index 100 pos 8-9 and 9-12 in pubdate:y and pubdate:n 9. index 100 pos 8-9 in pubdate:s instead of 210$d 10. Index all subfields of note 334 and 327 in note index 11. Index 304 and 327 in title index as well as note index 327 can contain a list of titles included in a work 304 can contain the title of the original work in case of a translation 12. Index 314 in author index as well as note index 314 can contain authors not mentionned in 200$f/g (the 4th, 5th etc. author) 13. Index 328 note in Dissertation-information as well as note 14. Index 328$t in Title B. Changes to ccl.properties : 1. add a new index Dissertation-information (1056) 2. fix EAN, pubdate and acqdate (they were not linked with bib1 attributes) C. Changes to Search.pm 1. add Dissertation-information and suppress Title-host and UPC D. Changes to QP config file queryparser.yaml 1. add Dissertation-information 2 fix EAN, pubdate and acqdate Test plan : If you cannot test in GRS1, test only in DOM, as GRS will be deprecated. 1. Apply the patch in a UNIMARC Koha running with DOM and ICU 2. copy src/etc/searchengine/queryparser.yaml into the main config directory of QP 3. copy src/etc/zebradb/ccl.properties into the main config directory of Zebra 4. copy src/etc/zebradb/marc_defs/unimarc/biblio/* into the main config directory of Zebra 5. reindex biblios (rebuild_zebra.pl -r -b -x -v) 6. test note index : make some searches on 334$b or 327$b 7. test author index : make some searches on 314 field 8. test title index : make some searches on 304 and 327 field, make a search on 328$t subfield 9. test dissertation-information index : make some searches on 328 field 10. In a record, put in the dates of 100 fields the values "1000" (1st date) and "1001" (2d date) ; try to search a book written in year 1000, you should find the record ; idem for year 1001 11. make some searches and sort by date. It should work better as before, especially if you have values like "c2009" or "impr. 2010" in 210 field 12. Regression test : make some searches on several indexes, like EAN, etc. It should work as before Test 10-12 with and without Queryparser activated. Be careful: with Queryparser activated, the index names (title, dissertation-information...) must be entered in lowercase only. Of course, to test search and sort by dates, you need to have full records, with dates in 100 field as well as 210 field. Signed-off-by: Paola Rossi <paola.rossi@cineca.it> Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-02-19 21:01:15 +00:00
Jonathan Druart	a573ac1fa8	Bug 9940: (follow-up) FIX comment: language-original is 101$c, not $h Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-12-25 15:41:31 +00:00
Mathieu Saby	451f67c055	Bug 9940: Add a new index for the original language of a document It could be useful to index the original language of a document (i.e. "fre" for the English translation of a French novel). This patch renames the Bib-1 use attribute 1095 from Code-language-original to language-original and uses it to index: - MARC21 041$h subfield - UNIMARC 101$c subfield It adds "language-original" in the list of index in Search.pm. Test plan : A. in a MARC21 GRS1 environment 1. Copy Zebra config files (zebradb/biblios/etc/bib1.att, zebradb/ccl.properties, marc_defs/marc21/biblios/record.abs) from your source etc/ directory to your main koha etc/ directory 2. Reindex zebra 3. Make some searches, like "language-original:fre" B. in a MARC21 DOM environment 4. Copy Zebra config files (zebradb/biblios/etc/bib1.att, zebradb/ccl.properties, marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl) from your source etc/ directory to your main koha etc/ directory 5. Reindex zebra 6. Make some searches, like "language-original:fre" C. in a UNIMARC GRS1 environment 7. Copy Zebra config files (zebradb/biblios/etc/bib1.att, zebradb/ccl.properties, marc_defs/unimarc/biblios/record.abs) from your source etc/ directory to your main koha etc/ directory 8. Reindex zebra 9. Make some searches, like "language-original:fre" A. in a UNIMARC DOM environment 10. Copy Zebra config files (zebradb/biblios/etc/bib1.att, zebradb/ccl.properties, marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl) from your source etc/ directory to your main koha etc/ directory 11. Reindex zebra 12. Make some searches, like "language-original:fre" Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz> Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-12-25 15:37:14 +00:00
Mathieu Saby	c00131e0ff	Bug 9830: Fix some indexes in UNIMARC item indexing With this combination of sysprefs, and a UNIMARC configuration, it was impossible to search on location, barcode and ccode indexes : QueryWeightFields is activated QueryAutoTruncate only if * is added But in UNIMARC, location, barcode and ccode (995 $e,$f,h) were indexed only as "words". They need to be indexed also as "phrase". Additionnaly, in UNIMARC, information about damaged and withdrawn status of items is not indexed, while it is done in MARC21. This patch - add 2 new indexes for 995$1 (damaged) and 995$3 (withdrawn) - index location, barcode and ccode as "phrase" as well as "words" Indexing of items in UNIMARC could be improved later. So this patch also add comments explaining the origin of Koha 995, I think it could be useful for further changes. To test, on a UNIMARC configuration : A. indexed with GRS-1 1) Set sysprefs QueryWeightFields as "activated" and QueryAutoTruncate as "only if * is added" 2) Select location index in advanced search and search for a value existing in your records in 995$e => 0 results 3) Apply patch 4) Rebuild zebra 5) Select location index in advanced search and search for a value existing in your records in 995$e => x results 6) Mark an item as withdrawn; search "withdrawn:1" => x results, and among them the biblio to which the item is attached 7) Mark an item as damaged ; search "damaged:1" => x results, and among them the biblio to which the item is attached B. indexed with DOM Do the same operations Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Work as described. No koha-qa errors Test Apply the patch Begin with GRS-1 Full reindex Search by location, no results cp files biblio-*-indexdefs.xml and record.abs to destination on etc/zebra Full reindex Search by location, got results Switch to DOM reset files Full reindex Search by location, no results cp files Full reindex Search by location, results ! Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-21 15:38:49 +00:00
Mathieu Saby	5298140c67	Bug 10037: fix item index in UNIMARC DOM indexing In UNIMARC DOM indexing, "item" index was working only for subfields of 995 field mapped with specific indexes, and also in index (ex : $a, $b...). It was not working for the other subfields (ex : $g), because a comment from record.abs was integrated in DOM config files. This patch removes the comment. To test, in a DOM UNIMARC environment : 1) In a item, write some value "Test10037" in 995$g 2) Search for this value in simple search, this way : item=Test10037 => you should have no results 3) Apply the patch. if necessary, copy the modified etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml and etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl into the /etc/... directory in your main Koha directory 4) Reindex Zebra biblios 5) Do the same search as 2) => you should have one result Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Work as described. No koha-qa errors. Test NOTE: default UNIMARC framework don't have 995g, so I must add it first. 1) Added test string to 995b on some record 2) Reindex and search as indicated, no results 3) cp files to destination 4) reindex 5) search and result ok ! Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 18:54:12 +00:00
Mathieu Saby	43809d2835	Bug 8252: Followup for Date/time-last-modified and Music number This followup restores the original wording of "Date/time-last-modified" index, and change the name of "Music-number" index to "Number-music-publisher" To test : 1. In a UNIMARC Koha instance 2. Apply patchs #1, #2 and this followup 3. Copy from src/etc/zebradb directory to the etc/zebradb/ in your main Koha directory the following files: -- zebradb/biblios/etc/bib1.att -- zebradb/ccl.properties -- zebradb/marc_defs/unimarc/biblios/record.abs -- zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml -- zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl 4. Rebuild zebra with -b -x -v -r options 5. Write a value like "test071a" in 071$a field in a record 6. Check if you can find this record with this search: "ccl=Number-music-publisher:test071a" Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> No koha-qa errors. Test Copy files reindex full Modify a couple of record to add 071a with test message Reindex -v -z -b -x Search test message as described and found modified records. Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 15:16:27 +00:00
Mathieu Saby	8034566027	Bug 8252: Fix indexing of UNIMARC 1xx for DOM This patch makes the same changes in UNIMARC DOM configuration as patch 1 made for GRS-1. positions of subfields are indexed that way : In biblio-koha-indexdefs.xml : tag="100" subfields="a" offset="17" length="1" In biblio-zebra-indexdefs.xsl : xslo:value-of select="substring(., 17, 1)" I had to edit biblio-zebra-indexdefs.xsl by hand, because etc/zebdradb/xml/koha-indexdefs-to-zebra.xsl does only support "subtring" in handle-one-index-control-field template. It is good for MARC21, but not for UNIMARC : in MARC21, indexing subtrings is needed for controled field (001-009, with no subfields) But in UNIMARC it is needed for subfields of 1XX fields. So if DOM indexing is working with these new files, we may need to change koha-indexdefs-to-zebra.xsl. Test plan (not possible in a sandbox) : 1) In a Koha instance using UNIMARC and DOM indexing 2) Apply Patch 1 and Patch 2 (this one) 3) Copy the following files from the etc/zebradb directory of your source into the etc/zebradb directory of your main Koha directory : -- etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml -- etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl -- etc/zebradb/ccl.properties -- etc/zebradb/biblios/etc/bib1.att 4) rebuild zebra with -x -b -r -v options 5) check if coded filters in advanced search are usable in OPAC and Staff interface Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Works. No koha-qa errors. Test for DOM Apply patches Don't forget to copy files reindex Search by coded fields works, also Country-publication Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-10 15:15:04 +00:00
Mathieu Saby	e86e3c24b8	Bug 8984: make Zebra more UNIMARC compliant This patch makes the following changes to record.abs, biblio-koha-indexdefs.xml and biblio-zebra-indexdefs.xsl : - adding new (sub)fields to Identifier-standard index : 011f/g ; 012a ; 013a/z ; 014a/z ; 015a/z ; 016a/z ; 017a/z, 040a/z, 071z, 072z, 073z - adding 1 new subfield to Publisher index : 071b (may contain the name of a music publisher) - adding new (sub)fields to Author and Identifier-standard index (for the $9) : 716, 72X, 730 - adding new (sub)fields to Note : 334$a (award note) - correcting 207 and 208 - suppressing 308a and 328a in Note (useless as complete fields are indexed in same index) - adding (sub)fields to Title index : 411t, 421-425t, 433-437t, 442-444t, 446-456t, 462-463t, 470-488t, 560 - adding (sub)fields to Subject and Identifier-standard index (for the $9) : 608, 615, 616, 617, 620, 621 - adding some classifications index : 670, 675, 686 - adding some comments (to make easier further modifications and to identify non unimarc fields : 414-420, 603, 630-636, 646) To test : - take a record and fill some of the missing fields (e.g 488t, 608, 720, 012a) with some data as "field488", "field608" etc - try to find the record => not possible - apply the patch, copy the new record.abs in etc/zebradb/biblios/etc and rebuild zebra - try to find the record => should be ok - check nothing else is broken... - same test with DOM indexing activated http://bugs.koha-community.org/show_bug.cgi?id=8984 Signed-off-by: Zeno Tajoli <tajoli@cilea.it> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com> Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>	2013-01-04 08:39:56 -05:00
Jared Camins-Esakov	7d9b4d58e3	Bug 8665: DOM indexing fails to index some bib records Use a user-specified field for z:id. This patch also fixes an excess space before the index in the MARC21 biblio index definitions, which someone fixed in the generated file but not in the source file it should have been fixed in. Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Elliott Davis <elliott@bywatersolutions.com>	2012-10-29 19:12:38 +01:00
Colin Campbell	1e8423167e	Bug 8653 remove erroneous whitespace blocking indexing The superfluous whitespace after the definition of subject tag $9s is causing an error when carried over into dom config files so that the authority links fail to index Also removed the (harmless) trailing space in the equivalent Unimarc files A good editor and git can help in not creating excess whitespace Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-09-14 17:20:34 +02:00
Frédéric Demians	38b375b32c	Bug 7818 Add UNIMARC biblio records zebra DOM def files Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> I tested two UNIMARC Koha installations using the sample UNIMARC data from the BibLibre sandbox, comparing the results with DOM and with GRS-1 indexing. The results are very similar, though there are some differences. Most noticeable: * relevance and facets seem to be more accurate with DOM enabled * the GRS-1 configuration returns approximately 10% more results with random single keywords like "petit," but the DOM results contain the most relevant items, and any lacks in the configuration can easily be corrected as UNIMARC users identify fields that should be indexed but aren't * authority-controlled searches match exactly * author and topic facets do not work with the out-of-the-box GRS-1 indexing configuration (?!?) (adding second sign-off line below because all that probably looks like a commit message and not a sign off) Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-06-09 11:44:16 +02:00

14 commits