Commit graph

30 commits

Author SHA1 Message Date
fbf1c32a6f
Bug 30879: Handle index/sorting for UNIMARC
Same as before, but test with UNIMARC setup

Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2022-08-02 09:33:13 -03:00
5a4ac1577b
Bug 15142: (follow-up) remove from Zebra UNIMARC config files
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
2020-01-10 16:16:41 +00:00
17a354f5e8 Bug 18322: Add a facet for ccode fields to Zebra
This patch adds the index definitions for zebra faceting of ccode in
koha for marc21, normarc and unimarc.

We also add lines to the templates to expose the new facet and enable
non-zebra faceting for ccode too.

Signed-off-by: David Cook <dcook@prosentient.com.au>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>

Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
2018-07-18 17:42:48 +00:00
5ef1b6710e Bug 18098: Add an index with the count of not onloan items
This patch adds a numeric index 'not-onloan-count' containing the value
of 999$x. This subfield is filled by 'rebuild_zebra.pl' by making use of
(bug's 18208) 'EmbedItemsAvailability' filter.

bib1.att and indexes definitions are updated accordingly.

To test:
- Apply the patch
- Pick the right biblio-zebra-indexdefs.xsl file for your setup and
  replace the one your Zebra uses [1]
- Replace your bib1.att
- Replace your ccl.properties
- Have at least one record with more than one item, checkout some
  item(s) from that record(s).
- Rebuild zebra's indexes:
  $ sudo koha-shell kohadev
 k$ cd kohaclone
 k$ misc/migration_tools/rebuild_zebra.pl -r -b -v -k
 (notice the dump directory is kept, you can try the XSLT yourself
  running:
    $ xsltproc \
       etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl \
       /tmp/the_dump_dir/biblios/exported_records | less
=> SUCCESS: There are records with the not-onloan-count index, and the
            value is correct!
- Check Zebra yourself:
  $ yaz-client unix:/var/run/koha/kohadev/bibliosocket
 Z> base biblios
 Z> find @attr 1=9013 @attr 2=5 @attr 4=109 0
=> SUCCESS: The search matches the amount of records with not-onloan
            items.
 Z> s 1+1
=> SUCCESS: Records with 999$x having a value higher than 0 are rendered
- Sign off :-D

Note: While this work is complete on its purpose, it is part of an
attempt to create a better way of filtering by availability.

Sponsored-by: ByWater Solutions

 [1] In kohadevbox this would be
/etc/koha/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl

Edit: Added the missing XSLT changes for UNIMARC and NORMARC

Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
2017-05-08 09:21:41 -04:00
aaf3ff3fec Bug 14154: 608$9 defined twice in UNIMARC biblio-koha-indexdefs.xml
In DOM config file :
etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml, the 608$9 is
defined a second time instead of 610$9.  Just a type I think.

Test plan :
- Apply patch
- Install a UNIMARC + DOM instance
- Define in a framework 610 using a thesaurus
- Create a new biblio
- Create a new authority (same type as the thesaurus defined above)
- Index : rebuild_zebra.pl -a -b -x -z
- Link the field 610 to the new authority
- Index : rebuild_zebra.pl -a -b -x -z
- In authorities search, search for the new authority
=> You see Use in 1 Records(s)

Signed-off-by: Frederic Demians <f.demians@tamil.fr>
  I confirm the typo.

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-10-21 13:43:12 -03:00
Stefan Weil
f6aec46dda Bug 14383: etc/zebradb: Fix some typos in documentation and Bib-1 attribute set
All of them were found and fixed using codespell.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>

Signed-off-by: Jonathan Druart <jonathan.druart@koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-06-22 17:34:46 -03:00
Jonathan Druart
b5e9691060 Bug 8992: Add 7..$3 to the Indentifier-standard index
Signed-off-by: valerie bertrand <valerie.bertrand@univ-lyon3.fr>

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
2015-04-28 15:47:40 -03:00
532b41934c Bug 13157: (QA followup) homebranch is 995$b on UNIMARC frameworks
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
2014-11-25 15:27:12 -03:00
9ebb6ba5d1 Bug 13157: UNIMARC holdingbranch facet is 995$c not 995$b
Fix a typo. Not test plan required, just a look at default UNIMARC framework.

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
2014-11-25 15:27:05 -03:00
c217b2c418 Revert "Bug 9828: More specific indexing of UNIMARC 6XX fields"
This reverts commit 0dd1ac40a0.
2014-10-28 12:02:34 -03:00
e43f012af6 Revert "ug 9828 : Add and fix comments in UNIMARC biblio-koha-indexdefs.xml"
This reverts commit 5bbe42932e.
2014-10-28 12:02:22 -03:00
b108a111f6 Revert "Bug 9828 : Followup for Queryparser and deletion of useless 6XX$9"
This reverts commit 49788987b2.
2014-10-28 12:02:09 -03:00
Mathieu Saby
49788987b2 Bug 9828 : Followup for Queryparser and deletion of useless 6XX$9
This followup
- changes some indexes in Queryparser configuration file
- supresses some clearly useless 6XX$9 in biblio-koha-indexdefs.xml and adds 2 new ones, probably useless (not sure of that)
- change the name of index Subject-geographical to Subject-name-geographical in ccl.properties (to match bib1.att)
the xsl file zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl was generated with the following command:
xsltproc zebradb/xsl/koha-indexdefs-to-zebra.xsl zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml > zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl

To test :
1) Apply the 3 patches
2) copy the modified files from the source directory to the directory where you store the config files for Zebra and Queryparser
The files modified by the 3 patches and that need to be copied are:
etc/zebradb/biblios/etc/bib1.att
etc/zebradb/ccl.properties
etc/searchengine/queryparser.yaml
etc/zebradb/ccl.properties
.../unimarc/biblios/biblio-koha-indexdefs.xml
.../unimarc/biblios/biblio-zebra-indexdefs.xsl
3) Rebuild Zebra
4) Create a record A with some values in critical fields, for example:
- the string "test9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f, 606$2,600$2
- the string "subform" in 600$j
4) Create a record B with the string "subgeo" in 606$y
5) Create a record C with the string "subdate" in 606$z
WITHOUT QP activated in sysprefs ("Don't try to use QP"):
6) try to search "su:test9828". You should have no results
7) try to search "su-genre:subform". You should have 1 result : record A
8) try to search "su-geo:subgeo". You should have 1 result : record B
9) try to search "su-chrono:subdate". You should have 1 result : record C
10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo indexes, and see it results are relevant
WITH QP activated in sysprefs:
Same tests

Signed-off-by: Nick Clemens <nick@quecheelibrary.org>

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-27 12:46:47 -03:00
Mathieu Saby
5bbe42932e ug 9828 : Add and fix comments in UNIMARC biblio-koha-indexdefs.xml
Only cosmetic :
- the references to lines record.abs are now useless and outdated
- some comments added in record.abs could be usefull in biblio-koha-indexdefs.xml

No change expected, only comments

Signed-off-by: Nick Clemens <nick@quecheelibrary.org>

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-27 12:46:44 -03:00
Mathieu Saby
0dd1ac40a0 Bug 9828: More specific indexing of UNIMARC 6XX fields
[New commit on 18 Aug 2014 : rebased, and DOM indexing only]

Issues to fix :
Most of 6XX may contain a $2 that identifies the system used for indexing. It should not be indexed.
In French libraries, $2 contains "rameau". So searching books about the music composer "Rameau" retreive thousands of records!
For some 6XX fiels, other subfields should not be indexed, for example dates of persons and family, or adresses.
In Unimarc guide, 600$t,601$t,602$t are said to exist but to be "not used". I keep them indexed.

Additionnally, subject indexing could be improved by using specific indexes for each 6XX if possible :
In ccl.properties :
- su-to, su-geo and su-ut are defined as aliases of Subject.
- a specific index is defined, but not used in record.abs : Subject-name-personal, alias su-na
We can use these indexes and create new specific indexes by using existing bib1 attributes.

We could also index $j,$x,$y,$z subdivision in specific indexes.

This patch does the following changes :
1) For all 6XX : Not indexing $2 (LSCH, Rameau...), $3 and $5
2) Suppressing the indexing of some specific subfields, depending on the field:
600 : Personal name used as a subject // see Marc21 600
not indexing c (additional elements),f (dates),p (address/affiliation)
602 : Family name used as a subject // see Marc21 600 3X
not indexing f (dates)
616 : Trademark
not indexing c,f
3) For all 6XX : index $j,$x,$y,$z in several indexes in addition to the specfific index for their 6XX field:
4) Define in ccl.properties some specific indexes :
Subject-name-conference 1=1073 => alias su-conf
Subject-name-corporate 1=1074 => alias su-corp
Subject-genre-form 1=1075 => alias su-genre and su-form
Subject-geographical 1=1076 => alias su-geo
Subject-chronological 1=1077 => alias su-chrono
Subject-title 1=1078 => alias su-ut and su-ti
Subject-topical 1=1079 => alias su-to
5) Adding new aliases in Search.pm :
su-chrono, su-form, su-genre, su-corp, su-conf, su-ti
6) Using these new indexes in for
600 : Subject and Subject-Personal-Name ; all subfields except subdivisions in Personal-name
601 : Subject, Subject-name-conference and Subject-name-corporate and Subject-name-conf ; all subfields except subdivisions in Corporate-name and Conference-name
602 : same as 600 but could be improved later
604 : Subject and Subject-title ; $a in Subject-Personal-Name ; all subfields except subdivisions in Name-and-Title
605 : Subject and Subject-title
606 : Subject and Subject-topical
607 : Subject and Subject-geographical ; all subfields except subdivisions in Name-geographic
608 : Subject and Subject-genre-form

To test :

A. In a UNIMARC-DOM indexing environment
1) Apply the patch
2) Rebuild zebra
3) Create a record A with some values in critical fields, for example:
- the string "test9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f, 606$2,600$2
- the string "subform" in 600$j
4) Create a record B with the string "subgeo" in 606$y
5) Create a record C with the string "subdate" in 606$z
6) try to search "su:test9828". You should have no results
7) try to search "su-genre:subform". You should have 1 result : record A
8) try to search "su-geo:subgeo". You should have 1 result : record B
9) try to search "su-chrono:subdate". You should have 1 result : record C
10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo indexes, and see it results are relevant

Indexing of subjects could maybe be improved later

Signed-off-by: Nick Clemens <nick@quecheelibrary.org>

All seems to work as expected, I am not super-familiar with UNIMARC but I wonder if in su-corp and su-conf the subdivisions might be useful (e.g. France-Gendarmie / Staatsbibliothek-Berlin)

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-27 12:46:42 -03:00
Jonathan Druart
b3acefc319 Bug 11586: Better default framework for UNIMARC - zebra conf
This patch updates the Zebra configuration for unimarc.

995$d and 995$j should not be indexed.

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-23 10:52:03 -03:00
e95cd1b126 Bug 11232: (followup) remove unnecesary namespace definition from all XML elements
The previous patches for facet extraction from Zebra indexes set a default
namespace on the following files:

etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml
etc/zebradb/marc_defs/normarc/biblios/biblio-koha-indexdefs.xml
etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml

and hence the XML file index_subfields can be cleaned by removing the namespace.

To test:
- Apply this patch
- Run

$ for i in marc21 normarc unimarc
  do xsltproc etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl \
              etc/zebradb/marc_defs/$i/biblios/biblio-koha-indexdefs.xml \
              > etc/zebradb/marc_defs/$i/biblios/biblio-zebra-indexdefs.xsl
  done

=> SUCCESS: no errors reported

- Run
$ git diff
=> SUCCESS: no differences on the xsl files

- Sign off :-D

Sponsored-by: Universidad Nacional de Cordoba
Signed-off-by: David Cook <dcook@prosentient.com.au>

Seems to work with DOM and MARC21.

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-15 12:55:44 -03:00
eafeb34097 Bug 11232: UNIMARC facet definition and updated XSL file for DOM
This patch adds the facets definitions to the biblio-koha-indexdefs.xml, based
on what is hardcoded on C4::Koha::getFacets().

The biblio-zebra-indexdefs.xsl file for UNIMARC is generated using the usual:

xsltproc ...koha-indexdefs-to-zebra.xsl ...unimarc/biblios/biblio-koha-indexdefs.xml > \
    ...unimarc/biblios/biblio-zebra-indexdefs.xsl

Sponsored-by: Universidad Nacional de Cordoba
Signed-off-by: David Cook <dcook@prosentient.com.au>

Seems to work with DOM and MARC21.

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-15 12:55:38 -03:00
95adc7a1f4 Bug 12453 - Do not use by default Host-Item-Number in UNIMARC
Actually, in default UNIMARC install, 461$9 is indexed as Host-Item-Number, meaning it is used for analytical itemnumber.

But most UNIMARC catalog use the analytical relation using unimarc_field_4XX.pl plugin on 461$a. In fact, this plugin is defined in default UNIMARC frameworks.

If Host-Item-Number is defined but 461$9 is used for something else, it will lead to odd bugs. For example, records containing analytical items can not be deleted.

This patch comments the 461$9 indexing in UNIMARC zebra config.

Test plan :
- Create a fresh UNIMARC install
- Create a record with 461$9 containing a value
- Index the record
- Perform a search on Host-Item-Number : ccl=Host-Item-Number,alwaysmatches=''
=> Without the patch you get a result
=> With the patch you get no result

Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Code is clean, commenting out all the indexing of 461$9.
Trusting the author that this is the correct thing to do :)

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-08-24 12:32:30 -03:00
Mathieu Saby
b6118db2f5 Bug 11202: Improve UNIMARC biblio indexing
This patch makes the following changes to UNIMARC biblio indexing :
A. Changes to UNIMARC conf files
1. add comments to biblio-koha-indexdefs.xml
2. make biblio-koha-indexdefs.xml more compact by grouping some
   declarations
   Ex : 200$f and 200$g => one declaration for 200$fg
3. suppress unneeded declarations (indexing of some 4XX fields and 6XX
   fields not in unimarc format)
4. unindex some (sub)fields unneeded by most users (318, 207,230,210a,
   215, 4XXd)
5. change the way 308 field is indexed (no visible changes)
6. replace Title-host with Host-item -- see bug 11119
7. index 208 in Material-Type -- see bug 11119
8. index 100 pos 8-9 and 9-12 in pubdate:y and pubdate:n
9. index 100 pos 8-9 in pubdate:s instead of 210$d
10. Index all subfields of note 334 and 327 in note index
11. Index 304 and 327 in title index as well as note index
    327 can contain a list of titles included in a work
    304 can contain the title of the original work in case of a
    translation
12. Index 314 in author index as well as note index
    314 can contain authors not mentionned in 200$f/g (the 4th, 5th etc.
    author)
13. Index 328 note in Dissertation-information as well as note
14. Index 328$t in Title

B. Changes to ccl.properties :
1. add a new index Dissertation-information (1056)
2. fix EAN, pubdate and acqdate (they were not linked with bib1 attributes)

C. Changes to Search.pm
1. add Dissertation-information and suppress Title-host and UPC

D. Changes to QP config file queryparser.yaml
1. add Dissertation-information
2 fix EAN, pubdate and acqdate

Test plan :
If you cannot test in GRS1, test only in DOM, as GRS will be deprecated.

1. Apply the patch in a UNIMARC Koha running with DOM and ICU
2. copy src/etc/searchengine/queryparser.yaml into the main config
   directory of QP
3. copy src/etc/zebradb/ccl.properties into the main config directory
   of Zebra
4. copy src/etc/zebradb/marc_defs/unimarc/biblio/* into the main config
   directory of Zebra
5. reindex biblios (rebuild_zebra.pl -r -b -x -v)
6. test note index : make some searches on 334$b or 327$b
7. test author index : make some searches on 314 field
8. test title index : make some searches on 304 and 327 field, make a
   search on 328$t subfield
9. test dissertation-information index : make some searches on 328 field
10. In a record, put in the dates of 100 fields the values "1000" (1st
    date) and "1001" (2d date) ; try to search a book written in year
    1000, you should find the record ; idem for year 1001
11. make some searches and sort by date. It should work better as before,
    especially if you have values like "c2009" or "impr. 2010" in 210
    field
12. Regression test : make some searches on several indexes, like EAN,
    etc. It should work as before

Test 10-12 with and without Queryparser activated.
Be careful: with Queryparser activated, the index names (title,
dissertation-information...) must be entered in lowercase only.
Of course, to test search and sort by dates, you need to have full
records, with dates in 100 field as well as 210 field.

Signed-off-by: Paola Rossi <paola.rossi@cineca.it>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-02-19 21:01:15 +00:00
Jonathan Druart
a573ac1fa8 Bug 9940: (follow-up) FIX comment: language-original is 101$c, not $h
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-12-25 15:41:31 +00:00
Mathieu Saby
451f67c055 Bug 9940: Add a new index for the original language of a document
It could be useful to index the original language of a document (i.e.
"fre" for the English translation of a French novel).

This patch renames the Bib-1 use attribute 1095 from
Code-language-original to language-original and uses it to index:

- MARC21 041$h subfield
- UNIMARC 101$c subfield

It adds "language-original" in the list of index in Search.pm.

Test plan :
A. in a MARC21 GRS1 environment
1. Copy Zebra config files (zebradb/biblios/etc/bib1.att,
   zebradb/ccl.properties, marc_defs/marc21/biblios/record.abs) from
   your source etc/ directory to your main koha etc/ directory
2. Reindex zebra
3. Make some searches, like "language-original:fre"
B. in a MARC21 DOM environment
4. Copy Zebra config files (zebradb/biblios/etc/bib1.att, zebradb/ccl.properties,
   marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl) from your source etc/
   directory to your main koha etc/ directory
5. Reindex zebra
6. Make some searches, like "language-original:fre"
C. in a UNIMARC GRS1 environment
7. Copy Zebra config files (zebradb/biblios/etc/bib1.att,
   zebradb/ccl.properties, marc_defs/unimarc/biblios/record.abs) from
   your source etc/ directory to your main koha etc/ directory
8. Reindex zebra
9. Make some searches, like "language-original:fre"
A. in a UNIMARC DOM environment
10. Copy Zebra config files (zebradb/biblios/etc/bib1.att,
    zebradb/ccl.properties, marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl)
    from your source etc/ directory to your main koha etc/ directory
11. Reindex zebra
12. Make some searches, like "language-original:fre"

Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-12-25 15:37:14 +00:00
Mathieu Saby
c00131e0ff Bug 9830: Fix some indexes in UNIMARC item indexing
With this combination of sysprefs, and a UNIMARC configuration, it was
impossible to search on location, barcode and ccode indexes :

QueryWeightFields          is activated
QueryAutoTruncate          only if * is added

But in UNIMARC, location, barcode and ccode (995 $e,$f,h) were indexed
only as "words". They need to be indexed also as "phrase".

Additionnaly, in UNIMARC, information about damaged and withdrawn status
of items is not indexed, while it is done in MARC21.

This patch
- add 2 new indexes for 995$1 (damaged) and 995$3 (withdrawn)
- index location, barcode and ccode as "phrase" as well as "words"

Indexing of items in UNIMARC could be improved later. So this patch also
add comments explaining the origin of Koha 995, I think it could be
useful for further changes.

To test, on a UNIMARC configuration :
A. indexed with GRS-1
1) Set sysprefs QueryWeightFields as "activated" and QueryAutoTruncate
   as "only if * is added"
2) Select location index in advanced search and search for a value
   existing in your records in 995$e => 0 results
3) Apply patch
4) Rebuild zebra
5) Select location index in advanced search and search for a value
   existing in your records in 995$e => x results
6) Mark an item as withdrawn; search "withdrawn:1" => x results, and
   among them the biblio to which the item is attached
7) Mark an item as damaged ; search "damaged:1" => x results, and among
   them the biblio to which the item is attached

B. indexed with DOM
Do the same operations

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Work as described. No koha-qa errors

Test
Apply the patch
Begin with GRS-1
Full reindex
Search by location, no results
cp files biblio-*-indexdefs.xml and record.abs to destination on etc/zebra
Full reindex
Search by location, got results

Switch to DOM
reset files
Full reindex
Search by location, no results
cp files
Full reindex
Search by location, results !

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-21 15:38:49 +00:00
Mathieu Saby
5298140c67 Bug 10037: fix item index in UNIMARC DOM indexing
In UNIMARC DOM indexing, "item" index was working only for subfields
of 995 field mapped with specific indexes, and also in index (ex :
$a, $b...). It was not working for the other subfields (ex : $g),
because a comment from record.abs was integrated in DOM config files.

This patch removes the comment.

To test, in a DOM UNIMARC environment :
1) In a item, write some value "Test10037" in 995$g
2) Search for this value in simple search, this way : item=Test10037
   => you should have no results
3) Apply the patch. if necessary, copy the modified
   etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml and
   etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl into
   the /etc/... directory in your main Koha directory
4) Reindex Zebra biblios
5) Do the same search as 2) => you should have one result

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Work as described. No koha-qa errors.

Test

NOTE: default UNIMARC framework don't have 995g,
so I must add it first.

1) Added test string to 995b on some record
2) Reindex and search as indicated, no results
3) cp files to destination
4) reindex
5) search and result ok !

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-10 18:54:12 +00:00
Mathieu Saby
43809d2835 Bug 8252: Followup for Date/time-last-modified and Music number
This followup restores the original wording of "Date/time-last-modified"
index, and change the name of "Music-number" index to
"Number-music-publisher"

To test :
1. In a UNIMARC Koha instance
2. Apply patchs #1, #2 and this followup
3. Copy from src/etc/zebradb directory to the etc/zebradb/ in your main
   Koha directory the following files:
-- zebradb/biblios/etc/bib1.att
-- zebradb/ccl.properties
-- zebradb/marc_defs/unimarc/biblios/record.abs
-- zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml
-- zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl
4. Rebuild zebra with -b -x -v -r options
5. Write a value like "test071a" in 071$a field in a record
6. Check if you can find this record with this search:
   "ccl=Number-music-publisher:test071a"

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
No koha-qa errors.

Test
Copy files
reindex full
Modify a couple of record to add 071a with test message
Reindex -v -z -b -x
Search test message as described and found modified records.

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-10 15:16:27 +00:00
Mathieu Saby
8034566027 Bug 8252: Fix indexing of UNIMARC 1xx for DOM
This patch makes the same changes in UNIMARC DOM configuration as patch
1 made for GRS-1.

positions of subfields are indexed that way :
In biblio-koha-indexdefs.xml :
tag="100" subfields="a" offset="17" length="1"
In biblio-zebra-indexdefs.xsl :
xslo:value-of select="substring(., 17, 1)"

I had to edit biblio-zebra-indexdefs.xsl by hand, because
etc/zebdradb/xml/koha-indexdefs-to-zebra.xsl does only support
"subtring" in handle-one-index-control-field template.

It is good for MARC21, but not for UNIMARC : in MARC21, indexing
subtrings is needed for controled field (001-009, with no subfields)
But in UNIMARC it is needed for subfields of 1XX fields.
So if DOM indexing is working with these new files, we may need to
change koha-indexdefs-to-zebra.xsl.

Test plan (not possible in a sandbox) :
1) In a Koha instance using UNIMARC and DOM indexing
2) Apply Patch 1 and Patch 2 (this one)
3) Copy the following files from the etc/zebradb directory of your
   source into the etc/zebradb directory of your main Koha directory :
-- etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml
-- etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl
-- etc/zebradb/ccl.properties
-- etc/zebradb/biblios/etc/bib1.att
4) rebuild zebra with -x -b -r -v options
5) check if coded filters in advanced search are usable in OPAC and
   Staff interface

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Works. No koha-qa errors.

Test for DOM
Apply patches
Don't forget to copy files
reindex
Search by coded fields works, also Country-publication

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-10 15:15:04 +00:00
Mathieu Saby
e86e3c24b8 Bug 8984: make Zebra more UNIMARC compliant
This patch makes the following changes to record.abs, biblio-koha-indexdefs.xml and biblio-zebra-indexdefs.xsl :
- adding new (sub)fields to Identifier-standard index : 011f/g ; 012a ; 013a/z ; 014a/z ; 015a/z ; 016a/z ; 017a/z, 040a/z, 071z, 072z, 073z
- adding 1 new subfield to Publisher index : 071b (may contain the name of a music publisher)
- adding new (sub)fields to Author and  Identifier-standard index (for the $9) : 716, 72X, 730 - adding new (sub)fields to Note : 334$a (award note)
- correcting 207 and 208
- suppressing 308a and 328a in Note (useless as complete fields are indexed in same index)
- adding (sub)fields to Title index : 411t, 421-425t, 433-437t, 442-444t, 446-456t, 462-463t, 470-488t, 560
- adding (sub)fields to Subject and  Identifier-standard index (for the $9) : 608, 615, 616, 617, 620, 621
- adding some classifications index : 670, 675, 686 - adding some comments (to make easier further modifications and to identify non unimarc fields : 414-420, 603, 630-636, 646)

To test :
- take a record and fill some of the missing fields (e.g 488t, 608, 720, 012a) with some data as "field488", "field608" etc
- try to find the record => not possible
- apply the patch, copy the new record.abs in etc/zebradb/biblios/etc and rebuild zebra
- try to find the record => should be ok
- check nothing else is broken...
- same test with DOM indexing activated

http://bugs.koha-community.org/show_bug.cgi?id=8984
Signed-off-by: Zeno Tajoli <tajoli@cilea.it>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-01-04 08:39:56 -05:00
Jared Camins-Esakov
7d9b4d58e3 Bug 8665: DOM indexing fails to index some bib records
Use a user-specified field for z:id.

This patch also fixes an excess space before the index in the MARC21
biblio index definitions, which someone fixed in the generated file
but not in the source file it should have been fixed in.

Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Elliott Davis <elliott@bywatersolutions.com>
2012-10-29 19:12:38 +01:00
Colin Campbell
1e8423167e Bug 8653 remove erroneous whitespace blocking indexing
The superfluous whitespace after the definition of subject
tag $9s is causing an error when carried over into dom config
files so that the authority links fail to index

Also removed the (harmless) trailing space in the equivalent
Unimarc files

A good editor and git can help in not creating excess whitespace

Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
2012-09-14 17:20:34 +02:00
38b375b32c Bug 7818 Add UNIMARC biblio records zebra DOM def files
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
I tested two UNIMARC Koha installations using the sample UNIMARC
data from the BibLibre sandbox, comparing the results with DOM
and with GRS-1 indexing. The results are very similar, though there
are some differences. Most noticeable:
* relevance and facets seem to be more accurate with DOM enabled
* the GRS-1 configuration returns approximately 10% more results with
  random single keywords like "petit," but the DOM results contain
  the most relevant items, and any lacks in the configuration can
  easily be corrected as UNIMARC users identify fields that should be
  indexed but aren't
* authority-controlled searches match exactly
* author and topic facets do not work with the out-of-the-box GRS-1
  indexing configuration (?!?)
(adding second sign-off line below because all that probably looks like
a commit message and not a sign off)

Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
2012-06-09 11:44:16 +02:00