Commit graph

180 commits

Author SHA1 Message Date
57ea65e725
Bug 15048: Index all possible searched subfields for index-term-genre
Currently we only index a - but we can setup the system such that avxyz are searched

To test:
 1 - define both a 655$a *and* 655$x value in a bib, save, reindex
 2 - Set system preferences:
      TraceSubjectSubdivisions: Include
      TraceCompleteSubfields: Force
 3 - View the record edited above in the opac
 4 - Click on the subject heading
 5 - No results found
 6 - Copy zebra files:
  sudo cp ./etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml \
  /etc/koha/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml
  sudo cp etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl \
  /etc/koha/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
 7 - restart all and reindex
 8 - Click on the subject heading in OPAC
 9 - Sucess!
10 - Repeat with other fields (vyz)
11 - Repeat under ES, reindexing and resetting mappings

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2022-10-24 14:39:38 -03:00
2c673944ba
Bug 31532: Add preprocessor to Zebra indexing pipeline to index 880 as linked field
This patch adds a preprocessor XSLT to the Zebra indexing pipeline,
so that 880 fields get indexed as the fields they're linked to. For example,
a "880 $6 245" field would be indexed as a "245" field.

However, because the preprocessor only occurs in the indexing part of the pipeline,
it does not affect the retrieval of MARCXML from Zebra. That MARCXML will be
the same MARCXML that was sent to Zebra from Koha.

Test plan:

0. Revert bug 15187, and apply patch for 31532
1. cp ./etc/zebradb/biblios/etc/dom-config.xml /etc/koha/zebradb/biblios/etc/dom-config.xml
2a. cp etc/zebradb/marc_defs/marc21/biblios/preprocess_marcxml.xsl /etc/koha/zebradb/marc_defs/marc21/biblios/.
2b. cp etc/zebradb/marc_defs/normarc/biblios/preprocess_marcxml.xsl /etc/koha/zebradb/marc_defs/normarc/biblios/.
2c. cp etc/zebradb/marc_defs/unimarc/biblios/preprocess_marcxml.xsl /etc/koha/zebradb/marc_defs/unimarc/biblios/.
3. koha-rebuild-zebra -b -f -v kohadev
4. Note that in search results the 880$6245$a data appears before the 245$a data
5. Note that you can do a title index search on the 880$6245$a data

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2022-10-24 14:07:46 -03:00
fbf1c32a6f
Bug 30879: Handle index/sorting for UNIMARC
Same as before, but test with UNIMARC setup

Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2022-08-02 09:33:13 -03:00
230bae0b02
Bug 30879: Add biblionumber as a sorting option in MARC21
This patch updates the Local-Number indexing by adding a zeropad option
to Zebra indexing and adding this to the mapping files

It also updates C4/Search.pm to allow biblionumber as an option

To test:
1 - Apply patches
2 - copy etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl to /etc/koha/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
3 - Restart all, reindex zebra
4 - Browse to: http://localhost:8081/cgi-bin/koha/catalogue/search.pl?idx=kw&q=a&sort_by=biblionumber_dsc&count=20
5 - Confirm records sorted correctly
6 - Browse to http://localhost:8081/cgi-bin/koha/catalogue/search.pl?idx=kw&q=a&sort_by=biblionumber_asc&count=20
7 - Confirm records sorted correctly

Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2022-08-02 09:33:12 -03:00
ae91a0b5db Bug 18540: Update MARC21 biblio XSL file
I ran the xsltproc on both MARC21 and UNIMARC files (biblios and
authorities). With my follow-up the only changed one is this one.

I skipped NORMARC as it is supposed to be removed by now (so unused in
Norway).

Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Fridolin Somers <fridolin.somers@biblibre.com>
2022-01-18 21:15:04 -10:00
78d403ca3d Bug 18540: Handle indexing sort title only when needed
This patch moves the code that generates the xsl for MARC21 biblio sorting
to it's own template that is only called when specified in the xml

To test:
 1 - xsltproc etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml > etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xsl
 2 - git diff
 3 - Note that authority-zebra-indexdefs.xsl now has 245 Title:s info
 4 - Apply patch
 5 - xsltproc etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml > etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xsl
 6 - git diff
 7 - There are lines added about title sort, but no 245 block
 8 - xsltproc etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml > etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
 9 - git diff
10 - Note lines changes to ...title_sort
11 - 245 block does not change

Signed-off-by: Hayley Pelham <hayleypelham@catalyst.net.nz>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Fridolin Somers <fridolin.somers@biblibre.com>
2022-01-18 21:15:04 -10:00
Hayley Pelham
45fee768e9 Bug 20463: Adding index for Leader 19 Multipart resource record level
Sponsored-by: Bibliotheksservice-Zentrum Baden-Wuerttemberg

Signed-off-by: Katrin Fischer <katrin.fischer@bsz-bw.de>

Signed-off-by: Nick Clemens <nick@bywatersolutions.com>

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
2021-10-25 14:08:06 +02:00
3a30e2d66c Bug 28337: Add System-control-number index to Zebra MARC21 files
To test:
1 - Apply patch
2 - Copy zebra files to destination:
    cp /kohadevbox/koha/etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml /etc/koha/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml
    cp /kohadevbox/koha/etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xsl /etc/koha/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xsl
3 - Reindex authorities
4 - Edit an authority and add 035$aExpialodocious
5 - Export the authority
6 - Create a new record matchign rule:
    Matching rule code: OCN
    Description: Other control number
    Match threshhold: 1000
    Record type: Authority record
    Search-index: Other-control-number
    Score: 1000
    Tag: 035
    Subfields: a
7 - Stage the record and use the new matchign rule
8 - Match found!

Signed-off-by: Andrew Fuerste-Henry <andrew@bywatersolutions.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
2021-10-15 11:34:26 +02:00
deece7b76b Bug 28830: Add cni index for 003
This patch adds the cni/Control-number-identifier index to enable
searches to use the 003 field.

Test plan
1/ Apply patch
2/ Re-index using updated configurations
3/ Confirm cni:number searches yield the expected results
4/ Signoff

Split-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Pasi Kallinen <pasi.kallinen@koha-suomi.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
2021-08-30 17:02:07 +02:00
e961e45daa Bug 21286: Add Corporate-name as phrase to zebra indexes
When using Zebra for searching, Koha performs a number of searches in order
to improve relevancy. This means that even for 'wordlist' search, we perform a phrase search.

When selecting 'Corporate-name' as an index, this expansion of the search causes errors and fails
the search

We can fix this for 'Corporate-name' searches by adding a phrase index

To test:
 1 - Edit koha-conf.xml and uncomment the zebra debug line and add 'request' to the list
 2 - Restart all
 3 - tail -f /var/log/koha/kohadev/zebra-output.log
 4 - Edit a record to add a 110 field e.g. 'House plants'
 5 - Enable syspref IntranetCatalogSearchPulldown
 6 - Search for 'Corporate name' and term 'House plants'
 7 - No results
 8 - View the log, see 'ERROR' and full search terms listed
 9 - Apply patch
10 - copy the zebra files to the production instance:
    cp etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml /etc/koha/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml
    cp etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl /etc/koha/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
11 - restart all
12 - rebuild: sudo koha-rebuild-zebra -v -f kohadev
13 - Repeat search
14 - Success!

Signed-off-by: David Nind <david@davidnind.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
2021-06-21 12:04:17 +02:00
bb83da9b1d Bug 18017: Use index_heading and index_match_heading in UNIMARC authorities zebra configuration
For a good management of autorities linking to biblio records,
MARC21 uses index_heading and index_match_heading in authorities zebra configuration.
UNIMARC configuration must use the same.

This patch adds in UNIMARC authorities zebra configuration index_heading and index_match_heading to earch heading
in order to be maximum close to MARC21 authorities zebra configuration.
See changes made in MARC21 :
32cf2af700

It fixes some indexes names : Personal-name-see => Personal-name-see-from

Removes useless Term-geographic index, a duplicate of Name-geographic.

Sometimes parallel 7xx form whas only on $a, it must contains same subfields
has the main heading.

Test plan :
===========
1.0) Use a UNIMARC install without patch
1.1) Set sysprefs
     BiblioAddsAuthorities = ON
     AutoCreateAuthorities = ON
     LinkerModule = First Match
1.2) Replace authorities zebra configuration files
     cp $KOHA_CLONE/etc/zebradb/marc_defs/unimarc/authorities/authority-koha-indexdefs.xml $KOHA_CONF_DIR/zebradb/marc_defs/unimarc/authorities/authority-koha-indexdefs.xml
     cp $KOHA_CLONE/etc/zebradb/marc_defs/unimarc/authorities/authority-zebra-indexdefs.xsl $KOHA_CONF_DIR/zebradb/marc_defs/unimarc/authorities/authority-zebra-indexdefs.xsl
1.3) Restart zebra server and indexer services
1.4) Reindex authorities
     ./misc/migration_tools/rebuild_zebra.pl -r -a -v
1.5) Search in Z3950 a record with complex heading (with subdivisions),
     for example ISBN 2877620115 "Facteurs culturels et sociaux de la santé en Afrique de l'Oues"
1.6) Import this record and save it : authorities are created
     go to staff:/cgi-bin/koha/cataloguing/addbooks.pl
1.7) Reimport the same record (when asked, say that it's not a duplicate)
1.8) The authority should have been duplicated :
     different url and different $9 value
2.0) Apply this patch
2.1) Replace again the authorities zebra configuration files
2.2) Restart zebra server and indexer services
2.3) Reindex authorities
2.4) Reimport the same record
2.5) The authority should have not been duplicated. Compare with both
       existing records to see which the 3rd has been matched against.
3.0) Play with authorities search to check every mode :
     Search main heading ($a only)
     Search main heading
     Search all headings
     Search entire record

Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
2021-04-16 15:12:45 +02:00
cb0f279390 Bug 18017: Add subdivisions to UNIMARC authorities zebra configuration
Like for MARC21, UNIMARC authorities has subdivisions form, general,
chronological and geographic.

In C4::Heading::UNIMARC, use subdivisions in _get_search_heading like in C4::Heading::MARC21.

Adds subdivisions variables into UNIMARC authorities zebra configuration.

Note that unlike MARC21 geographic is subfield $y and chronological is subfield $z.
See https://www.ifla.org/publications/unimarc-formats-and-related-documentation

Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
2021-04-16 15:12:45 +02:00
Katrin Fischer
6bba5a35c0
Bug 15142: (follow-up) Remove facet configuration from .xsl file
Removes last remaining bit of configuration for the Titles facet
configuration.

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
2020-01-10 16:16:49 +00:00
5a4ac1577b
Bug 15142: (follow-up) remove from Zebra UNIMARC config files
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
2020-01-10 16:16:41 +00:00
3599e57bf6
Bug 15704: Split up Zebra indexing of RDA 264 information
To test:
1 - Add a record with a unique publisher "Supercalifragilistic" in the
264 b field
2 - Search for the value
3 - Record not found
4 - Apply patch (may need ot copy the .xml file into koha install)
5 - Reindex all the things
6 - Search for the value
7 - Success!

Signed-off-by: Felicia Martin <felicia.martin@dncr.nh.gov>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
2019-06-24 14:53:18 +01:00
ef0c31f735
Bug 23053: [follow-up] Same changes for UNIMARC
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
2019-06-13 14:38:04 +01:00
2e8a6a4b09
Bug 23053: Add phrase index to authority Local-Number
To test:
 1 - Define a matching rule for authorities on field 001 index Local-Number
 2 - In koha-conf.xml raise the zebra_loglevels
     <zebra_loglevels>none,fatal,warn,request,info</zebra_loglevels>
 3 - Export some authorities using the tools->export data
 4 - Import those authorities
 5 - Note no matches found
 6 - View the zebra output log, you should see lots of error 114
 7 - Apply patch
 8 - Copy the indexdefs files to the installed versions
 9 - Reapply matchign rules to staged files
10 - Matches should now be found
11 - Logs should not have errors

Signed-off-by: Claire Gravely <claire.gravely@bsz-bw.de>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
2019-06-11 08:37:40 +01:00
cc87e0a458 Bug 14302: Remove GRS1 related files
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>

Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
2018-08-31 11:24:19 +00:00
14166c4a8c Bug 18322: Update xslt for NORMARC and UNIMARC
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>

Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
2018-07-18 17:42:49 +00:00
f37cac60e0 Bug 18322: (follow-up) Add generated xsl
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>

Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
2018-07-18 17:42:49 +00:00
17a354f5e8 Bug 18322: Add a facet for ccode fields to Zebra
This patch adds the index definitions for zebra faceting of ccode in
koha for marc21, normarc and unimarc.

We also add lines to the templates to expose the new facet and enable
non-zebra faceting for ccode too.

Signed-off-by: David Cook <dcook@prosentient.com.au>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>

Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
2018-07-18 17:42:48 +00:00
5ef1b6710e Bug 18098: Add an index with the count of not onloan items
This patch adds a numeric index 'not-onloan-count' containing the value
of 999$x. This subfield is filled by 'rebuild_zebra.pl' by making use of
(bug's 18208) 'EmbedItemsAvailability' filter.

bib1.att and indexes definitions are updated accordingly.

To test:
- Apply the patch
- Pick the right biblio-zebra-indexdefs.xsl file for your setup and
  replace the one your Zebra uses [1]
- Replace your bib1.att
- Replace your ccl.properties
- Have at least one record with more than one item, checkout some
  item(s) from that record(s).
- Rebuild zebra's indexes:
  $ sudo koha-shell kohadev
 k$ cd kohaclone
 k$ misc/migration_tools/rebuild_zebra.pl -r -b -v -k
 (notice the dump directory is kept, you can try the XSLT yourself
  running:
    $ xsltproc \
       etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl \
       /tmp/the_dump_dir/biblios/exported_records | less
=> SUCCESS: There are records with the not-onloan-count index, and the
            value is correct!
- Check Zebra yourself:
  $ yaz-client unix:/var/run/koha/kohadev/bibliosocket
 Z> base biblios
 Z> find @attr 1=9013 @attr 2=5 @attr 4=109 0
=> SUCCESS: The search matches the amount of records with not-onloan
            items.
 Z> s 1+1
=> SUCCESS: Records with 999$x having a value higher than 0 are rendered
- Sign off :-D

Note: While this work is complete on its purpose, it is part of an
attempt to create a better way of filtering by availability.

Sponsored-by: ByWater Solutions

 [1] In kohadevbox this would be
/etc/koha/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl

Edit: Added the missing XSLT changes for UNIMARC and NORMARC

Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
2017-05-08 09:21:41 -04:00
21d63c8fb0 Bug 17788: (MARC21) Add $9 fields to Koha-Auth-Number:w index
Looking at the default framework's fields that are linked to authority
records, there's a divergence with the Zebra index definitions.

This yields to authority usage count be incorrect for users searching
for authority records.

MariaDB [koha_kohadev]> SELECT tagfield,tagsubfield,authtypecode FROM
marc_subfield_structure WHERE authtypecode IS NOT NULL AND
authtypecode<>'' AND frameworkcode='' GROUP BY
tagfield,tagsubfield,authtypecode ;
+----------+-------------+--------------+
| tagfield | tagsubfield | authtypecode |
+----------+-------------+--------------+
| 100      | a           | PERSO_NAME   |
| 110      | a           | CORPO_NAME   |
| 111      | a           | MEETI_NAME   |
| 130      | a           | UNIF_TITLE   |
| 440      | a           | UNIF_TITLE   |
| 600      | a           | PERSO_NAME   |
| 610      | a           | CORPO_NAME   |
| 611      | a           | MEETI_NAME   |
| 630      | a           | UNIF_TITLE   |
| 648      | a           | CHRON_TERM   |
| 650      | a           | TOPIC_TERM   |
| 651      | a           | GEOGR_NAME   |
| 654      | a           | TOPIC_TERM   |
| 655      | a           | GENRE/FORM   |
| 656      | a           | TOPIC_TERM   |
| 657      | a           | TOPIC_TERM   |
| 658      | a           | TOPIC_TERM   |
| 662      | a           | GEOGR_NAME   |
| 690      | a           | TOPIC_TERM   |
| 691      | a           | GEOGR_NAME   |
| 696      | a           | PERSO_NAME   |
| 697      | a           | CORPO_NAME   |
| 698      | a           | MEETI_NAME   |
| 699      | a           | UNIF_TITLE   |
| 700      | a           | PERSO_NAME   |
| 710      | a           | CORPO_NAME   |
| 711      | a           | MEETI_NAME   |
| 730      | a           | UNIF_TITLE   |
| 796      | a           | PERSO_NAME   |
| 797      | a           | CORPO_NAME   |
| 798      | a           | MEETI_NAME   |
| 799      | a           | UNIF_TITLE   |
| 800      | a           | PERSO_NAME   |
| 810      | a           | CORPO_NAME   |
| 811      | a           | MEETI_NAME   |
| 830      | a           | UNIF_TITLE   |
| 896      | a           | PERSO_NAME   |
| 897      | a           | CORPO_NAME   |
| 898      | a           | MEETI_NAME   |
| 899      | a           | UNIF_TITLE   |
+----------+-------------+--------------+

This patch adds the missing ones to the authority number index as it is
done for the rest of the fields.

To test:
- Verify that
etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml
contains intries pointing the $9 subfield of all the fields in the
'tagfield' column above, to the Koha-Auth-Number:w index.
- Sign off :-D

Signed-off-by: Hugo Agud <hagud@orex.es>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
2017-02-14 13:58:23 +00:00
cb3f899826 Bug 17547: (MARC21|NORMARC) have Chronological term field $9 indexed
This patch makes Zebra index the 648$9 link for chronological terms on
bibliographic records. This way an authority search on chronological terms
will show the right number in 'Used in X records' message.

To test:
- Have a record with a 648 field, linked to an authority record (i.e. with an authid on 648$9).
- Search for the record, notice it is indexed.
- Perform an authority search for the chronological term
=> FAIL: the term is linked to our record, but koha shows '0' count.
- Apply the patch
- Run:
  $ cd kohaclone
  $ xsltproc etc/zebra/xsl/koha-indexdefs-to-zebra.xsl \
       etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml \
     > etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
  $ git diff
=> SUCCESS: Notice the shipped etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
   is up-to-date
- Run:
  $ sudo cp etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl \
            /etc/koha/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
  $ sudo koha-restart-zebra kohadev
  $ sudo koha-rebuild-zebra -f -b -v kohadev
- Search for the record, notice it is indexed.
- Perform an authority search for the chronological term
=> SUCCESS: the term is linked to our record, usage count is 1
- Sign off :-D

I assume NORMARC is similar on this regard. Feel free to fail it if the NORMARC part of the
patch is wrong.

Sponsored-by: Universidad Nacional de Cordoba

Signed-off-by: Hugo Agud <hagud@orex.es>

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
2016-12-16 11:27:18 +00:00
5b4259be9c Bug 6499: [QA Follow-up] Trivial adjustments
Removes commented line from bib1.att.
Adjust OCLC-number to Other-control-number in comment of ccl properties.
No need to explicitly add 035$a and $z if you index 035 completely in
record.abs as well as biblio-koha-indexdefs.xml.
Rerun koha-indexdefs-to-zebra.xsl on index defs.

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
2016-08-09 10:13:11 +00:00
Barton Chittenden
84f51549c9 Bug 6499: Add Zebra index "Other-control-number" covering MARC21 035$a, 035$z and 035 (entire tag)
1) Apply patch
2) Make sure that you have a bib that has MARC21 035$a (and possibly also 035$z) populated.

pre 3) Replace all modified zebra files and restart zebra server

3) Rebuild zebra: misc/migration_tools/rebuild_zebra.pl -x -b -z
4) Add the following to the intranetuserjs syspref:

$(document).ready(function(){
    // Add Other Control Number to advanced search
    if (window.location.href.indexOf("catalogue/search.pl") > -1) {
        $(".advsearch").append('<option value="Other-control-number">Other Control Number</option>');
    }
});

5) Do an advanced search, select "Other Control Number" from the search menu, then add the Other Control Number in 035$a for the bib specified in step 1.

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Works, no koha-qa errors

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
2016-08-09 10:13:10 +00:00
ff6523765d Bug 15555: Index 024$a into Identifier-other:u url register when source $2 is uri
This patch indexes 024$a into the "phrase" index type, and the "url" index type,
if the 024$2 equals "uri".

TEST PLAN

1) Apply the patch.
1b) If you're using a gitified Koha or a git install,
you'll need to upgrade your instance or copy your zebradb files
over to /etc/koha/zebradb or your "kohadev" directory.
2) Add a 024$a with a URL like http://libris.kb.se/resource/bib/219553
 to a bibliographic record
3) Re-index Zebra
4) Type "id-other,st-urx,fuzzy=http://libris.kb.se/resource/bib/219553"
 into the "Search the catalog" box in the Staff Client and search
5) Note that you retrieve your record

NOTE: The fuzzy is required because Koha's query "parsing" functions change
http:// to http=// which won't correctly match the value in the "Identifier-other:u" index.
NOTE: Alternatively, you could do the following search instead:
"id-other,phr=http libris kb se resource bib 219553".
 It would work as well by using the "Identifier-other:p" index.

Advanced tester version:
4) In a terminal window, find the "koha-conf.xml" file in your "etc" directory.
5) Open "koha-conf.xml" and find <listen id="biblioserver">.
Copy the URI you find there. (e.g. unix:/home/dcook/koha-dev/var/run/zebradb/bibliosocket).
6) Type "yaz-client unix:/home/dcook/koha-dev/var/run/zebradb/bibliosocket"
7) After it connects, type "base biblios" and press enter
8) Type "format xml" and press enter
9) Type "elements zebra::index" and press enter
10) Type "f id-other,st-urx=http://libris.kb.se/resource/bib/219553" and press enter
11) Note that you should have at least one result
12) Type "show 1"
13) If you scroll through the results, you should find something like the following:

<index name="Identifier-other" type="w" seq="28">@^</index>
<index name="Identifier-other" type="w" seq="1"></index>
<index name="Identifier-other" type="w" seq="29">http</index>
<index name="Identifier-other" type="w" seq="30">libris</index>
<index name="Identifier-other" type="w" seq="31">kb</index>
<index name="Identifier-other" type="w" seq="32">se</index>
<index name="Identifier-other" type="w" seq="33">resource</index>
<index name="Identifier-other" type="w" seq="34">bib</index>
<index name="Identifier-other" type="w" seq="35">219553</index>
<index name="Identifier-other" type="p" seq="28">http libris kb se resource bib 219553</index>
<index name="Identifier-other" type="u" seq="36">http://libris.kb.se/resource/bib/219553</index>

Signed-off-by: Hector Castro <hector.hecaxmmx@gmail.com>
Works as advertised the record is retrieved

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Some of the statements in the commit message do not work for me.
A search like "id-other,phr=http libris kb se resource bib 219553" does not
have results. Searching for "id-other,phr=libris.kb.se resource" does.
The steps in the advanced tester version do not work for me too.
I verified the following in yaz-client:
[1] Z> f @attr 1=9012 @attr 4=104 http://libris.kb.se/resource/bib/219553
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 1, setno 16
[2] First removed $2 and reindexed. Then searched again:
Z> f @attr 1=9012 @attr 4=104 http://libris.kb.se/resource/bib/219553
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 0, setno 1

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
2016-04-29 13:19:28 +00:00
Barton Chittenden
2a8c68936d Bug 14277: (QA followup) Silent GRS-1 tests
changed ocurrences of 'lex' to 'lexile-number' in record.abs

Edits were made to the deprecated file record.abs *solely* to quiet
warnings in tests -- this makes sense until GRS-1 code is removed
from Koha.

Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>

Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
2016-04-07 10:35:18 -06:00
Barton Chittenden
690ab60da2 Bug 14277: add zebra indexes for lexile that respect 521 indicator 1.
Added the following indexes:

Interest-age-level | 591$a ind1=1
Interest-grade-level | 591$a ind1=2
lexile-number | 591$a ind1=8
Reading-grade-level | 591$a ind1=0

Moved 'lex' from a zebra index to a ccl alias to lexile-number.

Changed the handling of st-numeric in C4/Search.pm to allow for search ranges.

Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Hector Castro <hector.hecaxmmx@gmail.com>
Works as advertised
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>

Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
2016-04-07 10:35:18 -06:00
Zeno Tajoli
cc878aee77 Bug 14332: Skip articles in MARC21 using indicator n.2 of field 245
Only in MARC21 is possible to use ind2 of tag 245 to skip articles.
This patch is based on inserting a special template in
koha-indexdefs-to-zebra.xsl With this patch you must not insert index
Title:s in biblio-koha-indexdefs.xml, it is defined in
koha-indexdefs-to-zebra.xsl.  It is not the best setup, but I find very
difficult  to use  biblio-koha-indexdefs.xml.

To test it in a english MARC21 setup:

Insert same records with titles and correct values in ind2 of 245.
If you have articles not in the skiping list of sort-string-utf.chr (The|the|a|A|an|An)
you can see that the sort by articles use also articles.

Insert the patch
Rebuilt indexes from scratch

Now all articles of titles are skipped

TO TEST WITHOUT INDEXING:

1. Go to etc/zebradb/marc_defs/marc21/biblios directory.

2. Put the sample MARCXML file in this directory.

3. Transform the file into Zebra indexes:
   xsltproc biblio-zebra-indexdefs.xsl record.xml
   Observe that the Title:s index contains:
   01 Business and Technologies

4. Apply the patch.

5. Repeat:
   xsltproc biblio-zebra-indexdefs.xsl record.xml
   Observe that the Title:s index contains:
   Business and Technologies

Signed-off-by: Frederic Demians <f.demians@tamil.fr>

Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Verified working using yaz-client (as in
http://wiki.koha-community.org/wiki/Understanding_Zebra_indexing#Examine_Zebra_index,
though note that the `elem zebra::index` seems to be unneeded).

Signed-off-by: Brendan A Gallagher <brendan@bywatersolutions.com>
2016-01-27 06:17:16 +00:00
Hector Eduardo Castro Avalos
de272e5b41 Bug 14198: RDA: Indexing 264 field (Zebra)
This patch add zebra indexes to RDA 264 field.
The new Provider index is added too.
QA comments corrected.

To test:
1) Download RDA records with 264 fields from this attachment <http://bugs.koha-community.org/bugzilla3/attachment.cgi?id=36825>. Import the file and re-index/rebuild zebra. These records contain 260 and 264 fields per record.
2) Do a search with pb:Bethany two records will appear with title The guardian. Search with pl:Minneapolis too, the two records will appear.
3) Select one record of both records and delete the 260 field keeping the 264 field and save, rebuild your zebra.
4) Search again with pb:Bethany and just one record will appear. Thats mean 264 is not indexed.
5) Apply patches.
6) Rebuild your zebra but this time all biblio records.
7) Search again with pv:Bethany or Provider:Bethany, this time will appear the two records, 264 is indexed. Note that if you search again with pb only one record appear. This is because the suggestion of LOC.
10) Search with copydate:2013 only launch records with 260 fields and pv:2013 show both fields, i.e., 260 and 264.
11) Apply QA Test Tools

Sponsored-by: Universidad de El Salvador
Signed-off-by: Nick Clemens <nick@quecheelibrary.org>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-11-02 11:41:36 -03:00
aaf3ff3fec Bug 14154: 608$9 defined twice in UNIMARC biblio-koha-indexdefs.xml
In DOM config file :
etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml, the 608$9 is
defined a second time instead of 610$9.  Just a type I think.

Test plan :
- Apply patch
- Install a UNIMARC + DOM instance
- Define in a framework 610 using a thesaurus
- Create a new biblio
- Create a new authority (same type as the thesaurus defined above)
- Index : rebuild_zebra.pl -a -b -x -z
- Link the field 610 to the new authority
- Index : rebuild_zebra.pl -a -b -x -z
- In authorities search, search for the new authority
=> You see Use in 1 Records(s)

Signed-off-by: Frederic Demians <f.demians@tamil.fr>
  I confirm the typo.

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-10-21 13:43:12 -03:00
Barton Chittenden
cab481dbb2 Bug 14617: Add fields to ISBN and ISSN indexes: 020$z, 022$y, 022$z
1) Import MARC21 bibs containing

- ISBN in 020$z
- ISSN in 022$y
- ISSN in 022$z

2) Make sure that bibs are indexed

3) Search by ISBN and ISSN above -- bibs should not show in search.

4) Apply patch, re-index

5) Search again; ISBN in 020$z and ISSN in 022$y and 022$z should return
results.

Signed-off-by: kholten@switchinc.org
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-10-19 10:12:04 -03:00
Katrin Fischer
e799e1cbc3 Bug 11620: Add dissertation-information index for MARC21 (502)
Bug 11202 introduced a new index 'dissertation-information' for
UNIMARC. This patch adds the index also for MARC21 installations.

http://www.loc.gov/marc/bibliographic/bd502.html

To test:
- Apply patch
- Copy files in etc/zebradb changed by this patch to your
  corresponding directory (koha-dev..)
- Make sure you have records with 502
- Reindex
- Verify you can search the field contents with
  dissertation-information= and
  diss=

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Can find by dissertation-information,
No errors

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
2015-07-20 10:31:06 -03:00
Mirko Tietgen
fbe25b1d8e Bug 14453: (followup) Fix shipped XSLT files
Make the shipped XSLTs for authorities (MARC21 and UNIMARC) the same as the generated version

Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-07-08 14:39:04 -03:00
2365537eea Bug 14453: kohaidx is missing for id in authority-koha-indexdefs.xml
In authority-koha-indexdefs.xml, all tags use the namespace "kohaidx" except the tag "id".

When re-generating authority-zebra-indexdefs.xsl, the line :
  <xslo:variable name="idfield" select="normalize-space(marc:controlfield[@tag='001'])"/>
is modified :
  <xslo:variable name="idfield" select="normalize-space()"/>
This is an error.

This patch adds kohaidx namespace to correct.

Test plan :
- Without patch
- go to etc/zebradb/marc_defs/marc21/authorities/
- run : xslproc xsltproc ../../../xsl/koha-indexdefs-to-zebra.xsl authority-koha-indexdefs.xml > authority-zebra-indexdefs.xsl
- read authority-zebra-indexdefs.xsl
=> the line has changed : <xslo:variable name="idfield" select="normalize-space()"/>
- Apply patch
- go to etc/zebradb/marc_defs/marc21/authorities/
- run : xslproc xsltproc ../../../xsl/koha-indexdefs-to-zebra.xsl authority-koha-indexdefs.xml > authority-zebra-indexdefs.xsl
- read authority-zebra-indexdefs.xsl
=> the line has not changed
(same for unimarc flavor)

Signed-off-by: Mirko Tietgen <mirko@abunchofthings.net>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
As Mirko mentioned, the xslt's now generate the facet-processing templates in
the authority xslt's too. They are harmless because we don't define facets
for authority records. If we did, it would be harmless too.

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-07-08 14:39:04 -03:00
Stefan Weil
f6aec46dda Bug 14383: etc/zebradb: Fix some typos in documentation and Bib-1 attribute set
All of them were found and fixed using codespell.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>

Signed-off-by: Jonathan Druart <jonathan.druart@koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-06-22 17:34:46 -03:00
Jonathan Druart
b5e9691060 Bug 8992: Add 7..$3 to the Indentifier-standard index
Signed-off-by: valerie bertrand <valerie.bertrand@univ-lyon3.fr>

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
2015-04-28 15:47:40 -03:00
Zeno Tajoli
c29a53ea20 Bug 12948: Use word indexing for language (MARC21)
This patch is for MARC21. To test:
1)Setup a site with
 MARC21
2)Insert 2 record, one lang A in 041 and 008 pos
 35-37 an other with lang A in 041 and lang B in 008 pos
 35-37
3)Index them
4)Search in advanced search with filter
 'languare' for lan A. You will see 2 records
5)Search in
 advanced search with filter 'languare' for lan B. You will
 see 0 records
6)Apply the patch
7)Full reindex
8)Search in advanced search
 with filter 'languare' for lan B. You will see 1 records

http://bugs.koha-community.org/show_bug.cgi?id=12948

Signed-off-by: Magnus Enger <magnus@enger.priv.no>
I have *not* actually tested this, but the changes are identical to the ones
done for NORMARC, which I have tested, so I think it is safe to sign off. If
anyone disagrees, please reset the bug to "Needs signoff".

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2015-02-20 11:51:59 -03:00
Zeno Tajoli
bf89e306a8 Bug 12948: Use word indexing for language (NORMARC)
This patch is for Normarc
Same test plan as patch for MARC21, except you need a setup with Normarc.

http://bugs.koha-community.org/show_bug.cgi?id=12948
Signed-off-by: Magnus Enger <magnus@enger.priv.no>

- Added a record with "bul" in 008pos35-37
- Verified that this did not turn up in an advanced search with language =
  Bulgarian
- Applied the patch
- I was testing on a gitified install, so I had to copy the patched index file
  to the right location with this command:

sudo cp etc/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl \
/etc/koha/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl

- Did a full reindex
- Verified that the record *did* turn up in an advanced search with language =
  Bulgarian
- Signing off! Thanks Zeno!

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2015-02-20 11:51:50 -03:00
532b41934c Bug 13157: (QA followup) homebranch is 995$b on UNIMARC frameworks
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
2014-11-25 15:27:12 -03:00
9ebb6ba5d1 Bug 13157: UNIMARC holdingbranch facet is 995$c not 995$b
Fix a typo. Not test plan required, just a look at default UNIMARC framework.

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
2014-11-25 15:27:05 -03:00
d4a7fa8580 Bug 13163: NORMARC DOM config missing <id> entry
This patch fixes the biblio-koha-indexdefs.xml for NORMARC, so
it includes the <id> element.

Because of how our DOM files work, the resulting biblio-zebra-indexdefs.xsl
for NORMARC picked the whole MARC record as ID, so every time the record
was edited, the id wouldn't match and a new record was created.

To test:
- Have a MARCXML record
- run:
  $ xsltproc etc/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl the_record | less
=> FAIL: verify the z:id property on the <z:record> line contains all subfields concatenated
- Apply the patch
- re-run the xsltproc line
=> SUCCESS: z:id contains the 999$c number
- Sign off :-D

Regards

Signed-off-by: Frederic Demians <f.demians@tamil.fr>

Known bug with DOM: Without <z:id> indexing biblionumber Zebra hasn't it record
unique ID, and so fails to identify existing records. Works as described. 999$c
is linked to biblionumber in default Normarc framework.

Signed-off-by: Magnus Enger <magnus@enger.priv.no>

I have applied the patch to my production server, and at least one customer has
confirmed that it fixes the problem with multiple copies of records in search
results.

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Passes tests and QA script, fix matches what we have for the other MARC flavours.

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-31 16:45:04 -03:00
c217b2c418 Revert "Bug 9828: More specific indexing of UNIMARC 6XX fields"
This reverts commit 0dd1ac40a0.
2014-10-28 12:02:34 -03:00
e43f012af6 Revert "ug 9828 : Add and fix comments in UNIMARC biblio-koha-indexdefs.xml"
This reverts commit 5bbe42932e.
2014-10-28 12:02:22 -03:00
b108a111f6 Revert "Bug 9828 : Followup for Queryparser and deletion of useless 6XX$9"
This reverts commit 49788987b2.
2014-10-28 12:02:09 -03:00
Mathieu Saby
49788987b2 Bug 9828 : Followup for Queryparser and deletion of useless 6XX$9
This followup
- changes some indexes in Queryparser configuration file
- supresses some clearly useless 6XX$9 in biblio-koha-indexdefs.xml and adds 2 new ones, probably useless (not sure of that)
- change the name of index Subject-geographical to Subject-name-geographical in ccl.properties (to match bib1.att)
the xsl file zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl was generated with the following command:
xsltproc zebradb/xsl/koha-indexdefs-to-zebra.xsl zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml > zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl

To test :
1) Apply the 3 patches
2) copy the modified files from the source directory to the directory where you store the config files for Zebra and Queryparser
The files modified by the 3 patches and that need to be copied are:
etc/zebradb/biblios/etc/bib1.att
etc/zebradb/ccl.properties
etc/searchengine/queryparser.yaml
etc/zebradb/ccl.properties
.../unimarc/biblios/biblio-koha-indexdefs.xml
.../unimarc/biblios/biblio-zebra-indexdefs.xsl
3) Rebuild Zebra
4) Create a record A with some values in critical fields, for example:
- the string "test9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f, 606$2,600$2
- the string "subform" in 600$j
4) Create a record B with the string "subgeo" in 606$y
5) Create a record C with the string "subdate" in 606$z
WITHOUT QP activated in sysprefs ("Don't try to use QP"):
6) try to search "su:test9828". You should have no results
7) try to search "su-genre:subform". You should have 1 result : record A
8) try to search "su-geo:subgeo". You should have 1 result : record B
9) try to search "su-chrono:subdate". You should have 1 result : record C
10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo indexes, and see it results are relevant
WITH QP activated in sysprefs:
Same tests

Signed-off-by: Nick Clemens <nick@quecheelibrary.org>

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-27 12:46:47 -03:00
Mathieu Saby
5bbe42932e ug 9828 : Add and fix comments in UNIMARC biblio-koha-indexdefs.xml
Only cosmetic :
- the references to lines record.abs are now useless and outdated
- some comments added in record.abs could be usefull in biblio-koha-indexdefs.xml

No change expected, only comments

Signed-off-by: Nick Clemens <nick@quecheelibrary.org>

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-27 12:46:44 -03:00
Mathieu Saby
0dd1ac40a0 Bug 9828: More specific indexing of UNIMARC 6XX fields
[New commit on 18 Aug 2014 : rebased, and DOM indexing only]

Issues to fix :
Most of 6XX may contain a $2 that identifies the system used for indexing. It should not be indexed.
In French libraries, $2 contains "rameau". So searching books about the music composer "Rameau" retreive thousands of records!
For some 6XX fiels, other subfields should not be indexed, for example dates of persons and family, or adresses.
In Unimarc guide, 600$t,601$t,602$t are said to exist but to be "not used". I keep them indexed.

Additionnally, subject indexing could be improved by using specific indexes for each 6XX if possible :
In ccl.properties :
- su-to, su-geo and su-ut are defined as aliases of Subject.
- a specific index is defined, but not used in record.abs : Subject-name-personal, alias su-na
We can use these indexes and create new specific indexes by using existing bib1 attributes.

We could also index $j,$x,$y,$z subdivision in specific indexes.

This patch does the following changes :
1) For all 6XX : Not indexing $2 (LSCH, Rameau...), $3 and $5
2) Suppressing the indexing of some specific subfields, depending on the field:
600 : Personal name used as a subject // see Marc21 600
not indexing c (additional elements),f (dates),p (address/affiliation)
602 : Family name used as a subject // see Marc21 600 3X
not indexing f (dates)
616 : Trademark
not indexing c,f
3) For all 6XX : index $j,$x,$y,$z in several indexes in addition to the specfific index for their 6XX field:
4) Define in ccl.properties some specific indexes :
Subject-name-conference 1=1073 => alias su-conf
Subject-name-corporate 1=1074 => alias su-corp
Subject-genre-form 1=1075 => alias su-genre and su-form
Subject-geographical 1=1076 => alias su-geo
Subject-chronological 1=1077 => alias su-chrono
Subject-title 1=1078 => alias su-ut and su-ti
Subject-topical 1=1079 => alias su-to
5) Adding new aliases in Search.pm :
su-chrono, su-form, su-genre, su-corp, su-conf, su-ti
6) Using these new indexes in for
600 : Subject and Subject-Personal-Name ; all subfields except subdivisions in Personal-name
601 : Subject, Subject-name-conference and Subject-name-corporate and Subject-name-conf ; all subfields except subdivisions in Corporate-name and Conference-name
602 : same as 600 but could be improved later
604 : Subject and Subject-title ; $a in Subject-Personal-Name ; all subfields except subdivisions in Name-and-Title
605 : Subject and Subject-title
606 : Subject and Subject-topical
607 : Subject and Subject-geographical ; all subfields except subdivisions in Name-geographic
608 : Subject and Subject-genre-form

To test :

A. In a UNIMARC-DOM indexing environment
1) Apply the patch
2) Rebuild zebra
3) Create a record A with some values in critical fields, for example:
- the string "test9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f, 606$2,600$2
- the string "subform" in 600$j
4) Create a record B with the string "subgeo" in 606$y
5) Create a record C with the string "subdate" in 606$z
6) try to search "su:test9828". You should have no results
7) try to search "su-genre:subform". You should have 1 result : record A
8) try to search "su-geo:subgeo". You should have 1 result : record B
9) try to search "su-chrono:subdate". You should have 1 result : record C
10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo indexes, and see it results are relevant

Indexing of subjects could maybe be improved later

Signed-off-by: Nick Clemens <nick@quecheelibrary.org>

All seems to work as expected, I am not super-familiar with UNIMARC but I wonder if in su-corp and su-conf the subdivisions might be useful (e.g. France-Gendarmie / Staatsbibliothek-Berlin)

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-27 12:46:42 -03:00
Jonathan Druart
b3acefc319 Bug 11586: Better default framework for UNIMARC - zebra conf
This patch updates the Zebra configuration for unimarc.

995$d and 995$j should not be indexed.

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-23 10:52:03 -03:00