Commit graph

157 commits

Author SHA1 Message Date
cb3f899826 Bug 17547: (MARC21|NORMARC) have Chronological term field $9 indexed
This patch makes Zebra index the 648$9 link for chronological terms on
bibliographic records. This way an authority search on chronological terms
will show the right number in 'Used in X records' message.

To test:
- Have a record with a 648 field, linked to an authority record (i.e. with an authid on 648$9).
- Search for the record, notice it is indexed.
- Perform an authority search for the chronological term
=> FAIL: the term is linked to our record, but koha shows '0' count.
- Apply the patch
- Run:
  $ cd kohaclone
  $ xsltproc etc/zebra/xsl/koha-indexdefs-to-zebra.xsl \
       etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml \
     > etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
  $ git diff
=> SUCCESS: Notice the shipped etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
   is up-to-date
- Run:
  $ sudo cp etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl \
            /etc/koha/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
  $ sudo koha-restart-zebra kohadev
  $ sudo koha-rebuild-zebra -f -b -v kohadev
- Search for the record, notice it is indexed.
- Perform an authority search for the chronological term
=> SUCCESS: the term is linked to our record, usage count is 1
- Sign off :-D

I assume NORMARC is similar on this regard. Feel free to fail it if the NORMARC part of the
patch is wrong.

Sponsored-by: Universidad Nacional de Cordoba

Signed-off-by: Hugo Agud <hagud@orex.es>

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
2016-12-16 11:27:18 +00:00
5b4259be9c Bug 6499: [QA Follow-up] Trivial adjustments
Removes commented line from bib1.att.
Adjust OCLC-number to Other-control-number in comment of ccl properties.
No need to explicitly add 035$a and $z if you index 035 completely in
record.abs as well as biblio-koha-indexdefs.xml.
Rerun koha-indexdefs-to-zebra.xsl on index defs.

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
2016-08-09 10:13:11 +00:00
Barton Chittenden
84f51549c9 Bug 6499: Add Zebra index "Other-control-number" covering MARC21 035$a, 035$z and 035 (entire tag)
1) Apply patch
2) Make sure that you have a bib that has MARC21 035$a (and possibly also 035$z) populated.

pre 3) Replace all modified zebra files and restart zebra server

3) Rebuild zebra: misc/migration_tools/rebuild_zebra.pl -x -b -z
4) Add the following to the intranetuserjs syspref:

$(document).ready(function(){
    // Add Other Control Number to advanced search
    if (window.location.href.indexOf("catalogue/search.pl") > -1) {
        $(".advsearch").append('<option value="Other-control-number">Other Control Number</option>');
    }
});

5) Do an advanced search, select "Other Control Number" from the search menu, then add the Other Control Number in 035$a for the bib specified in step 1.

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Works, no koha-qa errors

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
2016-08-09 10:13:10 +00:00
ff6523765d Bug 15555: Index 024$a into Identifier-other:u url register when source $2 is uri
This patch indexes 024$a into the "phrase" index type, and the "url" index type,
if the 024$2 equals "uri".

TEST PLAN

1) Apply the patch.
1b) If you're using a gitified Koha or a git install,
you'll need to upgrade your instance or copy your zebradb files
over to /etc/koha/zebradb or your "kohadev" directory.
2) Add a 024$a with a URL like http://libris.kb.se/resource/bib/219553
 to a bibliographic record
3) Re-index Zebra
4) Type "id-other,st-urx,fuzzy=http://libris.kb.se/resource/bib/219553"
 into the "Search the catalog" box in the Staff Client and search
5) Note that you retrieve your record

NOTE: The fuzzy is required because Koha's query "parsing" functions change
http:// to http=// which won't correctly match the value in the "Identifier-other:u" index.
NOTE: Alternatively, you could do the following search instead:
"id-other,phr=http libris kb se resource bib 219553".
 It would work as well by using the "Identifier-other:p" index.

Advanced tester version:
4) In a terminal window, find the "koha-conf.xml" file in your "etc" directory.
5) Open "koha-conf.xml" and find <listen id="biblioserver">.
Copy the URI you find there. (e.g. unix:/home/dcook/koha-dev/var/run/zebradb/bibliosocket).
6) Type "yaz-client unix:/home/dcook/koha-dev/var/run/zebradb/bibliosocket"
7) After it connects, type "base biblios" and press enter
8) Type "format xml" and press enter
9) Type "elements zebra::index" and press enter
10) Type "f id-other,st-urx=http://libris.kb.se/resource/bib/219553" and press enter
11) Note that you should have at least one result
12) Type "show 1"
13) If you scroll through the results, you should find something like the following:

<index name="Identifier-other" type="w" seq="28">@^</index>
<index name="Identifier-other" type="w" seq="1"></index>
<index name="Identifier-other" type="w" seq="29">http</index>
<index name="Identifier-other" type="w" seq="30">libris</index>
<index name="Identifier-other" type="w" seq="31">kb</index>
<index name="Identifier-other" type="w" seq="32">se</index>
<index name="Identifier-other" type="w" seq="33">resource</index>
<index name="Identifier-other" type="w" seq="34">bib</index>
<index name="Identifier-other" type="w" seq="35">219553</index>
<index name="Identifier-other" type="p" seq="28">http libris kb se resource bib 219553</index>
<index name="Identifier-other" type="u" seq="36">http://libris.kb.se/resource/bib/219553</index>

Signed-off-by: Hector Castro <hector.hecaxmmx@gmail.com>
Works as advertised the record is retrieved

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Some of the statements in the commit message do not work for me.
A search like "id-other,phr=http libris kb se resource bib 219553" does not
have results. Searching for "id-other,phr=libris.kb.se resource" does.
The steps in the advanced tester version do not work for me too.
I verified the following in yaz-client:
[1] Z> f @attr 1=9012 @attr 4=104 http://libris.kb.se/resource/bib/219553
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 1, setno 16
[2] First removed $2 and reindexed. Then searched again:
Z> f @attr 1=9012 @attr 4=104 http://libris.kb.se/resource/bib/219553
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 0, setno 1

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
2016-04-29 13:19:28 +00:00
Barton Chittenden
2a8c68936d Bug 14277: (QA followup) Silent GRS-1 tests
changed ocurrences of 'lex' to 'lexile-number' in record.abs

Edits were made to the deprecated file record.abs *solely* to quiet
warnings in tests -- this makes sense until GRS-1 code is removed
from Koha.

Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>

Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
2016-04-07 10:35:18 -06:00
Barton Chittenden
690ab60da2 Bug 14277: add zebra indexes for lexile that respect 521 indicator 1.
Added the following indexes:

Interest-age-level | 591$a ind1=1
Interest-grade-level | 591$a ind1=2
lexile-number | 591$a ind1=8
Reading-grade-level | 591$a ind1=0

Moved 'lex' from a zebra index to a ccl alias to lexile-number.

Changed the handling of st-numeric in C4/Search.pm to allow for search ranges.

Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Hector Castro <hector.hecaxmmx@gmail.com>
Works as advertised
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>

Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
2016-04-07 10:35:18 -06:00
Zeno Tajoli
cc878aee77 Bug 14332: Skip articles in MARC21 using indicator n.2 of field 245
Only in MARC21 is possible to use ind2 of tag 245 to skip articles.
This patch is based on inserting a special template in
koha-indexdefs-to-zebra.xsl With this patch you must not insert index
Title:s in biblio-koha-indexdefs.xml, it is defined in
koha-indexdefs-to-zebra.xsl.  It is not the best setup, but I find very
difficult  to use  biblio-koha-indexdefs.xml.

To test it in a english MARC21 setup:

Insert same records with titles and correct values in ind2 of 245.
If you have articles not in the skiping list of sort-string-utf.chr (The|the|a|A|an|An)
you can see that the sort by articles use also articles.

Insert the patch
Rebuilt indexes from scratch

Now all articles of titles are skipped

TO TEST WITHOUT INDEXING:

1. Go to etc/zebradb/marc_defs/marc21/biblios directory.

2. Put the sample MARCXML file in this directory.

3. Transform the file into Zebra indexes:
   xsltproc biblio-zebra-indexdefs.xsl record.xml
   Observe that the Title:s index contains:
   01 Business and Technologies

4. Apply the patch.

5. Repeat:
   xsltproc biblio-zebra-indexdefs.xsl record.xml
   Observe that the Title:s index contains:
   Business and Technologies

Signed-off-by: Frederic Demians <f.demians@tamil.fr>

Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Verified working using yaz-client (as in
http://wiki.koha-community.org/wiki/Understanding_Zebra_indexing#Examine_Zebra_index,
though note that the `elem zebra::index` seems to be unneeded).

Signed-off-by: Brendan A Gallagher <brendan@bywatersolutions.com>
2016-01-27 06:17:16 +00:00
Hector Eduardo Castro Avalos
de272e5b41 Bug 14198: RDA: Indexing 264 field (Zebra)
This patch add zebra indexes to RDA 264 field.
The new Provider index is added too.
QA comments corrected.

To test:
1) Download RDA records with 264 fields from this attachment <http://bugs.koha-community.org/bugzilla3/attachment.cgi?id=36825>. Import the file and re-index/rebuild zebra. These records contain 260 and 264 fields per record.
2) Do a search with pb:Bethany two records will appear with title The guardian. Search with pl:Minneapolis too, the two records will appear.
3) Select one record of both records and delete the 260 field keeping the 264 field and save, rebuild your zebra.
4) Search again with pb:Bethany and just one record will appear. Thats mean 264 is not indexed.
5) Apply patches.
6) Rebuild your zebra but this time all biblio records.
7) Search again with pv:Bethany or Provider:Bethany, this time will appear the two records, 264 is indexed. Note that if you search again with pb only one record appear. This is because the suggestion of LOC.
10) Search with copydate:2013 only launch records with 260 fields and pv:2013 show both fields, i.e., 260 and 264.
11) Apply QA Test Tools

Sponsored-by: Universidad de El Salvador
Signed-off-by: Nick Clemens <nick@quecheelibrary.org>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-11-02 11:41:36 -03:00
aaf3ff3fec Bug 14154: 608$9 defined twice in UNIMARC biblio-koha-indexdefs.xml
In DOM config file :
etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml, the 608$9 is
defined a second time instead of 610$9.  Just a type I think.

Test plan :
- Apply patch
- Install a UNIMARC + DOM instance
- Define in a framework 610 using a thesaurus
- Create a new biblio
- Create a new authority (same type as the thesaurus defined above)
- Index : rebuild_zebra.pl -a -b -x -z
- Link the field 610 to the new authority
- Index : rebuild_zebra.pl -a -b -x -z
- In authorities search, search for the new authority
=> You see Use in 1 Records(s)

Signed-off-by: Frederic Demians <f.demians@tamil.fr>
  I confirm the typo.

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-10-21 13:43:12 -03:00
Barton Chittenden
cab481dbb2 Bug 14617: Add fields to ISBN and ISSN indexes: 020$z, 022$y, 022$z
1) Import MARC21 bibs containing

- ISBN in 020$z
- ISSN in 022$y
- ISSN in 022$z

2) Make sure that bibs are indexed

3) Search by ISBN and ISSN above -- bibs should not show in search.

4) Apply patch, re-index

5) Search again; ISBN in 020$z and ISSN in 022$y and 022$z should return
results.

Signed-off-by: kholten@switchinc.org
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-10-19 10:12:04 -03:00
Katrin Fischer
e799e1cbc3 Bug 11620: Add dissertation-information index for MARC21 (502)
Bug 11202 introduced a new index 'dissertation-information' for
UNIMARC. This patch adds the index also for MARC21 installations.

http://www.loc.gov/marc/bibliographic/bd502.html

To test:
- Apply patch
- Copy files in etc/zebradb changed by this patch to your
  corresponding directory (koha-dev..)
- Make sure you have records with 502
- Reindex
- Verify you can search the field contents with
  dissertation-information= and
  diss=

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Can find by dissertation-information,
No errors

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
2015-07-20 10:31:06 -03:00
Mirko Tietgen
fbe25b1d8e Bug 14453: (followup) Fix shipped XSLT files
Make the shipped XSLTs for authorities (MARC21 and UNIMARC) the same as the generated version

Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-07-08 14:39:04 -03:00
2365537eea Bug 14453: kohaidx is missing for id in authority-koha-indexdefs.xml
In authority-koha-indexdefs.xml, all tags use the namespace "kohaidx" except the tag "id".

When re-generating authority-zebra-indexdefs.xsl, the line :
  <xslo:variable name="idfield" select="normalize-space(marc:controlfield[@tag='001'])"/>
is modified :
  <xslo:variable name="idfield" select="normalize-space()"/>
This is an error.

This patch adds kohaidx namespace to correct.

Test plan :
- Without patch
- go to etc/zebradb/marc_defs/marc21/authorities/
- run : xslproc xsltproc ../../../xsl/koha-indexdefs-to-zebra.xsl authority-koha-indexdefs.xml > authority-zebra-indexdefs.xsl
- read authority-zebra-indexdefs.xsl
=> the line has changed : <xslo:variable name="idfield" select="normalize-space()"/>
- Apply patch
- go to etc/zebradb/marc_defs/marc21/authorities/
- run : xslproc xsltproc ../../../xsl/koha-indexdefs-to-zebra.xsl authority-koha-indexdefs.xml > authority-zebra-indexdefs.xsl
- read authority-zebra-indexdefs.xsl
=> the line has not changed
(same for unimarc flavor)

Signed-off-by: Mirko Tietgen <mirko@abunchofthings.net>
Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar>
As Mirko mentioned, the xslt's now generate the facet-processing templates in
the authority xslt's too. They are harmless because we don't define facets
for authority records. If we did, it would be harmless too.

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-07-08 14:39:04 -03:00
Stefan Weil
f6aec46dda Bug 14383: etc/zebradb: Fix some typos in documentation and Bib-1 attribute set
All of them were found and fixed using codespell.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>

Signed-off-by: Jonathan Druart <jonathan.druart@koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2015-06-22 17:34:46 -03:00
Jonathan Druart
b5e9691060 Bug 8992: Add 7..$3 to the Indentifier-standard index
Signed-off-by: valerie bertrand <valerie.bertrand@univ-lyon3.fr>

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
2015-04-28 15:47:40 -03:00
Zeno Tajoli
c29a53ea20 Bug 12948: Use word indexing for language (MARC21)
This patch is for MARC21. To test:
1)Setup a site with
 MARC21
2)Insert 2 record, one lang A in 041 and 008 pos
 35-37 an other with lang A in 041 and lang B in 008 pos
 35-37
3)Index them
4)Search in advanced search with filter
 'languare' for lan A. You will see 2 records
5)Search in
 advanced search with filter 'languare' for lan B. You will
 see 0 records
6)Apply the patch
7)Full reindex
8)Search in advanced search
 with filter 'languare' for lan B. You will see 1 records

http://bugs.koha-community.org/show_bug.cgi?id=12948

Signed-off-by: Magnus Enger <magnus@enger.priv.no>
I have *not* actually tested this, but the changes are identical to the ones
done for NORMARC, which I have tested, so I think it is safe to sign off. If
anyone disagrees, please reset the bug to "Needs signoff".

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2015-02-20 11:51:59 -03:00
Zeno Tajoli
bf89e306a8 Bug 12948: Use word indexing for language (NORMARC)
This patch is for Normarc
Same test plan as patch for MARC21, except you need a setup with Normarc.

http://bugs.koha-community.org/show_bug.cgi?id=12948
Signed-off-by: Magnus Enger <magnus@enger.priv.no>

- Added a record with "bul" in 008pos35-37
- Verified that this did not turn up in an advanced search with language =
  Bulgarian
- Applied the patch
- I was testing on a gitified install, so I had to copy the patched index file
  to the right location with this command:

sudo cp etc/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl \
/etc/koha/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl

- Did a full reindex
- Verified that the record *did* turn up in an advanced search with language =
  Bulgarian
- Signing off! Thanks Zeno!

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2015-02-20 11:51:50 -03:00
532b41934c Bug 13157: (QA followup) homebranch is 995$b on UNIMARC frameworks
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
2014-11-25 15:27:12 -03:00
9ebb6ba5d1 Bug 13157: UNIMARC holdingbranch facet is 995$c not 995$b
Fix a typo. Not test plan required, just a look at default UNIMARC framework.

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
2014-11-25 15:27:05 -03:00
d4a7fa8580 Bug 13163: NORMARC DOM config missing <id> entry
This patch fixes the biblio-koha-indexdefs.xml for NORMARC, so
it includes the <id> element.

Because of how our DOM files work, the resulting biblio-zebra-indexdefs.xsl
for NORMARC picked the whole MARC record as ID, so every time the record
was edited, the id wouldn't match and a new record was created.

To test:
- Have a MARCXML record
- run:
  $ xsltproc etc/zebradb/marc_defs/normarc/biblios/biblio-zebra-indexdefs.xsl the_record | less
=> FAIL: verify the z:id property on the <z:record> line contains all subfields concatenated
- Apply the patch
- re-run the xsltproc line
=> SUCCESS: z:id contains the 999$c number
- Sign off :-D

Regards

Signed-off-by: Frederic Demians <f.demians@tamil.fr>

Known bug with DOM: Without <z:id> indexing biblionumber Zebra hasn't it record
unique ID, and so fails to identify existing records. Works as described. 999$c
is linked to biblionumber in default Normarc framework.

Signed-off-by: Magnus Enger <magnus@enger.priv.no>

I have applied the patch to my production server, and at least one customer has
confirmed that it fixes the problem with multiple copies of records in search
results.

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Passes tests and QA script, fix matches what we have for the other MARC flavours.

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-31 16:45:04 -03:00
c217b2c418 Revert "Bug 9828: More specific indexing of UNIMARC 6XX fields"
This reverts commit 0dd1ac40a0.
2014-10-28 12:02:34 -03:00
e43f012af6 Revert "ug 9828 : Add and fix comments in UNIMARC biblio-koha-indexdefs.xml"
This reverts commit 5bbe42932e.
2014-10-28 12:02:22 -03:00
b108a111f6 Revert "Bug 9828 : Followup for Queryparser and deletion of useless 6XX$9"
This reverts commit 49788987b2.
2014-10-28 12:02:09 -03:00
Mathieu Saby
49788987b2 Bug 9828 : Followup for Queryparser and deletion of useless 6XX$9
This followup
- changes some indexes in Queryparser configuration file
- supresses some clearly useless 6XX$9 in biblio-koha-indexdefs.xml and adds 2 new ones, probably useless (not sure of that)
- change the name of index Subject-geographical to Subject-name-geographical in ccl.properties (to match bib1.att)
the xsl file zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl was generated with the following command:
xsltproc zebradb/xsl/koha-indexdefs-to-zebra.xsl zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml > zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl

To test :
1) Apply the 3 patches
2) copy the modified files from the source directory to the directory where you store the config files for Zebra and Queryparser
The files modified by the 3 patches and that need to be copied are:
etc/zebradb/biblios/etc/bib1.att
etc/zebradb/ccl.properties
etc/searchengine/queryparser.yaml
etc/zebradb/ccl.properties
.../unimarc/biblios/biblio-koha-indexdefs.xml
.../unimarc/biblios/biblio-zebra-indexdefs.xsl
3) Rebuild Zebra
4) Create a record A with some values in critical fields, for example:
- the string "test9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f, 606$2,600$2
- the string "subform" in 600$j
4) Create a record B with the string "subgeo" in 606$y
5) Create a record C with the string "subdate" in 606$z
WITHOUT QP activated in sysprefs ("Don't try to use QP"):
6) try to search "su:test9828". You should have no results
7) try to search "su-genre:subform". You should have 1 result : record A
8) try to search "su-geo:subgeo". You should have 1 result : record B
9) try to search "su-chrono:subdate". You should have 1 result : record C
10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo indexes, and see it results are relevant
WITH QP activated in sysprefs:
Same tests

Signed-off-by: Nick Clemens <nick@quecheelibrary.org>

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-27 12:46:47 -03:00
Mathieu Saby
5bbe42932e ug 9828 : Add and fix comments in UNIMARC biblio-koha-indexdefs.xml
Only cosmetic :
- the references to lines record.abs are now useless and outdated
- some comments added in record.abs could be usefull in biblio-koha-indexdefs.xml

No change expected, only comments

Signed-off-by: Nick Clemens <nick@quecheelibrary.org>

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-27 12:46:44 -03:00
Mathieu Saby
0dd1ac40a0 Bug 9828: More specific indexing of UNIMARC 6XX fields
[New commit on 18 Aug 2014 : rebased, and DOM indexing only]

Issues to fix :
Most of 6XX may contain a $2 that identifies the system used for indexing. It should not be indexed.
In French libraries, $2 contains "rameau". So searching books about the music composer "Rameau" retreive thousands of records!
For some 6XX fiels, other subfields should not be indexed, for example dates of persons and family, or adresses.
In Unimarc guide, 600$t,601$t,602$t are said to exist but to be "not used". I keep them indexed.

Additionnally, subject indexing could be improved by using specific indexes for each 6XX if possible :
In ccl.properties :
- su-to, su-geo and su-ut are defined as aliases of Subject.
- a specific index is defined, but not used in record.abs : Subject-name-personal, alias su-na
We can use these indexes and create new specific indexes by using existing bib1 attributes.

We could also index $j,$x,$y,$z subdivision in specific indexes.

This patch does the following changes :
1) For all 6XX : Not indexing $2 (LSCH, Rameau...), $3 and $5
2) Suppressing the indexing of some specific subfields, depending on the field:
600 : Personal name used as a subject // see Marc21 600
not indexing c (additional elements),f (dates),p (address/affiliation)
602 : Family name used as a subject // see Marc21 600 3X
not indexing f (dates)
616 : Trademark
not indexing c,f
3) For all 6XX : index $j,$x,$y,$z in several indexes in addition to the specfific index for their 6XX field:
4) Define in ccl.properties some specific indexes :
Subject-name-conference 1=1073 => alias su-conf
Subject-name-corporate 1=1074 => alias su-corp
Subject-genre-form 1=1075 => alias su-genre and su-form
Subject-geographical 1=1076 => alias su-geo
Subject-chronological 1=1077 => alias su-chrono
Subject-title 1=1078 => alias su-ut and su-ti
Subject-topical 1=1079 => alias su-to
5) Adding new aliases in Search.pm :
su-chrono, su-form, su-genre, su-corp, su-conf, su-ti
6) Using these new indexes in for
600 : Subject and Subject-Personal-Name ; all subfields except subdivisions in Personal-name
601 : Subject, Subject-name-conference and Subject-name-corporate and Subject-name-conf ; all subfields except subdivisions in Corporate-name and Conference-name
602 : same as 600 but could be improved later
604 : Subject and Subject-title ; $a in Subject-Personal-Name ; all subfields except subdivisions in Name-and-Title
605 : Subject and Subject-title
606 : Subject and Subject-topical
607 : Subject and Subject-geographical ; all subfields except subdivisions in Name-geographic
608 : Subject and Subject-genre-form

To test :

A. In a UNIMARC-DOM indexing environment
1) Apply the patch
2) Rebuild zebra
3) Create a record A with some values in critical fields, for example:
- the string "test9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f, 606$2,600$2
- the string "subform" in 600$j
4) Create a record B with the string "subgeo" in 606$y
5) Create a record C with the string "subdate" in 606$z
6) try to search "su:test9828". You should have no results
7) try to search "su-genre:subform". You should have 1 result : record A
8) try to search "su-geo:subgeo". You should have 1 result : record B
9) try to search "su-chrono:subdate". You should have 1 result : record C
10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo indexes, and see it results are relevant

Indexing of subjects could maybe be improved later

Signed-off-by: Nick Clemens <nick@quecheelibrary.org>

All seems to work as expected, I am not super-familiar with UNIMARC but I wonder if in su-corp and su-conf the subdivisions might be useful (e.g. France-Gendarmie / Staatsbibliothek-Berlin)

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-27 12:46:42 -03:00
Jonathan Druart
b3acefc319 Bug 11586: Better default framework for UNIMARC - zebra conf
This patch updates the Zebra configuration for unimarc.

995$d and 995$j should not be indexed.

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-23 10:52:03 -03:00
ca17512a8e Bug 11232: (qa followup) empty ID due to namespace mistake
Note: NORMARC is missing the id field.

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
This patch makes t/db_dependent/Search.t pass again.
NORMARC is currently not tested.

I checked the results before and after applying the patch
and the facets are now looking the same as before.
Passes all tests and QA script.

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-15 12:55:52 -03:00
ccf7ae56f6 Bug 11232: (qa followup) Add missing fields/subfields to the item types faceta
The itype facet was missing 952$y for both MARC21 and NORMARC.
This patch adds that. And also modifies the zebra-biblios-dom.cfg file
(also the debian/ version) so facetNumRecs is set to 1000 for zebra.

It is the amount of records that are taken into account. The more record,
the more exact the facets for the result set. 1000 was chosen as it changed
the time to reindex 1000 records from 18s to 19s.

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-15 12:55:47 -03:00
e95cd1b126 Bug 11232: (followup) remove unnecesary namespace definition from all XML elements
The previous patches for facet extraction from Zebra indexes set a default
namespace on the following files:

etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml
etc/zebradb/marc_defs/normarc/biblios/biblio-koha-indexdefs.xml
etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml

and hence the XML file index_subfields can be cleaned by removing the namespace.

To test:
- Apply this patch
- Run

$ for i in marc21 normarc unimarc
  do xsltproc etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl \
              etc/zebradb/marc_defs/$i/biblios/biblio-koha-indexdefs.xml \
              > etc/zebradb/marc_defs/$i/biblios/biblio-zebra-indexdefs.xsl
  done

=> SUCCESS: no errors reported

- Run
$ git diff
=> SUCCESS: no differences on the xsl files

- Sign off :-D

Sponsored-by: Universidad Nacional de Cordoba
Signed-off-by: David Cook <dcook@prosentient.com.au>

Seems to work with DOM and MARC21.

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-15 12:55:44 -03:00
c1e384f250 Bug 11232: NORMARC facet definition and updated XSL file for DOM
This patch adds the facets definitions to the biblio-koha-indexdefs.xml, based
on what is hardcoded on C4::Koha::getFacets().

The biblio-zebra-indexdefs.xsl file for NORMARC is generated using the usual:

xsltproc ...koha-indexdefs-to-zebra.xsl ...normarc/biblios/biblio-koha-indexdefs.xml > \
    ...normarc/biblios/biblio-zebra-indexdefs.xsl

Sponsored-by: Universidad Nacional de Cordoba
Signed-off-by: David Cook <dcook@prosentient.com.au>

Seems to work with DOM and MARC21.

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-15 12:55:40 -03:00
eafeb34097 Bug 11232: UNIMARC facet definition and updated XSL file for DOM
This patch adds the facets definitions to the biblio-koha-indexdefs.xml, based
on what is hardcoded on C4::Koha::getFacets().

The biblio-zebra-indexdefs.xsl file for UNIMARC is generated using the usual:

xsltproc ...koha-indexdefs-to-zebra.xsl ...unimarc/biblios/biblio-koha-indexdefs.xml > \
    ...unimarc/biblios/biblio-zebra-indexdefs.xsl

Sponsored-by: Universidad Nacional de Cordoba
Signed-off-by: David Cook <dcook@prosentient.com.au>

Seems to work with DOM and MARC21.

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-15 12:55:38 -03:00
2cc293ecd6 Bug 11232: MARC21 facet definition and updated XSL file for DOM
This patch adds the facets definitions to the biblio-koha-indexdefs.xml, based
on what is hardcoded on C4::Koha::getFacets().

The biblio-zebra-indexdefs.xsl file for MARC21 is generated using the usual:

xsltproc ...koha-indexdefs-to-zebra.xsl ...marc21/biblios/biblio-koha-indexdefs.xml > \
    ...marc21/biblios/biblio-zebra-indexdefs.xsl

Sponsored-by: Universidad Nacional de Cordoba
Signed-off-by: David Cook <dcook@prosentient.com.au>

Seems to work with DOM and MARC21.

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-15 12:55:36 -03:00
95adc7a1f4 Bug 12453 - Do not use by default Host-Item-Number in UNIMARC
Actually, in default UNIMARC install, 461$9 is indexed as Host-Item-Number, meaning it is used for analytical itemnumber.

But most UNIMARC catalog use the analytical relation using unimarc_field_4XX.pl plugin on 461$a. In fact, this plugin is defined in default UNIMARC frameworks.

If Host-Item-Number is defined but 461$9 is used for something else, it will lead to odd bugs. For example, records containing analytical items can not be deleted.

This patch comments the 461$9 indexing in UNIMARC zebra config.

Test plan :
- Create a fresh UNIMARC install
- Create a record with 461$9 containing a value
- Index the record
- Perform a search on Host-Item-Number : ccl=Host-Item-Number,alwaysmatches=''
=> Without the patch you get a result
=> With the patch you get no result

Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Code is clean, commenting out all the indexing of 461$9.
Trusting the author that this is the correct thing to do :)

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-08-24 12:32:30 -03:00
bd65c6e95b Bug 11635: remove duplicate definition of 995$r in UNIMARC record.abs
Test plan :
- Create a fresh install UNIMARC flavor and GRS1 indexing for biblios
- Re-indexe database
- Perform a search with index "itemtype" (and then "itype") on an
  existing value of 995$r. For example : itemtype:BOOK
=> Check you get results

Signed-off-by: Mark Tompsett <mtompset@hotmail.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-05-05 02:25:20 +00:00
Mirko Tietgen
84bdb55549 Bug 9972: Add/change some zebra indexes (MARC21)
This patch adds :w and :p versions to the index for »Lexile number«
(it has only :n so far) and adds indexes for 653 (Index term
uncontrolled), 655 (Index term Genre/Form), 041 (language-audio) and
041 (language-subtitle). It also adds the »curriculum«-index to
Search.pm.

Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-04-20 16:24:08 +00:00
Mathieu Saby
b6118db2f5 Bug 11202: Improve UNIMARC biblio indexing
This patch makes the following changes to UNIMARC biblio indexing :
A. Changes to UNIMARC conf files
1. add comments to biblio-koha-indexdefs.xml
2. make biblio-koha-indexdefs.xml more compact by grouping some
   declarations
   Ex : 200$f and 200$g => one declaration for 200$fg
3. suppress unneeded declarations (indexing of some 4XX fields and 6XX
   fields not in unimarc format)
4. unindex some (sub)fields unneeded by most users (318, 207,230,210a,
   215, 4XXd)
5. change the way 308 field is indexed (no visible changes)
6. replace Title-host with Host-item -- see bug 11119
7. index 208 in Material-Type -- see bug 11119
8. index 100 pos 8-9 and 9-12 in pubdate:y and pubdate:n
9. index 100 pos 8-9 in pubdate:s instead of 210$d
10. Index all subfields of note 334 and 327 in note index
11. Index 304 and 327 in title index as well as note index
    327 can contain a list of titles included in a work
    304 can contain the title of the original work in case of a
    translation
12. Index 314 in author index as well as note index
    314 can contain authors not mentionned in 200$f/g (the 4th, 5th etc.
    author)
13. Index 328 note in Dissertation-information as well as note
14. Index 328$t in Title

B. Changes to ccl.properties :
1. add a new index Dissertation-information (1056)
2. fix EAN, pubdate and acqdate (they were not linked with bib1 attributes)

C. Changes to Search.pm
1. add Dissertation-information and suppress Title-host and UPC

D. Changes to QP config file queryparser.yaml
1. add Dissertation-information
2 fix EAN, pubdate and acqdate

Test plan :
If you cannot test in GRS1, test only in DOM, as GRS will be deprecated.

1. Apply the patch in a UNIMARC Koha running with DOM and ICU
2. copy src/etc/searchengine/queryparser.yaml into the main config
   directory of QP
3. copy src/etc/zebradb/ccl.properties into the main config directory
   of Zebra
4. copy src/etc/zebradb/marc_defs/unimarc/biblio/* into the main config
   directory of Zebra
5. reindex biblios (rebuild_zebra.pl -r -b -x -v)
6. test note index : make some searches on 334$b or 327$b
7. test author index : make some searches on 314 field
8. test title index : make some searches on 304 and 327 field, make a
   search on 328$t subfield
9. test dissertation-information index : make some searches on 328 field
10. In a record, put in the dates of 100 fields the values "1000" (1st
    date) and "1001" (2d date) ; try to search a book written in year
    1000, you should find the record ; idem for year 1001
11. make some searches and sort by date. It should work better as before,
    especially if you have values like "c2009" or "impr. 2010" in 210
    field
12. Regression test : make some searches on several indexes, like EAN,
    etc. It should work as before

Test 10-12 with and without Queryparser activated.
Be careful: with Queryparser activated, the index names (title,
dissertation-information...) must be entered in lowercase only.
Of course, to test search and sort by dates, you need to have full
records, with dates in 100 field as well as 210 field.

Signed-off-by: Paola Rossi <paola.rossi@cineca.it>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-02-19 21:01:15 +00:00
Galen Charlton
aaff735269 Bug 10544: (follow-up) update MARC21 DOM index definitions
This patch updates the MARC21 DOM index definitions to
index the 952$i as 'Number-local-acquisition' rather than
'stocknumber'.

To test (for a MARC21/DOM setup):

[1] Copy the MARC21 biblio-zebra-indexdefs.xsl over to the
    active Zebra configuration directory.
[2] Reindex the bib records.
[3] Verify that 'stocknumber', 'inv', and 'number-local-acquisition'
    searches work.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-02-19 20:41:37 +00:00
Fridolyn SOMERS
10e1cbeb14 Bug 10544: ensure that stocknumber searches work for MARC21
Bug 6256 replaced in bib1.att stocknumber by Number-local-acquisition
for number 1062.

In this case, Number-local-acquisition must be used in record.abs and
stocknumber can be an alias of it in ccl.properties.

Test plan (for MARC21/GRS1):
- drop zebra database (rebuild_zebra.pl -r ...)
- reindex
- test in simple search : ccl=Number-local-acquisition,alwaysmatches=''
=> you get all records with a stocknumber
- test in simple search : ccl=stocknumber,alwaysmatches=''
=> you get the same results

Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-02-19 20:39:10 +00:00
Jonathan Druart
a573ac1fa8 Bug 9940: (follow-up) FIX comment: language-original is 101$c, not $h
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-12-25 15:41:31 +00:00
Mathieu Saby
451f67c055 Bug 9940: Add a new index for the original language of a document
It could be useful to index the original language of a document (i.e.
"fre" for the English translation of a French novel).

This patch renames the Bib-1 use attribute 1095 from
Code-language-original to language-original and uses it to index:

- MARC21 041$h subfield
- UNIMARC 101$c subfield

It adds "language-original" in the list of index in Search.pm.

Test plan :
A. in a MARC21 GRS1 environment
1. Copy Zebra config files (zebradb/biblios/etc/bib1.att,
   zebradb/ccl.properties, marc_defs/marc21/biblios/record.abs) from
   your source etc/ directory to your main koha etc/ directory
2. Reindex zebra
3. Make some searches, like "language-original:fre"
B. in a MARC21 DOM environment
4. Copy Zebra config files (zebradb/biblios/etc/bib1.att, zebradb/ccl.properties,
   marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl) from your source etc/
   directory to your main koha etc/ directory
5. Reindex zebra
6. Make some searches, like "language-original:fre"
C. in a UNIMARC GRS1 environment
7. Copy Zebra config files (zebradb/biblios/etc/bib1.att,
   zebradb/ccl.properties, marc_defs/unimarc/biblios/record.abs) from
   your source etc/ directory to your main koha etc/ directory
8. Reindex zebra
9. Make some searches, like "language-original:fre"
A. in a UNIMARC DOM environment
10. Copy Zebra config files (zebradb/biblios/etc/bib1.att,
    zebradb/ccl.properties, marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl)
    from your source etc/ directory to your main koha etc/ directory
11. Reindex zebra
12. Make some searches, like "language-original:fre"

Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-12-25 15:37:14 +00:00
Mathieu Saby
c00131e0ff Bug 9830: Fix some indexes in UNIMARC item indexing
With this combination of sysprefs, and a UNIMARC configuration, it was
impossible to search on location, barcode and ccode indexes :

QueryWeightFields          is activated
QueryAutoTruncate          only if * is added

But in UNIMARC, location, barcode and ccode (995 $e,$f,h) were indexed
only as "words". They need to be indexed also as "phrase".

Additionnaly, in UNIMARC, information about damaged and withdrawn status
of items is not indexed, while it is done in MARC21.

This patch
- add 2 new indexes for 995$1 (damaged) and 995$3 (withdrawn)
- index location, barcode and ccode as "phrase" as well as "words"

Indexing of items in UNIMARC could be improved later. So this patch also
add comments explaining the origin of Koha 995, I think it could be
useful for further changes.

To test, on a UNIMARC configuration :
A. indexed with GRS-1
1) Set sysprefs QueryWeightFields as "activated" and QueryAutoTruncate
   as "only if * is added"
2) Select location index in advanced search and search for a value
   existing in your records in 995$e => 0 results
3) Apply patch
4) Rebuild zebra
5) Select location index in advanced search and search for a value
   existing in your records in 995$e => x results
6) Mark an item as withdrawn; search "withdrawn:1" => x results, and
   among them the biblio to which the item is attached
7) Mark an item as damaged ; search "damaged:1" => x results, and among
   them the biblio to which the item is attached

B. indexed with DOM
Do the same operations

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Work as described. No koha-qa errors

Test
Apply the patch
Begin with GRS-1
Full reindex
Search by location, no results
cp files biblio-*-indexdefs.xml and record.abs to destination on etc/zebra
Full reindex
Search by location, got results

Switch to DOM
reset files
Full reindex
Search by location, no results
cp files
Full reindex
Search by location, results !

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-21 15:38:49 +00:00
Mathieu Saby
4d8b1ec786 Bug 7421: support indexing UNIMARC authority records using the DOM Filter
I took as a base the patch of F. Demians, but made a lot of changes,
so I think it is more logical to create a new patch as the behavior is
not the same as previous patch.

I tried to define DOM config files as a "miror" of record.abs, so the
behavior be the same.

If it is OK, we will be able to improve indexing later, for example
suppressing warns, managing indicators or subdivisions, etc.

I made some little changes to record.abs :
- comments
- 216 was indexed in Conference-name as well as Trademark. I suppose
  that "Conference-name" is an error, so I indexed only in Trademark
- index 2 new notes : 340 / 356

The only difference between record.abs and DOM is that DOM config files
does not index complete fields, but subfields.

Ex :
melm 200 ===> <kohaidx:index_subfields tag="200" subfields="abcdfgjxyz">
I took all the subfields from the UNIMARC Authorities manual. The only
subfields not indexed are numeric subfields : $7, $8 for language of
record, and $0,2,3,5,6 for 4XX/5XX/7XX

To test :
- index a set of bib and auth records with GRS-1
- make some searches on different kind of authorities
- index the same records with DOM
- make the same searches
- You are not supposed to see differences

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
As I am not a UNIMARC user it's hard for me to test this, but
while testing other authority related patches I noticed that I couldn't
index the UNIMARC authorities of the sample base. The files are obviously
missing and reindex_zebra.pl notes this. With this patch applied,
indexing works and authorities are searchable in my installation.

Signed-off-by: Vitor Fernandes <fvernandes@keep.pt>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-10 21:03:15 +00:00
Mathieu Saby
5298140c67 Bug 10037: fix item index in UNIMARC DOM indexing
In UNIMARC DOM indexing, "item" index was working only for subfields
of 995 field mapped with specific indexes, and also in index (ex :
$a, $b...). It was not working for the other subfields (ex : $g),
because a comment from record.abs was integrated in DOM config files.

This patch removes the comment.

To test, in a DOM UNIMARC environment :
1) In a item, write some value "Test10037" in 995$g
2) Search for this value in simple search, this way : item=Test10037
   => you should have no results
3) Apply the patch. if necessary, copy the modified
   etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml and
   etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl into
   the /etc/... directory in your main Koha directory
4) Reindex Zebra biblios
5) Do the same search as 2) => you should have one result

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Work as described. No koha-qa errors.

Test

NOTE: default UNIMARC framework don't have 995g,
so I must add it first.

1) Added test string to 995b on some record
2) Reindex and search as indicated, no results
3) cp files to destination
4) reindex
5) search and result ok !

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-10 18:54:12 +00:00
Galen Charlton
8ea3462517 Bug 8252: (follow-up) standardize name of Identifier-publisher-for-music index
To test:

[1] When running t/db_dependent/Search.t, veify that no warnings like
    this are shown:

15:52:07-10/10 zebraidx(2006) [warn] Index 'Number-music-publisher' not found in attset(s)

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-10 16:05:33 +00:00
Galen Charlton
45d0365d12 Bug 8620: (follow-up) apply to NORMARC and MARC21 authorities
This applies the fix for the Any index to NORMARC bib
and MARC21 authority DOM Zebra indexes.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-10 15:56:13 +00:00
Mathieu Saby
475a9d19d1 Bug 8252: (follow-up) fix biblio-zebra-indexdefs.xsl
This patch fixes biblio-zebra-indexdefs.xsl files.
It was generated from biblio-koha-indexdefs.xsm with the new
koha-indexdefs-to-zebra.xsl amended by F. Démians's patch.

To test :
- Take a DOM UNIMARC Koha
- Apply all the patchs of 8252 bug, including this one
- Copy src/etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl
  to your etc/zebradb/marc_defs/unimarc/biblios/ located in your
  installation directory
- Run rebuid_zebra -b -x -r -v
- make advanced searches on staff interface and opac, on coded fields
  indexes (Audience, Literary genre, Biography, Illustration, Content,
  Video Types, Serial Type, Periodicity, Regularity, Picture)

Signed-off-by: Frédéric Demians <f.demians@tamil.fr>

Ok for me. This patch put in sync indexes XSL definition with
authoritative XML definition. Subsequently, it won't be difficult to
amend DOM UNIMARC indexes defintion if necessary. And, as it is, I don't
see any regression, whereas I can see huge improvements. Thanks Mathieu!

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-10 15:21:56 +00:00
Mathieu Saby
43809d2835 Bug 8252: Followup for Date/time-last-modified and Music number
This followup restores the original wording of "Date/time-last-modified"
index, and change the name of "Music-number" index to
"Number-music-publisher"

To test :
1. In a UNIMARC Koha instance
2. Apply patchs #1, #2 and this followup
3. Copy from src/etc/zebradb directory to the etc/zebradb/ in your main
   Koha directory the following files:
-- zebradb/biblios/etc/bib1.att
-- zebradb/ccl.properties
-- zebradb/marc_defs/unimarc/biblios/record.abs
-- zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml
-- zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl
4. Rebuild zebra with -b -x -v -r options
5. Write a value like "test071a" in 071$a field in a record
6. Check if you can find this record with this search:
   "ccl=Number-music-publisher:test071a"

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
No koha-qa errors.

Test
Copy files
reindex full
Modify a couple of record to add 071a with test message
Reindex -v -z -b -x
Search test message as described and found modified records.

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-10 15:16:27 +00:00
Mathieu Saby
8034566027 Bug 8252: Fix indexing of UNIMARC 1xx for DOM
This patch makes the same changes in UNIMARC DOM configuration as patch
1 made for GRS-1.

positions of subfields are indexed that way :
In biblio-koha-indexdefs.xml :
tag="100" subfields="a" offset="17" length="1"
In biblio-zebra-indexdefs.xsl :
xslo:value-of select="substring(., 17, 1)"

I had to edit biblio-zebra-indexdefs.xsl by hand, because
etc/zebdradb/xml/koha-indexdefs-to-zebra.xsl does only support
"subtring" in handle-one-index-control-field template.

It is good for MARC21, but not for UNIMARC : in MARC21, indexing
subtrings is needed for controled field (001-009, with no subfields)
But in UNIMARC it is needed for subfields of 1XX fields.
So if DOM indexing is working with these new files, we may need to
change koha-indexdefs-to-zebra.xsl.

Test plan (not possible in a sandbox) :
1) In a Koha instance using UNIMARC and DOM indexing
2) Apply Patch 1 and Patch 2 (this one)
3) Copy the following files from the etc/zebradb directory of your
   source into the etc/zebradb directory of your main Koha directory :
-- etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml
-- etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl
-- etc/zebradb/ccl.properties
-- etc/zebradb/biblios/etc/bib1.att
4) rebuild zebra with -x -b -r -v options
5) check if coded filters in advanced search are usable in OPAC and
   Staff interface

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Works. No koha-qa errors.

Test for DOM
Apply patches
Don't forget to copy files
reindex
Search by coded fields works, also Country-publication

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-10 15:15:04 +00:00
Mathieu Saby
041e3603a1 Bug 8252: Fix indexing of UNIMARC 1xx for GRS-1
Before fixing UNIMARC DOM indexing, we must fix GRS-1 indexing

1) In advanced search, some Coded fields index are not working: Print,
   Illustration, Content
2) Country-heading index is not working
3) Some subfields are indexed in wrong indexes :

  102$a should be in Country-publication instead of Country-heading
        (non defined in bib1.att)
  106$a, filled only for printed works, should be in ff88-23 (form of
         item) instead of itype.  (ff88-23 is made for Marc21 008 pos
         23, which contains the same data as 106a)
  200$b should be in Material-type instead of (or in addition to) itype
        and itemtype: (Material-type :"free-form string, ... that
        describes the material type of the item, e.g., cassette, kit,
        computer database, computer file.")
  100$a pos 22-24 should not be indexed as "ln" : it is the language of
        the record, not the language of the ressource

4) Index names are too long : if we index new positions of coded fields,
   with existing names it breaks Zebra indexing (there must be a limit
   in line lenghth in record.abs?)
5) There are a lot of warns when rebuiding zebra.

This patch make some changes in bib1.att (could be used later to improve
search) :

- fixing wording for att 51 and 1012
- adding comments for attributes based on MARC21 008 field (8800-8841)
- creating 8806 (tpubdate), 8838 (Modified-code), 8818 (ff8-18), 8840
  (ff8-18-21), 8819 (ff8-19), 8821 (ff8-21), 8828 (ff8-28), 8830
  (ff8-30), 8831 (ff8-31)
- creating attributes specific to UNIMARC : 9701-9707 (Video-mt,
  Graphics-type, Graphics-support, Title-page-availability,
  Cumulative-index-availability, script-Title, char-encoding)
- setting apart 3 blocks of attributes, so it could be easy to make
  further changes :
-- common to Marc21 and UNIMARC : 8806, 8822, 8838
-- slightly different in Marc21 and UNIMARC (different meanings
   according to the type of the record => don't match a single
   UNIMARC field)
-- specific to UNIMARC : 9701-9707

In ccl.properties :
- creating a new index: Country-publication 1=1053
- suppressing some warns by mapping with bib1 att:
  Date-time-last-modified, Name, rtype, Music-number
- defining indexes using the 3 blocks attributes defined in bib1
  (common to Marc21 and UNIMARC, slightly different, specific to UNIMARC)

In record.abs :
- renaming some index for 100-105-110 fields
- correcting indexing of 102$a (country of publication)
                         106$a (ff88-23)
                         100$a pos 22-24 (language of record, no more
                               indexed)
                         105$a pos. 0-3 (illustration code)
                         200$b (for the moment, I keep it indexed in
                               itype and itemtype, but also Material-Type)

In C4/Search.pm :
- adding "Country-publication" index

In OPAC and staff interface template subtypes_unimarc.in :
- renaming indexes to take into account the changes made to Zebra
  config files

To test (this cannot be done with a sandbox) :
1) Apply the patch in a UNIMARC GRS-1 Koha instance
2) Copy the following files from the etc/zebradb of your source
   directory into the etc/zebradb of your main Koha directory:
-- etc/zebradb/biblios/etc/bib1.att
-- etc/zebradb/ccl.properties
-- etc/zebradb/marc_defs/unimarc/biblios/record.abs
3) Reindex your data (rebuild_zebra -x -b -r -v)
4) Try to use those Coded fields indexes in Advanced search, in OPAC
   and Staff interface (available after clicking on "More options",
   then on "Coded information filters"):
   Audience, Print, Literary genre, Biography, Illustration, Content,
   Video Types, Serials, Serial Type, Periodicity, Regularity
5) Try to search "Country-publication=FR" in simple search

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
No koha-qa errors.

Tests for GRS-1
Followed test plan
Search by coded fields works, but only on OPAC,
on staff there are few options
Search by Country-publication works after patch

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-10-10 15:06:10 +00:00