This patch makes the following changes to UNIMARC biblio indexing :
A. Changes to UNIMARC conf files
1. add comments to biblio-koha-indexdefs.xml
2. make biblio-koha-indexdefs.xml more compact by grouping some
declarations
Ex : 200$f and 200$g => one declaration for 200$fg
3. suppress unneeded declarations (indexing of some 4XX fields and 6XX
fields not in unimarc format)
4. unindex some (sub)fields unneeded by most users (318, 207,230,210a,
215, 4XXd)
5. change the way 308 field is indexed (no visible changes)
6. replace Title-host with Host-item -- see bug 11119
7. index 208 in Material-Type -- see bug 11119
8. index 100 pos 8-9 and 9-12 in pubdate:y and pubdate:n
9. index 100 pos 8-9 in pubdate:s instead of 210$d
10. Index all subfields of note 334 and 327 in note index
11. Index 304 and 327 in title index as well as note index
327 can contain a list of titles included in a work
304 can contain the title of the original work in case of a
translation
12. Index 314 in author index as well as note index
314 can contain authors not mentionned in 200$f/g (the 4th, 5th etc.
author)
13. Index 328 note in Dissertation-information as well as note
14. Index 328$t in Title
B. Changes to ccl.properties :
1. add a new index Dissertation-information (1056)
2. fix EAN, pubdate and acqdate (they were not linked with bib1 attributes)
C. Changes to Search.pm
1. add Dissertation-information and suppress Title-host and UPC
D. Changes to QP config file queryparser.yaml
1. add Dissertation-information
2 fix EAN, pubdate and acqdate
Test plan :
If you cannot test in GRS1, test only in DOM, as GRS will be deprecated.
1. Apply the patch in a UNIMARC Koha running with DOM and ICU
2. copy src/etc/searchengine/queryparser.yaml into the main config
directory of QP
3. copy src/etc/zebradb/ccl.properties into the main config directory
of Zebra
4. copy src/etc/zebradb/marc_defs/unimarc/biblio/* into the main config
directory of Zebra
5. reindex biblios (rebuild_zebra.pl -r -b -x -v)
6. test note index : make some searches on 334$b or 327$b
7. test author index : make some searches on 314 field
8. test title index : make some searches on 304 and 327 field, make a
search on 328$t subfield
9. test dissertation-information index : make some searches on 328 field
10. In a record, put in the dates of 100 fields the values "1000" (1st
date) and "1001" (2d date) ; try to search a book written in year
1000, you should find the record ; idem for year 1001
11. make some searches and sort by date. It should work better as before,
especially if you have values like "c2009" or "impr. 2010" in 210
field
12. Regression test : make some searches on several indexes, like EAN,
etc. It should work as before
Test 10-12 with and without Queryparser activated.
Be careful: with Queryparser activated, the index names (title,
dissertation-information...) must be entered in lowercase only.
Of course, to test search and sort by dates, you need to have full
records, with dates in 100 field as well as 210 field.
Signed-off-by: Paola Rossi <paola.rossi@cineca.it>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
This patch updates the MARC21 DOM index definitions to
index the 952$i as 'Number-local-acquisition' rather than
'stocknumber'.
To test (for a MARC21/DOM setup):
[1] Copy the MARC21 biblio-zebra-indexdefs.xsl over to the
active Zebra configuration directory.
[2] Reindex the bib records.
[3] Verify that 'stocknumber', 'inv', and 'number-local-acquisition'
searches work.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Adding Number-local-acquisition in C4::Search known indexes allows to
search without using "ccl=" prefix.
Also corrects in ccl.properties : inv must be an alias of
Number-local-acquisition.
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Bug 6256 replaced in bib1.att stocknumber by Number-local-acquisition
for number 1062.
In this case, Number-local-acquisition must be used in record.abs and
stocknumber can be an alias of it in ccl.properties.
Test plan (for MARC21/GRS1):
- drop zebra database (rebuild_zebra.pl -r ...)
- reindex
- test in simple search : ccl=Number-local-acquisition,alwaysmatches=''
=> you get all records with a stocknumber
- test in simple search : ccl=stocknumber,alwaysmatches=''
=> you get the same results
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Test plan the same as the original patch
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Tested according to test plan. Searches tested were:
fic=e
fiction=e
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
ff7-02 1=87020 (position 2 of field 007 in MARC21) should be
ff7-02 1=8702
lf 1=8833
lf fiction
fic fiction
should be
lf 1=8833
fiction lf
fic lf
To test :
1. apply the patch
2. copy the modified ccl.properties into your active Zebra config
directory
3. reindex zebra (rebuild_zebra.pl -b -x -r -v)
4. make some searches using the fixed indexes
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
It could be useful to index the original language of a document (i.e.
"fre" for the English translation of a French novel).
This patch renames the Bib-1 use attribute 1095 from
Code-language-original to language-original and uses it to index:
- MARC21 041$h subfield
- UNIMARC 101$c subfield
It adds "language-original" in the list of index in Search.pm.
Test plan :
A. in a MARC21 GRS1 environment
1. Copy Zebra config files (zebradb/biblios/etc/bib1.att,
zebradb/ccl.properties, marc_defs/marc21/biblios/record.abs) from
your source etc/ directory to your main koha etc/ directory
2. Reindex zebra
3. Make some searches, like "language-original:fre"
B. in a MARC21 DOM environment
4. Copy Zebra config files (zebradb/biblios/etc/bib1.att, zebradb/ccl.properties,
marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl) from your source etc/
directory to your main koha etc/ directory
5. Reindex zebra
6. Make some searches, like "language-original:fre"
C. in a UNIMARC GRS1 environment
7. Copy Zebra config files (zebradb/biblios/etc/bib1.att,
zebradb/ccl.properties, marc_defs/unimarc/biblios/record.abs) from
your source etc/ directory to your main koha etc/ directory
8. Reindex zebra
9. Make some searches, like "language-original:fre"
A. in a UNIMARC DOM environment
10. Copy Zebra config files (zebradb/biblios/etc/bib1.att,
zebradb/ccl.properties, marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl)
from your source etc/ directory to your main koha etc/ directory
11. Reindex zebra
12. Make some searches, like "language-original:fre"
Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
With this combination of sysprefs, and a UNIMARC configuration, it was
impossible to search on location, barcode and ccode indexes :
QueryWeightFields is activated
QueryAutoTruncate only if * is added
But in UNIMARC, location, barcode and ccode (995 $e,$f,h) were indexed
only as "words". They need to be indexed also as "phrase".
Additionnaly, in UNIMARC, information about damaged and withdrawn status
of items is not indexed, while it is done in MARC21.
This patch
- add 2 new indexes for 995$1 (damaged) and 995$3 (withdrawn)
- index location, barcode and ccode as "phrase" as well as "words"
Indexing of items in UNIMARC could be improved later. So this patch also
add comments explaining the origin of Koha 995, I think it could be
useful for further changes.
To test, on a UNIMARC configuration :
A. indexed with GRS-1
1) Set sysprefs QueryWeightFields as "activated" and QueryAutoTruncate
as "only if * is added"
2) Select location index in advanced search and search for a value
existing in your records in 995$e => 0 results
3) Apply patch
4) Rebuild zebra
5) Select location index in advanced search and search for a value
existing in your records in 995$e => x results
6) Mark an item as withdrawn; search "withdrawn:1" => x results, and
among them the biblio to which the item is attached
7) Mark an item as damaged ; search "damaged:1" => x results, and among
them the biblio to which the item is attached
B. indexed with DOM
Do the same operations
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Work as described. No koha-qa errors
Test
Apply the patch
Begin with GRS-1
Full reindex
Search by location, no results
cp files biblio-*-indexdefs.xml and record.abs to destination on etc/zebra
Full reindex
Search by location, got results
Switch to DOM
reset files
Full reindex
Search by location, no results
cp files
Full reindex
Search by location, results !
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
I took as a base the patch of F. Demians, but made a lot of changes,
so I think it is more logical to create a new patch as the behavior is
not the same as previous patch.
I tried to define DOM config files as a "miror" of record.abs, so the
behavior be the same.
If it is OK, we will be able to improve indexing later, for example
suppressing warns, managing indicators or subdivisions, etc.
I made some little changes to record.abs :
- comments
- 216 was indexed in Conference-name as well as Trademark. I suppose
that "Conference-name" is an error, so I indexed only in Trademark
- index 2 new notes : 340 / 356
The only difference between record.abs and DOM is that DOM config files
does not index complete fields, but subfields.
Ex :
melm 200 ===> <kohaidx:index_subfields tag="200" subfields="abcdfgjxyz">
I took all the subfields from the UNIMARC Authorities manual. The only
subfields not indexed are numeric subfields : $7, $8 for language of
record, and $0,2,3,5,6 for 4XX/5XX/7XX
To test :
- index a set of bib and auth records with GRS-1
- make some searches on different kind of authorities
- index the same records with DOM
- make the same searches
- You are not supposed to see differences
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
As I am not a UNIMARC user it's hard for me to test this, but
while testing other authority related patches I noticed that I couldn't
index the UNIMARC authorities of the sample base. The files are obviously
missing and reindex_zebra.pl notes this. With this patch applied,
indexing works and authorities are searchable in my installation.
Signed-off-by: Vitor Fernandes <fvernandes@keep.pt>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
In UNIMARC DOM indexing, "item" index was working only for subfields
of 995 field mapped with specific indexes, and also in index (ex :
$a, $b...). It was not working for the other subfields (ex : $g),
because a comment from record.abs was integrated in DOM config files.
This patch removes the comment.
To test, in a DOM UNIMARC environment :
1) In a item, write some value "Test10037" in 995$g
2) Search for this value in simple search, this way : item=Test10037
=> you should have no results
3) Apply the patch. if necessary, copy the modified
etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml and
etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl into
the /etc/... directory in your main Koha directory
4) Reindex Zebra biblios
5) Do the same search as 2) => you should have one result
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Work as described. No koha-qa errors.
Test
NOTE: default UNIMARC framework don't have 995g,
so I must add it first.
1) Added test string to 995b on some record
2) Reindex and search as indicated, no results
3) cp files to destination
4) reindex
5) search and result ok !
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
To test:
[1] When running t/db_dependent/Search.t, veify that no warnings like
this are shown:
15:52:07-10/10 zebraidx(2006) [warn] Index 'Number-music-publisher' not found in attset(s)
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
This patch fixes biblio-zebra-indexdefs.xsl files.
It was generated from biblio-koha-indexdefs.xsm with the new
koha-indexdefs-to-zebra.xsl amended by F. Démians's patch.
To test :
- Take a DOM UNIMARC Koha
- Apply all the patchs of 8252 bug, including this one
- Copy src/etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl
to your etc/zebradb/marc_defs/unimarc/biblios/ located in your
installation directory
- Run rebuid_zebra -b -x -r -v
- make advanced searches on staff interface and opac, on coded fields
indexes (Audience, Literary genre, Biography, Illustration, Content,
Video Types, Serial Type, Periodicity, Regularity, Picture)
Signed-off-by: Frédéric Demians <f.demians@tamil.fr>
Ok for me. This patch put in sync indexes XSL definition with
authoritative XML definition. Subsequently, it won't be difficult to
amend DOM UNIMARC indexes defintion if necessary. And, as it is, I don't
see any regression, whereas I can see huge improvements. Thanks Mathieu!
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
This patch modify koha-indexdefs-to-zebra.xsl in order to add the
ability to populate indexes with subfield substring.
It's now possible to understand such construction as:
<index_subfields xmlns="http://..." tag="100" subfields="a" offset="7" length="1">
<target_index>tpubdate:s</target_index>
</index_subfields>
Signed-off-by:Mathieu Saby <mathieu.saby@univ-rennes2.fr>
I applied the patch and ran
xsltproc koha-indexdefs-to-zebra.xsl ../marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml \
> ../marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl
I looked at the generated file. It looks nice.
Then I copied it file in my INSTALLDIR/etc/zebra.... and reindexed my
records with rebuild_zebra.pl
I made some searches on coded position index and non coded position
indexes, everything works.
http://bugs.koha-community.org/show_bug.cgi?id=8252
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
This followup restores the original wording of "Date/time-last-modified"
index, and change the name of "Music-number" index to
"Number-music-publisher"
To test :
1. In a UNIMARC Koha instance
2. Apply patchs #1, #2 and this followup
3. Copy from src/etc/zebradb directory to the etc/zebradb/ in your main
Koha directory the following files:
-- zebradb/biblios/etc/bib1.att
-- zebradb/ccl.properties
-- zebradb/marc_defs/unimarc/biblios/record.abs
-- zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml
-- zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl
4. Rebuild zebra with -b -x -v -r options
5. Write a value like "test071a" in 071$a field in a record
6. Check if you can find this record with this search:
"ccl=Number-music-publisher:test071a"
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
No koha-qa errors.
Test
Copy files
reindex full
Modify a couple of record to add 071a with test message
Reindex -v -z -b -x
Search test message as described and found modified records.
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
This patch makes the same changes in UNIMARC DOM configuration as patch
1 made for GRS-1.
positions of subfields are indexed that way :
In biblio-koha-indexdefs.xml :
tag="100" subfields="a" offset="17" length="1"
In biblio-zebra-indexdefs.xsl :
xslo:value-of select="substring(., 17, 1)"
I had to edit biblio-zebra-indexdefs.xsl by hand, because
etc/zebdradb/xml/koha-indexdefs-to-zebra.xsl does only support
"subtring" in handle-one-index-control-field template.
It is good for MARC21, but not for UNIMARC : in MARC21, indexing
subtrings is needed for controled field (001-009, with no subfields)
But in UNIMARC it is needed for subfields of 1XX fields.
So if DOM indexing is working with these new files, we may need to
change koha-indexdefs-to-zebra.xsl.
Test plan (not possible in a sandbox) :
1) In a Koha instance using UNIMARC and DOM indexing
2) Apply Patch 1 and Patch 2 (this one)
3) Copy the following files from the etc/zebradb directory of your
source into the etc/zebradb directory of your main Koha directory :
-- etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml
-- etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl
-- etc/zebradb/ccl.properties
-- etc/zebradb/biblios/etc/bib1.att
4) rebuild zebra with -x -b -r -v options
5) check if coded filters in advanced search are usable in OPAC and
Staff interface
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Works. No koha-qa errors.
Test for DOM
Apply patches
Don't forget to copy files
reindex
Search by coded fields works, also Country-publication
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Before fixing UNIMARC DOM indexing, we must fix GRS-1 indexing
1) In advanced search, some Coded fields index are not working: Print,
Illustration, Content
2) Country-heading index is not working
3) Some subfields are indexed in wrong indexes :
102$a should be in Country-publication instead of Country-heading
(non defined in bib1.att)
106$a, filled only for printed works, should be in ff88-23 (form of
item) instead of itype. (ff88-23 is made for Marc21 008 pos
23, which contains the same data as 106a)
200$b should be in Material-type instead of (or in addition to) itype
and itemtype: (Material-type :"free-form string, ... that
describes the material type of the item, e.g., cassette, kit,
computer database, computer file.")
100$a pos 22-24 should not be indexed as "ln" : it is the language of
the record, not the language of the ressource
4) Index names are too long : if we index new positions of coded fields,
with existing names it breaks Zebra indexing (there must be a limit
in line lenghth in record.abs?)
5) There are a lot of warns when rebuiding zebra.
This patch make some changes in bib1.att (could be used later to improve
search) :
- fixing wording for att 51 and 1012
- adding comments for attributes based on MARC21 008 field (8800-8841)
- creating 8806 (tpubdate), 8838 (Modified-code), 8818 (ff8-18), 8840
(ff8-18-21), 8819 (ff8-19), 8821 (ff8-21), 8828 (ff8-28), 8830
(ff8-30), 8831 (ff8-31)
- creating attributes specific to UNIMARC : 9701-9707 (Video-mt,
Graphics-type, Graphics-support, Title-page-availability,
Cumulative-index-availability, script-Title, char-encoding)
- setting apart 3 blocks of attributes, so it could be easy to make
further changes :
-- common to Marc21 and UNIMARC : 8806, 8822, 8838
-- slightly different in Marc21 and UNIMARC (different meanings
according to the type of the record => don't match a single
UNIMARC field)
-- specific to UNIMARC : 9701-9707
In ccl.properties :
- creating a new index: Country-publication 1=1053
- suppressing some warns by mapping with bib1 att:
Date-time-last-modified, Name, rtype, Music-number
- defining indexes using the 3 blocks attributes defined in bib1
(common to Marc21 and UNIMARC, slightly different, specific to UNIMARC)
In record.abs :
- renaming some index for 100-105-110 fields
- correcting indexing of 102$a (country of publication)
106$a (ff88-23)
100$a pos 22-24 (language of record, no more
indexed)
105$a pos. 0-3 (illustration code)
200$b (for the moment, I keep it indexed in
itype and itemtype, but also Material-Type)
In C4/Search.pm :
- adding "Country-publication" index
In OPAC and staff interface template subtypes_unimarc.in :
- renaming indexes to take into account the changes made to Zebra
config files
To test (this cannot be done with a sandbox) :
1) Apply the patch in a UNIMARC GRS-1 Koha instance
2) Copy the following files from the etc/zebradb of your source
directory into the etc/zebradb of your main Koha directory:
-- etc/zebradb/biblios/etc/bib1.att
-- etc/zebradb/ccl.properties
-- etc/zebradb/marc_defs/unimarc/biblios/record.abs
3) Reindex your data (rebuild_zebra -x -b -r -v)
4) Try to use those Coded fields indexes in Advanced search, in OPAC
and Staff interface (available after clicking on "More options",
then on "Coded information filters"):
Audience, Print, Literary genre, Biography, Illustration, Content,
Video Types, Serials, Serial Type, Periodicity, Regularity
5) Try to search "Country-publication=FR" in simple search
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
No koha-qa errors.
Tests for GRS-1
Followed test plan
Search by coded fields works, but only on OPAC,
on staff there are few options
Search by Country-publication works after patch
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
There were 5 redundant mappings left in the file. They are removed.
Its almost a QA followup as everything went fine anyway.
Sponsored-by: Universidad Nacional de Córdoba
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
This patch provides a definition file for Spanish (es) character
sorting in Zebra. It is based on the ideas from Hugo Agud <hagud@orex.es>
and Pablo Bianchi <pablo.bianchi@gmail.com>.
Makefile.PL is fixed to notice the existence of the 'es' language. The
docs for koha-create are touched too.
To test:
Tarball
=======
- Go through the install process, choosing 'es' for the Zebra's language step
- Koha should work as usual.
- Running this should show the lang definition is properly set.
$ grep -R "lang_defs/es" /etc/koha/*
(stuff like zebradb/zebra-biblios-dom.cfg:profilePath:...etc/koha/zebra/lang_defs/es... should show)
- This file should be present:
/etc/koha/zebradb/lang_defs/es/sort-string-utf.chr
Packages
========
- Build your own package, it shouldn't break the packaging
- Try the new package, using koha-create to set an instance using --lang 'es'
Sponsored-by: Universidad Nacional de Córdoba
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
When i did bug 8805, I gave the biblio-koha-indexdefs.xml file the
wrong name, and called it biblio-zebra-indexdefs.xml. This patch
fixes that.
To reproduce:
- Check that etc/zebradb/marc_defs/normarc/biblios/biblio-zebra-
indexdefs.xml exists
To test:
- Apply the patch and check that etc/zebradb/marc_defs/normarc/
biblios/biblio-zebra-indexdefs.xml no longer exists, but that
etc/zebradb/marc_defs/normarc/biblios/biblio-koha-indexdefs.xml
does exist.
Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Problem:
Links between anaytics records were not being displayed for NORMARC setups.
What this patch does:
1. Add indexing for 773 subfield a, w and 9; both for GRS-1 and DOM indexing
(The DOM indexing config was generated from the GRS-1 record.abs)
2. Add "analytics links" to NORMARC XSLT files, both for OPAC and intranet
To test:
- Make sure you have a NORMARC installation
- Set UseControlNumber = Use
- Create a parent record with LDR/07=c. Leave 001 empty.
- In the "Normal" view, do New > New child record and create another record. Do
this twice (so you get a list of hits when you click on the "Show anaytics"
links later on).
- Do the following steps both in the OPAC and the Intranet:
- Search for the parent record in such a way that you can see the record in a
*result list*
- Check that the "Show analytics" link is displayed, and uses the title of the
parent record for linking: ?q=Host-item:<Title of parent record>
- Clik on the "Show analytics" link and check that you get a result list with
the two child records you created earlier
- Go back to the result list and click on the parent record, so you get the
*detail view*
- Check that the "Show analytics" link is displayed, and uses the title of the
parent record for linking: ?q=Host-item:<Title of parent record>
- Clik on the "Show analytics" link and check that you get a result list with
the two child records you created earlier
- Search for one or both of the child records in such a way that you can see
the record(s) in a *result list*
- Check that the "In: <Title of parent record>" link is displayed, and that it
uses the biblionumber of the parent record for linking:
?q=Control-number:<biblionumber of parent record>
- Click on the "In: <Title of parent record>" link, and check that the parent
record is displayed
- Go back to the result list and click on the child record, so you get the
*detail view*
- Check that the "In: <Title of parent record>" link is displayed, and that it
uses the biblionumber of the parent record for linking:
?q=Control-number:<biblionumber of parent record>
- Click on the "In: <Title of parent record>" link, and check that the parent
record is displayed
- Now edit the parent record and put it's biblionumber in 001. Repeat the steps
above, and check that everything still works, but that the links are different:
- The "Show analytics" link on the parent record should look like this:
?q=rcn:<biblionumber of parent record>+and+(bib-level:a+or+bib-level:b)
- The "In: <Title of parent record>" link on the child records should be the
same as it was earlier
- Now set UseControlNumber = "Don't use" and repeat all of the steps above
- All of the links should still be displayed and work, of course
- The "In: <Title of parent record>" link on the child records should look
like this: ?q=ti,phr:<Title of parent record>
- The "Show analytics" link on the parent record should look like this:
?q=Host-item:<Title of parent record>
- Change LDR/07 to "s" and repeat all of the steps above
- Do all of this both for GRS-1 indexing and for DOM indexing...
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
See the bug for a description of the problem.
This patch tries to restore searching for marcflavour != MARC21 as well as
allowing instances with different marcflavors to co-exist on the same server.
To test:
- Do a package install with e.g. the official squeeze-dev packages and create at
least two instances, with different marcflavours, e.g.:
sudo koha-create --create-db --marcflavor marc21 test1
sudo koha-create --create-db --marcflavor normarc test2
- Run through the web installers for both instances and add a couple of
records to each. Wait for the records to be indexed or run indexing manually
with
sudo koha-rebuild-zebra -f test1
sudo koha-rebuild-zebra -f test2
- Try searching for the records you added. It should work in test1 but not in
test2.
- Apply the patch and build packages with the build-git-snapshot script
- Install the new koha-common package
- Create two instances (because of Bug 9754 it is probably best to give the
instances different names than the ones you created above, or to do this on
a fresh VM or similar) and add records, as described above. Searching should
now work equally well for both instances.
Please note: Because of Bug 9752 you will have to set marcflavour = NORMARC
by hand before you do the searching, if you choose NORMARC as the marc flavour
on one of the instances you create.
Please note too: I am not confident that this is the perfect solution, so
merciless and thorough testing is necessary! ;-)
Signed-off-by: Mirko Tietgen <mirko@abunchofthings.net>
Works for me for GRS-1 (package installation out of the box). Could not figure out how to set up DOM indexing and eventually stopped caring about it.
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Build packages with the patch and checked that creating
instances and search within them works for both MARC21 and NORMARC.
All tests and QA script pass.
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
The definition of the Any index was sensitive to whether
spaces were present between (say) subfield elements in the
MARCXML representation of the bib being indexed. When using
the -x option to rebuild_zebra.pl, spaces would be present
because of how MARC::File::XML emits MARCXML.
When not using the -x option, spaces would not be present
and the contents of a field would be run together, potentially
as one big token.
The visible behavior was that doing a keyword search by
item barcode would sometimes not work.
To test:
0) Make sure Zebra is using DOM mode
1) Create an item record.
2) Reindex using rebuild_zebra.pl -b -z, *without* -x
3) Do a keyword search by the barcode of the item just
added; the search shouldn't work
4) Apply patch.
5) Update the following two files:
etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl
6) Reindex
7) Do a search that was previously failing.
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Fixes the problem for me - formerly not working callnumbers
and barcodes are now found in keyword (any) searches.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
(revised commit description to better explain why it fixes the problem)
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Passes all my tests, happy to sign off
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Currently, you can use "lt,le,eq,ge" in your CCL query to handle
"lesser than, lesser or equal to, equal to, greater than or equal
to" relationships.
The only one missing is "gt" (Bib1 2=5).
The mappings are also off "ne, phonetic, stem", but those are Bib1
attributes that Zebra doesn't support, so that's not really relevant.
To test:
[1] Before applying the patch, try the following query in the OPAC:
pubdate,gt:2006
You should get "no results found".
[2] After applying the patch (and note that ccl.properties will usually
need to be installed in the run-time Zebra configuration directory), try
the same search. This time, you could get back the titles whose
publication date is after 2006.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
This patch makes the following changes to record.abs, biblio-koha-indexdefs.xml and biblio-zebra-indexdefs.xsl :
- adding new (sub)fields to Identifier-standard index : 011f/g ; 012a ; 013a/z ; 014a/z ; 015a/z ; 016a/z ; 017a/z, 040a/z, 071z, 072z, 073z
- adding 1 new subfield to Publisher index : 071b (may contain the name of a music publisher)
- adding new (sub)fields to Author and Identifier-standard index (for the $9) : 716, 72X, 730 - adding new (sub)fields to Note : 334$a (award note)
- correcting 207 and 208
- suppressing 308a and 328a in Note (useless as complete fields are indexed in same index)
- adding (sub)fields to Title index : 411t, 421-425t, 433-437t, 442-444t, 446-456t, 462-463t, 470-488t, 560
- adding (sub)fields to Subject and Identifier-standard index (for the $9) : 608, 615, 616, 617, 620, 621
- adding some classifications index : 670, 675, 686 - adding some comments (to make easier further modifications and to identify non unimarc fields : 414-420, 603, 630-636, 646)
To test :
- take a record and fill some of the missing fields (e.g 488t, 608, 720, 012a) with some data as "field488", "field608" etc
- try to find the record => not possible
- apply the patch, copy the new record.abs in etc/zebradb/biblios/etc and rebuild zebra
- try to find the record => should be ok
- check nothing else is broken...
- same test with DOM indexing activated
http://bugs.koha-community.org/show_bug.cgi?id=8984
Signed-off-by: Zeno Tajoli <tajoli@cilea.it>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Tested with Zebra, marc21, grs1.
Discovered that paging through auth search results does no longer work, but that is not related to these changes.
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Tested with Zebra, marc21, dom.
All tests pass.
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Patch re-done so it applies, had that double-utf8 problem
There was no entry in authority's record.abs for indexing chronological
terms. They couldn't be searched and (obviously) linked.
I've added those entries using the index names defined in
authorities/etc/bib1.att
Regards
To+
Sponsored-by: Universidad Nacional de Córdoba
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Passed-QA-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
This is required in order for Koha to support DOM indexing of the
NORMARC dialect, cf Bug "Bug 7818 - support DOM mode for Zebra
indexing of bibliographic records".
The two files in this patch were generated from the NORMARC
record.abs by doing the steps suggested at the bottom here:
http://wiki.koha-community.org/wiki/Switching_to_dom_indexing
No manual editing was involved.
To test:
- Do a fresh install, choosing NORMARC as the MARC dialect
- Run rebuild_zebra.pl and check it does not complain about missing
files or other things
- Check that search works as expected. Using MARC21 records for
the testing should be OK.
2012-10-31: New patch after an update to Bug 8665
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Passed-QA-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
The DOM transformer was missing a line from a previous development,
resulting in the MARC21 authorities DOM indexing stylesheet being
regenerated with a missing line. This patch readds the missing line
to the transformer, and provides the corrected authority-zebra-indexdefs.
Signed-off-by: Elliott Davis <elliott@bywatersolutions.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Use a user-specified field for z:id.
This patch also fixes an excess space before the index in the MARC21
biblio index definitions, which someone fixed in the generated file
but not in the source file it should have been fixed in.
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Elliott Davis <elliott@bywatersolutions.com>
Modify Makefile.PL and Zebra configuration files in order to parametrized
biblio record type returned by Zebra Z39.50 server.
How to test:
- Test with a MARC21 and a UNIMARC DB
- Do a new installation
- Search from OPAC
- Search from a Z39.50 client like yaz-client: syntax = MARC21/UNIMARC must be
choosed
- It was working for MARC21: it continues to work
- It wasn't working for UNIMARC: it works now, both in OPAC and from a Z39.50
client
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Works fine for MARC21. Frederic looked at UNIMARC. Magnus looked at NORMARC.
GRS1 works okay for me. I still have issues with DOM, but they are not directly related to changes in this patch.
A followup is still needed for packaging (debian/templates).
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
In order to make matching rules more useful for MARC21 authorities,
this patch adds special indexes on previous see-from headings and
LCCN. This patch does not change UNIMARC authority configuration in
any way. Also modifies the Koha schema in preparation for adding
authority import and matching to the Staging tools.
To install:
1. Run installer/data/mysql/atomicupdate/importauthorities.pl
2. Update the following four files in your koha-dev:
etc/zebradb/authorities/etc/bib1.att
etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml
etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xsl
etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl
3. Reindex your authorities:
misc/migration_tools/rebuild_zebra.pl -a -r -v
NOTE TO RM: this patch adds an atomicupdate file that needs to be
incorporated into updatedatabase.pl if bug 7167 is not pushed.
http://bugs.koha-community.org/show_bug.cgi?id=2060
Signed-off-by: Elliott Davis <elliott@bywatersolutions.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Rebased on master 1 August 2012
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Rebased on master 11 September 2012
The superfluous whitespace after the definition of subject
tag $9s is causing an error when carried over into dom config
files so that the authority links fail to index
Also removed the (harmless) trailing space in the equivalent
Unimarc files
A good editor and git can help in not creating excess whitespace
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Although the Match index was correctly configured for UNIMARC
authorities and MARC21 authorities indexed with DOM, the Match
index was inadvertantly removed from the record.abs file for
MARC21 authorities at some point. Since the Match index is required
to make best use of the new search options, this patch adds it
back in.
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
modified: etc/zebradb/marc_defs/marc21/biblios/record.abs
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
I tested two UNIMARC Koha installations using the sample UNIMARC
data from the BibLibre sandbox, comparing the results with DOM
and with GRS-1 indexing. The results are very similar, though there
are some differences. Most noticeable:
* relevance and facets seem to be more accurate with DOM enabled
* the GRS-1 configuration returns approximately 10% more results with
random single keywords like "petit," but the DOM results contain
the most relevant items, and any lacks in the configuration can
easily be corrected as UNIMARC users identify fields that should be
indexed but aren't
* authority-controlled searches match exactly
* author and topic facets do not work with the out-of-the-box GRS-1
indexing configuration (?!?)
(adding second sign-off line below because all that probably looks like
a commit message and not a sign off)
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
This commit also updates the authority and biblio DOM indexing definition
XSL to include updated header comments.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
DOM indexing is now available for both bibs and authorities.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
One consequence is that the -x and -a options are no longer
mutually exclusive.
Also, because of the way that the GRS-1 SGML filter works, if you're
indexing multiple documents, you can't just wrap them in a document
element, but the DOM filter *requires* it. Consequently, two
new config settings in koha-conf.xml are added to indicate the
Zebra filter in use so that the -x option of rebuild_zebra.pl
knows whether to wrap the exported records or not:
- bib_index_mode (defaults to 'grs1' if not specified)
- auth_index_mode (defaults to 'dom')
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
The file biblio-zebra-indexdefs.xsl, which is the stylesheet that
is used by the Zebra DOM filter to convert an incoming MARC21 bib
to its indexed form, was generated by the following two steps:
misc/maintenance/make_zebra_dom_cfg_from_record_abs \
--input etc/zebradb/marc_defs/marc21/biblios/record.abs \
--output etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml
xsltproc etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl \
etc/zebradb/marc_defs/marc21/biblios/biblio-koha-indexdefs.xml \
> etc/zebradb/marc_defs/marc21/biblios/biblio-zebra-indexdefs.xsl
Records indexed using this XSLTshould behave similarly to records
indexed using the GRS-1 filter and the old record.abs definition, with
the following big exception (and improvemwent): indexed phrases now
span subfield boundaries if a specific subfield wasn't specified in the
index definition. For example, the GRS-1 filter index definition
melm 245 Title
would allow 245 $a Cats on boxes : $b cardboard fantasies
to be searched as the phrases "cats on boxes" or "cardboard fantasies",
but a title phrase seach of "cats on boxes cardboard fantasises"
wouldn't work. The DOM filter equivalent,
<index_data_field xmlns="http://www.koha-community.org/schemas/index-defs" tag="245">
<target_index>Title:w</target_index>
<target_index>Title:p</target_index>
</index_data_field>
*does* allow phrase searches to span subfield boundaries.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Adds a new kohaidx:index_data_field index definition type which
indexes all of the subfields of a MARC data field as a single
phrase, separating the contents of each with a space.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Since the koha-indexdefs-to-zebra.xsl stylesheet will be used
by both bib and authority indexing, put in a central location.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Adds the necessary bits to enable DOM indexing for bib
records as an option during installation from source.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Add the option of sorting authority search results by authid, and instruct the
FirstMatch and LastMatch linkers to use that sort order rather than the default
search order.
To test:
1. Install new Zebra authorities config
etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml,
etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xsl,
etc/zebradb/marc_defs/marc21/authorities/record.abs, and
etc/zebradb/marc_defs/unimarc/authorities/record.abs
2. Reindex authorities in Zebra
3. Set LinkerModule to FirstMatch or LastMatch
4. Add two identical authority records, and a bib record with a heading that
matches them
5. Run misc/link_bibs_to_authorities.pl on that record
6. Confirm that the authid that's been inserted into subfield $9 of that
heading is the first, if you selected FirstMatch, or last if you selected
LastMatch
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
I followed the test plan and checked that for "Last match" and "First match"
the correct authority was selected and linked to the record.
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
IMPORTANT! This patch relies on the patch for Bug 7092, which is now pushed to
master.
As the title says, this patch implements TraceCompleteSubfields,
TraceSubjectSubdivisions and UseICU for NORMARC XSLT, both for the OPAC
and the Intranet. This affects how clickable subject-links are constructed.
To make this work the indexing of MARC fields in the 600 range is changed
to include "Subject:p" in several new places.
To test:
Find a record with a "complex" subject, like "Internet -- Law and legislation".
MARC21 and NORMARC are very similar in how they handle subjects, so testing
on a MARC21 database should be OK. (Changes in indexing reflect changes already
made to the MARC21 indexing.)
Make sure you have these syspref settings:
- marcflavour = NORMARC
- XSLTDetailsDisplay = using XSLT stylesheets
- OPACXSLTDetailsDisplay = using XSLT stylesheets
(Ideally, testing should be done on a real NORMARC setup, but since the changes
to indexing only reflect how it's already done in MARC21, I think testing
on a MARC21 installation with marcflavour = NORMARC should be OK.)
Now try the different combinations of TraceCompleteSubfields,
TraceSubjectSubdivisions and UseICU, and check the format of the
clickable links, both in the OPAC and staff client. Here's what you should
be seeing:
1.
TraceCompleteSubfields = Don't force
TraceSubjectSubdivisions = Don't include
UseICU = Not using
opac-search.pl?q=su:"Internet"
UseICU = Using
opac-search.pl?q=su:{Internet}
2.
TraceCompleteSubfields = Force
TraceSubjectSubdivisions = Don't include
UseICU = Not using
opac-search.pl?q=su,complete-subfield:"Internet"
UseICU = Using
opac-search.pl?q=su,complete-subfield:{Internet}
3.
TraceCompleteSubfields = Don't force
TraceSubjectSubdivisions = Include
UseICU = Not using
opac-search.pl?q=(su:"Internet") AND (su:"Law and legislation.")
UseICU = Using
opac-search.pl?q=(su:{Internet}) AND (su:{Law and legislation.})
4.
TraceCompleteSubfields = Force
TraceSubjectSubdivisions = Include
UseICU = Not using
opac-search.pl?q=(su,complete-subfield:"Internet") AND (su,complete-subfield:"Law and legislation.")
UseICU = Using
opac-search.pl?q=(su,complete-subfield:{Internet}) AND (su,complete-subfield:{Law and legislation.})
UPDATE 2012-03-23
- Change the syspref TracingQuotes to UseICU, see bug 7092
- Change boolean operator from "and" to "AND", see bug 7695
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Note: UseControlnumber must be turned off.
1) Works.
2) Works.
3) Works.
4) Works.
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Word search with multi-part facets works properly only with Zebra ICU
tokenization. This patch add a new question to Koha command line
installer:
Zebra has two methods to perform records tokenization
and characters normalization: CHR and ICU. ICU is
recommended for catalogs containing non-Latin
characters. (chr, icu) [chr]
How to test:
- perl ./Makefile.PL
- Try each possible value for new parameter
- Take a look at zebradb/etc/default.idx file.
Depending of the parameter you get this line:
icuchain words-icu.xml
or this one:
charmap word-phrase-utf.chr
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
(Note: This patch was previously associated with bug 3216; I moved it to a
separate bug because including ICU is a good idea independent of the fix for
the particular issue described in bug 3216)
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Add the Match-heading and Match-heading-see-from indexes to the UNIMARC Zebra
configuration.
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Tested with an UNIMARC setup that things work fine. They do