From 0dd1ac40a0df984bee6583f13f80744b539b8f37 Mon Sep 17 00:00:00 2001 From: Mathieu Saby Date: Sat, 16 Mar 2013 19:47:20 +0100 Subject: [PATCH] Bug 9828: More specific indexing of UNIMARC 6XX fields [New commit on 18 Aug 2014 : rebased, and DOM indexing only] Issues to fix : Most of 6XX may contain a $2 that identifies the system used for indexing. It should not be indexed. In French libraries, $2 contains "rameau". So searching books about the music composer "Rameau" retreive thousands of records! For some 6XX fiels, other subfields should not be indexed, for example dates of persons and family, or adresses. In Unimarc guide, 600$t,601$t,602$t are said to exist but to be "not used". I keep them indexed. Additionnally, subject indexing could be improved by using specific indexes for each 6XX if possible : In ccl.properties : - su-to, su-geo and su-ut are defined as aliases of Subject. - a specific index is defined, but not used in record.abs : Subject-name-personal, alias su-na We can use these indexes and create new specific indexes by using existing bib1 attributes. We could also index $j,$x,$y,$z subdivision in specific indexes. This patch does the following changes : 1) For all 6XX : Not indexing $2 (LSCH, Rameau...), $3 and $5 2) Suppressing the indexing of some specific subfields, depending on the field: 600 : Personal name used as a subject // see Marc21 600 not indexing c (additional elements),f (dates),p (address/affiliation) 602 : Family name used as a subject // see Marc21 600 3X not indexing f (dates) 616 : Trademark not indexing c,f 3) For all 6XX : index $j,$x,$y,$z in several indexes in addition to the specfific index for their 6XX field: 4) Define in ccl.properties some specific indexes : Subject-name-conference 1=1073 => alias su-conf Subject-name-corporate 1=1074 => alias su-corp Subject-genre-form 1=1075 => alias su-genre and su-form Subject-geographical 1=1076 => alias su-geo Subject-chronological 1=1077 => alias su-chrono Subject-title 1=1078 => alias su-ut and su-ti Subject-topical 1=1079 => alias su-to 5) Adding new aliases in Search.pm : su-chrono, su-form, su-genre, su-corp, su-conf, su-ti 6) Using these new indexes in for 600 : Subject and Subject-Personal-Name ; all subfields except subdivisions in Personal-name 601 : Subject, Subject-name-conference and Subject-name-corporate and Subject-name-conf ; all subfields except subdivisions in Corporate-name and Conference-name 602 : same as 600 but could be improved later 604 : Subject and Subject-title ; $a in Subject-Personal-Name ; all subfields except subdivisions in Name-and-Title 605 : Subject and Subject-title 606 : Subject and Subject-topical 607 : Subject and Subject-geographical ; all subfields except subdivisions in Name-geographic 608 : Subject and Subject-genre-form To test : A. In a UNIMARC-DOM indexing environment 1) Apply the patch 2) Rebuild zebra 3) Create a record A with some values in critical fields, for example: - the string "test9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f, 606$2,600$2 - the string "subform" in 600$j 4) Create a record B with the string "subgeo" in 606$y 5) Create a record C with the string "subdate" in 606$z 6) try to search "su:test9828". You should have no results 7) try to search "su-genre:subform". You should have 1 result : record A 8) try to search "su-geo:subgeo". You should have 1 result : record B 9) try to search "su-chrono:subdate". You should have 1 result : record C 10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo indexes, and see it results are relevant Indexing of subjects could maybe be improved later Signed-off-by: Nick Clemens All seems to work as expected, I am not super-familiar with UNIMARC but I wonder if in su-corp and su-conf the subdivisions might be useful (e.g. France-Gendarmie / Staatsbibliothek-Berlin) Signed-off-by: Paul Poulain Signed-off-by: Tomas Cohen Arazi --- C4/Search.pm | 6 + etc/zebradb/biblios/etc/bib1.att | 4 +- etc/zebradb/ccl.properties | 43 +- .../unimarc/biblios/biblio-koha-indexdefs.xml | 711 +++++++++++++++--- .../biblios/biblio-zebra-indexdefs.xsl | 632 +++++++++++----- 5 files changed, 1105 insertions(+), 291 deletions(-) diff --git a/C4/Search.pm b/C4/Search.pm index 1326f692bb..4de0931fe0 100644 --- a/C4/Search.pm +++ b/C4/Search.pm @@ -1215,9 +1215,15 @@ sub getIndexes{ 'Subject-subdivision', 'Summary', 'Suppress', + 'su-chrono', + 'su-corp', + 'su-conf', 'su-geo', + 'su-form', + 'su-genre', 'su-na', 'su-to', + 'su-ti', 'su-ut', 'ut', 'Term-genre-form', diff --git a/etc/zebradb/biblios/etc/bib1.att b/etc/zebradb/biblios/etc/bib1.att index ed8c64b38a..bf344cebd5 100644 --- a/etc/zebradb/biblios/etc/bib1.att +++ b/etc/zebradb/biblios/etc/bib1.att @@ -142,8 +142,8 @@ att 1071 Section-heading att 1072 Subject-GOO att 1073 Subject-name-conference att 1074 Subject-name-corporate -att 1075 Subject-genre/form -att 1076 Subject-name-gerographical +att 1075 Subject-genre-form +att 1076 Subject-name-geographical att 1077 Subject-chronological att 1078 Subject-title att 1079 Subject-topical diff --git a/etc/zebradb/ccl.properties b/etc/zebradb/ccl.properties index 99e3498e4b..85611d3562 100644 --- a/etc/zebradb/ccl.properties +++ b/etc/zebradb/ccl.properties @@ -625,9 +625,7 @@ rcn Record-control-number # 655, 656, 657, 69X Subject 1=21 su Subject -su-to Subject -su-geo Subject -su-ut Subject + #Subject-BDI 23 Subject headings from # Bibliotek Dokumentasjon # Informasjon -- a controlled @@ -676,7 +674,7 @@ su-ut Subject # appears in a subject heading. Subject-name-personal 1=1009 su-na 1=1009 -#Subject-name-personal + #Subject-PA 26 Subject headings from 600i2, 610i2, # Thesaurus of Psychological 611i2, 630i2, # Index Terms -- maintained 650i2, 651i2 @@ -722,11 +720,43 @@ su-na 1=1009 #Subject-subdivision 47 An extension to a subject 6XX$x, 6XX$y, # heading indicating the form, 6XX$z -# place, period of time treated, +# place, period of time treated, UNIMARC 6XX$j # or aspect of the subject # treated. Subject-subdivision 1=47 +#Subject-name-conference 1073 MARC21 611 ; UNIMARC 601 +Subject-name-conference 1=1073 +su-conf Subject-name-conference + +#Subject-name-corporate 1074 MARC21 610 ; UNIMARC 601 +Subject-name-corporate 1=1074 +su-corp Subject-name-corporate + +#Subject-genre-form 1075 MARC21 610 ; UNIMARC 608 +# UNIMARC 6XX$j +Subject-genre-form 1=1075 +su-genre Subject-genre-form +su-form Subject-genre-form + +#Subject-geographical 1076 MARC21 651 ; UNIMARC 607 +# MARC21 AND UNIMARC 6XX$y +Subject-geographical 1=1076 +su-geo Subject-geographical + +#Subject-chronological 1077 MARC21 and UNIMARC 6XX$z +Subject-chronological 1=1077 +su-chrono Subject-chronological + +#Subject-title 1078 MARC21 630 ; UNIMARC 605 +Subject-title 1=1078 +su-ut Subject-title +su-ti Subject-title + +#Subject-topical 1079 MARC21 650 ; UNIMARC 606 +Subject-topical 1=1079 +su-to Subject-topical + #Title 4 A word, phrase, character, 130, 21X-24X, 440, # or group of characters, 490, 730, 740, 830, # normally appearing in an item, 840, subfield $t @@ -1218,9 +1248,6 @@ sort3 7=3 #corporateName 1=2 #conferenceName 1=3 #uniformTitle 1=6 -#geographicName 1=58 -#topicalSubject 1=1079 -#genreForm 1=1075 ################################################### # Rules for a few GILS fields diff --git a/etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml b/etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml index 10dc41ec55..81fac14731 100644 --- a/etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml +++ b/etc/zebradb/marc_defs/unimarc/biblios/biblio-koha-indexdefs.xml @@ -1156,148 +1156,683 @@ Title:p - - + + + + + + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + Koha-Auth-Number:w + Koha-Auth-Number:n + + + + + + + Personal-name:w Personal-name:p + Subject-name-personal:w + Subject-name-personal:p Subject:w Subject:p - - Koha-Auth-Number:w - Koha-Auth-Number:n + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-genre-form:w + Subject-genre-form:p + Subject-name-personal:w + Subject-name-personal:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-name-personal:w + Subject-name-personal:p - + + Subject:w Subject:p - - - + Subject-subdivision:w + Subject-subdivision:p + Subject-name-geographical:w + Subject-name-geographical:p + Subject-name-personal:w + Subject-name-personal:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-chronological:w + Subject-chronological:p + Subject-name-personal:w + Subject-name-personal:p + + + + + + + + + + Corporate-name:w - Conference-name:w Corporate-name:p + Conference-name:w Conference-name:p + Subject-name-conference:w + Subject-name-conference:p + Subject-name-corporate:w + Subject-name-corporate:p + Subject:w + Subject:p - - Koha-Auth-Number:w - Koha-Auth-Number:n + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-genre-form:w + Subject-genre-form:p + Subject-name-conference:w + Subject-name-conference:p + Subject-name-corporate:w + Subject-name-corporate:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-name-conference:w + Subject-name-conference:p + Subject-name-corporate:w + Subject-name-corporate:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-name-geographical:w + Subject-name-geographical:p + Subject-name-conference:w + Subject-name-conference:p + Subject-name-corporate:w + Subject-name-corporate:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-chronological:w + Subject-chronological:p + Subject-name-conference:w + Subject-name-conference:p + Subject-name-corporate:w + Subject-name-corporate:p + + + + + + + + + Personal-name:w + Personal-name:p + Subject-name-personal:w + Subject-name-personal:p + Subject:w + Subject:p - + + Subject:w Subject:p - - - + Subject-subdivision:w + Subject-subdivision:p + Subject-genre-form:w + Subject-genre-form:p + Subject-name-personal:w + Subject-name-personal:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-name-personal:w + Subject-name-personal:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-name-geographical:w + Subject-name-geographical:p + Subject-name-personal:w + Subject-name-personal:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-chronological:w + Subject-chronological:p + Subject-name-personal:w + Subject-name-personal:p + + + + + + + + Name-and-title:w + Name-and-title:p + Subject-title:w + Subject-title:p Personal-name:w Personal-name:p + Subject-name-personal:w + Subject-name-personal:p + Subject:w + Subject:p - - Koha-Auth-Number:w - Koha-Auth-Number:n + + + Name-and-title:w + Name-and-title:p + Subject-title:w + Subject-title:p + Subject-name-personal:w + Subject-name-personal:p + Subject:w + Subject:p - + + Subject:w Subject:p - - - - Koha-Auth-Number:w - Koha-Auth-Number:n + Subject-subdivision:w + Subject-subdivision:p + Subject-genre-form:w + Subject-genre-form:p + Subject-title:w + Subject-title:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-title:w + Subject-title:p - + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-name-geographical:w + Subject-name-geographical:p + Subject-title:w + Subject-title:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-chronological:w + Subject-chronological:p + Subject-title:w + Subject-title:p + + + + + + + Subject-title:w + Subject-title:p Subject:w Subject:p - - - - Koha-Auth-Number:w - Koha-Auth-Number:n - + + Subject:w Subject:p - - - - Koha-Auth-Number:w - Koha-Auth-Number:n + Subject-subdivision:w + Subject-subdivision:p + Subject-genre-form:w + Subject-genre-form:p + Subject-title:w + Subject-title:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-title:w + Subject-title:p - + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-name-geographical:w + Subject-name-geographical:p + Subject-title:w + Subject-title:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-chronological:w + Subject-chronological:p + Subject-title:w + Subject-title:p + + + + + + + Subject-topical:w + Subject-topical:p Subject:w Subject:p - - - - Koha-Auth-Number:w - Koha-Auth-Number:n - + + Subject:w Subject:p - - - - Koha-Auth-Number:w - Koha-Auth-Number:n + Subject-subdivision:w + Subject-subdivision:p + Subject-genre-form:w + Subject-genre-form:p + Subject-topical:w + Subject-topical:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-topical:w + Subject-topical:p - + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-name-geographical:w + Subject-name-geographical:p + Subject-topical:w + Subject-topical:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-chronological:w + Subject-chronological:p + Subject-topical:w + Subject-topical:p + + + + + + + Name-geographic:w + Name-geographic:p + Subject-name-geographical:w + Subject-name-geographical:p Subject:w Subject:p - - - - Koha-Auth-Number:w - Koha-Auth-Number:n - + + Subject:w Subject:p - - - - Koha-Auth-Number:w - Koha-Auth-Number:n + Subject-subdivision:w + Subject-subdivision:p + Subject-genre-form:w + Subject-genre-form:p + Subject-name-geographical:w + Subject-name-geographical:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-name-geographical:w + Subject-name-geographical:p - + + Subject:w Subject:p - - - - Koha-Auth-Number:w - Koha-Auth-Number:n + Subject-subdivision:w + Subject-subdivision:p + Subject-name-geographical:w + Subject-name-geographical:p - + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-chronological:w + Subject-chronological:p + Subject-name-geographical:w + Subject-name-geographical:p + + + + + + + Subject-genre-form:w + Subject-genre-form:p Subject:w Subject:p - - - - Koha-Auth-Number:w - Koha-Auth-Number:n - + + Subject:w Subject:p - - - - Koha-Auth-Number:w - Koha-Auth-Number:n + Subject-subdivision:w + Subject-subdivision:p + Subject-genre-form:w + Subject-genre-form:p - + + Subject:w Subject:p - - - - Koha-Auth-Number:w - Koha-Auth-Number:n + Subject-subdivision:w + Subject-subdivision:p + Subject-genre-form:w + Subject-genre-form:p - + + Subject:w Subject:p - + Subject-subdivision:w + Subject-subdivision:p + Subject-name-geographical:w + Subject-name-geographical:p + Subject-genre-form:w + Subject-genre-form:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-chronological:w + Subject-chronological:p + Subject-genre-form:w + Subject-genre-form:p + + + + + + Subject:w + Subject:p + + + + + + + + Subject:w + Subject:p + + + + + + + + + Subject:w + Subject:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-genre-form:w + Subject-genre-form:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-name-geographical:w + Subject-name-geographical:p + + + + Subject:w + Subject:p + Subject-subdivision:w + Subject-subdivision:p + Subject-chronological:w + Subject-chronological:w + + + + + + + + Subject:w + Subject:p + + + + + + + + + Subject:w + Subject:p + + + + + + + + Subject:w + Subject:p + + + + + + + + Subject:w + Subject:p + + + + + + Subject:w + Subject:p + diff --git a/etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl b/etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl index a8cde51614..ef2ff9e3d1 100644 --- a/etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl +++ b/etc/zebradb/marc_defs/unimarc/biblios/biblio-zebra-indexdefs.xsl @@ -1657,28 +1657,49 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) - - + + - - + + - - - - + + + + + + + + + + + + + + + + + + + + + + + + + @@ -1686,15 +1707,43 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -1702,6 +1751,41 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -1711,6 +1795,48 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -1720,6 +1846,41 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -1729,6 +1890,41 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -1738,6 +1934,41 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -1747,6 +1978,43 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -1754,6 +2022,13 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + @@ -1763,6 +2038,13 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + @@ -1772,6 +2054,41 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -1781,6 +2098,13 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + @@ -1790,6 +2114,13 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + @@ -1799,6 +2130,103 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -2252,188 +2680,6 @@ definition file (probably something like {biblio,authority}-koha-indexdefs.xml) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- 2.39.5