From cf8c3a84cad8b0d5f8c3702a9c1e0ab233e5fa54 Mon Sep 17 00:00:00 2001 From: Galen Charlton Date: Thu, 31 Jan 2008 13:43:15 -0600 Subject: [PATCH] authorities: start of work on reindexing Currently, MARC authorities are indexed (assuming Zebra is used) with Zebra's GRS-1 module. However, it does not appear to be possible to index phrases that cross subfield boundaries using the GRS-1 module's records.abs config file's melm, elm, and xelm directives. Since it is necessary to be able to efficiently search an entire authority heading (e.g., to see if a given bib heading is authorized), I'm proposing a switch to Zebra's DOM XML filter module, which uses XSLT to generate the words and phrases to be indexed from the original MARC XML (or ISO2709) record. The file authority-zebra-indexdefs.xml is an XSLT stylesheet to implement the new indexing regime. It is based on the MARC21 authority record.abs with the following changes: * addition of 148/448/548 * changed name of "see" indexes to "see-from" * changed name of "see-also" indexes to "see-also-from" * added index on the subject thesaurus based on the 008/11 and 040$f * added indexes on the full heading authority-zebra-indexdefs.xml was generated from authority-koha-indexdefs.xml via the XSL transform koha-indexdefs-to-zebra.xsl. authority-koha-indexdefs.xml is the actual master version of the indexing definitions, and was created to provide a much more compact syntax over the raw XSLT that is to be passed to Zebra. An experimental schema for Koha indexing definitions is under way; my aim is to propose a simple format that can be readily worked with, and perhaps even generated as a serialization of indexing definitions that are set up via administration settings in the Koha database itself. Signed-off-by: Chris Cormack Signed-off-by: Joshua Ferraro --- .../authorities/authority-koha-indexdefs.xml | 409 ++++++++ .../authorities/authority-zebra-indexdefs.xml | 983 ++++++++++++++++++ .../authorities/koha-indexdefs-to-zebra.xsl | 281 +++++ 3 files changed, 1673 insertions(+) create mode 100644 etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml create mode 100644 etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xml create mode 100644 etc/zebradb/marc_defs/marc21/authorities/koha-indexdefs-to-zebra.xsl diff --git a/etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml b/etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml new file mode 100644 index 0000000000..6ba802840d --- /dev/null +++ b/etc/zebradb/marc_defs/marc21/authorities/authority-koha-indexdefs.xml @@ -0,0 +1,409 @@ + + + + + Record-status + + + Encoding-level + + + + + Local-Number + + + + + Kind-of-record + + + Descriptive-cataloging-rules + + + Subject-heading-thesaurus + + + Heading-use-main-or-added-entry + + + Heading-use-subject-added-entry + + + Heading-use-series-added-entry + + + + + Personal-name:w + Personal-name:p + Personal-name:s + + + Personal-name-heading:w + Personal-name-heading:p + Personal-name-heading:s + Heading:w + Heading:p + Heading:s + + + + Personal-name-see-from:w + Personal-name-see-from:p + Personal-name-see-from:s + See-from:w + See-from:p + See-from:s + + + + Personal-name-see-also-from:w + Personal-name-see-also-from:p + Personal-name-see-also-from:s + See-also-from:w + See-also-from:p + See-also-from:s + + + + + Corporate-name:w + Corporate-name:p + + + Corporate-name-heading:w + Corporate-name-heading:p + Corporate-name-heading:s + Heading:w + Heading:p + Heading:s + + + + Corporate-name-see-from:w + Corporate-name-see-from:p + Corporate-name-see-from:s + See-from:w + See-from:p + See-from:s + + + + Corporate-name-see-also-from:w + Corporate-name-see-also-from:p + Corporate-name-see-also-from:s + See-also-from:w + See-also-from:p + See-also-from:s + + + + + Meeting-name:w + Meeting-name:p + + + Meeting-name-heading:w + Meeting-name-heading:p + Meeting-name-heading:s + Heading:w + Heading:p + Heading:s + + + + Meeting-name-see-from:w + Meeting-name-see-from:p + Meeting-name-see-from:s + See-from:w + See-from:p + See-from:s + + + + Meeting-name-see-also-from:w + Meeting-name-see-also-from:p + Meeting-name-see-also-from:s + See-also-from:w + See-also-from:p + See-also-from:s + + + + + Title-uniform:w + Title-uniform:p + + + Title-uniform-heading:w + Title-uniform-heading:p + Title-uniform-heading:s + Heading:w + Heading:p + Heading:s + + + + Title-uniform-see-from:w + Title-uniform-see-from:p + Title-uniform-see-from:s + See-from:w + See-from:p + See-from:s + + + + Title-uniform-see-also-from:w + Title-uniform-see-also-from:p + Title-uniform-see-also-from:s + See-also-from:w + See-also-from:p + See-also-from:s + + + + + Chronological-term:w + Chronological-term:p + + + Chronological-term-heading:w + Chronological-term-heading:p + Chronological-term-heading:s + Heading:w + Heading:p + Heading:s + + + + Chronological-term-see-from:w + Chronological-term-see-from:p + Chronological-term-see-from:s + See-from:w + See-from:p + See-from:s + + + + Chronological-term-see-also-from:w + Chronological-term-see-also-from:p + Chronological-term-see-also-from:s + See-also-from:w + See-also-from:p + See-also-from:s + + + + + + Subject-topical:w + Subject-topical:p + + + Subject-topical-heading:w + Subject-topical-heading:p + Subject-topical-heading:s + Heading:w + Heading:p + Heading:s + + + + Subject-topical-see-from:w + Subject-topical-see-from:p + Subject-topical-see-from:s + See-from:w + See-from:p + See-from:s + + + + Subject-topical-see-also-from:w + Subject-topical-see-also-from:p + Subject-topical-see-also-from:s + See-also-from:w + See-also-from:p + See-also-from:s + + + + + Name-geographic:w + Name-geographic:p + + + Name-geographic-heading:w + Name-geographic-heading:p + Name-geographic-heading:s + Heading:w + Heading:p + Heading:s + + + + Name-geographic-see-from:w + Name-geographic-see-from:p + Name-geographic-see-from:s + See-from:w + See-from:p + See-from:s + + + + Name-geographic-see-also-from:w + Name-geographic-see-also-from:p + Name-geographic-see-also-from:s + See-also-from:w + See-also-from:p + See-also-from:s + + + + + Term-genre-form:w + Term-genre-form:p + + + Term-genre-form-heading:w + Term-genre-form-heading:p + Term-genre-form-heading:s + Heading:w + Heading:p + Heading:s + + + + Term-genre-form-see-from:w + Term-genre-form-see-from:p + Term-genre-form-see-from:s + See-from:w + See-from:p + See-from:s + + + + Term-genre-form-see-also-from:w + Term-genre-form-see-also-from:p + Term-genre-form-see-also-from:s + See-also-from:w + See-also-from:p + See-also-from:s + + + + + General-subdivision:w + General-subdivision:p + General-subdivision:s + Subdivision:w + Subdivision:p + Subdivision:s + + + + General-subdivision-see-from:w + General-subdivision-see-from:p + General-subdivision-see-from:s + Subdivision-see-from:w + Subdivision-see-from:p + Subdivision-see-from:s + + + + General-subdivision-see-also-from:w + General-subdivision-see-also-from:p + General-subdivision-see-also-from:s + Subdivision-see-also-from:w + Subdivision-see-also-from:p + Subdivision-see-also-from:s + + + + + Geographic-subdivision:w + Geographic-subdivision:p + Geographic-subdivision:s + Subdivision:w + Subdivision:p + Subdivision:s + + + + Geographic-subdivision-see-from:w + Geographic-subdivision-see-from:p + Geographic-subdivision-see-from:s + Subdivision-see-from:w + Subdivision-see-from:p + Subdivision-see-from:s + + + + Geographic-subdivision-see-also-from:w + Geographic-subdivision-see-also-from:p + Geographic-subdivision-see-also-from:s + Subdivision-see-also-from:w + Subdivision-see-also-from:p + Subdivision-see-also-from:s + + + + + Chronological-subdivision:w + Chronological-subdivision:p + Chronological-subdivision:s + Subdivision:w + Subdivision:p + Subdivision:s + + + + Chronological-subdivision-see-from:w + Chronological-subdivision-see-from:p + Chronological-subdivision-see-from:s + Subdivision-see-from:w + Subdivision-see-from:p + Subdivision-see-from:s + + + + Chronological-subdivision-see-also-from:w + Chronological-subdivision-see-also-from:p + Chronological-subdivision-see-also-from:s + Subdivision-see-also-from:w + Subdivision-see-also-from:p + Subdivision-see-also-from:s + + + + + Form-subdivision:w + Form-subdivision:p + Form-subdivision:s + Subdivision:w + Subdivision:p + Subdivision:s + + + + Form-subdivision-see-from:w + Form-subdivision-see-from:p + Form-subdivision-see-from:s + Subdivision-see-from:w + Subdivision-see-from:p + Subdivision-see-from:s + + + + Form-subdivision-see-also-from:w + Form-subdivision-see-also-from:p + Form-subdivision-see-also-from:s + Subdivision-see-also-from:w + Subdivision-see-also-from:p + Subdivision-see-also-from:s + + + + authtype + + diff --git a/etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xml b/etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xml new file mode 100644 index 0000000000..f1c4dc75eb --- /dev/null +++ b/etc/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xml @@ -0,0 +1,983 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + + + + -- + + + + + + + + + + + + + + + + + + + lcsh + + + lcac + + + mesh + + + nal + + + cash + + + notapplicable + + + aat + + + sears + + + rvm + + + + + + + + notdefined + + + + + notdefined + + + + + + + + diff --git a/etc/zebradb/marc_defs/marc21/authorities/koha-indexdefs-to-zebra.xsl b/etc/zebradb/marc_defs/marc21/authorities/koha-indexdefs-to-zebra.xsl new file mode 100644 index 0000000000..f1b16d28bc --- /dev/null +++ b/etc/zebradb/marc_defs/marc21/authorities/koha-indexdefs-to-zebra.xsl @@ -0,0 +1,281 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + marc:controlfield[@tag=' + + '] + + + + substring(., + + , + + ) + + + + //marc:datafield[@tag=' + + ']/marc:subfield[@code=' + + '] + + + + lcsh + lcac + mesh + nal + cash + notapplicable + aat + sears + rvm + + + + + + + + + notdefined + + + notdefined + + + + + + + + + + + + + + + + + + + + + + + + + + + + substring(., + + , + + ) + + + + + + + + + + marc:controlfield[@tag=' + + '] + + + + + + + + + + + + + + + + + + + + + substring(., + + , + + ) + + + . + + + + + + + + + + + + marc:datafield[@tag=' + + '] + + + + + + + + + + + + + + + + contains(' + + ', @code) + + + + + + + + + + + + + + marc:datafield[@tag=' + + '] + + + + + + + + + + + + + + + + + + + contains(' + + ', @code) + + + + + + + contains(' + + ', @code) + + -- + + + + + + + + + + + + + + + + + + + + -- 2.39.5