Stop words in the default zebra config were being defined as
initial strings not as words causing then to truncate legitimate
headings.
This patch corrects that behaviour. It does not address the
question of what should be in the default file
Signed-off-by: Galen Charlton <gmcharlt@gmail.com>
Note: to completely apply this change, ensure that the working
copy of record.abs is updated and rebuild the bib indexes
using rebuild_zebra.pl -b -x -r
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
This patch just add an option to zebra-biblios.cfg that allow to make right truncate requests on a huge request.
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
Adding some fields to index.
Adding also some indexes in order to be able to query specific fields.
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
Adding Heading-Main as new index code in order to search only on Heading-main when $a selected.
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
These changes tidy up ISBN and ISSN indexing, per Michele Maenpaa. It's being
set up manually on many new installations, and probably ought to become part of the default
Koha installation.
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
Title-cover was not defined in record.abs
So the relevance ranking was broken.
This patch corrects that
For UNIMARC people, please reindex
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
Advances search limit by shelving location doesn't work due to
missing ccl definition in default installation. Once updated,
the zebradb will need to be reindexed.
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
Following a similar patch for UNIMARC, tweak the
authtype index for MARC21 authorities if the GRS-1
Zebra filter is in use.
Note that it is recommended that *DOM* mode indexing
be used for MARC21 authorities; if you're using DOM mode,
it is not necessary to rebuild the index. However, if
you're using the GRS-1 definitions (record.abs), it will
be necessary to reindex the authority records using
misc/migration_tools/rebuild_zebra.pl -a -r
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
On authorities-home.pl page, when you do search, you don't have any
result. Looking in log file, you see a Zebra error:
Unsupported Use attribute (114) authtype Bib-1
This patch modify record.abs UNIMARC definition.
The same may have to be done for MARC21 record.abs.
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
Changing record.abs file to add the management of acquisition date,
modification date and lost
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
You've been warned :-). This patch contains a more
complete mapping of UTF-8 to ASCII. The mappings are
based on those compiled by Richard Mahoney on the
Zebra list: http://lists.indexdata.dk/pipermail/zebralist/2007-August/001707.html
Note to documentation team: we need an area in the
documentation that discusses how Koha handles searches
and indexing for words that contain diacritics, such
as E-ACUTE (vs E without an acute). If you can paste
this list of mappings from this patch directly into
the docs and it preserves the encoding that would be
great.
NOTE: I don't think this patch addresses issues of
combining vs non-combining forms, and may require
a refactor to address that.
Josh
The problem was that the 'mc-' was removed from the checkboxes a while back and
that's what triggers the automatic application of OR boolean searching. I've
added it back to the templates and modified the ccl.properties file to include
mapping for itype,itemtype and ccode
Note: currently only zebraqueue_daemon.pl is known
to use the extended services that require the
Zebra r/w password.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
The following error messages in the Zebra
log should no longer appear:
06:10:25-04/03 zebrasrv(1) [warn] Failed to read character table urx.chr
06:10:25-04/03 zebrasrv(1) [warn] urx.chr [No such file or directory]
To fully install this patch, do a
'make update_zebra_conf'.
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
To fully install this patch, the following steps
are required:
1. perl Makefile.PL
2. make
3. make update_zebra_conf
4. restart zebrasvr
5. reindex authorities using rebuild_zebra.pl -a -r
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
To fully install this patch, the following steps are
necessary:
1. perl Makefile.PL
2. make
3. make update_zebra_conf (or make upgrade)
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
For DOM indexing, added index_matching_heading option
to create indexes for matching an entire authority
heading -- the index works by indexing a heading
such
150 $aCars$xElectric$zEngland$vScience fiction
as something like
"cars generalsubdiv electric geographicsubdiv england
formsubdiv science fiction"
Also started adjust names of some indexes to conform
to languaged used in the MARC21 and UNIMARC standards, e.g.,
"See" => "See-from"
"See-also" => "See-also-from"
"Conference-name-heading" => "Meeting-name-heading"
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
All new authority features will be based on the DOM indexing.
To update an existing installation, do the following:
[1] run perl Makefile.PL
[2] make
[3] make update_zebra_conf
[4] copy the new koha-conf.xml to $KOHA_CONF
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Currently, MARC authorities are indexed (assuming Zebra
is used) with Zebra's GRS-1 module. However, it does
not appear to be possible to index phrases that cross
subfield boundaries using the GRS-1 module's records.abs
config file's melm, elm, and xelm directives.
Since it is necessary to be able to efficiently search
an entire authority heading (e.g., to see if a given
bib heading is authorized), I'm proposing a switch
to Zebra's DOM XML filter module, which uses XSLT
to generate the words and phrases to be indexed from the
original MARC XML (or ISO2709) record.
The file authority-zebra-indexdefs.xml is an XSLT stylesheet
to implement the new indexing regime. It is based on the
MARC21 authority record.abs with the following changes:
* addition of 148/448/548
* changed name of "see" indexes to "see-from"
* changed name of "see-also" indexes to "see-also-from"
* added index on the subject thesaurus based on
the 008/11 and 040$f
* added indexes on the full heading
authority-zebra-indexdefs.xml was generated from
authority-koha-indexdefs.xml via the XSL transform
koha-indexdefs-to-zebra.xsl. authority-koha-indexdefs.xml
is the actual master version of the indexing definitions,
and was created to provide a much more compact syntax
over the raw XSLT that is to be passed to Zebra.
An experimental schema for Koha indexing definitions is
under way; my aim is to propose a simple format that can
be readily worked with, and perhaps even generated as
a serialization of indexing definitions that are set up
via administration settings in the Koha database itself.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
* Defined local field 942$a to store the authority type
for MARC21 instead of 152$b
* Added 942$b to MARC21 authority framework.
* Added auth_header.authid and auth_header.authtypecode
to appropriate subfields in MARC21 authority framework.
* Started work on two new modules:
C4::AuthoritiesMarc::MARC21
C4::AuthoritiesMarc::UNIMARC
These modules will be used to extract MARC-format-specific
behavior out of C4::AuthoritiesMarc
* Updated Zebra config for MARC21 to use only the 942$a
for the authority type.
* For MARC21, added logic to move 152$b to 942$a for
existing authority records. Specifically, AddAuthority
now does this move when a record is saved, while
GetAuthority and GetAuthorityXML do this when
extracting a record for other use. This logic
is temporary, and can hopefully be removed later, once
use of 152$b in MARC21 authorities is confirmed to be
absent for Koha users. I will also create a batch
job to do this update in one fell swoop.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
This commit is a partial fix to authority control for MARC21,
and better authority handling in general (for UNIMARC too).
Before this patch, authority searching, editing, saving, was
not functioning, or was extremely buggy.
WARNING: You will need to re-index your authority database after
applying this commit.
The following changes have been made:
* Normalizing record.abs index names (in both MARC21 and UNIMARC)
* Synching authorities/bib1.att, ccl.properties, AuthoritiesMarc.pm
with new indexes (UNIMARC too)
* Clean up biblios/bib1.att (remove duplicate att defs)
* Clean up authorities-* templates to conform to new styles
* Fixed search failure when using Default framework (now searches
All)
Also included are several fixes to the built-in SRU server for
Authority and Biblio, it's recommended that you update your
koha-conf.xml file:
* adding explain-authorities.xml and explain-biblios.xml
* adding necessary info to koha-conf.xml to enable SRU/W
* adding several example XSLT stylesheets, that can be used
for SRU on-the-fly transformations (to MODS, DC, RDF, etc.)
Still remaining for 3.0 are the following tasks:
* update MARC21 frameworks (authority and cross-reference bib)
* update display code/templates in authority results list
* update search code/templates to utilize index points
* implement 'grouping' of authtypes for searching (Name, Title, Subject)
* repair utility to import auths and perform matching
* repair bibliographic import to match auths and warn if no match
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
* prior to this commit, virtual shelvesn't did not function in
the OPAC! Now they do, except for deletion from virtual shelves
in list form
* I've re-named 'Virtual Shelves' to 'Lists' as per our agreed
upon convention
* while vshelves aren't perfect yet, they're in enough of a working
state for the RC1 now
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Summary of Koha 3.0 date indexing for MARC21:
Index Expected format Notes
-----------------------------------------------------
date-entered-on-file [yymmdd] (008/0-5, indexed in word and sort indexes)
copydate [yyyy] (260$c, indexed in word and sort indexes)
acqdate [yyyy-mm-dd] (952$d, indexed in date,word,sort indexes)
pubdate [yyyy] (008/7-10, indexed in year,word,sort indexes)
Template Search Parameters Tested:
limit-yr (either yyyy or yyyy-yyyy) (added processing for ge le, structure attribute st-numeric, etc.)
yr pubdate (yyyy)
acqdate,st-date-normalized (yyyy-mm-dd)
Template Sort Parameters Tested:
pubdate_dsc
pubdate_asc
acqdate_dsc
acqdate_asc
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
All Zebra config files are now installed by default. The
ones specific to a MARC format or language are selected
by appropriate values in profilePath in zebra-biblios.cfg
and zebra-authorities.cfg. Changing the MARC format
or indexing language can now be done by editing
profilePath.
skel directory is for the installer only; contains
a directory structure and dummy READMEs used for
setting up the Zebra runtime and data directories.
Moved non-config files from etc/zebradb/* to
appropriate places under skel.
* plain 'make' now stages everything to blib, leaving
actual installation to 'make install'
* adjusted rewrite-config.PL and config files
for new subtitution variables
* added default SetEnv Perl5Lib to
koha-httpd.conf
Inspired by work by Chris Cormack, move several base Zebra
configuration files to two new directories under etc/zebradb:
lang_defs - language-specific settings (e.g., French and English)
marc_defs - MARC format-specific settings (e.g., MARC21 & UNIMARC)
Installer will query user for language and MARC format and
copy the inital Zebra configs accordingly.
instead of 200$f
+ some tab added to have something easier to read
Some libraries don't use authorities (700$a),
but they are usually small libraries, so won't be with zebra !
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
+ Physical-detail changed to Extent
+ Thesis-note removed, as it's not standard UNIMARC (it's specific to one of our library, in2p3)
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
+ removing link, that is not in bib1.att (retult in lot of warnings in zebra log)
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
* bringing back facets
* bringing back stemming (syspref controlled)
* bringing back field weighting (syspref controlled)
* bringing back language limits
* bringing back year limits
* fixing 'expanded view'
* improvements to template
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>