Commit graph

74 commits

Author SHA1 Message Date
Joshua Ferraro
6a5b9194d5 fixes to indexing process for deleted records
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-12 17:38:07 -06:00
Galen Charlton
70dccacee5 FRBR: configure PazPar2 during installation
Also added koha-pazpar2-ctl.sh to start and stop
PazPar2.

Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-11 16:35:18 -06:00
Galen Charlton
2a6507a27c FRBR: added work-author to PazPar2 search defs
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-11 16:35:16 -06:00
Galen Charlton
9384c49d33 PazPar2 FRBRize - adjusted UT and author keys
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-11 16:35:15 -06:00
Galen Charlton
35249f48e4 more experimental work on grouping with pazpar2
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-11 16:35:14 -06:00
Galen Charlton
d92eb0373e experiment: use PazPar2 to group related works
The approach is to use PazPar2 to search just one
target, the biblio Zebra database.  The results
of each set are merged by PazPar2 to generate a
hitlist that combines related bibs together; as an
example, if a library has the first Harry Potter
book in three languages and an audiobook format,
the hitlist should ideally return one result
for the work that includes links to the individual
bibs.

The new module C4::Search::PazPar2 implements a
simple client for PazPar2's XML-over-HTTP API.  It is
designed to be generic, and thus may end up getting
moved out of Koha to become a stand-alone CPAN module.

Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-08 06:01:39 -06:00
Galen Charlton
741c10d911 authorities -- added CCL indexes for heading matching
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-08 05:48:56 -06:00
Galen Charlton
6a26bcf517 authorities indexing: qualify indexes with ":w"
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-08 05:48:54 -06:00
Galen Charlton
32cf2af700 authorities indexing - MAJOR changes
For DOM indexing, added index_matching_heading option
to create indexes for matching an entire authority
heading -- the index works by indexing a heading
such

150 $aCars$xElectric$zEngland$vScience fiction

as something like

"cars generalsubdiv electric geographicsubdiv england
formsubdiv science fiction"

Also started adjust names of some indexes to conform
to languaged used in the MARC21 and UNIMARC standards, e.g.,

"See" => "See-from"
"See-also" => "See-also-from"
"Conference-name-heading" => "Meeting-name-heading"

Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-08 05:48:52 -06:00
Galen Charlton
1c0401e867 authorities - enabled DOM indexing
All new authority features will be based on the DOM indexing.

To update an existing installation, do the following:

[1] run perl Makefile.PL
[2] make
[3] make update_zebra_conf
[4] copy the new koha-conf.xml to $KOHA_CONF

Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-08 05:48:51 -06:00
Galen Charlton
f9f246cb1e authorities: changed extension of authority-zebra-indexdefs.xml
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-03 07:22:12 -06:00
Galen Charlton
cf8c3a84ca authorities: start of work on reindexing
Currently, MARC authorities are indexed (assuming Zebra
is used) with Zebra's GRS-1 module.  However, it does
not appear to be possible to index phrases that cross
subfield boundaries using the GRS-1 module's records.abs
config file's melm, elm, and xelm directives.

Since it is necessary to be able to efficiently search
an entire authority heading (e.g., to see if a given
bib heading is authorized), I'm proposing a switch
to Zebra's DOM XML filter module, which uses XSLT
to generate the words and phrases to be indexed from the
original MARC XML (or ISO2709) record.

The file authority-zebra-indexdefs.xml is an XSLT stylesheet
to implement the new indexing regime.  It is based on the
MARC21 authority record.abs with the following changes:

  * addition of 148/448/548
  * changed name of "see" indexes to "see-from"
  * changed name of "see-also" indexes to "see-also-from"
  * added index on the subject thesaurus based on
    the 008/11 and 040$f
  * added indexes on the full heading

authority-zebra-indexdefs.xml was generated from
authority-koha-indexdefs.xml via the XSL transform
koha-indexdefs-to-zebra.xsl.  authority-koha-indexdefs.xml
is the actual master version of the indexing definitions,
and was created to provide a much more compact syntax
over the raw XSLT that is to be passed to Zebra.

An experimental schema for Koha indexing definitions is
under way; my aim is to propose a simple format that can
be readily worked with, and perhaps even generated as
a serialization of indexing definitions that are set up
via administration settings in the Koha database itself.

Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-02-03 07:22:06 -06:00
Galen Charlton
8340c478fa start of big MARC21 authorities work
* Defined local field 942$a to store the authority type
  for MARC21 instead of 152$b
* Added 942$b to MARC21 authority framework.
* Added auth_header.authid and auth_header.authtypecode
  to appropriate subfields in MARC21 authority framework.
* Started work on two new modules:
    C4::AuthoritiesMarc::MARC21
    C4::AuthoritiesMarc::UNIMARC
  These modules will be used to extract MARC-format-specific
  behavior out of C4::AuthoritiesMarc
* Updated Zebra config for MARC21 to use only the 942$a
  for the authority type.
* For MARC21, added logic to move 152$b to 942$a for
  existing authority records.  Specifically, AddAuthority
  now does this move when a record is saved, while
  GetAuthority and GetAuthorityXML do this when
  extracting a record for other use.  This logic
  is temporary, and can hopefully be removed later, once
  use of 152$b in MARC21 authorities is confirmed to be
  absent for Koha users.  I will also create a batch
  job to do this update in one fell swoop.

Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-01-04 18:42:40 -06:00
Joshua Ferraro
bbd043f155 adding three new variables for installation:
'ZEBRA_SRU_HOST'    => 'localhost',
  'ZEBRA_SRU_BIBLIOS_PORT'    => '9998',
  'ZEBRA_SRU_AUTHORITIES_PORT'    => '9999',

Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-01-03 21:28:27 -06:00
Joshua Ferraro
030fbd2e80 Microformat support:
Needed to restore OpenSearch capabilities, and did the following while
I was at it:

  * add support for unAPI: http://unapi.info/
  * add basic support for COinS and OpenURL:
    http://ocoins.info;
    http://www.niso.org/committees/committee_ax.html
  * ^^ Gives us Zotero Support!
  * adding some XSLT stylesheets for handling additional transformations
    NOTE: English and MARC21 specific unfortunately
  * adding back opensearch/rss feed <link>s for autodiscovery

TODO: after the installation, to get the Zebra system running on an external
port it's necessary to hand-edit the configs. I'm looking into Virtual Hosts
which could solve that problem (run on both the socket and a port).

Need to add better error handling to the unapi and opensearch scripts

Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-01-03 18:00:16 -06:00
Joshua Ferraro
c6c82fb2a5 Fix Genre-form and Subject-topical for MARC21
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-01-03 08:28:04 -06:00
Joshua Ferraro
6ba5ddd76e fixing a couple mappings for SRU CQL server
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-01-03 03:01:14 -06:00
Joshua Ferraro
6d924e69ab s/__DB_HOST__/__WEBSERVER_HOST__/
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-01-03 02:10:13 -06:00
Paul POULAIN
5dc5967801 synch'ing marc21 and unimarc where applicable
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-01-03 00:55:11 -06:00
Joshua Ferraro
3c0b7eee62 small fix to koha-conf.xml
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-01-03 00:40:14 -06:00
Joshua Ferraro
aabea3417b WARNING: Updates to Index Defs for Authorities
This commit is a partial fix to authority control for MARC21,
and better authority handling in general (for UNIMARC too).
Before this patch, authority searching, editing, saving, was
not functioning, or was extremely buggy.

WARNING: You will need to re-index your authority database after
applying this commit.

The following changes have been made:

  * Normalizing record.abs index names (in both MARC21 and UNIMARC)
  * Synching authorities/bib1.att, ccl.properties, AuthoritiesMarc.pm
    with new indexes (UNIMARC too)
  * Clean up biblios/bib1.att (remove duplicate att defs)
  * Clean up authorities-* templates to conform to new styles
  * Fixed search failure when using Default framework (now searches
    All)

Also included are several fixes to the built-in SRU server for
Authority and Biblio, it's recommended that you update your
koha-conf.xml file:

  * adding explain-authorities.xml and explain-biblios.xml
  * adding necessary info to koha-conf.xml to enable SRU/W
  * adding several example XSLT stylesheets, that can be used
    for SRU on-the-fly transformations (to MODS, DC, RDF, etc.)

Still remaining for 3.0 are the following tasks:

  * update MARC21 frameworks (authority and cross-reference bib)
  * update display code/templates in authority results list
  * update search code/templates to utilize index points
  * implement 'grouping' of authtypes for searching (Name, Title, Subject)
  * repair utility to import auths and perform matching
  * repair bibliographic import to match auths and warn if no match

Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2008-01-03 00:28:40 -06:00
Galen Charlton
10c82bd9d7 authority zebra config: include gils.att
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-29 06:53:54 -06:00
Galen Charlton
7ad86c3a8e added logdir to koha-conf.xml
This parameter, initialized from LOG_DIR during installation,
allows scripts to specify a common directory for logs.

Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-28 12:40:11 -06:00
Joshua Ferraro
da8a4ca991 BIG COMMIT: minimal fix to authorities search
This is a minimal fix -- pname authorities work propertly, but nothing
else has been tested yet

Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-26 20:23:16 -06:00
Joshua Ferraro
85092daa56 Warning: Big Commit. Fixing Virtual Shelves
* prior to this commit, virtual shelvesn't did not function in
    the OPAC! Now they do, except for deletion from virtual shelves
    in list form
  * I've re-named 'Virtual Shelves' to 'Lists' as per our agreed
    upon convention

  * while vshelves aren't perfect yet, they're in enough of a working
    state for the RC1 now

Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-23 14:31:14 -06:00
Joshua Ferraro
3edaba0cc0 fixes to search results list, ccl.properties tweak
patch updateitem.pl (was failing ... missing 'my')
update OPAC results
fix limit by availability

Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-20 17:35:46 -06:00
Galen Charlton
d579648df1 Merge git://git.koha.org/pub/scm/koha 2007-12-18 17:46:54 -06:00
Joshua Ferraro
fcc3986cfd Updates to date indexing and search processing
Summary of Koha 3.0 date indexing for MARC21:

Index                   Expected format         Notes
-----------------------------------------------------
date-entered-on-file    [yymmdd]                (008/0-5, indexed in word and sort indexes)
copydate                [yyyy]                  (260$c, indexed in word and sort indexes)
acqdate                 [yyyy-mm-dd]            (952$d, indexed in date,word,sort indexes)
pubdate                 [yyyy]                  (008/7-10, indexed in year,word,sort indexes)

Template Search Parameters Tested:
        limit-yr (either yyyy or yyyy-yyyy) (added processing for ge le, structure attribute st-numeric, etc.)
        yr pubdate (yyyy)
        acqdate,st-date-normalized (yyyy-mm-dd)

Template Sort Parameters Tested:
        pubdate_dsc
        pubdate_asc
        acqdate_dsc
        acqdate_asc

Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-17 12:00:30 -06:00
Galen Charlton
8b75eae887 installer: fixed problem in MARC21 biblio record.abs
uri is defined in bib1.att, but url is not
2007-12-17 09:13:53 -06:00
Galen Charlton
5ccd6098ab installer: commented out line in MARC21 records.abs
This was causing Zebra to fail to index bib records.
NOTE: this is not a permanent fix.
2007-12-17 09:13:53 -06:00
Galen Charlton
ce3605da2a installer: more moving of Zebra config files
* must use record.abs
* sort-string.cfg => sort-string-utf.chr
2007-12-17 09:13:53 -06:00
Galen Charlton
0f5fa1bf2d installer: further moves of zebra configuration files
All Zebra config files are now installed by default.  The
ones specific to a MARC format or language are selected
by appropriate values in profilePath in zebra-biblios.cfg
and zebra-authorities.cfg.  Changing the MARC format
or indexing language can now be done by editing
profilePath.
2007-12-17 09:13:52 -06:00
Galen Charlton
190a7f404a installer: created skel directory
skel directory is for the installer only; contains
a directory structure and dummy READMEs used for
setting up the Zebra runtime and data directories.

Moved non-config files from etc/zebradb/* to
appropriate places under skel.
2007-12-17 09:13:52 -06:00
Galen Charlton
56622d5428 added trailing / to cgi-bin directory 2007-12-17 09:13:52 -06:00
Galen Charlton
5befdd2cd3 installer (part 2): more work
* plain 'make' now stages everything to blib, leaving
  actual installation to 'make install'
* adjusted rewrite-config.PL and config files
  for new subtitution variables
* added default SetEnv Perl5Lib to
  koha-httpd.conf
2007-12-17 09:13:52 -06:00
Chris Cormack
5baba50aed Shifted the opac out of koha so its now /usr/lib/cgi-bin/opac and /usr/lib/cgi-bin/koha by default, rewrite-config.PL and koha-httpd.conf updated
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
2007-12-17 09:13:52 -06:00
Galen Charlton
3f08fd0131 installer: move base zebra config files
Inspired by work by Chris Cormack, move several base Zebra
configuration files to two new directories under etc/zebradb:

lang_defs - language-specific settings (e.g., French and English)
marc_defs - MARC format-specific settings (e.g., MARC21 & UNIMARC)

Installer will query user for language and MARC format and
copy the inital Zebra configs accordingly.
2007-12-17 09:13:51 -06:00
Paul POULAIN
1cc21be002 unimarc.abs change : defaulting author sorting to 700$a
instead of 200$f
+ some tab added to have something easier to read
Some libraries don't use authorities (700$a),
but they are usually small libraries, so won't be with zebra !

Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-13 17:56:51 -06:00
Paul POULAIN
c22aebbf09 Unimarc record.abs fixes : fixed fields and some lc added
+ Physical-detail changed to Extent
+ Thesis-note removed, as it's not standard UNIMARC (it's specific to one of our library, in2p3)

Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-06 11:34:02 -06:00
Paul POULAIN
971976efc0 fix for itemtype in unimarc.abs
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-06 11:33:46 -06:00
Paul POULAIN
74fedc8be4 fix for ISBN search in unimarc
+ removing link, that is not in bib1.att (retult in lot of warnings in zebra log)

Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-06 11:33:40 -06:00
Paul POULAIN
f14c9b7f6d Fix for pub date in unimarc
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-06 11:33:26 -06:00
Paul POULAIN
979614022b Fix for Place-publication in unimarc
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-06 11:33:19 -06:00
Paul POULAIN
1b2ebba7b1 Fix for subject in unimarc.abs
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-06 11:33:12 -06:00
Paul POULAIN
736986e031 fix to unimarc.abs for languages
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-06 11:33:03 -06:00
Paul POULAIN
37d712fad2 Some changes to unimarc record.abs
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-02 14:59:26 -06:00
Paul POULAIN
e0246785e3 UNIMARC specific : itemtype is now known as itype
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-12-02 14:56:51 -06:00
Paul POULAIN
0381f47994 unimarc zebra config files moved to etc/zebradbs directory
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-11-25 17:15:52 -06:00
Joshua Ferraro
39786ad6b3 adding Suppress in OPAC to record.abs
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-11-25 16:28:41 -06:00
Joshua Ferraro
937bb38e6a fix for bug 1208, exact barcode search
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
2007-11-25 16:28:04 -06:00