[1] Increase sleep interval between checks of zebraqueue
from 0.01 seconds to 0.50.
[2] Batch up commits of changes to the zebraqueue table
[3] If the same record appears multiple times in the queue,
handle only once.
[4] Properly postpone failures to process record deletes to
avoid spinning.
[5] Correct how queue entries are marked done - avoid skipping
an authority record update, e.g., if it has the same
ID number as a bib that was updated.
[6] Added a FIXME about a possible later enhancement to
batch up updates so that Zebra isn't told to commit
after each record.
No documentation changes.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
from circulation dashboard, creates new sysprefs, assigns
the sysprefs to the proper tab in sysprefs (Circulation),
updatedatabase changes to do the previous, and fixing one
redundent limit in the query for build_holds_queue.pl
Note: still need to address item-level holds
the tmp_holdsqueue table. This is an alternative holds
targeting workflow that is more suitable for multi-location
libraries than the default holds picklist report.
Note to documentation writers: this summary should be
added to any holds documentation as an overview of
the avaialable methods for holds fulfillment.
This alternative holds workflow assumes an
expectation that the system should target a specific
item for a given hold request, attempt to fulfill the
hold with that item, and if unable to fulfill, select
an available item at another location to fulfill the
hold.
This is quite different than the default Koha behavior
which uses a 'broadcast' method of hold fulfillment.
How it works:
This script weights available locations for holds based
on options specified in two system preferences:
StaticHoldsQueueWeight
Allows the library to specify a list of library
location codes -- if used alone, it will rank the
list statically, selecting the top-ranking available
location to be added to the picklist.
RandomizeHoldsQueueWeight
If RandomizeHoldsQueueWeight and StaticHoldsQueueWeight
are set, the list of library codes in the
StaticHoldsQueueWeight syspref are randomized rather
than statically ranked. If RandomizeHoldsQueueWeight
alone is set, the list of all available library codes
is used to randomize the weight.
If neither syspref is set, the list is statically
ranked according to how they are pulled out of the system
database.
NOTE: This has not yet been tested with item-level holds
this script had quite serious issues :
- it would not use mindays and maxdays variables
- It would send latin1 where utf8 was expected
- It would send data without text delimiters (; was chosen if title contains ; it would have been a problem " used as delimiters now)
- It would write a file when it was not asked
Now stores the results in a string before printing it.
New option added to store result into a file : -o filename
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Enhanced the ability of catalogers to specify how
bib and item records should be added, replaced, or
ignored during a staging import.
When an import batch of bib records is staged and commit,
the user can now explicitly specify what should occur
when an incoming bib record has, or does not have, a match
with a record already in the database. The options are:
if match found (overlay_action):
create_new (just add the incoming record)
replace (replace the matched record with the incoming one)
use_template (option not implemented)
ignore (do nothing with the incoming bib; however, the
items attached to it may still be processed
based on the item action)
if no match is found (nomatch_action):
create_new (just add the incoming record)
ignore (do nothing with the incoming bib; in this
case, any items attached to it will be
ignored since there will be nothing to
attach them to)
The following options for handling items embedded in the
bib record are now available:
always_add (add the items to the new or replaced bib)
add_only_if_match (add the items only if the incoming bib
matches an existing bib)
add_only_if_add (add the items only if the incoming bib
does *not* match an existing bib)
ignore (ignore the items entirely)
With these changes, it is now possible to support the following use cases:
[1] A library joining an existing Koha database wishes to add their
items to existing bib records if they match, but does not want
to overlay the bib records themselves.
[2] A library wants to load a file of records, but only handle
the new ones, not ones that are already in the database.
[3] A library wants to load a file of records, but only
handle the ones that match existing records (e.g., if
the records are coming back from an authority control vendor).
Documentation changes:
* See description above; also, screenshots of the 'stage MARC records
for import' and 'manage staged MARC records' should be updated.
Test cases:
* Added test cases to exercise staging and committing import batches.
UI changes:
* The pages for staging and managing import batches now have
controls for setting the overlay action, action if no match,
and item action separately.
* in the manage import batch tool, user is notified when they
change overlay action, no-match action, and item action
* HTML for manage import batch tool now uses fieldsets
Database changes (DB rev 076):
* added import_batches.item_action
* added import_batches.nomatch_action
* added 'ignore' as a valid value for import_batches.overlay_action
* added 'ignored' as a valid value for import_records.status
* added 'status' as a valid value for import_items.status
API changes:
* new accessor routines for C4::ImportBatch
GetImportBatchNoMatchAction
SetImportBatchNoMatchAction
GetImportBatchItemAction
SetImportBatchItemAction
* new internal functions for C4::ImportBatch to
determine how a given bib and item are to be
processed, based on overlay_action, nomatch_action,
and item_action:
_get_commit_action
_get_revert_action
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
If running Zebra, try to locate zebrasrv and zebraidx
so that koha-zebra-ctl.sh can point to a functioning
zebrasrv.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
note 995 for items is hardcoded, so it's really for UNIMARC only. The script exit if you're not UNIMARCflavour
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
The daemon control scripts (koha-zebra-ctl.sh, koha-zebraqueue-ctl.sh,
and koha-pazpar2-ctl.sh) are now copied and installed in a
runnable fashion for a 'dev'-mode install. By default
they are installed in the bin subdirectory of the runtime
directory.
Also:
* the control scripts now work if the EUID is other
than root (as would be expected for a 'dev' or 'single'
install).
* Split the SCRIPT_DIR installation target into
SCRIPT_DIR (scripts to copy regardless of install mode)
and SCRIPT_NONDEV_DIR (scripts to copy to SCRIPT_DIR
unless the install mode is 'dev').
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
rebuild_zebra.pl will now mark all zebraqueue entries
of the affected record type(s) done when run in
normal mode to index all records (as opposed to running
it with -z to just process the zebraqueue). This prevents
any running zebraqueue_daemon processes from attempting
to reindex the same records, redundantly.
The new -y swtich overrides this new behavior; in other words, if
running rebuild_zebra.pl without -z, you can specify
-y to *not* mark zebraqueue done.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
* Add a new parameter -o to begin importing input file after skiping
n records.
* Enclose input file reading in an eval directive to avoid abording
import if few records are corrupted: they are now skipped.
* Help formating.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Add a explanation on DBD::mysql installation without test suite.
Add /misc/translator/install-code.pl script that creates templates
for specified language codes.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Accidentally introducing a circular reference in a
MARC::Record object does not lead to goodness, particularly
if you export lots and lots of them.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
The -z option, when used in conjunction with -a and/or -b,
selects the records to reindex from the zebraqueue table.
Both record updates and record deletes are handled.
-z is cannot be used with -s or -r: the updated records
must always be freshly exported, and if zebraqueue
is to be processed, it's assumed that you don't want
to drop the Zebra index first.
This means that rebuild_zebra.pl -b -a -x can be
used as a cronjob to update the indexes periodically; it
is believed that this will offer much better indexing
performance on some setups as compared to zebraqueue_daemon.pl,
which uses Z39.50 extended services to send record updates
to Zebra.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
At moment using both -a (index authorities) and
-x (export records as MARC XML) is not allowed -
if the Zebra authority database is using the DOM
filter, zebraidx will not be able to process the
exported records correctly.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
1. Logic to fix up record IDs, UNIMARC 100 field,
and record leader now in separate functions.
2. Removed (incorrect) logic to save corrected record
in database.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
The version of MARC::Batch->new() distributed with version
2.0.0 of MARC::Record, if given a file name, will
open it using the ':utf8' layer. This results in an
incorrect character conversion when processing records
in the MARC-8 character encoding.
To avoid this, batch jobs that use MARC::Batch now
open the file themselves, then pass the file handle
to MARC::Batch->new().
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
The new tables have the same structure and constraints
as the tables they archive with the following exceptions:
* borrowernumber and biblionumber in old_reserves can be
NULL
* the FK constraints (e.g., for itemnumber) on old_reserves
set the child column to NULL if the parent row is deleted
instead of deleting the child row.
* there is no FK constraint on old_issues.branchcode, allowing
a branch to be deleted without changing archived requests.
Some miscellaneous cleanup was done as part of this patch:
* GetMemberIssuesAndFines (C4::Members) now uses bind variables
* fixed POD for GetMemberIssuesAndFines
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
INTRANET_TMPL_DIR
INTRANET_WWW_DIR
OPAC_TMPL_DIR
OPAC_WWW_DIR
KOHA_CONF_DIR
OPAC_TMPL_DIR
OPAC_WWW_DIR
PAZPAR2_CONF_DIR
ZEBRA_CONF_DIR
For each directory on the list, if an existing file
installed file is different from the version
coming from the new package, it is copied to a backup file
whose name will have the suffix "_koha_<old_version_number>"
or "_upgrade_backup", depending on whether
the --prev-install-log option was used or not
during 'perl Makefile.PL'.
The directories whose files were backed up were
chosen because they contain configuration and
HTML files that a non-programmer Koha user
is likely to change. Consequently, the backup
files produced by 'make upgrade' are not a substitute
for doing a complete backup of one's Koha
installation before upgrading.
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
* IsStringUTF8ish - determine if scalar contains a string in UTF8
* MarcToUTF8Record - convert MARC blob or MARC::Record to UTF8
* SetMarcUnicodeFlag - set appropriate MARC21 or UNIMARC field to
indicate that record is in UTF-8.
Design points of this module include:
* No dependencies on other C4 modules, making it easier to add
more test cases
* All character conversion code in one place
* Single entry point for doing a character conversion on a
MARC record
* Capture of errors and warnings produced by Text::Iconv
and MARC::Charset
* Start of support for guessing the source character set of
a MARC record.
Several functions were moved from other scripts
or modules to C4::Charset:
* C4::Koha->FixEncoding (expanded and renamed
MarcToUTF8Record)
* C4::Koha->char_decode5426
* fMARC8ToUTF8 from bulkmarcimport.pl (renamed
_marc_marc8_to_utf8)
Several batch jobs were adjusted to use MarcToUTF8Record instead of
FixEncoding.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Fixed syntax errors preventing compilation; however,
unsure whether this is a dead utility that should be
removed outright.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Needed to restore OpenSearch capabilities, and did the following while
I was at it:
* add support for unAPI: http://unapi.info/
* add basic support for COinS and OpenURL:
http://ocoins.info;
http://www.niso.org/committees/committee_ax.html
* ^^ Gives us Zotero Support!
* adding some XSLT stylesheets for handling additional transformations
NOTE: English and MARC21 specific unfortunately
* adding back opensearch/rss feed <link>s for autodiscovery
TODO: after the installation, to get the Zebra system running on an external
port it's necessary to hand-edit the configs. I'm looking into Virtual Hosts
which could solve that problem (run on both the socket and a port).
Need to add better error handling to the unapi and opensearch scripts
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Replace C4::Biblio::AddBiblioAndItems with two
things:
* An option to C4::Biblio::AddBiblio to defer writing
biblioitems.marc and biblioitems.marcxml. This
option was created to give a significant
speed boost to bulkmarcimport.pl, but is *not*
recommended for general use.
* C4::Items::AddItemBatchFromMarc
This refactoring removes the need to have functions
in C4::Biblio and C4::Items that call each other's
private functions.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
* Move CheckItemPreSave to C4::Items (from C4::Biblio)
* Modified C4::Biblio::AddBiblioAndItems to use appropriate
internal routines from C4::Items
* Moved GetItemnumberFromBarcode to C4::Items
* Removed duplicate C4::Biblio::_koha_new_items
* Removed disused C4::Biblio::MARCitemchange
Currently AddBiblioAndItems is a special routine that
uses private subs from both C4::Biblio and C4::Items.
This needs to be refactored.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Prior to this fix, the status fields had three 'off' values, NULL, "",
and 0. I've reduced it to two in the db, removing the option for NULL, and
setting the default value to 0, however, we need to verify that we don't ever
write out as "" as this needlessly complicates the indexing process,
critical for searching or limiting by status (e.g., availability). Also,
queries that attempt to write a NULL value to one of these fields will fail
(based on my tests).
This patch includes the following changes:
* Updated the database definition for notforloan, damaged, itemlost, and
wthdrawn in kohastructure.sql to forbid NULL and default to 0; MySQL
can't forbid other values (such as empty ""), so this has to be handled
at the application layer and REQUIRES further patching.
* Fixed the 'limit by availability' query node in Search.pm to use a
much less confusing definition of 'available'
* Added code to set values to 0 where they are NULL or empty ( "" ) for
notforloan, damaged, itemlost or wthdrawn in both the MARC and the items
table:
* Biblio.pm -> AddBiblioAndItems
* catalogue/updateitem.pl
* SEE NOTE BELOW, REQUIRES UPDATE TO THE REST OF KOHA'S ITEM MGT!
* Removed code in bulkmarcimport.pl that sets notforloan status depending
on item-level or bib-level itemtype -- that flag is designed to be set
only to override the notforloan setting for the item's (or bib's,
depending on the syspref) assigned itemtype (it doesn't need to override
to 'for loan', only to 'not for loan').
added $dbh->do("truncate zebraqueue"); when operation is 'delete'
* I updated some notes in catalogue/updateitem.pl as to why ModItem can't be
used -- we don't have _a_ place where we can change the item and marc :/
I've tested the following:
bulkmarcimport.pl..........................MARC/items OK
Staged Records Import......................NOT OK
updateitem.pl (via moredetail.pl)..........MARC/items OK
circulation.pl.............................NOT OK
returns.pl.................................NOT OK
addbiblio.pl...............................NOT OK
additem.pl.................................NOT OK
Basically, there isn't a single place to apply this patch that will
update both item data and MARC data in one place ... a future patch
needs to address this issue.
Signed-off-by: Galen Charlton <galen.charlton@liblime.com>
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
the unimarc stuff has been moved to marc_defs directory and the
lang specific is in lang_defs
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Enabled automatic conversion of MARC-8 records to
UTF-8. Record is converted if its Leader/09 contains
a blank and the -s (skip) option hasn't been supplied
on the command-line. Any record that cannot be converted
to UTF-8 is skipped.
Also now use Unicode Normalization Form C (NFC) for
records converted from MARC-8.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Introduced new C4::Biblio function CheckItemPreSave,
which checks for duplicate barcodes and invalid
branch codes. Not yet sure whether this function
needs to be exported or whether it will just be
used internally to C4::Bibli.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
Changes to improve speed of MARC bib and item
imports:
[1] Turn off autocommit and commit database
transactions in larger batches.
[2] Introduce a new C4::Biblio function (AddBiblioAndItems)
to combine AddBiblio and AddItems -- this is faster
because we are not parsing the MARC XML of the biblio
every time we add an item.
[3] Introduce FasterTransformMarcToKoha, which is much
faster than TransformMarcToKoha. The new version,
which will replace the old one once it has been
fully tested, scans through each field in the
MARC record just once, instead of potentially
dozens of times.
[4] Remove code in bulkmarcexport that moved the
item tags to separate MARC::Record objects.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
* adding zebra daemons for managing server and queue processes
* improvements to the README.debian file
* Fixes to Search.pm since last series of commits broke zebra-based
searching (again)
* moving some files to new misc/bin and misc/cronjobs
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
[1] Use File::Temp to create and manage
export directory if -d is not specified.
[2] Added usage message.
[3] Code that attempts to fix up Zebra
configuration files changed so that it
is invoked only if --munge-config option
is supplied; this code will ultimately
either be removed or moved to a separate
script -- the sorts of errors that it
tries to fix should no longer be appearing
in a standard install.
[4] Fixed Win32 portability problem when removing
temporary directory.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
* rewrite-config.PL now puts in installed location
of koha-conf.xml in C4/Context.pm so that
correct config can be found even when
KOHA_CONF is not set. Note that setting KOHA_CONF
will still override path set by installer.
* changed references from koha.xml to koha-conf.xml
The correct field is 'items.wthdrawn', not 'items.withdrawn'.
Spelling may be corrected in post-3.0 version.
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
There are only 2 UNIMARC specific files (.abs and .chr), they have been moved to etc/zebradb
The rebuild_zebra.pl takes all config file from this location now.
the misc/zebra/ can be removed (and will be soon)
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
The authoritative data for the items is the items table, not the MARC data.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
this time it's when a biblio don't have biblionumber, has a 100$a field, and it's invalid.
1 biblio in my 300 000 DB (and it was biblio 294 359, of course !)
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
the API of TransformMarcToKoha has changed.
+
For a reason I don't understand (& appear a conception bug), the biblioitemnumber is not shipped in the hash
and it's needed by the UPDATE. So it has to be reintroduced manually.
That's how it's done in Biblio.pm/Sub ModItem.
I think 3.2 will see the end of biblioitems (& it's replacement by something else)
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
now, adding both on the fly when needed.
(had 2 biblios like that in a 290 000 DB, but was enought to have M::F::X complaining & diing !)
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
* now properly deletes deleted authorities from index
* now actually ignores duplicate work
* standardized on recordDelete instead of delete_record
for bibs and recordDelete for authorities
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
As part of this, modified two routines in C4::ImportBatch
to support a callback for monitor progress of import
processing.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
(authorities that don't have a 001 field containing authid)
also comment some code when exporting biblios (NOT tested, hdl,pls confirm this commit)
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
The zebraque_start scripts works, but it not friendly when you try to track what he did
as records are deleted after done.
This commit changes the behaviour : 2 columns are added done & timestamp.
done is set to 1 when a line have been done. And the timestamp contains the timestamp of the last action (either line creation if done=0 or operation if done=1)
should be helpfull to track problem.
the table will grow, but i'll add soon a DELETE FROM zebraqueue WHERE timestamp > 30 days or something like that
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
- the quotemeta was wrong (and introduced some bugs in diacritics)
- fixing some bugs that appear only sometimes : the union was done including weight, which is wrong & resulted in missing some results (when various weighting)
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
This Bug came out of the fact that indexes were used a wrong way in record.abs
Please consider correcting record_usmarc.abs
Indeed indexes were used :
melm NNN$X Myattribute !:w,!:p
This prooved not to work on indexes.
It took only default (w) index for Myattribute.
So Please, correct your record.abs, reindex, and sorting by publication date will be fine.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
If 100$a repeated, the scripts failed to handle that correctly
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
(just spelling, but very important, without them, authorities won't work)
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
- reindenting
- upcasing SQL
- the script at least compiles...
... but it does seem not work yet
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
- pay.pl fixed, the librarian can see patron fines & mark them paid
- fines2.pl fixed, the script now calculate the fines correctly from the finerules :
* itemtype / patron category if it exist
* itemtype / default if needed
- renamed misc/fines.pl as fines-sanop.pl
- added 2 systempreferences:
* MaxFine: the max amount a patron can be charged for an item
* NoReturnSetLost: how many days of late before a non returned item is marked lost & the patron charged for the full cost of replacement
(those values where hardcoded in fines2.pl
- C4/Circulation/Fines.pm has been removed (unused duplicate of C4/Overdues)
Note that SANOP feature about notify levels have NOT been ported here. I think they are too specific now, and the code is poorly written
I've renammed fines.pl to fines-sanop.pl to point that it is specific.
Thus, all notify_id related features are not used by anything (and always 0)
Can be interesting to reintroduce them, but that will probably be a large work...
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
nobody probably ever enter "." on search term, but just in case...
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
This option uses the iso2709 version of the MARC record instead of the XML one
(biblioitems.marc vs biblioitems.marcxml)
No change if the parameter is not set.
Signed-off-by: Chris Cormack <crc@liblime.com>
Signed-off-by: Joshua Ferraro <jmf@liblime.com>
- adding browser and nozebra table definition to kohastructure & updatedatabase
- bumping to 3.00.00.005
Signed-off-by: Chris Cormack <crc@liblime.com>
- use biblio instead of marc_biblio,
- better check that biblionumber is correctly stored
- fix an buggy API call when ModMarcBiblio
Signed-off-by: Chris Cormack <crc@liblime.com>
local LC call number as used at many North American academic libraries.
Correcting authoritytypecode values where my typo used the number '0' as
part of the value in one table and the letter 'O' in another table.
Improved seealso column indexing values.
Signed-off-by: Chris Cormack <crc@liblime.com>
part of the value in one table and the letter 'O' in another table.
Improved seealso column indexing values.
Signed-off-by: Chris Cormack <crc@liblime.com>
This script allows you to update your database/records after a bulkmarckimport integration.
Currectly it reformats biblioitems.isbn and bibliotitems.marcxml on deleting any '-' on isbn.
- updating templates to have tmpl_process3.pl running without any errors
- adding a drupal-like css for prog templates (with 3 small images)
- fixing some bugs in circulation & other scripts
- updating french translation
- fixing some typos in templates
- support for authorities
- some bugfixes in ordering and "CCL" parsing
- support for authorities <=> biblios walking
Seems I can do what I want now, so I consider its done, except for bugfixes that will be needed i m sure !
* adding 3 subs in Biblio.pm
- GetNoZebraIndexes, that get the index structure in a new systempreference (added with this commit)
- _DelBiblioNoZebra, that retrieve all index entries for a biblio and remove in a variable the biblio reference
- _AddBiblioNoZebra, that add index entries for a biblio.
Note that the 2 _Add and _Del subs work only in a hash variable, to speed up things in case of a modif (ie : delete+add). The effective SQL update is done in the ModZebra sub (that existed before, and dealed with zebra index).
I think the code has to be more deeply tested, but it works at least partially.
- changing nozebra table to have biblionumber,title-ranking; (; is the entry separator. Now, if a value is several times in an index, it is stored only once, with a higher ranking (the ranking is the number of times the word appeard for this index)
- improving search to have ranking value (default order). The ranking is the sum of ranking of all terms. The list is ordered by ranking+title, from most to lower
- add nozebra table management on biblio editing
- the index table content is hardcoded. I still have to add some specific systempref to let the library update it
- manage pagination (next/previous)
- manage facets
WHAT works :
- NZgetRecords : has exactly the same API & returns as zebra getQuery, except that some parameters are unused
- search & sort works quite good
- CQL parser is better that what I thought I could do : title="harry and sally" and publicationyear>2000 not itemtype=LIVR should work fine
Abiding by Name Convention.
Using Members wherever it should be used.
Borrower is only used for borrower Categories.
+ GetBorrowersWhoHaveNeverBorrowed
and lists like that.
- checkaccount and getborraccountno => GetBorrowerAcctRecord
Many changes in names,
some changes in function signature.
Will be detailed in a mail to kohadevel.
All subs have be cleaned :
- removed useless
- merged some
- reordering Biblio.pm completly
- using only naming conventions
Seems to have broken nothing, but it still has to be heavily tested.
Note that Biblio.pm is now much more efficient than previously & probably more reliable as well.
== Biblio.pm cleaning (useless) ==
* some sub declaration dropped
* removed modbiblio sub
* removed moditem sub
* removed newitems. It was used only in finishrecieve. Replaced by a Koha2Marc+AddItem, that is better.
* removed MARCkoha2marcItem
* removed MARCdelsubfield declaration
* removed MARCkoha2marcBiblio
== Biblio.pm cleaning (naming conventions) ==
* MARCgettagslib renamed to GetMarcStructure
* MARCgetitems renamed to GetMarcItem
* MARCfind_frameworkcode renamed to GetFrameworkCode
* MARCmarc2koha renamed to TransformMarcToKoha
* MARChtml2marc renamed to TransformHtmlToMarc
* MARChtml2xml renamed to TranformeHtmlToXml
* zebraop renamed to ModZebra
== MARC=OFF ==
* removing MARC=OFF related scripts (in cataloguing directory)
* removed checkitems (function related to MARC=off feature, that is completly broken in head. If someone want to reintroduce it, hard work coming...)
* removed getitemsbybiblioitem (used only by MARC=OFF scripts, that is removed as well)
cvs -z3 -d:ext:kados@cvs.savannah.nongnu.org:/sources/koha co -P koha
find koha.precrash -type d -name "CVS" -exec rm -v {} \;
cp -r koha.precrash/* koha/
cd koha/
cvs commit
This should in theory put us right back where we were before the crash
subfields in the 09o or 999 placeholer for 090 Locally Assigned LC-Type Call
Call Number found in copy catalogued records. The locally assinged original
090 call number remains visible in tab 0 for informational purposes as often
the only place to find what an LC Call number for the material might be.
This file now combines koha configuration and zebra configuration. Old koha.conf variable names are retained. They go between the tags <config>. Do not change the tag names for zebra part. Use zebrasrv -f /etc/koha.xml to start your server. The virtual server configuration details can be found in YAZ documentation
only allows numbers up to 127 - needs to allow bigger numbers if the sub you're making
is for a daily subscription like a newspaper where the number of issues in a year
will be greater than 127
head repo (ie, copies over the right files, creates some symlinks,
etc.). It's not finished yet, but getting there (I'll be testing it
extensively today and tomorrow).
It now responds to:
-n : the number of records to import.
-commit : the number of records to wait before performing a 'commit' operation
ALSO: IMPORTANT: I took out the char_encoding as this should be handled by
MARC::File::XML now, unless I'm mistaken.
recordId: (bib1,Identifier-standard) just after the comma. Adam agreed it was a bug, and it should be solved soon. But now we are aware, we can avoid putting the space !
In this commit you have all what is needed to setup a working zebra DB in Unimarc :
* collection.abs is UNIMARC specific and must be rewritten for MARC21, in marc21 directory
* pdf.properties is to be copied unmodified in the marc21 directory (can also be put somewhere else)
* rebuild_zebra.pl is SLOW, but 1 step reindexing tool, using ZOOM
* rebuild_zebra_idx is FAST, but 2 step reindexing tool, and does not use zebra. run it, it will create all biblios XML files in /zebra/biblios directory, then zebraidx update biblios in your zebra directory
* zebra.cfg is the zebra config file ;-)
* test_cql2rpn.pl is a script that will query the database and show the results. Works for me, just change the query at the beginning to get answers you expect.
What has to be done :
* benchmarking : it seems the zebraidx update is faster than lightning (400biblios/sec : 10 000biblios in 25seconds), while ZOOM indexing is slow (something like 25biblios/second) More benchmarking could be done.
* completing collection.abs for UNIMARC. I'll take care of it.
* modifying Biblio.pm to use ZOOM instead of the "zebraidx through exec" running actually. I'll take care of it also.
* modify the search API & tools & screens. I'll let the ball to someone else (chris ?) for this. I agree SearchMarc.pm can be dropped and replaced by something else (maybe a new-and-clean Search.pm package)
Seems not to break too many things, but i'm probably wrong here.
at least, new features/bugfixes from 2.2.5 are here (tested on some features on my head local copy)
- removing useless directories (koha-html and koha-plucene)
* go to koha cvs home directory
* in misc/zebra there is a unimarc directory. I suggest that marc21 libraries create a marc21 directory
* put your zebra.cfg files here & create your database.
* from koha cvs home directory, ln -s misc/zebra/marc21 zebra (I mean create a symbolic link to YOUR zebra directory)
* now, everytime you add/modify a biblio/item your zebra DB is updated correctly.
NOTE :
* this uses a system call in perl. CPU consumming, but we are waiting for indexdata Perl/zoom
* deletion still not work
* UNIMARC zebra config files are provided in misc/zebra/unimarc directory. The most important line being :
in zebra.cfg :
recordId: (bib1,Local-number)
storeKeys:1
in .abs file :
elm 090 Local-number -
elm 090/? Local-number -
elm 090/?/9 Local-number !:w
(090$9 being the field mapped to biblio.biblionumber in Koha)
how it works :
* create the table marc_Tword with the following structure :
CREATE TABLE `marc_Tword` (
`word` varchar(80) NOT NULL default '',
`usedin` text NOT NULL,
`tagsubfield` varchar(4) NOT NULL default '',
PRIMARY KEY (`word`,`tagsubfield`)
) TYPE=MyISAM;
* open a console & type export PERL5LIB & export KOHA_CONF as usual.
* fill this table with misc/build_marc_Tword.pl. Warning, this script uses a very very consumming but very fast method to fill the table : it does everything in memory, then write everything. Another method is provided (& commented), but it's 100x times slower (really !)
* open opac-search.pl and replace use C4::SearchMarc; by use C4::SearchMarcTest; as the API hasn't changed, it will work immediatly.
* go to opac-search (advanced search) & search whatever you want. Should work fine.
LIMITS :
* build_marc_Tword has problem with extended chars (accented ones mainly). So don't be afraid if you get sql errors. They are not a problem for a POC
* search works always order by title, whatever you choose.
* search works only search WORDA and WOARDB, not yet WORDA or WORDB or WORDA except WORDB.
Make the generated pot file (i.e., result of "create") look more "real",
but using msgmerge to reformat the output
Script failed to create intermediate directories if the directory of the
target does not exist and the parent of that directory does not exist
either. This should fix that.