Koha-community/Koha - Koha: The world's first free and open source library system

Author	SHA1	Message	Date
Henri-Damien LAURENT	ce3adab2ee	bulkmarcimport.pl Bug Fix matching biblios enhanced matching biblios is now also getting biblioitemnumber so that Items management can be performed	2009-11-23 21:40:13 +01:00
Paul Poulain	7cc1115cba	adding error details	2009-11-17 16:27:12 +01:00
Henri-Damien LAURENT	663eb1edd6	Adding some error proof on GetMarcRecord Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2009-09-30 11:29:24 +02:00
Henri-Damien LAURENT	7eca37db4f	Authorities bulkmarcimport Adding some new options to bulkmarcimport : -k idtagsubfield in order to store the id of the file record into another field -match tagsubfield,index -a to import authorities -l logfilename to store logs Bug Fixing : C4/Charset.pm Charset was incorrect for UNIMARC Authorities Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2009-09-30 11:22:21 +02:00
Ricardo Dias Marques	f8ff5879a5	Bug 3582: Missing usage information for -h / --help switch for rebuild_nozebra.pl Fix for Bug 3582: Missing usage information for -h / --help switch for rebuild_nozebra.pl http://bugs.koha.org/cgi-bin/bugzilla3/show_bug.cgi?id=3582 Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2009-09-06 12:48:35 -04:00
Sébastien Hinderer	2a8df0bc2f	Get rid of a few warnings in the bulkmarcimport script: C4/Biblio.pm, hunks #1 , #2 : a warning occurring in NoZebra configurations. C4/Biblio.pm hunk #3 : warning occurring in Unimarc MARC flavour. misc/migration_tools/bulkmarcimport.pl hunk #1 : warning occurring when no default format is specified on command-line with -m switch. Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2009-09-06 09:54:11 -04:00
Colin Campbell	3199d032e5	Avoid numeric comparisons with leading zeroes Numbers in perl with leading zeros are interpreted in octal Ensure that comparisons are done using string operators or where appropriate use the MARC::Field method Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2009-08-20 21:01:52 -04:00
Henri-Damien LAURENT	731b82f764	3519 : mergeauthority and authority edition were not synched mergeauthority and ModAuthority were working on two separate directories. So that no authority would ever be merged via cronjob or commandline script when MergeAuthoritiesOnUpdate is disable Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2009-08-11 19:30:14 -04:00
Galen Charlton	3caec55fd1	removed redundant license statement The standard license statement in the header is fine; please don't confuse things by doing anything different. Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2009-08-01 08:17:52 -04:00
Paul Poulain	6b1df98ddf	script to remove authorities without biblio attached Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2009-08-01 08:10:01 -04:00
Frédéric Demians	459d732180	Bug 3301 - Speed up rebuild_zebra script With this patch, rebuild_zebra can re-index a whole Koha DB quickly: rebuild_zebra -r -b -nosanitize Biblio (authority) records are dump directly in a file from marcxml field without beeing transformed into MARC::Record object and corrected. DOCUMENTATION: rebuild_zebra.pl new paramater: -nosanitize export biblio/authority records directly from DB marcxml field without sanitizing records. It speed up dump process but could fail if DB contains badly encoded records. Works now only with -x and -b Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-06-29 07:52:46 -05:00
Brian Harrington	6a2d9ffcf2	Bug 3313, bulkauthimport.pl skips MARC21 subdivision records. This patch adds the MARC21 subdivsion record tags (18x) to the block which recognizes and assigns authtypecodes to imported authority records. Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-06-08 17:03:03 -05:00
Galen Charlton	da51de184c	bug 2926: fix staging import hang Fixes a hang of the staging import tool when it attempts to process a MARC21 record that claims that it's UTF-8 when it is not. The staging import will now attempt to fix the character encoding of such records. Also added a FIXME to bulkmarcimport.pl, which because of its use of MARC::Batch will skip over such records - better than the original hang of the staging import, but worse than the staging import's new ability to fix such records. Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-06-07 13:17:06 -05:00
Galen Charlton	3f4641bf30	bug 3201: missing090field.pl - skip bad bibs Patch courtesy of G. Henry <henry@cmi.univ-mrs.fr> Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-06-07 13:17:01 -05:00
J. David Bavousett	a7d1ab0041	Changes to bulkmarcimport.pl Adds three new switches: -idmap <filename> - optional output file of map of source record ID numbers to Koha biblionumber -x - if idmap is supplied, MARC tag to get source record ID from -y - if idmap is supplied, MARC subfield to get source record ID from Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-04-03 19:18:29 -05:00
Henri-Damien LAURENT	911fddab4a	merge_authority : Bug fixing Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-03-06 14:14:34 -06:00
Mason James	e9599f973c	Fixes command-line 'number' arg in bulkauthimport.pl. for HEAD and 3.0.x Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-03-04 10:43:49 -06:00
Brian Harrington	25cd35b3a1	bug 2924 fixed rebuild_zebra.pl to work when export is skipped reindexing now occurs if there are $num_records_exported or if $skip_export is set Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-03-04 08:28:22 -06:00
Galen Charlton	8f07521a2d	bug 2955: fix remaining calls to GetMarcFromKohaField This includes part of a patch from Henri-Damien Laurent that could not be applied because Chris and Joe patches happened to win the race. Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-02-12 16:29:19 -06:00
Joe Atzberger	11b90be284	Cleanup and perltidy. Add "use warnings", remove unused variables and unnecessary finish/disconnect at the end. This script could be improved to run only on tables that need to be altered instead of touching all of them. It should also probably contain warnings to the effect that it does not rescue your DATA that was forced into whatever encoding the table used previously. Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-01-28 17:29:43 -06:00
Michael Hafen	086b3ccf9a	bug in rebuild_zebra verbose logging - found another print I didn't want to see all the time Add the phrase 'if ( $verbose_logging )' to the two print statements concerning the skipping of biblio or authority records. I recently had to split biblio and authority index updating in my cron script ( had some really big records so had to add the -x switch which should only be used on biblios accourding to the help ). So I noticed that rebuild_zebra.pl printed messages that it was skipping biblios or authorities. This patch is to conditionalize those prints based on the verbose logging switch. Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2008-12-11 09:23:28 -06:00
Michael Hafen	62a590a954	Reduce logging from rebuild_zebra.pl with a command line option This reduces the output of the script and zebraidx, and creates a -v command line switch which will increase the logging to their former states. Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2008-10-01 13:05:20 -05:00
Henri-Damien LAURENT	ca8d24546e	Bug Fixing merge_authority.pl merge works on the fly now. But for an obscure reason, merge_authority.pl fails to update database when lanched on command line. Adding one table to LOCK for noZebra UPDATE in Biblio.pm You should remove C4::Search from merg_authority.pl Signed-off-by: Galen Charlton <galen.charlton@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-08-09 11:05:53 -05:00
Galen Charlton	df1f46f9da	bug 2253: improve rebuild_zebra's handling of zebraqueue Prior to this patch, rebuild_zebra.pl -z was effectively hanging on to a lock on the zebraqueue table, preventing other scripts from inserting new entries into the table. This had the effect of causing circulation operations to time out. Refactored by having rebuld_zebra.pl pull the active queue into memory, then mark entries done by zebraqueue.id. Consequently, rebuild_zebra.pl should no longer block adding new entries into zebraqueue. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-06-19 09:49:06 -05:00
Paul POULAIN	feae120738	BUGFIX : script to fix & fill onloan field in items table. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-05-12 09:24:43 -05:00
Galen Charlton	a78b115d35	kohabug 2076 - make biblioitems.marc longblob during upgrade Change to match 3.0 definition of that column. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-05-11 05:37:18 -05:00
Paul POULAIN	8e1844d495	missing ) Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-05-05 05:39:13 -05:00
Paul POULAIN	e7209ed02a	UNIMARC specific rebuild items correctly note 995 for items is hardcoded, so it's really for UNIMARC only. The script exit if you're not UNIMARCflavour Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-04-22 17:34:41 -05:00
Galen Charlton	3109d5820e	rebuild_zebra.pl - add -y option rebuild_zebra.pl will now mark all zebraqueue entries of the affected record type(s) done when run in normal mode to index all records (as opposed to running it with -z to just process the zebraqueue). This prevents any running zebraqueue_daemon processes from attempting to reindex the same records, redundantly. The new -y swtich overrides this new behavior; in other words, if running rebuild_zebra.pl without -z, you can specify -y to not mark zebraqueue done. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-04-21 11:17:29 -05:00
Frederic Demians	004524584b	Tweak bullmarcimport.pl * Add a new parameter -o to begin importing input file after skiping n records. * Enclose input file reading in an eval directive to avoid abording import if few records are corrupted: they are now skipped. * Help formating. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-04-17 05:52:53 -05:00
Galen Charlton	e2c1f11715	fixed memory leak I introduced Accidentally introducing a circular reference in a MARC::Record object does not lead to goodness, particularly if you export lots and lots of them. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-04-01 06:46:05 -05:00
Galen Charlton	4f001186b6	still more rebuild_zebra refactoring Merged duplicate code for indexing bibs and authorities into a single index_records() function. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-03-25 07:58:03 -05:00
Galen Charlton	a5576b8dfe	IMPORTANT: added -z option to rebuild_zebra.pl The -z option, when used in conjunction with -a and/or -b, selects the records to reindex from the zebraqueue table. Both record updates and record deletes are handled. -z is cannot be used with -s or -r: the updated records must always be freshly exported, and if zebraqueue is to be processed, it's assumed that you don't want to drop the Zebra index first. This means that rebuild_zebra.pl -b -a -x can be used as a cronjob to update the indexes periodically; it is believed that this will offer much better indexing performance on some setups as compared to zebraqueue_daemon.pl, which uses Z39.50 extended services to send record updates to Zebra. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-03-25 07:58:01 -05:00
Galen Charlton	57d128f727	rebuild_zebra: exit if both -a and -x specified At moment using both -a (index authorities) and -x (export records as MARC XML) is not allowed - if the Zebra authority database is using the DOM filter, zebraidx will not be able to process the exported records correctly. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-03-25 07:57:44 -05:00
Galen Charlton	f0d5da7448	more rebuild_zebra.pl refactoring 1. Logic to fix up record IDs, UNIMARC 100 field, and record leader now in separate functions. 2. Removed (incorrect) logic to save corrected record in database. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-03-25 07:57:43 -05:00
Galen Charlton	f98c27a8bc	refactor rebuild_zebra: new routine for invoking zebraidx Created a routine for calling zebraidx, replacing separate invocations for bibs and authorities. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-03-25 07:57:42 -05:00
Galen Charlton	ae8a76dacc	rebuild_zebra.pl: removed disused $limit option Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-03-25 07:57:41 -05:00
Galen Charlton	b4f39e5c58	do not let MARC::Batch open MARC files The version of MARC::Batch->new() distributed with version 2.0.0 of MARC::Record, if given a file name, will open it using the ':utf8' layer. This results in an incorrect character conversion when processing records in the MARC-8 character encoding. To avoid this, batch jobs that use MARC::Batch now open the file themselves, then pass the file handle to MARC::Batch->new(). Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-03-21 21:46:39 -05:00
Galen Charlton	ad0639e548	remove some unneeded use statements Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-03-21 21:46:29 -05:00
Galen Charlton	4e95689287	bulkmarcimport.pl: XML input option documented Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-03-03 13:01:00 -06:00
Galen Charlton	d49873cc2f	bulkauthimport.pl - various improvements Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-03-03 13:00:59 -06:00
Mason James	5057e74914	more \Q...\E wrapping on regexes, to handle occassionally problematic strings. Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-28 07:58:45 -06:00
Mason James	d451f072f9	corrections to host-item, shelf_loc and collection-code indexes Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-28 07:58:43 -06:00
Mason James	be14507658	oops, removing un-needed $dbh->commit() calls Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-20 20:16:42 -06:00
Mason James	b57c146b26	setting $dbh->{AutoCommit} = 0, and adding a new --commit arg. Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-20 20:16:41 -06:00
Paul POULAIN	0e2b065219	NoZebra : removing . and : before indexing Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-19 20:27:36 -06:00
Paul POULAIN	bcf36122a6	speeding a lot rebuild_nozebra by using autocommit OFF feature Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-19 20:27:24 -06:00
Mason James	bdd2afc747	a little speed tweak here, setting "SET FOREIGN_KEY_CHECKS = 0" before clearing bib/bi/items tables. Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-18 22:07:09 -06:00
Ryan Higgins	71dd69d5ac	add option to export and index xml to rebuild_zebra Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-15 08:25:46 -06:00
Mason James	3cb4ea7ecf	added 440* and 490* 'series' indexes Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-11 16:14:54 -06:00
Galen Charlton	60a98d258a	IMPORTANT - refactor MARC character set handling * IsStringUTF8ish - determine if scalar contains a string in UTF8 * MarcToUTF8Record - convert MARC blob or MARC::Record to UTF8 * SetMarcUnicodeFlag - set appropriate MARC21 or UNIMARC field to indicate that record is in UTF-8. Design points of this module include: * No dependencies on other C4 modules, making it easier to add more test cases * All character conversion code in one place * Single entry point for doing a character conversion on a MARC record * Capture of errors and warnings produced by Text::Iconv and MARC::Charset * Start of support for guessing the source character set of a MARC record. Several functions were moved from other scripts or modules to C4::Charset: * C4::Koha->FixEncoding (expanded and renamed MarcToUTF8Record) * C4::Koha->char_decode5426 * fMARC8ToUTF8 from bulkmarcimport.pl (renamed _marc_marc8_to_utf8) Several batch jobs were adjusted to use MarcToUTF8Record instead of FixEncoding. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-03 07:23:56 -06:00
Daniel BÃÂ¼nzli	78f3e56e2c	bulkauthimport fix Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Galen Charlton <galen.charlton@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-01-22 07:20:28 -06:00
Joshua Ferraro	2a37c19dac	Rudimentary import of MARC21 authorities Also adding support for ingesting format MARCXML in bulkmarcimport and bulkauthimport Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-01-04 21:30:17 -06:00
Joshua Ferraro	9c25d6368a	improvements to INSTALL.debian, adding Symbols for currencies adding \n to make bulkmarcimport.pl prettier Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-01-03 21:28:37 -06:00
Galen Charlton	8c60e82605	fixed variable masking warnings found by perl -w Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-01-03 20:23:59 -06:00
Galen Charlton	c2a0ed8077	item rework: replaced AddBiblioAndItems Replace C4::Biblio::AddBiblioAndItems with two things: * An option to C4::Biblio::AddBiblio to defer writing biblioitems.marc and biblioitems.marcxml. This option was created to give a significant speed boost to bulkmarcimport.pl, but is not recommended for general use. * C4::Items::AddItemBatchFromMarc This refactoring removes the need to have functions in C4::Biblio and C4::Items that call each other's private functions. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-01-03 16:26:16 -06:00
Galen Charlton	9d4d8897b2	item rework: various changes * Move CheckItemPreSave to C4::Items (from C4::Biblio) * Modified C4::Biblio::AddBiblioAndItems to use appropriate internal routines from C4::Items * Moved GetItemnumberFromBarcode to C4::Items * Removed duplicate C4::Biblio::_koha_new_items * Removed disused C4::Biblio::MARCitemchange Currently AddBiblioAndItems is a special routine that uses private subs from both C4::Biblio and C4::Items. This needs to be refactored. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-01-03 16:25:42 -06:00
Joshua Ferraro	5c23369af2	Fixing Database Definitions for Statuses PARTIAL Prior to this fix, the status fields had three 'off' values, NULL, "", and 0. I've reduced it to two in the db, removing the option for NULL, and setting the default value to 0, however, we need to verify that we don't ever write out as "" as this needlessly complicates the indexing process, critical for searching or limiting by status (e.g., availability). Also, queries that attempt to write a NULL value to one of these fields will fail (based on my tests). This patch includes the following changes: * Updated the database definition for notforloan, damaged, itemlost, and wthdrawn in kohastructure.sql to forbid NULL and default to 0; MySQL can't forbid other values (such as empty ""), so this has to be handled at the application layer and REQUIRES further patching. * Fixed the 'limit by availability' query node in Search.pm to use a much less confusing definition of 'available' * Added code to set values to 0 where they are NULL or empty ( "" ) for notforloan, damaged, itemlost or wthdrawn in both the MARC and the items table: * Biblio.pm -> AddBiblioAndItems * catalogue/updateitem.pl * SEE NOTE BELOW, REQUIRES UPDATE TO THE REST OF KOHA'S ITEM MGT! * Removed code in bulkmarcimport.pl that sets notforloan status depending on item-level or bib-level itemtype -- that flag is designed to be set only to override the notforloan setting for the item's (or bib's, depending on the syspref) assigned itemtype (it doesn't need to override to 'for loan', only to 'not for loan'). added $dbh->do("truncate zebraqueue"); when operation is 'delete' * I updated some notes in catalogue/updateitem.pl as to why ModItem can't be used -- we don't have _a_ place where we can change the item and marc :/ I've tested the following: bulkmarcimport.pl..........................MARC/items OK Staged Records Import......................NOT OK updateitem.pl (via moredetail.pl)..........MARC/items OK circulation.pl.............................NOT OK returns.pl.................................NOT OK addbiblio.pl...............................NOT OK additem.pl.................................NOT OK Basically, there isn't a single place to apply this patch that will update both item data and MARC data in one place ... a future patch needs to address this issue. Signed-off-by: Galen Charlton <galen.charlton@liblime.com> Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-01-03 16:23:04 -06:00
Chris Cormack	c7215e7a93	Escaping the $title in the regexes with \Q and \E to handle nested quantifiers Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-01-03 01:20:40 -06:00
Paul POULAIN	319a32b16e	rebuild_zebra : directories updated the unimarc stuff has been moved to marc_defs directory and the lang specific is in lang_defs Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-01-03 00:55:12 -06:00
Joshua Ferraro	554bbe1bda	s/Waited/Expected/ for serials statuses reformating rebuild_nozebra.pl indexes Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-01-01 12:59:28 -06:00
Joshua Ferraro	dd3f557f53	fixing nomenclature on files in misc/, adding a few new utilities Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-30 12:13:34 -06:00
Joshua Ferraro	c6ddddad98	adding a new option, -w, which disables shadow indexing for the current batch (faster indexing of large sets where ACID isn't critical) Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-30 12:13:27 -06:00
Galen Charlton	3508933c66	bulkmarcimport: enable MARC-8 to UTF-8 conversion Enabled automatic conversion of MARC-8 records to UTF-8. Record is converted if its Leader/09 contains a blank and the -s (skip) option hasn't been supplied on the command-line. Any record that cannot be converted to UTF-8 is skipped. Also now use Unicode Normalization Form C (NFC) for records converted from MARC-8. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-25 09:08:38 -06:00
Galen Charlton	d426a91d0e	removed extraneous comments Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-25 09:08:35 -06:00
Galen Charlton	cb6cf680bc	improved error detection in AddBiblioAndItems Introduced new C4::Biblio function CheckItemPreSave, which checks for duplicate barcodes and invalid branch codes. Not yet sure whether this function needs to be exported or whether it will just be used internally to C4::Bibli. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-25 09:08:34 -06:00
Galen Charlton	6b49df4c3f	removed superfluous comments Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-25 09:08:31 -06:00
Galen Charlton	7d47666f7e	bulk MARC record import - speed improved Changes to improve speed of MARC bib and item imports: [1] Turn off autocommit and commit database transactions in larger batches. [2] Introduce a new C4::Biblio function (AddBiblioAndItems) to combine AddBiblio and AddItems -- this is faster because we are not parsing the MARC XML of the biblio every time we add an item. [3] Introduce FasterTransformMarcToKoha, which is much faster than TransformMarcToKoha. The new version, which will replace the old one once it has been fully tested, scans through each field in the MARC record just once, instead of potentially dozens of times. [4] Remove code in bulkmarcexport that moved the item tags to separate MARC::Record objects. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-25 09:08:28 -06:00
Galen Charlton	4609608ccc	allow use of older version of File::Temp Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-22 22:58:12 -06:00
Galen Charlton	93beb943c0	bug 1661: rebuild_zebra.pl changes [1] Use File::Temp to create and manage export directory if -d is not specified. [2] Added usage message. [3] Code that attempts to fix up Zebra configuration files changed so that it is invoked only if --munge-config option is supplied; this code will ultimately either be removed or moved to a separate script -- the sorts of errors that it tries to fix should no longer be appearing in a standard install. [4] Fixed Win32 portability problem when removing temporary directory. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-20 19:19:43 -06:00
Henri-Damien LAURENT	bdade9bc9d	Adding rebuild field 100$a for Unimarc Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-20 18:40:43 -06:00
Henri-Damien LAURENT	c9fb20928b	Generating index for authorities on AUTHtypecode from table auth_header Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-20 18:30:50 -06:00
Galen Charlton	b8a58c4934	installer: command-line scripts improve finding C4 modules Command-line scripts now use a new SCRIPT_DIR/kohalib.pl to put installed location of Koha's Perl modules into @INC.	2007-12-17 09:13:54 -06:00
Mason James	d9a9e06556	updated MARC21 indexes, with authorites too. v2 Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-14 07:43:44 -06:00
Joe Atzberger	377db43117	C4 and misc: permissions fixes Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-13 19:00:34 -06:00
Galen Charlton	ad4e02f91d	warn on attempts to add duplicate item barcodes during batch import Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-12-02 15:06:24 -06:00
Paul POULAIN	262a6e2a9a	Updating rebuild_zebra.pl : now uses etc config files There are only 2 UNIMARC specific files (.abs and .chr), they have been moved to etc/zebradb The rebuild_zebra.pl takes all config file from this location now. the misc/zebra/ can be removed (and will be soon) Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-25 17:07:46 -06:00
Joshua Ferraro	cee40a741d	adding $DEBUG warnings to nozebra Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-24 09:07:44 -06:00
Paul POULAIN	f38b7598fc	still handling better dirty MARC records this time it's when a biblio don't have biblionumber, has a 100$a field, and it's invalid. 1 biblio in my 300 000 DB (and it was biblio 294 359, of course !) Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-20 16:20:50 -06:00
Mason James	78abbe94d3	little SQL typo fix, now builds 'NoZebraIndexes' index mapping correctly. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-20 16:19:05 -06:00
Paul POULAIN	f1bca9ba50	missing biblionumber AND missing unimarc 100 was not properly handled now, adding both on the fly when needed. (had 2 biblios like that in a 290 000 DB, but was enought to have M::F::X complaining & diing !) Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-17 11:25:07 -06:00
Paul POULAIN	ef1ac56857	handling wrong MARC record better Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-12 17:13:00 -06:00
Mason James	c846ed00db	utf8 handling fixes 'Wide character in print at' encoding errors. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-12 17:10:17 -06:00
Mason James	a51118833c	wrapping AddBiblio(), and AddItem() in evals{} to protect import from failure due to bad records. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-11 18:44:13 -06:00
Mason James	f6b17c1de9	wrapping write to *.iso file in eval{}, to handle failure, caused by bad record. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-11 18:44:12 -06:00
Paul POULAIN	9149a711fb	bugfixes to config files for zebra 2.0.18 those 2 lines are invalid Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-08 17:50:00 -06:00
Paul POULAIN	b7eb9e1b5c	rebuild_zebra now handle correctly improper authorities records (missing 100 field are automatically added) Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-07 08:18:24 -06:00
Paul POULAIN	bb5cea8e56	deal with wrong authorities when exporting for zebra (authorities that don't have a 001 field containing authid) also comment some code when exporting biblios (NOT tested, hdl,pls confirm this commit) Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-11-07 08:18:19 -06:00
Paul POULAIN	89b9e8f8c1	skip empty records (new GetMarcRecord behaviour that returns empty string and not empty MARC::Record) Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-10-31 19:41:49 -05:00
Paul POULAIN	1cd11f4d54	fixes in NoZebra search & indexing - the quotemeta was wrong (and introduced some bugs in diacritics) - fixing some bugs that appear only sometimes : the union was done including weight, which is wrong & resulted in missing some results (when various weighting) Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-10-31 05:53:36 -05:00
Paul POULAIN	fa26bcc037	rebuild_unimarc_100 : better handling of unusual cases If 100$a repeated, the scripts failed to handle that correctly Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-10-24 17:08:56 -05:00
Paul POULAIN	cd8a565a6a	temp Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-10-24 17:08:40 -05:00
Paul POULAIN	837e5c5e94	less verbose Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-10-24 17:06:36 -05:00
Joshua Ferraro	9d29ce5d58	improvements to zebra configuration files Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-10-21 19:14:12 -05:00
Paul POULAIN	1ac38782a1	#1474 : Bulkmarcimport croaks when Log is ON set to 0 and restore at the end of the import Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-10-11 14:53:59 -05:00
Paul POULAIN	057d654a5b	skipping wrong XMLs when rebuilding nozebra indexes Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-10-09 19:11:47 -05:00
Paul POULAIN	49ef1df969	Adding a new option to rebuildzebra : noxml This option uses the iso2709 version of the MARC record instead of the XML one (biblioitems.marc vs biblioitems.marcxml) No change if the parameter is not set. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-10-09 19:07:36 -05:00
Joshua Ferraro	827d27111f	adding barcode index Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2007-10-06 21:46:02 -05:00
Paul POULAIN	375d2f1158	(minor) updating doc & removing warn Signed-off-by: Chris Cormack <crc@liblime.com>	2007-10-03 14:57:12 -05:00
Chris Catalfo	502487e2ba	Added basic MARC21 index definitions. Signed-off-by: Chris Cormack <crc@liblime.com>	2007-10-02 15:38:32 -05:00
Paul POULAIN	6f7efca7e1	BUGFIX for browser and nozebra tables - adding browser and nozebra table definition to kohastructure & updatedatabase - bumping to 3.00.00.005 Signed-off-by: Chris Cormack <crc@liblime.com>	2007-10-02 04:35:49 -05:00
Joshua Ferraro	ae34e8f45a	changing the name of the zebra password file to passwd Signed-off-by: Chris Cormack <crc@liblime.com>	2007-10-01 23:14:47 -05:00
Joshua Ferraro	b87d4924b9	commenting out set_service_options, but also removes commit op Signed-off-by: Chris Cormack <crc@liblime.com>	2007-10-01 17:40:31 -05:00
Ryan Higgins	c44efe7b84	fix bad call to GetMarcFromKohaField in bulkmarcimport, and add -fk param, allowing disabling of fk constraints during import. Signed-off-by: Chris Cormack <crc@liblime.com>	2007-09-30 21:16:50 -05:00
Paul POULAIN	0d7a4aafd0	BUGFIX : NoZebra indexing was wrong for accented words Signed-off-by: Chris Cormack <crc@liblime.com>	2007-09-26 05:28:37 -05:00
Paul POULAIN	623ac80330	BUGFIXES : 3 (marc_biblio, check biblionumber, ModMarcBiblio API) - use biblio instead of marc_biblio, - better check that biblionumber is correctly stored - fix an buggy API call when ModMarcBiblio Signed-off-by: Chris Cormack <crc@liblime.com>	2007-09-13 17:18:50 -05:00
Paul POULAIN	ec7bd0b2ff	(unimarc specific) BUGFIX : if 100$a exist but is not 35 char long, MARC::File::XML may fail So, add blanks if needed... Signed-off-by: Chris Cormack <crc@liblime.com>	2007-09-13 17:17:56 -05:00
tipaul	1399945a75	eval() on getAuthority & getBiblio to avoid a script failure	2007-08-01 09:20:03 +00:00
toins	5e7b171686	adding an eval to don't die if an error occurs	2007-07-19 09:48:22 +00:00
tipaul	23427c51b9	some fixes (and only fixes)	2007-06-15 13:44:44 +00:00
toins	6dfb0dca36	next if there is an error getting the biblio.	2007-06-11 15:22:59 +00:00
toins	4728830e34	it's faster to 'truncate' instead of using 'delete from'...	2007-06-08 09:41:14 +00:00
tipaul	5dd3f0229a	bugfixes (various), handling utf-8 without guessencoding (as suggested by joshua, fixing some zebra config files -for french but should be interesting for other languages-	2007-06-06 13:08:35 +00:00
btoumi	68bcf35387	delete space in beggining of the script to accept script launch	2007-05-25 10:00:54 +00:00
tipaul	0569dccd5f	some changes to default zebra config for better searches	2007-05-25 09:34:30 +00:00
tipaul	651b075197	small script to check XML parser. Remember that PurePerl Parser is buggued and can t handle utf8 correctly	2007-05-25 09:33:58 +00:00
tipaul	5ff7fcffa4	Bugfixes & improvements (various and minor) : - updating templates to have tmpl_process3.pl running without any errors - adding a drupal-like css for prog templates (with 3 small images) - fixing some bugs in circulation & other scripts - updating french translation - fixing some typos in templates	2007-05-22 09:13:54 +00:00
tipaul	ca201e36af	Koha NoZebra : - support for authorities - some bugfixes in ordering and "CCL" parsing - support for authorities <=> biblios walking Seems I can do what I want now, so I consider its done, except for bugfixes that will be needed i m sure !	2007-05-10 14:45:15 +00:00
tipaul	e1d907c688	various bugfixes on parameters modules + adding default NoZebraIndexes systempreference if it's empty	2007-05-04 16:24:08 +00:00
tipaul	3e85c9e97f	NoZebra SQL index management : * adding 3 subs in Biblio.pm - GetNoZebraIndexes, that get the index structure in a new systempreference (added with this commit) - _DelBiblioNoZebra, that retrieve all index entries for a biblio and remove in a variable the biblio reference - _AddBiblioNoZebra, that add index entries for a biblio. Note that the 2 _Add and _Del subs work only in a hash variable, to speed up things in case of a modif (ie : delete+add). The effective SQL update is done in the ModZebra sub (that existed before, and dealed with zebra index). I think the code has to be more deeply tested, but it works at least partially.	2007-05-02 16:44:31 +00:00
tipaul	4213b6ec98	improving NOzebra search : - changing nozebra table to have biblionumber,title-ranking; (; is the entry separator. Now, if a value is several times in an index, it is stored only once, with a higher ranking (the ranking is the number of times the word appeard for this index) - improving search to have ranking value (default order). The ranking is the sum of ranking of all terms. The list is ordered by ranking+title, from most to lower	2007-05-02 11:57:11 +00:00
hdl	097fef712a	Removing $dbh from GetMarcFromKohaField (dbh is not used in this function.)	2007-04-27 14:00:48 +00:00
tipaul	b53be9cdaf	Koha 3.0 nozebra 1st commit : the script misc/migration_tools/rebuild_nozebra.pl build the nozebra table, and, if you set NoZebra to Yes, queries will be done through zebra. TODO : - add nozebra table management on biblio editing - the index table content is hardcoded. I still have to add some specific systempref to let the library update it - manage pagination (next/previous) - manage facets WHAT works : - NZgetRecords : has exactly the same API & returns as zebra getQuery, except that some parameters are unused - search & sort works quite good - CQL parser is better that what I thought I could do : title="harry and sally" and publicationyear>2000 not itemtype=LIVR should work fine	2007-04-25 16:26:42 +00:00
tipaul	6b201757c1	some bugfixes for this script that automatically build zebra DB from default config files	2007-04-17 08:50:33 +00:00
tipaul	eba2552086	Code cleaning of Biblio.pm (continued) All subs have be cleaned : - removed useless - merged some - reordering Biblio.pm completly - using only naming conventions Seems to have broken nothing, but it still has to be heavily tested. Note that Biblio.pm is now much more efficient than previously & probably more reliable as well.	2007-03-29 16:45:53 +00:00
tipaul	a481fad4b7	Code cleaning : == Biblio.pm cleaning (useless) == * some sub declaration dropped * removed modbiblio sub * removed moditem sub * removed newitems. It was used only in finishrecieve. Replaced by a Koha2Marc+AddItem, that is better. * removed MARCkoha2marcItem * removed MARCdelsubfield declaration * removed MARCkoha2marcBiblio == Biblio.pm cleaning (naming conventions) == * MARCgettagslib renamed to GetMarcStructure * MARCgetitems renamed to GetMarcItem * MARCfind_frameworkcode renamed to GetFrameworkCode * MARCmarc2koha renamed to TransformMarcToKoha * MARChtml2marc renamed to TransformHtmlToMarc * MARChtml2xml renamed to TranformeHtmlToXml * zebraop renamed to ModZebra == MARC=OFF == * removing MARC=OFF related scripts (in cataloguing directory) * removed checkitems (function related to MARC=off feature, that is completly broken in head. If someone want to reintroduce it, hard work coming...) * removed getitemsbybiblioitem (used only by MARC=OFF scripts, that is removed as well)	2007-03-29 13:30:31 +00:00
tipaul	f8e9fb6445	rel_3_0 moved to HEAD (introducing new files)	2007-03-09 15:34:17 +00:00
tipaul	a3999812e6	rel_3_0 moved to HEAD	2007-03-09 14:52:58 +00:00
thd	ad657e71eb	For MARC 21, instead of deleting the whole subfield when a character does not translate properly from MARC8 into UTF-8, only the problem characters are deleted.	2006-09-01 17:11:53 +00:00
toins	eac83ccd45	Head & rel_2_2 merged	2006-07-04 15:02:42 +00:00
rangi	10b2315eb3	Fixing the problem that all items were getting biblioitem=1 set	2006-04-01 22:10:50 +00:00
kados	44b4d37b54	removed Zconns, no need for them anymore with new Context.pm setup	2006-02-27 01:06:30 +00:00
kados	fafe0896d6	minor bugfix with 'commit' option	2006-02-25 23:40:59 +00:00
kados	77abbe2caf	A bulkmarcimport.pl that is based on the new Biblio.pm Zebra routines. It now responds to: -n : the number of records to import. -commit : the number of records to wait before performing a 'commit' operation ALSO: IMPORTANT: I took out the char_encoding as this should be handled by MARC::File::XML now, unless I'm mistaken.	2006-02-25 21:53:48 +00:00
tipaul	f74823bf1b	OK, this time it seems to work. The last blocking problem was... a space in recordId: (bib1,Identifier-standard) just after the comma. Adam agreed it was a bug, and it should be solved soon. But now we are aware, we can avoid putting the space ! In this commit you have all what is needed to setup a working zebra DB in Unimarc : * collection.abs is UNIMARC specific and must be rewritten for MARC21, in marc21 directory * pdf.properties is to be copied unmodified in the marc21 directory (can also be put somewhere else) * rebuild_zebra.pl is SLOW, but 1 step reindexing tool, using ZOOM * rebuild_zebra_idx is FAST, but 2 step reindexing tool, and does not use zebra. run it, it will create all biblios XML files in /zebra/biblios directory, then zebraidx update biblios in your zebra directory * zebra.cfg is the zebra config file ;-) * test_cql2rpn.pl is a script that will query the database and show the results. Works for me, just change the query at the beginning to get answers you expect. What has to be done : * benchmarking : it seems the zebraidx update is faster than lightning (400biblios/sec : 10 000biblios in 25seconds), while ZOOM indexing is slow (something like 25biblios/second) More benchmarking could be done. * completing collection.abs for UNIMARC. I'll take care of it. * modifying Biblio.pm to use ZOOM instead of the "zebraidx through exec" running actually. I'll take care of it also. * modify the search API & tools & screens. I'll let the ball to someone else (chris ?) for this. I agree SearchMarc.pm can be dropped and replaced by something else (maybe a new-and-clean Search.pm package)	2006-02-09 10:59:34 +00:00
tipaul	369ee65d94	new version of rebuild_zebra. Should work with Perl-ZOOM, but DOES NOT WORK for me. I get : ZOOM error 10002 "Encoding failed" from diag-set 'ZOOM' help expected from indexdata...	2006-01-10 17:03:32 +00:00
tipaul	d5938493d7	synch'ing head and rel_2_2 (from 2.2.5, including npl templates) Seems not to break too many things, but i'm probably wrong here. at least, new features/bugfixes from 2.2.5 are here (tested on some features on my head local copy) - removing useless directories (koha-html and koha-plucene)	2006-01-06 16:39:37 +00:00
tipaul	dba37f38e7	This script can be use to rebuild the zebra DB. It stores all koha MARC records in iso2709, in the bilbios directory. After that, you just have to "zebraidx update biblios" I tried on a 9900 DB, here are the results : [paul@bureau migration_tools]$ ./rebuild_zebra.pl -c 9900 9903 MARC record done in 37.9104120731354 seconds [paul@bureau zebra]$ zebraidx update biblios <snip> 18:31:24-11/08 zebraidx(20348) [log] Iterations . . . 144575 18:31:24-11/08 zebraidx(20348) [log] Distinct words . 39891 18:31:24-11/08 zebraidx(20348) [log] Updates. . . . . 46 18:31:24-11/08 zebraidx(20348) [log] Deletions. . . . 2 18:31:24-11/08 zebraidx(20348) [log] Insertions . . . 39843 18:31:24-11/08 zebraidx(20348) [log] zebra_register_close p=0x8104cf8 18:31:25-11/08 zebraidx(20348) [log] Records: 9887 i/u/d 9881/6/0 18:31:25-11/08 zebraidx(20348) [log] user/system: 531/145 18:31:25-11/08 zebraidx(20348) [log] zebra_stop 18:31:25-11/08 zebraidx(20348) [log] zebraidx times: 11.33 5.31 1.45	2005-08-11 16:35:54 +00:00
tipaul	c52e5b61dd	synch'ing 2.2 and head	2005-08-04 14:10:52 +00:00
tipaul	64cd740d2b	synch'ing 2.2 and head	2005-05-04 08:58:30 +00:00
tipaul	93ff09d081	merging 2.2 branch with head. Sorry for not making it before, many many commits done here	2005-03-01 13:40:35 +00:00
tipaul	51e204fa23	moving bulkmarcimport script to migration_tools directory	2005-01-03 15:25:50 +00:00
tipaul	cd6f87a689	Auto-build LANG authorized values	2005-01-03 12:59:49 +00:00

... 3 4 5 6 7

343 commits