Commit graph

226 commits

Author SHA1 Message Date
0a53d5e6b6 Bug 12651: DOM indexing is the default
On the 23 July development meeting it was decided to formally deprecate
GRS-1 indexing mode for Zebra. This patch makes code fallback to DOM
on the remaining places. No behaviour change should be noticed, as DOM
has been the default for a while.

Regards

Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Passes tests and QA script.
Also checked running Makefile.PL

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-27 12:35:44 -03:00
Jonathan Druart
cf2eb49448 Bug 12538: Remove Solr without breaking anything else
Since nobody is currently working on the zebra layer introduced by bug
8233, Solr won't never work.
Some code has been introduced in 3.10 to prove several search engines
can cohabit into Koha but no help/fund has been found to go ahead.
It is useless to keep this code and to maintain an ambiguous situation.

I think the indexes configuration page could be restore later if someone
else introduces a new search engine into Koha.

Test plan:
Look at the code introduced by bug 8233 and verify all is removed.

Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-10-11 16:59:04 -03:00
a6c278f8e0 Bug 12720: (QA followup) use API instead of plain SQL
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-08-24 13:19:01 -03:00
11446fcb0e Bug 12720 - Turn off Authority logging when running "bulkmarcimport.pl"
This patch turns off the AuthoritiesLogging syspref when running the
bulkmarcimport.pl script.

It also temporarily disables the syspref caching which will have
been making the CataloguingLogging handling ineffectual. (That is,
updating the CataloguingLogging syspref in the script wouldn't
have an effect as the original cached value would be used anyway.)

_TEST PLAN_

0) Turn on "AuthoritiesLogging" syspref
1) Load an authority record using bulkmarcimport.pl
2) Note a new Authorities entry in action_logs

3) Apply the patch

4) Repeat Step 1
5) Note that no new entry is made in action_logs

(Bonus points: Do the same thing with CataloguingLogging and a
bibliographic record.)

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Tested with biblio and auth imports.
Work as described, no koha-qa errors.

Note: If you begin to load a big file and get impatient and hit ^C,
seems that current syspref value is lost...

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Passes tests and QA script.
Patch copies what was already done for the CatalougingLog, no problems found.

Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-08-24 12:49:49 -03:00
Jonathan Druart
b2ba10b40b Bug 11278: (follow-up) Return an exit value (1) if the module is not found.
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-05-05 00:59:33 +00:00
9f45310d76 Bug 11278: Followup for customize command line parameter
The initial patch for this bug did not include a specific command line
option for customization. If a module LocalChanges.pm existed, it would
be used without asking.
This patch adds a command line option enabling the customization option
and offering the extra possibility of using another module name. If no file
name is passed, we default to LocalChanges.
Without the -custom option, behavior is as it was.
Also some POD lines are added to document the feature.

Test plan:
[1] Make a LocalChanges.pm in migration_tools. Verify that it is not used,
    if you do not enable the -cust parameter.
[2] Run the script again with -cust. Verify that it is called now.
[3] Copy LocalChanges.pm to Whatever.pm. Make some change. Run with
    -cust Whatever and verify that the new module is used.
[4] Copy Whatever.pm to another dir, make some change. Run with -cust and the
    full name. Verify that the latest change was used.
[5] Run without any option. Check the pod documentation.

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-05-05 00:59:32 +00:00
8480570197 Bug 11278: Adjusting bulkmarcimport.pl for customization routine and verbose printing
This patch makes two adjustments:
[1] For the verbose option, verbose level 2 now means print the
formatted version of each record.
[2] If a module LocalChanges.pm is found in misc/migration_tools, the
routine "customize" in this module is called for each marc record.
This allows you to make local changes to these marc records before
importing them.

Test plan:
[1] Test the verbose option: a single -v for medium verbosity and two
-v to dump a human-readable version of the record to standard output.
(Do not yet copy LocalChanges.pm in the folder.)
You may used the attached example file on Bugzilla:
perl misc/migration_tools/bulkmarcimport.pl -file zztest01.xml -v -v -b -m XML -t | more
Note the option t for test; no records will be imported.
[2] Copy LocalChanges.pm in the migration_tools folder. You may use the
example provided on Bugzilla (in a patch). If you use the example module,
check the contents of 001, 005 and 590 fields. (The -v -v option allows
you to easily check that.)

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-05-05 00:58:54 +00:00
Galen Charlton
f4633cc5e5 Bug 11441: (follow-up) improve utility help text
This patch expands and reformats the help text displayed
when running remove_unused_authorities.pl -h.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-04-11 15:25:41 +00:00
Juan Romay Sieira
0f6652d62b Bug 11441: enhance remove_unused_authorities.pl ability to select records
remove_unused_authorities.pl previously required that --aut be supplied
to specify one or more authority types to check for unlinked authority
records.  If --aut was omitted, it would default to search for
records of authority type NC, which is not present in many (or any?)
Koha databases.

Now, if --aut is omitted, unlinked authority records of any type
are removed.

To test it:
	Parse only PERSO_NAME authorities:
		misc/migration_tools/remove_unused_authorities.pl -aut PERSO_NAME

	Parse all authorities:
		misc/migration_tools/remove_unused_authorities.pl

Signed-off-by: Nicolas Legrand <nicolas.legrand@bulac.fr>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-04-11 15:17:28 +00:00
Matthias Meusburger
5935654bac Bug 11850: Add -append option to bulkmarcimport.pl to append to logfile
Signed-off-by: Magnus Enger <digitalutvikling@gmail.com>
Keeps current behaviour as default.
The -append option is described in the POD and works as expected.

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Works as described.
Adding a date/time to the output might
be good, to make it easier to find the entry you were looking for.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-04-07 15:32:01 +00:00
Galen Charlton
03338b70e4 Bug 10955: (follow-up) improve usage information
This patch improves rebuild_zebra.pl's usage help
by explaining when --skip-deletes should be considered
and noting that it should be used in conjunction with
a cronjob to process deletions after hours.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-03-10 18:46:28 +00:00
b0870311e1 Bug 10955 - Add ability to skip deletions in zebraqueue
It seems that record deletions can cause extreme slowdowns for Koha
installations with extremely large numbers of records. It would be
helpful to be able to skip record deletions when processing the
zebraqueue with rebuild_zebra.pl so the deletions can be processed with
a lower frequency.

Test Plan:
1) Disable any zebra indexing cronjobs you may have
2) Delete a record
3) Note the operation recordDelete in the zebraqueue table having done = 0
4) Run misc/migration_tools/rebuild_zebra.pl -b -z --skip-deletes
5) Note the delete still has done = 0
6) Run misc/migration_tools/rebuild_zebra.pl -b -z
7) Note the delete now has done = 1

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Passes all tests and QA script.
Also tested for authorities, no problems found.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>

RM note: this is at best a work-around, and I will emphasize that
--skip-deletes should be used only when absolutely necessary.

I hope that --skip-deletes can go away at some point soon, but
that may depend on changes to Zebra.
2014-03-10 18:44:10 +00:00
Galen Charlton
160c44d4e9 Bug 11078: (follow-up) tidy code
- fix a couple typos in comments
- make replace a "$i" with a more descriptive variable name
- style some of the new code

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-02-28 22:24:28 +00:00
07de37f0e5 Bug 11078: QA Follow-up for missing file permissions on lockfile
The original patch creates a lockfile in the ZEBRA_LOCKDIR.
It can fall back to /var/lock or even /tmp.
If the create fails, it dies. This can be considered as very
exceptional.

This followup adjusts the fallback location in /var/lock or /tmp
slightly.  It appends the database name to the folder in order to
prevent interfering between multiple Koha instances. Creation of the
lockfile has been moved to a subroutine extending directory and file
creation testing.

In the very unlikely case that we cannot create the lockfile (after
three separate tries), this follow-up allows you to continue instead
of die.  This is just as we did before we had file locking here. Every
time skipping a reindex could cause more harm than continuing and
having the race condition once in a while.

Test plan:
Test adding and removing lockdir from your koha-conf.xml. Check fallback.
Note that fallback in /var/lock or /tmp must contain database name.
Remove the lockdir config line and remove permissions from fallback. In
this case the reindex should continue but with a warning.

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Tested with daemon and one-off invocation simultaneously.
Tested new wait parameter.
Tried all variations of lock directory (changing permissions etc.)

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-02-28 22:22:47 +00:00
Doug Kingston
88e7faf860 Bug 11078: Add locking to rebuild_zebra
This patch adds locking to rebuild_zebra.pl to ensure that simultaneous
changes are prevented (as one is likely to overwrite the other).
Incremental updates in daemon mode will skipped if the lock is busy
and they will be picked up on the next pass.  Non-daemon mode
invocations will also exit immediately if they cannot get the lock
unless the new flag -wait-for-lock is specified, in which case they
will wait until the get the lock and then proceed.

Supporting changes made to Makefile.PL and templates for the new
locking directory (paralleling the other zebra lock directories).
We stash the zebra_lockdir in koha-conf.xml so rebuild_zebra.pl
can find it.

To address earlier QA concerns we:
1. added code to check if flock is available and ignore locking if
it's missing (from M. de Rooy)

2. changed default for adhoc invocations to abort if they cannot
obtain the lock.  Added option -wait-for-lock if the user prefers
to wait until the lock is free, and then continue processing.

3. added missing entry to t/db_dependent/zebra_config.pl

4. added a fallback locking directory of /tmp

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Doug merged the original patch with the QA changes.
Just for the record, noting here that the original patch was tested
extensively too by Martin Renvoize.
I have added a followup for some exceptional cases.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-02-28 22:21:41 +00:00
daf2ebc4f5 Bug 11096: support the retrieval of large MARCXML records
This patch makes Koha <-> Zebra use MARCXML for the serialization when
using DOM, and USMARC for GRS-1.

* The following functions are modified to set the Zebra record syntax
according to the current sysprefs and configuration:

- C4::Context->Zconn
- C4::Context-_new_Zconn

* A new function 'new_record_from_zebra' is introduced, which checks the
context we are in, and creates the MARC::Record object using the right
constructor.

The following packages get touched to make use of the new function:
- C4::Search
- C4::AuthoritiesMarc

and the same happens to the UI scripts that make use of them (both in
the OPAC and STAFF interfaces).

* Calls to the unsafe ZOOM::Record->render()[1] method are removed.

Due to this last change the code for building facets was rewritten. And
for performance on the facets creation I pushed higher version
dependencies for MARC::File::XML and MARC::Record (we rely on
MARC::Field->as_string).

* Calls to MARC::Record->new_from_xml and MARC::Record->new_from_usmarc
are wrapped with eval for catching problems [2].

* As of bug 3087, UNIMARC uses the 'unimarc' record syntax. this case is
  correctly handled.
* As of bug 7818 misc/migration_tools/rebuild_zebra.pl behaves like:

- bib_index_mode (defaults to 'grs1' if not specified)
- auth_index_mode (defaults to 'dom')

here we do exactly the same.

To test:
 - prove t/db_dependent/Search.t should pass.
 - Searching should remain functional.
 - Indexing and searching for a big record should work (that's what the
   unit tests do).
 - Test an index scan search (on the staff interface):
    Search > More options > Check "Scan indexes".
 - Enable 'itemBarcodeFallbackSearch' and try to circulate any word, it
   shouldn't break.
 - Searching for a biblio in a new subscription shouldn't break.
 - Running bulkmarcimport.pl shouldn't break.
 - And so on... for the rest of the .pl files.

[1] http://search.cpan.org/~mirk/Net-Z3950-ZOOM/lib/ZOOM.pod#render()
[2] a record that cannot be parsed by MARC::Record is simply skipped (bug 10684)

Sponsored-by: Universidad Nacional de Cordoba
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2014-02-28 19:50:09 +00:00
Matthias Meusburger
28d97e3228 Bug 11412: fix potential bulkmarcimport crash when searching for duplicates in authorities
bulkmarcimport.pl can crash when searching for duplicates if the 005
field from the incoming or local record is not defined. This patch
fixes it.

Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>

Test plan
1/ Create a record with no 005 field
2/ Try to import it checking for duplicates, notice it crashes
3/ Try with a record with a 005 field, but the one in Koha missing
one, still crashes
4/ Apply patch
5/ No more crash

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Passes all tests and QA script.
Patch fixes the problem described for importing authorities
with the bulkmarcimport.pl when trying to match with existing
records.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-12-26 15:52:02 +00:00
Galen Charlton
b26870e53d Bug 11252: remove deprecated -munge-config switch from rebuild_zebra.pl
The -munge-config switch has been deprecated for years, and
trying to use it would either not work at all or, if it did "work",
almost certainly damage one's Zebra configuration for Koha.

This patch removes this switch.

To test:

[1] Run rebuild_zebra.pl and verify that no mention is made
    of -munge-config.
[2] Run rebuild_zebra.pl to index records in one's test database
    and verify that there are no regressions.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>
Removing a really dangerous option

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Passes all tests and QA script.
Ran rebuild_zebra.pl with various options and confirmed
that data was reindexed successfully.
No regressions found.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-12-26 15:24:41 +00:00
Gaetan Boisson
6657860010 Bug 11417: make sure remove_unused_authorities.pl accepts --test
This patches adds support for the --test option, as well as a
short message telling the user the script is running in test mode.

Test plan :
- Launch the script with -h to see the help
- Launch the script with --test and --aut with an authtypecode
  that is used in your instance
- Make sure it does the same thing as launching it with -t
- Launch the script for real and make sure it still works as
  expected, deleting unused authorities.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-12-19 15:09:18 +00:00
Galen Charlton
b25de3e7cf Bug 6435: (follow-up) make -daemon really imply -a and -b
This patch follows up on the previous patch by moving the
check for whether authority and/or biblio indexing have been
specified so that -daemon has a chance to set those modes.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-11-24 18:20:56 +00:00
Doug Kingston
00240d6970 Bug 6435: (follow-up) rebuild_zebra -daemon option now smarter
Based on feedback, make daemon mode imply -z -a -b and abort
on startup if flags incompatible with an incremental update daemon
are used.  Update documentation to match.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-11-24 18:15:23 +00:00
Doug Kingston
1b0992e8d5 Bug 6435: Add daemon mode to rebuild_zebra.pl
This change adds code to check the zebraqueue table with a cheap SQL query
and a daemon loop that checks for new entries and processes them incrementally
before sleeping for a controllable number of seconds.  The default is 5 seconds
which provides a near realtime search index update.  This is desirable particularly
for libraries that are doing active catalogue updating.  The query is adjusted
based on whether -a, -b, or -a -b are specified.

Help text updated.  Tested against a live 3.12 system.

Note that this fix will benefit from the fix to lack of locking (bug 11078)

Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-11-24 18:12:21 +00:00
Janusz Kaczmarek
9d02840967 Bug 10326: bulkmarcimport.pl doesn't restore value of CataloguingLog syspref
To test:

0) Don't apply the patch yet.
1) Have the CataloguingLog system preference set to 'Log'.
2) Import a file of bibliographic records with bulkmarcimport.pl.
3) Check the state of CataloguingLog system preference -- it will be
   set to 'Don't log'.
4) Apply the patch.
5) Repeat steps 1-3.  The CataloguingLog system preference
   will be 'Log'.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-05-31 07:30:25 -07:00
2eefd1f3a5 Bug 8745: General whitespace and tab tidy
http://bugs.koha-community.org/show_bug.cgi?id=8745
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
1) Runs not with root.
2) Runs with root and -run-as-root.
3) Runs using the normal koha user.

Note: Maybe the message should be clear about why
running as root is bad and which user you should
be running the script with?
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-04-21 09:41:34 -04:00
Barry Cannon
ef86a77801 Bug 8745 - Disallow rebuild_zebra.pl from executing, when run by root user.
Added a check to warn users of execution as root user.
Added a 'runas-root' switch to allow users to force execution as root user.

Signed-off-by: Mason James <mtj@kohaaloha.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-04-21 09:41:34 -04:00
Julian Maurice
357cf7fdd8 Bug 8746 [Follow-up] Replace == by eq in string comparison
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
All tests and QA script pass.
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-04-21 09:23:23 -04:00
Julian Maurice
a4e8443691 Bug 8746: Fix indexation in DOM index mode
When in DOM index mode, files exported by `rebuild_zebra.pl -x` are
wrapped by '<collection></collection>' tag.
This is a problem because splitting files produces invalid files.
This is fixed by adding the missing <collection> tags in each generated
file.
Another problem was that the wrong zebra configuration file was used.
The script now uses C4::Context->zebraconfig($server)->{config} to know
which configuration file has to be used.

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-04-21 09:23:22 -04:00
Julian Maurice
eef4b3f23c Bug 8746: rebuild_zebra_sliced.sh now export/index records as MARCXML
This avoid indexing failures due to "bad offset" or "bad length" error
with ISO2709 format

+ minor improvements:
  -  --length parameter is optional. If not given, it will execute the
     right sql query to find the number of records to index
  -  new parameter --reset-index. If set, index is reset before indexing

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Comment: Work as described. No errors.

Test: Edit record to make it longer than 9999. Without patch rebuild_sliced
fails. With patches works.

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-04-21 09:23:22 -04:00
f9c8f39c02 Bug 9609: Rebuilding zebra reports double number of exported records.
Test plan:
Clear the zebra queue (run rebuild). Update one biblio.
Rebuild zebra (again) with -z. Check zebra log: note 2 exported records.
Now apply patch, and repeat: You will see 1 exported record.

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Works as described.
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-04-02 08:41:40 -04:00
Galen Charlton
151e22070a bug 9496: improve error checking in rebuild_zebra.pl
When using rebuild_zebra to index all records, skip over
bibliographic or authority records that don't come out
as valid XML.  Also, strip extraneous XML declarations when
using --nosanitize.

Test plans
----------
Note that both plans assume that DOM indexing is turned on.

Test plan #1
============

[1] Run rebuild_zebra.pl with the -x -nosanitize options.  Without
    the patch, zebraidx should terminate early and complain
    about invalid XML.
[2] With the patch, the rebuild_zebra.pl should work without
    error.

Test plan #2
============
[1] Intentionally make a MARCXML record invalid, e.g, by running
    the following SQL:

    UPDATE bilbioitems SET marcxml = CONCATENATE(marcxml, 'junk')
    WHERE biblionumber = 123;

[2] Run rebuild_zebra.pl -b -x -r
[3] Without the patch, only part of the database will be indexed.
[4] With the patch, rebuild_zebra.pl will not export the bad
    record and will give an error message saying so, but will
    successfully index the rest of the records.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Larry Baerveldt <larry@bywatersolutions.com>
Signed-off-by: Mason James <mtj@kohaaloha.com>

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-03-21 22:25:03 -04:00
Paul Poulain
be82a5f942 bug 5608 follow-up: exit immediately if UNIMARC
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-03-21 22:15:56 -04:00
Michael Hafen
b395ac0805 Add license statement to switch_series_info CLI script
License and copyright statement added.
Thanks to Bernardo Gonzalez Kriegel for reminding me about this.

http://bugs.koha-community.org/show_bug.cgi?id=5608
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>

Comment: add license information. No errors.

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-03-21 22:15:56 -04:00
Michael Hafen
8cb3533ae6 Bug 5608 - command-line tool to switch information in 440 and 490 tags
With the MARC21 standard moving from the 440 tag to the 490, this tool is
to help libraries make the move.  It switches any information in 440 tags to
490 tags, and any information in 490 tags to 440 tags.  That seemed like the
best way to go to me.  There is also an option to create 830 tags for any 44
information, like authorities, that can't be represented in the 490 tag.

To Test:
locate some biblios with 440 or 490 tags filled.
run bin/migration_tools/switch_marc21_series_info.pl -c
observe that the information in the biblios has switched 4xx tags.

http://bugs.koha-community.org/show_bug.cgi?id=5608
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>

Comment: Work as described. No errors.

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-03-21 22:15:56 -04:00
Michael Hafen
eea7b9d20d Bug 5608 - command-line tool to switch information in 440 and 490 tags
With the MARC21 standard moving from the 440 tag to the 490, this tool is
to help libraries make the move.  It switches any information in 440 tags to
490 tags, and any information in 490 tags to 440 tags.  That seemed like the
best way to go to me.

To Test:
locate some biblios with 440 or 490 tags filled.
run bin/migration_tools/switch_marc21_series_info.pl -c
observe that the information in the biblios has switched 4xx tags.

http://bugs.koha-community.org/show_bug.cgi?id=5608
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>

Comment: Works as described. No errors.

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-03-21 22:15:56 -04:00
Stéphane Delaune
cefa7c21e2 Bug 5635: bulkmarcimport new parameters & features
See the script's documentation for more details

New parameters are:
 - authtypes
 - filter
 - insert
 - update
 - all

Signed-off-by: Pascale Nalon <pascale.nalon@gmail.com>
This patch is live in Mines ParisTech since 2012-07-24.
Signing off

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
 - Moved the sign-off from bugzilla to the commit message.
 - All tests and QA script pass.
 - Amended commit message to list new parameters.
 - Verified this patch works on a UNIMARC installation.
 - Verified normal import still works correct on a MARC21
   installation.
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-03-21 20:21:54 -04:00
4dcee58a4d Bug 7440 - Remove NoZebra vestiges
Removed NoZebra vestiges. This comprises several code blocks that depend on the NoZebra syspref and NZ related functions/methods.

C4::Biblio->
 GetNoZebraIndexes
 _DelBiblioNoZebra
 _AddBiblioNoZebra

C4::Search->
 NZgetRecords
 NZanalyse
 NZoperatorAND
 NZoperatorOR
 NZoperatorNOT
 NZorder

C4::Installer->
 set_indexing_engine

Sponsored-by: Universidad Nacional de Córdoba
Signed-off-by: Julian Maurice <julian.maurice@biblibre.com>

Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-03-19 21:17:04 -04:00
Jared Camins-Esakov
144c7f4e4e Bug 9239: Allow the use of QueryParser for all queries
With the inclusion of this patch, all searches will (try) to use
QueryParser for handling queries for both the bibliographic and authority
databases if UseQueryParser is enabled. If QueryParser is unavailable,
UseQueryParser is disabled, or the search uses CCL indexes, the old
search code will be used.

To test:
1) Apply patch.
2) Run the unit test with `prove t/QueryParser.t`
3) Enable the UseQueryParser syspref.
4) Try searches that should return results in the following places:
   * OPAC (simple search)
   * OPAC (advanced search)
   * OPAC (authorities)
   * Staff client (header search)
   * Staff client (advanced search)
   * Staff client (cataloging search)
   * Staff client (authorities)
   * Staff client (importing a batch using a match point)
   * Staff client (searching for an item for adding to a label)
   * Staff client (acquisitions)
   * Staff client (searching for a record to create a serial)
   * ANYWHERE ELSE I HAVE FORGOTTEN
5) Disable the UseQueryParser syspref. Repeat at least some of the
   searches you did above.
6) If all searches worked, sign off.

Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Elliott Davis <elliott@bywatersolions.com>
Searching still works as expected for variuos places.
QueryParser syspref seemed to be enabled by default

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-03-16 21:32:32 -04:00
Robin Sheat
788e51adb1 Bug 9035 - delete bulkauthimport.pl
<dcook> Then bulkauthimport.pl?
<jcamins> bulkauthimport should not be used ever.
<eythian> it probably should be deleted
<jcamins> It should be.

Signed-off-by: David Cook <dcook@prosentient.com.au>

I've poked around in bulkmarcimport.pl and it certainly seems to have the functionality that Mason (and Jared and Robin) mention.

Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-03-13 08:50:09 -04:00
Vitor FERNANDES
33e95ea3b9 Bug 9144 - bulkmarcimport.pl - Problem identifying errors
Replace \r with \n for newline in output for bulkmarcimport.pl

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2012-12-17 11:53:01 -05:00
Jared Camins-Esakov
49cadcf7c1 Bug 9049: Don't use shadow with rebuild_zebra -r
Due to a limitation of Zebra, the register must be cleared *before*
doing shadow indexing if you want to reset the indexes. In light of
that, it does not make sense to do shadow indexing at all when
rebuild_zebra.pl is run with the -r switch. This patch makes -r (reset)
imply -n (no shadow).

To test:
1) Run `rebuild_zebra.pl -b -r -v -v -v`
2) Note that the script never runs the merge phase

Without the patch I see log lines refering to the shadow cache (enabling shadow spec=/home/koha/koha-dev/var/lib/zebradb/biblios/shadow:20G)
With the patch I don't see anything in the logs about shadow.  I do however see lines about merging.  I think it could just be a misunderstanding of the logs

Signed-off-by: wajasu <matted-34813@mypacks.net>
Signed-off-by: Elliott Davis <elliott@bywatersolutions.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2012-12-08 09:46:30 -05:00
Robin Sheat
5d0bdbce59 Bug 9012 - --framework option for bulkmarcimport
This allows the --framework option to be specified when running
bulkmarkimport. This option allows a framework code to be specified for
the records being imported.

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
All tests pass, perlcritic fails before and after.

Tested
- imported records with -framework FA, FA framework is used
- imported records without -framework, default framework is used
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2012-12-03 07:14:58 -05:00
Jared Camins-Esakov
deeeb068d9 Bug 9050: Use safer adelete when deleting records from Zebra index
Previously we used the "delete" command in zebraidx, which fails when
you try to delete a record that doesn't exist in the index. By changing
to the "adelete" command, we can reduce the likelihood of a failed
delete causing ghost records. A symptom of this problem is the warning
message occasionally encountered when indexing from the zebraqueue,
"[warn] cannot delete record above (seems new)."

To test:
1) Add a recordDelete action for a record that does not exist to
   zebraqueue in MySQL:
   INSERT INTO zebraqueue (biblio_auth_number, operation, server) \
       VALUES (999999999, 'recordDelete', 'biblioserver');
2) Run `rebuild_zebra.pl -b -z -v [-x]`.
3) Note that you do not get the message "[warn] cannot delete record
   above (seems new)".

Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Passed-QA-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2012-11-12 18:53:49 -05:00
Colin Campbell
722701d596 Bug 8727 Minor stylistic change to help text
indexing not indexation
some minor grammatical changes

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
2012-09-17 18:47:40 +02:00
Jared Camins-Esakov
bc05b5d163 Bug 7417: Include see from references in bibliographic searches
This patch adds the Koha::Indexer::RecordNormalizer and
Koha::Indexer::MARC::RecordNormalizer::EmbedSeeFromHeadings packages
to enable the inclusion of alternate forms of headings in bibliographic
searches. When the new syspref IncludeSeeFromInSearches is turned on
(default is off) rebuild_zebra.pl will insert see from headings from
authority records into bibliographic records when indexing, so that a
search on an obsolete term will turn up relevant records.

To test:
1) Enable IncludeSeeFromInSearches
2) Add a heading that has an alternate form to a record (for example,
   "Cooking" has the alternate form "Cookery," if you have authority
   records from LC)
3) Index the zebraqueue (or reindex if you haven't indexed your system
   yet)
4) Confirm that if you search for "Cookery" you get the record you
   just modified

Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Rebased on master 5 August 2012
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Rebased on master 11 September 2012

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>

Also checked:
- Verified database update works correctly
- Checked system preference and its description
- Checked staff/opac detail pages with feature on/off
- Checked staff/opac search facets
- Downloaded and tested records in various formats
- Tried different searches for 'see from' entries of authorities
- Ran all unit tests

No problems found.
2012-09-13 14:19:28 +02:00
Jared Camins-Esakov
3616eee996 Bug 8384: Some Perl scripts do not compile
Fix syntax errors preventing the scripts misc/translator/text-extract2.pl
and misc/cronjobs/thirdparty/TalkingTech_itiva_inbound.pl from compiling.

Remove misc/migration_tools/build6xx.pl entirely since it refers to
columns that no longer exist in the Koha database, and has seemingly
had broken encoding since Koha switched from CVS to git (or before!).

Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
2012-07-10 10:50:58 +02:00
Christophe Croullebois
665136f8a0 Bug 6566 Checking if DB's records are properly indexed
Small script that checks if each bibliorecord in the DB is properly indexed
use -h to learn more
(MT #6389)

Signed-off-by: Robin Sheat <robin@catalyst.net.nz>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
2012-07-06 17:11:39 +02:00
Jonathan Druart
623f3a2c84 Bug 8233 : SearchEngine: Add a Koha::SearchEngine module
First draft introducing solr into Koha :-)

List of files :
  $ tree t/searchengine/
  t/searchengine
  |-- 000_conn
  |   `-- conn.t
  |-- 001_search
  |   `-- search_base.t
  |-- 002_index
  |   `-- index_base.t
  |-- 003_query
  |   `-- buildquery.t
  |-- 004_config
  |   `-- load_config.t
  `-- indexes.yaml
  just do `prove -r t/searchengine/**/*.t`

  t/lib
  |-- Mocks
  |   `-- Context.pm
  `-- Mocks.pm
  provide a mock to SearchEngine syspref (set_zebra and set_solr).

  $ tree Koha/SearchEngine
  Koha/SearchEngine
  |-- Config.pm
  |-- ConfigRole.pm
  |-- FacetsBuilder.pm
  |-- FacetsBuilderRole.pm
  |-- Index.pm
  |-- IndexRole.pm
  |-- QueryBuilder.pm
  |-- QueryBuilderRole.pm
  |-- Search.pm
  |-- SearchRole.pm
  |-- Solr
  |   |-- Config.pm
  |   |-- FacetsBuilder.pm
  |   |-- Index.pm
  |   |-- QueryBuilder.pm
  |   `-- Search.pm
  |-- Solr.pm
  |-- Zebra
  |   |-- QueryBuilder.pm
  |   `-- Search.pm
  `-- Zebra.pm

How to install and configure Solr ?
  See the wiki page: http://wiki.koha-community.org/wiki/SearchEngine_Layer_RFC

http://bugs.koha-community.org/show_bug.cgi?id=8233
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>
2012-07-06 16:51:58 +02:00
Julian Maurice
57424a9fdc Bug 7286: rebuild_zebra_sliced for biblios and authorities
Complete rewrite of rebuild_zebra_sliced.zsh (renamed to .sh). Main
improvements are:
  - both biblio and authority records are handled
  - records are exported only once

It also add an option --skip-index to rebuild_zebra.pl that permit to
use rebuild_zebra.pl as an 'export only' script.

Description:
Index Koha records by chunks. It is useful when some record causes
errors and stop the indexation process. With this script, if indexation
of one chunk fails, chunk is splitted in 2 (or 3) chunks, and
indexation continue on these chunks.
rebuild_zebra.pl is called only once to export records.
Splitting and indexing is handled by this script (using yaz-marcdump and
zebraidx).

Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
2012-07-06 15:06:40 +02:00
christophe croullebois
082bb5049d Bug 8136 Changes the expected lenght of 100$a in rebuild_zebra.pl
In rebuild_zebra.pl, if we are in "unimarc" ("marcflavour" syspref), the sub "fix_unimarc_100" is called and checks if 100$a lenght is equal to 35.
If it is not the case, the sub inserts the localtime and more, so we loose the datas in reindexing.
The standart lenght is 36.
I have just changed 35 to 36.

Signed-off-by: Sophie Meynieux <sophie.meynieux@biblibre.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
2012-06-20 09:39:27 +02:00
Galen Charlton
daca5edc52 Bug 7818: -x option of rebuild_zebra.pl now works with DOM filter
One consequence is that the -x and -a options are no longer
mutually exclusive.

Also, because of the way that the GRS-1 SGML filter works, if you're
indexing multiple documents, you can't just wrap them in a document
element, but the DOM filter *requires* it.  Consequently, two
new config settings in koha-conf.xml are added to indicate the
Zebra filter in use so that the -x option of rebuild_zebra.pl
knows whether to wrap the exported records or not:

- bib_index_mode (defaults to 'grs1' if not specified)
- auth_index_mode (defaults to 'dom')

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
2012-06-09 11:44:09 +02:00