Commit graph

6 commits

Author SHA1 Message Date
Olli-Antti Kivilahti
09e330aa24 Bug 13660: Exclude export phase and use existing exported MARCXML - rebuild_zebra_sliced.sh
When looking for a bad MARC Record using the rebuild_zebra_sliced.sh, it is
useful to skip the complete MARCXML exporting from Koha and reuse the exported
files for Zebra indexing.

This patch adds a new parameter:
    -x | --exclude-export Do not export Biblios from Koha, but use the existing
                          export-dir

Which depends on the:
     -d | --export-dir     Where rebuild_zebra.pl will export data
                           Default: $EXPORTDIR

 !---------!
! TEST PLAN !
 !---------!

1. Run
     "./rebuild_zebra_sliced.sh --length 1000"
   to export 1000 MARC Records
   and slice them to one big 1000-Record chunk.
2. Realize that you get an imaginary "stack smashing detected"-error crashing
   your indexing at some Record you dont know of and can't make out from the
   indexing logging.
3. Start looking for the bad Record by running:
     "./rebuild_zebra_sliced.sh --exlude-export --chunk-size 10"
   To skip Biblios export from Koha which takes ~2h and get straight into
   splitting your exported biblios to chunks of 10, and indexing them. You
   know which chunk fails so it is much easier to find the issue there.

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>

Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
2018-01-09 17:23:50 -03:00
Julian Maurice
357cf7fdd8 Bug 8746 [Follow-up] Replace == by eq in string comparison
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
All tests and QA script pass.
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-04-21 09:23:23 -04:00
Julian Maurice
a4e8443691 Bug 8746: Fix indexation in DOM index mode
When in DOM index mode, files exported by `rebuild_zebra.pl -x` are
wrapped by '<collection></collection>' tag.
This is a problem because splitting files produces invalid files.
This is fixed by adding the missing <collection> tags in each generated
file.
Another problem was that the wrong zebra configuration file was used.
The script now uses C4::Context->zebraconfig($server)->{config} to know
which configuration file has to be used.

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-04-21 09:23:22 -04:00
Julian Maurice
eef4b3f23c Bug 8746: rebuild_zebra_sliced.sh now export/index records as MARCXML
This avoid indexing failures due to "bad offset" or "bad length" error
with ISO2709 format

+ minor improvements:
  -  --length parameter is optional. If not given, it will execute the
     right sql query to find the number of records to index
  -  new parameter --reset-index. If set, index is reset before indexing

Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Comment: Work as described. No errors.

Test: Edit record to make it longer than 9999. Without patch rebuild_sliced
fails. With patches works.

Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
2013-04-21 09:23:22 -04:00
Colin Campbell
722701d596 Bug 8727 Minor stylistic change to help text
indexing not indexation
some minor grammatical changes

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
2012-09-17 18:47:40 +02:00
Julian Maurice
57424a9fdc Bug 7286: rebuild_zebra_sliced for biblios and authorities
Complete rewrite of rebuild_zebra_sliced.zsh (renamed to .sh). Main
improvements are:
  - both biblio and authority records are handled
  - records are exported only once

It also add an option --skip-index to rebuild_zebra.pl that permit to
use rebuild_zebra.pl as an 'export only' script.

Description:
Index Koha records by chunks. It is useful when some record causes
errors and stop the indexation process. With this script, if indexation
of one chunk fails, chunk is splitted in 2 (or 3) chunks, and
indexation continue on these chunks.
rebuild_zebra.pl is called only once to export records.
Splitting and indexing is handled by this script (using yaz-marcdump and
zebraidx).

Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
2012-07-06 15:06:40 +02:00