git.koha-community.org Git - koha.git/commit

author	Galen Charlton <gmc@esilibrary.com>
	Tue, 8 Jan 2013 00:12:57 +0000 (19:12 -0500)
committer	Chris Cormack <chris@bigballofwax.co.nz>
	Fri, 29 Mar 2013 07:19:29 +0000 (20:19 +1300)
commit	9cf40a72ba2f8b206717e19e703506970cedfef2
tree	4b1bf250640f814a7905b188fbd91781f1b1517b	tree \| snapshot
parent	595687e213587e690d4bc66752e582fdac787043	commit \| diff

bug 9496: improve error checking in rebuild_zebra.pl

When using rebuild_zebra to index all records, skip over
bibliographic or authority records that don't come out
as valid XML.  Also, strip extraneous XML declarations when
using --nosanitize.

Test plans
----------
Note that both plans assume that DOM indexing is turned on.

Test plan #1
============

[1] Run rebuild_zebra.pl with the -x -nosanitize options.  Without
    the patch, zebraidx should terminate early and complain
    about invalid XML.
[2] With the patch, the rebuild_zebra.pl should work without
    error.

Test plan #2
============
[1] Intentionally make a MARCXML record invalid, e.g, by running
    the following SQL:

    UPDATE bilbioitems SET marcxml = CONCATENATE(marcxml, 'junk')
    WHERE biblionumber = 123;

[2] Run rebuild_zebra.pl -b -x -r
[3] Without the patch, only part of the database will be indexed.
[4] With the patch, rebuild_zebra.pl will not export the bad
    record and will give an error message saying so, but will
    successfully index the rest of the records.

Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Larry Baerveldt <larry@bywatersolutions.com>
Signed-off-by: Mason James <mtj@kohaaloha.com>
Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>
Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>
Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>