Bug 34549: Strip non-XML chars during TransformHtmlToMarc
authorDavid Cook <dcook@prosentient.com.au>
Thu, 17 Aug 2023 04:28:29 +0000 (04:28 +0000)
committerTomas Cohen Arazi <tomascohen@theke.io>
Mon, 9 Oct 2023 14:41:32 +0000 (11:41 -0300)
commit3e1d32f9caaab56acd8f4b338a859eb599955634
treeebc2aa8ba1811f7b6828ea71d42a6ba46ce58357
parent6b6534a22a60f8797c678c8ea5d66eefdb474141
Bug 34549: Strip non-XML chars during TransformHtmlToMarc

This patch strips non-XML characters from inputs during
TransformHtmlToMarc.

To test:
0. Apply patch
1. koha-plack --restart kohadev
2. Go to http://localhost:8081/cgi-bin/koha/cataloguing/addbiblio.pl
3. Fill out record and use the text from "Text file containing control characters"
as the title
4. Click Save
5. Note that your record displays without any warnings like the following:
Error: invalid data, cannot decode metadata object
parser error : PCDATA invalid Char value 27

Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
[EDIT] Squashed the tidy patch. Still needed a few spaces to satisfy qa tools.
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
C4/Biblio.pm
t/db_dependent/Biblio/TransformHtmlToMarc.t