Bug 34549: Strip non-XML chars during TransformHtmlToMarc
authorDavid Cook <dcook@prosentient.com.au>
Thu, 17 Aug 2023 04:28:29 +0000 (04:28 +0000)
committerFridolin Somers <fridolin.somers@biblibre.com>
Mon, 9 Oct 2023 19:28:05 +0000 (09:28 -1000)
commit871d6eaa3fbb5eba14e17c3ebc6a46db708e5483
tree8877086c440ca96557379f6b2bb494fc0fed47ad
parent70af8ac5643346ced39d5826faeb38cbcc066d87
Bug 34549: Strip non-XML chars during TransformHtmlToMarc

This patch strips non-XML characters from inputs during
TransformHtmlToMarc.

To test:
0. Apply patch
1. koha-plack --restart kohadev
2. Go to http://localhost:8081/cgi-bin/koha/cataloguing/addbiblio.pl
3. Fill out record and use the text from "Text file containing control characters"
as the title
4. Click Save
5. Note that your record displays without any warnings like the following:
Error: invalid data, cannot decode metadata object
parser error : PCDATA invalid Char value 27

Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
[EDIT] Squashed the tidy patch. Still needed a few spaces to satisfy qa tools.
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
(cherry picked from commit 3e1d32f9caaab56acd8f4b338a859eb599955634)
Signed-off-by: Fridolin Somers <fridolin.somers@biblibre.com>
C4/Biblio.pm
t/db_dependent/Biblio/TransformHtmlToMarc.t