Bug 20540: Fix TransformHtmlToXml if last tag is empty
authorJonathan Druart <jonathan.druart@bugs.koha-community.org>
Fri, 6 Apr 2018 20:16:30 +0000 (17:16 -0300)
committerJonathan Druart <jonathan.druart@bugs.koha-community.org>
Wed, 11 Apr 2018 19:45:19 +0000 (16:45 -0300)
commitcc8f3dc1cc046ecd7d53349091fbba864dfb91c9
treefd788dee0a06e90f674ca5ff73153365b9e4d103
parentd8b6fde9b7f59ba32f4726f55acb6f803c7816d6
Bug 20540: Fix TransformHtmlToXml if last tag is empty

This bug has been found during testing bug 19289.

In some conditions C4::Biblio::TransformHtmlToXml will generate a malformed XML structure. The last </datafield> can be duplicated.

For instance, if a call like:
my $xml = TransformHtmlToXml( \@tags, \@subfields, \@field_values );

with the last value of @field_values is empty, it will return:

<?xml version="1.0" encoding="UTF-8"?>
<collection
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd"
  xmlns="http://www.loc.gov/MARC21/slim">
<record>
<datafield2 tag="020" ind1=" " ind2=" ">
<subfield code="a">l</subfield>
</datafield>
<datafield1 tag="100" ind1=" " ind2=" ">
<subfield code="a">k</subfield>
</datafield>
<datafield1 tag="245" ind1=" " ind2=" ">
<subfield code="a">k</subfield>
</datafield>
<datafield1 tag="250" ind1=" " ind2=" ">
<subfield code="a">k</subfield>
</datafield>
<datafield1 tag="260" ind1=" " ind2=" ">
<subfield code="b">k</subfield>
<subfield code="c">k</subfield>
</datafield>
</datafield>
</record>
</collection>

Which will result later in the following error:
:23: parser error : Opening and ending tag mismatch: record line 6 and datafield
</datafield>
            ^
:24: parser error : Opening and ending tag mismatch: collection line 2 and record
</record>
         ^
:25: parser error : Extra content at the end of the document
</collection>

Test plan:
You can test it along with bug 19289 and confirm that it fixes the problem
raised on bug 19289 comment 30

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
C4/Biblio.pm
t/Biblio/TransformHtmlToXml.t