Bug 35659: (follow-up) Better handling of accented characters

If you try to harvest bibliographic records from a UNIMARC OAI
repository (using oai_dc data format) in a MARC21 Koha instance
and run the OAI harvester script in verbose mode, you may get
lines similar to the following in the output:

no mapping found for [0xC9] at position 0 in Économie politique g0=ASCII_DEFAULT g1=EXTENDED_LATIN at /usr/share/perl5/MARC/Charset.pm line 308.
no mapping found for [0xC9] at position 0 in Église et société g0=ASCII_DEFAULT g1=EXTENDED_LATIN at /usr/share/perl5/MARC/Charset.pm line 308.

When looking at the imported records' biblio details page in
the OPAC, most words containing accented characters will not
appear correctly.

The fix is to apply Franck Theeten's solution from Bug 16488
(https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=16488#c24)
and modify the value of the MARC leader's 10th character
to 'a' in the XSLT that transforms the UNIMARC OAI records
into MARC21 XML. Then, the accented characters get imported
properly and the records appear correctly in the OPAC.

Test plan:

0) Without this patch, running the OAI harvesting script in
   verbose mode produces many warnings, and garbled characters
   appear in the OPAC biblio details page wherever accented
   characters are in use.

1) Apply this patch.

2) Re-run the OAI harvesting script in verbose + force mode
   (force mode is required to ignore record datestamps from
   previous runs):

   misc/cronjobs/harvest_oai.pl -v -r <OAI_REPO_ID> -f

   This time there should be no warnings printed on your
   screen, and any characters with accents in the updated
   records should look OK in the OPAC.

Thanks-to: Franck Theeten <franck.theeten@africamuseum.be>
Signed-off-by: Michal Denar <black23@gmail.com>
Signed-off-by: Julian Maurice <julian.maurice@biblibre.com>
Signed-off-by: Pedro Amorim <pedro.amorim@ptfs-europe.com>
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>
Signed-off-by: David Cook <dcook@prosentient.com.au>
Sponsored-by: Association KohaLa - https://koha-fr.org/
Signed-off-by: Katrin Fischer <katrin.fischer@bsz-bw.de>
This commit is contained in:
Andreas Roussos 2024-02-28 07:06:33 +00:00 committed by Katrin Fischer
parent d2367657b2
commit 389a7de850
Signed by: kfischer
GPG key ID: 0EF6E2C03357A834

View file

@ -53,7 +53,7 @@
<xsl:otherwise>m</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:value-of select="concat(' ',$leader06,$leader07,' 3u ')"/>
<xsl:value-of select="concat(' ',$leader06,$leader07,' a 3u ')"/>
</xsl:element>
<datafield tag="042" ind1=" " ind2=" ">