From 19d9ba176dea6b7816a33b014a5f9e309af53dc0 Mon Sep 17 00:00:00 2001 From: Fridolin Somers Date: Mon, 29 Jun 2020 15:07:10 +0200 Subject: [PATCH] Bug 23542: Fix SRU import encoding When importing records from a SRU server, the diacritics have bad encoding. I reproduce with BNF server so it may be a UNIMARC issue. Tests show that difference between Z39.50 server and SRU is that leader contains 'a' at postion 9. Looking at MARC::Record->encoding() shows that encoding depends on leader even for UNIMARC. So this patch adds a call to MARC::Record->encoding('UTF-8') in case of a SRU server in C4::Breeding. Same use exists in Koha::MetadataRecord::Authority::get_from_breeding(). In case of import via Z3950, MarcToUTF8Record() is called, which calls SetMarcUnicodeFlag(), which calls MARC::Record->encoding('UTF-8') Test plan : 1) Use a UNIMARC database 2) Configure a connexion to a UNIMARC SRU, for example BNF, see https://doc.biblibre.com/koha/autour_de_koha/serveurs_z3950_sru#serveur_de_la_bnf 3) Go to cataloguing module 4) Click on 'New from Z39.50/SRU' 5) Choose only the SRU target 6) Search for ISBN 2266072889 7) Confirm you see good encoding : diacritic on 'a' of title 'Strate-a-gemmes' 8) Click on 'Marc preview' 9) Confirm you see good encoding 10) Click import 11) Confirm you see good encoding 12) Check also Authorities import via SRU 13) Check also SRU imports on a MARC21 database Signed-off-by: Marcel de Rooy Amended: Removed change to new_from_xml call. We should respect syntax. But the added MARC::Record encoding does the tric! Which is implicit for Z3950 targets where MarcToUTF8Record does the same. Signed-off-by: Tomas Cohen Arazi Signed-off-by: Jonathan Druart --- C4/Breeding.pm | 2 ++ 1 file changed, 2 insertions(+) diff --git a/C4/Breeding.pm b/C4/Breeding.pm index 109cda8950..8287cf4b6f 100644 --- a/C4/Breeding.pm +++ b/C4/Breeding.pm @@ -298,6 +298,7 @@ sub _handle_one_result { if( $servhref->{servertype} eq 'sru' ) { $marcrecord= MARC::Record->new_from_xml( $raw, 'UTF-8', $servhref->{syntax} ); + $marcrecord->encoding('UTF-8'); } else { ($marcrecord) = MarcToUTF8Record($raw, C4::Context->preference('marcflavour'), $servhref->{encoding} // "iso-5426" ); #ignores charset return values } @@ -610,6 +611,7 @@ sub Z3950SearchAuth { my ($charset_result, $charset_errors); if( $servers[$k]->{servertype} eq 'sru' ) { $marcrecord = MARC::Record->new_from_xml( $marcdata, 'UTF-8', $servers[$k]->{syntax} ); + $marcrecord->encoding('UTF-8'); } else { ( $marcrecord, $charset_result, $charset_errors ) = MarcToUTF8Record( $marcdata, $marc_type, $encoding[$k] ); } -- 2.39.5