Main Koha release repository https://koha-community.org
Find a file
Julian Maurice 76e980bb1a Bug 29333: Fix encoding of imported UNIMARC authorities
MARC::Record and MARC::File::* modules sometimes use the position 09 of
the leader to detect encoding. A blank character means 'MARC-8' while an
'a' means 'UTF-8'.

In a UNIMARC authority this position is used to store the authority type
(see https://www.transition-bibliographique.fr/wp-content/uploads/2021/02/AIntroLabel-2004.pdf [FR]).
In this case, 'a' means 'Personal Name'.

The result is that the import will succeed for a Personal Name
authority, but it will fail for all other authority types.

Steps to reproduce:
0. Be sure to have a Koha UNIMARC instance.
1. Download the MARCXML for "Honoré de Balzac"
   curl -o balzac.marcxml https://www.idref.fr/02670305X.xml
2. Verify that it's encoded in UTF-8
   file balzac.marcxml
   (should output "balzac.marcxml: XML 1.0 document, UTF-8 Unicode
   text")
3. Go to Tools » Stage MARC for import and import balzac.marcxml with
   the following settings:
   Record type: Authority
   Character encoding: UTF-8
   Format: MARCXML
   Do not touch the other settings
4. Once imported, go to the staged MARC management tool and find your
   batch. Click on the authority title "Balzac Honoré de 1799-1850" to
   show the MARC inside a modal window. There should be no encoding
   issue.
5. Write down the imported record id (the number in column '#') and go
   to the MARC authority editor. Replace all URL parameters by
   'breedingid=THE_ID_YOU_WROTE_DOWN'
   The URL should look like this:
   /cgi-bin/koha/authorities/authorities.pl?breedingid=198
   You should see no encoding issues. Do not save the record.
6. Import the batch into the catalog. Verify that the authority record
   has no encoding issue.
7. Now download the MARCXML for "Athènes (Grèce)"
   curl -o athènes.marcxml https://www.idref.fr/027290530.xml
8. Repeat steps 2 to 6 using athènes.marcxml file. At steps 4 and 5 you
   should see encoding issues and that the position 9 of the leader was
   rewritten from 'c' to 'a'. Strangely, importing this batch fix the
   encoding issue, but we still lose the information in position 09 of
   the leader

This patch makes use of the MARCXML representation of the record instead
of the ISO2709 representation, because, unlike
MARC::Record::new_from_usmarc, MARC::Record::new_from_xml allows us to
pass directly the encoding and the format, which prevents data to be
double encoded when position 09 of the leader is different that 'a'

Test plan:
- Follow the "steps to reproduce" above and verify that you have no
  encoding issues.

Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
(cherry picked from commit 01d78e1ec7)

Signed-off-by: Lucas Gass <lucas@bywatersolutions.com>
2022-08-23 15:30:11 +00:00
acqui Bug 31001: Fix "CGI::param called in list context" warning in basket.pl 2022-08-12 15:47:32 +00:00
admin Bug 30848: Add an ExpandCodedFields RecordProcessor filter 2022-07-29 17:15:30 +00:00
api Bug 30903: Fix POST /quote 2022-08-12 15:39:33 +00:00
authorities Bug 29333: Fix encoding of imported UNIMARC authorities 2022-08-23 15:30:11 +00:00
basket Bug 29871: Remove marcflavour param in Koha::Biblio->get_marc_notes 2022-07-12 15:54:27 +00:00
bin Bug 20582: Turn Koha into a Mojolicious application 2020-10-06 12:00:04 +02:00
C4 Bug 29333: Fix encoding of imported UNIMARC authorities 2022-08-23 15:30:11 +00:00
catalogue Bug 29333: Fix encoding of imported UNIMARC authorities 2022-08-23 15:30:11 +00:00
cataloguing Bug 31179: Don't copy invisible subfields when duplicating items 2022-08-12 14:29:34 +00:00
circ Bug 30409: barcodedecode() should always trim barcode 2022-07-13 19:44:18 +00:00
clubs Bug 29859: Use iterator instead of as_list 2022-02-09 15:36:23 -10:00
course_reserves Bug 30409: barcodedecode() should always trim barcode 2022-07-13 19:44:18 +00:00
debian Bug 25622: Use special chars in DB password (koha-create) 2022-08-09 22:05:25 +00:00
docs Bug 30808: Add the 22.05 release team. 2022-05-25 23:56:12 -10:00
errors Bug 29420: HTTP status code incorrect when calling error pages directly under Plack/PSGI 2022-04-20 09:03:39 -10:00
etc Bug 29936: Add holds_get_captured option to sip config 2022-05-05 11:17:37 -10:00
ill Bug 29844: Fix ->search occurrences 2022-02-09 15:36:23 -10:00
installer Increment version for 22.05.004 release 2022-08-22 21:12:22 +00:00
Koha Bug 29333: Fix encoding of imported UNIMARC authorities 2022-08-23 15:30:11 +00:00
koha-tmpl Bug 29051: Enable seen renewal in the staff client 2022-08-23 15:18:35 +00:00
labels Bug 29821: Add interface for generating barcodes using svc/barcode 2022-04-08 15:49:17 +02:00
lib/CGI/Session/Serialize Bug 17600: Standardize our EXPORT_OK 2021-07-16 08:58:47 +02:00
members Bug 30807: Migrate to patron-title in pay and paycollect 2022-07-12 19:16:23 +00:00
misc Update release notes for 22.05.04 release 2022-08-22 21:36:39 +00:00
offline_circ Bug 30525: Items batch modification broken 2022-04-21 13:41:36 -10:00
opac Bug 30918: Allow passing filtered record to get_marc_notes 2022-07-29 17:19:15 +00:00
patron_lists Bug 16446: Add ability to add patrons to list by borrowernumber 2021-10-21 12:24:04 +02:00
patroncards Bug 24001: Fix patron card template edition 2022-04-28 10:49:20 -10:00
plugins Bug 29787: Add plugin version to plugin search results 2022-04-08 15:49:15 +02:00
pos Bug 28481: (RM follow-up) formatting 2021-12-16 12:13:51 -10:00
recalls Bug 30924: Add missing branchtransfers.reason value for recall cancellation 2022-07-13 19:13:33 +00:00
reports Bug 30551: Make cash register report take branchcode from cash register 2022-05-06 10:33:10 -10:00
reserve Bug 30960: Fix JS error message when no pick-up location is selected when placing a hold 2022-06-24 15:51:57 +00:00
reviews Bug 17600: Standardize our EXPORT_OK 2021-07-16 08:58:47 +02:00
rotating_collections Bug 17600: Standardize our EXPORT_OK 2021-07-16 08:58:47 +02:00
serials Bug 23352: Set default collection code when creating subscription 2022-05-10 15:17:17 -10:00
services Bug 17600: Standardize our EXPORT_OK 2021-07-16 08:58:47 +02:00
skel
suggestion Bug 30127: By default show pending suggestions tab 2022-05-10 23:09:09 -10:00
svc Bug 29051: Update svc api to allow seen renewals 2022-08-23 15:18:50 +00:00
t Bug 29333: Fix encoding of imported UNIMARC authorities 2022-08-23 15:30:11 +00:00
tags Bug 29469: (bug 17600 follow-up) Fix tag approval/rejection from staff 2021-11-16 15:49:22 +01:00
tmp/modified_authorities
tools Bug 29333: Fix encoding of imported UNIMARC authorities 2022-08-23 15:30:11 +00:00
virtualshelves Bug 26346: Add option to make public lists editable by all staff 2022-04-12 17:13:02 +02:00
xt Bug 27619: (QA follow-up) Remove xt/sample_notices.t 2022-05-11 11:28:48 +01:00
.editorconfig Bug 27375: Set YAML file settings in .editorconfig 2021-11-03 15:40:52 +01:00
.eslintrc.json
.gitignore
.htaccess
.mailmap 22.05.00: Update mailmap 2022-05-25 23:56:12 -10:00
.perlcriticrc Bug 25898: Prohibit indirect object notation 2020-10-15 12:56:30 +02:00
.proverc.dist Bug 19821: Install sample data, ES mappings and Version syspref 2021-10-25 11:27:40 +02:00
.scss-lint.yml Bug 21237: Clean up staff client SCSS 2018-08-24 16:23:25 +00:00
about.pl Bug 28998: (follow-up) Add warning on about for missing key 2022-05-04 05:18:31 -10:00
app.psgi Bug 20582: Fix PSGI file when behind a reverse proxy 2020-10-06 12:00:04 +02:00
changelanguage.pl Bug 25898: Prohibit indirect object notation 2020-10-15 12:56:30 +02:00
cpanfile Bug 25669: (follow-up) Minor fixes 2022-07-29 15:28:00 +00:00
fix-perl-path.PL Bug 28606: Remove $DEBUG and $ENV{DEBUG} 2021-06-24 11:53:44 +02:00
gulpfile.js Bug 30373: Enable translation of UNIMARC frameworks 2022-04-21 13:41:35 -10:00
help.pl Bug 17600: Standardize our EXPORT_OK 2021-07-16 08:58:47 +02:00
INSTALL Bug 26617: Update INSTALL file to include koha-testing-docker and Gitlab links 2020-10-15 12:56:30 +02:00
Koha.pm Increment version for 22.05.004 release 2022-08-22 21:12:22 +00:00
koha_perl_deps.pl Bug 17600: Standardize our EXPORT_OK 2021-07-16 08:58:47 +02:00
kohaversion.pl Bug 26384: Fix executable flags 2020-09-11 09:56:56 +02:00
LICENSE
mainpage.pl Bug 29020: Add link on the mainpage for users without admin access 2021-10-19 09:29:09 +02:00
Makefile.PL Bug 19532: Database and installer stuff 2022-03-14 22:45:50 -10:00
MANIFEST.SKIP
package.json Bug 27939: Update yarn.lock file 2021-03-16 12:04:06 +01:00
README
README.md Bug 27092: Remove note about "synced repo" from README.md 2020-11-25 16:31:58 +01:00
README.robots Bug 6411 add another example to README.robots 2011-07-05 14:48:05 +12:00
rewrite-config.PL Bug 28519: Put CGI::Session::Serialize::yamlxs in lib directory 2021-06-17 10:07:36 +02:00
yarn.lock Bug 27939: Update yarn.lock file 2021-03-16 12:04:06 +01:00

Koha is a free software integrated library system (ILS).

Koha is distributed under the GNU GPL version 3 or later.

Note: Koha does not accept pull requests from git hosting sites.

Note: This project has its own bug tracker, to report a bug or submit a patch visit http://bugs.koha-community.org.

For guidelines on submitting patches for Koha please visit https://wiki.koha-community.org/wiki/SubmitingAPatch

The developers handbook can be found at https://wiki.koha-community.org/wiki/Developer_handbook

http://koha-community.org/

Koha Logo