60a98d258a
* IsStringUTF8ish - determine if scalar contains a string in UTF8 * MarcToUTF8Record - convert MARC blob or MARC::Record to UTF8 * SetMarcUnicodeFlag - set appropriate MARC21 or UNIMARC field to indicate that record is in UTF-8. Design points of this module include: * No dependencies on other C4 modules, making it easier to add more test cases * All character conversion code in one place * Single entry point for doing a character conversion on a MARC record * Capture of errors and warnings produced by Text::Iconv and MARC::Charset * Start of support for guessing the source character set of a MARC record. Several functions were moved from other scripts or modules to C4::Charset: * C4::Koha->FixEncoding (expanded and renamed MarcToUTF8Record) * C4::Koha->char_decode5426 * fMARC8ToUTF8 from bulkmarcimport.pl (renamed _marc_marc8_to_utf8) Several batch jobs were adjusted to use MarcToUTF8Record instead of FixEncoding. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>
20 lines
604 B
Perl
Executable file
20 lines
604 B
Perl
Executable file
#!/usr/bin/perl
|
|
|
|
use strict;
|
|
use warnings;
|
|
|
|
use Test::More tests => 6;
|
|
BEGIN {
|
|
use_ok('C4::Charset');
|
|
}
|
|
|
|
my $octets = "abc";
|
|
ok(IsStringUTF8ish($octets), "verify octets are valid UTF-8 (ASCII)");
|
|
|
|
$octets = "flamb\c3\a9";
|
|
ok(!utf8::is_utf8($octets), "verify that string does not have Perl UTF-8 flag on");
|
|
ok(IsStringUTF8ish($octets), "verify octets are valid UTF-8 (LATIN SMALL LETTER E WITH ACUTE)");
|
|
ok(!utf8::is_utf8($octets), "verify that IsStringUTF8ish does not magically turn Perl UTF-8 flag on");
|
|
|
|
$octets = "a\xc2" . "c";
|
|
ok(!IsStringUTF8ish($octets), "verify octets are not valid UTF-8");
|