Koha-community/Koha - Koha: The world's first free and open source library system

Author	SHA1	Message	Date
Josef Moravec	68eeefa07e	Bug 22721: Remove frameworkcode parameter in GetMarcFromKohaField calls Test plan: Run tests, at least: t/db_dependent/Biblio.t t/db_dependent/Biblio/TransformHtmlToMarc.t t/db_dependent/Charset.t t/db_dependent/Circulation/GetTopIssues.t t/db_dependent/Filter_MARC_ViewPolicy.t t/db_dependent/ImportBatch.t t/db_dependent/Items.t t/db_dependent/Items/AutomaticItemModificationByAge.t t/db_dependent/Items/GetItemsForInventory.t t/db_dependent/Koha/Filter/EmbedItemsAvailability.t t/db_dependent/Serials.t t/db_dependent/XISBN.t t/db_dependent/FrameworkPlugin.t Signed-off-by: Josef Moravec <josef.moravec@gmail.com> Signed-off-by: Michal Denar <black23@gmail.com> Signed-off-by: Bouzid Fergani <bouzid.fergani@inlibro.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>	2019-07-15 11:28:08 +01:00
Jonathan Druart	8e0bb409ae	Bug 18910: Revert "Bug 18152 : fix unimarc label in SetMarcUnicodeFlag" This reverts commit `bf551a0722`. Signed-off-by: Fridolin Somers <fridolin.somers@biblibre.com> Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>	2017-07-26 14:01:51 -03:00
Stephane Delaune	bf551a0722	Bug 18152 : fix unimarc label in SetMarcUnicodeFlag The standard UNIMARC requires than the 9th character (starting from 0) in labels must be blank (while it may be 'a' in marc21) the problem is that C4::Charset::SetMarcUnicodeFlag (called in particular when we import a record) always add 'a' char in the 9th label'pos whereas it should do it just for MARC21 and NORMARC (not for UNIMARC) : C4::Charset::SetMarcUnicodeFlag add 'a' char in the 9th label character for MARC21 and NORMARC (it's normal), but just before doing this it call "$marc_record->encoding('UTF-8')" which is a MARC::Record function which, when called with 'UTF-8' parameter, do only one thing : add 'a' char in the 9th label character This patch only removes this incorrect function call, so, when we import a bib record in UNIMARC : it no longer adds erroneous character (this does not change anything for MARC21 and NORMARC because SetMarcUnicodeFlag explicitly adds 'a' char in the 9th label for them) Signed-off-by: Alex Buckley <alexbuckley@catalyst.net.nz> Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>	2017-05-08 08:36:57 -04:00
Jonathan Druart	9190300694	Bug 18432: Replace 2 'he or she' with 'they' Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>	2017-04-21 10:56:43 -04:00
Jonathan Druart	798d38e4c7	Bug 16011: $VERSION - Remove comments perl -p -i -e 's/^.set the version for version checking.\n//' */.pm + manual adjustements Signed-off-by: Josef Moravec <josef.moravec@gmail.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar> Signed-off-by: Brendan A Gallagher <brendan@bywatersolutions.com>	2016-03-24 17:20:29 +00:00
Jonathan Druart	017699c345	Bug 16011: $VERSION - Remove the $VERSION init Mainly a perl -p -i -e 's/^.3.07.00.049.\n//' */.pm Then some adjustements Signed-off-by: Josef Moravec <josef.moravec@gmail.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar> Signed-off-by: Brendan A Gallagher <brendan@bywatersolutions.com>	2016-03-24 17:20:28 +00:00
Jonathan Druart	3830d78d46	Bug 16011: $VERSION - remove use vars $VERSION perl -p -i -e 's/^(use vars .)\$VERSION\s?(.)/$1$2/' */.pm Signed-off-by: Josef Moravec <josef.moravec@gmail.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@unc.edu.ar> Signed-off-by: Brendan A Gallagher <brendan@bywatersolutions.com>	2016-03-24 17:20:26 +00:00
Fridolin Somers	636050f9be	Bug 14078: (followup) converting from ISO5426 is not complete Conversion of MARC from ISO5426 is defined in C4::Charset::char_decode5426(). Each character or combined characters conversion is defined in a map. This patch adds missing conversions. See http://www.gymel.com/charsets/MAB2.html Signed-off-by: Frederic Demians <f.demians@tamil.fr> Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org> Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>	2015-11-16 12:48:44 -03:00
Fridolin Somers	5e3882bcec	Bug 14078: converting from ISO5426 is not complete Conversion of MARC from ISO5426 is defined in C4::Charset::char_decode5426(). Each character or combined characters conversion is defined in a map. This patch changes some odd actual conversions. In char_decode5426(), only characters between 0xC0 and 0xDF will be used for combining with following charater : ($char >= 0xC0 && $char <= 0xDF) So conversion like "$chars{0x81d1}=0x00b0" will never be used. Rules for "h with breve below" use combining with 0xf9 but looks like the correct caracter is 0xd5. See http://www.gymel.com/charsets/MAB2.html Signed-off-by: Frederic Demians <f.demians@tamil.fr> Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org> Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>	2015-11-16 12:48:33 -03:00
Jonathan Druart	34fe5c2416	Bug 11790: Remove dependency C4::Context from C4::Charset C4::Context is only used to retrieve a syspref value. This patch moves the use of C4::Context to a require. Test plan: Try to reach the SetMarcUnicodeFlag subroutine (batchmod, add/update a biblio, etc.) Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Tested on French UNIMARC install No errors adding/editing biblios No koha-qa errors Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-06-08 10:36:11 -03:00
Jonathan Druart	a6c9bd0eb5	Bug 9978: Replace license header with the correct license (GPLv3+) Signed-off-by: Chris Nighswonger <cnighswonger@foundations.edu> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com> Signed-off-by: Katrin Fischer <katrin.fischer@bsz-bw.de> http://bugs.koha-community.org/show_bug.cgi?id=9987 Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-04-20 09:59:38 -03:00
Tomas Cohen Arazi	2840c2fa75	Bug 11944: revert unneeded IsStringUTF8ish behaviour change Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-01-13 13:07:52 -03:00
Jonathan Druart	55107741a2	Bug 11944: replace use of utf8 with Encode See the wiki page for the explanation. Signed-off-by: Paola Rossi <paola.rossi@cineca.it> Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Signed-off-by: Dobrica Pavlinusic <dpavlin@rot13.org> Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-01-13 13:06:45 -03:00
Jonathan Druart	99521c37fa	Bug 11944: Remove all utf8 filter from templates This patch - removes all html_entity usages in tt file which hide utf8 bugs - removes all encode utf8 in tt plugins because we should get correctly marked data from DBIC and other sources directly (cf plugin EncodeUTF8 used in renew.tt) - adds some cleanup in C4::Templates::output: we now use perl utf8 file handler output so we don't need to decode tt variables manually. Signed-off-by: Paola Rossi <paola.rossi@cineca.it> Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Signed-off-by: Dobrica Pavlinusic <dpavlin@rot13.org> Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2015-01-13 13:06:42 -03:00
Mark Tompsett	878fa77c30	Bug 13075: Silence warnings and improve Charset testing. Calls to C4/Charset.pm's NormalizeString function with an undefined string were triggering warnings when running: prove -v t/db_dependent/Holds.t Sadly, t/Charset.t was also lacking calls to NormalizeString. TEST PLAN --------- 1) prove -v t/db_dependent/Holds.t -- This should generate the uninitialized string warnings. Make sure CPL and MPL are in your branches to save yourself from headaches due to expected data. 2) cat t/Charset.t -- note there are no function calls to NormalizeString. You can see other shortfalls in the tests beyond NormalizeString with: grep ^sub C4/Charset.pm 3) prove -v t/Charset.t 4) Apply patch 5) prove -v t/Charset.t -- Run as before with more tests. 6) cat t/Charset.t -- note there are now function calls to NormalizeString. 7) prove -v t/db_dependent/Holds.t -- Nice and clean run! :) 8) koha-qa.pl -v 2 -c 1 -- all should be Ok. Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com> Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-11-14 09:35:44 -03:00
Jonathan Druart	3443056e15	Bug 8218: qa followup This patch - rename _entity_clean as _clean_ampersand - rename the script to sanitize_records.pl - add a --fix-ampersand switch (the only one FOR NOW, enabled by default) so it is obvious what the script does. - make POD and usage reflect this changes. Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-11-11 15:39:05 -03:00
Jonathan Druart	299a8a6997	Bug 8218 : Add a maintenance script to sanitize biblio records This patch adds: - a new maintenance script batch_sanitize_records - a new subroutine C4::Charset::SanitizeRecord - new unit tests for the new subroutine Test plan: 1/ prove t/db_dependent/Charset.t 2/ Create a record containing "&amp;" (could be follow with as many 'amp;' as you want) in one of its fields and the same for the field linked to biblioitems.url. The url should not be sanitized, it may contain "&". 3/ Launch the maintenance script with the -h parameter to see how to use it. 4/ Launch the script using the different parameters: --filename=FILENAME --biblionumbers='XXX' --auto-search The auto-search permits to sanitize all records containing "&amp;" in the marcxml field. Use the verbose flag for testing. Without the --confirm flag, nothing is done. 5/ Use the --confirm flag and verify in the biblioitems.marcxml field that the record has been sanitized. 6/ Try the --reindex flag to reindex records which have been modified. Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl> Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-11-11 15:38:36 -03:00
Stéphane Delaune	f32b0e1243	Bug 9859: fix nsb_clean side effect Signed-off-by: Mathieu Saby <mathieu.saby@univ-rennes2.fr> This sub was causing 2 bugs : - tools/exports.pl --clean was removing Â - authority search plugin used in cataloging was removing Â in suggested authorities displayed dynamicly (using ajax) After applying the patch, - NSB/NSE are still removed by nsb_clean - tools/exports.pl --clean is no more removing Â - authority search plugins is no more removing Â Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-10-22 14:06:14 -03:00
Tomas Cohen Arazi	0f8ca14324	Bug 12462: Fix some POD errors Bug 12041 made xt/author/podcorrectness.t consider files in the 'Koha' namespace. Some of them where failing. This patch fixes some of those POD problems. Best regards To+ Test: 1) run prove xt/author/podcorrectness.t it fails 2) apply patch 3) run again, now it's ok Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com> Before patch test fails. After it, it pass No koha-qa errors Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>	2014-06-22 19:56:37 -03:00
Stéphane Delaune	b9d2a832db	Bug 11730: ensure that C4::Charset loads C4::Context C4::Charset::SetMarcUnicodeFlag() fetches system preference values, so since it invokes routines in C4::Context, it should load the module. Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz> Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2014-02-18 21:52:21 +00:00
Jonathan Druart	0a176d4648	Bug 8015: (follow-up) trap exceptions thrown by SetUTF8Flag() Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Leila <koha.aixmarseille@gmail.com> Bug 8015: Catch error in the SetUTF8Flag routine The eval avoids the interface to run endlessly if an error occurred. Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Leila <koha.aixmarseille@gmail.com> Signed-off-by: Galen Charlton <gmc@esilibrary.com>	2013-10-31 22:48:59 +00:00
Vitor FERNANDES	93ba3b8560	Bug 8347 - Koha forces UNIMARC 100 field code language to 'fre' Changed Charset.pm to use defaultlanguage instead of 'fre'. Signed-off-by: Rolando Isodoro <rolando.isidoro@gmail.com> Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de> Passes all tests and QA script. 1) Check system preference was added correctly: UNIMARCField100Language 2) Change code in preference to be not 'fre'. 3) Catalog a bibliographic record. - check plugin shows new value - check empty field is filled with new value from the plugin - check you can still edit it to be something else Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com>	2013-02-20 09:06:57 -05:00
Chris Cormack	509d673f10	Bug 7941 : Fix version numbers in modules Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com> Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-06-11 17:29:38 +02:00
Matthias Meusburger	1e7437bbae	Bug 7400: Add auto-completion on auth_finder While typing an authority, will automatically propose authorities (similar to autocompletion for patron search if activated) Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Tested searching for authorities with and without autocomplete. Note that this is most useful when used in the "Main entry" box instead of the "Main entry ($a only)" box. Signed-off-by: Jared Camins-Esakov <jcamins@cpbibliography.com> Corrected tabs to spaces in auth-finder-search.inc while resolving merge conflict. Signed-off-by: Paul Poulain <paul.poulain@biblibre.com>	2012-03-19 18:20:30 +01:00
Matthias Meusburger	a3ff0bb5cb	bug 5579 : Fixes several exports to embed items - The following export pages used to embed items when exporting, this was no longer the case, so they were fixed : Intranet : - basket/downloadcart.pl, - virtualshelves/downloadshelf.pl - catalogue/export.pl Opac : - opac/opac-downloadcart.pl - opac/opac-downloadshelf.pl - opac/opac-export.pl - Notes : - GetMarcBiblio used to embed items data, this was no longer the case, so an optional parameter was added to choose if items should be embedded or not. This way, previous work on this bug is not broken, and this is a pretty usefull feature, imho. - An optional parameter has been added to SetUTF8Flag, to be able to use NFD during normalization. This was required to make Unicode/UTF-8 export work again. Signed-off-by: Claire Hernandez <claire.hernandez@biblibre.com> Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>	2011-04-19 22:35:15 +12:00
Magnus Enger	edf8ad5d5d	Bug 3644 Add support for NORMARC - XSLT for the OPAC - Value_builders for lesder, 007 and 008 - Default NORMARC framework - Reverse MARC logic of some subs, so MARC21 is default (and works for NORMARC) - Add NORMARC as an option to the syspref marcflavour - Add record.abs for NORMARC - Add NORMARC and nb as options to Makefile.PL - Add etc/zebradb/lang_defs/nb/sort-string-utf.chr - Copy MARC21slim2OAIDC.xsl to NORMARCslim2OAIDC.xsl Some things are still missing, e.g.: - XSLT for Intranet - More MARC21slim2*.xsl transformations	2011-03-30 10:13:37 +02:00
Paul Poulain	4117b293f6	NormalizeString POD Fixing and variable renaming POD was mistakenly telling that NFD was supposed to be the default encoding. In fact, it is not, it is NFC. So the variable $nfc to change to the not default encoding was misleading. Renaming it into $nfd (written by hdl) Refactored by Chris Cormack Signed-off-by: Davi <davi@gnu.org> Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>	2011-02-18 10:39:56 +13:00
Galen Charlton	7c0e441d50	replace references to defunct info email address Now links to Koha project website. Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2010-06-25 05:18:44 -04:00
Andrew Elwell	aa9b4d92cd	POD Cleanups Signed-off-by: Andrew Elwell <Andrew.Elwell@gmail.com> Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2010-06-09 08:38:59 -04:00
Nahuel ANGELINETTI	b41e6e0f73	(MT #2962 ) add converted chars from ISO5426 This add 0xBE and 0xBD conversion to char table. Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2010-03-16 21:30:43 -04:00
Nahuel ANGELINETTI	4b5483c61e	(MT #3075 ) fix oe char from iso5426 this add oe conversion in chartable for iso5426. Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2010-03-16 21:30:42 -04:00
Lars Wirzenius	7279f55b60	Fix FSF address in directory C4/ Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2010-03-16 20:17:56 -04:00
Galen Charlton	afed49ccc8	fix POD errors reported by xt/author/podcorrectness.t Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2010-02-03 15:57:31 -05:00
Paul Poulain	9d1e7f43e1	(bug #4020 ) XSLT unimarc display When using XSLT Display, and UNIMARC, since marcFlavour is not used in encoding data, when data is true utf8, as_xml fails on some subfields. Moreover, because transformMARCXMLForXSLT edits some values in the marc record and the PERL UTF8 is not handled by MARC::File::USMARC, it endsup in double encoding the data. Sending a patch to fix both issues. This patch adds - two functions in C4/Charset.pm NormalizeString (uses Unicode::Normalize) SetUTF8Flag (This function in my opinion belongs to MARC::Record, or at least MARC::File::USMARC) - edits C4::XSLT in order to cope with the correct marcflavour - edits C4::Search searchResults to use setUTF8Flag	2010-01-28 15:11:51 +01:00
Henri-Damien LAURENT	d83ca6fedf	Adding a test in C4::Charset in UNIMARC_Encoding	2009-11-17 16:27:13 +01:00
Henri-Damien LAURENT	9d52c46dda	Better conformance for UNIMARC Authorities Encoding encoding is on 8th character and not 9th	2009-11-17 16:27:12 +01:00
Henri-Damien LAURENT	7177a6528a	Bug Fixing : C4::Charset source_encoding always set to iso-8859-1	2009-10-14 18:19:09 +02:00
Henri-Damien LAURENT	7eca37db4f	Authorities bulkmarcimport Adding some new options to bulkmarcimport : -k idtagsubfield in order to store the id of the file record into another field -match tagsubfield,index -a to import authorities -l logfilename to store logs Bug Fixing : C4/Charset.pm Charset was incorrect for UNIMARC Authorities Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2009-09-30 11:22:21 +02:00
J. David Bavousett	88ba183305	Handle null-or-empty to Charset::StripNonXmlChars When rebuild_zebra.pl is run from cron, there is an occasional error of the form: Use of uninitialized value $str in substitution (s///) at /home/ebpl/kohaclone/C4/Charset.pm line 304. This error is occuring when the string that is fed to Charset::StripNonXmlChars is null or undefined, for some reason. This fix will handle the null-or-empty condition, and thus suppress the error. Signed-off-by: Galen Charlton <gmcharlt@gmail.com>	2009-09-12 08:42:11 -04:00
Galen Charlton	a244ff6671	bug 2505: turn on warnings in seven modules C4::XSLT C4::VirtualShelves C4::Review C4::Output C4::Boolean C4::Charset C4::Stats Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz> Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-06-07 20:09:16 -05:00
Galen Charlton	da51de184c	bug 2926: fix staging import hang Fixes a hang of the staging import tool when it attempts to process a MARC21 record that claims that it's UTF-8 when it is not. The staging import will now attempt to fix the character encoding of such records. Also added a FIXME to bulkmarcimport.pl, which because of its use of MARC::Batch will skip over such records - better than the original hang of the staging import, but worse than the staging import's new ability to fix such records. Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-06-07 13:17:06 -05:00
Nahuel ANGELINETTI	1a4e67c143	(bug #3183 ) fix the SetMarcUnicodeFlag function This patch fix the funciton SetMarcUnicodeFlag for UNIMARC support, now the function will fix the length of the field, and set encoding as "50 " instead of "5050". Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-05-08 08:23:14 -05:00
Joe Atzberger	248e0392e2	Multi-bug fix - SetMarcUnicodeFlag for records coming from Koha This has bearing on bugs 2905, 2665, 2514 and other "wide character" crashes related to diacritics and Unicode. This should help open the door for reliable input of diacriticals via acquisitions. MARC21_utf8_flag_fix.pl diagnoses and fixes existing problems with MARC data affected by the bug. Adding SetMarcUnicodeFlag to TransformKohaToMarc prevents the bug from corrupting further data. Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2009-04-18 09:14:43 -05:00
Frederic Demians	0551f48150	Improve C4::Charset::MarcToUTF8Record performance A script like bulkmarkimport.pl spends most of the time in C4::Charset::MarcToUTF8Record function, and specifically in C4::Charset::char_decode5426 just initializing a hash. This patch moves this hash outside function to avoid its initializing each time the functon is called. A test on a specific conversion script shows me that performances were improved from 23s to 8s. Signed-off-by: Galen Charlton <galen.charlton@liblime.com>	2008-11-06 15:53:29 -06:00
Galen Charlton	cfea172544	work around issue in MARC::Charset Because of a bug in MARC::Charset 0.98, if a string to convert from MARC-8 to UTF-8 has (a) one or more diacritics that (b) are only in character positions 128 to 255 inclusive, the resulting converted string is not in UTF-8, but the legacy 8-bit encoding (e.g., ISO-8859-1). As a result, when such a record is converted to XML using ->as_xml_record(), the resulting XML can be truncated at the offending character. An example of such a record is one that has a price in Briish pounds in the 260$c but no other diacritics. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-04-01 06:46:04 -05:00
Galen Charlton	b549d7e1f1	added StripNonXmlChars to C4::Charset Added invocations of StripNonXmlChars to uses of new_from_xml() that involve records saved to Koha fields via MARC::Record->as_xml(); for batch jobs that work on MARC XML files coming from external sources, StripNonXmlChars should not necessarily be used, as it may be better to reject a file or record if it contains that kind of encoding error. Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-08 20:22:42 -06:00
Galen Charlton	c86f5df431	charset: fixed bug that prevented ISO-5426 conversion Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-03 07:24:45 -06:00
Galen Charlton	60a98d258a	IMPORTANT - refactor MARC character set handling * IsStringUTF8ish - determine if scalar contains a string in UTF8 * MarcToUTF8Record - convert MARC blob or MARC::Record to UTF8 * SetMarcUnicodeFlag - set appropriate MARC21 or UNIMARC field to indicate that record is in UTF-8. Design points of this module include: * No dependencies on other C4 modules, making it easier to add more test cases * All character conversion code in one place * Single entry point for doing a character conversion on a MARC record * Capture of errors and warnings produced by Text::Iconv and MARC::Charset * Start of support for guessing the source character set of a MARC record. Several functions were moved from other scripts or modules to C4::Charset: * C4::Koha->FixEncoding (expanded and renamed MarcToUTF8Record) * C4::Koha->char_decode5426 * fMARC8ToUTF8 from bulkmarcimport.pl (renamed _marc_marc8_to_utf8) Several batch jobs were adjusted to use MarcToUTF8Record instead of FixEncoding. Signed-off-by: Chris Cormack <crc@liblime.com> Signed-off-by: Joshua Ferraro <jmf@liblime.com>	2008-02-03 07:23:56 -06:00
acli	e9858a2910	Moved C4/Charset.pm to C4/Interface/CGI/Output.pm	2003-02-02 07:19:29 +00:00
acli	ea50c2acb6	Preliminary fix of the CGI.pm problem of always assuming that everything is in ISO-8859-1. A new C4::Charset module (tentative name) has been created to guess the charset of a piece of HTML markup. The CGI programs will be modified to use this module as they are encountered during translation.	2003-01-19 06:15:44 +00:00

50 commits