Currently both the import_export_framework.pl script outputs data with
Perl's default encoding, ISO-8859. This patch properly sets
the binmode to UTF-8 when exporting SQL and CSV files using the PerlIO
layer (":encoding(UTF-8)") for STDOUT.
To test:
Export step test
- Use some ASCII character(s) with DIACRITICS in some field description
in a chosen framework.
- Export the framework at Administration > MARC frameworks
- Run this to check the file is ISO-8859 encoded:
$ file export_XXX.csv
export_XXX.csv: ISO-8859 text, with very long lines
(Note: try SQL and other output formats too. But not ODS)
- Apply the patch
- Export the framework again (change the name), and test encoding:
$ file export_XXX_2.csv
export_XXX_2.csv: UTF-8 Unicode text
Import step test
I assume you have two files, export_XXX.csv (ISO-8859 encoded) and
export_XXX_2.csv (XXX will depend on your framework's code)
- Reset your testing branch to master
- Import export_XXX.csv
- The string with non-ASCII chars is truncated at the first non-ASCII
char's position (Note: this is the current behaviour).
- Import export_XXX_2.csv
- The non-ASCII chars are broken, the logs show errors on non-UNICODE
chars. (Note: even thou UTF-8 is the expected encoding it is
treated as ISO-8859).
- Apply the patch
- Import the good (UTF-8 as expected) file and check everything worked
as expected.
No double encoding should occur with either combination of formats.
Sponsored-by: Universidad Nacional de Cordoba
Signed-off-by: Magnus Enger <digitalutvikling@gmail.com>
I put some Norwegian and accented letters in a fremawork to test.
Before the patch, the exported CSV came out as ISO-8859, after the
patch it came out as UTF-8. ODS and XML (viewed in LibreOffice)
both looked good, before and after the patch.
Importing the ISO-8859 CSV cut off the strings at the first non-ASCII
char. Importing the UTF-8 CSV worked as epected.
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Works as expected, passes tests and QA script.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
The SQL option for MARC framework imports was subject to a bug whereby
somebody could use it to gain access to arbitrary information in the
database by uploading an SQL file containing unexpected statements.
As it is difficult to securely sanitize SQL, this patch removes the
option to use SQL as an import or export format.
To test:
[1] Verify that SQL no longer appears as an import or export option
for the MARC frameworks.
[2] Verify that exports and imports in CSV, Excel XML, and ODS formats
still work.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Works as advertised. The UI doesn't offer exporting/importing in the SQL format.
Crafting the URL to export SQL fallbacks to a spreadsheet format (ODS).
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Works as described, passes all tests and QA script.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
This patch makes the MARC framework import/export script require
that the staff user be logged in with appropriate permissions for
managing the MARC frameworks.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
I can confirm the bug and the solution. After applying the patch
downloading the file without logging in first is no longer possible.
Also passes tests and QA script.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Module to Import/Export a Framework structure to CSV/Excel-xml/ODS/SQL in Intranet Administration - MARC Frameworks section.
There are two new links: "Export" to export to a format; and "Import" to import from a file.
The data exported/imported is the one stored in the MySQL tables marc_tag_structure, marc_subfield_structure.
Exported works as follows:
1) CSV: As this format only allows one worksheet, the data from the tables is splitted with a row with #-# cells or with the
names of the fields of the next MySQL table. Each row has as much cells as fields has the MySQL table. The first row contains the
field names, the remaining holds the data.
2) Excel: Excel xml 2003 format. Each MySQL table has its own worksheet in the spreadsheet. Rows and cells data as CSV.
3) ODS: OpenDocument Spreadsheet compressed format, creates a temporary directory to generate the files needed to create the zip file.
Each MySQL table has its own worksheet in the spreadsheet. Rows and cells data as CSV.
4) SQL: Text file, the first row for each table is a delete and the remaining are inserts.
Importing reads the rows from the spreadsheet/text-file as follows:
1) CSV: Each row inserts or updates the associated MySQL table for this framework. At the end of the importing for a MySQL table, deletes the rows in the database that don't possess a correspondence with the spreadsheet.
2) Excel: Imports each worksheet to the associated MySQL table. Works as the CSV for each worksheet.
3) ODS: Creates a temporary directory to decompress and read the content.xml. This file has the data needed to import.
Works as the CSV for each worksheet.
4) Executes the SQL file.
If the file imported has a different frameworkcode that the framework importing, the framecode is changed along the process.
The Csv format will be the default.
It uses perl module Archive::Zip or zip/unzip system command to process ODS files.
To parse the sql files when importing it uses SQL::Statement or homemade parsing.
Signed-off-by: Nicole C. Engard <nengard@bywatersolutions.com>
Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>