Browse Source

Bug 14722: Refactor the export tool

Why a refactoring was need for this script?
The export tool (tools/export.pl) can be called from the command line
and some parts of code were unnecessarity complicated (just look at the
code, you will understand).

Worse still, the script does not provide the same options for both
interface. For instance you cannot export records given a range of
biblionumbers, authids, callnumbers, etc. from the commandline.

What does this patch?
1/ Important: The script tools/export.pl does not work anymore if called from
the command line (should be in the release notes).
2/ The code used to generated a file (csv, iso2709 or xml) has been moved to a new
module (Koha::Exporter::Record) and tests have been provided.
3/ No change is done on the web interface
4/ Some new options have been added to the commandline script
(misc/export.pl):
    - starting_authid
    - ending_authid
    - authtype
    - starting_biblionumber
    - ending_biblionumber
    - itemtype
    - starting_callnumber
    - ending_callnumber
    - start_accession
    - end_accession
5/ There is a change in the behavior if an error occurs:
Can't call method "as_usmarc" on an undefined value at Koha/Exporter/Record.pm line 114.
record (number 5530) is invalid and therefore not exported because its reopening generates warnings above at Koha/Exporter/Record.pm line 117.

Before this patch, they were not displayed (using the command line).

What does not do this patch?
It does not provide the 'clean', 'timestamp' and 'deleted_barcodes' options to
the web interface (same as before).

What about the perfs?
With a DB with ~800 biblios (MARC21)
Before: perl tools/export.pl 14.79s user 0.83s system 71% cpu 21.905 total
After:  perl misc/export.pl  17.19s user 0.84s system 75% cpu 24.018 total

With a DB with ~6400 biblios (UNIMARC)
Before: perl tools/export.pl 26.55s user 0.76s system 76% cpu 35.498 total
After:  perl misc/export.pl  26.78s user 0.84s system 80% cpu 34.494 total

How to test this patch?
Test plan:
A. Web interface:
1/ On the current master, export some records, biblios and authorities (with
the 3 differents exports) playing with the different filters (item type,
libraries, callnumber, accession date, don't export items, remove
non-local items, don't export fields, etc.).
2/ Apply this patch, export again the same records, and compare the
generated files. They must be identical!
3/ Confirm that the export features on the checkout list
(circ/circulation.pl) works as before this patch.

B. The command line
1/ On the current master, export some records, biblios and authorities (with
the 2 differents exports) playing with the different options (date,
deleted_barcodes, clean).
2/ Apply this patch, export again the same records, and compare the
generated files. They must be identical!

Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>

Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
3.22.x
Jonathan Druart 9 years ago
committed by Tomas Cohen Arazi
parent
commit
38f92df4e5
  1. 219
      Koha/Exporter/Record.pm
  2. 308
      misc/export.pl
  3. 166
      t/db_dependent/Exporter/Record.t
  4. 717
      tools/export.pl

219
Koha/Exporter/Record.pm

@ -0,0 +1,219 @@
package Koha::Exporter::Record;
use Modern::Perl;
use MARC::File::XML;
use MARC::File::USMARC;
use C4::AuthoritiesMarc;
use C4::Biblio;
use C4::Record;
sub _get_record_for_export {
my ($params) = @_;
my $record_type = $params->{record_type};
my $record_id = $params->{record_id};
my $dont_export_fields = $params->{dont_export_fields};
my $clean = $params->{clean};
my $record;
if ( $record_type eq 'auths' ) {
$record = _get_authority_for_export( { %$params, authid => $record_id } );
} elsif ( $record_type eq 'bibs' ) {
$record = _get_biblio_for_export( { %$params, biblionumber => $record_id } );
} else {
# TODO log "record_type not supported"
return;
}
if ($dont_export_fields) {
for my $f ( split / /, $dont_export_fields ) {
if ( $f =~ m/^(\d{3})(.)?$/ ) {
my ( $field, $subfield ) = ( $1, $2 );
# skip if this record doesn't have this field
if ( defined $record->field($field) ) {
if ( defined $subfield ) {
my @tags = $record->field($field);
foreach my $t (@tags) {
$t->delete_subfields($subfield);
}
} else {
$record->delete_fields( $record->field($field) );
}
}
}
}
}
C4::Biblio::RemoveAllNsb($record) if $clean;
return $record;
}
sub _get_authority_for_export {
my ($params) = @_;
my $authid = $params->{authid} || return;
my $authority = Koha::Authority->get_from_authid($authid);
return unless $authority;
return $authority->record;
}
sub _get_biblio_for_export {
my ($params) = @_;
my $biblionumber = $params->{biblionumber};
my $itemnumbers = $params->{itemnumbers};
my $export_items = $params->{export_items} // 1;
my $only_export_items_for_branch = $params->{only_export_items_for_branch};
my $record = eval { C4::Biblio::GetMarcBiblio($biblionumber); };
return if $@ or not defined $record;
if ($export_items) {
C4::Biblio::EmbedItemsInMarcBiblio( $record, $biblionumber, $itemnumbers );
if ($only_export_items_for_branch) {
my ( $homebranchfield, $homebranchsubfield ) = GetMarcFromKohaField( 'items.homebranch', '' ); # Should be GetFrameworkCode( $biblionumber )?
for my $itemfield ( $record->field($homebranchfield) ) {
my $homebranch = $itemfield->subfield($homebranchsubfield);
if ( $only_export_items_for_branch ne $homebranch ) {
$record->delete_field($itemfield);
}
}
}
}
return $record;
}
sub export {
my ($params) = @_;
my $record_type = $params->{record_type};
my $record_ids = $params->{record_ids} || [];
my $format = $params->{format};
my $itemnumbers = $params->{itemnumbers} || []; # Does not make sense with record_type eq auths
my $export_items = $params->{export_items};
my $dont_export_fields = $params->{dont_export_fields};
my $csv_profile_id = $params->{csv_profile_id};
my $output_filepath = $params->{output_filepath};
return unless $record_type;
return unless @$record_ids;
my $fh;
if ( $output_filepath ) {
open $fh, '>', $output_filepath or die "Cannot open file $output_filepath ($!)";
select $fh;
binmode $fh, ':encoding(UTF-8)' unless $format eq 'csv';
} else {
binmode STDOUT, ':encoding(UTF-8)' unless $format eq 'csv';
}
if ( $format eq 'iso2709' ) {
for my $record_id (@$record_ids) {
my $record = _get_record_for_export( { %$params, record_id => $record_id } );
my $errorcount_on_decode = eval { scalar( MARC::File::USMARC->decode( $record->as_usmarc )->warnings() ) };
if ( $errorcount_on_decode or $@ ) {
warn $@ if $@;
warn "record (number $record_id) is invalid and therefore not exported because its reopening generates warnings above";
next;
}
print $record->as_usmarc();
}
} elsif ( $format eq 'xml' ) {
my $marcflavour = C4::Context->preference("marcflavour");
MARC::File::XML->default_record_format( ( $marcflavour eq 'UNIMARC' && $record_type eq 'auths' ) ? 'UNIMARCAUTH' : $marcflavour );
print MARC::File::XML::header();
print "\n";
for my $record_id (@$record_ids) {
my $record = _get_record_for_export( { %$params, record_id => $record_id } );
print MARC::File::XML::record($record);
print "\n";
}
print MARC::File::XML::footer();
print "\n";
} elsif ( $format eq 'csv' ) {
$csv_profile_id ||= C4::Csv::GetCsvProfileId( C4::Context->preference('ExportWithCsvProfile') );
print marc2csv( $record_ids, $csv_profile_id, $itemnumbers );
}
close $fh if $output_filepath;
}
1;
__END__
=head1 NAME
Koha::Exporter::Records - module to export records (biblios and authorities)
=head1 SYNOPSIS
This module provides a public subroutine to export records as xml, csv or iso2709.
=head2 FUNCTIONS
=head3 export
Koha::Exporter::Record::export($params);
$params is a hashref with some keys:
It will displays on STDOUT the generated file.
=over 4
=item record_type
Must be set to 'bibs' or 'auths'
=item record_ids
The list of the records to export (a list of biblionumber or authid)
=item format
The format must be 'csv', 'xml' or 'iso2709'.
=item itemnumbers
Generate the item infos only for these itemnumbers.
Must only be used with biblios.
=item export_items
If this flag is set, the items will be exported.
Default is ON.
=item dont_export_fields
List of fields not to export.
=item csv_profile_id
If the format is csv, a csv_profile_id can be provide to overwrite the default value (syspref ExportWithCsvProfile).
=cut
=back
=head1 LICENSE
This file is part of Koha.
Copyright Koha Development Team
Koha is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
Koha is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with Koha; if not, see <http://www.gnu.org/licenses>.

308
misc/export.pl

@ -0,0 +1,308 @@
#!/usr/bin/perl
#
# This file is part of Koha.
#
# Koha is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# Koha is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with Koha; if not, see <http://www.gnu.org/licenses>.
use Modern::Perl;
use MARC::File::XML;
use List::MoreUtils qw(uniq);
use Getopt::Long;
use Pod::Usage;
use C4::Auth;
use C4::Context;
use C4::Csv;
use C4::Record;
use Koha::Biblioitems;
use Koha::Database;
use Koha::Exporter::Record;
use Koha::DateUtils qw( dt_from_string output_pref );
my ( $output_format, $timestamp, $dont_export_items, $csv_profile_id, $deleted_barcodes, $clean, $filename, $record_type, $id_list_file, $starting_authid, $ending_authid, $authtype, $starting_biblionumber, $ending_biblionumber, $itemtype, $starting_callnumber, $ending_callnumber, $start_accession, $end_accession, $help );
GetOptions(
'format=s' => \$output_format,
'date=s' => \$timestamp,
'dont_export_items' => \$dont_export_items,
'csv_profile_id=s' => \$csv_profile_id,
'deleted_barcodes' => \$deleted_barcodes,
'clean' => \$clean,
'filename=s' => \$filename,
'record-type=s' => \$record_type,
'id_list_file=s' => \$id_list_file,
'starting_authid=s' => \$starting_authid,
'ending_authid=s' => \$ending_authid,
'authtype=s' => \$authtype,
'starting_biblionumber=s' => \$starting_biblionumber,
'ending_biblionumber=s' => \$ending_biblionumber,
'itemtype=s' => \$itemtype,
'starting_callnumber=s' => \$starting_callnumber,
'ending_callnumber=s' => \$ending_callnumber,
'start_accession=s' => \$start_accession,
'end_accession=s' => \$end_accession,
'h|help|?' => \$help
) || pod2usage(1);
if ($help) {
pod2usage(1);
}
$filename ||= 'koha.mrc';
$output_format ||= 'iso2709';
$record_type ||= 'bibs';
# Retrocompatibility for the format parameter
$output_format = 'iso2709' if $output_format eq 'marc';
if ( $timestamp and $record_type ne 'bibs' ) {
pod2usage(q|--timestamp can only be used with biblios|);
}
if ( $record_type ne 'bibs' and $record_type ne 'auths' ) {
pod2usage(q|--record_type is not valid|);
}
if ( $deleted_barcodes and $record_type ne 'bibs' ) {
pod2usage(q|--deleted_barcodes can only be used with biblios|);
}
$start_accession = dt_from_string( $start_accession ) if $start_accession;
$end_accession = dt_from_string( $end_accession ) if $end_accession;
my $dbh = C4::Context->dbh;
# Redirect stdout
open STDOUT, '>', $filename if $filename;
my @record_ids;
$timestamp = ($timestamp) ? output_pref({ dt => dt_from_string($timestamp), dateformat => 'iso', dateonly => 1, }): '';
if ( $record_type eq 'bibs' ) {
if ( $timestamp ) {
push @record_ids, $_->{biblionumber} for @{
$dbh->selectall_arrayref(q| (
SELECT biblionumber
FROM biblioitems
LEFT JOIN items USING(biblionumber)
WHERE biblioitems.timestamp >= ?
OR items.timestamp >= ?
) UNION (
SELECT biblionumber
FROM biblioitems
LEFT JOIN deleteditems USING(biblionumber)
WHERE biblioitems.timestamp >= ?
OR deleteditems.timestamp >= ?
) |, { Slice => {} }, ( $timestamp ) x 4 );
};
} else {
my $conditions = {
( $starting_biblionumber or $ending_biblionumber )
? (
"me.biblionumber" => {
( $starting_biblionumber ? ( '>=' => $starting_biblionumber ) : () ),
( $ending_biblionumber ? ( '<=' => $ending_biblionumber ) : () ),
}
)
: (),
( $starting_callnumber or $ending_callnumber )
? (
callnumber => {
( $starting_callnumber ? ( '>=' => $starting_callnumber ) : () ),
( $ending_callnumber ? ( '<=' => $ending_callnumber ) : () ),
}
)
: (),
( $start_accession or $end_accession )
? (
dateaccessioned => {
( $start_accession ? ( '>=' => $start_accession ) : () ),
( $end_accession ? ( '<=' => $end_accession ) : () ),
}
)
: (),
( $itemtype
?
C4::Context->preference('item-level_itypes')
? ( 'items.itype' => $itemtype )
: ( 'biblioitems.itemtype' => $itemtype )
: ()
),
};
my $biblioitems = Koha::Biblioitems->search( $conditions, { join => 'items' } );
while ( my $biblioitem = $biblioitems->next ) {
push @record_ids, $biblioitem->biblionumber;
}
}
}
elsif ( $record_type eq 'auths' ) {
my $conditions = {
( $starting_authid or $ending_authid )
? (
authid => {
( $starting_authid ? ( '>=' => $starting_authid ) : () ),
( $ending_authid ? ( '<=' => $ending_authid ) : () ),
}
)
: (),
( $authtype ? ( authtypecode => $authtype ) : () ),
};
# Koha::Authority is not a Koha::Object...
my $authorities = Koha::Database->new->schema->resultset('AuthHeader')->search( $conditions );
@record_ids = map { $_->authid } $authorities->all;
}
@record_ids = uniq @record_ids;
if ( @record_ids and my $id_list_file ) {
my @filter_record_ids = <$id_list_file>;
@filter_record_ids = map { my $id = $_; $id =~ s/[\r\n]*$// } @filter_record_ids;
# intersection
my %record_ids = map { $_ => 1 } @record_ids;
@record_ids = grep $record_ids{$_}, @filter_record_ids;
}
if ($deleted_barcodes) {
for my $record_id ( @record_ids ) {
my $q = q|
|;
my $barcode = $dbh->selectall_arrayref(q| (
SELECT DISTINCT barcode
FROM deleteditems
WHERE deleteditems.biblionumber = ?
|, { Slice => {} }, $record_id );
say $_->{barcode} for @$barcode
}
}
else {
Koha::Exporter::Record::export(
{ record_type => $record_type,
record_ids => \@record_ids,
format => $output_format,
csv_profile_id => ( $csv_profile_id || GetCsvProfileId( C4::Context->preference('ExportWithCsvProfile') ) || undef ),
export_items => (not $dont_export_items),
clean => $clean || 0,
}
);
}
exit;
=head1 NAME
export records - This script exports record (biblios or authorities)
=head1 SYNOPSIS
export.pl [-h|--help] [--format=format] [--date=date] [--record-type=TYPE] [--dont_export_items] [--deleted_barcodes] [--clean] [--id_list_file=PATH] --filename=outputfile
=head1 OPTIONS
=over
=item B<-h|--help>
Print a brief help message.
=item B<--format>
--format=FORMAT FORMAT is either 'xml', 'csv' or 'marc' (default).
=item B<--date>
--date=DATE DATE should be entered as the 'dateformat' syspref is
set (dd/mm/yyyy for metric, yyyy-mm-dd for iso,
mm/dd/yyyy for us) records exported are the ones that
have been modified since DATE.
=item B<--record-type>
--record-type=TYPE TYPE is 'bibs' or 'auths'.
=item B<--dont_export_items>
--dont_export_items If enabled, the item infos won't be exported.
=item B<--csv_profile_id>
--csv_profile_id=ID Generate a CSV file with the given CSV profile id (see tools/csv-profiles.pl)
Unless provided, the one defined in the system preference 'ExportWithCsvProfile' will be used.
=item B<--deleted_barcodes>
--deleted_barcodes If used, a list of barcodes of items deleted since DATE
is produced (or from all deleted items if no date is
specified). Used only if TYPE is 'bibs'.
=item B<--clean>
--clean removes NSE/NSB.
=item B<--id_list_file>
--id_list_file=PATH PATH is a path to a file containing a list of
IDs (biblionumber or authid) with one ID per line.
This list works as a filter; it is compatible with
other parameters for selecting records.
=item B<--filename>
--filename=FILENAME FILENAME used to export the data.
=item B<--starting_authid>
=item B<--ending_authid>
=item B<--authtype>
=item B<--starting_biblionumber>
=item B<--ending_biblionumber>
=item B<--itemtype>
=item B<--starting_callnumber>
=item B<--ending_callnumber>
=item B<--start_accession>
=item B<--end_accession>
=back
=head1 AUTHOR
Koha Development Team
=head1 COPYRIGHT
Copyright Koha Team
=head1 LICENSE
This file is part of Koha.
Koha is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software
Foundation; either version 3 of the License, or (at your option) any later version.
You should have received a copy of the GNU General Public License along
with Koha; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
=cut

166
t/db_dependent/Exporter/Record.t

@ -0,0 +1,166 @@
use Modern::Perl;
use Test::More tests => 3;
use MARC::Record;
use MARC::File::USMARC;
use MARC::File::XML;# ( BinaryEncoding => 'utf-8' );
#use XML::Simple;
use MARC::Batch;
use t::lib::TestBuilder;
use File::Slurp;
#use utf8;
use Encode;
use C4::Biblio;
use C4::Context;
use Koha::Exporter::Record;
my $dbh = C4::Context->dbh;
#$dbh->{AutoCommit} = 0;
#$dbh->{RaiseError} = 1;
#$dbh->do(q|DELETE FROM issues|);
#$dbh->do(q|DELETE FROM reserves|);
#$dbh->do(q|DELETE FROM items|);
#$dbh->do(q|DELETE FROM biblio|);
#$dbh->do(q|DELETE FROM auth_header|);
my $biblio_1_title = 'Silence in the library';
#my $biblio_2_title = Encode::encode('UTF-8', 'The art of computer programming ກ ຂ ຄ ງ ຈ ຊ ຍ é');
my $biblio_2_title = 'The art of computer programming ກ ຂ ຄ ງ ຈ ຊ ຍ é';
my $biblio_1 = MARC::Record->new();
$biblio_1->leader('00266nam a22001097a 4500');
$biblio_1->append_fields(
MARC::Field->new('100', ' ', ' ', a => 'Moffat, Steven'),
MARC::Field->new('245', ' ', ' ', a => $biblio_1_title),
);
my ($biblionumber_1, $biblioitemnumber_1) = AddBiblio($biblio_1, '');
my $biblio_2 = MARC::Record->new();
$biblio_2->leader('00266nam a22001097a 4500');
$biblio_2->append_fields(
MARC::Field->new('100', ' ', ' ', a => 'Knuth, Donald Ervin'),
MARC::Field->new('245', ' ', ' ', a => $biblio_2_title),
);
my ($biblionumber_2, $biblioitemnumber_2) = AddBiblio($biblio_2, '');
my $builder = t::lib::TestBuilder->new;
my $item_1_1 = $builder->build({
source => 'Item',
value => {
biblionumber => $biblionumber_1,
more_subfields_xml => '',
}
});
my $item_1_2 = $builder->build({
source => 'Item',
value => {
biblionumber => $biblionumber_1,
more_subfields_xml => '',
}
});
my $item_2_1 = $builder->build({
source => 'Item',
value => {
biblionumber => $biblionumber_2,
more_subfields_xml => '',
}
});
subtest 'export csv' => sub {
plan tests => 2;
my $csv_content = q{Title=245$a|Barcode=952$p};
$dbh->do(q|INSERT INTO export_format(profile, description, content, csv_separator, field_separator, subfield_separator, encoding, type) VALUES (?, ?, ?, ?, ?, ?, ?, ?)|, {}, "TEST_PROFILE_Records.t", "my useless desc", $csv_content, '|', ';', ',', 'utf8', 'marc');
my $csv_profile_id = $dbh->last_insert_id( undef, undef, 'export_format', undef );
my $generated_csv_file = '/tmp/test_export_1.csv';
# Get all item infos
Koha::Exporter::Record::export(
{
record_type => 'bibs',
record_ids => [ $biblionumber_1, $biblionumber_2 ],
format => 'csv',
csv_profile_id => $csv_profile_id,
output_filepath => $generated_csv_file,
}
);
my $expected_csv = <<EOF;
Title|Barcode
"$biblio_1_title"|$item_1_1->{barcode},$item_1_2->{barcode}
"$biblio_2_title"|$item_2_1->{barcode}
EOF
my $generated_csv_content = read_file( $generated_csv_file );
is( $generated_csv_content, $expected_csv, "Export CSV: All item's infos should have been retrieved" );
$generated_csv_file = '/tmp/test_export.csv';
# Get only 1 item info
Koha::Exporter::Record::export(
{
record_type => 'bibs',
record_ids => [ $biblionumber_1, $biblionumber_2 ],
itemnumbers => [ $item_1_1->{itemnumber}, $item_2_1->{itemnumber} ],
format => 'csv',
csv_profile_id => $csv_profile_id,
output_filepath => $generated_csv_file,
}
);
$expected_csv = <<EOF;
Title|Barcode
"$biblio_1_title"|$item_1_1->{barcode}
"$biblio_2_title"|$item_2_1->{barcode}
EOF
$generated_csv_content = read_file( $generated_csv_file );
is( $generated_csv_content, $expected_csv, "Export CSV: Only 1 item info should have been retrieved" );
};
subtest 'export xml' => sub {
plan tests => 2;
my $generated_xml_file = '/tmp/test_export.xml';
Koha::Exporter::Record::export(
{
record_type => 'bibs',
record_ids => [ $biblionumber_1, $biblionumber_2 ],
format => 'xml',
output_filepath => $generated_xml_file,
}
);
my $generated_xml_content = read_file( $generated_xml_file );
$MARC::File::XML::_load_args{BinaryEncoding} = 'utf-8';
open my $fh, '<', $generated_xml_file;
my $records = MARC::Batch->new( 'XML', $fh );
my @records;
# The following statement produces
# Use of uninitialized value in concatenation (.) or string at /usr/share/perl5/MARC/File/XML.pm line 398, <$fh> chunk 5.
# Why?
while ( my $record = $records->next ) {
push @records, $record;
}
is( scalar( @records ), 2, 'Export XML: 2 records should have been exported' );
my $second_record = $records[1];
my $title = $second_record->subfield(245, 'a');
$title = Encode::encode('UTF-8', $title);
is( $title, $biblio_2_title, 'Export XML: The title is correctly encoded' );
};
subtest 'export iso2709' => sub {
plan tests => 2;
my $generated_mrc_file = '/tmp/test_export.mrc';
# Get all item infos
Koha::Exporter::Record::export(
{
record_type => 'bibs',
record_ids => [ $biblionumber_1, $biblionumber_2 ],
format => 'iso2709',
output_filepath => $generated_mrc_file,
}
);
my $records = MARC::File::USMARC->in( $generated_mrc_file );
my @records;
while ( my $record = $records->next ) {
push @records, $record;
}
is( scalar( @records ), 2, 'Export ISO2709: 2 records should have been exported' );
my $second_record = $records[1];
my $title = $second_record->subfield(245, 'a');
$title = Encode::encode('UTF-8', $title);
is( $title, $biblio_2_title, 'Export ISO2709: The title is correctly encoded' );
};

717
tools/export.pl

@ -17,104 +17,37 @@
# along with Koha; if not, see <http://www.gnu.org/licenses>.
use Modern::Perl;
use CGI qw ( -utf8 );
use MARC::File::XML;
use List::MoreUtils qw(uniq);
use Getopt::Long;
use CGI qw ( -utf8 );
use C4::Auth;
use C4::AuthoritiesMarc; # GetAuthority
use C4::Biblio; # GetMarcBiblio
use C4::Branch; # GetBranches
use C4::Csv;
use C4::Koha; # GetItemTypes
use C4::Output;
use C4::Record;
use Koha::DateUtils;
my $query = new CGI;
my $clean;
my $dont_export_items;
my $deleted_barcodes;
my $timestamp;
my $record_type;
my $id_list_file;
my $help;
my $op = $query->param("op") || '';
my $filename = $query->param("filename") || 'koha.mrc';
my $dbh = C4::Context->dbh;
my $marcflavour = C4::Context->preference("marcflavour");
my $output_format = $query->param("format") || $query->param("output_format") || 'iso2709';
# Checks if the script is called from commandline
my $commandline = not defined $ENV{GATEWAY_INTERFACE};
# @biblionumbers is only use for csv export from circulation.pl
my @biblionumbers = uniq $query->param("biblionumbers");
if ( $commandline ) {
# Getting parameters
$op = 'export';
GetOptions(
'format=s' => \$output_format,
'date=s' => \$timestamp,
'dont_export_items' => \$dont_export_items,
'deleted_barcodes' => \$deleted_barcodes,
'clean' => \$clean,
'filename=s' => \$filename,
'record-type=s' => \$record_type,
'id_list_file=s' => \$id_list_file,
'help|?' => \$help
);
if ($help) {
print <<_USAGE_;
export.pl [--format=format] [--date=date] [--record-type=TYPE] [--dont_export_items] [--deleted_barcodes] [--clean] [--id_list_file=PATH] --filename=outputfile
use Koha::Biblioitems;
use Koha::Database;
use Koha::DateUtils qw( dt_from_string output_pref );
use Koha::Exporter::Record;
--format=FORMAT FORMAT is either 'xml' or 'marc' (default)
--date=DATE DATE should be entered as the 'dateformat' syspref is
set (dd/mm/yyyy for metric, yyyy-mm-dd for iso,
mm/dd/yyyy for us) records exported are the ones that
have been modified since DATE
--record-type=TYPE TYPE is 'bibs' or 'auths'
--deleted_barcodes If used, a list of barcodes of items deleted since DATE
is produced (or from all deleted items if no date is
specified). Used only if TYPE is 'bibs'
--clean removes NSE/NSB
--id_list_file=PATH PATH is a path to a file containing a list of
IDs (biblionumber or authid) with one ID per line.
This list works as a filter; it is compatible with
other parameters for selecting records
_USAGE_
exit;
}
# Default parameters values :
$timestamp ||= '';
$dont_export_items ||= 0;
$deleted_barcodes ||= 0;
$clean ||= 0;
$record_type ||= "bibs";
$id_list_file ||= 0;
# Redirect stdout
open STDOUT, '>', $filename if $filename;
}
else {
$op = $query->param("op") || '';
$filename = $query->param("filename") || 'koha.mrc';
$filename =~ s/(\r|\n)//;
my $query = new CGI;
my $dont_export_items = $query->param("dont_export_item") || 0;
my $record_type = $query->param("record_type");
my $op = $query->param("op") || '';
my $output_format = $query->param("format") || $query->param("output_format") || 'iso2709';
my $backupdir = C4::Context->config('backupdir');
my $filename = $query->param("filename") || 'koha.mrc';
$filename =~ s/(\r|\n)//;
my $dbh = C4::Context->dbh;
my @record_ids;
# biblionumbers is sent from circulation.pl only
if ( $query->param("biblionumbers") ) {
$record_type = 'bibs';
@record_ids = $query->param("biblionumbers");
}
# Default value for output_format is 'iso2709'
@ -127,358 +60,198 @@ my ( $template, $loggedinuser, $cookie, $flags ) = get_template_and_user(
template_name => "tools/export.tt",
query => $query,
type => "intranet",
authnotrequired => $commandline,
authnotrequired => 0,
flagsrequired => { tools => 'export_catalog' },
debug => 1,
}
);
my $limit_ind_branch =
( C4::Context->preference('IndependentBranches')
&& C4::Context->userenv
&& !C4::Context->IsSuperLibrarian()
&& C4::Context->userenv->{branch} ) ? 1 : 0;
my @branch = $query->param("branch");
if ( C4::Context->preference("IndependentBranches")
&& C4::Context->userenv
&& !C4::Context->IsSuperLibrarian() )
{
my $only_my_branch;
# Limit to local branch if IndependentBranches and not superlibrarian
if (
(
C4::Context->preference('IndependentBranches')
&& C4::Context->userenv
&& !C4::Context->IsSuperLibrarian()
&& C4::Context->userenv->{branch}
)
# Limit result to local branch strip_nonlocal_items
or $query->param('strip_nonlocal_items')
) {
$only_my_branch = 1;
@branch = ( C4::Context->userenv->{'branch'} );
}
# if stripping nonlocal items, use loggedinuser's branch
my $localbranch = C4::Context->userenv ? C4::Context->userenv->{'branch'} : undef;
my %branchmap = map { $_ => 1 } @branch; # for quick lookups
my $backupdir = C4::Context->config('backupdir');
if ( $op eq "export" ) {
if (
$output_format eq "iso2709"
or $output_format eq "xml"
or (
$output_format eq 'csv'
and not @biblionumbers
)
) {
my $charset = 'utf-8';
my $mimetype = 'application/octet-stream';
binmode STDOUT, ':encoding(UTF-8)'
if $filename =~ m/\.gz$/
or $filename =~ m/\.bz2$/
or $output_format ne 'csv';
if ( $filename =~ m/\.gz$/ ) {
$mimetype = 'application/x-gzip';
$charset = '';
binmode STDOUT;
}
elsif ( $filename =~ m/\.bz2$/ ) {
$mimetype = 'application/x-bzip2';
binmode STDOUT;
$charset = '';
}
print $query->header(
-type => $mimetype,
-charset => $charset,
-attachment => $filename,
) unless ($commandline);
$record_type = $query->param("record_type") unless ($commandline);
my $export_remove_fields = $query->param("export_remove_fields");
my @biblionumbers = $query->param("biblionumbers");
my @itemnumbers = $query->param("itemnumbers");
my @sql_params;
my $sql_query;
my @recordids;
my $StartingBiblionumber = $query->param("StartingBiblionumber");
my $EndingBiblionumber = $query->param("EndingBiblionumber");
my $itemtype = $query->param("itemtype");
my $start_callnumber = $query->param("start_callnumber");
my $end_callnumber = $query->param("end_callnumber");
if ( $commandline ) {
$timestamp = eval { output_pref( { dt => dt_from_string( $timestamp ), dateonly => 1 }); };
$timestamp = '' unless ( $timestamp );
}
my $start_accession =
( $query->param("start_accession") )
? eval { output_pref( { dt => dt_from_string( $query->param("start_accession") ), dateonly => 1, dateformat => 'iso' } ); }
: '';
my $end_accession =
( $query->param("end_accession") )
? eval { output_pref( { dt => dt_from_string( $query->param("end_accession") ), dateonly => 1, dateformat => 'iso' } ); }
: '';
$dont_export_items = $query->param("dont_export_item")
unless ($commandline);
my $strip_nonlocal_items = $query->param("strip_nonlocal_items");
my $biblioitemstable =
( $commandline and $deleted_barcodes )
? 'deletedbiblioitems'
: 'biblioitems';
my $itemstable =
( $commandline and $deleted_barcodes )
? 'deleteditems'
: 'items';
my $starting_authid = $query->param('starting_authid');
my $ending_authid = $query->param('ending_authid');
my $authtype = $query->param('authtype');
my $filefh;
if ($commandline) {
open $filefh,"<", $id_list_file or die "cannot open $id_list_file: $!" if $id_list_file;
} else {
$filefh = $query->upload("id_list_file");
}
my %id_filter;
if ($filefh) {
while (my $number=<$filefh>){
$number=~s/[\r\n]*$//;
$id_filter{$number}=1 if $number=~/^\d+$/;
my $export_remove_fields = $query->param("export_remove_fields") || q||;
my @biblionumbers = $query->param("biblionumbers");
my @itemnumbers = $query->param("itemnumbers");
my @sql_params;
my $sql_query;
if ( $record_type eq 'bibs' or $record_type eq 'auths' ) {
# No need to retrieve the record_ids if we already get them
unless ( @record_ids ) {
if ( $record_type eq 'bibs' ) {
my $starting_biblionumber = $query->param("StartingBiblionumber");
my $ending_biblionumber = $query->param("EndingBiblionumber");
my $itemtype = $query->param("itemtype");
my $start_callnumber = $query->param("start_callnumber");
my $end_callnumber = $query->param("end_callnumber");
my $start_accession =
( $query->param("start_accession") )
? dt_from_string( $query->param("start_accession") )
: '';
my $end_accession =
( $query->param("end_accession") )
? dt_from_string( $query->param("end_accession") )
: '';
my $conditions = {
( $starting_biblionumber or $ending_biblionumber )
? (
"me.biblionumber" => {
( $starting_biblionumber ? ( '>=' => $starting_biblionumber ) : () ),
( $ending_biblionumber ? ( '<=' => $ending_biblionumber ) : () ),
}
)
: (),
( $start_callnumber or $end_callnumber )
? (
callnumber => {
( $start_callnumber ? ( '>=' => $start_callnumber ) : () ),
( $end_callnumber ? ( '<=' => $end_callnumber ) : () ),
}
)
: (),
( $start_accession or $end_accession )
? (
dateaccessioned => {
( $start_accession ? ( '>=' => $start_accession ) : () ),
( $end_accession ? ( '<=' => $end_accession ) : () ),
}
)
: (),
( @branch ? ( 'items.homebranch' => { in => \@branch } ) : () ),
( $itemtype
?
C4::Context->preference('item-level_itypes')
? ( 'items.itype' => $itemtype )
: ( 'biblioitems.itemtype' => $itemtype )
: ()
),
};
my $biblioitems = Koha::Biblioitems->search( $conditions, { join => 'items' } );
while ( my $biblioitem = $biblioitems->next ) {
push @record_ids, $biblioitem->biblionumber;
}
}
elsif ( $record_type eq 'auths' ) {
my $starting_authid = $query->param('starting_authid');
my $ending_authid = $query->param('ending_authid');
my $authtype = $query->param('authtype');
my $conditions = {
( $starting_authid or $ending_authid )
? (
authid => {
( $starting_authid ? ( '>=' => $starting_authid ) : () ),
( $ending_authid ? ( '<=' => $ending_authid ) : () ),
}
)
: (),
( $authtype ? ( authtypecode => $authtype ) : () ),
};
# Koha::Authority is not a Koha::Object...
my $authorities = Koha::Database->new->schema->resultset('AuthHeader')->search( $conditions );
@record_ids = map { $_->authid } $authorities->all;
}
}
if ( $record_type eq 'bibs' and not @biblionumbers ) {
if ($timestamp) {
# Specific query when timestamp is used
# Actually it's used only with CLI and so all previous filters
# are not used.
# If one day timestamp is used via the web interface, this part will
# certainly have to be rewrited
my ( $query, $params ) = construct_query(
{
recordtype => $record_type,
timestamp => $timestamp,
biblioitemstable => $biblioitemstable,
}
);
$sql_query = $query;
@sql_params = @$params;
@record_ids = uniq @record_ids;
if ( @record_ids and my $filefh = $query->upload("id_list_file") ) {
my @filter_record_ids = <$filefh>;
@filter_record_ids = map { my $id = $_; $id =~ s/[\r\n]*$// } @filter_record_ids;
# intersection
my %record_ids = map { $_ => 1 } @record_ids;
@record_ids = grep $record_ids{$_}, @filter_record_ids;
}
print CGI->new->header(
-type => 'application/octet-stream',
-charset => 'utf-8',
-attachment => $filename,
);
Koha::Exporter::Record::export(
{ record_type => $record_type,
record_ids => \@record_ids,
format => $output_format,
filename => $filename,
itemnumbers => \@itemnumbers,
dont_export_fields => $export_remove_fields,
csv_profile_id => ( $query->param('csv_profile_id') || GetCsvProfileId( C4::Context->preference('ExportWithCsvProfile') ) || undef ),
export_items => (not $dont_export_items),
}
else {
my ( $query, $params ) = construct_query(
{
recordtype => $record_type,
biblioitemstable => $biblioitemstable,
itemstable => $itemstable,
StartingBiblionumber => $StartingBiblionumber,
EndingBiblionumber => $EndingBiblionumber,
branch => \@branch,
start_callnumber => $start_callnumber,
end_callnumber => $end_callnumber,
start_accession => $start_accession,
end_accession => $end_accession,
itemtype => $itemtype,
}
);
$sql_query = $query;
@sql_params = @$params;
);
}
elsif ( $record_type eq 'db' or $record_type eq 'conf' ) {
my $successful_export;
if ( $flags->{superlibrarian}
and (
$record_type eq 'db' and C4::Context->config('backup_db_via_tools')
or
$record_type eq 'conf' and C4::Context->config('backup_conf_via_tools')
)
) {
binmode STDOUT, ':encoding(UTF-8)';
my $charset = 'utf-8';
my $mimetype = 'application/octet-stream';
if ( $filename =~ m/\.gz$/ ) {
$mimetype = 'application/x-gzip';
$charset = '';
binmode STDOUT;
}
}
elsif ( $record_type eq 'auths' ) {
my ( $query, $params ) = construct_query(
elsif ( $filename =~ m/\.bz2$/ ) {
$mimetype = 'application/x-bzip2';
binmode STDOUT;
$charset = '';
}
print $query->header(
-type => $mimetype,
-charset => $charset,
-attachment => $filename,
);
my $extension = $record_type eq 'db' ? 'sql' : 'tar';
$successful_export = download_backup(
{
recordtype => $record_type,
starting_authid => $starting_authid,
ending_authid => $ending_authid,
authtype => $authtype,
directory => $backupdir,
extension => $extension,
filename => $filename,
}
);
$sql_query = $query;
@sql_params = @$params;
}
elsif ( $record_type eq 'db' ) {
my $successful_export;
if ( $flags->{superlibrarian}
&& C4::Context->config('backup_db_via_tools') )
{
$successful_export = download_backup(
{
directory => "$backupdir",
extension => 'sql',
filename => "$filename"
}
);
}
unless ($successful_export) {
my $remotehost = $query->remote_host();
$remotehost =~ s/(\n|\r)//;
warn
"A suspicious attempt was made to download the db at '$filename' by someone at "
. $remotehost . "\n";
}
exit;
}
elsif ( $record_type eq 'conf' ) {
my $successful_export;
if ( $flags->{superlibrarian}
&& C4::Context->config('backup_conf_via_tools') )
{
$successful_export = download_backup(
{
directory => "$backupdir",
extension => 'tar',
filename => "$filename"
}
);
}
unless ($successful_export) {
my $remotehost = $query->remote_host();
$remotehost =~ s/(\n|\r)//;
warn
"A suspicious attempt was made to download the configuration at '$filename' by someone at "
"A suspicious attempt was made to download the " . ( $record_type eq 'db' ? 'db' : 'configuration' ) . "at '$filename' by someone at "
. $remotehost . "\n";
}
exit;
}
elsif (@biblionumbers) {
push @recordids, (@biblionumbers);
}
else {
# Someone is trying to mess us up
exit;
}
unless (@biblionumbers) {
my $sth = $dbh->prepare($sql_query);
$sth->execute(@sql_params);
push @recordids, map {
map { $$_[0] } $_
} @{ $sth->fetchall_arrayref };
@recordids = grep { exists($id_filter{$_}) } @recordids if scalar(%id_filter);
}
my $xml_header_written = 0;
for my $recordid ( uniq @recordids ) {
if ($deleted_barcodes) {
my $q = "
SELECT DISTINCT barcode
FROM deleteditems
WHERE deleteditems.biblionumber = ?
";
my $sth = $dbh->prepare($q);
$sth->execute($recordid);
while ( my $row = $sth->fetchrow_array ) {
print "$row\n";
}
}
else {
my $record;
if ( $record_type eq 'bibs' ) {
$record = eval { GetMarcBiblio($recordid); };
next if $@;
next if not defined $record;
C4::Biblio::EmbedItemsInMarcBiblio( $record, $recordid,
\@itemnumbers )
unless $dont_export_items;
if ( $strip_nonlocal_items
|| $limit_ind_branch
|| $dont_export_items )
{
my ( $homebranchfield, $homebranchsubfield ) =
GetMarcFromKohaField( 'items.homebranch', '' );
for my $itemfield ( $record->field($homebranchfield) ) {
$record->delete_field($itemfield)
if ( $dont_export_items
|| $localbranch ne $itemfield->subfield(
$homebranchsubfield) );
}
}
}
elsif ( $record_type eq 'auths' ) {
$record = C4::AuthoritiesMarc::GetAuthority($recordid);
next if not defined $record;
}
if ($export_remove_fields) {
for my $f ( split / /, $export_remove_fields ) {
if ( $f =~ m/^(\d{3})(.)?$/ ) {
my ( $field, $subfield ) = ( $1, $2 );
# skip if this record doesn't have this field
if ( defined $record->field($field) ) {
if ( defined $subfield ) {
my @tags = $record->field($field);
foreach my $t (@tags) {
$t->delete_subfields($subfield);
}
}
else {
$record->delete_fields($record->field($field));
}
}
}
}
}
RemoveAllNsb($record) if ($clean);
if ( $output_format eq "xml" ) {
unless ($xml_header_written) {
MARC::File::XML->default_record_format(
(
$marcflavour eq 'UNIMARC'
&& $record_type eq 'auths'
) ? 'UNIMARCAUTH' : $marcflavour
);
print MARC::File::XML::header();
print "\n";
$xml_header_written = 1;
}
print MARC::File::XML::record($record);
print "\n";
}
elsif ( $output_format eq 'iso2709' ) {
my $errorcount_on_decode = eval { scalar(MARC::File::USMARC->decode( $record->as_usmarc )->warnings()) };
if ($errorcount_on_decode or $@){
warn $@ if $@;
warn "record (number $recordid) is invalid and therefore not exported because its reopening generates warnings above";
next;
}
print $record->as_usmarc();
}
}
}
if ($xml_header_written) {
print MARC::File::XML::footer();
print "\n";
}
if ( $output_format eq 'csv' ) {
my $csv_profile_id = $query->param('csv_profile')
|| GetCsvProfileId( C4::Context->preference('ExportWithCsvProfile') );
my $output =
marc2csv( \@recordids,
$csv_profile_id );
print $output;
}
exit;
}
elsif ( $output_format eq "csv" ) {
my @biblionumbers = uniq $query->param("biblionumbers");
my @itemnumbers = $query->param("itemnumbers");
my $csv_profile_id = $query->param('csv_profile') || GetCsvProfileId( C4::Context->preference('ExportWithCsvProfile') );
my $output =
marc2csv( \@biblionumbers,
$csv_profile_id,
\@itemnumbers, );
print $query->header(
-type => 'application/octet-stream',
-'Content-Transfer-Encoding' => 'binary',
-attachment => "export.csv"
);
print $output;
exit;
}
} # if export
exit;
}
else {
@ -491,7 +264,7 @@ else {
);
push @itemtypesloop, \%row;
}
my $branches = GetBranches($limit_ind_branch);
my $branches = GetBranches($only_my_branch);
my @branchloop;
for my $thisbranch (
sort { $branches->{$a}->{branchname} cmp $branches->{$b}->{branchname} }
@ -548,128 +321,6 @@ else {
output_html_with_http_headers $query, $cookie, $template->output;
}
sub construct_query {
my ($params) = @_;
my ( $sql_query, @sql_params );
if ( $params->{recordtype} eq "bibs" ) {
if ( $params->{timestamp} ) {
my $biblioitemstable = $params->{biblioitemstable};
$sql_query = " (
SELECT biblionumber
FROM $biblioitemstable
LEFT JOIN items USING(biblionumber)
WHERE $biblioitemstable.timestamp >= ?
OR items.timestamp >= ?
) UNION (
SELECT biblionumber
FROM $biblioitemstable
LEFT JOIN deleteditems USING(biblionumber)
WHERE $biblioitemstable.timestamp >= ?
OR deleteditems.timestamp >= ?
) ";
my $ts = eval { output_pref( { dt => dt_from_string( $timestamp ), dateonly => 1, dateformat => 'iso' }); };
@sql_params = ( $ts, $ts, $ts, $ts );
}
else {
my $biblioitemstable = $params->{biblioitemstable};
my $itemstable = $params->{itemstable};
my $StartingBiblionumber = $params->{StartingBiblionumber};
my $EndingBiblionumber = $params->{EndingBiblionumber};
my @branch = @{ $params->{branch} };
my $start_callnumber = $params->{start_callnumber};
my $end_callnumber = $params->{end_callnumber};
my $start_accession = $params->{start_accession};
my $end_accession = $params->{end_accession};
my $itemtype = $params->{itemtype};
my $items_filter =
@branch
|| $start_callnumber
|| $end_callnumber
|| $start_accession
|| $end_accession
|| ( $itemtype && C4::Context->preference('item-level_itypes') );
$sql_query = $items_filter
? "SELECT DISTINCT $biblioitemstable.biblionumber
FROM $biblioitemstable JOIN $itemstable
USING (biblionumber) WHERE 1"
: "SELECT $biblioitemstable.biblionumber FROM $biblioitemstable WHERE biblionumber >0 ";
if ($StartingBiblionumber) {
$sql_query .= " AND $biblioitemstable.biblionumber >= ? ";
push @sql_params, $StartingBiblionumber;
}
if ($EndingBiblionumber) {
$sql_query .= " AND $biblioitemstable.biblionumber <= ? ";
push @sql_params, $EndingBiblionumber;
}
if (@branch) {
$sql_query .= " AND homebranch IN (".join(',',map({'?'} @branch)).")";
push @sql_params, @branch;
}
if ($start_callnumber) {
$sql_query .= " AND itemcallnumber >= ? ";
push @sql_params, $start_callnumber;
}
if ($end_callnumber) {
$sql_query .= " AND itemcallnumber <= ? ";
push @sql_params, $end_callnumber;
}
if ($start_accession) {
$sql_query .= " AND dateaccessioned >= ? ";
push @sql_params, $start_accession;
}
if ($end_accession) {
$sql_query .= " AND dateaccessioned <= ? ";
push @sql_params, $end_accession;
}
if ($itemtype) {
$sql_query .=
( C4::Context->preference('item-level_itypes') )
? " AND items.itype = ? "
: " AND biblioitems.itemtype = ?";
push @sql_params, $itemtype;
}
}
}
elsif ( $params->{recordtype} eq "auths" ) {
if ( $params->{timestamp} ) {
#TODO
}
else {
my $starting_authid = $params->{starting_authid};
my $ending_authid = $params->{ending_authid};
my $authtype = $params->{authtype};
$sql_query =
"SELECT DISTINCT auth_header.authid FROM auth_header WHERE 1";
if ($starting_authid) {
$sql_query .= " AND auth_header.authid >= ? ";
push @sql_params, $starting_authid;
}
if ($ending_authid) {
$sql_query .= " AND auth_header.authid <= ? ";
push @sql_params, $ending_authid;
}
if ($authtype) {
$sql_query .= " AND auth_header.authtypecode = ? ";
push @sql_params, $authtype;
}
}
}
return ( $sql_query, \@sql_params );
}
sub getbackupfilelist {
my $args = shift;
my $directory = $args->{directory};

Loading…
Cancel
Save