Koha/Koha/MetadataRecord.pm
Aleisha Amohia 39b17d0526
Bug 30358: Strip leading/trailing whitespace characters from input fields when cataloguing
This enhancement adds a system preference StripWhitespaceChars which,
when enabled, will strip leading and trailing whitespace characters from
all fields when cataloguing both bibliographic records and authority
records. Whitespace characters that will be stripped are:
- spaces
- newlines
- carriage returns
- tabs

To test:
1. Apply patch and install database updates
2. Go to Administration, system preferences, find the new
StripWhitespaceChars preference. It should be "Don't strip" by default.
Change it to "Strip".
3. Search for a biblio record and edit it. Put some leading or trailing
whitespace characters in input fields and textarea fields and save.
4. Confirm these characters are removed when you save the record.
5. Repeat steps 3 and 4 for authority records.
6. Confirm tests pass t/db_dependent/Biblio/ModBiblioMarc.t

Sponsored-by: Educational Services Australia SCIS

Signed-off-by: David Nind <david@davidnind.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>

Bug 30358: (follow-up) Also strip inner newlines

This patch amends the StripWhitespaceChars system preference to also
strip inner newlines (line breaks and carriage returns) when enabled.

Signed-off-by: David Nind <david@davidnind.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>

Bug 30358: (follow-up) Inner newlines should be replaced with a space

Signed-off-by: David Nind <david@davidnind.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>

Bug 30358: (follow-up) Fixing tests and including for inner newlines

Signed-off-by: David Nind <david@davidnind.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>

Bug 30358: (follow-up) Clarify syspref wording about fields affected

Signed-off-by: David Nind <david@davidnind.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>

Bug 30358: (follow-up) Consider field has multiple subfields of same key

To test:

1) Click the clone subfield button to make multiple subfields with the
same key, i.e. 500$a$a$a
2) Save the record and confirm that the fields contain the correct data
after whitespaces are stripped.

Signed-off-by: David Nind <david@davidnind.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>

Bug 30358: (follow-up) Put multiple subfields fix on auth side

Signed-off-by: David Nind <david@davidnind.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>

Bug 30358: (follow-up) stripWhitespaceChars subroutine and tests

To test:

Confirm test plan above still works as expected and tests pass in
t/Koha_MetadataRecord.t

Sponsored-by: Catalyst IT

Signed-off-by: David Nind <david@davidnind.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>

Bug 30358: (follow-up) Fixing ModBiblioMarc.t tests

Signed-off-by: David Nind <david@davidnind.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>

Bug 30358: (follow-up) Do not strip whitespace from control fields

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>

Bug 30358: (follow-up) Simplify regex

The regex does the following:
1. Replace newlines and carriage returns with a space
2. Replace leading and trailing whitespace with nothing (strip)

Signed-off-by: Hammat Wele <hammat.wele@inlibro.com>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
2023-05-16 15:17:26 -03:00

138 lines
3.6 KiB
Perl

package Koha::MetadataRecord;
# Copyright 2013 C & P Bibliography Services
#
# This file is part of Koha.
#
# Koha is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# Koha is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with Koha; if not, see <http://www.gnu.org/licenses>.
=head1 NAME
Koha::MetadataRecord - base class for metadata records
=head1 SYNOPSIS
my $record = new Koha::MetadataRecord({ 'record' => $record });
=head1 DESCRIPTION
Object-oriented class that encapsulates all metadata (i.e. bibliographic
and authority) records in Koha.
=cut
use Modern::Perl;
use Carp qw( carp );
use Koha::Util::MARC;
use base qw(Class::Accessor);
__PACKAGE__->mk_accessors(qw( record schema format id ));
=head2 new
my $metadata_record = new Koha::MetadataRecord({
record => $record,
schema => $schema,
format => $format,
id => $id
});
Returns a Koha::MetadataRecord object encapsulating record metadata.
C<$record> is expected to be a deserialized object (for example
a MARC::Record or XML::LibXML::Document object or JSON).
C<$schema> is used to describe the metadata schema (for example
marc21, unimarc, dc, mods, etc).
C<$format> is used to specify the serialization format. It is important
for Koha::RecordProcessor because it will pick the right Koha::Filter
implementation based on this parameter. Valid values are:
MARC (for MARC::Record objects)
XML (for XML::LibXML::Document objects)
JSON (for JSON objects)
(optional) C<$id> is used so the record carries its own id and Koha doesn't
need to look for it inside the record.
=cut
sub new {
my $class = shift;
my $params = shift;
if (!defined $params->{ record }) {
carp 'No record passed';
return;
}
if (!defined $params->{ schema }) {
carp 'No schema passed';
return;
}
$params->{format} //= 'MARC';
my $self = $class->SUPER::new($params);
bless $self, $class;
return $self;
}
=head2 createMergeHash
Create a hash for use when merging records. At the moment the only
metadata schema supported is MARC.
=cut
sub createMergeHash {
my ($self, $tagslib) = @_;
if ($self->schema =~ m/marc/) {
return Koha::Util::MARC::createMergeHash($self->record, $tagslib);
}
}
=head2 stripWhitespaceChars
$record = Koha::MetadataRecord::stripWhitespaceChars( $record );
Strip leading and trailing whitespace characters from input fields.
=cut
sub stripWhitespaceChars {
my ( $record ) = @_;
foreach my $field ( $record->fields ) {
unless ( $field->is_control_field ) {
foreach my $subfield ( $field->subfields ) {
my $key = $subfield->[0];
my $value = $subfield->[1];
$value =~ s/[\n\r]+/ /g;
$value =~ s/^\s+|\s+$//g;
$field->add_subfields( $key => $value ); # add subfield to the end of the subfield list
$field->delete_subfield( pos => 0 ); # delete the subfield at the top of the subfield list
}
}
}
return $record;
}
1;