Koha/C4/ClassSortRoutine/Dewey.pm
Jason Etheridge dba36a7a12 Bug 9770: fix sorting of Dewey call numbers that contain prefixes
C4::ClassSortRoutine::Dewey can pad the wrong part of a call number internally.

The subroutine get_class_sort_key tokenizes a call number string (splitting on
periods and whitespace) and counts the number of tokens that solely contain
digits.  If there is only one such digit group, a comment in the code states
that it will pad said digit group.  However, the bug is that the code assumes
said digit group is the first token, when this may not be the case.

In practice, this can cause poor sorting when used a call number is in the form
of PREFIX _space_ 3DIGITS.

To test:

[1] Create two item records whose class scheme is set to
    'ddc' (Dewey) and whose call numbers contain prefixes, e.g.,
    J DVD 700.1 ABC and J DVD 850 DEF.
[2] Use the inventory tool to produce a list of item items that include
    the two created in step 1.  Obsere that that items are sorted
    in the incorrect order, with "J DVD 850 DEF" coming before
    "J DVD 700.1 ABC".  Alternatively, run the following SQL
    to see the incorrect sort order:

    SELECT cn_sort, itemcallnumber
    FROM items
    WHERE itemcallnumber LIKE 'J DVD%'
    ORDER BY cn_sort;

[4] Apply this patch.
[5] Run misc/maintenance/touch_all_items.pl to force cn_sort to be
    recalculated.
[6] Repeat step 2 and verify that the call numbers are now sorted
    corrected.

Signed-off-by: Jason Etheridge <jason@esilibrary.com>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
2013-07-15 16:12:47 +00:00

101 lines
2.8 KiB
Perl

package C4::ClassSortRoutine::Dewey;
# Copyright (C) 2007 LibLime
#
# This file is part of Koha.
#
# Koha is free software; you can redistribute it and/or modify it under the
# terms of the GNU General Public License as published by the Free Software
# Foundation; either version 2 of the License, or (at your option) any later
# version.
#
# Koha is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
# A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with Koha; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
use strict;
use warnings;
use vars qw($VERSION);
# set the version for version checking
$VERSION = 3.07.00.049;
=head1 NAME
C4::ClassSortRoutine::Dewey - generic call number sorting key routine
=head1 SYNOPSIS
use C4::ClassSortRoutine;
my $cn_sort = GetClassSortKey('Dewey', $cn_class, $cn_item);
=head1 FUNCTIONS
=head2 get_class_sort_key
my $cn_sort = C4::ClassSortRoutine::Dewey::Dewey($cn_class, $cn_item);
Generates sorting key using the following rules:
* Concatenates class and item part.
* Converts to uppercase.
* Removes leading and trailing whitespace and '/'
* Separates alphabetic prefix from the rest of the call number
* Splits into tokens on whitespaces and periods.
* Leaves first digit group as is.
* Converts second digit group to 15-digit long group, padded on right with zeroes.
* Converts each run of whitespace to an underscore.
* Removes any remaining non-alphabetical, non-numeric, non-underscore characters.
=cut
sub get_class_sort_key {
my ($cn_class, $cn_item) = @_;
$cn_class = '' unless defined $cn_class;
$cn_item = '' unless defined $cn_item;
my $init = uc "$cn_class $cn_item";
$init =~ s/^\s+//;
$init =~ s/\s+$//;
$init =~ s!/!!g;
$init =~ s/^([\p{IsAlpha}]+)/$1 /;
my @tokens = split /\.|\s+/, $init;
my $digit_group_count = 0;
my $first_digit_group_idx;
for (my $i = 0; $i <= $#tokens; $i++) {
if ($tokens[$i] =~ /^\d+$/) {
$digit_group_count++;
if (1 == $digit_group_count) {
$first_digit_group_idx = $i;
}
if (2 == $digit_group_count) {
$tokens[$i] = sprintf("%-15.15s", $tokens[$i]);
$tokens[$i] =~ tr/ /0/;
}
}
}
# Pad the first digit_group if there was only one
if (1 == $digit_group_count) {
$tokens[$first_digit_group_idx] .= '_000000000000000'
}
my $key = join("_", @tokens);
$key =~ s/[^\p{IsAlnum}_]//g;
return $key;
}
1;
=head1 AUTHOR
Koha Development Team <http://koha-community.org/>
=cut