Jonathan Druart
65c4d8019e
Currently the call number splitting seems to be mostly implemented for DDC and LC classifications. Those are both not very common in some countries. A lot of libraries use their own custom classification schemes so the call number plitting is something that should be individually configurable. This enhancement adds the ability to define custom splitting rules based on regular expressions. How does it work so far? From C4/Labels/Label.pm there are 3 differents splitting methods defined, depending on items.cn_source. if cn_source is "lcc' or 'nlm' we split using Library::CallNumber::LC if cn_source is 'ddc' we split using a in-house method Finally there is a fallback method to split on space And nothing else is done for other cn_source The idea of this patch is to mimick what was done for the "filing rules" and add the ability to define "splitting rules" that will be used by the "Classification sources". A classification source will then have: * a filing rule used to sort items by callnumbers * a splitting rule used to print labels To acchieve this goal this enhancement will do the following modifications at DB level: * New table class_split_rules * New column class_sources.class_split_rule Test plan: * Execute the update database entry to create the new table and column. I. UI Changes a) Create/modify/delete a filing rule b) Create/modify/delete a splitting rule c) Create/modify/delete a classification source => A filing rule or splitting rule cannot be removed if used by a classification source II. Splitting rule using regular expressions a) Create a splitting rule using the "Splitting routine" "RegEx" b) Define several regular expressions, they will be applied one after the other in the same order you define them. Something like: s/\s/\n/g # Break on spaces s/(\s?=)/\n=/g # Break on = (unless it's done already) s/^(J|K)\n/$1 / # Remove the first break if callnumber starts with J or K c) You can test the regular expressions using filling the textarea with a list of callnumbers. Then click "Test" and confirm the callnumbers are split how you expected. d) Finally create a new classification source that will use this new splitting rule. III. Print the label! a) Create a layout. It should have the "Split call numbers" checkbox ticked, and display itemcallnumber b) Use this layout to export labels, use items with different classification source ('lcc', 'ddc', but also the new one you have create) => The callnumbers should have been split according to the regex you defined earlier! Notes: * The update database entry fill the class_sources.class_split_rule with the value of class_sources.class_sort_rule If default rules exist it will not work, we should add a note in the release notes (would be enough?) * C4::ClassSplitRoutine::* should be moved to Koha::ClassSplitRule, but it sounded better to keep the same pattern as ClassSortRoutines * Should not we use a LONGTEXT for class_split_rules.split_regex instead of VARCHAR(255)? * class_sources.sql should be filled for other languages before pushed to master! IMPORTANT NOTES: The regular expressions are stored as it, and eval is used to evaluate it (perlcritic raises a warning about it (Expression form of "eval"). It can lead to serious security issues (execution of arbitrary code on the server), especially if the modifier 'e' is used. We could then remedy the situation with one of these following points: - Assume that this DB data is safe (We can add a new permission?) - Assume that the data is not safe and deal with possible attack Cons: how be sure we are exhaustive? Making sure it matches ^s///[^e/]*$ would be enough? - Use Template Toolkit syntax instead (Really safer?) [% callnumber.replace('\s', '\n').replace ... %] - Cut the regex parts: find, replace, modifiers like we already do for Marc modification template. Cons: we are going to have escape problems, the "find" and "replace" parts should not be handle the same way (think "\n", "\\n", "\1", "\s", etc.) I did not manage to implement this one easily. Sponsored-by: Goethe-Institut Signed-off-by: Christian Stelzenmüller <christian.stelzenmueller@bsz-bw.de> Signed-off-by: Chris Cormack <chrisc@catalyst.net.nz> Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
59 lines
1.3 KiB
Perl
59 lines
1.3 KiB
Perl
package C4::ClassSplitRoutine::RegEx;
|
|
|
|
# Copyright 2018 Koha Development Team
|
|
#
|
|
# This file is part of Koha.
|
|
#
|
|
# Koha is free software; you can redistribute it and/or modify it
|
|
# under the terms of the GNU General Public License as published by
|
|
# the Free Software Foundation; either version 3 of the License, or
|
|
# (at your option) any later version.
|
|
#
|
|
# Koha is distributed in the hope that it will be useful, but
|
|
# WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
# GNU General Public License for more details.
|
|
#
|
|
# You should have received a copy of the GNU General Public License
|
|
# along with Koha; if not, see <http://www.gnu.org/licenses>.
|
|
|
|
use Modern::Perl;
|
|
|
|
use C4::Debug;
|
|
|
|
=head1 NAME
|
|
|
|
C4::ClassSplitRoutine::RegEx - regex call number sorting key routine
|
|
|
|
=head1 SYNOPSIS
|
|
|
|
use C4::ClassSplitRoutine;
|
|
|
|
my $cn_sort = C4::ClassSplitRoutine::RegEx::split_callnumber($cn_item, $regexs);
|
|
|
|
=head1 FUNCTIONS
|
|
|
|
=head2 split_callnumber
|
|
|
|
my $cn_split = C4::ClassSplitRoutine::RegEx::split_callnumber($cn_item, $regexs);
|
|
|
|
=cut
|
|
|
|
sub split_callnumber {
|
|
my ($cn_item, $regexs) = @_;
|
|
|
|
for my $regex ( @$regexs ) {
|
|
eval "\$cn_item =~ $regex";
|
|
}
|
|
my @lines = split "\n", $cn_item;
|
|
|
|
return @lines;
|
|
}
|
|
|
|
1;
|
|
|
|
=head1 AUTHOR
|
|
|
|
Koha Development Team <http://koha-community.org/>
|
|
|
|
=cut
|