From f3a8b7a0e1e1bf112628c6215105ab80f25ed94f Mon Sep 17 00:00:00 2001 From: Jacek Ablewicz Date: Wed, 24 Jun 2015 19:43:05 +0200 Subject: [PATCH] Bug 14456: EmbedSeeFromHeadings record filter shouldn't process MARC holding fields If the system preference IncludeSeeFromInSearches is enabled, records exported for zebra indexing are being additionally processed by EmbedSeeFromHeadings record filter (right now used only in rebuild_zebra.pl script). This filter embeds 'see from' fields (extracted from authority records linked with the given biblio via $9 subfields) into target MARC record, which is then subsequently indexed in zebra. Currently all fields containing $9 are getting the same exact treatment by this filter. But on the export stage when the filter is applied, MARC record being processed already does have holdings data fields added in the previous stage (usually 952 / 995, depending on the MARC format). Problem is that holdings data fields use to have $9 subfields in them as well (mapped to item.itemnumber by default). As a consequence, some (great many in the typical setup) records exported for zebra indexing may have surplus "see from" fields added erroneously in semi-random fashion, so biblio searches would often return some completely unexpected additional results. EmbedSeeFromHeadings record filter should not process holdings fields when dealing with MARC records intended for zebra indexing. To reproduce: 1) database with as many sample or real-world biblio, item and authority records as possible is recommended for testing purposes 2) enable IncludeSeeFromInSearches 3) export a bunch of biblio records for zebra (e.g.: misc/migration_tools/rebuild_zebra.pl -I -b -x -k -length=1000), inspect the result xml records in /tmp/ file; observe that at the end of many records, here and there some extra "see from" (= 1st indicator: 'z') fields tend to appear, which shouldn't be there ;) To test: 4) apply patch 5) redo 3) 6) compare results from 3) and 5) with diff Signed-off-by: Tomas Cohen Arazi I introduced a regression test for this. You should run the tests without/with the patch and verify that the patch actually fixes the problem. Good job Jacek! I'm sure writing the regression test would take less time than such a detailed commit message! Signed-off-by: Katrin Fischer Signed-off-by: Tomas Cohen Arazi --- Koha/Filter/MARC/EmbedSeeFromHeadings.pm | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/Koha/Filter/MARC/EmbedSeeFromHeadings.pm b/Koha/Filter/MARC/EmbedSeeFromHeadings.pm index 4e05af13e7..fc8d1ebe9f 100644 --- a/Koha/Filter/MARC/EmbedSeeFromHeadings.pm +++ b/Koha/Filter/MARC/EmbedSeeFromHeadings.pm @@ -34,6 +34,7 @@ use strict; use warnings; use Carp; use Koha::Authority; +use C4::Biblio qw/GetMarcFromKohaField/; use base qw(Koha::RecordProcessor::Base); our $NAME = 'EmbedSeeFromHeadings'; @@ -72,8 +73,12 @@ sub filter { sub _processrecord { my $record = shift; + my ($item_tag) = GetMarcFromKohaField("items.itemnumber", ''); + $item_tag ||= ''; + foreach my $field ( $record->fields() ) { next if $field->is_control_field(); + next if $field->tag() eq $item_tag; my $authid = $field->subfield('9'); next unless $authid; -- 2.39.5