Bug 34587: Improve title uniqueness rules

Improve robustness of unique matching here, to make sure we always match for same title if the case,

Some report rows may correspond to the same title as the previous row but have an empty match field, or alternatively come with a filled match field (e.g. DOI or Print_ISSN in TR_J4. Because of this we only verify a uniqueness match field if both current row and previous have it non-empty, otherwise we keep checking the remaining uniqueness match fields.

Example of this use-case, COUNTER report:
title    | publisher | platform  | Proprietary_ID | Print_ISSN | DOI     | YOP  | usages
examplet | examplep  | examplepl | 1              | 123        |        | 2020 | usages
examplet | examplep  | examplepl | 1              | 123        |someDOI | 2021 | usages

The above 2 rows is the same title, same publisher, same proprietary_id,
same Print_ISSN, etc. It just so happens that one was returned by SUSHI
with DOI and the other wasnt.
These 2 rows correspond to different usage statistics by YOP, not 2
different usage_titles

Signed-off-by: Jessica Zairo <jzairo@bywatersolutions.com>
Signed-off-by: Michaela Sieber <michaela.sieber@kit.edu>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
This commit is contained in:
Pedro Amorim 2023-10-23 11:02:36 +00:00 committed by Tomas Cohen Arazi
parent 184602ce38
commit ee4db6d763
Signed by: tomascohen
GPG key ID: 0A272EA1B2F3C15F

View file

@ -444,7 +444,12 @@ sub _search_for_usage_object {
} elsif ( $self->type =~ /TR/i ) {
return Koha::ERM::EUsage::UsageTitles->search(
{
title_doi => $row->{DOI},
print_issn => $row->{Print_ISSN},
online_issn => $row->{Online_ISSN},
proprietary_id => $row->{Proprietary_ID},
publisher => $row->{Publisher},
platform => $row->{Platform},
title => $row->{Title},
usage_data_provider_id => $usage_data_provider->erm_usage_data_provider_id
}
)->last;
@ -474,7 +479,38 @@ sub _is_same_usage_object {
&& $previous_object->item eq $row->{Item}
&& $previous_object->publisher eq $row->{Publisher};
} elsif ( $self->type =~ /TR/i ) {
return $previous_object && $previous_object->title_doi eq $row->{DOI};
return unless $previous_object;
if ( $previous_object->print_issn && $row->{Print_ISSN} ){
return unless $previous_object->print_issn eq $row->{Print_ISSN};
}
if ( $previous_object->online_issn && $row->{Online_ISSN} ){
return unless $previous_object->online_issn eq $row->{Online_ISSN};
}
if ( $previous_object->proprietary_id && $row->{Proprietary_ID} ){
return unless $previous_object->proprietary_id eq $row->{Proprietary_ID};
}
if ( $previous_object->publisher && $row->{Publisher} ){
return unless $previous_object->publisher eq $row->{Publisher};
}
if ( $previous_object->platform && $row->{Platform} ){
return unless $previous_object->platform eq $row->{Platform};
}
if ( $previous_object->title_doi && $row->{DOI} ){
return unless $previous_object->title_doi eq $row->{DOI};
}
if ( $previous_object->title && $row->{Title} ){
return unless $previous_object->title eq $row->{Title};
}
return 1;
}
return 0;