Bug 30571: (follow-up) Additional check at upgrade

Bad data occurs. We are removing null hosts here and update
empty syntax or encoding, informing the user.

Test plan:
Remove the NOT NULL condition if needed.
Copy db_revs/220600056.pl to atomicupdate and test dbrev.
You might see something like:
    Upgrade to 22.06.00.056  [11:48:22]: Bug 30571 - Table z3950servers: three cols NOT NULL
        Found bad data in table z3950servers: removed 1 records with host undefined
        Corrected empty syntax or encoding for the following hosts. Please check after upgrade.
        Updated hosts: lx2.loc.gov,lx2.loc.gov

Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
This commit is contained in:
Marcel de Rooy 2022-10-05 11:49:56 +00:00 committed by Tomas Cohen Arazi
parent 86a744cbfd
commit d71c20a997
Signed by: tomascohen
GPG key ID: 0A272EA1B2F3C15F

View file

@ -6,6 +6,24 @@ return {
up => sub {
my ($args) = @_;
my ($dbh, $out) = @$args{qw(dbh out)};
# Preliminary data checks
my $sql = "SELECT COUNT(*) FROM z3950servers WHERE host IS NULL";
my ($cnt) = $dbh->selectrow_array($sql);
if( $cnt ) { # No host is really bad data! Remove it.
$dbh->do( "DELETE FROM z3950servers WHERE host IS NULL" );
say $out "Found bad data in table z3950servers: removed $cnt records with host undefined";
}
$sql = "SELECT host FROM z3950servers WHERE syntax IS NULL OR encoding IS NULL";
my $hosts = $dbh->selectcol_arrayref($sql);
if( @$hosts ) { # This is bad data too. We choose a default here.
$dbh->do( q|UPDATE z3950servers SET syntax = COALESCE(syntax, 'USMARC'), encoding = COALESCE(encoding, 'utf8')
WHERE syntax IS NULL OR encoding IS NULL| );
say $out "Corrected empty syntax or encoding for the following hosts. Please check after upgrade.";
say $out "Updated hosts: ". (join ',', @$hosts);
}
# Actual dbrev
$dbh->do(q{
alter table z3950servers
change column `host` `host` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL COMMENT 'target''s host name',