'error' has special meaning in exceptions so naming the fields:
type, details
Rather than only dealing with a single exception type, we generically
get the ES exception info and pass it up.
I could not recreate timeout still, however, I simply restarted the
ES docker during commit stage to cause NoNodes exceptions
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This catches a timeout response from the ES server, logs this, and continues the indexing
To test:
1 - perl misc/search_tools/rebuild_elasticsearch.pl
2 - Make the ES server timeout (I don't have good instruction yet)
3 - Watch the job crash
4 - Apply patches
5 - perl misc/search_tools/rebuild_elasticsearch.pl
6 - Make the server timeout
7 - Note the job reports failed commit, and continues
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Bug 26312: (follow-up) Reset buffers even if commit fails
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Bug 26312: (follow-up) Fix whitespace and missing semicolon
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Same as Load, but for Dump.
Test plan:
Edit ES mappings, replace withdrawn's label with "withdrawn ✔️❤️ ★"
Export the mappings
perl misc/search_tools/export_elasticsearch_mappings.pl > admin/searchengine/elasticsearch/mappings.yaml
Reset mappings from the UI
=> Notice that the label is correct
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
From tht YAML pod:
"""
This module has been released to CPAN as YAML::Old, and soon YAML.pm will be changed to just be a frontend interface module for all the various Perl YAML implementation modules, including YAML::Old.
If you want robust and fast YAML processing using the normal Dump/Load API, please consider switching to YAML::XS. It is by far the best Perl module for YAML at this time. It requires that you have a C compiler, since it is written in C.
"""
See also
https://gitlab.com/koha-community/qa-test-tools/-/merge_requests/35
Test plan:
Try some place where YAML::XS is not used and confirm that it works
correctly
QA note: This patch removes some uses of YAML that were not useful
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
When multithreaded indexing is used, the commit size for children are spread
out resulting in them being of type float. When records are processed and the
commit counter decreased it may then never reach *exactly* 0. This means records
are never commited. This patch makes sure the counter is an integer to avoid the
problem.
To test you must find a set of circumstances that causes the issue. For me:
1. Run: ./rebuild_elasticsearch -v -b -p 2 -c 400
2. Note that only one process is logging "Committing xxx records..."
3. Kill processes.
4. Apply patch.
5. Repeat 1
6. Note that both processes are logging "Committing xxx records..."
Sponsored-by: Lund University Library
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
While the ES index is incremental and provides results as it commits, we currently index from the oldest records to the newest.
This patch provides the option to go the other direction
To test:
1 - Have ES setup and running for Koha
2 - perl misc/search_tools/rebuild_elasticsearch.pl -v -v -b
3 - Note the biblios index from number 1 the end
4 - perl misc/search_tools/rebuild_elasticsearch.pl -v -v -a
5 - Notice the same
6 - Apply patch
7 - perl misc/search_tools/rebuild_elasticsearch.pl -v -v -b
8 - Still in ascending order
9 - perl misc/search_tools/rebuild_elasticsearch.pl -v -v -b --desc
10 - Now records index in descending order
11 - perl misc/search_tools/rebuild_elasticsearch.pl -v -v -a
12 - Still ascending
13 - perl misc/search_tools/rebuild_elasticsearch.pl -v -v -a --desc
14 - Now descending
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
JD amended patch: fix typo "inde" vs "index" and add commit body
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Script misc/search_tools/export_elasticsearch_mappings.pl allows to export current search engine configuration into a YAML file.
This export should use UTF-8 encoding, like other exports.
Test plan :
1) Go to Administration > Search engine configuration (Elasticsearch)
2) Edit a field label to use a diacrtic, for example local-number => Numéro
3) Save
4) Edit file etc/koha-conf.xml to enable 'elasticsearch_index_mappings'
5) Export mappings to file via misc/search_tools/export_elasticsearch_mappings.pl -t $MARCFLAVOUR
6) Reset memcached and plack
7) Back to Administration > Search engine configuration (Elasticsearch)
8) Click on 'Reset Mappings' and accept
9) Look at field 'local-number'
=> Without patch diacritic 'é' is broken
10) You may try with an emoji B-)
Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Setup:
1 - Be using Elasticsearch
2 - Reload mappings from the db
Admin->Search configuration
Reset Mappings
3 - Reindex ES and confirm searching is working
To test:
1 - Apply patch
2 - Alter your mappings file for elastic (just change a description for a field)
3 - perl misc/search_tools/rebuild_elasticsearch.pl -r -v
Verbose not necessary, but good for letting you know things are progressing
4 - Confirm the mapping change shows in the interface
5 - Confirm reindex worked and searching is working
Signed-off-by: Andrew Fuerste-Henry <andrew@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
To test:
1 - Use the Koha sample data, or insert a blank 245$b into a record (easiest way is using advanced cataloging editor
2 - Reindex elasticsearch
3 - Check the ES count on the about page
4 - Check the count in the DB (SELECT count(*) FROM biblio)
5 - They don't match!
6 - perl misc/search_tools/rebuild_elastic_search.pl -v -v
7 - No errors indicated
8 - Apply patch
9 - perl misc/search_tools/rebuild_elastic_search.pl -v
10 - You should be notified of an error
11 - perl misc/search_tools/rebuild_elastic_search.pl -v -v
12 - You should be notified of the specific biblio with an error and a (somewhat) readable reason
13 - perl misc/search_tools/rebuild_elastic_search.pl
14 - No output
Signed-off-by: Ere Maijala <ere.maijala@helsinki.fi>
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Bouzid Fergani <bouzid.fergani@inlibro.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Test plan:
- launch perl misc/search_tools/export_elasticsearch_mappings.pl >
/path/to/my_mappings.yaml
- set koha-conf.elasticsearch_index_mappings to
/path/to/my_mappings.yaml,
- go to admin -> Search engine configuration,
- click on "Reset mappins",
- check that your search fields and mappings are as expected
Signed-off-by: Bouzid Fergani <bouzid.fergani@inlibro.com>
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Bug 9978 should have fixed them all, but some were missing.
We want all the license statements part of Koha to be identical, and
using the GPLv3 statement.
Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
To test:
1 - perl misc/search_tools/rebuild_elastic_search.pl -h
2 - Note it indicates the bn option can be passed to index individual authids
3 - perl misc/search_tools/rebuild_elastic_search.pl -a -bn 92 -v
4 - Note the error
5 - Apply patch
6 - perl misc/search_tools/rebuild_elastic_search.pl -h
7 - Note new option ai|authid for indexing individual authids
8 - Note updated text for bn|biblionumber option
9 - perl misc/search_tools/rebuild_elastic_search.pl -a -bn 92 -v
10 - No errors, but no records indexed
11 - perl misc/search_tools/rebuild_elastic_search.pl -a -ai 92 -v
12 - 1 record indexed
13 - perl misc/search_tools/rebuild_elastic_search.pl -ai 92 -bn 92 -v
14 - 1 authority record and 1 biblio record indexed
Signed-off-by: Michal Denar <black23@gmail.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Use of uninitialized value $processes in numeric lt (<) at
misc/search_tools/rebuild_elasticsearch.pl line 199.
We want the number of processes to be set to 1 by default, and then
assign it to $slice_count
Test plan:
Run the script with and without --processes and confirm that the warning
went away.
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Change to zero based indexing for slice index to simplify some
conditions. Exit with error message if trying to combine processes
and biblio numbers arguments.
Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Test plan:
1. Time execution without -p parameter
2. Time execution with -p 2 or -p3 or -p 4 depending on CPU core count
Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
This patch change Koha::Cron to be a more generic Koha::Script class and
update all commanline driven scripts to use it.
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Add missing pods, remove obsolete syspref and add test for serialization format for records exceeding max record size
Sponsored-by: Gothenburg University Library
Signed-off-by: Ere Maijala <ere.maijala@helsinki.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Add persistent per index "index status" state to provide useful
user feedback when update of Elasticsearch server mappings fails
Sponsored-by: Gothenburg University Library
Signed-off-by: Ere Maijala <ere.maijala@helsinki.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Implement optimized indexing for Elasticsearch
How to test:
1) Time a full elasticsearch re-index without this patch by running the
rebuild_elastic_search.pl with the -d flag:
`koha-shell <instance_name> -c "time rebuild_elastic_search.pl -d"`.
2) Apply this patch.
3) Time a full re-index again, it should be about twice at fast (for a
couple of thousand biblios, with fewer biblios results may be more
unpredictable).
Sponsored-by: Gothenburg University Library
Signed-off-by: Ere Maijala <ere.maijala@helsinki.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Improvements:
1) Mappings UI now has button that allows one to reset the mappings.
2) Mappings UI displays the items in alphabetical order.
3) Indexing script drops and recreates the index right away, which
helps prevent ES from autocreating a bad index if someone does something
while the first batch of records is being processed.
4) Indexing script has nicer output.
To test:
1) Change mappings.yaml file
2) Reset mappings in UI in mappings.pl
3) Verify the mappings have been changed in UI
4) The field order is alphabetical
5) Rebuild script has clean output
6) Run test t/db_dependent/Koha_Elasticsearch_Indexer.t
Signed-off-by: Bouzid Fergani <bouzid.fergani@inlibro.com>
Signed-off-by: Julian Maurice <julian.maurice@biblibre.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
What we currently have:
Koha/ElasticSearch.pm
Koha/ElasticSearch/Indexer.pm
Koha/SearchEngine/Elasticsearch/QueryBuilder.pm
Koha/SearchEngine/Elasticsearch/Search.pm
What we want:
Koha/SearchEngine/Elasticsearch.pm
Koha/SearchEngine/Elasticsearch/Indexer.pm
Koha/SearchEngine/Elasticsearch/QueryBuilder.pm
Koha/SearchEngine/Elasticsearch/Search.pm
Test plan:
% git grep -i Koha::ElasticSearch
% git grep ElasticSearch|grep -v Catmandu::Store::ElasticSearch
should not return any result
Do a full reindex and search for records
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
The changes made to Koha::Authority has not been correctly fixed.
The code of Koha::Authority has been moved bo
Koha::MetadataRecord::Authority by bug 15380.
Test plan:
perl misc/search_tools/rebuild_elastic_search.pl -a -v
should success
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
There were rebase conflicts that it was just easier to postpone until
afterwards.
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
% perl misc/search_tools/rebuild_elastic_search.pl -bn 42
Can't locate object method "idnumber" via package "MARC::Record" at
misc/search_tools/rebuild_elastic_search.pl line 171.
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
It will improve the indexing time.
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
(Not fetched yet though.)
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>