Koha/misc
Thomas Klausner 61c8fc97fc Bug 35345: Add --where option to rebuild_elasticsearch.pl
Sometimes we need to only re-index a subset of our bibliographic data or authorities. Currently this is only possible by enumerating all id (-bn or -ai), which does not work well when indexing eg 100.000 items of a 2.000.000 DB. Re-indexing everything is also overkill.

This patch adds an `--where` flag to misc/search_tools/rebuild_elasticsearch.pl which can take arbitrary SQL (that of course has to match the respective tables) and adds it as an additional param to the resultset to index

To test, start koha-testing-docker with ElasticSearch enabled, for example via `ktd --es7 up

Before applying the patch, rebuild_elasticsearch will index all data:

Biblios:
$ misc/search_tools/rebuild_elasticsearch.pl -b -v
[12387] Checking state of biblios index
[12387] Indexing biblios
[12387] Committing final records...
[12387] Total 435 records indexed
(there might be a waring regarding a broken biblio, which can be ignored)

Auth:
$ misc/search_tools/rebuild_elasticsearch.pl -a -v
[12546] Checking state of authorities index
[12546] Indexing authorities
[12546] 1000 records processed
[12546] Committing final records...
[12546] Total 1706 records indexed

Now apply the patch

Biblio, limit by range of biblioid:
$ misc/search_tools/rebuild_elasticsearch.pl -b -v --where "biblionumber between 100 and 150"
[12765] Checking state of biblios index
[12765] Indexing biblios
[12765] Committing final records...
[12765] Total 50 records indexed

Note that only 50 records where indexed (instead of the whole set of 435 records)

Auth, limit by authtypecode:
$ misc/search_tools/rebuild_elasticsearch.pl -a -v --where "authtypecode = 'GEOGR_NAME'"
[12848] Checking state of authorities index
[12848] Indexing authorities
[12848] Committing final records...
[12848] Total 142 records indexed

Again, only 142 have been indexed.

Sponsored-by: Steiermärkische Landesbibliothek
Sponsored-by: HKS3 / koha-support.eu

Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Katrin Fischer <katrin.fischer@bsz-bw.de>
(cherry picked from commit 61e7aa374e)
Signed-off-by: Fridolin Somers <fridolin.somers@biblibre.com>
2024-05-23 16:49:25 +02:00
..
admin
bin
cronjobs Bug 23296: Set patron branchcode when preparing non-digest version 2024-05-23 11:36:46 +02:00
devel Bug 36517: Fix output from install_plugins.pl 2024-05-23 14:36:46 +02:00
interface_customization
maintenance
migration_tools
release_notes Update release notes for 23.11.05 release 2024-05-03 15:15:06 +02:00
search_tools Bug 35345: Add --where option to rebuild_elasticsearch.pl 2024-05-23 16:49:25 +02:00
translator Bug 36516: Fix useless warning from translation script 2024-05-23 13:46:27 +02:00
workers Bug 33898: Implement reaping for database polling 2024-03-18 09:56:02 +01:00
add_date_fields_to_marc_records.pl
add_statistics_borrowers_categorycode.pl
batchCompareMARCvsFrameworks.pl
batchdeletebiblios.pl
batchDeleteUnusedSubfields.pl
batchImportMARCWithBiblionumbers.pl
batchRebuildBiblioTables.pl
batchRebuildItemsTables.pl
batchRepairMissingBiblionumbers.pl
check_sysprefs.pl
commit_file.pl
export_borrowers.pl
export_records.pl Bug 31286: Embed see-from headings into bibliographic records export 2024-05-23 10:39:39 +02:00
exportauth.pl
import_patrons.pl Bug 34621: Tidy import_patrons.pl 2024-05-23 14:56:02 +02:00
koha-install-log
link_bibs_to_authorities.pl Bug 30024: Make link_bibs_to_authorities.pl rely on LinkerRelink 2023-11-03 14:22:45 -03:00
load_yaml.pl
mod_zebraqueue.pl
process_ill_updates.pl
recreateIssueStatistics.pl
sax_parser_print.pl
sax_parser_test.pl
sip_cli_emulator.pl
stage_file.pl
z3950_responder.pl