To test:
1 - Do some authority searches in Zebra
2 - Switch to ES and repeat, results will vary and some may fail
3 - Apply patch and dependencies
4 - Reindex ES
5 - Repeat searches, they should suceed and results should be similar to
Zebra
6 - Slight differences are okay, but results should (mostly) meet
expectations
A few notes:
We add a 'normalizer' to ensure we get a single token from the heading
indexes, this makes 'starts with' work as expcted
We switch to 'AND' for fields searched from cataloging editor - this
matches Zebra results
We force the '__sort' fields for sorting - if sorting looks wrong try
reducing the heading field to a single subfield - this will need to be
addressed on a future bug (multiple subfields create an array, ES sorts
those randomly)
Signed-off-by: Nicolas Legrand <nicolas.legrand@bulac.fr>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
We likely shoudln't pass through an uncoverted sort order for now, but
it does allow us to look ahead to implementing the orders directly so
seems a good option to have.
Either this patch should be used, or the commented out tests should be
removed
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Currently sort order is extracted from sort condition by splitting the
field, instead use regular expression to extract the last part preceded
by underscore.
Signed-off-by: Nicolas Legrand <nicolas.legrand@bulac.fr>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
To test:
1 - Enable Zebra
2 - Perform an auth search
3 - note results
4 - Enable ES
5 - Repeat search, note (likely) diff results
6 - Open a record in cataloging and use the button to launch auth search
7 - Perform same search as above, note results match for eiher engine
selected
8 - NOTE: Disbale sorting for ES search - this will be dealt with in
another report
Signed-off-by: David Bourgault <david.bourgault@inlibro.com>
Signed-off-by: Nicolas Legrand <nicolas.legrand@bulac.fr>
Signed-off-by: Alex Arnaud <alex.arnaud@biblibre.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
To test:
0 - Apply Unit test patch only
1 - prove t/db_dependent/Koha_SearchEngine_Elasticsearch_Search.t
2 - Should fail
3 - Apply this patch
4 - prove t/db_dependent/Koha_SearchEngine_Elasticsearch_Search.t
5 - should pass
6 - search for 'Local-number.raw:"4"' (or a vlid biblionumber)
7 - should get expected result
Signed-off-by: David Bourgault <david.bourgault@inlibro.com>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Test plan assumes
QueryAutoTruncate = automatically
SearchEngine = Elasticsearch
To test:
0 - Apply Unit test patch only
1 - prove t/db_dependent/Koha_SearchEngine_Elasticsearch_Search.t
2 - Should fail
3 - Apply this patch
4 - prove t/db_dependent/Koha_SearchEngine_Elasticsearch_Search.t
5 - should pass
6 - search for 'Local-number:"4"' (or a vlid biblionumber)
7 - should get expected result
Signed-off-by: David Bourgault <david.bourgault@inlibro.com>
Signed-off-by: Julian Maurice <julian.maurice@biblibre.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
To test:
1 - Enable suppression
2 - Suppress some records
3 - Apply all the patches
4 - Reindex ES
5 - Search and don't get suppressed records
6 - Disable suppression
7 - Search and get all the records
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
By using a different split regex, we can simplify a bit the process of
appending '*' to every word of the query
Signed-off-by: Julian Maurice <julian.maurice@biblibre.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
To test:
1 - prove t/db_dependent/Koha_SearchEngine_Elasticsearch_Search.t
2 - do some searches in staff client and test results
Signed-off-by: Julian Maurice <julian.maurice@biblibre.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This patchset adds a subroutine '_truncate_terms' to the ES QueryParser.
If QueryAutoTruncate is enabled this function will be called for any
search to add wildcard '*' to all terms
To test:
1 - Enable Elasticsearch and have some records indexed
2 - Search for partial terms
3 - Note they fail unless '*' is appended
4 - Apply patch, leave QueryAutoTruncate disabled
5 - Note partial term searches still fail
6 - Enable QueryAutoTruncate
7 - Note partial term searches succeed
8 - Do some regular and advanced searches to make sure results are as
expected
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Julian Maurice <julian.maurice@biblibre.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
It seems we have a syntax mismatch - any vs all
all seems to the be list we expect so updating code that way
To test:
1 - Enabled Elasticsearch
2 - Index some authorities
3 - Perform a 'Search entire record' search
4 - Internal server error (
Invalid marclist field provided: all at
/usr/local/koha/Koha/SearchEngine/Elasticsearch/QueryBuilder.pm
line 433.
)
5 - Run:
$ sudo koha-shell kohadev
k$ cd kohaclone
k$ prove t/db_dependent/Koha/SearchEngine/Elasticsearch/QueryBuilder.t
=> FAIL: Tests fail because 'any is used'
6 - Apply patch
7 - Search should work
8 - Run:
k$ prove t/db_dependent/Koha/SearchEngine/Elasticsearch/QueryBuilder.t
=> SUCCESS: Tests pass!
9 Sign off :-D
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Edited the test plan so it mentions the new tests
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
To test:
1 - Apply patch
2 - Backup your db
3 - Drop and create a new db to ensure your mappings are refreshed from
the patch
4 - add some titles with items with collection codes
5 - search and see collection code facets
6 - sign off
Work to be done:
1 - Replace codes with descriptions
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Brendan A Gallagher <brendan@bywatersolutions.com>
This patch makes the 'Locations' facet work as expected (i.e. having the
same behaviour it has for Zebra: picking the 952$c in MARC21 and 995e
for UNIMARC).
It also adds the code to handle holding and home library settings for
facets and makes the facets show the library name instead of the branch
code.
The mappings are updated so the labels match what facets.inc expect to
work properly.
To test:
- On master, do a search that returns biblios with items having
homebranch set.
=> FAIL: Under the 'Locations' label on the facets you will notice
branchcodes are shown.
- Apply the patch
- Restart memcached and plack (just in case, it was tricky)
- Reset your mappings:
http://localhost:8081/cgi-bin/koha/admin/searchengine/elasticsearch/mappings.pl?op=reset&i_know_what_i_am_doing=1
- Restart memcached and plack (again, not sure if needed)
- Make sure this mappings are set:
homebranch => HomeLibrary
holdingbranch => HoldingLibrary
(Note: it might not be set due to the place the yaml file is being picked)
- Reindex your records:
$ sudo koha-shell kohadev
k$ cd kohaclone
k$ perl misc/search_tools/rebuild_elastic_search.pl -d -v
- Repeat the initial search
=> SUCCESS: 'Location' contains the right stuff, 'Home libraries' and
'Holding libraries' too.
- Run
k$ prove t/db_dependent/Koha_SearchEngine_Elasticsearch_Search.t
=> SUCCESS: Tests pass!
- Sign off :-D
Note: play with the 'DisplayLibraryFacets' syspref options.
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
-Changed deprecated facets to aggregations
-Fixed boolean datatypes not allowing analyzers to be specified
-Fixed deprecated '_id' to 'es_id'. Now the ES-index has the correct id==biblionumber
ZE TEST PLAN
1. Reset Zebra index since facets are hard coded to dynamic search_marc_mappings.
2. perl misc/search_tools/rebuild_elastic_search.pl
3. Fetch all indexed records and the facet for subject__facet
curl -XGET localhost:9200/koha_biblios/data/_search?pretty -d '{
"aggregations": {
"my_agg": {
"terms": {
"field": "subject__facet"
}
}
}
}'
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Elastic appears to be indexing onloan as a string, but our code assumes
it is a boolean.
Test Plan:
1) Ensure you are set up using Elastic as your search engine
2) Search only for available items from the advanced search
3) Note you get no results
4) Apply this patch
5) Re-run the search
6) You should now get results!
Signed-off-by: Jennifer Schmidt <jschmidt@switchinc.org>
Signed-off-by: Jonathan Druart <jonathan.druart@biblibre.com>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
What we currently have:
Koha/ElasticSearch.pm
Koha/ElasticSearch/Indexer.pm
Koha/SearchEngine/Elasticsearch/QueryBuilder.pm
Koha/SearchEngine/Elasticsearch/Search.pm
What we want:
Koha/SearchEngine/Elasticsearch.pm
Koha/SearchEngine/Elasticsearch/Indexer.pm
Koha/SearchEngine/Elasticsearch/QueryBuilder.pm
Koha/SearchEngine/Elasticsearch/Search.pm
Test plan:
% git grep -i Koha::ElasticSearch
% git grep ElasticSearch|grep -v Catmandu::Store::ElasticSearch
should not return any result
Do a full reindex and search for records
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
The system preference FacetMaxCount should work as expected with ES.
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
If no limit are passed, the url will contain '&limit=' anyway.
It is not necessary and can be avoided easily
Test plan:
1/ Search for a term in your catalogue
2/ Hover over a link in the facet area
3/ The link is
cgi-bin/koha/opac-search.pl?idx=kw&q=your_term&limit=&limit=[...]
With this patch, the empty limit parameter does not appear.
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Filer on "Ślez, Ts." => Can't escape \x{015A}, try uri_escape_utf8()
instead at
/home/koha/src/Koha/SearchEngine/Elasticsearch/QueryBuilder.pm line 221.
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
This allows sorting to be configured within a field. For example, while
many values are included for search on author, sorting should only be
done on the main entry values. This permits that by have a sort value,
which can be true, false, or null. true and null are pretty much the
same, but false means that a field isn't available for sorting on. By
default (null), fields can be sorted on.
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
It doesn't really help.
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
This means that things like itype get "OR"ed together, rather than
"AND"ed like other things.
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Queries are being built, but they seem to be wrong as no results are
returned.
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jesse Weaver <jweaver@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>