This patch adds the 'cross_fields' type to our searches:
https://www.elastic.co/guide/en/elasticsearch/reference/6.8/query-dsl-query-string-query.html#query-string-syntax
Without this patch the search terms seem to all require being in the same field when using Elasticsearch 6
To test:
0 - Set QueryAutoTruncate to 'only if * is added'
1 - Find a record with a title and publisher
2 - Search for a word form the title and confirm the record is returned
3 - Search for a work from the title and the publisher's name
4 - The record is not returned
5 - Apply patch
6 - Repeat #3
7 - The record is returned
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Without this patch we get
Use of uninitialized value $3 in concatenation (.) or string at /kohadevbox/koha/Koha/SearchEngine/Elasticsearch/QueryBuilder.pm line 943.
This converts the | OR operator to two different regexes so that the
capture group variables will be defined in every case.
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This updates the regex used for removing colons to capture those with space on either side, and remove the colon
while preserving the space
To test:
1 - Have Koha using ES
2 - Search for:
ti:chess AND chess
3 - You should get a result in sample data, otherwise replace 'chess' with a title in your catalogue
4 - Search for:
ti:chess AND kw:chess
5 - No result
6 - Enable DumpTemplateVarsIntranet and DumpSearchQueryTemplate
7 - Repeate search and check page source
8 - search_query has:
title:chess ANDchess
9 - Apply patch
10 - Repeat
11 - Seaerch works!
12 - query is now:
title:chess AND chess
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Bug 24567: (follow-up) Use dollar sign to refer to captures
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
The match-heading field is a special field used only by the linker, not accessible
to staff or patrons via the interface. This field is used to store the constructed
'search form' used for matching bib headings to authority fields.
In bug 24269 I attempted to use the mappings defined in the inferface and also inject the search term.
This did not work as too many subfields were indexed on their own and leading to false matches.
In this bug we remove the mappings for this field, and create it ourselves during
the indexing process. The C4::Headings module is still used to generate the correct form,
however, the mappings are set based on the authority types in the system. This gives the user
the ability to add new typoes, but prevents mapping changes from breaking linker functionality
To test:
1 - Start form a sample database with ElasticSearch working
2 - Download via Z39.50 2 authorities, one of which is a narrower heading of the other, e.g.:
Waterworks
Waterworks - Costs
3 - Place a heading for the broader term in a record. e.g. Waterworks
In 650$a, without the cataloguing authority plugin. We don't want
the link created now.
You need syspref BiblioAddsAuthorities => allow
4 - Make sure linker is set to default
5 - Attempt to link the records
misc/link_bibs_to_authorities.pl
6 - Linking fails
7 - Apply patch
8 - refresh index settings (if using a custom file, remove 'match-heading')
You can reset mappings in the UI or run this:
misc/search_tools/rebuild_elasticsearch.pl -v -d -r
9 - Reindex ES
10 - Try to link again
11 - It succeeds!
12 - Run the tests
prove t/db_dependent/Koha/SearchEngine/Elasticsearch.t
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Bug 25273: (follow-up)
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This takes care of more occurences of staff client and changes it to
staff interface, including in code comments.
To test:
- I think in this case careful code review is what we look for.
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
The code in build_query_compat contained a 'my' in the assigning of limits for the search in a conditional
This meant the limits were being set correctly during the conditional, but we blanked when passed to the rest
of the code. The effect was that the searches worked, however, the template params to repeat the search were
incomplete.
Removing the my ensures the same limits are applied during search and on re-sorting
To test:
1 - Enable Elasticsearch
2 - On OPAC perform advanced search, selecting only an itype, ccode, or LOC limit
3 - Attempt to sort results
4 - You are returned to Advanced search
5 - Apply patch
6 - Repeat
7 - It sorts!
Signed-off-by: Lisette Scheer <lisetteslatah@gmail.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
We assume all limits from advanced search to be a phrase and quote them
when doing this we should remove the phrase marker to avoid doulbe quoting
To test:
1 - Have koha using ES
2 - Go to advanced search
3 - Limit by a single itemtype that exists
4 - Get some results
5 - Limit by a different itemtype that exists
6 - Get some results
7 - Limit by both itemtypes
8 - Get only the results for the second itemtype
9 - Enable DumpTemplateVarsIntranet and DumpSearchQueryTemplate
10 - Repeat search
11 - View page source and find 'search_query'
12 - See limit looks like itype:("("BK")" OR "("CR")")
13 - Apply patches
14 - Restart all the things
15 - Repeat search for both itemtypes
16 - Note results now include both types
Signed-off-by: Alex Arnaud <alex.arnaud@biblibre.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
To test:
0 - Set DisplayLibraryFacets to 'both' and FacetMaxCount to 20
1 - Have more than 10 branches
2 - Have items in each of those branches
3 - Enable ES, set system preference SearchEngine to Elasticsearch
4 - Search for '*'
5 - Expand homebranch/holdingbranch facets
6 - Note you only get 10
7 - Apply patch
8 - Repeat search
9 - Now you get all your facets
Signed-off-by: Michael Springer <mspringer@mylakelibrary.org>
Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
To recreate:
1 - Have Koha using ES5 and Elasticsearch as search engine
2 - Enable DumpTemplateVarsIntranet and DumpSearchQueryTemplate
3 - Do a search in authorities using 'Search entire record' (abduction if using sample db)
4 - Note no results
5 - View the page source and find 'search_query'
6 - Note the uppercased fields
7 - curl 'es:9200/koha_kohadev_authorities/data/417?pretty'
8 - Note all fields lower-cased
9 - Apply patch
10 - Repeat search
11 - It works!
Signed-off-by: Heather Hernandez <heather_hernandez@nps.gov>
Signed-off-by: Alex Arnaud <alex.arnaud@biblibre.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
In the advanced search form, you can enable several limits using syspref
AdvancedSearchTypes (namely itemtypes, shelving locations, collections)
When used, the resulting query parts end up being joined with OR, even
if the field is different. That means that if you pick "Book" under
itemtypes tab, and "Fiction" under collection tab, it will search
"itype:BOOK OR ccode:FIC". It should be AND.
For instance, if you select:
Itemtypes:
✓ Book
✓ DVD
Location:
✓ Child
✓ Adult
it should search:
itype:(Book OR DVD) AND location:(Child OR Adult)
Test plan:
0. Do not apply the patch yet
1. Enable elasticsearch
2. Set syspref AdvancedSearchTypes = 'itemtypes|loc|ccode'
3. Create a new itemtype and a new authorised value for categories LOC
and CCODE
4. Create a biblio with the new itemtype, another biblio with the new
location, another biblio with the new collection, and again another
biblio with the new itemtype, location and collection
5. Verify that you can find these new biblio records using only the
"advanced search types" in the advanced search form
6. In the advanced search form, pick all 3 limits (itemtype, location,
collection) and verify that it returns the 4 records.
7. Apply the patch
8. Repeat step 6, it should now return only the biblio that satisfies
all criteria
9. Verify that if you select more than one
{itemtype|location|collection} it still returns results that
satisfies any selected criteria
10. prove t/Koha/SearchEngine/ElasticSearch/QueryBuilder.t
Signed-off-by: Bernardo Gonzalez Kriegel <bgkriegel@gmail.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Not only should terms from facets/limits be grouped, but order should probably matter
We build a different facet for "Dillinger Girl" or "Girl Dillinger"
The parens are possibly overkill now, but I think it makes it very clear and does not hurt
To test:
1 - Follow original plan
2 - Create a new record with 'Girl Dillinger' as author
3 - Search for 'Dill*'
4 - You get all three records (and maybe others that match)
5 - Limit by 'Girl Dillinger' - you get two records
6 - Same for 'Dillinger Girl'
7 - Apply patch
8 - Limits/facets for 'Dillinger Girl' and 'Girl Dillinger' now match a isngle record
Signed-off-by: Ere Maijala <ere.maijala@helsinki.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Test plan:
- Use Elasticsearch 6 (you'll need Bug 18969),
- create a biblio (#1) with "Dillinger Girl" in author and what you
want in title,
- create another biblio (#2) with the word "girl" in the title and
"Dillinger Escaplan" as author
- reindex
- search * and refine on "Dillinger Girl"
- Ko => Biblio #1 and #2 appear
- Apply this patch,
- search * and refine on "Dillinger Girl"
- Ok => anly biblio #1 appears
- use Elasticsearch 5 again
- check for no search regression
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Ere Maijala <ere.maijala@helsinki.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
A search made on Shelving location tab in advanced search gives 0 result.
Test plan :
Without patch:
- in advanced search, choose a location in shelving location tab and start search (opac-search.pl?idx=kw&op=and&idx=kw&op=and&idx=kw&do=Search&limit=mc-loc)
- 0 result
- check that location exist by doing a simple search and filtering on the location facet
- have some results
With patch:
- in advanced search, choose a location in shelving location tab and start search (opac-search.pl?idx=kw&op=and&idx=kw&op=and&idx=kw&do=Search&limit=mc-loc)
- found results
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
To test:
1 - Enable ES
2 - Enable OpacSuppression
3 - Suppress a bib in staff client
Edit 942n to be 1
4 - Search the opac for anything
5 - Tail the plack logs and note a deprecation warning
6 - Apply patch
7 - Restart all the things
8 - Repeat search on opac
9 - No error/warn in logs
10 - Record is correctly suppressed
Signed-off-by: Myka Kennedy Stephens <mkstephens@lancasterseminary.edu>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
To test:
1 - In staff or OPAC using ES, search for "biblionumber:3" or any existing biblionumber
2 - No results
3 - Apply patch, restart all the things
4 - Search again
5 - You go to the biblionumber
Signed-off-by: Bob Bennhoff <bbennhoff@clicweb.org>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Note: I also remove warnings for undefined operation in this patch, is a trivial fix
To test:
prove -v t/db_dependent/Koha/SearchEngine/Elasticsearch/QueryBuilder.t
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
To test:
1 - Export your authorities via Tools->Export data
2 - Define a record matching rule in Admin->Record matchign rules
Use index: LC-card-number
field: 010$a
3 - Stage the exported records for import and use the rule created above for matching
4 - The process does not complete
5 - Check intranet error logs - exception on unknown marclist
6 - Apply patch
7 - Repeat
8 - Success!
9 - prove -v t/db_dependent/Koha/SearchEngine/Elasticsearch/QueryBuilder.t
Signed-off-by: Andrew Fuerste-Henry <andrew@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Adds support for using the "scan indexes" action in advanced search by using faceting with a prefix filter. Requires that the field be set as facetable for anything to be found.
Test plan:
1. Apply patch
2. Go to advanced search and click "More options"
3. Select author as the search field, enter a last name and check "Scan indexes"
4. Perform search and observe the result list resembling scan results
Signed-off-by: Michal Denar <black23@gmail.com>
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
As of bug 20589 we no longer analyze sort fields and so we no longer need to append ".phrase"
to our sort in searches.
Additionally, sort fields based on 'sum' should also use sum in building the value to sort on
To test:
0 - Be using ES
1 - Find the most circulated item in your collection
2 - Search for '*'
3 - Sort by popularity DESC
4 - Note that item is not first
5 - Try to sort by anything but relevancy, it fails
6 - Apply patch
7 - Redo searches and sorts
8 - Things should now work as expected
Signed-off-by: Ere Maijala <ere.maijala@helsinki.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Test plan
- Set up a elasticsearch 6 instance to work with Koha,
- you may need to make koha works with ES 6 (see bug 20589),
- make a search and limit it to available items only,
=> no result
- Apply this patch,
- make a search and limit it to available items only,
- you should get some results
- Do the same with ES 5.x
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
During cataloguing of an existing biblio, on an heading field,
the use of tag editor fills authorities finder with existing value :
Search main heading ($a only)
Search main heading
Default operator beeing 'contains'.
Actually with Elasticsearch those search give no results.
Example with heading :
200
$a Casaubon
$b Isaac
$f 1559-1614
Call to Elasticsearch :
"query" : {
"bool" : {
"must" : [
{
"query_string" : {
"query" : "Casaubon*",
"default_field" : "heading-main",
}
},
{
"query_string" : {
"query" : "(Isaac*) AND (1559-1614*)",
"default_field" : "heading"
}
}
]
}
},
"sort" : [
{
"heading__sort.phrase" : "asc"
}
]
}
Patch adds to "query_string" :
analyze_wildcard : true.
Test plan :
1) Use Elasticsearch
2) Edit an existing biblio record
3) Use tag editor on a heading
4) Click search => You get correct results
5) Check also search in authorities-home.pl
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Add "QueryRegexEscapeOption" system preference to provide option to escape
Elasticsearch regexp delimiters (/) within queries, or alternativly to
unescape escaped slashes (\/) while escaping unescaped slashes, in
effect making "\/" the new regexp delimiter.
How to test:
1) Run tests in ./t/Koha/SearchEngine/ElasticSearch/QueryBuilder.t
2) All tests should succeed
Signed-off-by: Nicolas Legrand <nicolas.legrand@bulac.fr>
Signed-off-by: Maksim Sen <maksim.sen@inlibro.com>
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Alex Arnaud <alex.arnaud@biblibre.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Alex Arnaud <alex.arnaud@biblibre.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Generate a list of fields for the query_string query fields parameter,
with possible boosts, instead of using "_all"-field. Also add "search"
flag in search_marc_to_field table so that certain mappings can be
excluded from searches. Add option to include/exclude fields in
query_string "fields" parameter depending on searching in OPAC or staff
client. Refactor code to remove all other dependencies on "_all"-field.
How to test:
1) Reindex authorities and biblios.
2) Search biblios and try to verify that this works as expected.
3) Search authorities and try to verify that this works as expected.
4) Go to "Search engine configuration"
5) Change some "Boost", "Staff client", and "OPAC" settings and save.
6) Verify that those settings where saved accordingly.
7) Click the "Biblios" or "Authorities" tab and change one or more
"Searchable" settings
8) Verfiy that those settings where saved accordingly.
9) Try to verify that these settings has taken effect by peforming
some biblios and/or authorities searches.
Sponsorded-by: Gothenburg Univesity Library
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Alex Arnaud <alex.arnaud@biblibre.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
In cataloguing, the use of tag editor opens authorities finder with a limit on specific authorities type.
This limit is missing with Elasticsearch.
This patch adds in query a "filter" on "term" which is the most performant way because there must be no ranking computed on this part.
Test plan :
1) Use Elasticsearch
2) Create an autority of type author (NP in UNIMARC) with heading "Tolkien"
3) Create an autority of type subject-author (SAUT in UNIMARC) with heading "Tolkien"
4) Create a biblio record
5) Use tag editor on a author field (700 in UNIMARC)
6) Seach for "Tolkien" in $a
without patch : you see the 2 authorities in results
with patch : you see only the correct authority in results (NP in UNIMARC)
7) Check search in authorities-home.pl is still OK
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Avoid using slash in the field name since it would need to be escaped. Fix conversion of dtlm and any existing mapping.
Signed-off-by: Liz Rea <wizzyrea@gmail.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Also fixes the mappings.yaml to use correct field name (left over issue from bug 19575), so reset mappings and reindex before testing.
Test plan:
1. Reset mappings and reindex biblios.
2. Check that tests in t/db_dependent/Koha/SearchEngine/Elasticsearch/QueryBuilder.t pass.
3. Try that all of the following year range type work in publication date search and year limit in advanced search:
yyyy
yyyy-yyyy
-yyyy
yyyy-
Signed-off-by: Liz Rea <wizzyrea@gmail.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
This patch removes the constraint of only passing 5 facets to the template unless the list is expanded, in fact, it removes the 'expanded' attribute from Search.pm
Now that all facets are passed to page it adds a 'show more' link at the bottom of lists and allows user to expand or collapse any facet set without reloading page.
Updated tests included.
To test:
1 - Perform an OPAC search that returns more than 5 of any given facet type
2 - Click the "Show more" link on the facets and see that the search is reloaded
3 - Apply patch
4 - Repeat search
5 - Note that you can click "Show more" without reloading page
6 - Test that page load is not greatly affected
7 - Ensure that all facet links function normally
8 - Ensure that facets are the same a prior to patch
9 - Repeat for staff client
10 - Prove t/Search.t
NOTE: This patch makes it much easier to see that there is an existing issue with marking the "active" facet. Ending punctuation seems to confuse the matcher.
Signed-off-by: Michal Denar <black23@gmail.com>
Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
You get no results when searching with an hyphen + with * in query string (or with preference QueryAutoTruncate) :
ie /cgi-bin/koha/opac-search.pl?q=saints-anges*
Looks like query-string by default does not compute wildcards, see analyze_wildcard in :
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
Test plan :
1) Use Elasticsearch
2) Create a record with "saints-anges"
3) Search for "saints-anges" => you get results
4) Search for "saints-anges*" => you get results
5) Search for "saints-ang*" => you get results
Signed-off-by: Michal Denar <black23@gmail.com>
Signed-off-by: Arthur Bousquet <arthur.bousquet@inlibro.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Test plan:
After each of the following steps, verify that the results and query description looks ok.
1. Do a simple keyword query
2. Change sort order
3. Click a facet
4. Do an advanced search with title and author
5. Change sort order
6. Click a facet
Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Marjorie <marjorie.barry-vila@collecto.ca>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Test plan:
1. Do an advanced search for
Title = new
AND
Title = york
2. Verify that the results match an advanced search for:
Title = new york
3. Verify that tests in t/db_dependent/Koha/SearchEngine/Elasticsearch still pass
Signed-off-by: Michal Denar <black23@gmail.com>
Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
This patch add language as a facet to ES results - it adds
a new template plugin for languages to get the appropriate
description given an iso 639-2 code
To test:
1 - Make sure you have records with differing languages (in the MARC21 008
field characters 35-37 or UNIMARC 101a)
2 - Apply patch
3 - Reload Elasticsearch settings:
http://localhost:8081/cgi-bin/koha/admin/searchengine/elasticsearch/mappings.pl?op=reset&i_know_what_i_am_doing=1
4 - Reindex your records
5 - Search for a phrase that will return results in several languages
6 - Verify you see factes correctly labelled for 'Language'
7 - Verify the facets work
8 - Verify both opac and staff results
9 - prove t/db_dependent/Languages.t
Signed-off-by: Nicolas Legrand <nicolas.legrand@bulac.fr>
Signed-off-by: Alex Arnaud <alex.arnaud@biblibre.com>
Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Escapes unquoted colons that have whitespace on either side. Removed unbalanced quotes.
Test plan:
1. Make sure the test case described in the bug works
2. Make sure tests pass:
prove t/Koha/SearchEngine/Elasticsearch/QueryBuilder.t
Signed-off-by: Björn Nylén <bjorn.nylen(at)ub.lu.se>
Signed-off-by: Josef Moravec <josef.moravec@gmail.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Keep authority-number as alias and change field name
from alias to real field in hard coded Elasticsearch query
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
This patch lower cases the sort by fields to normalize checking them and adjusts
some existing tests to meet the new expectations.
The regex for splitting terms has been moved into a subroutine so that adjustment was made
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Adjust elastic search mappings to more closely match Zebra equivalents
resolving serveral issues with coded Zebra searches in templates and
sorting of search results in UI. Also make field names in search strings
case insensitive to accept case variations in template links and user input.
Sponsored-by: Gothenburg University Library
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Ere Maijala <ere.maijala@helsinki.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
- Makes token splitting work like in biblio search
- Only adds right hand truncation since that's what Zebra also does
- Only adds truncation if the token is not a phrase and ends in a word character
- Adds tests to existing and new QueryBuilder functions
Test plan:
1. Create an authority record for "Duck, Donald"
2. Try contains type authority searches with the following terms and make sure they find the record:
Duck, Donald
donald duck
don duck
"Duck, Donald"
3. Make sure the following search works but does not find anything:
"Duck, Don"
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Sponsored-by: National Library of Finland
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Also optimize it so it's actually usable.
Test plan:
1. To test it properly you need biblio and authority data. You might get away with enabling AutoCreateAuthorities and BiblioAddsAuthorities so that authorities are created in the process. Another option would be to import authorities first e.g. from LoC.
2. Make sure the authority index has been properly created with "misc/search_tools/rebuild_elastic_search.pl -a -d"
3. Run "misc/link_bibs_to_authorities.pl -v -l" twice and observe the results.
Sponsored-by: National Library of Finland
Signed-off-by: Brendan Gallagher <brendan@bywatersolutions.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
It is true by default on ES 5 and has been removed on ES 6
Signed-off-by: Ere Maijala <ere.maijala@helsinki.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Rebased-by: Alex Arnaud <alex.arnaud@biblibre.com>
Signed-off-by: Ere Maijala <ere.maijala@helsinki.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
The idea is the following: if some search field(s) are weighted in
search engine config page, Koha will query ES on all fields plus those with
the coresponding weights. Else, search is done on the entire record with
no weighting. The advanced search page is unaffected by these changes
Test plan (having Koha working with Elasticsearch):
- apply this patch
- have some weights defined for various fields
- try searches from the search bar and from the advanced search page
- confirm weighting affects the relevancy (in expected ways)
e.g.
1. search for 'a' from advanced search, note results
2. give 'title' a weight
3. search for 'a' using the simple search bar
4. results with 'a' in the title should now be more relevant
- confirm search results on advanced search page are unaffected
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Rebased-by: Alex Arnaud <alex.arnaud@biblibre.com>
Signed-off-by: Ere Maijala <ere.maijala@helsinki.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Test plan:
- Try a simple search (OPAC or staff) like "title:foo",
- click on a facet: badaboum!
- apply this patch,
- retry
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
To test:
1 - Do some authority searches in Zebra
2 - Switch to ES and repeat, results will vary and some may fail
3 - Apply patch and dependencies
4 - Reindex ES
5 - Repeat searches, they should suceed and results should be similar to
Zebra
6 - Slight differences are okay, but results should (mostly) meet
expectations
A few notes:
We add a 'normalizer' to ensure we get a single token from the heading
indexes, this makes 'starts with' work as expcted
We switch to 'AND' for fields searched from cataloging editor - this
matches Zebra results
We force the '__sort' fields for sorting - if sorting looks wrong try
reducing the heading field to a single subfield - this will need to be
addressed on a future bug (multiple subfields create an array, ES sorts
those randomly)
Signed-off-by: Nicolas Legrand <nicolas.legrand@bulac.fr>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
We likely shoudln't pass through an uncoverted sort order for now, but
it does allow us to look ahead to implementing the orders directly so
seems a good option to have.
Either this patch should be used, or the commented out tests should be
removed
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>