Test plan:
1. Create an authority with at least a 1XX$a and a 4XX$a, for instance:
100 $a Foo
400 $a Bar
2. Create a biblio and add a link to this authority using the
cataloguing plugin
3. Disable syspref IncludeSeeFromInSearches
4. Reindex the biblio. You should be able to find it when searching
'Foo' but not when searching 'Bar'
5. Enable syspref IncludeSeeFromInSearches
6. Reindex the biblio. You should be able to find it when searching
'Foo' and also when searching 'Bar'
without the patch, 'Bar' doesn't yeld results. With it, it does.
7. prove t/db_dependent/Koha/SearchEngine/Elasticsearch.t
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Lucy Vaux-Harvey <lucy.vaux-harvey@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This bug adds a system preference to control ordering of facets and
adds the control to both Zebra and Elasticsearch
To test:
1 - Have a koha that can use both Zebra and ES
2 - Set 'displayFacetCount' to true
3 - Search in ES and Zebra
4 - Note facets in Zebra sorted alphabetically, ES by usage
5 - Apply patch, updatedatabase
6 - Search in ES and Zebra, facets are alphabetically sorted in both
7 - Find new syspref FacetOrder and set to 'by usage'
8 - Search in both engines, facets sorted by usage
Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Test plan:
- Create a biblio with title like "osteuropa:" or "osteuropa!"
- Go the this biblio detail pages (cgi-bin/koha/catalogue/detail.pl)
=> Error
- Apply bug 28316 and this one
- test again
Signed-off-by: Alex Buckley <alexbuckley@catalyst.net.nz>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
With all the work that's gone into improving the internal
_clean_search_term method I feel we should expose it publically as it's
going to be more widely helpful
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
if QueryAutoTruncate enabled we will have any special operators ruined
for example: "test [6 TO 7]" will be converted to "test* [6* TO* 7]"
so no reason to keep ranges when QueryAutoTruncate set to "enabled"
1) enable QueryAutoTruncate at your sysprefs.
2) perform a search using range, for example: "[1999 TO 2020]",
it shouldn't work the way it's supposed to.
3) apply the patch.
4) perform the same search with range, ensure that it works correctly.
Signed-off-by: Alex Buckley <alexbuckley@catalyst.net.nz>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This patch ensures that the behavior with
QueryRegexEscapeOptions set to values other than
"Escape" still will works as expected.
It does so by storing the contents of regexes
before escaping special characters and
then restores the contents of regexes back to how
it was before, ensuring that searching with regex is possible.
Signed-off-by: Alex Buckley <alexbuckley@catalyst.net.nz>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Currently having exclamation sign at the end of the query makes ES
search fail, and when you try to search for a book that has exclamation
sign in the tittle (something like "Words! words") won't show results
correctly as it tries to negate everything that is after exclamation
sign, making it impossible to search for books that have in in the title
This patch escapes exclamation signs if it's at the end of the query or
has a space after it, resolving both of the issues listed above.
To reproduce:
1) with ES enabled, search for the book with title that contains
exclamation sight at the end, like "book!", this search should result
in error.
2) do another search, but this time find/prepare beforehand book with a
title that has exclamation sign with a space after it,
e.g "exclamation! sign", it shouldn't find it as ES treats everything
after that exclamation sign as negation.
2) apply the patch.
3) perform searches from the steep one and two again.
Search from step one should no longer fail, while search from the step
two should find that book.
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>
Signed-off-by: Alex Buckley <alexbuckley@catalyst.net.nz>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This patch screens square and curly brackets which have no special
language meaning.
To reproduce:
1) using ES, search for the book with title that contains
square and/or curly brackets, like "book [second edition]", which will
result in error.
2) apply the patch.
3) search for that book again, ensure that it works now.
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>
Signed-off-by: Alex Buckley <alexbuckley@catalyst.net.nz>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Currently searches like: "book:", ":book" and "host-item:test:n"
cause internal server errors.
This patch adds additional regexes that remove the colons at the start
and end of the query, and another regex that screens all follow-up
colons that go after the first colon to avoid errors when searching for
"host-item:test:n".
To reproduce:
1) using ES, search for the book with title that contains
semicolon at the start or at the end of the line, separated with spaces,
this should cause internal server error.
2) try doing the same with something like "host-item:test:n", it should
result in error as well.
3) apply the patch.
4) repeat steps 1-2, ensure that it works now.
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>
Signed-off-by: Alex Buckley <alexbuckley@catalyst.net.nz>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This patch removes the code from the search scripts into QueryBuilder
modules.
To test:
1 - Have a library group defined as a search group for both staff and opac
2 - Search on staff client and opac with that group limit and a single branch limit
3 - Note your results/counts
4 - Note the visuals of the search description
5 - Apply patch
6 - Repeat searches
7 - All should work as before
Signed-off-by: Andrew Fuerste-Henry <andrew@bywatersolutions.com>
Signed-off-by: Joonas Kylmälä <joonas.kylmala@iki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
There is existing error message in the code stating:
"Unable to understand your search query, please rephrase and try again."
which fits perfectly but because it looks for "ParseException" in the
warning output it doesn't show up on this page as it's actually
"parse_exception".
This patch makes that it's also checked if "parse_exception" is present
in the warning output.
Side note:
"ParseException" reaction code was added here:
e0f6c4dc Bug 12478: improve error reporting a bit
Search::Elasticsearch seems propagates clean ES JSON answer,
and in current ES version inside of $@ it contains "parse_exception"
string in dumped JSON answer ("'type' => 'parse_exception'").
Old seeked phrase "ParseException" wasn't reproduced, only in ES logs
("Caused by: org.apache.lucene.queryparser.classic.ParseException:
Cannot parse ..."). Check for both phrases won't complicate future
changes, but this note added for reference and code cleanup if needed.
To reproduce:
1) using ES search for something like "// ^ ! { } [ ] .. , <>" that
will for sure break the syntax of ES.
2) after the search query fails note that the error is
"Unable to perform your search. Please try again."
3) apply the patch
4) search for the same thing again
5) error message should be "Unable to understand your search query,
please rephrase and try again." now.
Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>
Signed-off-by: Andrew Nugged <nugged@gmail.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This patch adds the cni/Control-number-identifier index to enable
searches to use the 003 field.
Test plan
1/ Apply patch
2/ Re-index using updated configurations
3/ Confirm cni:number searches yield the expected results
4/ Signoff
Split-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Pasi Kallinen <pasi.kallinen@koha-suomi.fi>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
In advanced search with Elasticsearch the limit on years range is actually using copydate :
Koha/SearchEngine/Elasticsearch/QueryBuilder.pm in _fix_limit_special_cases() :
if ( $l =~ /^yr,st-numeric,ge=/ ) {
my ( $start, $end ) =
( $l =~ /^yr,st-numeric,ge=(.*) and yr,st-numeric,le=(.*)$/ );
next unless defined($start) && defined($end);
push @new_lim, "copydate:[$start TO $end]";
}
With Zebra it uses date-of-publication and also in Koha/SearchEngine/Elasticsearch/QueryBuilder.pm :
our %index_field_convert = (
(...)
'yr' => 'date-of-publication',
This patch uses %index_field_convert to perform 'yr' limit.
Test plan:
1) Apply patch
2) Use Elasticsearch searchengine
3) Go to advanced search with 'More options'
4) Perform a search with a year limit (value or range)
5) Check results are correct
Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
On bug 17591 we discovered that there was something weird going on with
the way we export and use subroutines/modules.
This patch tries to standardize our EXPORT to use EXPORT_OK only.
That way we will need to explicitely define the subroutine we want to
use from a module.
This patch is a squashed version of:
Bug 17600: After export.pl
Bug 17600: After perlimport
Bug 17600: Manual changes
Bug 17600: Other manual changes after second perlimports run
Bug 17600: Fix tests
And a lot of other manual changes.
export.pl is a dirty script that can be found on bug 17600.
"perlimport" is:
git clone https://github.com/oalders/App-perlimports.git
cd App-perlimports/
cpanm --installdeps .
export PERL5LIB="$PERL5LIB:/kohadevbox/koha/App-perlimports/lib"
find . \( -name "*.pl" -o -name "*.pm" \) -exec perl App-perlimports/script/perlimports --inplace-edit --no-preserve-unused --filename {} \;
The ideas of this patch are to:
* use EXPORT_OK instead of EXPORT
* perltidy the EXPORT_OK list
* remove '&' before the subroutine names
* remove some uneeded use statements
* explicitely import the subroutines we need within the controllers or
modules
Note that the private subroutines (starting with _) should not be
exported (and not used from outside of the module except from tests).
EXPORT vs EXPORT_OK (from
https://www.thegeekstuff.com/2010/06/perl-exporter-examples/)
"""
Export allows to export the functions and variables of modules to user’s namespace using the standard import method. This way, we don’t need to create the objects for the modules to access it’s members.
@EXPORT and @EXPORT_OK are the two main variables used during export operation.
@EXPORT contains list of symbols (subroutines and variables) of the module to be exported into the caller namespace.
@EXPORT_OK does export of symbols on demand basis.
"""
If this patch caused a conflict with a patch you wrote prior to its
push:
* Make sure you are not reintroducing a "use" statement that has been
removed
* "$subroutine" is not exported by the C4::$MODULE module
means that you need to add the subroutine to the @EXPORT_OK list
* Bareword "$subroutine" not allowed while "strict subs"
means that you didn't imported the subroutine from the module:
- use $MODULE qw( $subroutine list );
You can also use the fully qualified namespace: C4::$MODULE::$subroutine
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
In Elasticsearch, query for biblios uses lenient=true.
This is also needed for authorities search.
In case a search field is defined as type numeric.
Test plan :
1) Use Elasticsearch searchengine
2) Define a search field 'local-number' as type 'Number'
3) Be sure to us 'local-number' in autorities mapping
4) Rebuild autorities
5) Performe a search for autorities with 'Search entire record' and
'contains' with term '123'
=> Without patch you get error :
[query_shard_exception] Can only use prefix queries on keyword and text fields - not on [local-number] which is of type [integer]
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
'error' has special meaning in exceptions so naming the fields:
type, details
Rather than only dealing with a single exception type, we generically
get the ES exception info and pass it up.
I could not recreate timeout still, however, I simply restarted the
ES docker during commit stage to cause NoNodes exceptions
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This catches a timeout response from the ES server, logs this, and continues the indexing
To test:
1 - perl misc/search_tools/rebuild_elasticsearch.pl
2 - Make the ES server timeout (I don't have good instruction yet)
3 - Watch the job crash
4 - Apply patches
5 - perl misc/search_tools/rebuild_elasticsearch.pl
6 - Make the server timeout
7 - Note the job reports failed commit, and continues
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Bug 26312: (follow-up) Reset buffers even if commit fails
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Bug 26312: (follow-up) Fix whitespace and missing semicolon
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
If we are searching on kw there is a leading colon at the beginning of
the generated query:
kw:foo becomes :foo
Note that it only happens when there is no other terms before.
Test plan:
0. Don't apply the patch
1. Search for kw:foo
2. Notice the error
Error: Unable to perform your search. Please try again.
and the logs say
Failed to parse query [(:foo*)]
3. Apply the patch
4. Repeat the search and notice that you know get:
"12 result(s) found for 'kw:foo'."
Signed-off-by: Fridolin Somers <fridolin.somers@biblibre.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
koha-elastic --rebuild was failing with
Unable to update mappings for index "koha_kohadev_biblios". Reason was: "Could not convert [marc_data.index] to boolean"
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Julian Maurice <julian.maurice@biblibre.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Julian Maurice <julian.maurice@biblibre.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Subdivision authorities are not used for linking, however, they are recognized by
C4::AuthoritiesMarc
While these records are not used for linking, they could provide reference and
should be allowed to exist in the catalog without breaking ES indexing
THis patch simply skips the step of parsing the authorities into the linking form
if the type contains '_SUBD'
To test:
1 - Import a subdivision authority record via Z39 or use the one attached to this bug
2 - perl misc/search_tools/rebuild_elasticsearch.pl -v -d
3 - Authority indexing dies:
Use of uninitialized value $tag in hash element at /usr/share/perl5/MARC/Record.pm line 202.
Use of uninitialized value $tag in regexp compilation at /usr/share/perl5/MARC/Record.pm line 206.
Use of uninitialized value $tag in hash element at /usr/share/perl5/MARC/Record.pm line 207.
Can't call method "tag" on an undefined value at /kohadevbox/koha/C4/Heading.pm line 71.
4 - Apply patches
5 - reindex
6 - Success!
Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
In zebra we sort by callnumber using 8007 cn-sort
We should do the same in elasticsearch
To test:
1 - Have Koha using Elasticsearch
2 - Perform a search
3 - Attempt to sort by callnumber
4 - Error in logs: No mapping found for [local-classification__sort]
5 - Apply patch
6 - Restart all
7 - Perform a search and sort by callnumber
8 - Success!
Signed-off-by: Fridolin Somers <fridolin.somers@biblibre.com>
Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
It'd be great if the Search Engine Configuration page would display
the various aliases (shortcuts) available : ti for title, sn for local-number, etc.
Patch changes Koha/SearchEngine/Elasticsearch/QueryBuilder.pm to move
hard-coded vars at the beging and adds a method to provide to %index_field_convert via a method.
Test plan :
1) Use Elasticsearch
2) Go to Administration > Search engine configuration (Elasticsearch)
3) Check you see new column 'Aliases' with for example ti for title.
4) Perform a search 'ti:<title>' and check you get results
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
We convert 'keyword' to '' as an index since we want it to search all fields
When we are searching 'as phrase', however, we should not drop the search type
To test:
1 - Enable IntranetCatalogPullDown
2 - Set searchEngine to Elasticsearch
3 - Perform a search for 'Keyword as phrase' for a phrase that does appear in a record
4 - You get the result
5 - Reverse the order of words in the phrase
6 - You still get a result?
7 - Apply patch
8 - Restart all the things
9 - Reversed search does not return record
10 - Correct order and search, correct record returned
Signed-off-by: Andrew Fuerste-Henry <andrew@bywatersolutions.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This patch prepares Koha to officially no longer support Elasticsearch 5.X
It adds a new system preference 'ElasticsearchCrossFields' to allow users to choose whether or not
to enable this feature
It updates the about page to add a deprecation warning if a site is running ES5
To test:
1 - Be running Koha with Elasticsearch 5.X
2 - Attempt to search
Error: Unable to perform your search. Please try again.
3 - Apply patch
4 - Update database
5 - Searching works
6 - Find syspref 'ElasticsearchCrossFields'
7 - Enable it
8 - Searching is now broken
9 - Check the about page
10 - you can now see the Elasticsearch version
11 - The systeminformation tab has a deprectaion warning
12 - Set SearchEngine preference to 'Zebra'
13 - View the about page - no warnings
14 - Test again with ES6 - searching should "work" with either pref setting
15 - There should be no warning on about pages
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This patch adds the 'cross_fields' type to our searches:
https://www.elastic.co/guide/en/elasticsearch/reference/6.8/query-dsl-query-string-query.html#query-string-syntax
Without this patch the search terms seem to all require being in the same field when using Elasticsearch 6
To test:
0 - Set QueryAutoTruncate to 'only if * is added'
1 - Find a record with a title and publisher
2 - Search for a word form the title and confirm the record is returned
3 - Search for a work from the title and the publisher's name
4 - The record is not returned
5 - Apply patch
6 - Repeat #3
7 - The record is returned
Signed-off-by: Victor Grousset/tuxayo <victor@tuxayo.net>
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
When both a record and record_id are passed to index_records the data should be passed through
to update_index. We missed copying over the record ids to the variable we use as a check.
To test:
1 - Set searchEngine system preference to Elasticsearch
2 - Reindex your db
3 - Search authorities
4 - Edit a record and add 'testwaffle' to the main heading
5 - Search authorities for 'testwaffle' - no results
6 - Apply patch
7 - Edit the record again, change 'testwaffle' to 'testpancake'
8 - Search authorities for 'testpancake' - result!
9 - Confirm imported authorities and authorities added via Z39 are correctly indexed
Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Without this patch we get
Use of uninitialized value $3 in concatenation (.) or string at /kohadevbox/koha/Koha/SearchEngine/Elasticsearch/QueryBuilder.pm line 943.
This converts the | OR operator to two different regexes so that the
capture group variables will be defined in every case.
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This updates the regex used for removing colons to capture those with space on either side, and remove the colon
while preserving the space
To test:
1 - Have Koha using ES
2 - Search for:
ti:chess AND chess
3 - You should get a result in sample data, otherwise replace 'chess' with a title in your catalogue
4 - Search for:
ti:chess AND kw:chess
5 - No result
6 - Enable DumpTemplateVarsIntranet and DumpSearchQueryTemplate
7 - Repeate search and check page source
8 - search_query has:
title:chess ANDchess
9 - Apply patch
10 - Repeat
11 - Seaerch works!
12 - query is now:
title:chess AND chess
Signed-off-by: Séverine QUEUNE <severine.queune@bulac.fr>
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Bug 24567: (follow-up) Use dollar sign to refer to captures
Signed-off-by: Joonas Kylmälä <joonas.kylmala@helsinki.fi>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
To test:
1 - Apply patch
2 - ./installer/data/mysql/updatedatabase.pl
3 - Reset ES mapping: Administration->Search engine configuration , button at bottom of page
4 - 'issues' and 'title' mapping under 'search fields' should be mandatory and not editable
5 - On 'Bibliographic records' tab you should not be able to delete the single entry for issues
6 - You should be able to delete 'title' mappings, however, at the final one you should be stopped by javascript
7 - Bonus: force remove the last mapping from the page using developer tools - attempt to save and should be warned of missing mandatory mapping
Signed-off-by: Nicolas Legrand <nicolas.legrand@bulac.fr>
Signed-off-by: Bouzid Fergani <bouzid.fergani@inlibro.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
For debugging purposes we may wish to see the requests and responses made to
Elasticsearch
To test:
1 - prove -v t/Koha/SearchEngine/Elasticsearch.t
2 - Set <trace_to>Stderr</trace_to> in koha-conf
3 - Restart all
4 - perl misc/search_tools/rebuild_elasticsearch.pl
5 - Note requests are shown
6 - Set
<trace_to>File</trace_to>
<trace_to>/var/log/koha/kohadev/plack-error.log</trace_to>
in koha-conf
7 - Restart all
8 - perl misc/search_tools/rebuild_elasticsearch.pl
9 - Check the plack log and see the ES requests
Signed-off-by: Bob Bennhoff <bbennhoff@clicweb.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Currently if you combine subfields in the marc mappings the subfields are indexed in the order
listed in the mapping.
i.e. 650(avxyz) in mapping
and in record:
650 $aHeading $zGeosubdiv $vFormsubdiv
is indexed as:
Heading Formsubdiv Geosubdiv
We should preserve the order and index as:
Heading Geosubdiv Formsubdiv
We can use built in function in Marc::Field to achieve this
To test:
1 - It is easy to find examples of this using authorities
2 - Find or create a record with subfields order azv
e.g. 150$aActresses$zUnited states$vBiography
3 - Add or have a second authority
e.g. 150$aActresses$vPortraits
4 - Set an authorities mapping for 'Heading' to 150(abgvxyz)
find at:
Administration->Search engine configuration (Elasticsearch)->Authorities tab
5 - Index the records in Elasticsearch
perl misc/search_tools/rebuild_elaticsearch.pl -a -ai 1691 -ai 1692
6 - View the first record in the ES index
curl es:9200/koha_kohadev_authorities/data/1692?pretty
7 - Note 'Heading' field is ordered as in the mapping
8 - Search authorities for 'contains' "act"
9 - Note the records sort incorrectly
10 - Apply patches
11 - perl misc/search_tools/rebuild_elaticsearch.pl -a -ai 1692
12 - curl es:9200/koha_kohadev_authorities/data/1692?pretty
13 - Note the order is now preserved
14 - Search authorities for 'contains' "act"
15 - Note the records sort correctly
Signed-off-by: Heather Hernandez <heather_hernandez@nps.gov>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
JD amended patch: Fix
FAIL spelling
combind ==> combined
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Just formally needed. It is already loaded somewhere.
That is: Koha::SearchEngine::Elasticsearch.
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
To test:
1 - Load the sample DB or edit a record (using advanced cataloging editor) to have a blank subfield in a field that is indexed as suggestible
2 - For example 'author' / 100a
100 _ _ ‡a
3 - Index that record into Elasticsearch 5.X:
perl misc/search_tools/rebuild_elasticsearch.pl -v -bn 115 -b -d
4 - Note error 'value must have length > 0'
5 - Edit mappings to set author 100a not suggestible
6 - perl misc/search_tools/rebuild_elasticsearch.pl -v -bn 115 -b -d
7 - Success
8 - Set field to suggestible again
9 - Apply patch
10 - perl misc/search_tools/rebuild_elasticsearch.pl -v -bn 115 -b -d
11 - Success!
Signed-off-by: Bob Bennhoff <bbennhoff@clicweb.org>
Signed-off-by: Martin Renvoize <martin.renvoize@ptfs-europe.com>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Doing the same change as previously (renaming biblionumber), but fixing
at the same the record fetch. If (theoretically) an authority is passed
without a record, it would have fetched a biblio record.
Test plan:
You need Elasticsearch here.
Replaced this line in AddAuthority:
$indexer->index_records( $authid, "specialUpdate", "authorityserver", $record );
by
$indexer->index_records( $authid, "specialUpdate", "authorityserver", undef );
And updated an authority record. Check if you can search for the change.
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
JD amended patch: remove trailing whitespace
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
ModZebra:
The name is very misleading: we can index authid's too here.
And yes, it should not be in C4/Biblio too ;) A first step..
Adding the same change here in Koha/SearchEngine/Zebra/Indexer.
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
This is analogous to 26522, we shoudl skip record that cannot be retrieved for indexing
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Bob Bennhoff <bbennhoff@clicweb.org>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
When batch editing, 2 reindex calls are sent to ES/Zebra.
We can easily avoid that reusing the skip_modzebra_update (renamed skip_record_index)
Additionally we should only send one request for biblio, and we should
only do it if we succeed
As the whole batch mod is in a transaction it is possible to fail in which case
Zebra queue is reset, but ES indexes have already been set
In addition to the skip param this patchset moves Zebra and Elasticsearch calls to
Indexer modules and introduces a generic Koha::SearchEngine::Indexer so that we don't
need to check the engine when calling for index
The new index_records routine takes an array so that we can reduce the calls to
the ES server.
The index_records routine for Zebra loops over ModZebra to avoid affecting current behaviour
Test plan:
General tests, under both search engines:
1 - Add a biblio and confirm it is searchable
2 - Edit the biblio and confirm changes are searchable
3 - Add an item, confirm it is searchable
4 - Delete an item, confirm it is not searchable
5 - Delete a biblio, confirm it is not searchable
6 - Add an authority and confirm it is searchable
7 - Delete an authority and confirm it is not searchable
Batch mod tests, under both search engines
1 - Have a bib with several items, none marked 'not for loan'
2 - Do a staff search that returns this biblio
3 - Items show as available
4 - Click on title to go to details page
5 - Edit->Item in a batch
6 - Set the not for loan status for all items
7 - Repeat your search
8 - Items show as not for loan
9 - Test batch deleting items
a - Test with a list of items, not deleting bibs
b - Test with a list of items, deleting bibs if no items remain where all items are only item on a biblio:
SELECT MAX(barcode) FROM items GROUP BY biblionumber HAVING COUNT(barcode) IN (1)
c - Test with a list of items, deleting bibs if no items remain where some items are the only item on a biblio:
SELECT MAX(barcode) FROM items GROUP BY biblionumber HAVING COUNT(barcode) IN (1,2)
10 - Confirm records are update/deleted as appropriate
Signed-off-by: Bob Bennhoff <bbennhoff@clicweb.org>
Signed-off-by: Marcel de Rooy <m.de.rooy@rijksmuseum.nl>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
To test:
1 - Have some records with uncertain dates in the 008
19uu, 195u, etc.
2 - Index them in Elasticsearch
3 - Do a search that will return them
4 - Sort results by publication/copyright date
5 - Note odd results
6 - Apply patch
7 - Reindex
8 - Sorting should be improved
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Katrin Fischer <katrin.fischer.83@web.de>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>