Bug 32937: Make Zebra ignore the copyright symbol in searches

It's common to catalog the year with the copyright symbol attached
to it. This creates some issues with search that we can fix by
adding the © to the list of characters Zebra ignores in search.

To test:
* Search for any existing publication year with ©, example: ©1951
* Verify your record is not found
* Add the copyright symbol to a record, verify it's now found with ©,
  but not without
* Apply patch
* On ktd:
  * sudo cp -i /kohadevbox/koha/etc/zebradb/etc/word-phrase-utf.chr  /etc/koha/zebradb/etc/word-phrase-utf.chr
  * sudo koha-zebra --restart kohadev
  * sudo koha-mysql kohadev
  * DELETE FROM biblio WHERE biblionumber = 369;
  * The reindex woudl fail for me with this broken record present.
  * sudo koha-rebuild-zebra -f kohadev
* Repeat searches, the records should be found when searching
  with and without ©.

Signed-off-by: David Nind <david@davidnind.com>
Signed-off-by: Nick Clemens <nick@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
(cherry picked from commit c39d4ea0b2)
Signed-off-by: Matt Blenkinsop <matt.blenkinsop@ptfs-europe.com>
This commit is contained in:
Katrin Fischer 2023-04-16 11:57:19 +00:00 committed by Matt Blenkinsop
parent 276bd9af4b
commit 42b8f60bf4

View file

@ -9,7 +9,7 @@ lowercase {0-9}{a-z}αβγδεζηθικλμνξοπρστυφχψω
uppercase {0-9}{A-Z}ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ
# Breaking characters
space {\001-\040}!"#$%&'\()*+,-./:;<=>?@\[\\]^_`\{|}~{\x88-\x89}{\x98-\x9C}¡¿«»
space {\001-\040}!"#$%&'\()*+,-./:;<=>?@©\[\\]^_`\{|}~{\x88-\x89}{\x98-\x9C}¡¿«»
# Characters to be considered equivalent
equivalent aáàãåâăąȧǎȁȃ