Bug 14542: Transliterate rule for all single quote forms

Single quotes in common language (not in programming) are usually ', but
there is also the form known as ’ in HTML.  See
https://fr.wikipedia.org/wiki/Apostrophe_%28typographie%29

This bug proposes to transliterate all forms into a space.

Test plan :
(I'll use the code ’ instead of the unicode character)
- Without the patch
- Create a record with title : L’avion d’argile
- Index this record
- Search for "L’avion d’argile" => You find the record
- Search for "L'avion d'argile" => You do not find the record
- Apply patch
- Search for "L’avion d’argile" => You find the record
- Search for "L'avion d'argile" => You find the record
- Search for "L avion d argile" => You find the record

Signed-off-by: Frederic Demians <f.demians@tamil.fr>
Signed-off-by: Jonathan Druart <jonathan.druart@bugs.koha-community.org>
Signed-off-by: Tomas Cohen Arazi <tomascohen@theke.io>
This commit is contained in:
Fridolin Somers 2015-07-16 17:48:14 +02:00 committed by Tomas Cohen Arazi
parent f2c6bd2c61
commit b11eb03a4c

View file

@ -4,6 +4,8 @@
<transliterate rule="{ æ > ae "/>
<transliterate rule="{ Æ > ae "/>
<transliterate rule="\'>\ "/>
<transliterate rule="\u2019>\ "/>
<transliterate rule="\u02BC>\ "/>
<transliterate rule="[:Number:] { '-' > '' "/>
<!-- Remove control characters except \t\n\r -->
<transform rule="[\x00-\x08\x0B\x0C\x0E-\x1F\x7F] Any-Remove"/>