Koha/etc
Fridolin Somers 9220482cd3 Bug 13064 - Indexing problem with ICU on control characters
The ICU configuration files contains a rule to remove control characters :
  <transform rule="[:Control:] Any-Remove"/>
This rule is before tokenization.

The problem is that "[:Control:]" regex contains line feed, carriage return and tab. See http://www.regular-expressions.info/posixbrackets.html.
So when several lines are indexed, last word of line is joined with first line of next line. Thoses words are then not searchable.

For example :
  First line
  Second line
This will become "First lineSecond line", tokenized as "First", "lineSecond" and "line".

Test plan :
- Use ICU in Zebra configuration
- Choose an indexed field, like 300$a
- Create a new record
- Enter several lines in choosen field, like :
  First line
  Second line
- Index this record
=> Without patch the search on "Second" does not return the record
=> With patch the search on "Second" returns the record
- Same tests with tab and carriage return instead of line feed

Signed-off-by: Chris Cormack <chris@bigballofwax.co.nz>

Signed-off-by: Kyle M Hall <kyle@bywatersolutions.com>
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>
2014-11-14 12:03:12 -03:00
..
pazpar2 FRBR: configure PazPar2 during installation 2008-02-11 16:35:18 -06:00
searchengine Revert "Bug 9828 : Followup for Queryparser and deletion of useless 6XX$9" 2014-10-28 12:02:09 -03:00
zebradb Bug 13064 - Indexing problem with ICU on control characters 2014-11-14 12:03:12 -03:00
koha-conf.xml Bug 12031: [QA Follow-up] Undefined routine and change to koha-conf.xml 2014-10-27 10:38:11 -03:00
koha-httpd.conf Bug 10325: (follow-up) fix typos and whitespace in httpd.conf example 2013-09-08 02:16:59 +00:00
README.txt Add configuration file helper to the installer 2007-09-06 17:14:40 -05:00
SIPconfig.xml Bug 12571 - Add ability to customize SIP2 screen messages 2014-10-28 09:26:47 -03:00

Koha Configuration Files:

The following files specify the base configuration for Koha ZOOM:

 * koha-httpd.conf  
In a debian system, this apache configuration file will be symlinked
from /etc/apache2/sites-enabled
Specify Koha's IP address with NameVirtualHost
Set ServerName, etc

 * koha-production.xml  
 * koha-testing.xml 
These are the production and testing configurations for zebrasrv and for Koha.
The first part of each file specifies Zebra server names, indexing configuration files,
and query language configurations.  Koha configuration directives follow. 

 * zebra-authorities.cfg  
 * zebra-biblios.cfg