Bug 9579: fix truncation of facets containing multi-byte characters
We seem to be relying on whatever Zoom::Results->render return, and
Perl doesn't explicitly consider it UNICODE data. That's why CORE::substr
(and probably CORE::length too) cut the bytes wrong.
This patch just decodes the UTF-8 data that render() returns and then
Perl behaves, heh.
It uses Encode::decode_utf8 which is already a dependency for the current
stable Koha releases.
REVISED TEST PLAN
-----------------
1) Import the attached sample records.
2) Rebuild your indexes
3) In OPAC search for يكيمكتبات : قبسي ، كرم
-- There will be ugly diamonds with question marks in the facets
4) apply the patch
5) Search again.
-- The names will be properly truncated.
NOTE: This test assumes FacetLabelTruncationLength = 20.
Sponsored-by: Universidad Nacional de Cordoba
Signed-off-by: Mark Tompsett <mtompset@hotmail.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83@web.de>
Passes all tests and QA script.
Works as described, tested with several German, English and
the Arabic test record. Arabic strings now display correctly
and no regression was found.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
I've reviewed it and approve its inclusion in 3.14.x and earlier. I
will use the patches for bug 11096, once they pass QA, for the master
branch.
Signed-off-by: Galen Charlton <gmc@esilibrary.com>
Signed-off-by: Fridolin Somers <fridolin.somers@biblibre.com>
(cherry picked from commit
171e2b47460c7afa489b16eb885a9862eef9d43a)
Signed-off-by: Tomas Cohen Arazi <tomascohen@gmail.com>