recordId: (bib1,Identifier-standard) just after the comma. Adam agreed it was a bug, and it should be solved soon. But now we are aware, we can avoid putting the space !
In this commit you have all what is needed to setup a working zebra DB in Unimarc :
* collection.abs is UNIMARC specific and must be rewritten for MARC21, in marc21 directory
* pdf.properties is to be copied unmodified in the marc21 directory (can also be put somewhere else)
* rebuild_zebra.pl is SLOW, but 1 step reindexing tool, using ZOOM
* rebuild_zebra_idx is FAST, but 2 step reindexing tool, and does not use zebra. run it, it will create all biblios XML files in /zebra/biblios directory, then zebraidx update biblios in your zebra directory
* zebra.cfg is the zebra config file ;-)
* test_cql2rpn.pl is a script that will query the database and show the results. Works for me, just change the query at the beginning to get answers you expect.
What has to be done :
* benchmarking : it seems the zebraidx update is faster than lightning (400biblios/sec : 10 000biblios in 25seconds), while ZOOM indexing is slow (something like 25biblios/second) More benchmarking could be done.
* completing collection.abs for UNIMARC. I'll take care of it.
* modifying Biblio.pm to use ZOOM instead of the "zebraidx through exec" running actually. I'll take care of it also.
* modify the search API & tools & screens. I'll let the ball to someone else (chris ?) for this. I agree SearchMarc.pm can be dropped and replaced by something else (maybe a new-and-clean Search.pm package)
* removing useless tables
* adding useful indexes
* altering some columns definitions
* The goal being to have updater working fine for foreign keys.
For me it's done, let me know if it works for you. You can see an updated schema of the DB (with constraints) on the wiki
Seems not to break too many things, but i'm probably wrong here.
at least, new features/bugfixes from 2.2.5 are here (tested on some features on my head local copy)
- removing useless directories (koha-html and koha-plucene)
some explanations :
- updater/updatedatabase => will transform all tables in innoDB (not related to utf8, just to warn you) AND collate them in utf8 / utf8_general_ci. The SQL command is : ALTER TABLE tablename DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci.
- *-top.inc will show the pages in utf8
- THE HARD THING : for me, mysql-client and mysql-server were set up to communicate in iso8859-1, whatever the mysql collation ! Thus, pages were improperly shown, as datas were transmitted in iso8859-1 format ! After a full day of investigation, someone on usenet pointed "set NAMES 'utf8'" to explain that I wanted utf8. I could put this in my.cnf, but if I do that, ALL databases will "speak" in utf8, that's not what we want. Thus, I added a line in Context.pm : everytime a DB handle is opened, the communication is set to utf8.
- using marcxml field and no more the iso2709 raw marc biblioitems.marc field.