Відмінності між версіями «Rebuild nozebra.pl»
Dubyk (обговорення • внесок) (→Джерела) |
Dubyk (обговорення • внесок) |
||
Рядок 1: | Рядок 1: | ||
− | rebuild_nozebra.pl — | + | == Назва == |
+ | rebuild_nozebra.pl — сценарій переіндексування бібліографічних та віторитетних MARC-записів у БД Zebra. | ||
+ | |||
+ | Використовуйте це пакетне завдання, щоб переіндексувати усі бібліотечні або авторитетні записи у Вашій базі даних Koha. | ||
+ | |||
+ | == Опис == | ||
+ | Параметри: | ||
+ | |||
+ | -b index bibliographic records | ||
+ | |||
+ | -a index authority records | ||
+ | |||
+ | -daemon Run in daemon mode. The program will loop checking | ||
+ | for entries on the zebraqueue table, processing | ||
+ | them incrementally if present, and then sleep | ||
+ | for a few seconds before repeating the process | ||
+ | Checking the zebraqueue table is done with a cheap | ||
+ | SQL query. This allows for near realtime update of | ||
+ | the zebra search index with low system overhead. | ||
+ | Use -sleep to control the checking interval. | ||
+ | |||
+ | Daemon mode implies -z, -a, -b. The program will | ||
+ | refuse to start if options are present that do not | ||
+ | make sense while running as an incremental update | ||
+ | daemon (e.g. -r or -offset). | ||
+ | |||
+ | -sleep 10 Seconds to sleep between checks of the zebraqueue | ||
+ | table in daemon mode. The default is 5 seconds. | ||
+ | |||
+ | -z select only updated and deleted | ||
+ | records marked in the zebraqueue | ||
+ | table. Cannot be used with -r | ||
+ | or -s. | ||
+ | |||
+ | --skip-deletes only select record updates, not record | ||
+ | deletions, to avoid potential excessive | ||
+ | I/O when zebraidx processes deletions. | ||
+ | If this option is used for normal indexing, | ||
+ | a cronjob should be set up to run | ||
+ | rebuild_zebra.pl -z without --skip-deletes | ||
+ | during off hours. | ||
+ | Only effective with -z. | ||
+ | |||
+ | -r clear Zebra index before | ||
+ | adding records to index. Implies -w. | ||
+ | |||
+ | -d Temporary directory for indexing. | ||
+ | If not specified, one is automatically | ||
+ | created. The export directory | ||
+ | is automatically deleted unless | ||
+ | you supply the -k switch. | ||
+ | |||
+ | -k Do not delete export directory. | ||
+ | |||
+ | -s Skip export. Used if you have | ||
+ | already exported the records | ||
+ | in a previous run. | ||
+ | |||
+ | -nosanitize export biblio/authority records directly from DB marcxml | ||
+ | field without sanitizing records. It speed up | ||
+ | dump process but could fail if DB contains badly | ||
+ | encoded records. Works only with -x, | ||
+ | |||
+ | -w skip shadow indexing for this batch | ||
+ | |||
+ | -y do NOT clear zebraqueue after indexing; normally, | ||
+ | after doing batch indexing, zebraqueue should be | ||
+ | marked done for the affected record type(s) so that | ||
+ | a running zebraqueue_daemon doesn't try to reindex | ||
+ | the same records - specify -y to override this. | ||
+ | Cannot be used with -z. | ||
+ | |||
+ | -v increase the amount of logging. Normally only | ||
+ | warnings and errors from the indexing are shown. | ||
+ | Use log level 2 (-v -v) to include all Zebra logs. | ||
+ | |||
+ | --length 1234 how many biblio you want to export | ||
+ | --offset 1243 offset you want to start to | ||
+ | example: --offset 500 --length=500 will result in a LIMIT 500,1000 (exporting 1000 records, starting by the 500th one | ||
+ | ) | ||
+ | note that the numbers are NOT related to biblionumber, that's the intended behaviour. | ||
+ | --where let you specify a WHERE query, like itemtype='BOOK' | ||
+ | or something like that | ||
+ | |||
+ | --run-as-root explicitily allow script to run as 'root' user | ||
+ | |||
+ | --wait-for-lock when not running in daemon mode, the default | ||
+ | behavior is to abort a rebuild if the rebuild | ||
+ | lock is busy. This option will cause the program | ||
+ | to wait for the lock to free and then continue | ||
+ | processing the rebuild request, | ||
+ | |||
+ | --table specify a table (can be items, biblioitems, biblio, biblio_metadata) to retrieve biblionumber to index. | ||
+ | biblioitems is the default value. | ||
+ | |||
+ | --help or -h show this message. | ||
== Джерела == | == Джерела == | ||
* https://git.koha-community.org/gitweb/?p=koha.git;a=blob;f=misc/migration_tools/rebuild_zebra.pl;h=ec19ce1f5c40445dcdb7c033414fde02dcb06c2b;hb=HEAD | * https://git.koha-community.org/gitweb/?p=koha.git;a=blob;f=misc/migration_tools/rebuild_zebra.pl;h=ec19ce1f5c40445dcdb7c033414fde02dcb06c2b;hb=HEAD | ||
* https://github.com/Koha-Community/Koha/blob/master/misc/migration_tools/rebuild_zebra.pl | * https://github.com/Koha-Community/Koha/blob/master/misc/migration_tools/rebuild_zebra.pl |
Версія за 17:19, 14 листопада 2022
Назва
rebuild_nozebra.pl — сценарій переіндексування бібліографічних та віторитетних MARC-записів у БД Zebra.
Використовуйте це пакетне завдання, щоб переіндексувати усі бібліотечні або авторитетні записи у Вашій базі даних Koha.
Опис
Параметри:
-b index bibliographic records
-a index authority records
-daemon Run in daemon mode. The program will loop checking for entries on the zebraqueue table, processing them incrementally if present, and then sleep for a few seconds before repeating the process Checking the zebraqueue table is done with a cheap SQL query. This allows for near realtime update of the zebra search index with low system overhead. Use -sleep to control the checking interval.
Daemon mode implies -z, -a, -b. The program will refuse to start if options are present that do not make sense while running as an incremental update daemon (e.g. -r or -offset).
-sleep 10 Seconds to sleep between checks of the zebraqueue table in daemon mode. The default is 5 seconds.
-z select only updated and deleted records marked in the zebraqueue table. Cannot be used with -r or -s.
--skip-deletes only select record updates, not record deletions, to avoid potential excessive I/O when zebraidx processes deletions. If this option is used for normal indexing, a cronjob should be set up to run rebuild_zebra.pl -z without --skip-deletes during off hours. Only effective with -z.
-r clear Zebra index before adding records to index. Implies -w.
-d Temporary directory for indexing. If not specified, one is automatically created. The export directory is automatically deleted unless you supply the -k switch.
-k Do not delete export directory.
-s Skip export. Used if you have already exported the records in a previous run.
-nosanitize export biblio/authority records directly from DB marcxml field without sanitizing records. It speed up dump process but could fail if DB contains badly encoded records. Works only with -x,
-w skip shadow indexing for this batch
-y do NOT clear zebraqueue after indexing; normally, after doing batch indexing, zebraqueue should be marked done for the affected record type(s) so that a running zebraqueue_daemon doesn't try to reindex the same records - specify -y to override this. Cannot be used with -z.
-v increase the amount of logging. Normally only warnings and errors from the indexing are shown. Use log level 2 (-v -v) to include all Zebra logs.
--length 1234 how many biblio you want to export --offset 1243 offset you want to start to example: --offset 500 --length=500 will result in a LIMIT 500,1000 (exporting 1000 records, starting by the 500th one
)
note that the numbers are NOT related to biblionumber, that's the intended behaviour. --where let you specify a WHERE query, like itemtype='BOOK' or something like that
--run-as-root explicitily allow script to run as 'root' user
--wait-for-lock when not running in daemon mode, the default behavior is to abort a rebuild if the rebuild lock is busy. This option will cause the program to wait for the lock to free and then continue processing the rebuild request,
--table specify a table (can be items, biblioitems, biblio, biblio_metadata) to retrieve biblionumber to index. biblioitems is the default value.
--help or -h show this message.