Loading...
OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited
Reynaert,Martin
Reynaert,Martin
Abstract
We present further work on evaluation of the fully automatic post-correction of Early Dutch Books Online, a collection of 10,333 18th century books. In prior work we evaluated the new implementation of Text-Induced Corpus Clean-up (TICCL) on the basis of a single book Gold Standard derived from this collection. In the current paper we revisit the same collection on the basis of a sizeable 1020 item random sample of OCR post-corrected strings from the full collection. Both evaluations have their own stories to tell and lessons to teach.
Description
Date
2016
Journal Title
Journal ISSN
Volume Title
Publisher
ELRA
Research Projects
Organizational Units
Journal Issue
Keywords
TICCL, OCR post-correction, evaluation, EDBO, Nederlab, CLARIAH
Citation
Reynaert, M 2016, OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited. in Calzolari (ed.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). ELRA, pp. 967-974, International Conference on Language Resources and Evaluation 2016, Portoroz, Slovenia, 23/05/16.
License
info:eu-repo/semantics/restrictedAccess
