Abstract: National audience; This paper tackles the task of NER applied to historical texts obtained from processing digital images of news papers using OCR techniques. The main challenge for this task is that the OCR process leads to misspellings and linguistic errors in the output text, which can impact the performance of the NER. We conduct a comparative evaluation on two historical datasets in German and French against previous state-of-the-art models, and we propose a model based ona hierarchical stack of Transformers to approach the NER task for hi...
(read more)