Abstract: We have developed a simple, but powerful extension for two well known line segmentation methods which makes them more robust when working on historical manuscripts with almost regular line spacing. Against the intuitive impression that such manuscripts are easy to be handled, existing methods and tools fail to correctly segment some columns, mainly because of empty or nearly empty lines. Since historical documents frequently do have a regular occurrence of lines it is advisable to take this knowledge into account. From a literature review, our ...
(read more)
Topics: 
Artificial intelligence
Natural language processing
Information retrieval