2018 •
Corpus of Occitan Written Traditional Folktales Annotated with Part-Of-Speech (OWT-Tag)
Authors:
Vergez-Couret, Marianne
Abstract:
This resource contains 5 extracts of texts in Occitan which were manually annotated with lemmas and parts-of-speech, following the Grace standard. It was produced during the ExpressioNarration project, funded by a Marie Curie Individual Fellowship, in order to evaluate the performance of an Occitan Part-Of-Speech tagger, Talismane, to the specifities of the corpus of the project called Oral Occitan (OcOr), also available on https://zenodo.org/record/1451753#.W78FJWOYSpo. Each extract contains around 1500 words. They are extracted from 'Cont (...)
This resource contains 5 extracts of texts in Occitan which were manually annotated with lemmas and parts-of-speech, following the Grace standard. It was produced during the ExpressioNarration project, funded by a Marie Curie Individual Fellowship, in order to evaluate the performance of an Occitan Part-Of-Speech tagger, Talismane, to the specifities of the corpus of the project called Oral Occitan (OcOr), also available on https://zenodo.org/record/1451753#.W78FJWOYSpo. Each extract contains around 1500 words. They are extracted from 'Contes et proverbes populaires recueillis en armagnac et Contes populaires recueillis en agenais' de J.-F. Bladé, 'Coundes biarnés, couéilhuts aüs parsàas miéytadès dou péys dé Biarn' de J.-V. Lalanne, 'Contes populaires du Languedoc' de L. Lambert and 'Contes populaires recueillis dans la Grande-Lande' de F. Arnaudin. The annotation process is described in the following article available on https://www.openscience.fr/IMG/pdf/iste_modocv1n1_2.pdf. (Read More)
We have placed cookies on your device to help make this website and the services we offer better. By using this site, you agree to the use of cookies. Learn more