BIP! Finder - Data Augmentation for Low-Resource Neural Machine Translation

2017 • Data Augmentation for Low-Resource Neural Machine Translation

Authors: Christof Monz; Marzieh Fadaee; Arianna Bisazza

Venue: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Type: Publication

Abstract: The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora. For low-resource language pairs this is not the case, resulting in poor translation quality. Inspired by work in computer vision, we propose a novel data augmentation approach that targets low-frequency words by generating new sentence pairs containing rare words in new, synthetically created contexts. Experimental results on simulated low-resource settings show that our method improves translation quality by up to 2.9 BLEU ... (read more)

Impact:

1.7386266E-7 3.3387476E-8 224 99

/ Attention: 0 2

Topics: N/A

DOI: 10.48550/arxiv.1705.00440

External links: Crossref OpenAIRE

Found 2 versions BibTex PDF