BIP! Finder - PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription

2021 • PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription

Authors: Chen Zhang, Jiaxing Yu, Luchin Chang, Xu Tan, Jiawei Chen, Tao Qin, Kejun Zhang

Venue: N/A

Type: Publication

Abstract: Automatic lyrics transcription (ALT), which can be regarded as automatic speech recognition (ASR) on singing voice, is an interesting and practical topic in academia and industry. ALT has not been well developed mainly due to the dearth of paired singing voice and lyrics datasets for model training. Considering that there is a large amount of ASR training data, a straightforward method is to leverage ASR data to enhance ALT training. However, the improvement is marginal when training the ALT system directly with ASR data, because of the gap bet... (read more)

Topics: Speech recognition

DOI: 10.5281/zenodo.7316698 (Found 3 versions)

BIP! social metrics: 0 1
External links: Crossref OpenAIRE

BibTex PDF

Topic-specific impact indicators

Popularity: This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
Influence: This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
Citation Count: This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
Impulse: This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.