BIP! Finder - Transformer-XL: Attentive Language Models beyond a Fixed-Length Context

2019 • Transformer-XL: Attentive Language Models beyond a Fixed-Length Context

Authors: Dai, Zihang, Yang, Zhilin, Yiming, Carbonell, Jaime, Le, Quoc V., Salakhutdinov, Ruslan

Venue: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Type: Publication

Abstract: Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence. It consists of a segment-level recurrence mechanism and a novel positional encoding scheme. Our method not only enables capturing longer-term dependency, but also resolves the context fragmentation problem. As a result, Transformer-XL learns dependency that... (read more)

Topics: Artificial intelligence Natural language processing

DOI: 10.18653/v1/p19-1285 (Found 2 versions)

BIP! social metrics: 0 1
External links: Crossref OpenAIRE

BibTex PDF

Topic-specific impact indicators

Popularity: This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
Influence: This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
Citation Count: This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
Impulse: This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.