Skip to content

Sequence to Sequence Learning with Neural Networks

Authors: Ilya Sutskever, Oriol Vinyals, Quoc V. Le (2014)

arXiv: 1409.3215

TLDR (English)

The foundational seq2seq (encoder-decoder) architecture paper. Using two LSTMs in a compress-then-generate structure, it enabled neural networks to perform variable-length sequence-to-sequence transformations for the first time, achieving breakthroughs in machine translation and directly inspiring the Transformer's encoder-decoder design.

TLDR(中文)

Seq2Seq 架构(编码器-解码器)的奠基之作。通过两个 LSTM 的"压缩-生成"结构,首次让神经网络 能够进行变长序列到变长序列的转换,在机器翻译上取得突破性进展,也直接启发了后来 Transformer 的编解码器设计。