ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

作者： Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning (2020)

arXiv： 2003.10555

领域

预训练

TLDR（中文）

用 replaced token detection 替代 MLM，让小模型也能拿到 BERT-large 级表现。是"预训练目标决定样本效率"这条线索的代表作。

TLDR (English)

Uses replaced token detection instead of MLM, allowing small models to achieve BERT-large level performance. Representative work on "pre-training objective determines sample efficiency".

出现在这些文章里

Embeddings：把离散符号放进连续空间
Embeddings: Putting Discrete Symbols into Continuous Space

同被引用

这些论文与本文出现在同一篇文章中

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

领域

TLDR（中文）

TLDR (English)

出现在这些文章里

同被引用

相关论文