Skip to content

clark2020-electra

arXiv: 2003.10555

TLDR (English)

Uses replaced token detection instead of MLM, allowing small models to achieve BERT-large level performance. Representative work on "pre-training objective determines sample efficiency".

TLDR(中文)

用 replaced token detection 替代 MLM,让小模型也能拿到 BERT-large 级表现。是"预训练目标决定样本效率"这条线索的代表作。