clark2020-electra
arXiv: 2003.10555
TLDR(中文)
用 replaced token detection 替代 MLM,让小模型也能拿到 BERT-large 级表现。是"预训练目标决定样本效率"这条线索的代表作。
TLDR (English)
Uses replaced token detection instead of MLM, allowing small models to achieve BERT-large level performance. Representative work on "pre-training objective determines sample efficiency".