Universal Language Model Fine-tuning for Text Classification

Authors: Jeremy Howard, Sebastian Ruder (2018)

Domains

Pretraining

TLDR (English)

First paper to explicitly propose the "universal language model pre-training → task fine-tuning" pipeline, with key tricks like discriminative LR and slanted triangular schedule. Together with ELMo, represents "the last mile before BERT".

TLDR（中文）

第一篇明确提出"通用语言模型预训练 → 任务微调"流水线，并给出 discriminative LR、slanted triangular schedule 等关键 trick。和 ELMo 一起是 "BERT 之前最后一公里"。

Related Papers

Other papers in the same domain