Do not Stop Pretraining: Adapt Language Models to Domains and Tasks
arXiv: 2004.10964
Domains
TLDR (English)
Demonstrates that continuing pretraining on target-domain data (Domain-Adaptive Pretraining, DAPT) significantly improves task performance. Across biomedical, computer science, news, and reviews domains, DAPT improves over generic pretrained models by 4-8 percentage points on average.
TLDR(中文)
证明了在目标领域数据上继续预训练(Domain-Adaptive Pretraining, DAPT)能显著提升任务表现。在生物医学、计算机科学、新闻和评论四个领域上,DAPT 相比直接使用通用预训练模型平均提升 4-8 个百分点。
Related Papers
Other papers in the same domain