跳转到内容

Language Models are Few-Shot Learners

作者: Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakanth, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei (2020)

arXiv: 2005.14165

TLDR(中文)

OpenAI 的 GPT-3 论文,展示了 1750 亿参数的语言模型通过 few-shot in-context learning 能在无需微调的情况下完成各种任务。这篇论文确立了"规模即能力"的范式,并开创了提示工程这个方向。

TLDR (English)

OpenAI's GPT-3 paper demonstrates that a 175B parameter language model can perform diverse tasks through few-shot in-context learning without fine-tuning. It established the paradigm that scale unlocks emergent capabilities and launched the era of prompt engineering.

出现在这些文章里