Self-Instruct: Aligning Language Models with Self-Generated Instructions

作者： Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi (2022)

arXiv： 2212.10560

领域

对齐

TLDR（中文）

用 GPT-3 自己生成指令-输出数据再蒸馏到自己。Stanford Alpaca / Vicuna 都基于这套，开启"用大模型造数据训小模型"的合成数据时代。

TLDR (English)

Uses GPT-3 to generate instruction-output data and distill to itself. Stanford Alpaca/Vicuna both based on this, opening "use large models to generate data for training small models" synthetic data era.

Self-Instruct: Aligning Language Models with Self-Generated Instructions

领域

TLDR（中文）

TLDR (English)

相关论文