dettmers2023-qlora
arXiv: 2305.14314
TLDR (English)
4-bit NF4 + LoRA + paged optimizer enables SFT of 65B on single 48GB GPU. Open-source community fine-tuning of LLaMA-2/3, Qwen uses this approach almost 100%.
TLDR(中文)
4-bit NF4 + LoRA + paged optimizer,让 65B 在单张 48GB 显卡上 SFT。开源社区微调 LLaMA-2/3、Qwen 几乎 100% 用这套方案。