The Llama 3 Herd of Models

Authors: Meta AI (2024)

Domains

Pretraining

TLDR (English)

Meta's LLaMA 3 technical report covering models from 8B to 405B parameters. Details data processing (15T tokens, multilingual), architecture improvements (GQA, extended RoPE), training pipeline (SFT + RLHF + DPO), and multimodal extension integration. LLaMA 3 405B is one of the most capable open-source LLMs available.

TLDR（中文）

Meta 的 LLaMA 3 系列技术报告，覆盖从 8B 到 405B 参数的多个模型。详细介绍了数据处理（15T tokens，多语言）、架构改进（GQA、RoPE 扩展）、训练流程（SFT + RLHF + DPO）以及与多模态扩展的整合。LLaMA 3 405B 是性能最强的开源大模型之一。

Related Papers

Other papers in the same domain