Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

作者： Charlie Snell, Jaehoon Lee, Kelvin Xu, Aviral Kumar (2024)

领域

推理能力推理

TLDR（中文）

系统性给出"推理时多花 compute"的 scaling law：在固定预算下，对小模型加推理时搜索往往比训练更大模型更划算。是 o1/R1 时代理论支撑。

TLDR (English)

Systematically presents scaling laws for "spending more compute at inference time": with fixed budget, adding inference-time search to small models often more cost-effective than training larger models. Theoretical foundation for o1/R1 era.

出现在这些文章里

Sampling 与 Decoding：从概率到文字
Sampling and Decoding: From Probabilities to Text

同被引用

这些论文与本文出现在同一篇文章中

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

领域

TLDR（中文）

TLDR (English)

出现在这些文章里

同被引用

相关论文