跳转到内容

peng2023-yarn

arXiv: 2309.00071

TLDR(中文)

在 RoPE 上做 NTK-aware 插值 + 温度修正,少量训练即可把上下文扩到 64K-128K。当下大多数开源模型扩长基本走 YaRN 或其变体。

TLDR (English)

Applies NTK-aware interpolation + temperature correction on RoPE, extending context to 64K-128K with minimal training. Most open-source models today use YaRN or variants for length extension.