The Right Tool for the Job: Matching Model and Instance Complexities
arXiv: 2003.03618
领域
TLDR(中文)
提出自适应计算思想:不同输入实例需要的计算量不同。通过训练一个轻量级路由器将简单样本分配给较小模型、复杂样本分配给较大模型,可以在几乎不损失精度的情况下将平均推理成本降低 2-3 倍。
TLDR (English)
Proposes adaptive computation: different input instances require different amounts of computation. By training a lightweight router to assign simple samples to smaller models and complex samples to larger models, reduces average inference cost by 2-3x with minimal accuracy loss.
相关论文
同一领域的其他论文