小红花·文摘

Sharpen your problem-solving skills the McKinsey way, with our weekly crossword. Each puzzle is created with the McKinsey audience in mind, and includes a subtle (and sometimes not-so-subtle)...

The McKinsey Crossword: Coat Check | No. 247

McKinsey Insights & Publications ·

FP8训练新范式：减少40%显存占用，训练速度提高1.4倍

机器之心 ·

本研究提出了关联思维链（CoAT）框架，旨在增强大型语言模型（LLM）的推理能力。通过结合蒙特卡洛树搜索算法与动态关联记忆机制，CoAT显著提升了推理的准确性、一致性和多样性，并具备实时更新知识库的潜力。

CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Reasoning in Large Language Models

BriefGPT - AI 论文速递 ·

本研究解决了现有 FP8 训练框架在内存使用优化方面的不足。通过动态范围扩展和混合粒度激活量化的创新方法，COAT 显著降低了大模型训练的内存占用，并在多项任务中实现了几乎无损的性能，提供了在较少 GPU 上高效训练大模型的解决方案。

COAT：优化器状态和激活的压缩以实现内存高效的 FP8 训练

BriefGPT - AI 论文速递 ·