BriefGPT - AI 论文速递 ·

OstQuant：通过正交和缩放变换优化大语言模型量化以更好地拟合分布

📝

内容提要

本文针对后训练量化（PTQ）中大语言模型（LLMs）量化面临的数据分布不均和重尾问题，提出了一种新颖的方法，即量化空间利用率（QSUR），用于评估变换后数据的量化能力。通过引入正交和缩放变换的学习等效转换，研究表明OSTQuant在多种LLMs和基准测试中表现优越，尤其在W4A4KV4配置下，减少了与最先进方法的性能差距32%。

🏷️

继续阅读

新玩具！PanstarCloud德国法兰克福三网精品优化服务器 2.79美元/月 30%循环优惠
服务器提供商 PanstarCloud 日前推出德国法兰克福数据中心活动，提供精品优化线路服务器 7 折促销， […]
解构Scaling Law：优化、架构、数据的三重奏
训练一个大型的神经网络，最终效果会受到非常多因素的影响，换个优化器，换个模型架构，或者换一个训练集，结果都可能截然不同。在工程实践中，我们将调试这些因素的...
When do AI agents need permission boundaries?
An AI agent feels harmless when it only produces text, but the risk profile c...
Dogfooding at scale: migrating cdnjs to Cloudflare’s Developer Platform
We moved cdnjs, serving 9 billion requests a day, entirely onto Cloudflare...
Transform any place with Nano Banana in Google Earth
A hero image with example queries is shown.
7 Machine Learning Algorithms That Still Matter
Discover 7 essential machine learning algorithms that every data scientist sh...

内容提要

标签

继续阅读