Nicksxs's Blog ·

通过显卡来给gpt-oss做个加速

💡 原文中文，约700字，阅读约需2分钟。

📝

内容提要

在MacBook Pro上使用gpt-oss因内存限制运行困难。在显存为6G的Windows笔记本上使用lm studio运行gpt-oss 20b模型，加载8层后生成速度有所提升，但仍受显存限制，建议使用显存大于16G的显卡。

🎯

关键要点

在MacBook Pro上使用gpt-oss因内存限制运行困难。
Windows笔记本配备3060显卡，显存为6G，尝试使用lm studio运行gpt-oss 20b模型。
gpt-oss 20b模型有24层，官方显存需求为16G，6G显存可加载8~9层。
通过lm studio配置加载8层，生成速度可达到4.x个token，优于Mac上运行。
调整gpu卸载层数到9层勉强可行，显存不足限制了加载更多层。
可以使用unsloth进行微调，参考相关Colab链接。
通过nvidia-smi和nvitop监控显存占用，加载部分层提升生成速度，但仍受显存限制，建议使用显存大于16G的显卡。

🏷️

标签

Windows gpt gpt-oss 内存限制显存模型

➡️

继续阅读

百度文心助手任务Agent登顶国际权威榜单，超越Claude、GPT拿下全球智能体冠军
AI 圈今天最大的瓜：GPT-6 越狱攻击，被 GLM 5.2 揪出了
「GPT-6」为了考试作弊，黑进了别人的服务器#欢迎关注爱范儿官方微信公众号：爱范儿（微信号：ifanr），更多精彩内容第一时间为您奉上。
GitHub Increased Instant Navigation from 4% to 22% by Rethinking Client Side Architecture
GitHub redesigned GitHub Issues navigation using a client-side architecture t...
Kaggle + Google’s Free 5-Day Agentic AI Course
Google and Kaggle's 5-Day AI agents course is now freely available to everyone.
Architecting offline-first generative AI applications for edge deployments using AWS services
According to Siemens’ 2024 report The True Cost of Downtime, Fortune 500 comp...
Automate custom PII detection at scale with Amazon Macie and Step Functions
Organizations in regulated industries like financial services, insurance, hea...