BriefGPT - AI 论文速递 ·

移动 / 嵌入式设备高效推理的动态深度神经网络和运行时管理

💡 原文中文，约400字，阅读约需1分钟。

📝

内容提要

本论文提出了一种结合算法和硬件的运行时性能权衡管理方法，通过动态超网络实现实时满足应用性能目标和硬件约束。在Jetson Xavier NX的GPU上，相比最先进的方法，速度提高了2.4倍或准确率提高了5.1%。设计的分级运行时资源管理器在能量和延迟方面都有显著降低。

🎯

关键要点

深度神经网络在移动和嵌入式平台上执行推理具有延迟、隐私和始终可用性等优势。
有效部署深度神经网络面临计算资源有限的挑战。
提出了一种结合算法和硬件的运行时性能权衡管理方法。
通过动态超网络实现实时满足应用性能目标和硬件约束。
在Jetson Xavier NX的GPU上，速度提高了2.4倍或准确率提高了5.1%。
设计了分级运行时资源管理器，单模型部署场景中能量降低19%，延迟降低9%。
在两个并发模型部署场景中，能量降低89%，延迟降低23%。

🏷️

标签

硬件神经网络算法资源管理器超网络运行时性能

➡️

继续阅读

How Netflix Built GenPage: a Single GenAI Model to Build Personalized Homepages
GenPage is a generative AI system developed by Netflix to replace its traditi...
Kodak EC35 is a dirt-cheap point-and-shoot film camera
Following the success of its $99 Kodak-branded Snapic A1, Reto Project is rel...
I hate that I don’t hate this song made with Suno
I would never go so far as to say there's no place for AI in music (I'...
The FBI reportedly won’t investigate ICE anymore
According to the The New York Times, federal agents have been told that the F...
Henrietta Dombrovskaya: Prairie Postgres July Meetup: Proudly Sourced at Midwest!
On July 15, we hosted the second meetup at our new location, the Chicago Inno...
Spark 4.2 has a feature that could retire your vector database
Apache Spark 4.2 launched last week, and it signals an expansion of Spark’s d...