Striking Performance: Large Language Models up to 4x Faster on RTX With TensorRT-LLM for Windows

原文英文,约1000词,阅读约需4分钟。发表于:

GeForce RTX and NVIDIA RTX GPUs, which are packed with dedicated AI processors called Tensor Cores, are bringing the power of generative AI natively to more than 100 million Windows PCs and workstations.

NVIDIA的GeForce RTX和NVIDIA RTX GPU将生成式AI的能力带到了超过1亿台Windows PC和工作站上。TensorRT-LLM for Windows是一个开源库,可加速最新的AI大型语言模型的推理性能,使PC上的生成式AI速度提高了4倍。TensorRT加速对于将LLM功能与其他技术集成也很有益,例如在检索增强生成(RAG)中。TensorRT将Stable Diffusion的速度提高了一倍,而RTX VSR版本1.5通过减少或消除由视频压缩引起的伪影来提高了流媒体视频内容的质量。

Striking Performance: Large Language Models up to 4x Faster on RTX With TensorRT-LLM for Windows
相关推荐 去reddit讨论