💡
原文英文,约300词,阅读约需1分钟。
📝
内容提要
NVIDIA GeForce RTX 4090的性能参数包括128个计算单元,时钟频率2520 MHz,单精度计算能力为84761.26 GFLOPS,双精度为1398.84 GFLOPS,整数计算能力为44124.49 GIOPS,传输带宽为10.68 GBPS。
🎯
关键要点
- 平台:NVIDIA CUDA
- 设备:NVIDIA GeForce RTX 4090
- 驱动版本:550.127.05 (Linux x64)
- 计算单元:128
- 时钟频率:2520 MHz
- 全局内存带宽(GBPS):float 873.20, float2 901.24, float4 917.89, float8 928.70, float16 938.94
- 单精度计算能力(GFLOPS):float 84761.26, float2 80760.14, float4 80512.55, float8 79900.18, float16 79513.42
- 双精度计算能力(GFLOPS):double 1398.84, double2 1397.85, double4 1394.48, double8 1387.83, double16 1374.64
- 整数计算能力(GIOPS):int 44124.49, int2 44080.14, int4 43970.14, int8 44089.10, int16 44104.19
- 整数计算快速24位(GIOPS):int 44067.89, int2 44081.56, int4 44038.71, int8 43851.83, int16 43369.82
- 整数字符(8位)计算(GIOPS):char 38655.31, char2 38334.73, char4 37103.88, char8 30839.88, char16 28388.27
- 整数短整型(16位)计算(GIOPS):short 36869.31, short2 35287.81, short4 36894.71, short8 32896.40, short16 28145.07
- 传输带宽(GBPS):enqueueWriteBuffer 10.68, enqueueReadBuffer 15.51, enqueueWriteBuffer非阻塞 10.08, enqueueReadBuffer非阻塞 13.46, enqueueMapBuffer(读取)19.79
- 内核启动延迟:4.06微秒
➡️