Gamingdeputy reported on February 10 that according to Phoronix’s evaluation of the GH100 (containing a single Grace chip), Nvidia’s Grace server CPU (72-core Arm architecture) seems to be very competitive compared to AMD and Intel’s products, in many test projects Both outperform the top-of-the-line EPYC 9754 or Xeon Platinum 8592+ processors (but overall performance still lags behind X86 products).
It is worth mentioning that Nvidia does not sell Grace chips separately, so the most basic GH100 and GH200 (including a Hopper GPU and a 72-core Grace CPU, equipped with 480GB LPDDR5X memory) are the only products that can test the performance of Grace CPU.
Advertisement
Phoronix helps GPTshop.ai The GH100 was tested remotely (based on Ubuntu 23.10) and compared with other CPUs. The summary results of Gamingdeputy are as follows:
Benchmarks | GH200 | EPYC 9754 | Xeon 8592+ |
High Performance Conjugate Gradient | 41.69 | 25.89 | 35.42 |
Algebraic Multi-Grid Benchmark 1.2 | 1,997,929,111 | 2,291,049,667 | 1,839,912,667 |
LULESH 2.0.3 | 23,185.18 | 22,356.75 | 39,468.91 |
Xmrig 6.18.1 | 17,253 | 29,356.1 | 40,381.2 |
John The Ripper 2023.03.14 | 68,817 | 204,828 | 178,108 |
ACES DGEMM 1.0 | 17.94 | 43.68 | 29.14 |
GraphicsMagick 1.3.38 Sharpen | 1,363 | 924 | 749 |
GraphicsMagick 1.3.38 Enhance | 1,761 | 1,451 | 1,192 |
Graph500 3.0 Median | 1,239,790,000 | 1,147,090,000 | 1,238,670,000 |
Graph500 3.0 Max | 1,315,650,000 | 1,184,510,000 | 1,304,200,000 |
Stress-NG 0.16.04 Matrix | 512,759.08 | 552,067.04 | 301,894.53 |
Stress-NG 0.16.04 Matrix 3D | 17,483.02 | 8,009.21 | 13,854.38 |
Here are the GH200 CPU benchmark results (lower is better):
Benchmarks | GH200 | 9754 | 8592+ |
Rodinia 3.1 (Lower is better) | 30.31 | 25.15 | 39.89 |
NWChem 7.0.2 (Lower is better) | 1,403.5 | 1,700.8 | 1,850.8 |
Xompact3d Incompact3d (Lower is better) | 254.49 | 493.5 | 323.53 |
Xompact3d Incompact3d (Lower is better) | 9.81 | 9.03 | 10.18 |
Godot Compilation 4.0 (Lower is better) | 139.1 | 118.25 | 111.96 |
Primesieve 8.0 (Lower is better) | 35.49 | 21.76 | 49.06 |
Helsing 1.0-beta (Lower is better) | 67.61 | 48.95 | 84.95 |
DuckDB 0.9.1 IMDB (Lower is better) | 92.08 | 147.6 | 96.87 |
DuckDB 0.9.1 TPC-H Parquet (Lower is better) | 148.76 | 177.13 | 134.73 |
RawTherapee (Lower is better) | 46.72 | 66.13 | 45.53 |
Timed Gem 5 Compilation 23.0.1 (Lower is better) | 180.62 | 208.58 | 174.18 |
Overall Average Performance | 2,175.03 | 2,459.11 | 2,242.9 |
The results showed that the Grace chip had 15 superior results compared to Intel Emerald Rapids, and 13 wins compared to AMD Bergamo and Genoa.
Advertisement
On average, Grace performance is still 3% behind the Emerald Rapids series' Xeon Platinum 8592+, and 13% behind Bergamo's EPYC 9754 and Genoa's EPYC 9654.
According to Phoronix, there are still some workloads that are not fully optimized for AArch64 (Arm), which is also a key reason for Grace's significant disadvantage in some scenarios.