18th April 2017 / by Xcelerit /
Nvidia’s Pascal generation GPUs, in particular the flagship compute-grade GPU P100, is said to be a game-changer for compute-intensive applications. Compared to the Kepler generation flagship Tesla K80, the P100 provides 1.6x more GFLOPs (double precision float). P100’s stacked memory features 3x the memory bandwidth of the K80, an important factor for memory-intensive applications. In […]
Read more
Benchmarks: Intel Xeon Scalable Processor vs. Nvidia V100 GPU
9th July 2018 / by Xcelerit /
Intel’s top of the line Xeon Scalable Processor (Skylake architecture) features a massive increase in compute power compared to the previous Broadwell generation. It comes with AVX-512 SIMD, allows 2 fused multiply-add (FMA) instructions per clock cycle, effectively performing up to 32 double-precision floating point operations per clock tick. The highest-end Platinum range comes with […]
Read more
Benchmarks: Deep Learning Nvidia P100 vs. V100 GPU
27th November 2017 / by Xcelerit /
Artificial intelligence, and in particular deep learning, has become hugely popular in recent years. It has shown outstanding performance in solving a wide variety of tasks from almost all fields of science. The mainstream has primarily focused on applications for computer vision and language processing, but deep learning also shows great potential for a wider […]
Read more
Benchmarks: Intel Xeon Phi (KNL) vs. Broadwell CPU
28th June 2017 / by Xcelerit /
Intel’s Knights Landing processor is the latest generation of the Xeon Phi many-core processor family. It is a host processor, x86 binary compatible, so it can host any box-standard x86 operating system. It comes with 64-68 cores, high-performance stacked memory, and a 512bits wide vector unit – designed for massively parallel workloads. The Xeon Broadwell […]
Read more
Benchmarks: Nvidia P100 vs K80 GPU
18th April 2017 / by Xcelerit /
Nvidia’s Pascal generation GPUs, in particular the flagship compute-grade GPU P100, is said to be a game-changer for compute-intensive applications. Compared to the Kepler generation flagship Tesla K80, the P100 provides 1.6x more GFLOPs (double precision float). P100’s stacked memory features 3x the memory bandwidth of the K80, an important factor for memory-intensive applications. In […]
Read more
HPC Benchmarks: What is a Fair Baseline?
17th April 2017 / by Xcelerit /
There are many studies and publications claiming large speedups on High Performance Computing (HPC) hardware. The reality is that pretty much all kinds of speedups can be reported depending on the baseline used… What does this mean in practice? How should one interpret these numbers? And – more importantly – what should be the baseline […]
Read more