Category Archives: featured-top

HPC Challenge Benchmark

HPC Challenge is a benchmark suite that measures a range memory access patterns. Source code repository of the benchmark is located at HPCC SourceForge and HPCC@BitBucket pages.

The HPC Challenge benchmark consists of basically 7 tests:

  1. HPL – the Linpack TPP benchmark which measures the floating point rate of execution for solving a linear system of equations.
  2. DGEMM – measures the floating point rate of execution of double precision real matrix-matrix multiplication.
  3. STREAM – a simple synthetic benchmark program that measures sustainable memory bandwidth (in GB/s) and the corresponding computation rate for simple vector kernel.
  4. PTRANS (parallel matrix transpose) – exercises the communications where pairs of processors communicate with each other simultaneously. It is a useful test of the total communications capacity of the network.
  5. RandomAccess – measures the rate of integer random updates of memory (GUPS).
  6. FFT – measures the floating point rate of execution of double precision complex one-dimensional Discrete Fourier Transform (DFT).
  7. Communication bandwidth and latency- a set of tests to measure latency and bandwidth of a number of simultaneous communication patterns; based on b_eff (effective bandwidth benchmark).

Exploiting the Cores to it’s MAX

Parallelization of task is considered to be a huge challenge for future extreme-scale computing system. Sophisticated parallel computing system necessitates solving the bus contention in a most efficient manner with high computation rate. The major challenge to deal with is the achievement of high CPU core usage through increased task parallelism by keeping moderate bus bandwidth allocation. In order to tackle the aforesaid problems, a novel arbitration technique, known as Parallel Adaptive Arbitration (PAA) has been proposed for the masters designed according to the traffic behaviour of the data flow. These masters are implemented using a synthetic benchmark program that measures sustainable memory bandwidth and the corresponding computational rate. The proposed arbitration technique is a strong case in favour of fair bandwidth optimization and high CPU utilization, as it consumes the processor cores up to 77% through high degree of task parallelization and also reduces bandwidth fluctuation.
To find out more regarding PAA, follow the link below:
http://www.sciencedirect.com/science/article/pii/S0045790615002815

If you are unable to view the article, then you can contact me.