In High Performance Computing, data access has complex implications and requires concepts that are fundamentally different from those provided in the STL.
Iterators as we know them just are not enough.
The
proposed range concepts for the standard library are a significant improvement but are designed for the mental model of iterating and mapping values, not hierarchical domain decomposition.
Even for a seemingly trivial array there are countless ways to partition and store its elements in distributed memory, and algorithms are required to behave and scale identically for all of them. It also does not help that most applications operate on multidimensional data structures where efficient access to neighborhood regions is crucial. Among HPC developers, it is therefore widely accepted that canonical iteration space and physical memory layout must be specified as separate concepts.
For this, we use views based on multidimensional index sets, inspired by the proposed range concepts.
In this session, we will explain the challenges when distributing container elements for thousands of cores and how modern C++ allows to achieve portable efficiency.
As an HPC afficionado, you know you want this:
copy( matrix_a | local() | block({ 2,3 }), matrix_b | block({ 4,5 }) )
If this does not look familiar to you: we give a gentle introduction to High Performance Computing along the way.