Memory alignment for multidimensional arrays
Multi-dimensional arrays need to be padded in thefastest-moving dimension, to ensure array sec9ons to be aligned at the desired byte boundaries :
- Fortran: first array dimension
- C/C++: last array dimension
npadded= ((n +veclen– 1) /veclen) *veclen
- No alignment requested:veclen= 1
- 16-byte alignment (SSE):veclen= 4 (sp) or 2 (dp)
- 32-byte alignment (AVX2):veclen= 8 (sp) or 4 (dp)
- 64-byte alignment (AVX-512):veclen= 16 (sp) or 8 (dp)
Example:
real,
allocatable:: a(:,:), b(:,:), c(:,:)
!dir$
attributes align : 32 :: a,b,c
...
allocate
(a(npadded,n))
allocate
(b(npadded,n))
allocate
(c(npadded,n))
...
do
j=1,n
do
k=1,n
!dir$
vector aligned
do
i=1,npadded
c(i,j)
= c(i,j) &+ a(i,k) * b(k,j)
end
do
end
do
end
do
Comentaris