Most computations like element or contact forces, are performed by packets of MVSIZ
. This parameter is adjustable to match so-called vector length. This parameter is also important for cache locality. It is optimized according to hardware characteristics
New treatments need to respect this programming model which is to split the loop over number of elements or nodes by packets of MVSIZ
. This will ensure optimal vector length and cache size as well as minimal local storage (local arrays of size MVSIZ
instead of number of elements or nodes)
IF/THEN/ELSE
| It is recommended to minimize the use of IF/THEN/ELSE instruction inside computational loop Every time a test does not depend on loop index value, it is asked to perform it outside of such loop |
GOTO
| GOTO is absolutely forbidden inside computational loops as it inhibits vectorization and optimization
|
EXIT/CYCLE
| EXIT and CYCLE need to be minimized and avoided in computational loops
|
Inside a loop it is recommended to keep the number of instructions reasonable
Calling a procedure inside a loop inhibits vectorization
The loop below is not vectorized due to possible dependence (same value of INDEX(I)
for different I
):
DO I = 1, N
K = INDEX(I)
A(K) = B(K)
B(K) = 2*A(K)
END DO
In case of no true dependence, vectorization needs to be forced by adding a compiler directive
To keep portability across different platforms and compilers, an architecture specific include file exists named vectorize.inc that manages vectorization directives. The programmer just needs to add this include file just before the DO
loop:
#include "vectorize.inc"
DO I = 1, N
K = INDEX(I)
A(K) = B(K)
B(K) = 2*A(K)
END DO
Notice there is another include file named simd.inc which makes unconditional vectorization, even if a true dependence is detected by the compiler. It is recommended to only use vectorise.inc which is more conservative regarding correctness