OpenRadioss Performance Aspects: Vectorization and Optimization

Introduction

This page deals with vectorization and optimization of OpenRadioss Fortran code. This is a fundamental aspect of the code that needs to be well understood and learned by new OpenRadioss contributors. New functionality should be developed taking into account the same level of care regarding performance

Vectorization

Vector Length

Most computations like element or contact forces, are performed by packets of MVSIZ. This parameter is adjustable to match so-called vector length. This parameter is also important for cache locality. It is optimized according to hardware characteristics

New treatments need to respect this programming model which is to split the loop over number of elements or nodes by packets of MVSIZ. This will ensure optimal vector length and cache size as well as minimal local storage (local arrays of size MVSIZ instead of number of elements or nodes)

Loop Control

IF/THEN/ELSE

It is recommended to minimize the use of IF/THEN/ELSE instruction inside computational loop

Every time a test does not depend on loop index value, it is asked to perform it outside of such loop

GOTO

GOTO is absolutely forbidden inside computational loops as it inhibits vectorization and optimization

EXIT/CYCLE

EXIT and CYCLE need to be minimized and avoided in computational loops

Inside a loop it is recommended to keep the number of instructions reasonable

Calling a procedure inside a loop inhibits vectorization

Data Dependency

The loop below is not vectorized due to possible dependence (same value of INDEX(I) for different I): 

DO I = 1, N          K = INDEX(I)          A(K) = B(K)          B(K) = 2*A(K) END DO

In case of no true dependence, vectorization needs to be forced by adding a compiler directive

To keep portability across different platforms and compilers, an architecture specific include file exists named vectorize.inc that manages vectorization directives. The programmer just needs to add this include file just before the DO loop:

#include "vectorize.inc"        DO I = 1, N          K = INDEX(I)          A(K) = B(K)          B(K) = 2*A(K)        END DO

Notice there is another include file named simd.inc which makes unconditional vectorization, even if a true dependence is detected by the compiler. It is recommended to only use vectorise.inc which is more conservative regarding correctness

Arithmetic Functions

Power

Never use real variable for integer power because of the cost of real power arithmetic. Take care to not use real variable defined in constant.inc when integer is enough

A**2

Allowed

A**TWO

Forbidden as TWO is a my_real variable defined in constant.inc

A**THIRD

here there is no other choice as a real power arithmetic is required

Arrays

Fortran90 Array Operations

Use of Fortran90 array operations is encouraged as long as code readability is kept by always specifying array bounds

 Example:

INTEGER, DIMENSION(NUMNOD) :: A, B, C A = B + C ! confusion between variable and array operation   A(:NUMNOD) = B(:NUMNOD) + C(:NUMNOD) ! default lower bound:1

Multidimensional Arrays

Data Locality

Large arrays over a number of nodes or elements are defined to maximize data locality and have therefore the smallest dimension first, like in the example below:

Rule of thumb for data locality of 2D arrays:

  • If the large dimension is >= MVSIZ or 128 : it should be last (X(3,NUMNOD))

  • If the large dimension is <= MVSIZ or 128:  it should be first (C(MVSIZ,5))

Structure Of Arrays

Use structure of arrays (POINT%X(1:NBPOINTS) ) rather than arrays of structure (POINT(1:NBPOINTS)%X)

Object Oriented Programming

It is not recommended to use object-oriented features unless you can verify that it does not harm performance

 Related articles

OpenRadioss Coding Standards

OpenRadioss Coding Recommendations

OpenRadioss Reader (Radioss Block Format)

OpenRadioss HMPP Development Insights