- Created by Paul Sharp, last modified on Mar 31, 2022
You are viewing an old version of this page. View the current version.
Compare with Current View Page History
« Previous Version 6 Next »
Introduction
This page deals with vectorization and optimization of Radioss Fortran code. This is a fundamental aspect of the code that needs to be well understood and learned by new Radioss programmers. Breaking performance of current code is not allowed. Furthermore, new functionality should be developed taking into account the same level of care regarding performance.
Vectorization deals with the execution of computational loops. It allows a computer to compute several loop indexes during the same cycle leveraging vector registers. This concept was first introduced on vector supercomputers (CRAY, NEC, FUJITSU…)
Nonetheless, though there are no more vector supercomputers, vectorization is reintroduced on modern CPUs like Xeon processor with AVX and AVX512
For instance, AVX512 first introduced into Intel Xeon Phi Knights Landing and Xeon Skylake allows the handling of 8x 64-bit double precision real at the same time or 16x single precision
Vectorization
Vector Length
Most computations in Radioss, like element or contact forces, are performed by packets of MVSIZ
. This parameter is adjustable to match so-called vector length. This parameter is also important for cache locality. It is optimized by parallel development team according to hardware characteristics
New treatments need to respect this programming model which is to split the loop over number of elements or nodes by packets of MVSIZ
. This will ensure optimal vector length and cache size as well as minimal local storage (local arrays of size MVSIZ
instead of number of elements or nodes)
Loop Control
| It is recommended to minimize the use of Every time a test does not depend on loop index value, it is asked to perform it outside of such loop |
|
|
|
|
Loop Size
Inside a loop it is recommended to keep the number of instructions reasonable. 20 instructions or less is good. Very long loops should be split to keep cache efficiency
Most compilers will be able to fuse short loops, while they will probably fail to vectorize long complex loops
Data Dependency
The loop below is not vectorized due to possible dependence (same value of INDEX(I)
for different I
):
DO I = 1, N K = INDEX(I) A(K) = B(K) B(K) = 2*A(K) END DO
In case of no true dependence, vectorization needs to be forced by adding a compiler directive
To keep portability across different platforms and compilers, an architecture specific include file exists named vectorize.inc that manages vectorization directives. The programmer just needs to add this include file just before the DO
loop:
#include "vectorize.inc" DO I = 1, N K = INDEX(I) A(K) = B(K) B(K) = 2*A(K) END DO
Notice there is another include file named simd.inc which makes unconditional vectorization, even if a true dependence is detected by the compiler. It is recommended to only use vectorise.inc which is more conservative regarding correctness
For Intel compiler:
vectorize.inc corresponds to !DIR$ IVDEP
simd.inc corresponds to !DIR$ SIMD
Procedure CALL
Calling a procedure inside a loop inhibits vectorization therefore it is not authorized
Inside Radioss, there are vectorized versions of procedures, basically the loop is put inside the procedure rather than outside:
VALPVEC
to replaceJACOBIEW
VINTER
to replaceFINTER
Nested Loops
In practice, only the inner most loop will be vectorized. So the inner most loop needs to be the largest one.
For fixed size loops it is possible to unroll them by hand or to use Fortran90 enhancement. Then the compiler is able to vectorize the outer loop
Unroll by hand is not enough on AVX512 or SSE architectures
Only Fortran 90 syntax helps in this case
Example Nested Loop with NEL >> DIM
DO I = 1, NEL DO J=1, DIM A(I,J) = B(I,J) + C(I,J) END DO END DO
To be transformed to:
DO J=1, DIM DO I = 1, NEL A(I,J) = B(I,J) + C(I,J) END DO END DO
Or using Fortran90 notation:
DO I=1,NEL A(I,:DIM)=B(I,:DIM) + C(I,:DIM) END DO
Or for this simple case:
A(:NEL,:DIM)=B(:NEL,:DIM) + C(:NEL,:DIM)
Arithmetic Functions
Power
Never use real variable for integer power because of the cost of real power arithmetic. Take
care to not use real variable defined in constant.inc when integer is enough
| Allowed |
| Forbidden as DEUX is a |
| here there is no other choice as a real power arithmetic is required |
Div
For invariant, it is advised to multiply by invert instead of doing a division by a constant inside a loop
Header definition
Each procedure needs to have a conforming header, containing by order of appearance:
A minimal description of the routine
The list of caller subroutines/functions
The list of called subroutines/functions
The
SUBROUTINE/FUNCTION
declaration with the list of dummy arguments, 5 per line, matching the call declarationThe used modules
The
implicit_f.inc
(IMPLICIT NONE
)The optional remaining global parameters and commons if any (for compatibility with the past)
The list of dummy arguments, one per line with its type,
INTENT
attribute, explicit bounds in case of array, a short description of the variableThe local variables
The source code
Notes:
The list of callers/callees is automatically generated and its actual look and feel may change according to external documentation tool choice retained (Doxygen, FORD, …)
All dummy and local variables are declared using Fortran90 style
It is important to respect the order given above as some dummy or local variables may use parameters defined in modules (commons or include files)
implicit_f.inc
must always be included. It automatically includesIMPLICIT NONE
instruction. It also automatically includesmy_real.inc
which definesmy_real
as REAL*8 for double precision or REAL*4 for single precision andconstant.inc
which defines numerical constant variables like ZERO hereExplicit bounds of all arrays are mandatory; the use of “
*
" is now forbidden for clarity, error detection, and to ease compiler optimization of the code. The use of hardcoded constant size should be avoided and systematically replaced by a parameter defined in a module or passed as argument
Example:
INTEGER IPARG(60, *)
Forbidden
INTEGER IPARG(NPARG, NGROUP)
Authorized
Fortran routine generic example:
Cgw|============================================================ Cgw| PENCOMP src/interf/pencomp.F Cgw|------------------------------------------------------------ Cgw| Description: Cgw| Compute new penetration for contacts Cgw|------------------------------------------------------------ Cgw|-- called by ----------- Cgw| PRECOMP src/interf/precomp.F Cgw|-- calls --------------- Cgw| PENNEW src/inter/pennew.F Cgw|============================================================ SUBROUTINE PENCOMP( 1 ISIZE, IFORM, CAND_N, FACT, PEN0, 2 NORM , PNEW ) C----------------------------------------------- C M o d u l e s C----------------------------------------------- USE ELBUFDEF_MOD C----------------------------------------------- C I m p l i c i t T y p e s C----------------------------------------------- #include "implicit_f.inc" C----------------------------------------------- C G l o b a l P a r a m e t e r s C----------------------------------------------- #include "mvsiz_p.inc" C----------------------------------------------- C C o m m o n B l o c k s C----------------------------------------------- C----------------------------------------------- C D u m m y A r g u m e n t s C----------------------------------------------- INTEGER, INTENT (IN) :: ISIZE ! length of CAND_N INTEGER, INTENT (IN) :: IFORM ! formulation used : 1 == standard ! 2 == improved INTEGER, INTENT (IN) :: CAND_N(ISIZE) ! contact node candidates ! REAL ou REAL*8 my_real, INTENT (IN) :: PEN0(ISIZE) ! initial penetration my_real, INTENT (IN) :: NORM(ISIZE) ! master segment normal my_real, INTENT (IN) :: FACT ! initial scale factor my_real, INTENT (OUT) :: PNEW(ISIZE) ! new computed penetration C----------------------------------------------- C L o c a l V a r i a b l e s C----------------------------------------------- INTEGER :: I,J, K, LAST my_real :: PENEOLD(MVSIZ) C----------------------------------------------- C S o u r c e L i n e C----------------------------------------------- ! Fortran code of the routine respecting indentations K = 0 DO I = 1,ISIZE,MVSIZ ! do the loop by packet of MVSIZ LAST = MIN(I+MVSIZ,ISIZE) IF (IFORM > 0) THEN DO J = 1, LAST PENEOLD(J) = FACT*PEN0(CAND_N(I+J-1)) END DO ELSE PENEOLD(1:LAST) = ZERO ! F90 notation ok ! ZERO defined in constant.inc automatically included within implicit.inc END IF CALL PENNEW(I,LAST,K,PENEOLD,PNEW,CAND_N) K = K+MVSIZ END DO ! ... RETURN END SUBROUTINE PENCOMP
Comments
It is important to comment important algorithms, especially when non straightforward coding is used
Comments are written in English
Comments respect Fortran90 standard. The use of "!
" at beginning of line is preferred to "C
" or "*
"
Except for precompiler directives, the following characters '
,"
, \
, /*
, */
, #
are forbidden, even inside comments. Especially "\
" is dangerous as it is interpreted as a continuation line by the precompiler
Example:
! this is a legal comment A = A + B ! this one is also authorized in FORTRAN 90 C = A + C ! next instruction ignored cause of this char \ D = C + D
Modules
Module Format
Generic format of a Fortran90 module is as follows:
MODULE <module name> USE [other module list] #include "implicit_f.inc" <declaration section> CONTAINS <procedure definitions> END MODULE <module name>
Naming Convention
Module name is defined as follows: MODULENAME_MOD
With module file name: modulename_mod.F
Module file is placed at the same location as other files used by the option
Module Usage
3 types of usage:
Derived data types definition
Variable names declaration (in replacement of commons)
Procedure interface (data type & argument list control,…)
Good practice is to split type declaration from variable declaration into 2 different modules.
This way it is possible to pass variables defined in modules at upper level (resol) into calling tree at lower levels
Then, such derived data type variables passed as argument of procedures can be defined using the module which defines them, allowing traceability of such variables throughout the code
The procedure that uses such variables passed by argument needs to include the module that defines derived data types
Example of derived data types (comments compliant with Doxygen):
C==================================================================== C DUMMYDEF_MOD modules/dummydef_mod.F C------------------------------------------------------------------- C> Description: C> define DUMMY struture C------------------------------------------------------------------- C> called by: \n C> @ref DUMMY starter/src/test/dummy.F \n C> @ref DINIT starter/src/test/dinit.F \n C> @ref DSOLVE starter/src/test/dsolve.F \n C>\n calling: \n C==================================================================== MODULE DUMMYDEF_MOD C----------------------------------------------------------------------- C----------------------------------------------- C M o d u l e s C----------------------------------------------- C----------------------------------------------- C m y _ r e a l C----------------------------------------------- C----------------------------------------------- C I m p l i c i t T y p e s C----------------------------------------------- #include "implicit_f.inc" C================================================= TYPE DUMMY_STRUCT_ C================================================= INTEGER :: L_ITAB !< size of integer array ITAB INTEGER :: L_RTAB !< size of integer array RTAB INTEGER, DIMENSION(:) , POINTER :: ITAB !< integer array ITAB my_real, DIMENSION(:) , POINTER :: RTAB !< real array RTAB END TYPE DUMMY_STRUCT_ END MODULE DUMMYDEF_MOD C==================================================================== C DINIT starter/src/test/dinit.F C------------------------------------------------------------------- C> Description: \n C> Allocate and initialize variable MY_DUM of type DUMMY_STRUCT_ C------------------------------------------------------------------- C> called by: \n C> @ref DUMMY starter/src/test/dummy.F \n C>\n calling: \n C==================================================================== SUBROUTINE DINIT(MY_DUM, NITEMS) C----------------------------------------------- C M o d u l e s C----------------------------------------------- USE DUMMYDEF_MOD C----------------------------------------------- C I m p l i c i t T y p e s C----------------------------------------------- #include "implicit_f.inc" C----------------------------------------------- C----------------------------------------------- C G l o b a l P a r a m e t e r s C----------------------------------------------- C----------------------------------------------- C C o m m o n B l o c k s C----------------------------------------------- C----------------------------------------------- C D u m m y A r g u m e n t s C----------------------------------------------- INTEGER, INTENT(IN) :: NITEMS TYPE(DUMMY_STRUCT_), INTENT(INOUT) :: MY_DUM C----------------------------------------------- C L o c a l V a r i a b l e s C----------------------------------------------- INTEGER :: N, IERROR C----------------------------------------------- MY_DUM%L_ITAB = NITEMS MY_DUM%L_RTAB = NITEMS ALLOCATE(MY_DUM%ITAB(MY_DUM%L_ITAB),MY_DUM%RTAB(MY_DUM%L_RTAB), & STAT=ierror) IF (IERROR /= 0) THEN ! better to use MY_ALLOCATE macro instead print*,'error:',IERROR stop 123 END IF DO N = 1, NITEMS MY_DUM%ITAB(N) = N MY_DUM%RTAB(N) = N**2 END DO RETURN END SUBROUTINE DINIT C==================================================================== C DSOLVE starter/src/test/dsolve.F C------------------------------------------------------------------- C> Description: \n C> Solve some dummy problem C------------------------------------------------------------------- C> called by: \n C> @ref DUMMY starter/src/test/dummy.F \n C>\n calling: \n C|==================================================================== SUBROUTINE DSOLVE(MY_DUM,RES) C----------------------------------------------- C M o d u l e s C----------------------------------------------- USE DUMMYDEF_MOD C----6------------------------------------------ C I m p l i c i t T y p e s C----------------------------------------------- #include "implicit_f.inc" C----------------------------------------------- C G l o b a l P a r a m e t e r s C----------------------------------------------- C C o m m o n B l o c k s C----------------------------------------------- C----------------------------------------------- C D u m m y A r g u m e n t s C----------------------------------------------- TYPE(DUMMY_STRUCT_), INTENT(INOUT) :: MY_DUM my_real, INTENT(OUT) :: RES C----------------------------------------------- C L o c a l V a r i a b l e s C----------------------------------------------- INTEGER :: N, NITEMS,IERROR my_real :: VERIF C----------------------------------------------- NITEMS = MY_DUM%L_RTAB RES = 0. VERIF = 0. DO N = 1, NITEMS RES = RES + MY_DUM%RTAB(N) VERIF = VERIF + N**2 END DO print *,'res=',res,' verif=0?:',RES-VERIF DEALLOCATE(MY_DUM%ITAB,MY_DUM%RTAB,STAT=IERROR) print *,'solve terminated with error code:',IERROR RETURN END !==================================================================== ! DUMMY starter/src/test/dummy.F !==================================================================== !> Description: \n !> Main routine which initialize and use a variable !> of type DUMMY_STRUCT_ !==================================================================== !>\n called by: \n !> @ref DUMMYEXT starter/src/test/dummyext.F \n !>\n calling: \n !> @ref DINIT starter/src/test/dinit.F \n !> @ref DSOLVE starter/src/test/dsolve.F \n !==================================================================== SUBROUTINE DUMMY(NITEMS) C----------------------------------------------- C M o d u l e s C----------------------------------------------- USE DUMMYDEF_MOD C----------------------------------------------- C I m p l i c i t T y p e s C----------------------------------------------- #include "implicit_f.inc" C----------------------------------------------- C----------------------------------------------- C G l o b a l P a r a m e t e r s C----------------------------------------------- C----------------------------------------------- C C o m m o n B l o c k s C----------------------------------------------- C----------------------------------------------- C D u m m y A r g u m e n t s C----------------------------------------------- INTEGER, INTENT(IN) :: NITEMS ! number of NITEMS C----------------------------------------------- C L o c a l V a r i a b l e s C----------------------------------------------- TYPE(DUMMY_STRUCT_) MY_DUM ! dummy structure my_real RES ! result of dummy solve C----------------------------------------------- CALL DINIT(MY_DUM,NITEMS) CALL DSOLVE(MY_DUM,RES) RETURN END SUBROUTINE DUMMY !==================================================================== ! DUMMYEXT starter/src/test/dummy.F !==================================================================== !> Description: \n !> main program calling dummy \n !>\n called by: \n !>\n calling: \n !> @ref DUMMY starter/src/test/d.F \n !==================================================================== !> \warning program should not be put in the same file, just for !> convenience and for the aim to put some warning message blablabla !> \bug not good to call "dummy(10)", should be passed by address instead of a value C==================================================================== PROGRAM DUMMYEXT CALL DUMMY(10) C> @note this comment is ignored or would be applied to next subroutine C never do that with DOxygen !!! C all Doxygen comments need to be put before the routine definition END PROGRAM DUMMYEXT
Restart Variables
All the variables communicated between Starter and Engine are declared in module RESTART_MOD
, used by subroutine ARRALLOC
for their allocation, by RDRESB
for their reading from RESTART file, then by RESOL_HEAD
in which they are used as argument for RESOL
subroutine. All these argument variables are then passed by argument to procedures called from RESOL
, ...)
Interface definition
Fortran90 interface allows the compiler to do additional checks like coherency between argument types, attributes and number between calling and callee routines. It is required in some cases like when a procedure has a dummy argument with attribute ALLOCATABLE, POINTER, TARGET
In practice, it was introduced in few places of the code for routines which were called at several different places
Such coherency is automatically tested by QA static analysis tools (Forcheck).
And for pointer, the good practice is to use derived data types instead of pointer directly
Therefore, the remaining use of interface is regarding routine with optional arguments
This feature should be spread in the code instead of adding additional “dummy” arguments
Interface Example
Example of an interface for a routine called at several places in the code. The routine is put in a module to guarantee automatic update of interfaces and recompilations of all routines using this routine in case of change (dependence automatically found by compiler)
MODULE INITBUF_MOD CONTAINS Cgw|============================================================ Cgw| INITBUF src/resol/initbuf.F Cgw|------------------------------------------------------------ Cgw| Description : Cgw| Initialisation of vect01_c.inc variables Cgw|-- called by ----------- Cgw| FORINT src/resol/forint.F Cgw| FORINTS src/resol/forints.F Cgw| ALEMAIN priv/ale/alemain.F Cgw|============================================================ SUBROUTINE INITBUF (IPARG ,NG , 2 MTN ,LLT ,NFT ,IAD ,ITY , C … 6 IREP ,IINT ,IGTYP ,ISRAT ,ISROT , 7 ICSEN ,ISORTH ,ISORTHG ,IFAILURE) C----------------------------------------------- C I m p l i c i t T y p e s C----------------------------------------------- #include "implicit_f.inc" C----------------------------------------------- C C o m m o n B l o c k s C----------------------------------------------- #include "param_c.inc" C----------------------------------------------- C D u m m y A r g u m e n t s C----------------------------------------------- INTEGER, INTENT (IN) :: IPARG(NPARG,NGROUP),NG INTEGER, INTENT (OUT) :: MTN,LLT,NFT,IAD,ITY,NPT,JALE,ISMSTR, . JEUL,JTUR,JTHE,JLAG,NVAUX,JMULT,JHBE,JIVF,JPOR,JPLA,JCLOSE, . IREP,IINT,IGTYP,JCVT,ISROT,ISRAT,ISORTH,ISORTHG,ICSEN,IFAILURE C----------------------------------------------- C S o u r c e L i n e s C====================================================================== MTN = IPARG(1,NG) LLT = IPARG(2,NG) … JCLOSE = IPARG(33,NG) IREP = IPARG(35,NG) IINT = IPARG(36,NG) JCVT = IPARG(37,NG) IFAILURE = IPARG(43,NG) C---- RETURN END SUBROUTINE INITBUF END MODULE INITBUF_MOD Cgw|============================================================ Cgw| FORINT src/resol/forint.F Cgw|------------------------------------------------------------ Cgw|-- called by ----------- Cgw| RESOL src/resol/resol.F Cgw|-- valls --------------- Cgw| INITBUF src/resol/initbuf.F Cgw|============================================================ SUBROUTINE FORINT( 1 PM ,GEO ,X ,A ,AR , 2 V ,VR ,MS ,IN ,W , C… K MSNF ,IGEO ,IPM ,XSEC ,ITASK) C----------------------------------------------- C M o d u l e s C----------------------------------------------- USE INITBUF_MOD C----6---------------------------------------------------------------7---------8 C I m p l i c i t T y p e s C----------------------------------------------- #include "implicit_f.inc" #include "comlock.inc" C----------------------------------------------- C G l o b a l P a r a m e t e r s C----------------------------------------------- #include "mvsiz_p.inc" C----------------------------------------------- C C o m m o n B l o c k s C----------------------------------------------- #include "com01_c.inc" #include "com03_c.inc" C----------------------------------------------------------------- C D u m m y A r g u m e n t s C----------------------------------------------- INTEGER IXS(NIXS,*), . IXQ(NIXQ,*), IXT(NIXT,*), IXP(NIXP,*), . IXR(NIXR,*), IELVS(*), IGEO(NPROPGI,*), . IXS16(8,*),IADS16(8,*),ITASK C REAL ou REAL*8 my_real . X(3,*) ,D(3,*) ,V(3,*) ,VR(3,*), . MS(*) ,IN(*) ,PM(NPROPM,*),SKEW(9,*),GEO(NPROPG,*), . BUFMAT(*) ,W(3,*) ,VEUL(*),TF(*) ,FR_WAVE(*) C----------------------------------------------- C L o c a l V a r i a b l e s C----------------------------------------------- INTEGER INDXOF(MVSIZ) INTEGER I,II,J,N my_real . FX(MVSIZ,20),FY(MVSIZ,20),FZ(MVSIZ,20), . MX(MVSIZ,4),MY(MVSIZ,4),MZ(MVSIZ,4) C======================================================================| CALL INITBUF (IPARG ,NG , 2 MLW ,NEL ,NFT ,KAD ,ITY , 3 NPT ,JALE ,ISMSTR ,JEUL ,JTUR , 4 JTHE ,JLAG ,JMULT ,JHBE ,JIVF , 5 NVAUX ,JPOR ,JCVT ,JCLOSE ,IPLA , 6 IREP ,IINT ,IGTYP ,ISRAT ,ISROT , 7 ICSEN ,ISORTH ,ISORTHG ,IFAILURE) C ICNOD = IPARG(11,NG) KADDSA = 1+(KAD-1)*NDSAEXT NSG = IPARG(10,NG) C----------- RETURN END
Memory Allocation
Dynamic memory allocation mechanism
Dynamic memory allocation is directly done at the Fortran90 level using MY_ALLOCATE
macro
This macro allows automatic error checks encapsulating call to Fortran90 ALLOCATE
statement
Previously, allocation error check was done by hand by the:
CALL ALLOCATE(ITAB(NUMNOD),STAT=IERR) IF(IERR/=0)THEN WRITE(ISTDO,’(A)’) ‘ERROR IN MEMORY ALLOCATION’ WRITE(IOUT,’(A)’) ‘ERROR IN MEMORY ALLOCATION’ CALL ARRET(2) END IF C… CALL DEALLOCATE(ITAB)
In practice, error checking was missing in many places. Therefore, the idea to use a macro to automatically control allocation, handle error message and execution stop in case of failure was implemented.
Here is the macro detail:
#ifndef MY_ALLOCATE #define MY_ALLOCATE(ARRAY,LENGTH)\ ALLOCATE(ARRAY(LENGTH),STAT=MY_IERR);\ IF(MY_IERR/=0) THEN;\ CALL ANCMSG(MSGID=268,MSGTYPE=MSGERROR,ANMODE=ANSTOP,C1=#ARRAY);\ ENDIF #endif
The previous code becomes:
USE MESSAGE_MOD C… #include “my_allocate.inc” C… CALL MY_ALLOCATE(ITAB,NUMNOD) C… CALL DEALLOCATE(ITAB)
Developers are required to check the success of the allocation
The message printed by this macro in case of allocation failure is rather generic. For large arrays it is preferred to print a specific message, with advice for the user, or at least the option concerned by this failure
Global Memory
Memory allocation of global data structures, arrays and derived data types, should be done at the highest level, in LECTUR
for Starter, in RADIOSS2
or RESOL
for Engine
It is advised to use derived data type with structure of arrays. This way it is possible to declare the variable at the upper level, gather the allocation of array members in a dedicated subroutine, then use the variable in procedures called at lower level without losing traceability
Local Memory
In a procedure, local variable allocation method depends on its size:
The size is known and limited to a multiple of MVSIZ | Automatic allocation in the stack is ok |
The size is not known or larger than times MVSIZ | It is then too large to use automatic allocation in the stack. It is therefore needed to use dynamic allocation in the heap |
Automatic Allocated arrays go into Stack
ALLOCATED ARRAYS go into Heap
One should take care to reduce Stacksize usage to a reasonable size.
Stacksize is hardcoded under Windows
It is allowed to use MY_ALLOCATE
at the beginning of a routine provided matching DEALLOCATE
is done at the end of this routine
In case of multiple calls to MY_ALLOCATE
, a call to DEALLOCATE
must be applied to the variable used in the last previous call to MY_ALLOCATE
to avoid a memory hole and need for a garbage collector
Local Memory Example:
CALL MY_ALLOCATE(VAR1) CALL MY_ALLOCATE(VAR2) C… DEALLOCATE(VAR2) C… CALL MY_ALLOCATE(VAR3) C… DEALLOCATE(VAR3) C… DEALLOCATE(VAR1)
Shared Memory Programming (SMP) and memory allocation
For RADIOSS Engine, OpenMP programming model is used for second level parallelization
By default any memory allocation done outside of a parallel section is shared between threads
Most of the parallel sections are started from RESOL
. So for the need of a shared memory array, the simplest way is to declare it at the level of RESOL
. It will be shared inside the different parallel sections started from RESOL
, possibly passed by argument to routines called from RESOL
The same way, any variable defined in a common or module is shared by default
For pointer, notice that a single thread needs to allocate and deallocate it. The programmer has to manage synchronization in order to insure such a variable is allocated before being used by any other thread and no longer used before it is deallocated
The !$OMP THREADPRIVATE directive overrides default behavior by creating thread local storage variables
Array Aliasing
Description
Here we discuss different arguments of a procedure referencing the same memory locations. The compiler won’t be able to detect in the procedure that different argument variables reference one or more identical memory locations. Such a situation is particularly dangerous because of compiler optimization. Even if compilers are not forbidden it, if both variables are modified inside the procedure this could lead to unpredictable results. Potential conflicts or dependencies won’t be detected
Code Example:
PROGRAM OVERMAIN C------------------------------------------ IMPLICIT NONE C------------------------------------------ PARAMETER (MAXLEN = 40) INTEGER BUFFER(MAXLEN),I,I1,LEN C------------------------------------------ I1 = 10 LEN = 22 DO I = 1, LEN BUFFER(I1+I-1) = I ENDDO print *,’init : buffer =’,(BUFFER(I1+I-1),I=1,LEN) CALL SHIFTI1(BUFFER,BUFFER(I1),I1,LEN) ! potential overlap print *,’result: buffer =’,(BUFFER(I1+I-1),I=1,LEN) STOP END SUBROUTINE SHIFTI1(BUFFER,TAB,I1,LEN) C------------------------------------------ IMPLICIT NONE C------------------------------------------ INTEGER BUFFER(*),TAB(*),I1,LEN C------------------------------------------ INTEGER I,LEN0,I10 C------------------------------------------ LEN0 = 19 I10 = I1 - LEN + LEN0 DO I = 1, LEN BUFFER(I10+I-1) = TAB(I) ! true overlap ENDDO I1 = I10 RETURN END
Tested on SGI O200 IRIX 6.4: output:
skippy 61% f77 -O2 -c *.F overmain.F: shifti1.F: skippy 62% f77 -O2 -o overlaps *.o skippy 63% overlaps init : buffer = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 result: buffer = 1 2 6 4 5 6 10 8 9 10 14 12 13 14 18 16 17 18 22 20 21 22
Conclusion: Array aliasing is forbidden when the variable is modified inside the procedure
\uD83D\uDCCB Related articles
- No labels