OPTIMISING MEMORY STRUCTURES
• Memory Optimisation I
• Types of memory structures
Copyright By PowCoder代写 加微信 powcoder
• Memory Optimisation II
• Reducing memory accesses • Utilizing Caches
• Write optimisations
• Prefetching
• Pointer aliasing
Types of data structure
• Pointer arrays
• records/structures • Trees and lists
• Arrays are large blocks of memory indexed by integer index.
• Compiler can calculate the address of any element by multiplying the index by a stride value.
• In most languages stride value is the size of the underlying type. • In C pointers and (1D) arrays are essentially the same.
• Probably the most common data structure used in HPC codes
• Very efficient use of space as no overhead
Multi dimensional Arrays
• Multidimensionalarraysusemultipleindexes(shorthand)
REAL A(100,100,100) A (i,j,k) = 7.0
float A[100][100][100]; A [i][j][k] = 7.0
REAL A(1000000) A(i+100*j+10000*k) = 7.0
float A[1000000]; A(k+100*j+10000*i) = 7.0
• Address calculation requires computation but still relatively cheap.
• Compilers have better chance to optimise where dimension sizes are known at compile time.
• Good for representing regularly discretised versions of dense continuous data
𝑓𝑥,𝑦,𝑧 →𝐹𝑖 𝑗[𝑘]
• Many codes loop over array elements
• Data access pattern is regular and easy to predict
• Unless loop nest order and array index order match the access pattern may not be optimal for cache re- use.
• Compiler can often address these problems by transforming the loops.
• But sometimes can do a better job when provided with a more cache-friendly index order.
• Vectorisation may also need changes to index order
Dynamic sized arrays (Fortran)
• Not always possible/desirable to fix array sizes at compile time
• Fortran allows arrays to be dynamically sized based on subroutine arguments.
• Address calculation can still be optimised using CSA.
• Size of slowest moving index is not needed in address computation.
• Fortran actually allows this dimension to be unspecified in subroutine arguments (assumed size arrays)
Array sections and array descriptors
• Fortran allows you to take array sections
• B(2:N:4) – every 4th element of B up to N starting at 2 • Similar to a strided DO loop
• These can be passed as subroutine arguments
• Without a full interface compiler has to copy data to a contiguous temporary scratch array.
• With an interface it can pass an array descriptor
• Internal compiler structure containing • Baseaddress
• Offsets and strides for each dimension
Dynamic sized arrays (C)
• C requires array dimensions to be known at compile time.
• C99 introduced variable length arrays but C11 made this an optional feature
• However even traditional C can make slowest dimension variable with pointers and typedef (similar to the more commonly used array of struct but cleaner to read)
typedef float Mat[2][2];
Mat *data =(Mat *) malloc(n*sizeof(Mat)); for(i=0;i