Squishy Maps for Soft Body Modelling Using Generalised Chain Mail
KIT308/408 (Advanced) Multicore Architecture and Programming
Shared Memory and Synchronisation
Dr. Ian Lewis
Discipline of ICT, School of TED
University of Tasmania, Australia
1
Today we’ll look at some of the problems with sharing memory between multicores (or just multithreaded programs)
See simple approaches for restricting access to avoid these potential pitfalls
Often referred to as synchronisation
2
Programming Multicores
Shared Memory
3
Threads share the address space of their parent process
Simple to access shared data
Any global variables are global across all threads
Care needs to be taken when doing so
4
Refresher: Processes and Threads
Threads within a Process
Thread functions can be informally grouped into four major groups
Thread management
Routines that work directly on threads
Creating, exiting, waiting for other threads to finish, etc.
Mutexes
Routines that deal with synchronization, called a mutex, which is an abbreviation for “mutual exclusion”
Condition variables
Routines that address communications between threads that share a mutex
Synchronization
Routines that manage read/write locks and barriers
Today we’ll look at mutex and synchronization thread functions
Refresher: Windows Thread Functions
5
This is a different calculation that what we’ve been doing with Mandelbrot
One single result
Would this code work?
Unlikely, as all the threads are trying to read and write to sum at the same time
Lots and lots of reads and writes
Code once compiled is actually: read, add, write
What’s going on in main memory and cache?
This value would need to be shared between different cores
int array[LOTS_OF_VALUES];
int sum = 0;
void thread_func(int* array, int start, int end)
{
for (int i = start; i < end; ++i)
{
sum += array[i];
}
}
// initialise array with something
// make some threads that all run thread_func
// (with appropriate start and end indices)
// to calculate sum
6
Multithreaded Sum: Naïve Attempt 1
Would this code work?
Less read/writes of sum variable
But still potential for them to fight
Pretty unlucky, but not impossible
This is probably worse than previous example as this might work most of the time
Quick testing wouldn’t reveal problem
int array[LOTS_OF_VALUES];
int sum = 0;
void thread_func(int* array, int start, int end)
{
int local_sum = 0;
for (int i = start; i < end; ++i)
{
local_sum += array[i];
}
sum += local_sum;
}
// initialise array with something
// make some threads that all run thread_func
// (with appropriate start and end indices)
// to calculate sum
7
Multithreaded Sum: Naïve Attempt 2
The solution to this problem is to somehow lock the variable
So that only one thread can read/write from it at one time
Then unlock it once the calculation has been done
int array[LOTS_OF_VALUES];
int sum = 0;
void thread_func(int* array, int start, int end)
{
int local_sum = 0;
for (int i = start; i < end; ++i)
{
local_sum += array[i];
}
GET_LOCK_FOR_SUM;
// (no other thread can touch it)
sum += local_sum;
RELEASE_LOCK_FOR_SUM;
}
// initialise array with something
// make some threads that all run thread_func
// (with appropriate start and end indices)
// to calculate sum
8
Multithreaded Sum: Locking
Mutexes
9
A mutex is used to protect a shared resource from multiple simultaneous accesses
It enforces sequential access to the resource
Resources can be as simple as a single memory location
Or as complex as you like
1. https://i.pinimg.com/564x/d9/92/ed/d992ed50b3efe8c33f125e5cd353c5b6--life-plan-venn-diagrams.jpg
10
Mutual Exclusion Locks (Mutexes)
To create a mutex, the CreateMutex function is used with the following parameters
LPSECURITY_ATTRIBUTES lpMutexAttributes
A pointer to a structure to define security features (NULL for the defaults)
BOOL bInitialOwner
A boolean specifying whether to create this mutex in its locked state or not
i.e. if this is TRUE, the thread calling CreateMutex will own the mutex
LPCTSTR lpName
The name of the mutex (NULL to default to no name)
This function returns a HANDLE
A valid (non-zero) HANDLE is returned for success, if NULL is returned, you can query GetLastError to get an error code
11
Creating Windows Mutexes
The lock on a mutex is obtained by use of the WaitForSingleObject function
(Or WaitForMultipleObjects to get multiple mutexes, or one from a set)
The lock is released by using the ReleaseMutex function with the following parameters
HANDLE hMutex
The HANDLE to the mutex
1. http://static.gulfnews.com/polopoly_fs/1.2177287!/image/2534844721.jpg_gen/derivatives/box_620347/2534844721.jpg
12
Locking/Releasing Windows Mutexes
Using a single mutex allows us to do our sum calculation correctly
Why are we still bothering with the local_sum variable here?
To reduce the amount of locking that takes place
Locking is slow
int array[LOTS_OF_VALUES];
int sum = 0;
HANDLE sum_mutex;
void thread_func(int* array, int start, int end)
{
int local_sum = 0;
for (int i = start; i < end; ++i)
{
local_sum += array[i];
}
WaitForSingleObject(sum_mutex, INFINITE);
sum += local_sum;
ReleaseMutex(sum_mutex);
}
void main()
{
sum_mutex = CreateMutex(NULL, FALSE, NULL);
// initialise array with something
// make some threads that all run thread_func
// (with appropriate start and end indices)
// to calculate sum
}
13
Multithreaded Sum: Mutex
Semaphores are a related concept to mutexes
Except that they allow for more than one thread to access a resource at once
They allow a fixed number of threads access
Basically they are maintain a counter, and every time the semaphore is requested, the counter is increased
Increasing the counter is thread-safe (i.e. only one thread can do it a time)
When the semaphore is released, the counter is decreased
1. https://upload.wikimedia.org/wikipedia/commons/b/b2/020118-N-6520M-011_Semaphore_Flags.jpg
14
Aside: Semaphores
Specific Windows Functions
15
Windows has an extensive set of custom functions/classes for helping write threaded programs
CriticalSection
Basically the same as a mutex
Can be more efficient when the lock is available
Interlocked API
A variety of functions for allowing simple thread-safe operations
16
Windows Thread Sharing Functions
InterlockedIncrement will thread-safely increase a 32-bit integer variable by 1
It’s equivalent to wrapping the increment of a variable in a mutex lock / unlock
There are also variants that work on 16- or 64-bit values
There are also similar functions for
Decrement
Xor, And, Or
Exchange
Assignment (with previous value of variable returned)
Add (with previous value returned)
The InterlockedIncrement function has the following parameters
LONG volatile *Addend
A pointer to the variable to be incremented
This function returns a LONG
The incremented value of the variable pointed to by Addend
The AddEnd pointer needs to be 32-bit aligned
We’ll discuss this concept more later on
And see how to force the compiler to do this
17
InterlockedIncrement
The interlocked Windows functions allow for shorthand versions of the simple uses of mutex
int array[LOTS_OF_VALUES];
int sum = 0;
void thread_func(int* array, int start, int end)
{
int local_sum = 0;
for (int i = start; i < end; ++i)
{
local_sum += array[i];
}
// guaranteed to perform correct add
InterlockedExchangeAdd(&sum, local_sum);
}
// initialise array with something
// make some threads that all run thread_func
// (with appropriate start and end indices)
// to calculate sum
18
Multithreaded Sum: InterlockedExchangeAdd
Operation ENDURING FREEDOM