Parallel Programming in C with the Message Passing Interface
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Programming
in C with MPI and OpenMP
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Programming
in C with MPI and OpenMP
Michael J. Quinn
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Slides are modified from those found in
Parallel Programming in C with MPI and
OpenMP, Michael Quinn
3
Algorithm design and
basic algorithms
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Outline
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Outline
n Task/channel model
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Outline
n Task/channel model
n Algorithm design methodology
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Outline
n Task/channel model
n Algorithm design methodology
n Case studies
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
n Parallel computation = set of tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
n Parallel computation = set of tasks
n Task
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
n Parallel computation = set of tasks
n Task
uProgram
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
n Parallel computation = set of tasks
n Task
uProgram
uLocal memory
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
n Parallel computation = set of tasks
n Task
uProgram
uLocal memory
uCollection of I/O ports
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
n Parallel computation = set of tasks
n Task
uProgram
uLocal memory
uCollection of I/O ports
n Tasks interact by sending messages through
channels
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Task/Channel Model
task
channel
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Design Methodology
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Design Methodology
n Partitioning
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Design Methodology
n Partitioning
n Communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Design Methodology
n Partitioning
n Communication
n Agglomeration
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Design Methodology
n Partitioning
n Communication
n Agglomeration
n Mapping
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Foster’s Methodology
Problem
Partitioning
Communication
AgglomerationMapping
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Dividing computation and data into pieces
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Dividing computation and data into pieces
n Domain decomposition
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Dividing computation and data into pieces
n Domain decomposition
u Divide data into pieces
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Dividing computation and data into pieces
n Domain decomposition
u Divide data into pieces
u Determine how to associate computations with
the data
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Dividing computation and data into pieces
n Domain decomposition
u Divide data into pieces
u Determine how to associate computations with
the data
n Functional decomposition
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Dividing computation and data into pieces
n Domain decomposition
u Divide data into pieces
u Determine how to associate computations with
the data
n Functional decomposition
u Divide computation into pieces
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Dividing computation and data into pieces
n Domain decomposition
u Divide data into pieces
u Determine how to associate computations with
the data
n Functional decomposition
u Divide computation into pieces
u Determine how to associate data with the
computations
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Example Domain Decompositions
1-D
2-D
3-D
Primitive tasks
is the number of
scope, or order
of magnitude, of
the parallelism.
1-D has, in the
example, n-way
||ism along the
n-faces, 2-D has
n^2 ||ism along
the faces, and 3-
way has n^3 ||
ism along the
faces.
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Example Functional Decomposition
Determine image
location
Display image
Determine image
location Register image
Track position of
instruments
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Types of parallelism
12
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Types of parallelism
n Numerical algorithms often have data-parallelism
12
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Types of parallelism
n Numerical algorithms often have data-parallelism
n Non-numerical algorithms often have functional
parallelism.
12
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Types of parallelism
n Numerical algorithms often have data-parallelism
n Non-numerical algorithms often have functional
parallelism.
n Many algorithms, especially complex numerical
algorithms, have both, e.g., data parallelism
within an function, many functions that can be
done in parallel.
12
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Types of parallelism
n Numerical algorithms often have data-parallelism
n Non-numerical algorithms often have functional
parallelism.
n Many algorithms, especially complex numerical
algorithms, have both, e.g., data parallelism
within an function, many functions that can be
done in parallel.
n Functional parallelism often scales worse with
increasing data size (concurrency-limited in
isoefficiency terms)
12
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning Checklist
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning Checklist
n At least 10x more primitive tasks than
processors in target computer
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning Checklist
n At least 10x more primitive tasks than
processors in target computer
n Minimize redundant computations and
redundant data storage
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning Checklist
n At least 10x more primitive tasks than
processors in target computer
n Minimize redundant computations and
redundant data storage
n Primitive tasks roughly the same size
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning Checklist
n At least 10x more primitive tasks than
processors in target computer
n Minimize redundant computations and
redundant data storage
n Primitive tasks roughly the same size
n Number of tasks an increasing function of
problem size
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n Determine values passed among tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n Determine values passed among tasks
n Local communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n Determine values passed among tasks
n Local communication
u Task needs values from a small number of other
tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n Determine values passed among tasks
n Local communication
u Task needs values from a small number of other
tasks
u Create channels illustrating data flow
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n Determine values passed among tasks
n Local communication
u Task needs values from a small number of other
tasks
u Create channels illustrating data flow
n Global communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n Determine values passed among tasks
n Local communication
u Task needs values from a small number of other
tasks
u Create channels illustrating data flow
n Global communication
u Significant number of tasks contribute data to
perform a computation
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n Determine values passed among tasks
n Local communication
u Task needs values from a small number of other
tasks
u Create channels illustrating data flow
n Global communication
u Significant number of tasks contribute data to
perform a computation
u Don’t create channels for them early in design
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Checklist
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Checklist
n Communication operations balanced among
tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Checklist
n Communication operations balanced among
tasks
n Each task communicates with only small
group of neighbors
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Checklist
n Communication operations balanced among
tasks
n Each task communicates with only small
group of neighbors
n Tasks can perform communications
concurrently
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Checklist
n Communication operations balanced among
tasks
n Each task communicates with only small
group of neighbors
n Tasks can perform communications
concurrently
n Task can perform computations
concurrently
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
n Grouping tasks into larger tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
n Grouping tasks into larger tasks
n Goals
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
n Grouping tasks into larger tasks
n Goals
u Improve performance
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
n Grouping tasks into larger tasks
n Goals
u Improve performance
uMaintain scalability of program
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
n Grouping tasks into larger tasks
n Goals
u Improve performance
uMaintain scalability of program
uSimplify programming
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
n Grouping tasks into larger tasks
n Goals
u Improve performance
uMaintain scalability of program
uSimplify programming
n In MPI programming, goal often to create
one agglomerated task per processor
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Can Improve
Performance
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Can Improve
Performance
n Eliminate communication between
primitive tasks agglomerated into
consolidated task
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Can Improve
Performance
n Eliminate communication between
primitive tasks agglomerated into
consolidated task
n Combine groups of sending and receiving
tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n Locality of parallel algorithm has increased
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n Locality of parallel algorithm has increased
n Replicated computations take less time than
communications they replace
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n Locality of parallel algorithm has increased
n Replicated computations take less time than
communications they replace
n Data replication doesn’t affect scalability
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n Locality of parallel algorithm has increased
n Replicated computations take less time than
communications they replace
n Data replication doesn’t affect scalability
n Agglomerated tasks have similar computational
and communications costs
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n Locality of parallel algorithm has increased
n Replicated computations take less time than
communications they replace
n Data replication doesn’t affect scalability
n Agglomerated tasks have similar computational
and communications costs
n Number of tasks increases with problem size
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n Locality of parallel algorithm has increased
n Replicated computations take less time than
communications they replace
n Data replication doesn’t affect scalability
n Agglomerated tasks have similar computational
and communications costs
n Number of tasks increases with problem size
n Number of tasks suitable for likely target systems
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration Checklist
n Locality of parallel algorithm has increased
n Replicated computations take less time than
communications they replace
n Data replication doesn’t affect scalability
n Agglomerated tasks have similar computational
and communications costs
n Number of tasks increases with problem size
n Number of tasks suitable for likely target systems
n Tradeoff between agglomeration and code
modifications costs is reasonable
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
n Process of assigning tasks to processors
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
n Process of assigning tasks to processors
n Shared memory system: mapping done by
operating system
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
n Process of assigning tasks to processors
n Shared memory system: mapping done by
operating system
n Distributed memory system: mapping done
by user
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
n Process of assigning tasks to processors
n Shared memory system: mapping done by
operating system
n Distributed memory system: mapping done
by user
n Conflicting goals of mapping
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
n Process of assigning tasks to processors
n Shared memory system: mapping done by
operating system
n Distributed memory system: mapping done
by user
n Conflicting goals of mapping
uMaximize processor utilization
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping
n Process of assigning tasks to processors
n Shared memory system: mapping done by
operating system
n Distributed memory system: mapping done
by user
n Conflicting goals of mapping
uMaximize processor utilization
uMinimize interprocessor communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Example
While this may reduce communication, load
balance may be an issue
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
n Finding optimal mapping is NP-hard
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
n Finding optimal mapping is NP-hard
n Must rely on heuristics
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
n Finding optimal mapping is NP-hard
n Must rely on heuristics
n Metis is a popular package for partitioning
graphs
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
n Finding optimal mapping is NP-hard
n Must rely on heuristics
n Metis is a popular package for partitioning
graphs
uMinimizes the number of edges between
nodes in a graph
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Optimal Mapping
n Finding optimal mapping is NP-hard
n Must rely on heuristics
n Metis is a popular package for partitioning
graphs
uMinimizes the number of edges between
nodes in a graph
uEdges, for our purposes, can be thought
of as communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n Static number of tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n Static number of tasks
u Structured communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n Static number of tasks
u Structured communication
t Constant computation time per task
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
• Cyclically map tasks to processors
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
• Cyclically map tasks to processors
• GSS (guided self scheduling)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
• Cyclically map tasks to processors
• GSS (guided self scheduling)
u Unstructured communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
• Cyclically map tasks to processors
• GSS (guided self scheduling)
u Unstructured communication
• Use a static load balancing algorithm
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Decision Tree
n Static number of tasks
u Structured communication
t Constant computation time per task
• Agglomerate tasks to minimize communication
• Create one task per processor
t Variable computation time per task
• Cyclically map tasks to processors
• GSS (guided self scheduling)
u Unstructured communication
• Use a static load balancing algorithm
n Dynamic number of tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n Static number of tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n Static number of tasks
n Dynamic number of tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n Static number of tasks
n Dynamic number of tasks
uFrequent communications between tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n Static number of tasks
n Dynamic number of tasks
uFrequent communications between tasks
tUse a dynamic load balancing algorithm
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n Static number of tasks
n Dynamic number of tasks
uFrequent communications between tasks
tUse a dynamic load balancing algorithm
uMany short-lived tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n Static number of tasks
n Dynamic number of tasks
uFrequent communications between tasks
tUse a dynamic load balancing algorithm
uMany short-lived tasks
tUse a run-time task-scheduling algorithm
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Strategy
n Static number of tasks
n Dynamic number of tasks
uFrequent communications between tasks
tUse a dynamic load balancing algorithm
uMany short-lived tasks
tUse a run-time task-scheduling algorithm
tCilk and Galois, discussed in the next couple
of weeks, do this.
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Checklist
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Checklist
n Considered designs based on one task per
processor and multiple tasks per processor
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Checklist
n Considered designs based on one task per
processor and multiple tasks per processor
n Evaluated static and dynamic task allocation
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Checklist
n Considered designs based on one task per
processor and multiple tasks per processor
n Evaluated static and dynamic task allocation
n If dynamic task allocation chosen, task
allocator is not a bottleneck to performance
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Mapping Checklist
n Considered designs based on one task per
processor and multiple tasks per processor
n Evaluated static and dynamic task allocation
n If dynamic task allocation chosen, task
allocator is not a bottleneck to performance
n If static task allocation chosen, ratio of tasks
to processors is at least 10:1
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Case Studies
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Case Studies
n Boundary value problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Case Studies
n Boundary value problem
n Finding the maximum
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Case Studies
n Boundary value problem
n Finding the maximum
n The n-body problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Case Studies
n Boundary value problem
n Finding the maximum
n The n-body problem
n Adding data input
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Boundary Value Problem
Ice water Rod Insulation
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Rod Cools as Time Progresses
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Want to use finite-difference
method over multiple time steps
28
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Want to use finite-difference
method over multiple time steps
n Each circle
represents a
computation
28
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Want to use finite-difference
method over multiple time steps
29
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Want to use finite-difference
method over multiple time steps
Temperature at
time t+1 for a
position on the rod
represented by a
node depends on
the temperature of
neighbors at time t
29
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n One data item
per grid pointtime
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n One data item
per grid point
n Associate one
primitive task
with each grid
point
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n One data item
per grid point
n Associate one
primitive task
with each grid
point
n Two-
dimensional
domain
decomposition
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n Identify
communication
pattern between
primitive tasks
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication
n Identify
communication
pattern between
primitive tasks
n Each interior
primitive task has
three incoming
and three
outgoing channels
time
position
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration and Mapping
(a)
(b)
( c)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration and Mapping
Agglomeration
(a)
(b)
( c)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sequential execution time
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sequential execution time
n χ – time to update element
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sequential execution time
n χ – time to update element
n n – number of elements
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sequential execution time
n χ – time to update element
n n – number of elements
n m – number of iterations
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Sequential execution time
n χ – time to update element
n n – number of elements
n m – number of iterations
n Sequential execution time: m (n-1) χ
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
n χ – time to update element
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
n χ – time to update element
n n – number of elements
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
n χ – time to update element
n n – number of elements
n m – number of iterations
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
n χ – time to update element
n n – number of elements
n m – number of iterations
n p – number of processors
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
n χ – time to update element
n n – number of elements
n m – number of iterations
n p – number of processors
n λ – message latency
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Execution Time
n χ – time to update element
n n – number of elements
n m – number of iterations
n p – number of processors
n λ – message latency
n Parallel execution time m(χ⎡(n-1)/p⎤+2λ)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding the Maximum Error from
measured data
6.25%
Need to do
a reduction.
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Reduction Evolution
n -1 tasks n -1 tasks n -1 tasks
n /4 -1 tasks
n /4 – 1 tasks
n /4 – 1 tasks
n /4 -1 tasks
(a) (b)
( c)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Reduction Evolution
n -1 tasks n /2 -1 tasks n /2-1 tasks
n /4 -1 tasks
n /4 – 1 tasks
n /4 – 1 tasks
n /4 -1 tasks
(a) (b)
( c)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Parallel Reduction Evolution
n -1 tasks n /2 -1 tasks n /2-1 tasks
n /4 -1 tasks
n /4 – 1 tasks
n /4 – 1 tasks
n /4 -1 tasks
(a) (b)
( c)
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Binomial Trees
Subgraph of hypercube
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
4 2 0 7
-3 5 -6 -3
8 1 2 3
-4 4 6 -1
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
4 2 0 7
-3 5 -6 -3
8 1 2 3
-4 4 6 -1
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
1 7 -6 4
4 5 8 2
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
1 7 -6 4
4 5 8 2
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
8 -2
9 10
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
8 -2
9 10
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
17 8
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
17 8
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
25
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Finding Global Sum
25
Binomial Tree
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration leads to actual
communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration leads to actual
communication
sum
sum sum
sum
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Agglomeration leads to actual
communication
sum
sum sum
sum
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The n-body Problem
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Domain partitioning
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Domain partitioning
n Assume one task per particle
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Domain partitioning
n Assume one task per particle
n Task has particle’s position, velocity vector
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Domain partitioning
n Assume one task per particle
n Task has particle’s position, velocity vector
n Iteration
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Domain partitioning
n Assume one task per particle
n Task has particle’s position, velocity vector
n Iteration
uGet positions of all other particles
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Partitioning
n Domain partitioning
n Assume one task per particle
n Task has particle’s position, velocity vector
n Iteration
uGet positions of all other particles
uCompute new position, velocity
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Gather
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Gather
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
All-gather
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
All-gather
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Complete Graph for All-gather —
operations shown, no ordering required
a b
c d
a, b a, b
c, d c,d
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
a, b, c a, b, d
a, c, d b, c, d
a, b,
c, d
a, b,
c, d
a,b,
c, d
a, b,
c, d
Complete Graph for All-gather —
operations shown, no ordering required
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Hypercube-based All-gather —
ordering required
a b
c d
a, b a, b
c, d c,d
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Complete Graph for All-gather
a, b,
c, d
a, b,
c, d
a, b,
c, d
a, b,
c, d
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Communication Time
Complete graphComplete graph
p
pn
p
pn
p
β
λ
β
λ
)1(
)1()
/
)(1(
−
+−=+−
p
pn
p
p
np
i β
λ
β
λ
)1(
log
2
log
1
1-i −
+=$$
%
&
”
(
)
+∑
=
Hypercube Hypercube
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Adding Data Input
0 1
2 3
Output
Input
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Scatter
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Scatter
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Scatter in log p Steps
in a system
buffer
in an application
buffer
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
n Parallel computation
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
n Parallel computation
uSet of tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
n Parallel computation
uSet of tasks
u Interactions through channels
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
n Parallel computation
uSet of tasks
u Interactions through channels
n Good designs
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
n Parallel computation
uSet of tasks
u Interactions through channels
n Good designs
uMaximize local computations
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
n Parallel computation
uSet of tasks
u Interactions through channels
n Good designs
uMaximize local computations
uMinimize communications
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Task/channel Model
n Parallel computation
uSet of tasks
u Interactions through channels
n Good designs
uMaximize local computations
uMinimize communications
uScale up
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
n Partition computation
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
n Partition computation
n Agglomerate tasks
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
n Partition computation
n Agglomerate tasks
n Map tasks to processors
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
n Partition computation
n Agglomerate tasks
n Map tasks to processors
n Goals
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
n Partition computation
n Agglomerate tasks
n Map tasks to processors
n Goals
uMaximize processor utilization
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Design Steps
n Partition computation
n Agglomerate tasks
n Map tasks to processors
n Goals
uMaximize processor utilization
uMinimize inter-processor communication
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Fundamental Algorithms
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Fundamental Algorithms
n Reduction
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Fundamental Algorithms
n Reduction
n Gather and scatter
Tuesday, April 14, 15
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Summary: Fundamental Algorithms
n Reduction
n Gather and scatter
n All-gather
Tuesday, April 14, 15