Problem Domain
Coursework taken from the field of Computational Fluid Dynamics (CFD)
1. Fluiddynamicsbasedonthreefundamentalprinciples:(i)massisconserved;(ii)Newton’ssecondlaw; (iii) energy is conserved
2. Expressedaspartialdifferentialequations,showinghowvelocityandpressurearerelated,etc.(called governing equations).
Copyright By PowCoder代写 加微信 powcoder
3. Thecoordinatesandtimeareindependentvariableswhilevelocityandpressurearedependent variables
4. ComputationalFluidDynamicsisthescienceoffindingthenumericalsolutiontothegoverning equations of fluid flow, over the discretized space or time
Governing Equations
The code in the coursework, called Karman, calculates the velocity and pressure of a 2D flow The code writes the solution values into a binary file
Currently the code is sequential
The purpose of the coursework is to parallelize the code
Data and stencil
The area represented as a 2D Grid (discretize) Calculate one point in each cell
Numerical method for solving the governing equations
Successive Over-Relaxation(SOR)
Decomposition of a grid of cells
• Make each process responsible for updating a block of cells
• A process must send the cells at the edge of its domain to its neighbours, and receive a copy of the edge cells from its neighbours
• 1D decomposition vs. 2D decomposition
The main loop
for (t = 0.0; t< t_end; t = t + delt) {
1.Calculate an approximate time-step size by seeing how much movement occurred in the last time-step. The discrete approximation is only stable when the maximum motion < 1 cell per time-step.
2.For each cell, compute a tentative new velocity (f,g) based on the previous (u,v) values. It takes as input the u, v and flag matrices, and updates the f and g matrices.
3.For each cell calculate the righ-hand side (RHS) of the pressure equation (Poisson equation). This uses two f and g values. It takes as input the f, g, and flag matrices and updates the RHS matrix.
4.For the entire pressure matrix, use Red/ OR to solve the Poisson equation. This takes a large number iterations of the Red/Black process as shown in the slides. It takes as input the current pressure matrix, flag matrix, and the RHS matrix and outputs a new pressure matrix
5. Foreachcell,updatethereal(u,v)valuesbasedonthepressurematrixandthetentativevelocity values (f,g). It takes as input the pressure, f, g and flag matrices and updates the u and v matrices.
6. Foreachcellthatisadjacenttoanedgecell,the(u,v)valuesoftheboundarycellsareupdated(by
taking values from their neighbours). It takes as input the u, v and flag matrices and updates the u and v matrices }
Note that you should focus on parallelizing the functions that consume a considerable amount of time
Coursework 2 – What to Do in General
You will be given a serial program, called need to parallelize the Karman code and write a report
Parallelize the Karman code, using a pure MPI approach and a hybrid MPI-OpenMP approach
one parallelization using MPI
the other one parallelization using both MPI and OpenMP (a hybrid approach), i.e., parallelizing the computations in one machine using OpenMP, while parallelizing the computations across machines using MPI
Profiling which Karman functions are more time consuming
Design your decomposition strategy, data exchange strategies between neighboring partitions, the
MPI functions (e.g., collective operations)
For a hybrid approach, add OpenMP directives
Benchmarking the execution time (e.g., using MPI_Wtime function) of your parallel code as you
• increase the number of processes
• and/or change the problem size if it helps you observe the trend
• a hybrid MPI-OpenMP implementation,
• benchmark the total runtime of the parallel code and the runtime of the key loops/kernels over different threads
• benchmark the overhead incurred by OpenMP
• Ensure the simulation results are the same after the parallelization
• Contain adequate comments as good programming practice.
Coursework 2 – Requirements for Report
Profiling the functions to determine which functions are more time consuming; discuss the profiling results
• Discuss your MPI implementation, for example,
• your decomposition strategy and load balancing strategy;
• your strategy of exchanging the boundary data between neighboring partitions;
• MPI functions such as collective operations you used;
• Benchmark the change in execution times of your parallel program as you increase the number of processes and/or problem size; present the results in graph or table
• Analyze and discuss the performance of your parallel program
the hybrid MPI-OpenMP implementation,
• discuss OpenMP implementation (e.g., the directives used, iteration scheduling used)
• present and discuss the benchmark results for the runtime of the parallel code and the runtime of the key loops/kernels
• present and discuss the overhead of OpenMP implementation • Up to 6 A4 pages (not a strict up limit)
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com