MPI并行计算代写: AE3-422 High-performance Computing

AE3-422 High-performance Computing

Coursework Assignment Deadline: 21st March 2018

Instructions

Please take note of the following when completing this assignment:
• Read all the tasks carefully and plan ahead before you start designing and implementing your code. • You many use any of the tools and libraries available on the provided Linux environment.
• Your submitted code must compile and run correctly on the provided Linux environment.

Make regular backups of your code onto a separate computer system; no allowance will be made for data loss resulting from human or computer error.

1 Introduction

The objective of this coursework is to write a parallel numerical code for simulating heat transfer in a plate. The heat conduction equation in a steady-state problem will be solved numerically using the finite element method.

The strong form for heat conduction is given by the the following laws:
Energybalance: ∇Tq=s on Ω, (1)

Fourier’s law: q = −D∇T on Ω . (2)

Eq. (1) is conservation of energy and states that the heat flowing out from a point ∇Tq in the domain must equal the heat s generated, in order to maintain a constant amount of heat energy. The term q is the heat flux in units of W · m−2, while s is the rate of heat generated per unit volume, with units W · m−3, also known as the heat source. In other words, energy balance requires that the rate of heat energy that is generated within a control volume must equal to the rate at which heat energy leaves the control volume, and hence the energy in the control volume remains constant in a steady-state problem.

The constitutive equation for heat flow, also known as Fourier’s law, relates the heat flux to the temperature, and is given by Eq. (2). It states that the time rate of heat transfer through a material is proportional to the negative gradient in the temperature and to the area, perpendicular to that gradient, through which the

1

Figure 1: Plate subjected to heat transfer.
heat flows. D is the conductivity matrix in units of W · m−1 · K−1 and ∇T is the temperature gradient in

K·m−1.
Boundary conditions are given by:

NaturalBC: qn ≡−qTn=q on Γq, (3) EssentialBC: T=T on ΓT. (4)

At the boundaries of the problem domain, either the flux (Eq. (3)) or the temperature (Eq. (4)) must be prescribed; these are the boundary conditions. T is the known surface temperature on ΓT and q is the prescribed heat flux on the part of the surface Γq. Heat flux is positive if heat (energy) flows out of the domain.Furthermore,Γq􏰫ΓT =∅,Γq􏰪ΓT =Γ,asshowninFigure1.

We solve our problem in the weak sense. The weak form for the problem is given as follows. Find T ∈ U such that,

􏰟􏰟􏰟

(∇w)⊤ ·(D∇T)hdΩ= w·shdΩ− w·qhdΓ ∀w∈U0 (5) Ω Ω Γq

where h is the plate thickness, w are test functions, and the conductivity matrix D is symmetric, positive definite, and given by

􏰢kxx kxy􏰣 D=kk,

yx yy

2 Finite Element Method Discretisation

We will restrict ourselves to using quadrilateral elements. We define a reference element as [−1,1]2, on which we construct our finite element operators. Isoparametric elements are those in which the coordinates

2

from our physical mesh elements are mapped to the reference element by the same shape functions as those used for the approximation itself.

Let (ξ, η) be the coordinate system of the reference element and let the four vertices of element Ωe have co- ordinates xe = [xe1, xe2, xe3, xe4]⊤ and ye = [y1e, y2e, y3e, y4e]⊤. Furthermore, let the shape functions on the ref- erence element be linear in each of ξ and η and given by N(ξ, η) = [N1(ξ, η), N2(ξ, η), N3(ξ, η), N4(ξ, η)], where

N1 =0.25(1−ξ)(1−η), N2 =0.25(1+ξ)(1−η), N3 =0.25(1+ξ)(1+η), N4 =0.25(1−ξ)(1+η).

We can now define coordinates (x, y) in the element Ωe in terms of the coordinates of the reference element as a mapping χ : Ω → Ωe, such that

x = χx(ξ, η) = N(ξ, η)xe, (6) y = χy(ξ,η) = N(ξ,η)ye. (7)

We will also need the gradient of the shape functions, Be, given by by the chain rule, such that Be = (Je)−1GN

􏰬∂ ∂􏰭⊤ e e e whereG= ∂ξ,∂η andtheJacobianmatrixJ =GN[x y ].

Now consider the integral of a function f(ξ,η) over the domain of a quadrilateral element. This can be defined in terms of the shape functions as

I =
Ωe

􏰟

􏰟 η=1 􏰟 ξ=1 ngp ngp
|Je(ξ,η)|f(ξ,η)dξdη ≈ 􏰞􏰞WiWj|Je(ξi,ηj)|f(ξi,ηj)

f(x,y)dΩ =
where |Je(ξ,η)| is the determinant of the Jacobian matrix, ngp is the number of Gauss points, Wi are the

η=−1 ξ=−1

i=1 j=1

Gauss quadrature weights and (ξ , η ) are the abscissae. For our problem, we choose a two-point rule and

ij
thereforeusengp =2,Wi =Wj =1andξi,ηj =±1/ 3.

The weak integral form in Eq. (5) can be replaced by the sum of elemental contributions from the nel elements in our mesh:

nel􏰤􏰟 􏰟 􏰟 􏰥

􏰞

(∇we)⊤ ·De (∇T)e hdΩ− (we)⊤ ·shdΩ+ (we)⊤ ·qhdΓ = 0 (8) e=1 Ωe Ωe Γeq

A discrete form of a global quantity, d is related to the elemental form through the scatter matrix Le as de = Led

The finite element approximation T e(x, y) for the trial solution T (x, y) in each element is given by: T(x,y) ≈ Te(x,y) = Ne(x,y)Te = Ne(x,y)LeT (9)

3

where Te = [T1e, T2e, T3e, T4e]T is the element temperature vector. Similarly, for the test functions, we have w(x, y) ≈ we(x, y) = Ne(x, y)we = Ne(x, y)Lew

⇒ (we(x, y))⊤ = wT(Le)⊤(Ne(x, y))⊤ (10)

where we = [w1e, w2e, w3e, w4e]T is the vector of element nodal values of the test function. Note that for an isoparametric element formulation (see Eqs. (6) and (7)), the shape functions are expressed in terms of element (natural) coordinates ξ and η.

The gradient field is:

and

(∇T)e (x, y) = Be(x, y)LeT (11)

((∇w)e(x,y))⊤ =w⊤(Le)⊤(Be(x,y))⊤ (12) The global vectors d and w are partioned as follows:

􏰢 TE 􏰣 􏰢 wE 􏰣 􏰢 0 􏰣 T=T ,w=w =w

FFF

The part of the vector denoted by the subscript E contains the nodes on the essential boundaries where the values on TE are known.

Substituting the Eqs. (9), (10), (11) and (12) into Eq. (8) we obtain:

􏰨nel 􏰦􏰠􏰟 􏰡 􏰟 􏰟 􏰧􏰩
w⊤ 􏰞(Le)⊤ (Be)⊤DeBehdΩ LeT− (Ne)⊤shdΩ+ (Ne)⊤qhdΓ =0 ∀wF

e=iΩe ΩeΓeq
where wF is the portion of w corresponding to nodes on a natural boundary. From this we can now define

the element conductance matrix as

and the element flux vector as

fe =

fΩe fΓe
In the cases considered in this assignment, s ≡ 0 and so the term fΩe is zero.

􏰟

Ke = (Be)⊤DeBe dΩ Ωe

􏰟􏰟

(Ne)⊤shdΩ− (Ne)⊤qhdΓ Ωe Γeq

􏰰 􏰯􏰮 􏰱􏰰 􏰯􏰮 􏰱

(13)

This allows us to write Eq. (13) as

􏰦􏰤nel
wT 􏰞LeTKeLe

e=i

􏰥 􏰤nel 􏰥􏰧 T − 􏰞LeTfe

= 0 ∀wF

4

e=i

Figure 2: The heat transfer problem is solved on a plate of length L and upper and lower boundaries as shown, with a thickness of 0.2 m. Material is isotropic.

or in terms of global matrices as

wT[(KT−f)]=0 ∀wF Finally, the resultant system to be solved is:

􏰢KEE KEF 􏰣􏰢TE 􏰣 􏰢rE 􏰣
KT K T = f (14)

EF FF F F

Tasks

The objective of this coursework is to write a high-performance parallel C++ implementation which will solve the heat transfer problem in a plate, as illustrated in Figure 2.

A prototype unoptimised serial code, written in Python, is provided for you.

5

Table 1: Test case definitions

Case 1 (C1)

Case 2 (C2)

Case 3 (C3)

a=0
h1 = 1
h2 = 1
L=2
Nelx = 10
Nely =5 T=TxLEFT =10
q = qxRIGHT = 2500

a=0
h1 = 1
h2 = 1 L=2 Nelx = 10 Nely =5

T=TyBOTTOM =10 q = qyTOP = 2500

a = 0.25
h1 = 1
h2 = 1.3
L=3
Nelx = 15
Nely =8
T=TxLEFT =−20
q = qyBOTTOM = −5000

  1. Extend the Python program provided to solve Case 3.
  2. Convert the Python code to C++, and optimise it, to solve heat transfer on a plate in serial for all three cases.
    1. (a)  Accept values for the geometric parameters (a, h1, h2, L and tp), material parameters (kxx, kyy,andkxy)anddiscretisationparameters(Nelx andNely)ascommand-lineargumentstoyour program and validate them appropriately. Ensure the conductivity matrix is positive-definite.
    2. (b)  Generate the element conductance matrix Ke and element flux vector fe.
    3. (c)  Assemble the global conductance matrix K and global flux vector f as shown in Eq. (14).
    4. (d)  Find the nodal temperatures by solving Eq. (14).
    5. (e)  Write your results to a VTK file. This can be visualized with the program Paraview (run paraview from a terminal).
  3. Create a Makefile to build and run your C++ code.
    1. (a)  Create a target compile which compiles your code.
    2. (b)  Create targets c1, c2 and c3 which executes your compiled code with the parameter sets listed in Table1 and other properties shown in Figure 2.
    3. (c)  Update your makefile to add a clean target which removes files generated during compilation.
    4. (d)  Define appropriate default and all target rules in your makefile.
  4. Parallelise your C++ program using MPI to solve the problem on two processes.
    1. (a)  Splitthedomainapproximatelyequallybetweentheprocesses.Programoutputshouldbewritten from process 0.
    2. (b)  Solve the linear system in parallel.
    3. (c)  Update your makefile and add the targets c1p, c2p and c3p to run the code in parallel on two processes using the parameters for the three cases.
    4. (d)  Investigate the mesh size (choices of Nelx and Nely ) required for your code when run on two processes to out-perform your code when run on one process.

      6

[10%] [40%]

[5%]

[30%]

You could use the Linux time command, e.g. time ./a.out, Boost timer library, or a perfor- mance profiler to make measurements.

5. Write a brief report (up to 2 pages) which includes [5%]

  1. (a)  Quantitative evidence of validation of your C++ code, in serial and parallel, against analytical solutions for C1 and C2.
  2. (b)  A plot of your solution to C3.
  3. (c)  A brief description of design decisions you made to ensure an efficient implementation.

6. Demonstrate use of good programming practices [10%]

(a) Generate a log of your use of Git version control using the command:

git log –name-status > repository.log (b) Document your source code appropriately.

Submission and Assessment

When submitting your assignment, make sure you include the following: • The extended version of the Python code to implement Case 3.
• All the files needed to compile and run your C++ code:

– Source files for a single C++ program which performs all the tasks. i.e. All .cpp and .h files necessary to compile and run the code.

– The Makefile used for both compiling and running the code including all the targets as spec- ified in the tasks.

• Your 2-page report (in PDF format only).

• The git log (repository.log)

These files should be submitted in a single tar.gz archive file to Blackboard Learn. To generate your tar.gz archive, put all files to be submitted in a directory (e.g. ae3-422-assessment) and run the following command:

tar -cvzf submission.tar.gz ae3-422-assessment
You may make unlimited submissions and the last submission before the deadline will be assessed.

7