This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
The Compute Unified Device Architecture (CUDA)
Computer Graphics
Copyright By PowCoder代写 加微信 powcoder
cuda.pptx mjb – May 7, 2021
C/C++ Program with both host and CUDA code in it
Host code CUDA code
CUDA is an NVIDIA-only product. It is very popular, and got the whole GPU-as-CPU ball rolling, which has resulted in other packages like OpenCL.
CUDA also comes with several libraries that are highly optimized for applications such as linear algebra and deep learning.
C/C++ Compiler and UDA Compiler and CUDA Paradigm
CPU binary on the Host
1. Run CPU code
5. Run CPU code
9. Run CPU code
CUDA binary on the Device
2. Send data to GPU 3. Run GPU kernel
4. Get data back from GPU
6. Send data to GPU 7. Run GPU kernel
8. Get data back from GPU
Computer Graphics
mjb – May 7, 2021
If you were writing in C/C++, you would say:
If you were writing in CUDA, you would say:
Computer Graphics
Think of this as having an implied for-loop around it, looping through all possible values of gid
CUDA wants you to break the problem up into Pieces
ArrayMult( int n, float *a, float *b, float *c) {
for(inti=0; i
Note that this is just like calling the C/C++ function:
KernelFunction( arg1, arg2, … ) ;
except that we have designated it to run on the GPU with a particular block/thread configuration.
The C/C++ Program Calls a CUDA Kernel using a Special <<<...>>> Syntax
These are called “chevrons”
Computer Graphics
mjb – May 7, 2021
One of my own Experiments with Number of Threads Per Block
KernelFunction<<< NumBlocks , NumThreadsPerBlock >>>( arg1, arg2, … ) ;
Dataset Size
NumBlocks = DataSetSize / NumThreadsPerBlock
Number of Threads per Block
Computer Graphics
mjb – May 7, 2021
Performance
One of my own Experiments with Number of Threads Per Block
KernelFunction<<< NumBlocks , NumThreadsPerBlock >>>( arg1, arg2, … ) ;
Number of Threads per Block
Dataset Size
mjb – May 7, 2021
omNpumterBGlroapchkicss = DataSetSize / NumThreadsPerBlock
Getting CUDA Programs to Run under Linux The Makefile we use
CUDA_PATH =
CUDA_BIN_PATH =
CUDA_NVCC = $(CUDA_BIN_PATH)/nvcc
/usr/local/apps/cuda/cuda-10.1 $(CUDA_PATH)/bin
arrayMul: arrayMul.cu
$(CUDA_NVCC) -o arrayMul arrayMul.cu
This is the path where the CUDA tools are loaded on our Oregon State University systems.
Computer Graphics
mjb – May 7, 2021
Performance
Getting CUDA Programs to Run under Visual Studio
1. Install Visual Studio if you haven’t already. If you are an OSU student, go to:
https://azureforeducation.microsoft.com/devtools
Click the blue Sign In button on the right.
Login using your username and password.
2. Install the CUDA toolkit. It is available here:
https://developer.nvidia.com/cuda-downloads
Computer Graphics
mjb – May 7, 2021
Getting CUDA Programs to Run under Visual Studio
From the main screen, click File → New → Project…
Computer Graphics
mjb – May 7, 2021
Getting CUDA Programs to Run under Visual Studio
Then, in this templates box, type: CUDA
Computer Graphics
mjb – May 7, 2021
20 After a few seconds, you will then see this. Click Next.
Getting CUDA Programs to Run under Visual Studio
Computer Graphics
mjb – May 7, 2021
Getting CUDA Programs to Run under Visual Studio
2. Give the name you want for the folder and project
4. Click Create
Leave this box checked.
1. Navigate to the folder you want to contain this project folder. Computer Graphics
mjb – May 7, 2021
Getting CUDA Programs to Run under Visual Studio
1. Visual Studio then “writes” a program for you. It has both CUDA and C++ code in it. Its structure looks just like our notes’ examples.
2. You can click Build → Build to compile it, both the C++ and the CUDA code.
3. You can click Debug → Start Without Debugging to run it.
4. You can then either modify this file, or clear it and paste your own code in.
Computer Graphics
mjb – May 7, 2021
Getting CUDA Programs to Run under Visual Studio
Computer Graphics
mjb – May 7, 2021
If that doesn’t work, try this:
Computer Graphics
Note: if you are trying to run CUDA on your own Visual Studio system, make sure your machine has the CUDA toolkit installed. It is available here:
https://developer.nvidia.com/cuda-downloads
mjb – May 7, 2021
1. Un-zip the ArrayMul2019.zip file into its own folder.
2. Rename that folder to what you want it to be.
3. Rename arrayMul.cu to whatever you want it to be (keeping the .cu extension). Without the .cu extension, we will call this the basename.
4. Rename the .sln and .vcxproj files to have the same basename as your .cu file has.
5. Edit the *.sln file. Replace all occurrences of “arrayMul” to what the basename is.
6. Edit the *.vcxproj file. Replace all occurrences of “arrayMul” with the basename. Replace all occurrences of ArrayMul2019 with whatever you renamed the folder to.
7. In the .cu file, rename the CUDA function from ArrayMul to whatever you want it to be. Do this twice, once in the definition of the function and once in the calling of the function.
8. Now modify the CUDA code to perform the computation you require.
Using Multiple GPU Cards with CUDA
int deviceCount;
cudaGetDeviceCount( &deviceCount );
int device; // 0 ≤ device ≤ deviceCount – 1 cudaSetDevice( device );
Computer Graphics
mjb – May 7, 2021
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com