CS考试辅导 CS 61C Summer 2023 Calendar Staff Exam Policies Resources Extensions Quic

CS 61C Summer 2023 Calendar Staff Exam Policies Resources Extensions Quick Links
Setup: Git
Setup: Testing
Background

Copyright By PowCoder代写 加微信 powcoder

Task 1: Naive Convolutions
Task 2: Optimization
Task 3: Open MPI
Task 4: Partner/Feedback Form
Benchmarks
Submission and Grading
Appendix: Helper Functions in io.o
Appendix: (Optional) Convolutions
Project 4: CS61ka : Tuesday, August 8, 11:59:59 PM PT
For this project, remember that the verb form of convolution is “convolve” not “convolute”. Convolute means to complicate things,
but in case you forget, here’s a helpful quote:
“Convolute is what I do on exams, convolve is what you do on exams.”
– , EE120 Professor
Lectures 15-20, Discussions 9-10, Labs 7-9, and Homework 5-6 are required for this project, and we highly recommend
finishing those assignments first. Also, make sure you’ve finished the setup in Lab 0.
§ Introduction
One of the major goals of this project is to explore how to speed up our programs. However, as with any real-world performance
tests, there are many different factors that can affect the performance of any particular program.
For this project, one of the greatest factors is the load on the hive machines. Heavy hive machine load may significantly affect
execution time and speedup. In fact, it can drastically misrepresent the performance of your program, when in fact your code may
be correct.
As a result, we recommend the following in order to get the most consistent results from running your tests on the hive:
1. Startworkingontheprojectearly.Inevitably,morestudentswillbeworkingontheprojectclosertothedeadline.Assuch, students who start early will have a higher chance of using the hive machines under lower amounts of load.
2. CheckHivemind tochoosewhichhivemachinetosshinto.Youwillwanttoselectahivemachinethathasalowoverallload,
CPU usage, and number of current users.
3. Tryworkingontheprojectduringtimesthatotherstudentsarenotworking.Youwillfindthatcertainhoursofthedayhaveless
average hive machine load; running your tests during this time will help you get more consistent performance results.
§ Setup: Git
This assignment can be done alone or with a partner.
You must complete this project on the hive machines (not your local machine). See Lab 0 if you need to set up the hive
machines again.
Warning: Once you create a Github repo, you will not be able to change (add, remove, or swap) partners for this project, so
please be sure of your partner before starting the project. You must add your partner on both galloc and to every Gradescope
submission.
If there are extenuating circumstances that require a partner switch (e.g. your partner drops the class, your partner is
unresponsive), please reach out to us privately.
1. VisitGalloc .LoginandstarttheProject4assignment.ThiswillcreateaGitHubrepositoryforyourwork.Thiswillcreatea
GitHub repository for your work. If you have a partner, one partner should create a repo and invite the other partner to that
repo. The other partner should accept the invite without creating their own repo.
2. Clonetherepositoryonahivemachine.
$ git clone 61c-proj4
(replace username with your GitHub username)
3. Navigatetoyourrepository:
$ cd 61c-proj4
4. Addthestarterrepositoryasaremote:
$ git remote add starter https://github.com/61c-teach/su23-proj4-starter.git
If you run into git issues, please check out the common errors page.
§ Setup: Testing
The starter code does not come with any provided tests. To download the staff tests, run the following command:
$ python3 tools/create_tests.py
§ Background
This section is a bit long, so feel free to skip it for now, but please refer to them as needed since they provide helpful background
information for the tasks in this project.
§ Convolutions
For background information about what a convolution is and why it is useful, see Appendix: (Optional) Convolutions.
In this project, a vector is represented as an array of int32_ts, or a int32_t *.
§ Matrices
In this project, we provide a type matrix_t defined as follows: typedef struct {
uint32_t rows;
uint32_t cols;
int32_t *data;
} matrix_t;
In matrix_t, rows represents the number of rows in the matrix, cols represents the number of columns in the matrix, and data is a
1D array of the matrix stored in row-major format (similar to project 2). For example, the matrix [[1, 2, 3], [4, 5, 6]] would be
stored as [1, 2, 3, 4, 5, 6].
§ bin files
Matrices are stored in .bin files as a consecutive sequence of 4-byte integers (identical to project 2). The first and second integers
in the file indicate the number of rows and columns in the matrix, respectively. The rest of the integers store the elements in the
matrix in row-major order.
To view matrix files, you can run xxd -e matrix_file.bin, replacing matrix_file.bin with the matrix file you want to read. The
output should look something like this:
00000000: 00000003 00000003 00000001 00000002 …………….
00000010: 00000003 00000004 00000005 00000006 …………….
00000020: 00000007 00000008 00000009 …………
The left-most column indexes the bytes in the file (e.g. the third row starts at the 0x20th byte of the file). The dots on the right
display the bytes in the file as ASCII, but these bytes don’t correspond to printable ASCII characters so only dot placeholders
The actual contents of the file are listed in 4-byte blocks, with 4 blocks displayed per row. The first row has the numbers 3 (row
count), 3 (column count), 1 (first element), and 2 (second element). This is a 3×3 matrix with elements [1, 2, 3, 4, 5, 6, 7, 8, 9].
In this project, we provide a type task_t defined as follows:
typedef struct {
char *path;
Each task represents a convolution operation (we’ll go into what convolution is later), and is uniquely identified by its path. The path
member of the task_t struct is the relative path to the folder containing the task.
§ Testing Framework
Tests in this project are located in the tests directory. The starter code does not contain any tests, but it contains a script
tools/create_tests.py which will create the tests directory and generate tests. If you would like to make a custom test, please
add it to tools/custom_tests.py. Feel free to use the tests we provide in tools/staff_tests.py as examples.
Once you define a custom test in custom_tests.py, you can run the test using the make commands provided in the task that you’re
currently working on, and the test will be generated for you based on the parameters you specify.
§ Directory Structure
Here’s what folder should look like after running the example above:
61c-proj4/ # Your root project 4 folder
├─src/ # Where your source files are located
├─tests/ # Where your test files are located
│ └─my_custom_test/ # Where a specific test is located
│ ├─task0/
│ ├─input.txt
# The input file to pass to your program
# The first task
│ │ ├─a.bin
│ │ ├─b.bin
│ │ ├─ref.bin
│ │ └─out.bin
│ ├─task1/
│ └─task9/
# Matrix A used in this task
# Matrix B used in this task
# Reference output matrix generated by our script
# Output matrix generated by your program
# The second task, with the same folder structure
# The tenth (last) task
├─tools/ # Where some utility scripts are stored
└─Makefile # Makefile for compilation and running the program.
The path member of the task_t struct for task0 would be would be tests/my_custom_test/task0.
This project uses a Makefile for running tests. The Makefile accepts one variable: TEST. TEST should be a path to a folder in tests/
that contains an input.txt. We’ll provide more guidance on this later in the project.
For example, if you would like the run the test generated above and you’re working on task 1, the command would be
$ make task_1 TEST=tests/my_custom_test
§ Debugging
While cgdb and valgrind may not be as helpful as they are for project 1, they are still helpful for fixing any errors you may
encounter. When you run a test through make, it prints out the actual command used to run the test. For example, the make
command in the above section would print out the following:
Command: ./convolve_naive_naive tests/my_custom_test/input.txt
If you would like to use cgdb, add cgdb before the command printed. Similarly, for valgrind, add valgrind before the command
printed. For example, the commands would be
$ cgdb –args ./convolve_naive_naive tests/my_custom_test/input.txt
$ valgrind ./convolve_naive_naive tests/my_custom_test/input.txt
§ Task 1: Naive Convolutions
In this project, you will implement and optimize 2D convolutions, which is a mathematical operation that has a wide range of
applications. Don’t worry if you’ve never seen convolutions before, it can be simplified to a series of dot products (more on this in
task 1.3).
§ Task 1.1 (Optional): Vector Dot Product
Note: This task is NOT required but you may find it useful to implement as convolutions involve calculating dot products.
Helpful resources: Background: Vectors
01234 01234
• 56789 56•789 0•5+1•6+2•7+3•8+4•9=80 0•5 + 1•6 + 2•7 + 3•8 + 4•9 = 80
Recall from project 2, the dot product is a way of multiplying vectors together, by multiplying the corresponding entries of vectors
together and returning the sum of all the products. For a review, see the project 2 spec.
Implement the int dot(uint32_t n, int32_t* vec1, int32_t* vec2) function in compute_naive.c. This function calculates and
returns the dot product of vectors A and B.
uint32_t n
int32_t* vec1
int32_t* vec2
The length of the vectors we are taking a dot product of.
Pointer to the first vector.
Pointer to the second vector.
The result of the dot product.
§ Task 1.2 (Optional): 1D Convolutions
Note: This task is NOT required but you may find it useful to implement 1D convolutions before 2D convolutions. 1D convolution can
be treated as a special case of 2D convolution, where the number of rows is always 1.
§ Conceptual Overview: 1D Convolutions
Note: For this project, we will only be covering discrete convolutions since the input vectors and matrices only have values at
discrete indexes. This is in contrast to continuous convolutions, where the input would have values for all real numbers.
1D convolution is a special way of multiplying two vectors together in order to determine how well they overlap. This leads to many
different applications that you’ll explore in this project, but first, here are the mechanics for how a 1D convolution is done: (click the
image for a larger version)
012 123 234
432 432 432
7 7 16 7 16 25
0•4 + 1•3 + 2•2 = 7 1•4 + 2•3 + 3•2 = 16 2•4 + 3•3 + 4•2 = 25
In this example, we want to convolve vector A = [0, 1, 2, 3, 4] and vector B = [2, 3, 4].
1. FlippingtheelementsinvectorBhorizontally,makingthelastelementsfirstandthefirstelementslast.Inourexample,weflip vector B to become [4, 3, 2].
2. OverlapvectorAandBsuchthatvectorBisthefurthestleftpossiblewhilestillcompletelyoverlappingwithvectorA.Then perform a dot product of the overlapped elements, and store the result as the first element in the resultant output.
3. SlidevectorBtotherightby1,andrepeatthisprocess.ThiscontinuesuntilanypartofvectorBnolongeroverlapswithvector
You can assume that the width of vector B is less than or equal to the width of vector A.
Note: The output vector has different dimensions. We’d recommend working out some examples to see how the dimensions of the
output vector is related to the input vectors. For example, a convolution between input vectors of size 5 and 2 would output a vector
of size 4.
6-7÷3•23-7÷4•24•7÷1•21-7÷5•2 48293017
§ Implementation
Helpful resources: Background: Matrices, Background: Task
Implement convolve in compute_naive.c for 1D convolutions. You may assume that both a_matrix and b_matrix have exactly 1 row
and the number of columns in b_matrix will smaller or equal to the number of columns in a_matrix (that is, a_matrix is 1 by m and
b_matrix is 1 by n, and n <= m). matrix_t* a_matrix matrix_t* b_matrix matrix_t** output_matrix Pointer to the first vector. Pointer to the second vector. Set *output_matrix to be a pointer to the resulting matrix. You must allocate memory for the resulting matrix in convolve. 0 if successful, -1 if there are any errors. § Testing and Debugging Helpful resources: Background: Testing Framework, Background: Directory Structure, Background: Testing, Background: Debugging To run the staff provided tests for this task: $ make task_1 TEST=tests/test_1d_small If you'd like to create additional tests, please refer to the testing framework section for creating tests as well as the testing and debugging sections for instructions on running tests. Make sure to create tests of different sizes and dimensions! The autograder will test a wide range of sizes for both correctness and § Task 1.3: Convolutions § Conceptual Overview: 2D Convolutions Note: For this project, we will only be covering discrete convolutions since the input vectors and matrices only have values at discrete indexes. This is in contrast to continuous convolutions where the input would have values for all real numbers. If you have seen convolutions in a different class or in some other context, the process we describe below may differ slightly from what you are Convolution is a special way of multiplying two vectors or two matrices together in order to determine how well they overlap. This leads to many different applications that you'll explore in this project, but first, here are the mechanics for how convolution is done: A convolution is when you want to convolve two vectors or matrices together, matrix A and matrix B. 1. YoubeginbyflippingmatrixBinbothdimensions.NotethatflippingmatrixBinbothdimensionsisNOTthesameas transposing the matrix. Flipping an MxN matrix results in an MxN matrix. Transpose results in an NxM matrix. Flippinginboth directions 576278 567 26 12134 3775 2. Onceflippedhorizontallyandvertically,overlapmatrixBinthetopleftcornerofmatrixA.Performanelement-wise multiplication of where the matrices overlap and then add all of the results together to get a single value. This is the top left entry in your resultant matrix. 3. SlidematrixBtotherightby1andrepeatthisprocess.ThiscontinuesuntilanypartofmatrixBnolongeroverlapswithmatrix A. When this happens, move matrix B to first column of matrix A and down by 1 row. 4. RepeattheentireprocessuntilreachingthebottomrightcornerofmatrixA.YouhavenowconvolvedmatrixAandB.(clickthe image for a larger version) 1•0 + 1•1 + 1•1 + 0•2 = 2 1•0 + 1•1 + 0•1 + 2•2 = 5 1•0 + 0•1 + 1•1 + 2•2 = 5 0•0 + 2•1 + 2•1 + 2•2 = 8 You can assume that the height and width of matrix B are less than or equal to the height and width of matrix A. Note: The output matrix has different dimensions from its input matrices. We'd recommend working out some examples to see how the dimensions of the output matrix is related to the input matrices. § Implementation Helpful resources: Background: Matrices, Background: Task Implement (or modify your existing) convolve in compute_naive.c. You may assume that b_matrix will smaller or equal to a_matrix (that is, if a_matrix is m by n and b_matrix is k by l, then k < m and l < n). matrix_t* a_matrix matrix_t* b_matrix matrix_t** output_matrix Pointer to the first vector. Pointer to the second vector. Set *output_matrix to be a pointer to the resulting matrix. You must allocate memory for the resulting matrix in convolve. 0 if successful, -1 if there are any errors. § Testing and Debugging Helpful resources: Background: Testing Framework, Background: Directory Structure, Background: Testing, Background: Debugging To run the staff provided tests for this task: $ make task_1 TEST=tests/test_2d_small If you'd like to create additional tests, please refer to the testing framework section for creating tests as well as the testing and debugging sections for instructions on running tests. Make sure to create tests of different sizes and dimensions! The autograder will test a wide range of sizes for both correctness and § Task 2: Optimization At this point, you should have a complete implementation in compute_naive.c. You will implement all optimizations in compute_optimized.c. § Task 2.1: SIMD Helpful resources: Lab 7, Discussion 9 Optimize your naive solution using SIMD instructions in compute_optimized.c. Not all functions can be optimized using SIMD instructions. For this project, we're using 32-bit integers, so each 256 bit AVX vector can store 8 integers and perform 8 operations As a reminder, you can use the Intel Intrinsics Guide as a reference to look up the relevant instructions. You will have to use the __m256i type to hold 8 integers in a YMM register, and then use the _mm256_* intrinsics to operate on them. Note: if your implementation of convolve relies on dot, you will want to also convert dot to use SIMD instructions. Don't forget to implement any tail case(s)! Hint: You will want to check out the instructions in the AVX family of intrinsic functions, as it may help greatly with certain operations during the convolution process. We understand that there are a lot of functions, but it is a useful skill to be able to read documentation and find information relevant to your task. § Task 2.2: OpenMP Helpful resources: Lab 8, Discussion 9 Optimize your solution from task 2.1 using OpenMP directives in compute_optimized.c. Not all functions can be optimized using OpenMP directives. You can find more information on OpenMP directives on the OpenMP summary card . § Task 2.3: Testing and Debugging Helpful resources: Background: Testing Framework, Background: Directory Structure, Background: Testing, Background: Debugging To run the staff provided tests for this task (note: for this task, you should use make task_2 instead of make task_1): $ make task_2 TEST=tests/test_2d_small If you'd like to create additional tests, please refer to the testing framework section for creating tests as well as the testing and debugging sections for instructions on running tests. In order to receive full credit, your code should achieve the speedups listed in the benchmarks section of the spec. § Task 3: Open MPI Helpful resources: Lab 9, Discussion 10 Up until this point, you've been using the provided coordinator_naive.c as the coordinator with a small number of tasks. However, coordinator_naive.c does not have any PLP capabilities and we can optimize it for a workload that includes a larger number of 1. ImplementanOpenMPIcoordinatorincoordinator_mpi.c. The structure of the code will be similar to the Open MPI coordinator you wrote in lab 9 and discussion 10. 2. Copyyourexistingcomputefunctionsincompute_optimized.ctocompute_optimized_mpi.c,andmakeanychanges necessary to further improve the speedup. The execute_task function may be helpful here: execute_task Return values int task_t *task A task_t struct representing the task to execute. § Testing and Debugging successful, -1 if there Helpful resources: Background: Testing Framework, Background: Directory Structure, Background: Testing, Background: Debugging To run the staff provided tests for this task (note: for this task, you should use make task_3 instead of make task_1 or make $ make task_3 TEST=tests/test_2d_small If you'd like to create additional tests, please refer to the testing framework section for creating tests as well as the testing and debugging sections for instructions on 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com