Calendar Staff Policies Ed OH Queue Semesters
Project 4: Numc
Deadline: Sunday, August 8, 11:59:59 PM PT
This project is designed to be both a C project as well as a performance project. In this project you will be implementing a slower version of numpy, a very useful Python library for performing mathematical and logical operations on arrays and matrices. Your version of numpy, numc (how exciting!), is most likely to be slower than numpy, but much faster than the naive implementations of matrix operations. The steps of this project are as follows:
Copyright By PowCoder代写 加微信 powcoder
1. YouwillfirstcompleteanaivesolutionforsomematrixfunctionsinC
2. YouwillgainadeeperunderstandingofthePython-Cinterfacebyoverloadingsomeoperatorsanddefiningsomeinstance
methods for numc.Matrix objects
3. Finally,youwillspeedupyournaivesolution,thusmakingnumc.Matrixoperationsfaster.
Do not expect your final completed numc module to be as good as numpy, but you should expect a very large speedup compared to the naive solution, especially for matrix multiplication and exponentiation!
Tips and Guidelines
Please start early! Because there are many more 61C students than Hive machines, you will likely share resources with your classmates. This might affect the measurement of your code¡¯s speedup. We encourage you to use Hivemind to help balance the load amongst the hive machines.
You will have 6 tokens every 6 hours for the Gradescope assignment. So again, please start early!
For this project, we strongly suggest working on Hive machines under your cs61c account. We have set up a few environment settings that only work if you use your cs61c account. If you run into issues related to not working on the Hive machines, we will NOT be able to help you.
If you would like to run the reference solution to compare your solution to, you can import the dumbpy library on hive as we have already installed it there for you!
We will not be directly testing your C code. All tests will be in Python!
You may change the skeleton code in src/numc.c, especially if you are not using a row-major setup for your matrices.
You may NOT add/remove any additional imports.
You may change the function signatures in the following files:
src/matrix.h and src/matrix.c
You may not change the function signatures in the following files:
numc.h and src/numc.c
You will get negative points up to twice the number of points it is worth if you fail to complete task 4, so you should do it as much as you can even if you do not complete the entire project!
Getting Started
Warning: this assignment allows groups of 2 students! If you have a group, one member should create the repo and invite the other group member to that repo. The other group member should NOT create their own repo. If you haven’t decided on a group yet, don’t create the repo yet. You will not be able to change groups after creating a repo!
Visit Galloc. Log in and start the Project 4 assignment. A GitHub repo will be created for your team; this will be your repo for any Project 4 work you do. Then, clone your repository locally and add the starter remote:
If we publish changes to the starter code, retrieve them using git pull starter main.
To be able install the modules that you will complete in this project, you must create a virtual environment with by running
$ python3.6 -m venv .venv
Note that you MUST use python 3.6 as our reference module dumbpy only supports this specific version of python. Finally, run the following command to activate the virtual environment (a tool used to create isolated Python environments such that our project can have its own dependencies, regardless of what dependencies every other project has):
$ source .venv/bin/activate
This will put you in a virtual environment needed for this project. Please remember that if you exit the virtual environment and want to return to work on the project, you must re-run source .venv/bin/activate. This also means every time you re-ssh into the hive, you will have to re-run source .venv/bin/activate.
pip3 install -r requirements.txt
in the virtual environment. This will install all python packages you need for running your custom python tests. Finally, if you have to exit out of the virtual environment, you can do so by running:
$ deactivate
We already have the reference library dumbpy installed for you on Hive machines. You can import it with or without the virtual environment while using python3.6, and all object and function names are the same as the numc module that you will use (please refer to using the setup file). You will only be able to access the dumbpy package on hive as we will not be directly releasing it. You can use it as a reference for both correctness and speed.
Again, for this project, we strongly suggest working on Hive machines in your cs61c class account. You will not be able to import dumbpy if you are using other class accounts. We will be unable to help you with issues caused by working outside of the Hive.
Task 1: Matrix functions in C
Before contacting the staff about any issues that you encounter, please have a look through the FAQ here.
For this task, you will need to complete all functions in src/matrix.c labelled with /* TODO: YOUR CODE HERE */. The comments above each function signature in src/matrix.c contain instructions on how to implement the functions and the comments next to each variable of the matrix struct in src/matrix.h contain details about each variable, so read them carefully before you start coding.
The function allocate_matrix_ref is called from src/numc.c’s Matrix61c_subscript function and and is used for getting a row of the from matrix (see Info: numc.Matrix indexing for an example). Currently, Matrix61c_subscript and allocate_matrix_ref assume a row-major setup. If you choose to implement your matrices as column-major, you will have to change the implementation of Matrix61c_subscript, and you might also want to change the function signature of allocate_matrix_ref.
Again, you may change any function signature in src/matrix.h and src/matrix.c. Important notes:
The deallocate function as well as the ref_cnt field in the matrix struct have caused a lot of confusions in the past semesters. It is important to remember that this ref_cnt is NOT Python’s internal reference count. It is simply a field that will help you implement the deallocate function. It does not have to reflect the true reference count if you deem that setting it to other values will simplify your implementation of deallocate.
For the deallocate function, since there can be multiple matrices that refer to the same data array in the memory, you must not free the data until you call deallocate on the last existing matrix that refers to that data. If you are having some difficulties implementing this, here’s a hint: you can keep the matrix struct in the memory even if you have already called deallocate on that matrix. You only need to make sure to that the struct is freed once the last matrix referring to its data is deallocated.
If this explanation does not make sense now, don’t worry! It will make more sense after you implement the indexing section of task 2.
Throwing Errors
For the following errors, please have a look at PyExc_ValueError and PyExc_RuntimeError in PyErr_SetString mentioned here to throw errors correctly. Make sure to test these errors and not to confuse the error types being thrown.
Since you will use these return values to later throw these errors in src/numc.c, you would have to make sure that a runtime/value error will be thrown in src/numc.c whenever we run out of memory. This includes throwing an error in Matrix61c_init.
Testing for Correctness
We’ve provided some sanity in tests/mat_test.c. These tests make several assumptions:
They assume that all result matrices are already pre-allocated with the correct dimensions and that all input dimensions are valid. They assume that you have not modified the matrix struct in src/matrix.h
All tests except the tests for get and set assume that your get and set are correct
They assume that you have not modified the function signatures. Note that you may still change the function signatures of matrix.c and matrix.h if you’d like to, as mentioned here. These tests are different from the autograder tests because they only test matrix.c which is why you need to have the same function signatures here.
Violation of one or more of these assumptions may not cause your tests to fail, but please keep this in mind if your tests are failing and you are violating at least one of these assumptions.
To run the CUnit tests, run
$ make test
in the root folder of your project. This will create an executable called test in the root folder and run it.
By default, CUnit will run these tests in Normal mode. When debugging a specific issue, it may be helpful to switch to Verbose mode, which can be done by commenting and uncommenting the relevant lines in mat_test.c:
Make sure that one line is uncommented at a time.
Please keep in mind that these tests are not comprehensive, and passing all the sanity tests does not necessarily mean your implementation is correct. This is especially true with the memory functions allocate_matrix, allocate_matrix_ref, and deallocate_matrix. Also keep in mind that the autograder will be using our own set of sanity tests, and will not be running your CUnit tests.
Another thing to note is that the Makefile is written for compilation on the hive machines. If you wish to run it locally, you will have to modify the Makefile by replacing the path to your CUnit/Python libraries in your CUNIT and PYTHON variables. You will also need to make sure that your local computer supports AVX extensions and OpenMP.
Finally, you are welcomed to modify the tests/mat_test.c file in the tests directory to implement your custom test cases. To add your own custom tests, you will need to define your own test function and possibly use any of the CU_ASSERT_EQUAL, CU_ASSERT_NOT_EQUAL, or CU_ASSERT_PTR_EQUAL CUnit test cases to compare any value that you would like (Suggestion: there are more possible CUnit Test Cases linked here. Lastly, you will need to call CU_add_test to the main function to run your newly created function! A good place to start is to look at some of the provided tests and use the general approach to your own specific tests.
Using the setup file
The setup.py file is used for installing your custom-built modules and has been already provided to you. Have a look at the code and make sure to understand what is being done. You should be able to install numc by simply running:
This will uninstall your previously installed numc module if it existed and reinstall numc. We have written src/numc.c so that numc.Matrix will be initialized and ready to import upon succesful installation of the numc module. You should rerun make every time you make changes and want them to be reflected in the numc module.
You can uninstall your numc module by running $ make uninstall
You will likely get a lot of warnings about functions being defined but not used, and that’s ok! You should ignore these warnings for now, and they will be gone after you finish writing Task 2.
Remember that you must be in the virtual environment that you set up in order to install the modules, otherwise you will get a “Read-only file system” error.
Task 2: Writing the Python-C interface
Before contacting the staff about any issues that you encounter, please have a look through the FAQ here.
Now that you have successfully installed your numc module, you can import your numc.Matrix objects in Python programs! Here are
some ready-to-use features already implemented for numc.Matrix objects. You might find them helpful when debugging Task 2. Info: Importing numc.Matrix
Here are several ways of importing numc.Matrix
Info: numc.Matrix initialization
The code block below lists all the different ways of creating a numc.Matrix object.
More specifically:
nc.Matrix(rows: int, cols: int) will create a matrix with rows rows and cols cols. All entries in this matrix are defaulted to 0. nc.Matrix(rows: int, cols: int, val: int/float) will create a matrix with rows rows and cols cols. All entries in this matrix will be initialized to val.
nc.Matrix(rows: int, cols: int, lst: List[int/float]) will create a matrix with rows rows and cols cols. lst must have length rows * cols, and entries of the matrix will be initialized to values of lst in a row-major order.
nc.Matrix(lst: List[List[int/float]]) will create a matrix with the same shape as the 2D lst (i.e. each list in lst is a row for this matrix).
Info: numc.Matrix indexing
You can index into a matrix and change either the value of one single entry or an entire row. More specifically, mat[i] should give you the ith row of mat. If mat has more than 1 column, mat[i] should also be of type numc.Matrix with (mat’s number of columns, 1) as its shape. In other words, mat[i] returns a column vector. This is to support 2D indexing of numc matrices.
If mat only has one column, then mat[i] will return a double. mat[i][j] should give you the entry at the ith row and jth column. If you are setting one single entry by indexing, the data type must be float or int. If you are setting an entire row of a matrix that has more than one column, you must provide a 1D list that has the same length as the number of columns of that matrix. Every element of this list must be either of type float or int.
Tips and Guidelines
Getting Started
Task 1: Matrix functions in C
Task 2: Writing the Python-C interface Task 3: Speeding up matrix operations Task 4: Tell us what you did!
Frequently Asked Questions
Submitting Your Code
$ git clone YOUR_REPO_NAME
$ cd YOUR_REPO_NAME
$ git remote add starter https://github.com/61c-teach/su21-proj4-starter.git
$ git pull starter main
Return value
Error (later thrown in Task 2)
When to return value
PyExc_ValueError
If you are trying to allocate matrices with non-positive dimensions.
PyExc_RuntimeError
If allocate_matrix or allocate_matrix_ref fails to allocate space.
// CU_basic_set_mode(CU_BRM_NORMAL);
CU_basic_set_mode(CU_BRM_VERBOSE);
from numc import Matrix
import numc
numc.Matrix
import numc as nc
>>> import numc as nc
CS61C Summer 2021 Project 4: numc imported!
>>> nc.Matrix(3, 3) # This creates a 3 * 3 matrix with entries all zeros
[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]
>>> nc.Matrix(3, 3, 1) # This creates a 3 * 3 matrix with entries all ones
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]
>>> nc.Matrix([[1, 2, 3], [4, 5, 6]]) # This creates a 2 * 3 matrix with first row 1, 2, 3, second row 4, 5, 6
[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]
>>> nc.Matrix(1, 2, [4, 5]) # This creates a 1 * 2 matrix with entries 4, 5
[[4.0, 5.0]]
Please note that if mat[i] has more than 1 entry, it will share data with mat, and changing mat[i] will result in a change in mat. The example given below assumes the matrices are initialized from the code block above.
Partial slices, however, are not supported. For example,
Info: instance attributes
The matrices and vectors have an attribute shape, which is a tuple of (rows, cols). Example is given below.
Info: Python/C API Reference
>>> import numc as nc
CS61C Summer 2021 Project 4: numc imported!
>>> mat = nc.Matrix([[1, 2, 3], [4, 5, 6]]) # This creates a 2 * 3 matrix with first row 1, 2, 3, second row 4, 5
[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]
>>> slice = mat[0]
[[1.0], [2.0], [3.0]]
>>> slice[0]
>>> slice[1] = 10.0 # Change a value in slice
[[1.0], [10.0], [3.0]]
>>> mat # Mat is changed as well
[[1.0, 10.0, 3.0], [4.0, 5.0, 6.0]]
We define the Matrix61c struct in numc.h. It is of type PyObject (this means you can always cast Matrix61c to PyObject, but not vice versa), which according to the official documentation, “contains the information Python needs to treat a pointer to an object as an object”. Our Matrix61c has the matrix struct we defined in src/matrix.h.
Then we define a struct PyTypeObject named Matrix61cType to specify the intended behaviors of our Python object Matrix61c. This struct will then be initialized to be our numc.Matrix objects.
For example, .tp_dealloc tells Python which function to call to destroy a numc.Matrix object when its reference count becomes 0, and .tp_members tells Python what instance attributes numc.Matrix objects have. You can take a look at the official documentation if you are curious.
Useful functions:
Here is a list of some functions and Python objects from
PyObject_TypeCheck PyErr_SetString Py_BuildValue PyTupleObject PyLongObject PyFloatObject PyListObject
Now you are ready to complete src/numc.c, the Python-C interface! As before, you will need to fill out all functions and variables labeled /* TODO: YOUR CODE HERE */. The code for initializing the module numc and the object type numc.Matrix is already done for you. Although not required, we encourage you to take a look at the existing code to better understand the interface.
Below are the two main parts for this task.
Note: Don’t forget to use the return values from task 1 to throw PyExc_ValueError and PyExc_RuntimeError in src/numc.c! Here is the error table for your convenience:
Number Methods
For this part, we ask you to overload operators for numc.Matrix objects. For the following operations and functions, please have a look at PyExc_TypeError, PyExc_ValueError, and PyExc_IndexError in PyErr_SetString to throw errors correctly. Make sure to test these errors and not to confuse the error types being thrown. We have made subtraction and negation optional to reduce the amount of redundant work for the project. Feel free to implement these functions if you would like. We have provided autograder tests for you to test their functionality. Extra points will NOT be awarded for implementing the optional functions. Here are the expected behaviors of overloaded operators:
Please note that for all these operations above, you never directly modify the matrix that you pass in. You always make a new numc.Matrix object to hold your result, so make sure you set the shape attribute of the new numc.Matrix. You can use Matrix61c_new to create new numc.Matrix objects. Take a look at the implementation of Matrix61c_subscript for an example.
For all the functions above, throw a runtime error using PyExc_RuntimeError (similarly to PyExc_ValueError, PyExc_TypeError, and PyExc_IndexError as mentioned above) if any error occurs (such as matrix allocation failure) and causes the operation to fail. Moreover, for any operations that involve two instances of numc.Matrix, you will have to make sure that both a and b are indeed of type numc.Matrix as we do not support operations between numc.Matrix and other data/object types. Please read the comments in src/numc.c carefully.
After you implement all the functions above, you will need to fill out the struct Matrix61c_as_number in src/numc.c, which is used to define the object type numc.Matrix. Remember to cast your functions to the correct types when assigning them to Matrix61c_as_number’s fields! Here is the link to the official documentation of a PyNumberMethods struct: https://docs.python.org/3/c-api/typeobj.html#c.PyNumberMethods
Instance Methods
You will implement two instance methods for numc.Matrix: Method
After you implement all the functions above, you will need to fill out the array of PyMethodDef structs Matrix61c_methods in src/numc.c, which is used to define the object type numc.Matrix.
This link tells you what goes into a PyMethodDef struct: https://docs.python.org/3/c-api/structures.html Indexing
As mentioned in task 1, if you are storing your matrix data in a non-row-major order, you might want to change your Matrix61c_subscript.
Regardless of how you are storing you matrices, now is a good ti
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com