C语言cache模拟代写 Project Two: Cache Simulator

Objective

Project Two: Cache Simulator

Instructor: Dr. Anup Kumar Das

The objective of this project is to develop a cache simulator in C and then simulate a set associative instruction cache to find the cache configuration that gives the best hit rate. You can choose to work on this project in group of two or work individually. Please read the grading policy on page 4 carefully.

Introduction

Figure 1 Circuit diagram of a four-way set-associative cache

Figure 1 shows a sample circuit diagram of a four-way set-associative cache. The comparators determine which element of the selected block (if any) matches the tag. The output of the comparators is used to select the data from one of the four blocks of the indexed set, using a multiplexor with a decoded select signal. The size of tag and (set) index bits are determined by cache size, associativity and block size.

When a miss occurs in an associative cache, we have a choice of where to place the requested block, and hence a choice of which block to replace. The most commonly used scheme is least recently used (LRU).

There are number of ways to implement LRU scheme. To fully implement LRU, it is necessary to maintain a linked list of all blocks in a set, with the most recently used block at the front and the least used block at the rear. The difficulty is that the list must be updated on every memory reference. Finding a block in the list, deleting it, and then moving it to the front is very time consuming, even in hardware. You are not encouraged to implement your LRU scheme this way.

One straightforward way to implement LRU is by the help of counters. This method requires each set to equip with a 64-bit counter, say ACCESS_COUNTER, that is automatically incremented after each reference (index) to the set. Furthermore, each block must also have a field to contain the counter. After each set reference, the current value of ACCESS_COUNTER is stored in the block that just referenced. When a miss occurs, the hardware (inside each set) just simply examines all the counters in the blocks inside to find the lowest one and this block is the least recently used one. However, one drawback of this method is that there may be more than one block with the same value of access counters. In this case, you need a secondary criterion like FIFO which results in a unique block to replace.

The recommended way to implement LRU is the optimized bit pseudo-LRU policy. Let’s assume the degree of associativity is A, this method requires each set to maintain an A by A matrix, initially all zeros. Whenever block k is referenced, the hardware first sets all the bits of row k to 1, then sets all the bits of column k to 0. At any instant, the row whose binary value is the lowest is the least recently used (block), the row whose value is next lowest is next least recently used (block), and so forth. Figure 2 shows the matrix updates when the following blocks (inside the same set) are referenced in order: 0, 1, 2, 3, 2, 1, 0, 3, 2, 3. You are encouraged to use this method.

Figure 2 Matrix updates every set reference, degree of associativity is 4

Implementation

1. The first step is to create efficient data structures to represent a cache. For example, you can use the following data structure to represent a (set) block

To remind you the fields inside a set block (see figure 1), there are usually three fields: valid bit, tag and data. In this project, you don’t need to consider the data field, which means you don’t need to handle loading block from main memory. If there is a miss occurred, you can simply do the following:

(1) Find the LRU block (choose an arbitrary block if the set is not fully utilized) (2) Update tag and valid bit to the LRU block
(3) Update LRU table inside the current set
(4) set hit=0

2. You also need to design a function called “create_cache()” which initializes the data structures. Your function should contain the following mechanisms:

(1) Check whether user provided parameters (cache size, associativity and line/block size) is the power of 2 or not

(2) Initialize (cache) data structures including LRU tables

3. Finally, you need to implement a function called “access_cache()” which is the key function to simulate cache behavior.

4. “simple_trace” is the file you will use to test your cache and the parser has been provided. Since we are only interested in instruction cache, so you only need to worry about the “I” case. You can assume all the address are 64-bit.

Deliverables

1. Efficient data structures and initialization functions.
2. Efficient implementation of LRU scheme.
3. Efficient implementation of access_cache() function.
3. Test your unified cache (instruction cache) with following setup:

(1) Cache Size: 1 KB, Associativity: 4, Cache Line Size: 128 Bytes; (2) Cache Size: 2 KB, Associativity: 4, Cache Line Size: 128 Bytes; (3) Cache Size: 4 KB, Associativity: 4, Cache Line Size: 128 Bytes; (4) Cache Size: 8 KB, Associativity: 4, Cache Line Size: 128 Bytes;

4. Find the setup in 3 that gives you the best hit rate, then change its associativity to 1, 2, 4 and simulate.

5. Find the setup in 4 that gives you the best hit rage, then change its cache line size to 8, 16, …, 128 and simulate.

6. Discuss what you find in question 3, 4 and 5 in a report and submit it with the zipped codes.

Grading Policy

You shouldn’t expect any partial credits because obviously you need a working cache simulator to find the instruction cache configuration that gives the best hit rate. You should expect the following grades:

1. 100 – Codes can be compiled, executed and give the expected results; report contains detailed implementation steps and discussions; good coding style, data structures and functions are well designed and well commented.

2. 80 – Codes can be compiled, executed and give the expected results; report contains detailed implementation steps and discussions; inefficient data structures or functions.

3. 0
* Any copying from other groups or individuals will receive an F for the final letter grade.

Contact

Project Developer and Grader: Shihao Song

E-mail: ss3695@drexel.edu

Notes: I will not answer any questions regarding basic C programming and debugging; I will not answer any questions regarding basic cache concepts. Please send me your questions through email.