CS代考 ELE00011H Digital Engineering Coursework Assessment 2021/22 SUMMARY DETAILS

Department of Electronic Engineering
ELE00011H Digital Engineering Coursework Assessment 2021/22 SUMMARY DETAILS
This coursework (Lab Report) contributes 50% of the assessment for this module. Clearly indicate your Exam Number on every separate piece of work submitted.
Submission is via the VLE module submission point. The deadline is 12:00 noon on 25 February 2022, Spring Term, Week 7, Friday. Please try and submit early as any late submissions will be penalised. Assessment information including on late penalties is given in the Statement of Assessment.

Copyright By PowCoder代写 加微信 powcoder

ACADEMIC INTEGRITY
It is your responsibility to ensure that you understand and comply with the University’s policy on academic integrity. If this is your first year of study at the University then you also need to complete the mandatory Academic Integrity Tutorial. Further information is available at http://www.york.ac.uk/integrity/.
In particular please note:
● Unless the coursework specifies a group submission, you should assume that all submissions are individual and should therefore be your own work.
● All assessment submissions are subject to the Department’s policy on plagiarism and,
wherever possible, will be checked by the Department using Turnitin software.

UG Students: MSc Students:
ELE00011H Digital Engineering ELE00121M Digital Engineering for MSc
Laboratories: Session 2 Design for performance – Part 1
– ALL LABS SHOULD BE DONE IN GROUPS OF TWO STUDENTS
– A mark penalty will apply to single-student submissions unless agreed in advance
– Any issues related to groups (conflicts) should be communicated as soon as possible
– ALL LABS SHOULD BE INDIVIDUAL SUBMISSIONS
Report formatting:
There is no formal report structure – you will be marked on the items listed within each lab script. Most reports will include code printout and simulation screenshots – see Lab 1 appendices for guidelines.
The reports for labs 1 – 4 must be handed in, together in a single zip archive, via the VLE by the deadline indicated on the front page of the script. A single submission should be handed in for each group (by any member of the group).
Each lab report, containing all the material required in the order specified, should be submitted as a separate PDF file. The exam numbers of all members of the group should be printed on the front page of each PDF.
The PDF files must be named Yxxxxxxx-Yxxxxxxx_DE_Lab#.pdf, where the Yxxxxxxx are the exam numbers of the group members (normally two for UG students, one for MSc students) and # is the lab number.
In all cases, read carefully the instructions on the VLE submission page. Failure to follow the instructions could lead to your assignment not being marked and in any case to a mark penalty.
Submission weight on module mark: 12.5% Mark breakdown [50 marks]:
 General issues (e.g. documentation, layout and comments, structural issues): 10 marks
 Task 2A (e.g. VHDL code, testbench, simulation): 25 marks
 Task 2B (e.g. VHDL code, testbench, simulation): 10 marks
 Task 2C (e.g. VHDL code, testbench, simulation): 5 marks

Task 0: Setup
Create a new project with an appropriate name. Next, download from the module website the “algorithm.vhd” file and import it into the project using the procedure described in lab 1 (remember to add a copy of the source file into your design). Double-click on it to open it in the Xilinx tools.
The VHDL implements, very straightforwardly, a sequence of operations: O <= (A*3 + B*C)/D + C +5 The circuit is the following: A few notes regarding the coding:  The use of the IEEE.NUMERIC_STD.ALL library is crucial for arithmetic / mathematical operations, but it implies that signals need to be explicitly defined as types SIGNED or UNSIGNED (in this case, we will work with unsigned numbers). Nevertheless, Xilinx still needs to use STD_LOGIC_VECTOR as a default for all top-level entity inputs and outputs.  Note the use of a generic value for data size. This parameterizable implementation will work for any data size. Note also that there is no provision for overflow. Some of your input vectors might well cause overflow and the result will be incorrect. You should ignore this.  The size of internal signals is not random! Arithmetic operations defined in the NUMERIC_STD library have precise relationships between the size of the inputs and the size of the outputs. If in any doubt, check the library itself: http://www.csee.umbc.edu/portal/help/VHDL/numeric_std.vhdl  Proper circuit design should always include registered inputs and outputs at the top level of your design (i.e. for the signals that will be connected to pins on the FPGA) – unless of course this is impossible because of other constraints. All registers (unless there is a good reason to do otherwise) should be implemented with a synchronous reset. Note also that D is the divisor in one of the operations, and therefore cannot have value 0 and must be reset to a non-zero value (note the “odd” syntax of the test for reset in the I/O register – this is imposed by the simulation setup coupled with the non-zero requirement). Keep this in mind when initialising signals in your testbench!  Note also the comments in the file. The comments apply to the initial code. Part of your task within the assignment is to make sure that the comments are updated for each modification you make to the circuit. Task A: Timing simulation Create a new simulation source (testbench). This should be a self-checking testbench and use a record together with assert/report statements (see details below). Set the clock period to 120ns and the initial wait period to 500ns. Remember to re-synchronize to the falling edge of the clock! Note that this is not exactly the same kind of testbench that you are used to. It is not meant to verify the correct design of the circuit (assume that this was done), but instead to verify its implementation and specifically its timing. What this means is that you do not need to find test patterns that verify the equation, but instead ones that “use” the critical path. Since determining the exact patterns that test the critical path is not straightforward (specific tools exist but are not within the scope of this module), the instructions below will allow you to ensure at least that your patterns use the majority of the logic (and hence are likely to involve paths that are at least close to critical). Immediately after the re-synchronization, assign initial values to all the inputs (note that input D will have to be set to a non-zero value, so use x“0001” for D and x“0000” for the others), then reset the circuit. Next, inject at least 5 different sets of input values. Choose your test vectors so that every vector inside the circuit – including the inputs and the output – is “large” (i.e. uses some of the most significant bits in the vectors) at least once in the test sequence. To verify this, run a simulation displaying all the internal vectors (INT1-INT5) and try a few different combinations of values. The inputs should change at every clock cycle. This is of fundamental importance for this lab (and the next) as we need to evaluate the maximum performance of the circuit. Failing to do so will invalidate your testbenches and result in a heavy mark penalty. Note also that the I/O registers in the design impose a 2-cycle delay before the result for a given set of inputs arrives at the output – this delay should be defined as a constant and the testbench should work for any value of this delay. Keep in mind that the simplest approach in this case is to use two separate processes, one for assigning the inputs and one (delayed by the constant value) to check if the output is correct (see Appendix A). For each set of inputs, an error message should be output to the console if the output is incorrect and a confirmation message should be output if they are correct. Finally, keep in mind that this is an implementation testbench. Hence, it will be applied to a circuit that has been implemented by the Xilinx tools and is no longer parameterizable. As a consequence, you should not include a generic map for your UUT, as the circuit will no longer have any generics after synthesis. 1.1.1: Print out the self-checking testbench. Make sure to include a short testing strategy in the comments. Run the behavioural simulation. By default, Xilinx sets a simulation duration of 1us, which might not be sufficient to observe the operation of the circuit. Set the duration to 5us and re-run it. Satisfy yourself that the circuit performs the correct operation (set the display radix of the output to unsigned decimal and note the output values). Familiarize yourself with the interface by observing the values of INT1-INT5. 1.1.2: Print out a screenshot of the behavioural simulation window, zoomed in to display, in readable format, all inputs, the output, and internal signals INT1-INT5 and INTO in unsigned decimal format. Include the console output (as a separate screenshot). Now synthesize your circuit and open the Synthesis Report. Observe the “RTL Component Statistics” section of the report. Can you readily identify the components you are expecting? You can also look at the RTL schematic for the circuit by selecting “Open Elaborated Design” in the “RTL Analysis” section of the flow navigator. You should readily recognise the circuit schematic. Unfortunately, this design has too many inputs to be implemented on the board, but it is still possible to run implementation to let the tools create the circuit for you. The one thing that is absolutely necessary is to specify the clock and its behaviour (after all, we want to analyse the maximum frequency to evaluate the performance!) Follow the instructions in lab 1 to create a XDC file. Instead of assigning pins, write the line: create_clock -period 120.000 -name clk -waveform {0.000 60.000} [get_ports clk] This tells the tools that the clk input is a clock with a 120ns period, with a rising edge at 0ns and a falling edge at 60ns (50% duty cycle - ratio between the time when the clock is 1 and when it is at 0). In this and the next lab, you will modify the clock period several times: make sure to change the fall time in order to maintain (roughly) a 50% duty cycle at all times! Next, to prevent the tools from “cheating” (we will see this in the next lab), click on “Settings” under the Project Manager entry of the Flow Navigator, select “Synthesis”, then scroll down the options until you see an entry called “-max_dsp” (which should have value -1). Set the value to 0, then click OK (answer “no” if prompted to preserve the previous run). Click “Run Implementation” in the Flow Navigator (it will probably ask you to re-run synthesis). In the console, select the “Design Runs” tab. The second line is the timing report and the first value will be “WNS(ns)” (approximately 30ns, but there might be fairly significant variations – a few ns either way - due to the random nature of the routing algorithms). This is the worst time slack: essentially, it is the margin you have between the period you asked for (120ns) and what the circuit can run at (which should be somewhere around 120 - 30 =90ns). The value will be positive, which tells you that your request for a 120ns period (8.333MHz frequency) has been satisfied. 1.1.3: Print out a screenshot of the “Design Runs” tab, showing all columns up to “DSP”. In your own words, and in relation to the lecture material, explain how the WNS is calculated. What is the maximum frequency at which the circuit can run, according to the tools? Now click again on “Run Simulation”. This time, you will see that more options are available, including post-synthesis and post-implementation simulations. This is because now the tools know the timing of the circuit (back-annotation). Run the “Post-Implementation Timing Simulation”. Note that the same VHDL Test Bench as the behavioural simulation will be applied by default. As for the behavioural simulation, run for 5us and observe the results. You are likely to see some fluctuations/errors at the start of the simulation (after all, ‘U’ is not really a valid signal in real hardware), but after the reset things should settle down. Can you observe the differences? Verify that the circuit outputs the correct results. Try to observe the internal signals again: can you find INT1-INT5? How about INTO? 1.1.4: Print out a screenshot of the timing simulation window, zoomed in to display, in readable format, all inputs and the outputs, as well as INTO, in unsigned decimal format. Include the console output. Close the window and change the clock period in your testbench to 50ns. Re-launch the simulation. Can you observe the differences? Is the output still correct (if it is, you really want to change your input vectors)? Why? 1.1.5: Print out a screenshot of the timing simulation window, zoomed in to display, in readable format, all inputs and the outputs, as well as INTO, in unsigned decimal format. Include the console output. 1.1.6: In your own words, and in relation to the lecture material, explain: 1) Why most internal signals are not available for display in the post-implementation simulation, whereas INTO is. 2) The behaviour of INTO and O in the post-implementation simulation of step 1.1.4, compared to the behavioural simulation of step 1.1.2 3) The behaviour of INTO and O in the post-implementation simulation of step 1.1.5, compared to the post-implementation simulation of step 1.1.4 Task B: Tool optimizations The tools should have reported a WNS of approximately 30ns from the 120ns period. This means that the circuit has a minimum period of approximately 90ns, right? Open the XDC file and change the period to 80ns. Run the implementation again and check the timing report. Surprised? If all has gone well, the timing requirement has been met, even if you requested a frequency higher than what the first run had told you was possible. Verify this by running the simulation again, with a 80ns period defined in the testbench. Then try the implementation and simulation again by changing the XDC and the testbench periods to 75ns, then 70ns. 1.2.1: Print out a screenshot of the “Design Runs” tab, showing all columns up to “DSP”, and calculate the maximum frequency for each implementation of the circuit. Using your own words, explain why the WNS changes and what happens when the timing requirements are not met. Relate this to what you see in the timing simulations. Open the Post-Route Timing Report (see above) and scroll down to the “Timing Details” section again. This time, look at the first few lines. They will look something like this (do not be surprised if there is a difference): In this case, there are 11 paths (you might well have more or fewer) where the clock setup time is violated. What is now the first Path? Is it the same as earlier on? Why do you think it might have changed? 1.2.2: In your own words, explain what a setup violation is and how it relates to the critical path. Task C: Logic optimizations Open the VHDL file for your circuit and the RTL schematic. Observe the implementation of the algorithm. Is there anything that can be done to reduce the critical path? Note the sequence of operations and in particular the last two sums. Can you think of any ways to rearrange the order of the operations to reduce the critical path? Consider this re-arrangement: I N P U T R E G O U T P U T R E G Modify the circuit (making sure to minimise the vector sizes), then run the implementation again with the 70ns clock constraint in the XDC file, and access the post-route timing report. Check the WNS for the clock. Has it changed with respect to your previous result? What can you conclude? 1.3.1: Print out the modified VHDL code. Add comments within the VHDL to illustrate its operation and your modifications (in particular, their effect on the critical path). Print out a screenshot of the “Design Runs” tab, showing all columns up to “DSP”, and comment on the comparison between the WNS value here and in the pre-modification case. If the implementation failed, increase the period to 80ns and run it again (it should definitely meet the constraints). Open the Reports tab at the bottom of the screen and scroll almost all the way down to the “Route Design” section. You should see a “Report timing summary” entry (probably called something like “impl_1_route _report_timing_summary_0”). Note that I will refer to this as the Post-Route Timing Report from here on in the scripts. Open it and scroll down to the “Timing Details” section. This is a very important section for everything to do with timing. The first few lines provide a bit of additional detail compared to the Design Runs tab you just looked at (we will look at this later). This is flowed by a very long list of “ Paths”. These are the (detailed!) critical paths for the circuit. In this instance, the information is important, but not crucial, since the required timing has been met. But when, later on in the script, you will start to fail the timing target, this can be used to identify where the problem is. Of course, the task is not straightforward! Have a look at even a single path. It is quite hard to follow exactly what is happening. However, the first couple of lines are often sufficient to identify at least the area of the circuit that is proving slowest (and therefore most problematic). In this case, the first lines of the first path might look something like the following (the example below is for a 120ns period, so don’t be surprised if you see a difference): Can you make anything out of it? Well, it’s actually not that obscure. Let’s see: - Slack we have seen (in this case, it is positive as the timing was met) - Source/destination are the really important pieces of information. They tell you that this path starts from one of the input registers (a D-type FF triggered by the rising edge of the clock) and ends at the output register. Why bit 1 of INTC specifically? That is MUCH harder to figure out, but it often it is not necessary. In any case, this outcome is hardly surprising as the circuit is (for the moment) one massive combinational circuit between the I/O registers, so the critical path obviously lies there. - Ignoring some of the intermediate information, the Data Path Delay tells us that almost half of the delay in in the routing, and not in the logic. Nothing much we can do about it, but it’s interesting. Here and in the next lab, I will ask you to print a screenshot of the first max delay path several times. The screenshot above shows the information that will be required. Please do NOT print all the details that follow! You are welcome to have a look, by the way: it lists every single slice and propagation delay on the path, and then goes on with several more paths that are very close to being critical. 1.3.2: Print out (use a screenshot) the first few lines of the first Path in the Post-Route Timing Report (see above for details). Save the current version of the VHDL file as it will be the basis for the following laboratory. The report required for this lab will consist of the answers / printouts required in the numbered items above. APPENDIX A: Two-process testbenches When designing a self-checking testbench for complex circuits, particularly those where a latency is present between the inputs and the outputs, a useful technique relies on the use of two separate processes: one that defines the inputs and a second one that checks the output. The code below is an example of the use of double-process TBs (clearly, you are expected to have better comments and use much more complete reporting!) The example uses a record, which is not required for this lab (but would be a nice touch, as it fits quite nicely). ARCHITECTURE behavior OF 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com