CS代考 ELE00011H Digital Engineering Coursework Assessment 2021/22 SUMMARY DETAILS

Department of Electronic Engineering
ELE00011H Digital Engineering Coursework Assessment 2021/22 SUMMARY DETAILS
This coursework (Lab Report) contributes 50% of the assessment for this module. Clearly indicate your Exam Number on every separate piece of work submitted.
Submission is via the VLE module submission point. The deadline is 12:00 noon on 25 February 2022, Spring Term, Week 7, Friday. Please try and submit early as any late submissions will be penalised. Assessment information including on late penalties is given in the Statement of Assessment.

Copyright By PowCoder代写 加微信 powcoder

ACADEMIC INTEGRITY
It is your responsibility to ensure that you understand and comply with the University’s policy on academic integrity. If this is your first year of study at the University then you also need to complete the mandatory Academic Integrity Tutorial. Further information is available at http://www.york.ac.uk/integrity/.
In particular please note:
● Unless the coursework specifies a group submission, you should assume that all submissions are individual and should therefore be your own work.
● All assessment submissions are subject to the Department’s policy on plagiarism and,
wherever possible, will be checked by the Department using Turnitin software.

UG Students: MSc Students:
ELE00011H Digital Engineering ELE00121M Digital Engineering for MSc
Laboratories: Session 3 Design for performance – Part 2
– ALL LABS SHOULD BE DONE IN GROUPS OF TWO STUDENTS
– A mark penalty will apply to single-student submissions unless agreed in advance
– Any issues related to groups (conflicts) should be communicated as soon as possible
– ALL LABS SHOULD BE INDIVIDUAL SUBMISSIONS
Report formatting:
There is no formal report structure – you will be marked on the items listed within each lab script. Most reports will include code printout and simulation screenshots – see Lab 1 appendices for guidelines.
The reports for labs 1 – 4 must be handed in, together in a single zip archive, via the VLE by the deadline indicated on the front page of the script. A single submission should be handed in for each group (by any member of the group).
Each lab report, containing all the material required in the order specified, should be submitted as a separate PDF file. The exam numbers of all members of the group should be printed on the front page of each PDF.
The PDF files must be named Yxxxxxxx-Yxxxxxxx_DE_Lab#.pdf, where the Yxxxxxxx are the exam numbers of the group members (normally two for UG students, one for MSc students) and # is the lab number.
In all cases, read carefully the instructions on the VLE submission page. Failure to follow the instructions could lead to your assignment not being marked and in any case to a mark penalty.
Submission weight on module mark: 10% Mark breakdown [40 marks]:
 General issues (e.g. documentation, layout and comments, structural issues): 10 marks
 Task 3A (e.g. VHDL code, testbench, simulation): 15 marks
 Task 3B (e.g. VHDL code, testbench, simulation): 15 marks

Task A: Pipelining
The circuit you implemented at the end of the previous lab should be the following:
I N P U T R E G
O U T P U T R E G
Figure 1.1
Create a new project, importing the files from the previous, and make sure to change the Synthesis setting “-max_dsp” to 0. Set the clock period of your VHDL test bench to 80ns and run the behavioural simulation.
2.1.1: Print out a screenshot of the simulation window, zoomed in to display, in readable format, all inputs, internal data busses, and the output in unsigned decimal format for the same testbench used in lab 1. Include the console output (as a separate screenshot).
Set a 80ns constraint on the clock in the XDC file (remember to change the fall time so that you maintain a 50% duty cycle). Implement the design.
IMPORTANT: in this laboratory, you will be asked to do repeated implementation runs in several occasions. Particularly in the later stages, this is likely to require considerable time (especially if you run the project from the M drive). To avoid this, and massively reduce the time required for each run, follow this procedure:
– Click on “Settings” under “Project Manager” in the Flow Navigator pane.
– Select “Implementation” on the left-hand side.
– In the options, scroll down to “Place Design” and select “Quick” in the “-directive” pull-down menu.
– Scroll further to “Route Design” and select “Quick” in the “-directive” pull-down menu.
– Answer “No” if prompted to preserve previous runs.
Remember that this procedure will generate circuits of lesser quality with respect to what you would obtain otherwise. It is simply a “trick” to shorten the time required for you to complete the lab and should be avoided in any sort of “real world” setting!
Note also that this option will introduce a much greater variability in the timing results. Throughout this script, this implies that some of the suggested timings might not match your actual results. Do not worry overmuch (if something looks really odd, ask a demonstrator!) and simply report the values you obtain.
At this stage, you will probably find that using the “quick” option has affected the timing of the implementation, and that the circuit will now be slower than the one found at the end of the previous lab. Your first task is to find the new “baseline” performance. Lower or raise the target clock period (in steps of 1ns) until you find the largest value for which the constraint is not met (you can use the slack value to “skip” steps – i.e. if you see a slack of more than ±5ns, you can “jump” a few ns). This value illustrates the very best that the tools (in “quick” mode) are able to do, given the current design.
2.1.2: Write the best period where the constraints are met (i.e. the one just before it starts to fail) and print out a screenshot of the “Design Runs” tab, showing all columns up to “DSP” (you might need to click on “Open implemented design” to see the “Design runs” tab). What is the highest frequency at which your design should be able to run according to the WNS results?
2.1.3: Print out (use a screenshot) the first few lines of the first Path in the Post-Route Timing Report (see previous script for details).

The objective of this task is to further accelerate the algorithm by introducing pipeline stages. The stages will be implemented as banks of D-type registers with synchronous reset. The most effective way to implement the registers in VHDL is to introduce additional processes similar to the input and output registers already present in the algorithm.vhd entity (an alternative would be to use a parameterizable register component, but this implies a lot more typing). Remember that the registers needed to pipeline the D vector (the divisor in the division operator) cannot be set to 0 without causing a simulation error.
It is highly recommended to create a separate project for each of the following steps, importing the files from the previous step. Remember to change the Synthesis and Implementation settings!
To start the creation of a pipeline for the circuit, insert a pipeline stage as shown below:
I N P U T R E G
P I P E L I
O U T P U T R E G
Figure 1.2
This is the most obvious place for a pipeline stage, as it separates the multipliers and the divider, the most complex operators in the design. Make sure that your pipeline includes all signals that traverse the cutset.
Of course, you need to verify that you implemented the pipeline stage correctly. Re-run the behavioural simulation using the same test-bench as above and confirm that the outputs correspond to the expected values. Note that there will be one additional clock cycle delay between the moment your inputs arrive and the corresponding outputs are produced. If you used (as suggested) a constant for the delay between inputs and outputs, you should only need to modify a single value.
2.1.4: Print out a screenshot of the simulation window, zoomed in to display, in readable format, all inputs, internal data busses, and the output in unsigned decimal format for the same sets of input (and output) values used in the preceding simulations. Include the console output (as a separate screenshot).
Set the clock period to the largest value that failed in the attempts above and run the implementation again. Go again through the process of finding the largest integer value for the target clock period for which the constraint is not met (note that again you can use the Timing Report output to speed up the process by jumping to known valid results).
2.1.5: Write the best period where the constraints are met (i.e. the one just before it starts to fail) and print out a screenshot of the “Design Runs” tab, showing all columns up to “DSP”. What is the highest frequency at which your design should be able to run according to the WNS results?
By introducing the first pipeline stage, you should have seen an improvement in the performance, but maybe not as much as you might have hoped. Let us try to improve this.
First, let us have a look at the Post-Route Timing report.
2.1.6: Print out (use a screenshot) the first few lines of the first Path in the Post-Route Timing Report (see previous script for details). Can you identify where the new critical path lies, with respect to the pipeline above (i.e. in which pipeline stage) and which of the mathematical operation(s) in the algorithm are in the critical path? Based on the lecture material, what are you expecting to happen to the critical path, with respect to the circuit without this pipeline stage, and why? Does the practice match the theory (compare the new clock period to the one you obtained in step 2.1.2)? Provide answers to each of these questions.

Now insert a second pipeline stage as shown below in Figure 1.3. Again, considering that the division is obviously the slowest operation, it makes sense to “bracket” it inside a single pipeline stage. Repeat the procedure above for the new circuit (verify its operation and determine its performance).
I N P U T R E G
P I P E L I
P I P E L I
O U T P U T R E G
Figure 1.3
2.1.7: Print out a screenshot of the simulation window, zoomed in to display, in readable format, all inputs, internal data busses, and the output in unsigned decimal format for the same sets of input (and output) values used in the preceding simulations. Include the console output (as a separate screenshot).
2.1.8: Write the best period where the constraints are met (i.e. the one just before it starts to fail) and print out a screenshot of the “Design Runs” tab, showing all columns up to “DSP”. What is the highest frequency at which your design should be able to run according to the WNS results?
2.1.9: Print out (use a screenshot) the first few lines of the first Path in the Post-Route Timing Report (see previous script for details). Can you identify where the new critical path lies, with respect to the pipeline above (i.e. in which pipeline stage) and which of the mathematical operation(s) in the algorithm are in the critical path? Based on the lecture material, what are you expecting to happen to the critical path, with respect to the circuit without this pipeline stage, and why? Does the practice match the theory (compare the new clock period to the one you obtained in step 2.1.4)? Provide answers to each of these questions.
To experiment further with pipelines, try adding a third pipeline stage as below:
I N P U T R E G
O U T P U T R E G
Figure 1.4
Repeat the procedure above for the new circuit (verify its operation and determine its performance). Note the warning messages during synthesis. Can you figure out what they mean?

2.1.10: Print out a screenshot of the simulation window, zoomed in to display, in readable format, all inputs, internal data busses, and the output in unsigned decimal format for the same sets of input (and output) values used in the preceding simulations. Include the console output (as a separate screenshot).
2.1.11: Write the best period where the constraints are met (i.e. the one just before it starts to fail) and print out a screenshot of the “Design Runs” tab, showing all columns up to “DSP”. What is the highest frequency at which your design should be able to run according to the WNS results?
2.1.12: Print out (use a screenshot) the first few lines of the first Path in the Post-Route Timing Report (see previous script for details). Can you identify where the new critical path lies, with respect to the pipeline above (i.e. in which pipeline stage) and which of the mathematical operation(s) in the algorithm are in the critical path? Based on the lecture material, what are you expecting to happen to the critical path, with respect to the circuit without this pipeline stage, and why? Does the practice match the theory (compare the new clock period to the one you obtained in step 2.1.8)? Provide answers to each of these questions.

Task B: IP Components
If we want to substantially improve the performance of our circuit, we need to look at ways to improve the speed of the single operations. The problem here is that the slowest component in our design is the division (we have seen with the introduction of the third pipeline stage that it is useless to improve other parts of the design until the divider is accelerated), and Xilinx does not currently provide a divider IP that you just plug into your circuit.
However, a “locked” version of a divider IP is available on the website. Download the zip file and save the directory somewhere on your computer. Next, go to “Add Sources”, select “Add Design Sources”, then “Add Directory”. Select the directory you have just imported and add it to your project.
Remove the last pipeline stage from your design (in other words, go back to the design illustrated by figure 1.3.
For your information, the IP interface used to generate the divider was the following:
Note in particular the latency parameter: this indicates how many internal pipeline stages are present in the implementation of the divider (in this case, 34). This is not a major issue with the operation of the circuit (a bit inconvenient for simulation, but not too much so). On the other hand, the internal pipeline stages will have to be matched by as many external stages to ensure that the data arrives at the adders in the correct order.
I N P U T R E G
P I P E L I
P I P E L I
PIPELINE ARRAY
Figure 2.1
This is a large number of registers! Luckily, VHDL provides for-generate loops. Define a for-generate loop of data_size shift registers as wide as the pipeline is deep (in this case, 34 bits) to create a deep pipeline to synchronize the data. The depth of this pipeline array depends on the latency of the divider, and should be defined as a constant in your code.
O U T P U T R E G

Add the divider as a component in your VHDL design. To do so, you will need to add two pieces of code. First, a component declaration, which goes together with the internal signal declarations (between “architecture” and “begin”):
component divider
clk: in std_logic;
sclr: in std_logic; rfd: out std_logic;
dividend: in std_logic_vector(31 downto 0);
divisor: in std_logic_vector(15 downto 0);
quotient: out std_logic_vector(31 downto 0);
fractional: out std_logic_vector(15 downto 0)
end component;
Then, a component instantiation (where you will have to replace “pipe1_3”, resp. “pipe1_D”, with your own labels for the 32-bit dividend, resp. 16-bit divisor).
Once the complete circuit has been designed and verified with behavioural simulation (it should give the same outputs as the previous versions, only much later). Note that the divider will probably output one or more illegal values before the correct latency has been achieved. This is normal.
Note that, at some point, your timing reports might start mentioning failures in WHS (Worst Hold Slack) and/or THS (Total Hold Slack) somewhere in the divider. These seem to be caused by the use of the “quick” options in the Implementation settings and can be safely ignored for the purposes of this lab.
Run the implementation and have a look at the WNS! Repeat the procedure above to find the first integer period where the implementation fails. You might want to start around the 10ns mark…
2.2.1: Print out a screenshot of the simulation window, zoomed in to display, in readable format, all inputs and the output in unsigned decimal format for the same sets of input (and output) values used in the preceding simulations. You will need two separate screenshots because of the depth of the pipeline (do not try to fit inputs and outputs in one screenshot!) Include the console output (as a separate screenshot).
2.2.2: Write the best period where the constraints are met (i.e. the one just before it starts to fail) and print out a screenshot of the “Design Runs” tab, showing all columns up to “DSP” (note the massive increase in FFs!) What is the highest frequency at which your design should be able to run according to the WNS results?
2.2.3: Print out (use a screenshot) the first few lines of the first Path in the Post-Route Timing Report (see previous script for details). Can you identify where the new critical path lies, with respect to the pipeline above (i.e. in which pipeline stage) and which of the mathematical operation(s) in the algorithm are in the critical path?
Let us check what happens now if we re-introduce the third pipeline stage. Repeat the procedure used above to find the new period where the implementation fails.
my_divider : divider
port map (
clk => clk,
sclr => rst,
dividend => std_logic_vector(pipe1_3),
divisor => std_logic_vector(pipe1_D),
quotient => quotient

2.2.4: Print out a screenshot of the simulation window, zoomed in to display, in readable format, all inputs and the output in unsigned decimal format for the same sets of input (and output) values used in the preceding simulations. Again, you will need two separate screenshots because of the pipeline. Include the console output (as a separate screenshot).
2.2.5: Write the best period where the constraints are met (i.e. the one just before it starts to fail) and print out a screenshot of the “Design Runs” tab, showing all columns up to “DSP”. What is the highest frequency at which your design should be able to run according to the WNS results?
2.2.6: Print out (use a screenshot) the first few lines of the first Path in the Post-Route Timing Report (see previous script for details). Can you identify where the new critical path lies, with respect to the pipeline above (i.e. in which pipeline stage) and which of the mathematical operation(s) in the algorithm are in the critical path? Based on the lecture material, what are you expecting to happen to the critical path, with respect to the circuit without this pipeline stage, and why? Does the practice match the theory (compare the new clock period to the one you obtained in step 2.2.2)? Provide answers to each of these questions.
What does the post route timing report say about the critical path now (you might have to reload the design – see the yellow line at the top of the window – check that the slack time matches your new value)? Probably, the timing will be violated by the multiplier. Can we improve it yet more? One way to do it would be to pipeline the multiplier, in the same way as we pipelined the divider, but remember that the Xilinx devices contain very fast embedded multipliers, so there might be an even better (and simpler) way!
Open again the settings of the Project Manager. Under “Synthesis”, find again the “-max_dsp” entry, and change i

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com