Compiler Construction/Spring 2022 Homework 1
1 Compiler Architecture
Question 1.1 (3 points). Compilers typically separate the frontend (lexing, parsing, semantic analysis, IR generation) and the backend (code generation) to enable reuse and composition. Find a real-world example, where one frontend targets multiple backends. Find a real-world example, where multiple frontends target the same backend.
Answer. GCC provides an example of both: all GCC languages have a separate frontend (https: //gcc.gnu.org/onlinedocs/gccint/Front-End.html), all GCC architectures have a separate backend (https://gcc.gnu.org/onlinedocs/gccint/Back-End.html), and both parts are fully composable.
Copyright By PowCoder代写 加微信 powcoder
Question 1.2 (3 points). Many compilers have multiple optimizers, at different stages in the pipeline. Why is that useful? Find a real-world example of optimization at different stages.
Answer. Early optimization can take advantage of source-language knowledge that may be lost later on, whereas late optimization can take advantage of target-language knowledge. Put more generally, different stages of compilation expose different optimization opportunities.
GCC continues to be a good example: https://gcc.gnu.org/onlinedocs/gccint/Passes.html shows the passes GCC performs, including the optimization ones. Note how, even before knowing the target architecture, different optimizations are applied at the tree (https://gcc.gnu.org/onlinedo cs/gccint/Tree-SSA-passes.html) and intermediate language (https://gcc.gnu.org/onlinedo cs/gccint/RTL-passes.html) levels.
Question 1.3 (3 points). Many compilers use two main intermediate representations – usually trees and some intermediate language. Some compilers go further and introduce more levels of IR. Why is that useful? Find a real-world example of an extra IR (in addition to trees and some intermediate language).
Answer. Just like different stages of compilation present different optimization opportunities, so do different IRs. For example:
• GCC uses GENERIC (trees), GIMPLE (high-level IL), and RTL (low-level IL).
• Rust uses HIR and MIR IRs: https://blog.rust-lang.org/2016/04/19/MIR.html, which then produce LLVM IR (low-level IL).
Adapted from CSCI-GA.2130-001 assignments by and . Page 1 of 4
Question 1.4 (3 points). Some compilers translate a high-level language to another high-level lan- guage rather than assembly, virtual machine code, or machine code. What specific language is a com- mon target for such compilers and why?
Answer. There are two main candidates:
JavaScript is a common target thanks to its ubiquity on the web. Examples of X-to-JavaScript compilers include TypeScript and Emscripten. (A trend exists towards replacing JavaScript, and especially asm.js, as a compilation target: https://en.wikipedia.org/wiki/WebAssembly.)
Another common target is C. C is available on numerous platforms and has highly developed toolchains, allowing an X-to-C compiler to benefit from these advantages, as well as emit code in a universally understood language.
Compiler Phases
These questions are based on Figure 1.7 on page 7 of the Dragon Book. Your task is to determine the output of each of the compiler phases for the following source code statement:
celsius = (fahrenheit – 32) * (5/9).
Question 2.1 (3 points). What sequence of tokens does the lexical analyzer output?
with symbol table: 1:celsius 2:fahrenheit
Question 2.2 (3 points). What abstract syntax tree does the syntax analyzer output? Answer.
Question 2.3 (3 points). What is the abstract syntax tree after the semantic analyzer modifies it?
Adapted from CSCI-GA.2130-001 assignments by and . Page 2 of 4
Answer. Note that if 5 and 9 in our example are integer literals, many programming languages would evaluate 5/9 to 0, ultimately reducing the source statement to celsius = 0 for integer fahrenheit, or celsius = isNaN(fahrenheit) ? NaN : 0.0 for a floating-point one.
Having pointed that out, let us assume that celsius and fahrenheit have been declared as floating- point numbers, and that arithmetic operations in our language always convert their operands to floating-point numbers. Semantic analyzer output then would be:
Question 2.4 (3 points). What sequence of three-address instructions does the IR generator output? Answer.
t1 = int2float(32)
t2 = id2 – t1
t3 = int2float(5)
t4 = int2float(9)
t5 = t3 / t4
t6 = t2 * t5
Question 2.5 (3 points). What sequence of three-address instructions does the optimizer output? Answer.
t2 = id2 – 32.0
id1 = t2 * 0.555555556
Question 2.6 (3 points). What sequence of machine instructions does the code generator output? Use the same assembly language as the Dragon Book or improvise your own.
Adapted from CSCI-GA.2130-001 assignments by and . Page 3 of 4
LDF R1, id2
SUBF R1, R1, #32.0
MULF R1, R1, #0.5555555555555556
STF id1, R1
Adapted from CSCI-GA.2130-001 assignments by and . Page 4 of 4
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com