Compilation, Interpretation &
Overview of Java Virtual Machine
Slide #2 of 37
Lecture Objective
To introduce the basic concepts of compilation,
interpretation and Java Virtual Machine.
Slide #3 of 37
Lecture Outline
Levels of Programming Languages
High Level to Low Level Translation
High Level Program Execution
Compilation vs. Interpretation
Combined Compilation & Interpretation
Compilation and Execution on Virtual Machines
Slide #4 of 37
Levels of Programming Languages
High level languages
e.g. Java, C/C++/C#, Fortran, Cobol, Pascal, etc
Easier for humans
Low level languages
Machine code – instructions stored in memory (opcodes)
Hard to read and write by humans
Next level up: Assembly code
Can be written or read by humans (using mnemonics)
Watch on Youtube:
Most Popular Programming Languages 1965 – 2019
Slide #5 of 37
Levels of Programming Languages
Slide #6 of 37
Converting High Level to Low Level
To execute on a computer we must have machine code!
Assembly code is translated to machine code to run
Assembler does this (e.g. works out the relative addresses for
jumps etc.). Relocatable Code.
Linker: combines different assembled parts into a Whole
Loader: loads into memory at a given location
Slide #7 of 37
Executing High Level Programs
A program written in a high level language can be
run in two different ways:
Compiled into a program in the native machine language
and then run on the target machine
Directly interpreted and the execution is simulated within an
interpreter
Which approach is more efficient?
Think of C++ vs. JavaScript
Slide #8 of 37
Compilation
Compiler: converts source code (text of a program)
into object code – e.g. machine code – that does the
same thing as the original program
Usually object code is relocatable, so can be later
linked and loaded into memory.
Advantages:
Done once for each program
With clever tricks to optimize object code (by exploiting
hardware features) so that it will run fast
Disadvantages:
Harder than interpreting
Hardware dependent i.e. cannot run of different platforms
Slide #9 of 37
Compilation
Compiler runs on the same platform X as the target
code
Slide #10 of 37
Cross Compilation
Compiler runs on platform X, target code runs on
platform Y
Slide #11 of 37
Compilation is a Compute Intensive process!
https://xkcd.com/303/
https://xkcd.com/303/
Slide #12 of 37
Interpretation
Interpreter = another program that follows the source
code (text of program) and does appropriate actions
Same principle as:
Humans running through instructions of a program
A processor (CPU) can be viewed as a hardware
implementation of an interpreter for machine code
Advantages:
Facilitates interactive debugging & testing
User can modify the values of variables; can invoke
procedures from the command line
Disadvantages:
Slow Execution (as compared to compilation)
Slide #13 of 37
Interpretation
Running high-level code by an interpreter
Watch on Youtube:
Compiled vs. Interpreted Languages
Slide #14 of 37
Research Example – Simulation Techniques
Full article: https://ieeexplore.ieee.org/document/5620924
https://ieeexplore.ieee.org/document/5620924
Slide #15 of 37
Combined Compilation & Interpretation
Executing high level programs
Compile to an intermediate level (between high and
low) language that can be efficiently interpreted
Slower than pure compilation
Faster than pure interpretation
A single compiler, independent of CPU
Separate task for each CPU is to interpret the
intermediate language
Slide #16 of 37
Example: Java
Executing high level programs
Compile to an intermediate level (between high and
low) language that can be efficiently interpreted
Slower than pure compilation
Faster than pure interpretation
A single compiler, independent of CPU
Separate task for each CPU is to interpret the intermediate
language
Source Code
.java files Java bytecode
.class files
javac
The command “java” calls the JRE
Java Runtime Environment (JRE)
using Java Virtual Machine (JVM)
Slide #17 of 37
Combined Compilation & Interpretation
Slide #18 of 37
Virtual Machines
A virtual machine executes an instruction stream in
software (instead of hardware)
Adopted by Pascal, Java, Smalltalk-80, C#, functional
and logic languages, and some scripting languages
Pascal compilers generate P-code that can be interpreted
or compiled into object code (https://en.wikipedia.org/wiki/P-
code_machine)
Java compilers generate bytecode that is interpreted by
the Java Virtual Machine (JVM)
The JVM may translate bytecode into machine code by
Just-In-Time (JIT) compilation
Slide #19 of 37
Compilation and Execution on Virtual Machines
Compiler generates intermediate program (language)
Virtual machine interprets the intermediate program
We need to have virtual machine on each platform
Slide #20 of 37
Java Virtual Machine (JVM)
Introduction
Watch on Youtube:
What is Java Virtual Machine?
Slide #21 of 37
Lecture Outline
Java Concept and Portability
The JVM Architecture
Stack Machines & Expression Evaluation
IJVM & IJVM Instruction Set / Groups
Compiling Java to IJVM
JVM Instruction Summary
Interpreting JVM & Just In Time (JIT) Compilation
Slide #22 of 37
The Java Concept
Before Java … [Bell Labs]
C and C++ (object-oriented C) were used for systems
programming
WWW has evolved very fast (Animated History)
How to load and run a program over WWW?
different target machines, word length, instruction sets
Security is another issue!
Java [mid-1990s, Sun Microsystems]
language based on C++
has a virtual machine, hence portable
can be downloaded over WWW and executed remotely (using
the applets)
Slide #23 of 37
Portability of Java
Why not compile Java to machine
code?
need to generate code for each target
machine
cannot exchange executable code
The Sun Java solution
design machine architecture (JVM)
specifically for the Java language
translate Java source code into JVM
code (bytecode)
write software interpreter for JVM in C
(widely available)
Thus bytecode can be exchanged
remote execution is possible
Slide #24 of 37
The JVM Architecture
The architecture
Stack machine! Closer to modern high-level languages
than the von Neumann machine (Register machines).
Memory: 32 bit words (=4 bytes)
Instructions: 226 in total, variable length, 1-5 bytes
Program: byte stream
Data: stored in words
Program Counter (PC) contains byte addresses
Here simplified, Integer JVM (IJVM)
no floating point arithmetic
More details: https://en.wikipedia.org/wiki/IJVM
Slide #25 of 37
The JVM Architecture
http://www.santhoshreddymandadi.com/java/java-virt
ual-machine-jvm-architecture.html
http://www.santhoshreddymandadi.com/java/java-virtual-machine-jvm-architecture.html
http://www.santhoshreddymandadi.com/java/java-virtual-machine-jvm-architecture.html
Slide #26 of 37
Stack Machines
Stack
Area of memory, extends upwards or
shrinks downwards
LV (Local Variable), base of stack
SP (Stack Pointer), top of stack
Operations
push on top (increment SP)
pop (decrement SP)
add top two arguments on the stack,
replace with result
More details:
https://en.wikipedia.org/wiki/Stack_machine
https://en.wikipedia.org/wiki/Stack_machine
Slide #27 of 37
Evaluating Expressions on Stack
Slide #28 of 37
What are stacks good for?
Expression Evaluation
can handle bracketed expressions
(a1+a2)*a3 without temporary
variables:
PUSH a1, PUSH a2, ADD, PUSH a3,
MULT (See also RPN &
Infix, Prefix & Postfix Expressions)
Direct Support for
Local variables for methods
(stored at the base of stack,
deleted when the method exits)
(recursive) method calls: to store
return address RPN Example: 7 8 + 3 2 + /
https://en.wikipedia.org/wiki/Reverse_Polish_notation
https://runestone.academy/runestone/books/published/pythonds/BasicDS/InfixPrefixandPostfixExpressions.html
Slide #29 of 37
IJVM Memory
Slide #30 of 37
Main IJVM Instruction Groups
Stack Operations
PUSH/POP – push/pop word on a stack
BIPUSH – push byte on stack
ILOAD/ISTORE – load/store local variable onto/from stack
Integer Arithmetic
IADD/ISUB – add/subtract two top words on stack
Branching
IFEQ – pop top word from stack, branch if zero
Invoking a method / return from a method
INVOKEVIRTUAL, RETURN
Slide #31 of 37
IJVM Instruction Set
Slide #32 of 37
Compiling Java to IJVM
Slide #33 of 37
JVM Instruction Summary
Different from most CPUs
Closer to high-level programming languages, rather
than von Neumann architecture
No accumulator/registers – just the stack!
Small, straightforward instruction set
Variable length instructions
Typed instructions, i.e. different instruction for LOADing
integer and for LOADing pointer (this is to help verify
security constraints)
Slide #34 of 37
Interpreting JVM
Software interpreter for JVM in C (the original Sun
Microsystems solution)
memory for the constant pool, method area and stack
procedure for each instruction
program which fetches, decodes and executes instructions
Produce micro-programmed interpreter
Manufacture hardware chip (picoJava II)
for embedded Java applications
More details:
https://en.wikipedia.org/wiki/PicoJava
https://en.wikipedia.org/wiki/PicoJava
Slide #35 of 37
Just In Time (JIT) Compilation
Why not compile directly to target architecture?
more expensive – many varying architectures
more time needed to compile each instruction
But
execution is slower with an interpreter!!!
instructions may have to be parsed repeatedly
Source:
https://en.wikibooks.org/wiki/Java_Programming/
The_Java_Platform
https://en.wikibooks.org/wiki/Java_Programming/The_Java_Platform
https://en.wikibooks.org/wiki/Java_Programming/The_Java_Platform
Slide #36 of 37
Just In Time (JIT) Compilation
Just In Time (JIT) Compilation
include Java compiler to target machine within a browser
compile instructions, and reuse them
longer wait till arrival of executable code
Source:
https://en.wikibooks.org/wiki/Java_Programming/
The_Java_Platform
https://en.wikibooks.org/wiki/Java_Programming/The_Java_Platform
https://en.wikibooks.org/wiki/Java_Programming/The_Java_Platform
Slide #37 of 37
Summary
Compilation vs. Interpretation
Interpreted languages
execute with the help of a layer of software, not directly
on a CPU
usually translated into intermediate code
Java
conceived as an interpreted language, to enhance
portability and downloading to foreign/remote
architectures (applets)
has JVM, a virtual stack machine
interpreted via a C language interpreter, or a hardware
chip (picoJava II for embedded Java applications)
Slide 1
Slide 2
Slide 3
Slide 4
Slide 5
Slide 6
Slide 7
Slide 8
Slide 9
Slide 10
Slide 11
Slide 12
Slide 13
Slide 14
Slide 15
Slide 16
Slide 17
Slide 18
Slide 19
Slide 20
Slide 21
Slide 22
Slide 23
Slide 24
Slide 25
Slide 26
Slide 27
Slide 28
Slide 29
Slide 30
Slide 31
Slide 32
Slide 33
Slide 34
Slide 35
Slide 36
Slide 37