Computer Systems
Topic – JVM and Java Bytecode
Dr. Mian M. Hamayun
M.M. .ac.uk
Credits to:
&
Slide #2 of 61
Lecture Objectives
Develop a basic understanding how Java Virtual
Machine works and what is bytecode
Understand the role of stack frames in method Calls
Understand the mapping of high-level Java code to low
level Bytecode
Slide #3 of 61
Lecture Outline
How JVM Works?
Frames (Stack Frames) – Operand Stack, Local
Variables and Return Values
Bytecode Instructions
Operations on Operand Stack
Call and Return
Runtime Constant Pool
Method Calls – Creating Frames
Recursive Factorial Example
Java Disassembler – High-level to Low-level mapping
Factorial Example – Execution
Slide #4 of 61
Java Virtual Machine (JVM)
Slide #5 of 61
Java Virtual Machine (JVM) (2)
Interprets intermediate level bytecode in .class files
Java Runtime Environment (JRE)
loads .class files
verifies their internal consistency …. security checks
executes their methods using JVM
Watch it on Youtube (JVM vs. JRE vs. JDK):
For different CPU or operating system
same .class files
different JRE
Slide #6 of 61
Frames (or Stack Frames)
Each time you call a method in Java, it gets a new frame
constructed for it.
Each frame has
Storage space for Local Variables (same principle as
registers)
Space for an Operand Stack
Program Counter & Stack Pointer
for Virtual Machine, not for the CPU
Reference back to the frame of calling method.
Other Stuff
Slide #7 of 61
Frames (or Stack Frames) (2)
Space for
Operand Stack
Local Variables
PC, SP
Other Stuff
Caller’s Frame
Space for
Operand Stack
Local Variables
PC, SP
Other Stuff
Caller’s Frame
Slide #8 of 61
Space for Operand Stack & Local Variables
All Entries on the Operand Stack and Local Variables are 4
bytes each.
Enough for
bool … 1 bit
byte … 1 byte
short, char … 2 bytes
int, float … 4 bytes
Also enough for any Reference value … 4 bytes
For long and double, we need two consecutive entries …
8 bytes
Slide #9 of 61
What is a Local Variables?
1) “this” (for a non-static method)
Reference to the object “this”, on which method was called
Kept as a local variable at index 0
2) Parameters of the method
Indices (slot) numbers start at 0 (for static methods) and 1
(non-static methods)
3) Variables declared inside the method; slot numbers start
after those for parameters.
Each local variable has a “slot” number, to show where it is
stored in the frame.
Sometimes the term “local variable” specifically means (3)!
Slide #10 of 61
What is NOT a Local Variable?
1) Instance Variables
The non-static fields in a class lets say “int a;”
These are the instance variables for each object of the class.
If x is a reference to an object then x.a is a variable in that
object
2) Class Variables
The static fields in a class lets say “static int b;”
These are the shared variables between all objects of a class
One value stored for the whole class. Used as classname.b
Both of the above variables are declared outside of all methods.
Slide #11 of 61
How many Local Variables are there in this
constructor?
public class PosVal {
private static int nextSerial = 0;
private int serial;
private int val; //invariant: val >= 0
public PosVal(int initVal){
int v = initVal;
if (v < 0){
v = -v;
}
val = v;
serial = nextSerial;
nextSerial += 1;
}
Slide #12 of 61
How many Local Variables are there in this
constructor?
The PosVal constructor has
Three local variables:
(0) “this”
(1) Parameter “initVal”
(2) Variable “v”
indices = slot numbers
Class Variable
Instance
Variables
public class PosVal {
private static int nextSerial = 0;
private int serial;
private int val; //invariant: val >= 0
public PosVal(int initVal){
int v = initVal;
if (v < 0){
v = -v;
}
val = v;
serial = nextSerial;
nextSerial += 1;
}
Slide #13 of 61
Slot Numbers for Local Variables in JVM
JVM doesn’t know the names of the local variables
(from Java Source Code)
JVM refers to local variables by Slot Numbers, starting at 0
Operand Stack... ...
Variable 0 Variable 1 Variable 2
For non-static methods, variable 0 is “this”. Then
parameters start at slot 1, then other local variables.
For static methods, parameters start at slot 0
For long, double use number of first of the pair of slots as its
slot number.
Slide #14 of 61
Can Operand Stack Ever Overflow ?
(Run out of space ?)
Short Answer: NO
The Java Compiler works out exactly how much space is
needed (and you should be able to do the same on
paper!).
The Loader checks each method to verify that No
Overflow or Underflow is possible.
Stack Overflow Error is different – A new frame is
needed, but there’s not enough memory.
An online Q&A forum inspired by the same concept:
https://stackoverflow.com/
https://stackoverflow.com/
Slide #15 of 61
Bytecode Instructions
Each bytecode instruction has at-least one byte
Opcode
May have one or more operand bytes
Remember the two separate meanings of “Operand” - here & entry
on operand stack
A single operand may be 2 or more bytes together as an
integer. The bigendian – most significant bytes come first.
For each opcode, there is a human-readable mnemonic
Slide #16 of 61
More on Endianness
Details at: https://en.wikipedia.org/wiki/Endianness
https://www.youtube.com/watch?v=seZLUbgbB7Y
https://en.wikipedia.org/wiki/Endianness
https://www.youtube.com/watch?v=seZLUbgbB7Y
Slide #17 of 61
Arithmetic on Operand Stack
Examples:
add – adds top two stack entries …
… , val1, val2 => …, val1+val2
but different opcodes for different datatypes
Similarly for other operations
Also sometimes b – byte, s – short,
a – address (reference)
Mnemonic Opcode (hex) Type Prefix Letter
iadd 0x60 int i
ladd 0x61 long l
fadd 0x62 float f
dadd 0x63 double d
Slide #18 of 61
Errors ? What happens if …
Use fadd on int entries?
Use ladd on int entries?
There’s only one entry on the stack? Stack Underflow
Other obvious mistakes?
No checks are performed when the operation is executed!
But JRE verifies code when it loads a class
Checks that types are used consistently etc.
Mainly used to safeguard against security holes.
Slide #19 of 61
Pushing Constants on the Stack
bipush 1-operand_byte
b for byte – 1 byte operand
i for int – pushes 4 bytes
Sign Extends operand to 4 bytes and
pushes the result on stack
Similarly
sipush – two operand bytes
Slide #20 of 61
Simple Opcodes for Common Constants
For Example: int
We have 7 different opcodes for integer constants,
where no operand is needed.
Opcode Pushed on Stack
iconst_m1 -1
iconst_0 0
iconst_1 1
iconst_2 2
… …
iconst_5 5
Slide #21 of 61
Load Instructions
A load instruction loads (pushes) a variable onto stack
Example:
iload slot_number
The slot_number (1 byte operand) specifies the variable
Push an int local variable at a given slot onto stack.
Similarly lload, fload, dload, aload instructions
JRE Verfier ensures that we use types consistently.
E.g. We can’t load as integer and then use as address.
It’s Java, not C++!
Slide #22 of 61
1-Byte Loads / Stores
We have some special opcodes with no operand for
slot numbers i.e. 0, 1, 2, 3
e.g. iload_0, iload_1, iload_2, iload_3
Store
Reverse of load: pops top of the Operand Stack into
a local variable
e.g. istore slot_number … 1-byte operand
istore_2 … no operand
Slide #23 of 61
Jumps
A jump must be within the current method i.e. We
can’t jump to a different method
Unconditional Jumps
goto 2-byte_offset
The operand is an offset
It is added to the address of goto opcode to give
address of the next opcode to execute
Similarly:
goto_w 4-byte_offset
w: wide_offset
Slide #24 of 61
Conditional Jumps
As before, but using
offsets for Operand N.
Condition based on
val at top of operand
stack
Also, e.g.
if_icmpeq
int comparison
Slide #25 of 61
Call & Return
Say method B calls static method A(p0, p1), and A returns a
result.
1) Method B calculates the actual parameters on its operand
stack
2) Construct stack frame for method A
pop actual parameter values from B’s operand stack
use them to initialize local variables p0, p1 for A
3) Method A calculates its result on its operand stack
4) Result is pushed onto to method B’s operand stack
5) Return to method B, throw away method A’s stack frame
On method B’s operand stack: A(p0, p1) has the effect of
…, p0, p1 → …, result
Slide #26 of 61
B Calls A … A Executes … A Returns to B
P0
…
Local Variables
Parameters
Caller’s Frame
P1
Operand Stack
…
Local Variables
Parameters
Caller’s Frame
Operand Stack
Local Variables
Parameter P1
Parameter P0
Caller’s Frame
Operand Stack
(empty)
result
…
Local Variables
Parameters
Caller’s Frame
Operand Stack
…
Local Variables
Parameters
Caller’s Frame
Operand Stack
Local Variables
Parameter P1
Parameter P0
Caller’s Frame
Operand Stack
result
…
Frames
for A
for B
Slide #27 of 61
What is the Current Frame?
While the method A is executing, its frame is the current
frame
Method B’s frame is not. (that’s why it was greyed out
on previous slide)
When the method A returns, its frame is destroyed
Method B’s frame becomes current again
Slide #28 of 61
Saving Return Address
Each frame has its own PC
While the method A is executed, its PC is used
When A returns, B resumes with its old PC value
B’s local variables are also unchanged by A
Linked frames have the effect of a return stack.
In fact, the chain of linked frames is officially called a Stack
in JVM.
Slide #29 of 61
Summary (so far)
We have seen that, for Local Variables:
Class file doesn’t use the names (instead slot numbers are
used)
Compiler translates names into indices
Used only in their own class
Next: Methods, Instance and Class Variables
Potentially used in other classes too
Remember that classes are compiled separately → compiler
doesn’t know the addresses in other classes
Hence, the .class file must keep the names = Symbolic
Reference
Slide #30 of 61
Runtime Constant Pool
One for each class; contains read-only constants used
in the class
e.g. numeric (3.141), string (“Hello World!”)
Symbolic references of method & its class
Types (of parameters, and return values)
and more …
Each pool entry has an index: 0, 1, 2, …
Each frame, in its “other stuff”, includes pointer to
constant pool for its class
Slide #31 of 61
Runtime Constant Pool
Source:
http://java8.in/java-virtual-machine-run-time-data-areas/
http://java8.in/java-virtual-machine-run-time-data-areas/
Slide #32 of 61
Static Method Calls
(non-static calls will be discussed, if time permits)
invokestatic 2-byte_index
JVM uses these to find address of class, then the size
needed for new frame & address of bytecode for
method.
Used as index into constant pool of the
class of method currently being executed.
Constant pool entry is a symbolic reference
to class & method
Slide #33 of 61
Static Method Calls
invokestatic 2-Byte_Indexbytecode
Symbolic Referenceconstant pool entry
Class of called method
bytecode for called method
Slide #34 of 61
Method Return
(Static or Non-Static)
For void methods
return
On caller’s operand stack:
… → …
Throw away the current frame
Carry on execution using the caller’s frame
(It’s PC shows where it was when it made the call).
Slide #35 of 61
Method Return
(Static or Non-Static)
For methods that return a result
ireturn
On caller’s operand stack
… → …, result
int result is popped from the operand stack and
pushed onto caller’s operand stack. Then do the
return as on previous slide.
Other Types
lreturn freturn dreturn areturn
long float double reference
Slide #36 of 61
Example: Recursive Factorial
/**
* Calculate factorial.
* requires: 0 <= n
* @param n number whose factorial is
* to be calculated
* @return factorial of n
*/
public static int fact(int n){
if (n==0){
return 1;
} else {
return n*fact(n-1);
}
}
Slide #37 of 61
Recursion – When a method calls itself
Example shows how methods use frames
It also shows how frames save PC and Local Variables
So that the recursion works!
Slide #38 of 61
Frame for Factorial Example
One Parameter (n)
No other local variables
How much space we need on the Operand Stack?
Slide #39 of 61
Frame for Factorial Example
Most complicated calculation
n * fact (n – 1)
Reverse Polish Notation
n n 1 – fact *
n: slot 0
Caller’s Frame
Operand
Stack.
Initially
Empty
space for 3
ints
Local
Variable
One int
Slide #40 of 61
Java Disassembler (javap)
The Java “disassembler” converts byte code to
human readable mnemonics
Command javap
Can be applied to a .class file
Displays its structure
e.g.
javap -classpath . -c Factorial > filename
Option -l gives table of local variables
Look in the current folder;
Not Always Needed
Show Bytecode
Disassemble
Factorial.class
Send output to a
file if you wish
Slide #41 of 61
Bytecode in mnemonics
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
line numbers
= bytecode
addresses in
decimal
Slide #42 of 61
Bytecode in bytes
Conditional jump is from address 1 (ifne) to 6 (iload_0).
Hence offset = 5.
Mnemonics show absolute address 6.
We are not going to use these opcodes.
But they are what a class file is really made of.
Slide #43 of 61
The Disassembler Helps You
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
The jump offset is 5
The disassembler tells you where the
jump goes to, address 6.
#2 is index in the constant pool.
That’s meaningless to us.
It could be anything.
The disassembler puts in a comment
to show what information is stored
there – it’s a symbolic reference to the
fact method.
Slide #44 of 61
What bytecode does what Java?
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Slide #45 of 61
What bytecode does what Java?
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Why do all the bytecode instructions
start with i?
Slide #46 of 61
What bytecode does what Java?
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Push variable in slot 0, i.e. n
Conditional jump, if non-zero top of stack
Push integer constant 1
Return integer
Integer subtract
Integer multiply
Mostly: shows int type for operation.
Exceptions: ifne, invokestatic
Why do all the bytecode instructions
start with i?
Slide #47 of 61
What bytecode does what Java?
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
n
empty
1
1
n, n – 1
n * fact(n-1)
n
n, n
n, n, 1
n, fact(n-1)
Stack
Slide #48 of 61
What bytecode does what Java?
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
n
empty
1
1
n, n – 1
n
n, n
n, n, 1
n, fact(n-1)
Stack
n * fact(n-1)
Slide #49 of 61
Executing fact(2) [1]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
PC = 0
Frame for
main() n
Next
Instruction
Parameter,
Slot 0
Operand
Stack
Slide #50 of 61
Executing fact(2) [2]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
2
PC = 1
Frame for
main()
push n
n
Slide #51 of 61
Executing fact(2) [3]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
PC = 6
Frame for
main() n
Slide #52 of 61
Executing fact(2) [4]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
1
2
PC = 10
Frame for
main() n
2
Empty
2
2, 2
2, 2, 1
2, 1
Slide #53 of 61
Recursive call fact(1) [5]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
2
PC = 13
Frame for
main() n
1
PC = 0
n
fact(1)
Two Frames for fact:
One for each call i.e. fact(2)
and fact(1)
Slide #54 of 61
Recursive call fact(1) [6]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
2
PC = 13
Frame for
main() n
1
0
1
PC = 10
n
fact(1)
1
Empty
1
1, 1
1, 1, 1
1, 0
Slide #55 of 61
Recursive call fact(0) [7]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
2
PC = 13
Frame for
main() n
1
1
PC = 13
n
fact(0)
0
PC = 0
n
Slide #56 of 61
Recursive call fact(0) [8]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
2
PC = 13
Frame for
main() n
1
1
PC = 13
n
fact(0)
1
0
PC = 5
n
0
Empty
1
Return
Value = 1
Slide #57 of 61
Return from fact(0) [9]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
2
PC = 13
Frame for
main() n
1
1
1
PC = 13
n
Slide #58 of 61
Completing fact(1) [10]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
2
PC = 13
Frame for
main() n
1
1
PC = 14
n
Return
Value = 1
Slide #59 of 61
Return from fact(1) [11]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
1
2
PC = 13
Frame for
main() n
Slide #60 of 61
Completing fact(2) [12]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Stack
2
2
PC = 14
Frame for
main() n
2
2,1
Result of fact(2)
Slide #61 of 61
Summary
In this lecture, we have seen the
Stack frames and their role in function execution
Local variables and how they are assigned slot numbers
Bytecode instructions and how operations are performed
using the operand stack
Runtime constant pool and static method calls
Java disassembler and how to map high-level Java code
to low-level bytecode instructions
Bytecode level execution of the recursive factorial function
Slide 1
Slide 2
Slide 3
Slide 4
Slide 5
Slide 6
Slide 7
Slide 8
Slide 9
Slide 10
Slide 11
Slide 12
Slide 13
Slide 14
Slide 15
Slide 16
Slide 17
Slide 18
Slide 19
Slide 20
Slide 21
Slide 22
Slide 23
Slide 24
Slide 25
Slide 26
Slide 27
Slide 28
Slide 29
Slide 30
Slide 31
Slide 32
Slide 33
Slide 34
Slide 35
Slide 36
Slide 37
Slide 38
Slide 39
Slide 40
Slide 41
Slide 42
Slide 43
Slide 44
Slide 45
Slide 46
Slide 47
Slide 48
Slide 49
Slide 50
Slide 51
Slide 52
Slide 53
Slide 54
Slide 55
Slide 56
Slide 57
Slide 58
Slide 59
Slide 60
Slide 61