Computer Systems
Topic – JVM and Java Bytecode
Dr. Mian M. Hamayun
M.M.Hamayun@cs.bham.ac.uk
Credits to:
Ata Kaban & Steve Vickers
Lecture Objectives
Develop a basic understanding how Java Virtual Machine works and what is bytecode
Understand the role of stack frames in method Calls
Understand the mapping of high-level Java code to low level Bytecode
Slide #2 of 61
Lecture Outline
How JVM Works?
Frames (Stack Frames) – Operand Stack, Local
Variables and Return Values
Bytecode Instructions
Operations on Operand Stack
Call and Return
Runtime Constant Pool
Method Calls – Creating Frames
Recursive Factorial Example
Java Disassembler – High-level to Low-level mapping Factorial Example – Execution
Slide #3 of 61
Java Virtual Machine (JVM)
Slide #4 of 61
Java Virtual Machine (JVM) (2)
Interprets intermediate level bytecode in .class files
Java Runtime Environment (JRE)
loads .class files
verifies their internal consistency …. security checks executes their methods using JVM
Watch it on Youtube (JVM vs. JRE vs. JDK):
For different CPU or operating system same .class files
different JRE
Slide #5 of 61
Frames (or Stack Frames)
Each time you call a method in Java, it gets a new frame constructed for it.
Each frame has
Storage space for Local Variables (same principle as
registers)
Space for an Operand Stack
Program Counter & Stack Pointer for Virtual Machine, not for the CPU
Reference back to the frame of calling method. Other Stuff
Slide #6 of 61
Frames (or Stack Frames) (2)
Space for Operand Stack
Local Variables
PC, SP
Other Stuff
Caller’s Frame
Space for Operand Stack
Local Variables
PC, SP
Other Stuff
Caller’s Frame
Slide #7 of 61
Space for Operand Stack & Local Variables
All Entries on the Operand Stack and Local Variables are 4 bytes each.
Enough for
bool … 1 bit
byte … 1 byte short, char … 2 bytes int, float … 4 bytes
Also enough for any Reference value … 4 bytes
For long and double, we need two consecutive entries … 8 bytes
Slide #8 of 61
What is a Local Variables?
1) “this” (for a non-static method)
Reference to the object “this”, on which method was called
Kept as a local variable at index 0
2) Parameters of the method
Indices (slot) numbers start at 0 (for static methods) and 1
(non-static methods)
3) Variables declared inside the method; slot numbers start after those for parameters.
Each local variable has a “slot” number, to show where it is stored in the frame.
Sometimes the term “local variable” specifically means (3)!
Slide #9 of 61
What is NOT a Local Variable?
1) Instance Variables
The non-static fields in a class lets say “int a;”
These are the instance variables for each object of the class.
If x is a reference to an object then x.a is a variable in that object
2) Class Variables
The static fields in a class lets say “static int b;”
These are the shared variables between all objects of a class
One value stored for the whole class. Used as classname.b Both of the above variables are declared outside of all methods.
Slide #10 of 61
How many Local Variables are there in this constructor?
public class PosVal {
private static int nextSerial = 0;
private int serial;
private int val; //invariant: val >= 0
public PosVal(int initVal){
int v = initVal;
if (v < 0){
v = -v; }
val = v;
serial = nextSerial;
nextSerial += 1;
}
Slide #11 of 61
How many Local Variables are there in this constructor?
public class PosVal {
private static int nextSerial = 0;
private int serial;
private int val; //invariant: val >= 0
public PosVal(int initVal){
int v = initVal;
if (v < 0){
v = -v;
val = v;
serial = nextSerial;
+= 1; } Class Variable
The PosVal constructor has Three local variables:
(0) “this”
(1) Parameter “initVal” (2) Variable “v”
indices = slot numbers
Instance Variables }
nextSerial
Slide #12 of 61
Slot Numbers for Local Variables in JVM
JVM doesn’t know the names of the local variables
(from Java Source Code)
JVM refers to local variables by Slot Numbers, starting at 0
Variable 0 Variable 1 Variable 2
For non-static methods, variable 0 is “this”. Then
parameters start at slot 1, then other local variables.
For static methods, parameters start at slot 0
For long, double use number of first of the pair of slots as its slot number.
...
...
Operand Stack
Slide #13 of 61
Can Operand Stack Ever Overflow ? (Run out of space ?)
Short Answer: NO
The Java Compiler works out exactly how much space is needed (and you should be able to do the same on paper!).
The Loader checks each method to verify that No Overflow or Underflow is possible.
Stack Overflow Error is different – A new frame is needed, but there’s not enough memory.
An online Q&A forum inspired by the same concept:
https://stackoverflow.com/
Slide #14 of 61
Bytecode Instructions
Each bytecode instruction has at-least one byte
Opcode
May have one or more operand bytes
Remember the two separate meanings of “Operand” - here & entry
on operand stack
A single operand may be 2 or more bytes together as an
integer. The bigendian – most significant bytes come first. For each opcode, there is a human-readable mnemonic
Slide #15 of 61
More on Endianness
Details at: https://en.wikipedia.org/wiki/Endianness https://www.youtube.com/watch?v=seZLUbgbB7Y
Slide #16 of 61
Arithmetic on Operand Stack
Examples:
add – adds top two stack entries ...
... , val1, val2 => …, val1+val2
but different opcodes for different datatypes
Mnemonic
Opcode (hex)
Type
Prefix Letter
iadd
0x60
int
i
ladd
0x61
long
l
fadd
0x62
float
f
dadd
0x63
double
d
Similarly for other operations
Also sometimes b – byte, s – short,
a – address (reference)
Slide #17 of 61
Errors ? What happens if …
Use fadd on int entries?
Use ladd on int entries?
There’s only one entry on the stack? Stack Underflow Other obvious mistakes?
No checks are performed when the operation is executed! But JRE verifies code when it loads a class
Checks that types are used consistently etc. Mainly used to safeguard against security holes.
Slide #18 of 61
Pushing Constants on the Stack
bipush 1-operand_byte b for byte – 1 byte operand
i for int – pushes 4 bytes
Sign Extends operand to 4 bytes and
pushes the result on stack
Similarly
sipush – two operand bytes
Slide #19 of 61
Simple Opcodes for Common Constants
For Example: int
We have 7 different opcodes for integer constants, where no operand is needed.
Opcode
Pushed on Stack
iconst_m1
-1
iconst_0
0
iconst_1
1
iconst_2
2
…
…
iconst_5
5
Slide #20 of 61
Load Instructions
A load instruction loads (pushes) a variable onto stack Example:
iload slot_number
The slot_number (1 byte operand) specifies the variable
Push an int local variable at a given slot onto stack.
Similarly lload, fload, dload, aload instructions
JRE Verfier ensures that we use types consistently.
E.g. We can’t load as integer and then use as address.
It’s Java, not C++!
Slide #21 of 61
1-Byte Loads / Stores
We have some special opcodes with no operand for slot numbers i.e. 0, 1, 2, 3
e.g. iload_0, iload_1, iload_2, iload_3 Store
Reverse of load: pops top of the Operand Stack into a local variable
e.g. istore slot_number … 1-byte operand istore_2 … no operand
Slide #22 of 61
Jumps
A jump must be within the current method i.e. We can’t jump to a different method
Unconditional Jumps
goto 2-byte_offset
The operand is an offset
It is added to the address of goto opcode to give
address of the next opcode to execute Similarly:
w: wide_offset
goto_w 4-byte_offset
Slide #23 of 61
Conditional Jumps
As before, but using offsets for Operand N.
Condition based on val at top of operand stack
Also, e.g. if_icmpeq
int comparison
Slide #24 of 61
Call & Return
Say method B calls static method A(p0, p1), and A returns a result.
1) Method B calculates the actual parameters on its operand stack
2) Construct stack frame for method A
pop actual parameter values from B’s operand stack use them to initialize local variables p0, p1 for A
3) Method A calculates its result on its operand stack
4) Result is pushed onto to method B’s operand stack
5) Return to method B, throw away method A’s stack frame
On method B’s operand stack: A(p0, p1) has the effect of
…, p0, p1 → …, result
Slide #25 of 61
BCallsA … AExecutes … AReturnstoB
Operand Stack (empty)
Local Variables
Parameter P1
Parameter P0
Caller’s Frame
Operand Stack
result
Local Variables
Parameter P1
Parameter P0
Caller’s Frame
Frames for A for B
…
Operand Stack
result
…
Local Variables
Parameters
Caller’s Frame
Operand Stack
P1
P0
…
Local Variables
Parameters
Caller’s Frame
Operand Stack
…
Local Variables
Parameters
Caller’s Frame
Operand Stack
…
Local Variables
Parameters
Caller’s Frame
Slide #26 of 61
What is the Current Frame?
While the method A is executing, its frame is the current frame
Method B’s frame is not. (that’s why it was greyed out on previous slide)
When the method A returns, its frame is destroyed Method B’s frame becomes current again
Slide #27 of 61
Saving Return Address
Each frame has its own PC
While the method A is executed, its PC is used When A returns, B resumes with its old PC value
B’s local variables are also unchanged by A
Linked frames have the effect of a return stack.
In fact, the chain of linked frames is officially called a Stack in JVM.
Slide #28 of 61
Summary (so far)
We have seen that, for Local Variables:
Class file doesn’t use the names (instead slot numbers are
used)
Compiler translates names into indices Used only in their own class
Next: Methods, Instance and Class Variables
Potentially used in other classes too
Remember that classes are compiled separately → compiler
doesn’t know the addresses in other classes
Hence, the .class file must keep the names = Symbolic Reference
Slide #29 of 61
Runtime Constant Pool
One for each class; contains read-only constants used in the class
e.g. numeric (3.141), string (“Hello World!”)
Symbolic references of method & its class
Types (of parameters, and return values) and more …
Each pool entry has an index: 0, 1, 2, …
Each frame, in its “other stuff”, includes pointer to constant pool for its class
Slide #30 of 61
Runtime Constant Pool
Source:
Slide #31 of 61
Static Method Calls
(non-static calls will be discussed, if time permits)
invokestatic 2-byte_index
Used as index into constant pool of the class of method currently being executed.
Constant pool entry is a symbolic reference to class & method
JVM uses these to find address of class, then the size needed for new frame & address of bytecode for method.
Slide #32 of 61
Static Method Calls
bytecode
constant pool entry
invokestatic
2-Byte_Index
Symbolic Reference
Class of called method bytecode for called method
Slide #33 of 61
Method Return (Static or Non-Static)
For void methods return
On caller’s operand stack:
… → …
Throw away the current frame
Carry on execution using the caller’s frame
(It’s PC shows where it was when it made the call).
Slide #34 of 61
Method Return (Static or Non-Static)
For methods that return a result ireturn
On caller’s operand stack
… → …, result
int result is popped from the operand stack and pushed onto caller’s operand stack. Then do the return as on previous slide.
Other Types
lreturn
freturn
dreturn
areturn
long
float
double
reference
Slide #35 of 61
Example: Recursive Factorial
/**
* Calculate factorial.
* requires: 0 <= n
* @param n number whose factorial is * to be calculated
* @return factorial of n
*/
public static int fact(int n){
if (n==0){
return 1; } else {
return n*fact(n-1); }
}
Slide #36 of 61
Recursion – When a method calls itself
Example shows how methods use frames
It also shows how frames save PC and Local Variables So that the recursion works!
Slide #37 of 61
Frame for Factorial Example
One Parameter (n)
No other local variables
How much space we need on the Operand Stack?
Slide #38 of 61
Frame for Factorial Example
Most complicated calculation n * fact (n – 1)
Reverse Polish Notation n n 1 – fact *
Operand Stack. Initially Empty space for 3 ints
Local Variable One int
n: slot 0
Caller’s Frame
Slide #39 of 61
Java Disassembler (javap)
The Java “disassembler” converts byte code to human readable mnemonics
Command javap
Can be applied to a .class file
Displays its structure
Show Bytecode
e.g.
javap -classpath . -c Factorial > filename
Look in the current folder; Not Always Needed
Disassemble Factorial.class
Send output to a file if you wish
Option -l gives table of local variables
Slide #40 of 61
Bytecode in mnemonics
public static int fact(int);
Code:
line numbers
= bytecode addresses in decimal
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
Slide #41 of 61
Bytecode in bytes
Conditional jump is from address 1 (ifne) to 6 (iload_0). Hence offset = 5.
Mnemonics show absolute address 6.
We are not going to use these opcodes.
But they are what a class file is really made of.
Slide #42 of 61
The Disassembler Helps You
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
The jump offset is 5
The disassembler tells you where the jump goes to, address 6.
#2 is index in the constant pool. That’s meaningless to us.
It could be anything.
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
The disassembler puts in a comment to show what information is stored there – it’s a symbolic reference to the fact method.
Slide #43 of 61
What bytecode does what Java?
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
Slide #44 of 61
What bytecode does what Java?
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
Why do all the bytecode instructions start with i?
Slide #45 of 61
What bytecode does what Java?
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
Push variable in slot 0, i.e. n
Conditional jump, if non-zero top of stack Push integer constant 1
Return integer
Integer subtract
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
Integer multiply
Why do all the bytecode instructions
start with i?
Mostly: shows int type for operation.
Exceptions: ifne, invokestatic
Slide #46 of 61
What bytecode does what Java?
public static int fact(int);
Code:
0: iload_0
1: ifne 6
Stack
n
empty
4: iconst_1 1
5: ireturn 1
6: iload_0
7: iload_0
8: iconst_1
9: isub
n
n, n
n, n, 1 n, n – 1
10: invokestatic #2 // Method fact:(I)I 13: imul n * fact(n-1)
14: ireturn
n, fact(n-1)
Slide #47 of 61
What bytecode does what Java?
public static int fact(int);
Code:
Stack
n
empty 1
1
n
n, n
n, n, 1 n, n – 1
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul n * fact(n-1)
n, fact(n-1)
14: ireturn
Slide #48 of 61
Executing fact(2) [1]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
n
Stack
Operand Stack
Parameter, Slot 0
Next Instruction
PC = 0
2
Frame for main()
Slide #49 of 61
Executing fact(2) [2]
public static int fact(int);
Code:
0: iload_0 push n
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
n
Stack
PC = 1
2
2
Frame for main()
Slide #50 of 61
Executing fact(2) [3]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
n
Stack
PC = 6
2
Frame for main()
Slide #51 of 61
Executing fact(2) [4]
public static int fact(int);
Code:
0: iload_0 2
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
Stack
Empty
2
2, 2
2, 2, 1 2, 1
PC = 10
1
2
2
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
n
Frame for main()
Slide #52 of 61
Recursive call fact(1) [5]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
Stack
fact(1)
PC = 0
1
n
PC = 13
2
2
Two Frames for fact:
One for each call i.e. fact(2) and fact(1)
n
Frame for main()
Slide #53 of 61
Recursive call fact(1) [6]
public static int fact(int);
Code:
0: iload_0
1: ifne
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
n
Stack
1 Empty
1
fact(1)
PC = 10
0
1
1
6
1, 1
1, 1, 1 1, 0
n
PC = 13
2
2
Frame for main()
Slide #54 of 61
PC = 0
0
Recursive call fact(0) [7]
public static int fact(int);
Code: Stack 0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
n
fact(0)
PC = 13
1
1
n
PC = 13
2
2
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
Frame for main()
n
Slide #55 of 61
PC = 5
1
0
Recursive call fact(0) [8]
public static int fact(int);
n
fact(0)
Code:
0: iload_0
1: ifne
4: iconst_1 1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
Stack
Return Value = 1
0
6 Empty
PC = 13
1
1
n
PC = 13
2
2
Frame for main()
n
Slide #56 of 61
Return from fact(0) [9]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
n
Stack
PC = 13
1
1
1
n
PC = 13
2
2
Frame for main()
Slide #57 of 61
Completing fact(1) [10]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
n
Stack
PC = 14
1
1
Return Value = 1
n
PC = 13
2
2
Frame for main()
Slide #58 of 61
Return from fact(1) [11]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
10: invokestatic #2 // Method fact:(I)I 13: imul
14: ireturn
n
Stack
PC = 13
1
2
2
Frame for main()
Slide #59 of 61
Completing fact(2) [12]
public static int fact(int);
Code:
0: iload_0
1: ifne 6
4: iconst_1
5: ireturn
6: iload_0
7: iload_0
8: iconst_1
9: isub
Stack
PC = 14
2
2
10: invokestatic #2 // Method fact:(I)I
13: imul
14: ireturn
2
2,1 Result of fact(2)
n
Frame for main()
Slide #60 of 61
Summary
In this lecture, we have seen the
Stack frames and their role in function execution
Local variables and how they are assigned slot numbers
Bytecode instructions and how operations are performed using the operand stack
Runtime constant pool and static method calls
Java disassembler and how to map high-level Java code
to low-level bytecode instructions
Bytecode level execution of the recursive factorial function
Slide #61 of 61