Announcements
Reminder: Project 1A is out.
¡ñ Very strongly encouraged to work on this project in IntelliJ.
¡ð Having the ability to visually debug your code is incredibly useful.
¡ð Having your IDE yell at you about compilation errors while you are
writing code is really nice to avoid issues with, for example, generics.
¡ñ Autograder is up, but we still want you to write your own tests.
¡ñ Tests not graded.
¡ñ On part 1B there will be graded tests, so might be worthwhile to write tests
just to save yourself some work next week.
CS61B, 2021
Lecture 7: Arrays and Lists
¡ñ A Last Look at Linked Lists
¡ñ Naive Array Lists
¡ñ Resizing Arrays
¡ñ Generic ALists
¡ñ Obscurantism in Java
A Last Look at Linked Lists
Doubly Linked Lists
Behold. The state of the art as we arrived at in last week¡¯s lecture. Through various improvements, we made all of the following operations fast:
¡ñ addFirst,addLast
¡ñ getFirst,getLast
¡ñ removeFirst,removeLast ¡ñ You will build this in project 1A.
addLast()
getLast()
removeLast()
get(int i)
size sentinel
3
??
5
17
38
prev value next
Arbitrary Retrieval
Suppose we added get(int i), which returns the ith item from the list.
Why would get be slow for long lists compared to getLast()? For what inputs?
addLast()
getLast()
removeLast()
get(int i)
size sentinel
3
??
5
17
38
prev value next
Arbitrary Retrieval
Suppose we added get(int i), which returns the ith item from the list.
Why would get be slow for long lists compared to getLast()? For what inputs?
¡ñ Have to scan to desired position. Slow for any i not near the sentinel node.
¡ñ How do we fix this?
addLast()
getLast()
removeLast()
get(int i)
size sentinel
3
??
5
17
38
prev value next
Arbitrary Retrieval
Suppose we added get(int i), which returns the ith item from the list.
Why would get be slow for long lists compared to getLast()? For what inputs?
¡ñ Have to scan to desired position. Slow for any i not near the sentinel node.
¡ñ Will discuss (much later) sophisticated changes that can speed up lists.
¡ñ For now: We¡¯ll take a different tack: Using an array instead (no links!).
addLast()
getLast()
removeLast()
get(int i)
size sentinel
3
??
5
17
38
prev value next
Naive Array Lists
Random Access in Arrays
Retrieval from any position of an array is very fast.
¡ñ Independent* of array size.
¡ñ 61C Preview: Ultra fast random access results from the fact that memory
boxes are the same size (in bits).
Our Goal: AList.java
Want to figure out how to build an array version of a list:
¡ñ In lecture we¡¯ll only do back operations. Project 1A is the front operations.
addLast()
getLast()
removeLast()
get(int i)
size sentinel
3
??
5
17
38
prev value next
addLast()
getLast()
removeLast()
get(int i)
???
Let¡¯s try it out…
Naive AList Code
From last lecture, ¡°things that must be true¡±.
AList Invariants:
¡ñ The position of the next item to be inserted is always size.
¡ñ size is always the number of items in the AList.
¡ñ The last item in the list is always in position size – 1.
Let¡¯s now discuss delete operations.
public class AList {
private int[] items;
private int size;
public AList() {
items = new int[100]; size = 0;
}
public void addLast(int x) {
items[size] = x;
size += 1;
}
public int getLast() {
return items[size – 1];
}
public int get(int i) {
return items[i];
}
public int size() {
return size;
} }
The Abstract vs. the Concrete
When we removeLast(), which memory boxes need to change? To what?- User¡¯s mental model: {5, 3, 1, 7, 22, -1} ¡ú {5, 3, 1, 7, 22}
Actual truth:
addLast()
getLast()
removeLast()
get(int i)
size items
6
5
3
1
7
22
-1
0
0
…
0
0
0 1 2 3 4 5 6 7 97 98 99
Deletion: yellkey.com/enter
When we removeLast(), which memory boxes need to change? To what?
a) size
b) size and items
c) size and items[i] for some i
d) size, items, and items[i] for
some i
e) size, items, and items[i] for
addLast()
getLast()
removeLast()
get(int i)
size items
6
5
3
1
7
22
-1
0
0
…
0
0
0
0 1 2 3 4 5 6 7
many different i 97 98 99
¡ñ The position of the next item to be inserted is always size.
¡ñ size is always the number of items in the AList.
¡ñ The last item in the list is always in position size – 1.
AList invariants.
Naive AList Code
AList Invariants:
¡ñ The position of the next item to be inserted is always size.
¡ñ size is always the number of items in the AList.
¡ñ The last item in the list is always in position size – 1.
Setting deleted item to zero is not necessary to preserve invariants, and thus not necessary for correctness.
public int removeLast() {
int returnItem = items[size – 1];
items[size – 1] = 0;
size -= 1;
return returnItem;
}
public class AList {
private int[] items;
private int size;
public AList() {
items = new int[100]; size = 0;
}
public void addLast(int x) {
items[size] = x;
size += 1;
}
public int getLast() {
return items[size – 1];
}
public int get(int i) {
return items[i];
}
public int size() {
return size;
} }
The Mighty AList
Key Idea: Use some subset of the entries of an array.
addLast()
getLast()
removeLast()
get(int i)
size items
6
5
3
1
7
22
-1
3
2
…
7
12
4
0 1 2 3 4 5 6 7 97 98 99
The Mighty (?) AList
Key Idea: Use some subset of the entries of an array.
addLast()
getLast()
removeLast()
get(int i)
size items
100
5
3
1
7
22
-1
5
7
…
-1
5
7
0 1 2 3 4 5 6 7 97 98 99
What happens if we insert into the AList above? What should we do about it?
Resizing Arrays
Array Resizing
When the array gets too full, e.g. addLast(11), just make a new array:
size==items.length
addLast()
getLast()
removeLast()
get(int i)
size items
100
5
3
1
7
22
-1
5
7
…
-1
5
7
0 1 2 3 4 5 6 7 97 98 99
Array Resizing
When the array gets too full, e.g. addLast(11), just make a new array: ¡ñ int[] a = new int[size+1];
size==items.length
addLast()
getLast()
removeLast()
get(int i)
size items
100
5
3
1
7
22
-1
5
7
…
-1
5
7
a 0 1 2 3 4 5 6 7 97 98 99
0
0
0
0
0
0
0
0
…
0
0
0
0
0 1 2 3 4 5 6 7 97 98 99 100
Array Resizing
When the array gets too full, e.g. addLast(11), just make a new array: ¡ñ int[] a = new int[size+1];
¡ñ System.arraycopy(…)
size==items.length
addLast()
getLast()
removeLast()
get(int i)
size items
100
5
3
1
7
22
-1
5
7
…
-1
5
7
a 0 1 2 3 4 5 6 7 97 98 99
5
3
1
7
22
-1
5
7
…
-1
5
7
0
0 1 2 3 4 5 6 7 97 98 99 100
Array Resizing
When the array gets too full, e.g. addLast(11), just make a new array:
¡ñ int[] a = new int[size+1]; ¡ñ System.arraycopy(…)
¡ñ a[size] = 11;
size==items.length
addLast()
getLast()
removeLast()
get(int i)
size items
100
5
3
1
7
22
-1
5
7
…
-1
5
7
a 0 1 2 3 4 5 6 7 97 98 99
5
3
1
7
22
-1
5
7
…
-1
5
7
11
0 1 2 3 4 5 6 7 97 98 99 100
Array Resizing
When the array gets too full, e.g. addLast(11), just make a new array:
size==items.length
¡ñ int[] a = new int[size+1]; ¡ñ System.arraycopy(…)
addLast()
getLast()
removeLast()
get(int i)
size items
101
¡ñ ¡ñ
a[size] = 11;
items = a; size +=1;
5
3
1
7
22
-1
5
7
…
-1
5
7
a
0 1 2 3 4 5 6 7
97 98 99
5
3
1
7
22
-1
5
7
…
-1
5
7
11
0 1 2 3 4 5 6 7
97 98 99 100
Array Resizing
When the array gets too full, e.g. addLast(11), just make a new array:
¡ñ int[] a = new int[size+1]; ¡ñ System.arraycopy(…)
¡ñ a[size] = 11;
¡ñ items = a; size +=1;
We call this process ¡°resizing¡±
size==items.length
addLast()
getLast()
removeLast()
get(int i)
size items
101
a
5
3
1
7
22
-1
5
7
…
-1
5
7
11
0 1 2 3 4 5 6 7
97 98 99 100
Implementation
Let¡¯s implement the resizing capability.
¡ñ As usual, for those of you watching online, I recommend trying to implement this on your own before watching me do it.
¡ñ Starter code is provided in the lists4 study guide if you want to try it out on a computer.
Resizing Array Code
private void resize(int capacity) {
int[] a = new int[capacity];
System.arraycopy(items, 0, a, 0, size);
items = a;
}
public void addLast(int x) {
if (size == items.length) {
resize(size + 1);
}
items[size] = x;
size += 1; }
public void addLast(int x) {
if (size == items.length) {
int[] a = new int[size + 1];
System.arraycopy(items, 0, a, 0, size);
items = a;
}
items[size] = x;
size += 1;
}
Works
Much Better
Runtime and Space Usage Analysis: yellkey.com/camera
Suppose we have a full array of size 100. If we call addLast two times, how many total array memory boxes will we need to create and fill (for just these 2 calls)?
A. 0
B. 101
C. 203
D. 10,302
Bonus question: What is the maximum number of array boxes that Java will track at any given time? Assume that ¡°garbage collection¡± happens immediately when all references to an object are lost.
private void resize(int capacity) {
int[] a = new int[capacity];
System.arraycopy(items, 0, a, 0, size);
items = a;
}
public void addLast(int x) {
if (size == items.length) {
resize(size + 1);
}
items[size] = x;
size += 1; }
Array Resizing
Resizing twice requires us to create and fill 203 total memory boxes.
¡ñ Bonus answer: Most boxes at any one time is 203.
¡ñ When the second addLast is done, we are left with 102 boxes. 100
5
3
1
7
22
-1
5
7
…
-1
5
7
0 1 2 3 4 5 6 7
97 98 99
5
3
1
7
22
-1
5
7
…
-1
5
7
11
101
0 1 2 3 4 5 6 7
97 98 99 100
5
3
1
7
22
-1
5
7
…
-1
5
7
11
3
102
0 1 2 3 4 5 6 7
97 98 99 100 101
Runtime and Space Usage Analysis: yellkey.com/focus
Suppose we have a full array of size 100. If we call addLast until size = 1000,
roughly how many total array memory boxes will we need to create and fill?
A. 1,000
B. 500,000
C. 1,000,000
D. 500,000,000,000
E. 1,000,000,000,000
Bonus question: What is the maximum number of array boxes that Java will track at any given time? Assume that ¡°garbage collection¡± happens immediately when all references to an object are lost.
private void resize(int capacity) {
int[] a = new int[capacity];
System.arraycopy(items, 0, a, 0, size);
items = a;
}
public void addLast(int x) {
if (size == items.length) {
resize(size + 1);
}
items[size] = x;
size += 1; }
Runtime and Space Usage Analysis
Suppose we have a full array of size 100. If we call addLast until size = 1000, roughly how many total array memory boxes will we need to create and fill?
B. 500,000
Going from capacity 100 to 101: 101 From 101 to 102: 102
…
From: 999 to 1000: 1000
private void resize(int capacity) {
int[] a = new int[capacity];
System.arraycopy(items, 0, a, 0, size);
items = a;
}
Total array boxes created/copied: 101 + 102 + … + 1000
Since sum of 1 + 2 + 3 + … + N = N(N+1)/2, sum(101, …, 1000) is close to 500,000. See: http://mathandmultimedia.com/2010/09/15/sum-first-n-positive-integers
We¡¯ll be doing a lot of this after the midterm.
Resizing Slowness
Inserting 100,000 items requires roughly 5,000,000,000 new containers.
¡ñ Computers operate at the speed of GHz (due billions of things per second).
¡ñ No huge surprise that 100,000 items took seconds.
Note: Insert here is addFirst
Fixing the Resizing Performance Bug
How do we fix this?
private void resize(int capacity) {
int[] a = new int[capacity];
System.arraycopy(items, 0, a, 0, size);
items = a;
}
public void addLast(int x) {
if (size == items.length) {
resize(size + 1);
}
items[size] = x;
size += 1; }
(Probably) Surprising Fact
Geometric resizing is much faster: Just how much better will have to wait.
public void addLast(int x) {
if (size == items.length) {
resize(size + RFACTOR);
}
items[size] = x;
size += 1; }
public void addLast(int x) {
if (size == items.length) {
resize(size * RFACTOR);
}
items[size] = x;
size += 1; }
Great performance.
This is how the Python list is implemented.
Unusably bad.
Performance Problem #2
Suppose we have a very rare situation occur which causes us to:
¡ñ Insert 1,000,000,000 items.
¡ñ Then remove 990,000,000 items.
Our data structure will execute these operations acceptably fast, but afterwards
there is a problem.
¡ñ What is the problem?
Memory Efficiency
An AList should not only be efficient in time, but also efficient in space.
¡ñ ¡ñ ¡ñ
Define the ¡°usage ratio¡± R = size / items.length; Typical solution: Half array size when R < 0.25. More details in a few weeks.
Usage ratio: 4/100 = 0.04
addLast()
getLast()
removeLast()
get(int i)
size items
4
5
3
1
7
0
0
0
0
...
0
0
0
0 1 2 3 4 5 6 7 97 98 99
Later we will consider tradeoffs between time and space efficiency for a variety of algorithms and data structures.
Generic ALists
Generic ALists (similar to generic SLists)
public class AList { bleepblrop
private int[] items;
private int size;
public AList() {
items = new int[8];
size = 0;
}
private void resize(int capacity) {
int[] a = new int[capacity];
System.arraycopy(items, 0,
a, 0, size);
items = a; }
public int get(int i) {
return items[i];
} ...
public class AList
private Glorp[] items;
private int size;
public AList() {
items = (Glorp []) new Object[8];
size = 0;
}
private void resize(int cap) {
Glorp[] a = (Glorp []) new Object[cap];
System.arraycopy(items, 0,
a, 0, size);
items = a; }
public Glorp get(int i) {
return items[i];
} …
Generic ALists (similar to generic SLists)
When creating an array of references to Glorps:
¡ñ (Glorp []) new Object[cap]; ¡ñ Causes a compiler warning,
which you should ignore.
Why not just new Glorp[cap];
¡ñ Will cause a ¡°generic array creation¡± error.
public class AList
private Glorp[] items;
private int size;
public AList() {
items = (Glorp []) new Object[8];
size = 0;
}
private void resize(int cap) {
Glorp[] a = (Glorp []) new Object[cap];
System.arraycopy(items, 0,
a, 0, size);
items = a; }
public Glorp get(int i) {
return items[i];
} …
Nulling Out Deleted Items
Unlike integer based ALists, we actually want to null out deleted items.
¡ñ Java only destroys unwanted objects when the last reference has been lost.
¡ñ Keeping references to unneeded objects is sometimes called loitering.
¡ñ Save memory. Don¡¯t loiter.
public Glorp deleteBack() { Glorp returnItem = getBack(); items[size – 1] = null;
size -= 1;
return returnItem;
}
size items
3
null
0123
Loitering Example
Changing size to 2 yields a correct AList.
¡ñ But memory is wasted storing a reference to the red sign image.
size items
2
null
0123
Loitering Example
Changing size to 2 yields a correct AList.
¡ñ But memory is wasted storing a reference to the red sign image.
By nulling out items[2], Java is free to destroy the unneeded image from memory, which could be potentially megabytes in size.
size items
2
null
null
0123
Loitering Example
Changing size to 2 yields a correct AList.
¡ñ But memory is wasted storing a reference to the red sign image.
By nulling out items[2], Java is free to destroy the unneeded image from memory, which could be potentially megabytes in size.
size items
2
null
null
0123
Obscurantism in Java
One last thought: Obscurantism in Java
We talk of ¡°layers of abstraction¡± often in computer science.
¡ñ Related concept: obscurantism. The user of a class does not and should not know how it works.
User¡¯s mental model: {5, 3, 1, 7, 22, -1} ¡ú {5, 3, 1, 7, 22} Actual truth:
addLast()
getLast()
removeLast()
get(int i)
size items
5
5
3
1
7
22
-1
0
0
…
0
0
0
0 1 2 3 4 5 6 7 97 98 99
One last thought: Obscurantism in Java
We talk of ¡°layers of abstraction¡± often in computer science.
¡ñ Related concept: obscurantism. The user of a class does not and should not know how it works.
¡ð The Java language allows you to enforce this with ideas like private! ¡ñ A good programmer obscures details from themselves, even within a class.
¡ð Example: addFirst and resize should be written totally independently. You should not be thinking about the details of one method while writing the other. Simply trust that the other works.
¡ð Breaking programming tasks down into small pieces (especially functions) helps with this greatly!
¡ð Through judicious use of testing, we can build confidence in these small pieces, as we¡¯ll see in the next lecture.
Citations
Hanging Containers:
http://www.portcalls.com/wp-content/uploads/2012/04/hanging_containers1.j pg
Loitering:
http://i.istockimg.com/file_thumbview_approve/19711163/6/stock-photo-197 11163-red-loitering-prohibited-sign.jpg
http://images.mysecuritysign.com/img/lg/K/No-Loitering-Sign-K-5418.gif http://3.bp.blogspot.com/-NV3y2NQDFy0/UAAXB5gINoI/AAAAAAAALi8/F_bM4 -dmsm4/s1600/DVC00575.JPG
Red pill/blue pill: https://en.wikipedia.org/wiki/Red_pill_and_blue_pill