Lecture 4:
Concurrent Data Structures
(Concurrent Linked Lists)
Companion slides for
The Art of Multiprocessor Programming
by Maurice Herlihy & Nir Shavit With modifications by Lamont Samuels
Concurrent Data Structures
• We assume
– shared-memory multiprocessors
environment
– concurrently execute multiple threads
which communicate and synchronize through data structures in shared memory
Art of Multiprocessor Programming 2
Concurrent Data Structures
• Far more difficult to design than
sequential ones
– Correctness
• Primary source of difficulty is concurrency
• The steps of different threads can be interleaved
arbitrarily
– Scalability (performance)
• We will look at
– Concurrent Linked List/Queue/Stack
Art of Multiprocessor Programming 3
Main performance issue of
lock based system
• Sequential bottleneck
– Atanypointintime,atmostonelock-protectedoperationis doing useful work.
• Memory contention
– Overheadintrafficasaresultofmultiplethreads concurrently attempting to access the same memory location.
• Blocking
– Ifthreadthatcurrentlyholdsthelockisdelayed,thenall other threads attempting to access are also delayed.
– Considernon-blocking(lock-free)algorithm
4
Nonblocking algorithms
• implemented by a hardware operation
– atomically combines a load and a store – Ex) compare-and-swap(CAS)
• lock-free
– if there is guaranteed system-wide progress;
– while a given thread might be blocked by other threads, all CPUs can
continue doing other useful work without stalls.
• wait-free
– if there is also guaranteed per-thread progress.
– in addition to all CPUs continuing to do useful work, no computation can
ever be blocked by another computation.
5
Linked List
• Illustrate these patterns …
• Using a list-based Set
– Common application
– Building block for other apps
Art of Multiprocessor Programming 6
Set Interface
• Unordered collection of items
• No duplicates
• Methods
– add(x) put x in set
– remove(x) take x out of set – contains(x) tests if x in set
Art of Multiprocessor Programming 7
List-Based Sets
public interface Set
}
Art of Multiprocessor Programming 8
List Node
public class Node {
public T item; // item of interest
public int key; // usually hash code publicNodenext; //referencetonextnode
}
Art of Multiprocessor Programming 9
The List-Based Set
-∞ a b c
+∞
Sorted with Sentinel nodes (min & max possible keys)
Art of Multiprocessor Programming 10
Sequential List Based Set
Add()
a
cd
Remove()
a
bc
Art of Multiprocessor Programming 11
Sequential List Based Set
Add()
a
b
cd
Remove()
a
bc
Art of Multiprocessor Programming 12
Course Grained Locking
a
bd
Art of Multiprocessor Programming 13
Course Grained Locking
a
c
bd
Art of Multiprocessor Programming 14
Course Grained Locking
a
honk!
honk!
Simple but hotspot + bottleneck Art of Multiprocessor Programming 15
bd
c
Coarse-Grained Synchronization
• Sequential bottleneck – Threads “stand in line”
• Adding more threads
– Does not improve throughput
– Struggle to keep it from getting worse
• So why even use a multiprocessor? – Well, some apps inherently parallel …
Art of Multiprocessor Programming 16
Coarse-Grained Synchronization
(Linked List)
public boolean add(T item) {
Node pred, curr;
int key = item.hashCode();
lock.lock();
try {
pred = head;
curr = pred.next;
while (curr.key < key) {
pred = curr;
curr = curr.next;
}
if (key == curr.key) {
return false;
} else {
Node node = new Node(item);
node.next = curr;
pred.next = node;
return true;
}
} finally {
lock.unlock();
}
}
17
public class CoarseList
private Node head;
private Node tail;
private Lock lock = new ReentrantLock();
public CoarseList() {
// Add sentinels to start and end
head = new Node(Integer.MIN_VALUE);
tail = new Node(Integer.MAX_VALUE);
head.next = this.tail;
}
public boolean remove(T item) {
Node pred, curr;
int key = item.hashCode();
lock.lock();
try {
pred = this.head;
curr = pred.next;
while (curr.key < key) {
pred = curr;
curr = curr.next;
}
if (key == curr.key)
pred.next = curr.next;
return true;
} else {
return false;
}
} finally {
lock.unlock();
}
}
public boolean contains(T item) {
Node pred, curr;
int key = item.hashCode();
lock.lock();
try {
pred = head;
curr = pred.next;
while (curr.key < key) {
pred = curr;
curr = curr.next;
}
return (key == curr.key);
} finally {
lock.unlock();
}
}
18
Coarse-Grained Locking
• Easy, same as synchronized methods
– “One lock to rule them all ...”
• Simple, clearly correct – Deserves respect!
• Works poorly with contention
Art of Multiprocessor Programming 19
Performance Improvement
• For highly-concurrent objects
• Goal:
– Concurrent access
– More threads, more throughput
Art of Multiprocessor Programming 20
First:
Fine-Grained Synchronization
• Instead of using a single lock ..
• Split object into
– Independently-synchronized components
• Methods conflict when they access – The same component ...
– At the same time
Art of Multiprocessor Programming 21
Second:
Optimistic Synchronization
• Search without locking ...
• If you find it, lock and check ... – OK: we are done
– Oops: start over
• Evaluation
– Usually cheaper than locking – Mistakes are expensive
Art of Multiprocessor Programming 22
Third:
Lazy Synchronization
• Postpone hard work
• Removing components is tricky
– Logical removal
• Mark component to be deleted
– Physical removal
• Do what needs to be done
Art of Multiprocessor Programming 23
Fourth:
Lock-Free Synchronization
• Don’t use locks at all
– Use compareAndSet() & relatives ...
• Advantages
– No Scheduler Assumptions/Support
• Disadvantages – Complex
– Sometimes high overhead
Art of Multiprocessor Programming 24
Fine-grained Locking
• Requires careful thought
– “Do not meddle in the affairs of wizards,
for they are subtle and quick to anger”
• Split object into pieces
– Each piece has own lock
– Methods that work on disjoint pieces
need not exclude each other
Art of Multiprocessor Programming 25
Fine-grained Locking
• Use multiple locks of small granularity to protect different parts of the data structure
• Goal
– To allow concurrent operations to proceed in parallel when they do not access the same parts of the data structure
Art of Multiprocessor Programming 26
Hand-over-Hand locking
abc
Art of Multiprocessor Programming 27
Hand-over-Hand locking
abc
Art of Multiprocessor Programming 28
Hand-over-Hand locking
abc
Art of Multiprocessor Programming 29
Hand-over-Hand locking
abc
Art of Multiprocessor Programming 30
Hand-over-Hand locking
abc
Art of Multiprocessor Programming 31
Removing a Node
abcd
remove(b)
Art of Multiprocessor Programming 32
Removing a Node
abcd
remove(b)
Art of Multiprocessor Programming 33
Removing a Node
abcd
remove(b)
Art of Multiprocessor Programming 34
Removing a Node
abcd
remove(b)
Art of Multiprocessor Programming 35
Removing a Node
abcd
remove(b)
Art of Multiprocessor Programming 36
Removing a Node
acd
remove(b)
Why do we need to always hold 2 locks?
Art of Multiprocessor Programming 37
Concurrent Removes
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 38
Concurrent Removes
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 39
Concurrent Removes
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 40
Concurrent Removes
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 41
Concurrent Removes
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 42
Concurrent Removes
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 43
Concurrent Removes
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 44
Concurrent Removes
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 45
Uh, Oh
acd
remove(c)
remove(b)
Art of Multiprocessor Programming 46
Uh, Oh
Bad news, C not removed
acd
remove(c)
remove(b)
Art of Multiprocessor Programming 47
Problem • To delete node c
– Swing node b’s next field to d abc
• Problem is,
– Someone deleting b concurrently could
direct a pointer to c
abc
Art of Multiprocessor Programming 48
Insight
• If a node is locked
– No one can delete node’s successor
• If a thread locks
– Node to be deleted – And its predecessor – Then it works
Art of Multiprocessor Programming 49
Hand-Over-Hand Again
abcd
remove(b)
Art of Multiprocessor Programming 50
Hand-Over-Hand Again
abcd
remove(b)
Art of Multiprocessor Programming 51
Hand-Over-Hand Again
abcd
remove(b)
Art of Multiprocessor Programming 52
Hand-Over-Hand Again
remove(b)
abcd
Found it!
Art of Multiprocessor Programming 53
Hand-Over-Hand Again
remove(b)
abcd
Found it!
Art of Multiprocessor Programming 54
Hand-Over-Hand Again
acd
remove(b)
Art of Multiprocessor Programming 55
Removing a Node
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 56
Removing a Node
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 57
Removing a Node
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 58
Removing a Node
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 59
Removing a Node
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 60
Removing a Node
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 61
Removing a Node
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 62
Removing a Node
abcd
remove(c)
remove(b)
Art of Multiprocessor Programming 63
Removing a Node
abcd
remove(c)
Must acquire Lock of b
Art of Multiprocessor Programming 64
Removing a Node
abcd
Cannot acquire lock of b
remove(c)
Art of Multiprocessor Programming 65
Removing a Node
abcd
remove(c)
Wait!
Art of Multiprocessor Programming 66
Removing a Node
abd
Proceed to remove(b)
Art of Multiprocessor Programming 67
Removing a Node
abd
remove(b)
Art of Multiprocessor Programming 68
Removing a Node
abd
remove(b)
Art of Multiprocessor Programming 69
Removing a Node
ad
remove(b)
Art of Multiprocessor Programming 70
Removing a Node
ad
Art of Multiprocessor Programming 71
public boolean add(T item) {
int key = item.hashCode();
head.lock();
Node pred = head;
try {
Node curr = pred.next;
curr.lock();
try {
while (curr.key < key) {
pred.unlock();
pred = curr;
curr = curr.next;
curr.lock();
}
if (curr.key == key) return false;
Node newNode = new Node(item);
newNode.next = curr;
pred.next = newNode;
return true;
} finally {
curr.unlock();
}
} finally {
pred.unlock();
}
}
Fine-Grained Synchronization:
hand-over-hand locking Linked List
72
public boolean remove(T item) {
Node pred = null, curr = null;
int key = item.hashCode();
head.lock();
try {
pred = head;
curr = pred.next;
curr.lock();
try {
while (curr.key < key) {
pred.unlock();
pred = curr;
curr = curr.next;
curr.lock();
}
if (curr.key == key) {
pred.next = curr.next;
return true;
}
return false;
} finally {
curr.unlock();
}
} finally {
pred.unlock();
} }
public boolean contains(T item) {
Node last = null, pred = null, curr
= null;
int key = item.hashCode();
head.lock();
try {
pred = head;
curr = pred.next;
curr.lock();
try {
while (curr.key < key) {
pred.unlock();
pred = curr;
curr = curr.next;
curr.lock();
}
return (curr.key == key);
} finally {
curr.unlock();
}
} finally {
pred.unlock();
} }
}
73
Adding Nodes
• To add node e
– Must lock predecessor – Must lock successor
• Neither can be deleted
Art of Multiprocessor Programming 74
Drawbacks
• Better than coarse-grained lock – Threads can traverse in parallel
• Still not ideal
– Long chain of acquire/release – Inefficient
Art of Multiprocessor Programming 75
Optimistic Synchronization
• Find nodes without locking
• Lock nodes
• Check that everything is OK
Art of Multiprocessor Programming 76
Optimistic: Traverse without Locking
a
b
de
Aha!
add(c)
Art of Multiprocessor Programming 77
Optimistic: Lock and Load
a
b
de
add(c)
Art of Multiprocessor Programming 78
What could go wrong?
a
b
de
remove(b)
add(c)
Aha!
Art of Multiprocessor Programming 79
Validate – Part 1
(while holding locks)
a
b
de
Yes, b still reachable from head
add(c)
Art of Multiprocessor Programming 80
What Else Can Go Wrong?
a
b
de
add(c)
Art of Multiprocessor Programming 81
What Else Can Go Wrong?
b’ a
b
de
add(b’)
add(c)
Art of Multiprocessor Programming 82
What Else Can Go Wrong?
b’ a
Aha!
b
de
add(c)
Art of Multiprocessor Programming 83
Validate Part 2
(while holding locks)
a
b
de
Yes, b still points to d
add(c)
Art of Multiprocessor Programming 84
Optimistic: Linearization Point
a
c
b
de
add(c)
Art of Multiprocessor Programming 85
Correctness
• If
– Nodes b and c both locked – Node b still accessible
– Node c still successor to b
• Then
– Neither will be deleted
– OK to delete and return true
Art of Multiprocessor Programming 86
Unsuccessful Remove
abde
Aha!
remove(c)
Art of Multiprocessor Programming 87
Validate (1)
abde
remove(c)
Yes, b still reachable from head
Art of Multiprocessor Programming 88
remove(c)
Validate (2)
abde Yes, b still
points to d
Art of Multiprocessor Programming 89
OK Computer
abde
remove(c)
return false
Art of Multiprocessor Programming 90
Correctness
• If
– Nodes b and d both locked – Node b still accessible
– Node d still successor to b
• Then
– Neither will be deleted
– No thread can add c after b – OK to return false
Art of Multiprocessor Programming 91
Validation
private boolean validate(Node pred,
Node curry) { Node node = head;
while (node.key <= pred.key) { if (node == pred)
return pred.next == curr; node = node.next;
}
return false; }
Art of Multiprocessor Programming 92
Validation
private boolean validate(Node pred,
Node curr) {
Node node = head;
while (node.key <= pred.key) {
if (node == pred)
return pred.next == curr;
node = node.next; }
return false;
}
Predecessor & current nodes
Art of Multiprocessor Programming 93
private boolean validate(Node pred,
Node curr) {
Node node = head;
while (node.key <= pred.key) { if (node == pred)
return pred.next == curr; node = node.next;
}
return false; }
Begin at the beginning
Validation
Art of Multiprocessor Programming 94
private boolean validate(Node pred,
Node curr) { Node node = head;
while (node.key <= pred.key) {
Validation
if (node == pred)
return pred.next == curr;
node = node.next; }
return false; }
Search range of keys
Art of Multiprocessor Programming 95
private boolean validate(Node pred,
Node curr) { Node node = head;
while (node.key <= pred.key) {
if (node == pred)
return pred.next == curr; node = node.next;
}
return false; }
Predecessor reachable
Validation
Art of Multiprocessor Programming 96
Validation
Node curry) { Node node = head;
while (node.key <= pred.key) { if (node == pred)
return pred.next == curr;
private boolean validate(Node pred,
node = node.next; }
return false; }
Is current node next?
Art of Multiprocessor Programming 97
Validation
private boolean validate(Node pred,
Node curr) { Node node = head;
while (node.key <= pred.key) { if (node == pred)
return pred.next == curr;
node = node.next;
}
return false; }
Otherwise move on
Art of Multiprocessor Programming 98
Validation
private boolean Predecessor not reachable validate(Node pred,
Node curr) { Node node = head;
while (node.key <= pred.key) { if (node == pred)
return pred.next == curr; node = node.next;
}
return false;
}
Art of Multiprocessor Programming 99
public boolean add(T item) {
int key = item.hashCode();
while (true) {
Node pred = this.head;
Node curr = pred.next;
while (curr.key < key) {
pred = curr; curr = curr.next;
}
pred.lock(); curr.lock();
try {
if (validate(pred, curr)) {
if (curr.key == key) {
return false;
} else {
Node node = new Entry(item);
entry.next = curr;
pred.next = node;
return true;
} }
} finally {
pred.unlock(); curr.unlock();
} }
}
Optimistic Synchronization
100
public boolean remove(T item) {
int key = item.hashCode();
while (true) {
Node pred = this.head;
Node curr = pred.next;
while (curr.key < key) {
pred = curr; curr = curr.next;
}
pred.lock(); curr.lock();
try {
if (validate(pred, curr)) {
if (curr.key == key) {
pred.next = curr.next;
return true;
} else {
return false;
}
}
} finally {
pred.unlock(); curr.unlock();
}
} }
public boolean contains(T item) {
int key = item.hashCode();
while (true) {
Node pred = this.head;
Node curr = pred.next;
while (curr.key < key) {
pred = curr; curr = curr.next;
}
try {
pred.lock(); curr.lock();
if (validate(pred, curr)) {
return (curr.key == key);
}
} finally {
pred.unlock(); curr.unlock();
} }
}
private boolean validate(Node pred, Node
curr) {
Node node = head;
while (node.key <= pred.key) {
if (node == pred)
return pred.next == curr;
Node = node.next;
}
return false;
}
101
Optimistic List
• Limited hot-spots
– Targets of add(), remove(), contains() – No contention on traversals
• Moreover
– Traversals are wait-free – Food for thought ...
Art of Multiprocessor Programming 102
So Far, So Good
• Much less lock acquisition/release
– Performance – Concurrency
• Problems
– Need to traverse list twice
– contains() method acquires locks
Art of Multiprocessor Programming 103
Evaluation
• Optimistic is effective if
– cost of scanning twice without locks
is less than
– cost of scanning once with locks
• Drawback
– contains() acquires locks – 90% of calls in many apps
Art of Multiprocessor Programming 104
Lazy List
• Like optimistic, except
– Scan once
– contains(x) never locks ...
• Key insight
– Removing nodes causes trouble – Do it “lazily”
Art of Multiprocessor Programming 105
Lazy List
– Scans list (as before)
– Locks predecessor & current (as before)
• Logical delete
– Marks current node as removed (new!)
• Physical delete
– Redirects predecessor’s next (as before)
• remove()
Art of Multiprocessor Programming 106
Lazy Removal
a b c d
Art of Multiprocessor Programming 107
Lazy Removal
a b c d
Present in list
Art of Multiprocessor Programming 108
Lazy Removal
a b c d
Logically deleted
Art of Multiprocessor Programming 109
Lazy Removal
a b c d
Physically deleted
Art of Multiprocessor Programming 110
Lazy Removal
ab d
Physically deleted
Art of Multiprocessor Programming 111
Lazy List
• All Methods
– Scan through locked and marked nodes
– Removing a node doesn’t slow down other
method calls ...
• Must still lock pred and curr nodes.
Art of Multiprocessor Programming 112
Validation
• No need to rescan list!
• Check that pred is not marked • Check that curr is not marked • Check that pred points to curr
Art of Multiprocessor Programming 113
Business as Usual
abc
Art of Multiprocessor Programming 114
Business as Usual
abc
Art of Multiprocessor Programming 115
Business as Usual
abc
Art of Multiprocessor Programming 116
Business as Usual
abc
remove(b)
Art of Multiprocessor Programming 117
Business as Usual
abc
a,b not marked
Art of Multiprocessor Programming 118
Business as Usual
abc
a still points to b
Art of Multiprocessor Programming 119
Business as Usual
abc
Logical delete
Art of Multiprocessor Programming 120
Business as Usual
abc
physical delete
Art of Multiprocessor Programming 121
Business as Usual
abc
Art of Multiprocessor Programming 122
Invariant
• If not marked then item in the set
• and reachable from head
• and if not yet traversed it is reachable from pred
Art of Multiprocessor Programming 123
Validation
private boolean
validate(Node pred, Node curr) {
return
!pred.marked && !curr.marked && pred.next == curr); }
Art of Multiprocessor Programming 124
List Validate Method
private boolean
validate(Node pred, Node curr) {
return
!pred.marked &&
!curr.marked && pred.next == curr); }
Predecessor not Logically removed
Art of Multiprocessor Programming 125
List Validate Method
private boolean
validate(Node pred, Node curr) {
return
!pred.marked && !curr.marked && pred.next == curr); }
Current not Logically removed
Art of Multiprocessor Programming 126
List Validate Method
private boolean
validate(Node pred, Node curr) {
return
!pred.marked && !curr.marked && pred.next == curr); }
Predecessor still Points to current
Art of Multiprocessor Programming 127
Contains
public boolean contains(Item item) { int key = item.hashCode();
Node curr = this.head;
while (curr.key < key) {
curr = curr.next; }
return curr.key == key && !curr.marked; }
Art of Multiprocessor Programming 128
Contains
public boolean contains(Item item) { int key = item.hashCode();
Node curr = this.head;
while (curr.key < key) {
curr = curr.next; }
return curr.key == key && !curr.marked; }
Start at the head
Art of Multiprocessor Programming 129
Contains
public boolean contains(Item item) { int key = item.hashCode();
Node curr = this.head;
while (curr.key < key) {
curr = curr.next; }
return curr.key == key && !curr.marked; }
Search key range
Art of Multiprocessor Programming 130
Contains
public boolean contains(Item item) { int key = item.hashCode();
Node curr = this.head;
while (curr.key < key) {
curr = curr.next;
}
return curr.key == key && !curr.marked; }
Traverse without locking (nodes may have been removed)
Art of Multiprocessor Programming 131
Contains
public boolean contains(Item item) { int key = item.hashCode();
Node curr = this.head;
while (curr.key < key) {
curr = curr.next; }
return curr.key == key && !curr.marked;
}
Present and undeleted?
Art of Multiprocessor Programming 132
public boolean add(T item) {
int key = item.hashCode();
while (true) {
Node pred = this.head;
Node curr = head.next;
while (curr.key < key) {
pred = curr; curr = curr.next;
}
pred.lock();
try {
curr.lock();
try {
if (validate(pred, curr)) {
if (curr.key == key) {
return false;
} else {
Node Node = new Node(item);
Node.next = curr;
pred.next = Node;
return true;
} }
} finally { // always unlock
curr.unlock();
}
} finally { // always unlock
pred.unlock();
}
} }
Lazy Synchronization
133
public boolean remove(T item) {
int key = item.hashCode();
while (true) {
Node pred = this.head;
Node curr = head.next;
while (curr.key < key) {
pred = curr; curr = curr.next;
}
pred.lock();
try {
curr.lock();
try {
if (validate(pred, curr)) {
if (curr.key != key) {
return false;
} else {
curr.marked = true;
pred.next = curr.next;
return true;
} }
} finally {
curr.unlock();
}
} finally {
pred.unlock();
}
} }
public boolean contains(T item) {
int key = item.hashCode();
Node curr = this.head;
while (curr.key < key)
curr = curr.next;
return curr.key == key && !curr.marked;
}
private boolean validate(Node pred, Node
curr) {
return !pred.marked && !curr.marked &&
pred.next == curr;
}
134
Evaluation
• Good:
– contains() doesn’t lock
– Good because typically high % contains() – Uncontended calls don’t re-traverse
• Bad
– Contended add() and remove() calls do re-
traverse
– Traffic jam if one thread delays
Art of Multiprocessor Programming 135
Traffic Jam
• Any concurrent data structure based on mutual exclusion has a weakness
• If one thread
– Enters critical section
– And “eats the big muffin”
• Cache miss, page fault, descheduled ...
– Everyone else using that lock is stuck! – Need to trust the scheduler....
Art of Multiprocessor Programming 136
Reminder: Lock-Free Data Structures
• No matter what ...
– Guarantees minimal progress in any
execution
– i.e. Some thread will always complete a
method call
– Even if others halt at malicious times
– Implies that implementation can’t use
locks
Art of Multiprocessor Programming 137