Atmosphere 2014: Lockless programming - Tomasz Barański

52
Lockless Programming Tomasz Barański IBM Research

description

In the world of multi-core programming, traditional parallel programming techniques with locks (mutexes and similar mechanisms) create performance bottlenecks. Lockless programming is a set of techniques employing atomic operations to synchronize data exchange between threads. The talk introduces the audience to the lockless programming, presents its benefits and pitfalls. The presenter will talk about support for atomic operations in different CPU families as well as support for them in lower- and higher-level languages. He will also cover reordering and memory barriers. He will end the talk with tips on designing lockless algorithms and practical examples of lockless data structures. Tomasz Barański - Tomasz Barański is a software developer working in Kraków for IBM T.J. Watson Research on projects related to High-Performance computing. He has got over 12 years experience in enterprise world, taking roles of a developer, tester, interaction designer and a go-to guy.

Transcript of Atmosphere 2014: Lockless programming - Tomasz Barański

Page 1: Atmosphere 2014: Lockless programming - Tomasz Barański

Lockless Programming

Tomasz BarańskiIBM Research

Page 2: Atmosphere 2014: Lockless programming - Tomasz Barański

Me

Making software for 15 years

IBM Research @ KRK

Page 3: Atmosphere 2014: Lockless programming - Tomasz Barański

Lockless?

Page 4: Atmosphere 2014: Lockless programming - Tomasz Barański

Programming with multiple threads that access

shared memory and threads cannot block

each other.

Page 5: Atmosphere 2014: Lockless programming - Tomasz Barański

Why?

Page 6: Atmosphere 2014: Lockless programming - Tomasz Barański
Page 7: Atmosphere 2014: Lockless programming - Tomasz Barański
Page 8: Atmosphere 2014: Lockless programming - Tomasz Barański
Page 9: Atmosphere 2014: Lockless programming - Tomasz Barański

And also

(Dead|Live)locks

Priority inversion

Lock convoy

Page 10: Atmosphere 2014: Lockless programming - Tomasz Barański

How?

Page 11: Atmosphere 2014: Lockless programming - Tomasz Barański

Atomic operations Memory barriers

Page 12: Atmosphere 2014: Lockless programming - Tomasz Barański

Atomic operations Memory barriers

( τομοςἄ indivisible)

Page 13: Atmosphere 2014: Lockless programming - Tomasz Barański

Atomic operations Memory barriers

CAS FAA|AAF

Page 14: Atmosphere 2014: Lockless programming - Tomasz Barański

Atomic operations Memory barriers

CAS FAA|AAF

LoadLoad LoadStore

StoreLoad StoreStore

Page 15: Atmosphere 2014: Lockless programming - Tomasz Barański

Compare-And-Swap

cas(val, old, new) =if val == old

val = newreturn SUCCESS

elsereturn FAIL

Page 16: Atmosphere 2014: Lockless programming - Tomasz Barański

Fetch-And-Add

faa(val, i) =tmp = valval += ireturn tmp

Page 17: Atmosphere 2014: Lockless programming - Tomasz Barański

Sequential consistency

Page 18: Atmosphere 2014: Lockless programming - Tomasz Barański

acqiure lockread Xread Y

(…)store Ystore X

release lock

Pseudo-assembly

Page 19: Atmosphere 2014: Lockless programming - Tomasz Barański

acqiure lockread Xread Y

(…)store Ystore X

release lock

acqiure lockread Y

(…)store X

(...)read X

(...)store Y

release lock

reordering

compiler(JVM)CPU

Page 20: Atmosphere 2014: Lockless programming - Tomasz Barański

read Y(…)

store X(...)

read X(...)

store Y

read Y(…)

store X(...)

read X(...)

store Y

Thread 2Thread 1

Page 21: Atmosphere 2014: Lockless programming - Tomasz Barański

What are X and Y?

Page 22: Atmosphere 2014: Lockless programming - Tomasz Barański

Sequential consistency

All threads (on all CPUs) agree on order of all memory operations, and the order is consistent with the operations order in the source code.

Page 23: Atmosphere 2014: Lockless programming - Tomasz Barański

Memory barriers

Page 24: Atmosphere 2014: Lockless programming - Tomasz Barański

read XLoadLoad Barrier

read Y(…)

store Ystore X

read X(…)

store X(...)

read Y(...)

store Y

reordering

compiler(JVM)CPU

Page 25: Atmosphere 2014: Lockless programming - Tomasz Barański

read Xread Y

(…)store Y

StoreStore Barrierstore X

read Y(…)

store Y(...)

read X(...)

store X

reordering

compiler(JVM)CPU

Page 26: Atmosphere 2014: Lockless programming - Tomasz Barański

read Xread Y

(…)LoadStore Barrier

store Ystore X

read Y(…)

read X(…)

store X(...)

store Y

reordering

compiler(JVM)CPU

Page 27: Atmosphere 2014: Lockless programming - Tomasz Barański

store Xstore Y

(…)StoreLoad Barrier

read Xread Y

store Y(…)

store X(…)

read X(...)

read Y

reordering

compiler(JVM)CPU

Page 28: Atmosphere 2014: Lockless programming - Tomasz Barański

Full barrier

Page 29: Atmosphere 2014: Lockless programming - Tomasz Barański

Let's get practical!

Page 30: Atmosphere 2014: Lockless programming - Tomasz Barański

Lock-free (FIFO) queue

(by John D. Valois)

Page 31: Atmosphere 2014: Lockless programming - Tomasz Barański
Page 32: Atmosphere 2014: Lockless programming - Tomasz Barański

enqueue(x) =acquire(lock)q = new Nodeq.value = xq.next = NULLtail.next = qtail = qrelease(lock)

Page 33: Atmosphere 2014: Lockless programming - Tomasz Barański

enqueue(x) =acquire(lock)q = new Nodeq.value = xq.next = NULLtail.next = qtail = qrelease(lock)

Page 34: Atmosphere 2014: Lockless programming - Tomasz Barański

enqueue(x) =acquire(lock)q = new Nodeq.value = xq.next = NULLtail.next = qtail = qrelease(lock)

Page 35: Atmosphere 2014: Lockless programming - Tomasz Barański

enqueue(x) =q = new Nodeq.value = xq.next = NULLdo

p = tailsucc = CAS(p.next, NULL, q)if !succ

CAS(tail, p, p.next)while !succCAS(tail, p, q)

Page 36: Atmosphere 2014: Lockless programming - Tomasz Barański

enqueue(x) =q = new Nodeq.value = xq.next = NULLdo

p = tailsucc = CAS(p.next, NULL, q)if !succ

CAS(tail, p, p.next)while !succCAS(tail, p, q)

Page 37: Atmosphere 2014: Lockless programming - Tomasz Barański

dequeue() =do

p = headif p.next == NULL

error QUEUE_EMPTYwhile !CAS(head, p, p.next)return p.next.value

Page 38: Atmosphere 2014: Lockless programming - Tomasz Barański

Never waitsNever blocks

Page 39: Atmosphere 2014: Lockless programming - Tomasz Barański

Silver bullet?

Page 40: Atmosphere 2014: Lockless programming - Tomasz Barański

More difficultABA problem

Page 41: Atmosphere 2014: Lockless programming - Tomasz Barański
Page 42: Atmosphere 2014: Lockless programming - Tomasz Barański
Page 43: Atmosphere 2014: Lockless programming - Tomasz Barański
Page 44: Atmosphere 2014: Lockless programming - Tomasz Barański
Page 45: Atmosphere 2014: Lockless programming - Tomasz Barański

Solution?

Tagged referenceIntermediate nodes

LL/SC

Page 46: Atmosphere 2014: Lockless programming - Tomasz Barański

Load-Link / Store-Conditional

Separates storage has valuefrom storage has been changed.

PowerPC, ARMbut NOT: x86, SPARC

Page 47: Atmosphere 2014: Lockless programming - Tomasz Barański

LoadLink(x) =read(x)mark(x)

StoreConditional(x) = if x marked

store(x)unmark(x)return SUCCESS

elsereturn FAILURE

Page 48: Atmosphere 2014: Lockless programming - Tomasz Barański

Language support

Page 49: Atmosphere 2014: Lockless programming - Tomasz Barański

C (gcc)

__sync_fetch_and_add (_sub, _or...)__sync_add_and_fetch (_sub, _or...)

__sync_bool_compare_and_swap__sync_val_compare_and_swap

__sync_synchronize

Page 50: Atmosphere 2014: Lockless programming - Tomasz Barański

C++11

#include <atomic>

template <class T> struct atomic;

atomic_thread_fence(...)

::store(...)::load(...)::compare_exchange(...)::fetch_add(...)

Page 51: Atmosphere 2014: Lockless programming - Tomasz Barański

Java

java.util.concurrent.atomic

AtomicInteger.addAndGet.getAndAdd.compareAndSet

AtomicIntegerArray

AtomicReferenceAtomicStampedReference

Page 52: Atmosphere 2014: Lockless programming - Tomasz Barański

?