In computer science, lock is complex. Now for cloud computing, lock is an issue must to be solve and avoid. Thus there are many solutions to implement lock-free data structure. In this article, we just introduce some basic knowledge of lock free queue.
6. CAS
Compare and Swap/Set - cmpxchg
It compares the contents of a memory location to a given value
and, only if they are the same, modifies the contents of that
memory location to a given new value. This is done as a single
atomic operation. The atomicity guarantees that the new value
is calculated based on up-to-date information; if the value had
been updated by another thread in the meantime, the write
would fail.
int compare_and_swap (int* reg, int oldval, int newval)
{
int old_reg_val = *reg;
if (old_reg_val == oldval)
*reg = newval;
return old_reg_val;
}
7. CAS in C/C++
GCC
Windows
bool __sync_bool_compare_and_swap (type *ptr, type oldval type
newval, ...)
type __sync_val_compare_and_swap (type *ptr, type oldval type
newval, ...)
InterlockedCompareExchange ( __inout LONG volatile *Target,
__in LONG Exchange,
__in LONG Comperand);
C++11
template< class T >
bool atomic_compare_exchange_weak( std::atomic<T>* obj,
T* expected, T desired );
template< class T >
bool atomic_compare_exchange_weak( volatile std::atomic<T>*
obj,
T* expected, T desired );
8. Lock-free queue
List implementation
EnQueue(x)
{
q = new record();
q->value = x;
q->next = NULL;
do {
p = tail;
} while( CAS(p->next, NULL, q) != TRUE);
CAS(tail, p, q); //why we do NOT care the return value?
}
//the CAS of while loop success in T1 thread, all the other
//threads failed. After Ti update the tail pointer, one of the
// other threads can get the new tail pointer.
9. Lock-free queue
Enhancement
If T1 thread hang up before update tail pointer, dead loop for
other threads
EnQueue(x) {
q = new record();
q->value = x;
q->next = NULL;
p = tail; oldp = p;
do {
while (p->next != NULL)
p = p->next;
} while( CAS(p.next, NULL, q) != TRUE);
CAS(tail, oldp, q);
}
11. CAS ABA issue
It's possible that between the time the old value is
read and the time CAS is attempted, some other
processors or threads change the memory location
two or more times such that it acquires a bit pattern
which matches the old value. The problem arises if
this new bit pattern, which looks exactly like the old
value, has a different meaning
CAS just compare the pointer address, what if this
address is reused?
12. ABA solution
Double-length CAS
on a 32 bit system, a 64 bit CAS. The second half is used
to hold a counter. The compare part of the operation
compares the previously read value of the pointer *and*
the counter, to the current pointer and counter. If they
match, the swap occurs - the new value is written - but
the new value has an incremented counter.
14. Lock-free queue in Disruptor
Ring-buffer implementation
sequence mod array length = array index
Only tail pointer
It is faster,
array, cache-friendly, pre-loaded, pre-allocate, no need to clean
up
18. False sharing
struct foo {
int x;
int y;
};
static struct foo f;
/* The two following functions are running concurrently: */
int sum_a(void){
int s = 0;
int i;
for (i = 0; i < 1000000; ++i)
s += f.x;
return s;
}
void inc_b(void){
int i;
for (i = 0; i < 1000000; ++i)
++f.y;
}
19. Eliminate False sharing
Disruptor – cache line padding
public long p1, p2, p3, p4, p5, p6, p7;//cache line padding
Private volatile long cursor = 0;
http://www.drdobbs.com/parallel/eliminate-falsesharing/217500206
http://ifeve.com/false-sharing/
http://ifeve.com/volatile/
20. Memory Barrier
a type of barrier instruction which causes a central
processing unit (CPU) or compiler to enforce an
ordering constraint on memory operations issued
before and after the barrier instruction. This typically
means that certain operations are guaranteed to be
performed before the barrier, and others after.