2. Time Constrains
• Very advanced talk – limited time
– Must assume prior knowledge and skim a lot!
• Assume:
– Familiarity with reference counting smart-pointers
– Basic understanding of threading concepts.
• No need for C++11 compiler
– Boost provides all missing functionality (v1.48 and up)
3. Time Constrains
• No time to go deep into:
– C++11
• std::shared_ptr<>
• Lock-based facilities – e.g. std::mutex
• New memory model
• Move semantics – r-values
– Concurrent programming
• Lock-based programming
– Pitfalls: dead-locks, data-races, resource release problems
• Lock-free programming
– Pitfalls: hard, tricky, subtle, platform-specific hardware support (e.g.
CAS), ABA etc.
4. A Problem I Had…
• A highly concurrent server system
• Must update a shared vector of in-memory
objects while these objects are being used by
other threads.
• Cannot afford locking the whole vector
• Problem: How to update this shared vector
without:
– Delaying other threads
– Causing multithreaded havoc such as data-
races, deadlocks, memory leaks or premature release.
5. The Ecosystem
• Concurrent hardware becoming pervasive:
– Hyper-Threading
– Multiple cores
– [GPGPUs, cloud…]
• True parallelism – not just time-sharing
• Demand for high performance and fast response
is growing – think web-servers
• One way to maximize the power of parallelism is:
Lock-Free Data-Structures!
6. Lock-free Programming
WARNING:
Lock-free and wait-free data-structures
and algorithms are
HARD to write!
• Some wise advice: “Use Locks!”
Tony Van Eerd, BoostCon 2011 presentation: “The Basics of Lock-free Programming”
• Consider lock-free algorithms as a last resort.
7. Lock-Free Programming
• Non-blocking algorithms
• Guaranteed system-wide progress at all times.
• No locks used
– No dead-locks
– Reduced contention = improved hardware utilization
• Generally, use atomic hardware read-modify-write primitives
– These are like single HW instruction “critical sections”
• A lot of research in recent years = many new lock-free data-
structures. However, implementations are:
– Often designed for particular use-cases
– Not always portable (platform-dep. atomic support)
– Not always well-tested, more proof-of-concept
– Not always suitably licensed for commercial use.
8. FRECKLE*
• A general scheme for providing portable lock-free access
to, and modification of, standard STL containers.
• Can be easily adapted to non-STL containers.
• Leverages new standard atomic reference-counting
guarantees to provide automatic garbage collection.
• Uses only standard C++11 components.
But:
• It is not a replacement for more specialized lock-free
containers, but in many cases may suffice.
• There is no free-lunch, and FRECKLE entails various trade-
offs and certain usage conditions that will be detailed.
* Poor-man’s ‘LoCKFREE’ acronym, a distant cousin of the Pimpl.
9. The FRECKLE Setup
• Provide shared access to a container of shared
objects.
• Useful for
– Frequent map lookups with infrequent map updates.
– Optimize greedy vector-traversal algorithms
• Use only a single object that is shared between
the multiple threads.
• This shared object is a std::shared_ptr to an STL
container of shared objects.
10. What does “shared” mean?
Two meanings:
1. A shared variable:
A variable that can be seen concurrently by multiple
threads, this ‘shared’ applies to the instance of the
shared_ptr itself
2. In the shared_ptr type name:
Refers to the fact that multiple objects may be
“pointing” to the same data (from one or more
threads) – this is the pointer abstraction.
This refers to the data pointed to by the shared_ptr
instance.
11. The FRECKLE Setup
• Use a single shared shared_ptr to an STL container of
shared objects.
• The container is a container of shared_ptrs to the desired
data type.
• The data-type can be:
– Arbitrarily large
– Does not even need to be
CopyConstructible, CopyAssignable, and LessThanComparable
and thus not even generally allowed for direct use within
standard containers, it may also be an incomplete type.
– A polymorphic type like an abstract interface
using namespace std;
shared_ptr< vector< shared_ptr< MyAbstractInterface const>>> sharedObjects;
shared_ptr< vector< shared_ptr< atomic<int>>>> atomicCounters;
shared_ptr< map < size_t, shared_ptr< int const>>> sharedMap;
12. When/Where Can FRECKLE Help?
1. Multiple concurrent readers that use only const access to the container.
– Readers do not modify the shared shared_ptr to the container;
– Readers do not modify the container itself;
– Readers should not, generally, modify the container data, unless this data is
safe (read: designed for) for concurrent modification (e.g. using std::atomic<>).
2. One or more writers such that:
– Writer(s) may modify the shared shared_ptr to the container;
– Writer(s) may modify the container itself (e.g. insert, remove, sort, resize);
– Writer(s) should not, generally, modify the container data, unless this data is
safe for concurrent modification (e.g. std::atomic<>) or by using the approach
mentioned below.
• Intuitively, FRECKLE provides shared access to a container of shared
objects via “virtual snapshots”.
• These “virtual snapshots” do not in-fact incur any access runtime
overheads.
13. C++11 Facilities – shared_ptr
• std::shared_ptr<>
– A replacement for raw pointers for dynamic data
management
– Reference counted smart-pointer
– Automatic deleting of data upon ref-count == 0 in dtor
(RAII)
– Use std::make_shared() instead of new.
– Supports:
• Deleters, weak_ptr, use in containers …
14. const Access to shared_ptr is Atomic
• From the Boost shared_ptr documentation about thread-safety:
– shared_ptr objects offer the same level of thread safety as built-in types.
– A shared_ptr instance can be "read" (accessed using only const operations) simultaneously by multiple threads.
– Different shared_ptr instances can be "written to" (accessed using mutable operations such as operator= or reset)
simultaneously by multiple threads (even when these instances are copies, and share the same reference count
underneath.)
– Any other simultaneous accesses result in undefined behavior.
• shared_ptr<int> p(make_shared<int>(42)); // the shared shared_ptr
//--- Example 1 ---
// thread A
shared_ptr<int> p2(p); // reads p
// thread B
shared_ptr<int> p3(p); // OK, multiple reads are safe
• p is in shared memory and visible to both threads A and B.
• Objects p2 and p3 are local variables and are not (necessarily) shared or visible to the other thread.
• In legalese, section §20.7.2.2 of the C++11 ISO standard states:
– For purposes of determining the presence of a data race, member functions shall access and modify only the
shared_ptr [...] objects themselves and not objects they refer to. Changes in use_count() do not reflect modifications
that can introduce data races.
• TAKE HOME MESSAGE: as long as there are no concurrent non-use_count()
changes, the single reference-count of p, p2 and p3, will always be properly, i.e.
atomically, adjusted, with no data-races.
15. Atomic Access to a shared_ptr
• To modify a shared_ptr instance (item 4), we must use special atomic
functions provided by the standard. Section §20.7.2.5 of the standard
states:
– Concurrent access to a shared_ptr object from multiple threads does not introduce
a data race if the access is done exclusively via the functions in this section and the
instance is passed as their first argument.
• Of particular interest in this case are:
– atomic_store(), atomic_compare_exchange(), atomic_load()
• atomic_store allows the atomic update of a shared shared_ptr object
to share the same ref-count and ref-object as another (e.g. thread-local)
shared_ptr instance.
• atomic_compare_exchange is a more generalized form of
atomic_store.
• atomic_load allows the atomic creation of a (e.g. thread-local)
shared_ptr instance which shares the same ref-count and ref-object of
another shared shared_ptr instance even in the presence of possible
concurrent modification of this shared shared_ptr instance.
16. FRECKLE – Single Writer
class MyDataClass;
typedef vector<shared_ptr<MyDataClass const>> Container;
shared_ptr<Container> freckle;
// The reader thread(s)
void reader()
{ shared_ptr<Container const> p = atomic_load( &freckle );
// use *p;
}
// The single writer case
void writer()
{ // copy container
shared_ptr<Container> p(make_shared<Container>(*freckle));
// update *p reflecting new information;
atomic_store( &freckle, move( p ) );
}
17. FRECKLE – Multiple Writers
class MyDataClass;
typedef vector<shared_ptr<MyDataClass const>> Container;
shared_ptr<Container> freckle;
// The reader thread(s)
void reader()
{ shared_ptr<Container const> p = atomic_load( &freckle );
// use *p;
}
// The multiple writers case
void writer()
{ shared_ptr<Container> p = atomic_load( &freckle );
shared_ptr<Container> q;
do
{ // copy container
q = make_shared<Container>( *p );
// update *q reflecting new information;
}
while( !atomic_compare_exchange( &freckle, &p, move(q )));
}
18. atomic_compare_exchange()
• The function atomic_compare_exchange() does all of the
following atomically:
– If p and freckle are equivalent – i.e. pointing to the same data,
then:
• freckle is updated to point at q’s data - the newly created copy.
Just like with the single reader case.
• Else: p is updated to point at the updated data pointed to by freckle.
This update is equivalent to an internal atomic_load like on the first line.
// The multiple writers case
void writer()
{ shared_ptr<Container> p = atomic_load( &freckle );
shared_ptr<Container> q;
do
{ // copy container
q = make_shared<Container>( *p );
// update *q reflecting new information;
}
while( !atomic_compare_exchange( &freckle, &p, move(q )));
}
19. Exception Safety
Provides the Strong Exception Safety Guarantee:
The operation has either completed successfully or thrown an
exception, leaving the program state exactly as it was before
the operation started.
All operations are in one of the following categories:
1. const access to shared data - no change in the
program state.
2. Modifications only of stack-local non-shared
variables and data - no change to the rest of the
program state.
3. Only atomic non-throwing operations actually alter
shared data.
20. Additional Notes
• Why (*almost)?
• Move semantics
• Primitive Data Types
• Further reading:
– C++ Concurrency in Action
Anthony Williams
21. ∵ Questions ∴
Adi Shavit ∵ adishavit@gmail.com ∴ 050-7637223