9. 线程: Java Monitors This figure shows the monitor as three rectangles. In the center, a large rectangle contains a single thread, the monitor's owner. On the left, a small rectangle contains the entry set. On the right, another small rectangle contains the wait set. Active threads are shown as dark gray circles. Suspended threads are shown as light gray circles.
40. 闭锁: CountDownLatch 等待启动信号 等待完成信号 继续 启动信号 + 完成信号的模式 N 部分锁存器倒计数模式; 当线程必须用这种方法反复倒计数时,可改为使用 CyclicBarrier 典型应用:手动控制事务,从数据库读取多份数据做初始化; 线程 A 获得等待启动信号 线程 B 获得等待启动信号 线程 C 获得等待启动信号 线程 A 运行,递减锁存器的计数 线程 B 运行,递减锁存器的计数 线程 C 运行,递减锁存器的计数
41. 关卡: CyclicBarrier Barrier A B C Barrier Barrier A B C A barrier : A barrier is a coordination mechanosm (an algorithm) that forces process which participate in a concurrent (or distributed) algorithm to wait until each one of them has reached a certain point in its program. The collection of these coordination points is called the barrier. Once all the processes have reached the barrier, they are all permitted to continue past the barrier. A B C
66. 无锁队列算法: MS-Queue 算法 论文地址: http://www.research.ibm.com/people/m/michael/podc-1996.pdf M S - q u e u e 算法是 1 9 9 6 年由 M a g e d . M . Michael and M. L. Scott 提出的,是最为经典的并发 FIFO 队列上的算法,目前很多对并发 FIFO 队列的研究都是基于这个算法来加以改进的。 MS-queue 算法的队列用一个单链表来实现,包含两个基本的操作, enquene() 和 dequene() ,新节点总是从队尾最后一个元素后面加入队列,节点元素总是从队头删除。包含两个指针, head 和 tail , head 总是自相链表头部的节点,指向的这个节点被当作是哑节点或哨兵节点,它保存的值是多少并无意义; tail 总是指向链表中的一个节点,不一定是队尾元素。每个节点包含两个数据域值信息,即存放的数值信息和指向下一个节点的指针。每个指针对象,除了包含一个指向节点的指针外,还包含一个时间戳,初试时时戳为零,每修改一次指针,时戳增加一,在 64 位系统中,无需考虑时戳溢出的影响。
67. 无锁队列算法: Optitmistic 算法 Optimistic 算法对于上面提到的 MS-queue 算法的改进就在于使用普通的 store 指令代替代价昂贵的 CAS 指令。 Optimistic 算法的高效性在于使用双向链表表示队列,并且入队和出队操作都只需要一次成功的 CAS 操作。该算法保证链表总是连接的, next 指针总是一致的,当 prev 指针出现不一致时通过调用 fixList 方法能够恢复到一致性状态。 同 MS-queue 算法一样, optimistic 算法也用到了原子化的指令 Compare-and-swap(CAS) , CAS(a , p , n) ,原子化的将内存地址 a 中的值与 p 进行比较,如果二者相等,就将 n 写入地址 a 中并返回 true ,否则返回 false 。由于 optimistic 算法使用了 CAS 指令,所以经典的 ABA 问题同样会出现,解决方案同 MS-queue 相同,即使用标签机制。 论文地址: http://nedko.arnaudov.name/soft/L17_Fober.pdf
68. Atomic 实现 public final int incrementAndGet() { for (;;) { int current = get(); int next = current + 1; if (compareAndSet(current, next)) return next; } } public final boolean compareAndSet(int expect, int update) { return unsafe.compareAndSwapInt(this, valueOffset, expect, update); } 当 import sun.misc.Unsafe; 这个的时候,就因为各种问题(例如:专利)看不到源码了。
所有内存都是共享的么? No, only globals and heap. In Java, classes are global and static variables declared inside classes are global too. All the rest is local. All variables declared inside a method are locals, therefore they are not shared. The heap memory is never a problem even though it is shared, because variables pointing to them are either global or local.
public synchronized static int getAge(int i){ return i; } public synchronized String name(){ return "longhao"; } public void modifyHeight(){ synchronized(this){ //do something } //do something }
目前的多处理器系统基本都支持原子指令,典型模式:首先从 V 中读取值 B ,由 A 生成新值 B ,然后使用 CAS 原子化地把 V 的值由 A 修改成 B ,并且期间不能有其他线程修改 V 的值, CAS 能够发现来自其他线程的干扰,所以即使不使用锁,也能解决原子化地读-写-改的问题
为了解决 ABA 问题,提供了 AtomicStampedReference( 同系 AtomicMarkableReference) ,实现原子化的条件更新,允许“版本化”引用,更新时,同时更新应用和版本号。参考代码 Atomic. incrementAndGet : AtomicReference : private static final Unsafe unsafe = Unsafe.getUnsafe(); private static final long valueOffset; static { try { valueOffset = unsafe.objectFieldOffset (AtomicReference. class .getDeclaredField(&quot;value&quot;)); } catch (Exception ex) { throw new Error(ex); } } private volatile V value; /** * Creates a new AtomicReference with the given initial value. * * @param initialValue the initial value */ public AtomicReference(V initialValue) { value = initialValue; } /** * Creates a new AtomicReference with null initial value. */ public AtomicReference() { } /** * Gets the current value. * * @return the current value */ public final V get() { return value; } /** * Sets to the given value. * * @param newValue the new value */ public final void set(V newValue) { value = newValue; } /** * Eventually sets to the given value. * * @param newValue the new value * @since 1.6 */ public final void lazySet(V newValue) { unsafe.putOrderedObject( this , valueOffset, newValue); } /** * Atomically sets the value to the given updated value * if the current value {@code ==} the expected value. * @param expect the expected value * @param update the new value * @return true if successful. False return indicates that * the actual value was not equal to the expected value. */ public final boolean compareAndSet(V expect, V update) { return unsafe.compareAndSwapObject( this , valueOffset, expect, update); } /** * Atomically sets the value to the given updated value * if the current value {@code ==} the expected value. * * <p>May <a href=&quot;package-summary.html#Spurious&quot;>fail spuriously</a> * and does not provide ordering guarantees, so is only rarely an * appropriate alternative to {@code compareAndSet}. * * @param expect the expected value * @param update the new value * @return true if successful. */ public final boolean weakCompareAndSet(V expect, V update) { return unsafe.compareAndSwapObject( this , valueOffset, expect, update); } /** * Atomically sets to the given value and returns the old value. * * @param newValue the new value * @return the previous value */ public final V getAndSet(V newValue) { while ( true ) { V x = get(); if (compareAndSet(x, newValue)) return x; } }
目前实现的常见数据结构:栈,队列,哈希表。参考代码: ConcurrentLinkedQueue: 一个基于链接节点的无界线程安全队列。此队列按照 FIFO (先进先出)原则对元素进行排序。队列的头部 是队列中 时间最长的元素。队列的尾部 是队列中时间最短的元素。新的元素插入到队列的尾部,队列获取操作从队列头部获得 元素。当多个线程共享访问一个公共 collection 时, ConcurrentLinkedQueue 是一个恰当的选择。此队列不允许使 用 null 元素。 public boolean offer(E e) { if (e == null ) throw new NullPointerException(); Node<E> n = new Node<E>(e, null ); for (;;) { Node<E> t = tail; Node<E> s = t.getNext(); if (t == tail) { if (s == null ) { if (t.casNext(s, n)) { casTail(t, n); return true ; } } else { casTail(t, s); } } } }
Concurrent Building Blocks 1. Data Structures: A set of lockfree collection classes. Since these datastructures were developed using lockfree algorithms, they enjoy some of the basic lockfree properties like, immunity from different types of deadlocks, immunity for priority inversion, etc. 2. Patterns and Scheduling Algorithms: Most application parallelization efforts follow one or more of a number of well known parallel computation patterns. We provide a set of patterns that developers can directly leverage to build parallel applications. The patterns we propose to provide will include (but not limited to): Master-Worker, Map-reduce, Divide and conquer, Pipeline, etc. We also plan to provide a set of schedulers. The schedulers can be used in conjunction with the patterns classes. 3. Parallel implementations of general-purpose functions: Example of functions to include, but not limited to: 1. String, Sequence and Array functions: Sort, Search, Merge, Rank, Compare, Reverse, Shuffle, Rotate, Median, etc. 2. Tree and Graph functions: Connected Components, Spanning Trees, Shortest Path, Graph Coloring, etc. 4. Atomics, STM, etc. 1. Deliver a C++ implementation of atomics. This implementation will be based on the draft of the C++ standards definition of the interface for atomics. 2. Deliver an open, flexible implementation of Software Transactional Memory. STM(Software transactional memory): In computer science, software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-based synchronization. A transaction in this context is a piece of code that executes a series of reads and writes to shared memory. These reads and writes logically occur at a single instant in time; intermediate states are not visible to other (successful) transactions. The idea of providing hardware support for transactions originated in a 1986 paper and patent by Tom Knight[1]. The idea was popularized by Maurice Herlihy and J. Eliot B. Moss[2]. In 1995 Nir Shavit and Dan Touitou extended this idea to software-only transactional memory (STM)[3]. STM has recently been the focus of intense research and support for practical implementations is growing. Now Amino is under active development and we've got several golden components such as deque, queue, etc. Although some of our components are not best yet. It's our target to become a standare highly-scalable library for Java/C++ programmer. Using Amino Java Components Amino Java components depend on JDK version 6. Some components can be easily used in lower version. Some other compoents, such as Deque, implement an interface of JDK 6 standard library. These components can only run on JDK version 6 and later. Refer Java components quick startguide for more details. LockFreeList a lock-free linked list,ref. http://www.research.ibm.com/people/m/michael/spaa-2002.pdf This lock-free linked list is an unbounded thread-safe linked list. A LockFreeList is an appropriate choice when many threads will share access to a common collection. This list does not permit null element. This is a lock-free implementation intended for highly scalable add, remove * and contains which is thread safe. All method related to index is not * implemented. Add() will add the element to the head of the list which is * different with the normal list. LockFreeOrderedList extends LockFreeList, ref. http://www.research.ibm.com/people/m/michael/spaa-2002.pdf An unbounded thread-safe linked list which its element is ordered. A <tt>LockFreeOrderedList</tt> is an appropriate choice when many threads * will share access to a common collection. This list does not permit * <tt>null</tt> elements. All elements in the list is ordered according to * compare(), This is a lock-free implementation intended for highly scalable add, remove * and contains which is thread safe. All mothed related to index is not thread * safe. Add() will add the element to the head of the list which is different * with the normal list. LockFreeSet ref. http://www.research.ibm.com/people/m/michael/spaa-2002.pdf The internal data structure is a single linked list, which uses the same * algorithm as {@link LockFreeOrderedList}. Elements are sorted by &quot;binary * reversal&quot; of hash of elements. Additionally, an array of dummy nodes is * stored to allow quick access to elements in the middle of elements. Elements * are wrapped by {@link HashLinkNode} before stored into set. LockFreeDeque ref paper: CAS-Based Lock-Free Algorithm for Shared Deques By Maged M. Michael 双向队列 EBDeque This deque add elimination mechanism to deal with high contention rate * scenario. Please read about {@link org.amino.utility.EliminationArray} to get * more information. If we don't consider elimination backoff, this class * implements the same algorithm as {@link org.amino.ds.lockfree.LockFreeDeque} use EliminationArray EBStack implements IStack,use EliminationArray ,ref. A Scalable Lock-free Stack Algorithm EliminationArray A global elimination array class for several data structures. It can be used * to reducing number of modification to central data structure. The idea comes * from following observation: * * <blockquote>If two threads execute push() or pop() operation on a stack, * there is no need to modify the stack at all. We can simply transfer object * from push() to the pop() and both operations succeed.</blockquote> ref. A Scalable Lock-free Stack Algorithm HakanDeque ref. http://www.cs.chalmers.se/~dcs/ConcurrentDataStructures/phd_chap7.pdf LockFreeBlockQueue 方法采用 cas 实现,不加锁,不同于 LinkedBlockingDeque, 采用加锁策略。 LockFreeDictionary ref. Scalable and Lock-Free Concurrent Dictionaries By Hakan Sundell and Philippas Tsigas LockFreePriorityQueue ref. Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems By * Hakan Sundell and Philippas Tsigas, 支持 Comparator LockFreeVector ref. Lock-free Dynamically Resizable Arrays ParallelRBTree This is an implementation of a relaxed balanced red-black tree data structure. ref. http://citeseer.ist.psu.edu/hanke97relaxed.html and * http://citeseer.ist.psu.edu/400640.html The tree implemented here is a leaf-oriented binary search trees, which are * full binary trees (each node has either two or no children). GraphAlg getStrongComponents( 获取有向图强连通子集 ) getConnectedComponents( 获取无向图强连通子集 ) getMST( 获取最小生成数 ) getShortestPath( 获取最短路径 ) ParallelPrefix ref. http://ocw.mit.edu/NR/rdonlyres/Mathematics/18-337JSpring-2005/95505ED3-630E-4B20-BB66-2FB14108FD39/0/lec4.pdf ParallelScanner MultiCAS http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-579.pdf