Lightning Talk on Software Transactional Memory in Scala. The Actors Pattern has gotten a lot of attention in the Scala ecosphere, but STM of often a good first solution for solving concurrency problems. It's odd that it hasn't had as much attention in the Scala world, so this talk aims to show how easy it is to use, and compare it to typical lock-based synchronization by showing how easy it is to make errors with lock-based synchronization
9. ADVANTAGES: PRIORITY INVERSION
val protectedInt = Ref(0)
atomic{ implicit transaction =>
val currentValue = protectedInt.get
protectedInt.set(currentValue + 1)
}
10. ADVANTAGES: COMPOSABILITY
val protectedInt = Ref(0)
val anotherProtedtedInt = Ref(0)
atomic{ implicit transaction =>
val currentValue = protectedInt.get
protectedInt.set(currentValue + 1)
atomic{ implicit transaction =>
val anotherCurrentValue = anotherProtectedInt.get
val anotherProctedInt.set(anotherCurrentValue + 1)
}
}
11. GOTCHAS: IMMUTABILITY!
// bad!
val badMap= Ref(new java.util.HashMap[String, Int]())
val mutableMap = atomic{ implicit transaction => badMap.get }
mutableMap.put(“Wrong!”, 666)
// good
val goodMap = Ref(Map(“Good” -> 7))
atomic{ implicit transaction =>
val tempMap = goodMap.get
goodMap.set(tempMap + (“Good” -> 777))
}
12. GOTCHAS: REFERENTIAL TRANSPARENCY
//bad
val protectedString = Ref("This is a string")
val time = System.currentTimeMillis
atomic{ implicit transaction =>
if(time % 2 == 0) protectedString.set("Time was even")
else protectedString.set("Time was odd")
}
//good
atomic{ implicit transaction =>
val time = System.currentTimeMillis
if(time % 2 == 0) protectedString.set("Time was even")
else protectedString.set("Time was odd")
}
Hello, I'm Dustin Whitney. I'm one of the organizers of the New York ScalaMeetup, and I'm the CTO of Pellucid Analytics. We're hiring. This talk is about Software Transactional Memory (STM). I feel that in the Scala world STM gets less attention than it deserves, which is odd given it's popularity in other programming languages, and I honestly think it's a great first choice when attacking concurrency issues with Scala. So I felt it would be worthwhile to give a lightning talk about it, especially given how simple it is. In fact 30 minutes is probably too much time. -- let's get started
What is STM? It's an alternative to lock-based synchronization, which has been the standard for doing concurrency in Java since its inception. STM is analagous to database transactions in that when a transaction is started, the ACID properties are adheared to, with the acception of the 'D' in ACID. I'll get back to what the ACID properties and how they are manifested in STM in later slides, but it suffices to say that this is a good thing. Also, STM is simple and easy to reason about. What does it look like in Scala?
Eventually STM will be included in the Scala Standard Library, but for now you must include it as a dependency in your sbt build file.This is the import statement, and after you've imported it, there are really only two things you need to know:1. There is a 'Ref', which protects a piece of shared memory. In this case it is protecting an Integer.2. To access the shared memory protected by the 'Ref', you must be inside of an atomic block, where a transaction is created. When you're inside of the atomic block you can access the shared memory with the get method and modify the shared memory with the set method.
Actions taken within the atomic block are 'atomic', the 'A' from ACID. Being atomic means that the actions inside of the block are 'all or nothing', meaning if any part of an atomic block fails in committing a successful transaction, then none of the values in a Ref will be updated. The example I have here shows two Refs. It's possible during the execution of this block, one or both of the Refs could be updated by another thread before the block is done executing. Should this occur, no change to either Ref will occur, and the atomic block will then retry.Something should be said about retries. How is it determined that a success or failure has occurred? STM uses optimistic concurrency control. What that means is, from a high level, when a block is executed a transaction object is created with a unique number related to the state of the Ref. At the end of the block an attempt to commit is made, and if the number of the transaction does not match the current state of the Ref, the transaction is a failure, and the atomic block is retried. Retries will occur until either a success occurs or too many retries occurs. The 'too many retries' number is configurable.
Picking up from there, atomic blocks are consistent, the 'C' in ACID. Your transaction acts as a way to always retrieve the same state of the world throughout your transaction. In this example, I've removed the implicit keyword from the front of the transaction and passed it into the get and set methods manually to make explicit the fact that it is used to fetch a consistent view of the world. It would not be good if another thread were to update the ref and when fetching it you were fetching those changes, which leads to the next letter in ACID
modifications to the Ref are isolated from the rest of the world until a successful commit is made. also when a successful commit is made those changes will not be visible to transactions that began before the successful commit.
STM is not durable since it is protecting memory, which is inherently non-durable.
To make clear some of the advantages of STM over lock-based synchronization, I thought I'd highlight a few things.STM does not suffer from deadlock or livelock, because there are no locks. To explain what deadlock is, imagine a process that much obtain two locks. One thread has one lock and another thread has the other, and both are waiting for the other to release the lock so they can continue processing. They will continue to wait until the program is terminated. With STM, you simply start a transaction, and go. If something goes wrong, the atomic block will execute again.
Another disadvantage of lock-based synchronization is priority inversion. This happens when lots of low priority threads take ahold of a lock and keep a high priority thread blocking. Again, since there are no locks with STM, high priority threads can gain access to the shared resource at will. It should be noted that with STM there can be issues with competition over committing transactions, but resolving these issues are pretty easy since they are easy to track down and debug.
I think the biggest advantage STM has over lock-based synchronization is composability. Locks simply don't compose. Locks are chosen more or less arbitrarily, and when you wish to have multiple parts of a program work together in a threadsafe manner where locks are involved, you must know what all of the locks are, and what their policies are, and it's really tough to make happen. With STM, you can nest atomic blocks, and everything works out the way you'd expect.