2. Parallel Processing - What is it?
• Concurrent System
– Two or more actions progressing in parallel
– This can be on a single core
• Parallel System
– Two or more actions executing in parallel
– This requires multiple cores
– Parallel systems are a subset of concurrent systems
• Distributed System
– Two or more actions executing in parallel
– This requires multiple connected machines
– Communication is primarily via messaging
3. Why application developers need to
know?
• Multi-core processors
• Large volume of data
• Analytics and other new fields
• Cheaper hardware
4. OS Resource Challenges
Memory
(ROM/
RAM)
DB NW NW
Web
CPU
Registers
Cache
Service
Queue
Distribu
ted
Mem
5. Representation of an Application
Source Data Transform Data Sink
- UI
- DB
- Queue
- Network
- Service
- UI
- DB
- Queue
- Network
- Service
6. Abstraction of Concurrancy
• Programs are execution of atomic
statements
• Concurrent programs are the interleavings
of atomic statements
• All possible interleavings should produce
the same results
• No process should be excluded from any
arbitrary interleaving
8. To go Parallel?
• SpeedUp
– Amdahl’s Law
• Speedup <= 1/((1 – PCTpar) + (PCTpar/P))
– PCTpar -> percentage of time in parallel
– P -> Number of cores
• 75% PCTpar provides 3x speedup on a 8 core
– Gustafson-Barsis’s Law
• Speedup <= P + (1 – P) S
– P -> Number of cores
– S -> Percentage of time spend in serial code
• Efficiency
– Speedup/Cores -> resource utilization as %age
9. Things to Consider
• Identify independent computations
• Implement concurrancy at the highest level
• Make no assumptions on the cores
• Use the best processing model
• Never assume a particular execution order
• User thread local storage or associate locks
to specific data
• Make no assumptions on the order of
execution
10. Methedology
• Start with a tuned and functional serial code
• Analysis: identify possible concurrancy
– Identify hotspots probably using profilers
• Calculate speedup and efficiency
• Design and implementation
• Test for correctness
– Loop executions, rounding errors
• Tune for performance
– Enough work load to compensate for overheads
• Measure Speedup as a multiplier (e.g. 2x faster)
– Serial code elapsed time/parallel code elapsed time
12. Highlights of Process Development
• Single threaded functional code (Java)
• Threads spawned off for each message
– Data decomposition
– Task decomposition
– Concurrency at the highest level
• Monitor (synchronize) & conditional variable
• Code to ignore deadlocks
• Tested and tuned to remove bottlenecks
• Scalable for any number of cores
13. Monitor
public class Monitor {
private static int count = 0;
private static int max = 0;
public boolean isSemaphoreUsed()
{
return (count < max);
}
public Semaphore(int num) {
count = num;
max = num;
}
private synchronized void
incCount() {
count++;
}
private synchronized void
decCount() {
count--;
}
public synchronized void acquire()
{
try {
while (count == 0) {
}
decCount();
}
public synchronized void release()
{
incCount();
}
}
Never comes
out
14. Monitor with Conditional Variables
public class Monitor {
private static int count = 0;
private static int max = 0;
public boolean isSemaphoreUsed()
{
return (count < max);
}
public Semaphore(int num) {
count = num;
max = num;
}
private synchronized void
incCount() {
count++;
}
private synchronized void
decCount() {
count--;
}
public synchronized void acquire()
{
try {
while (count == 0) {
this.wait();
}
decCount();
} catch (InterruptedException e)
{}
}
public synchronized void release()
{
incCount();
this.notify();
}
}
15. General Design Guidelines
• Keep it simple stupid (KISS)
– Break the problem into simple components
• Keep the number of layers to minimum
– Don’t add layers since they are hot in industry
• Do the right thing
– Skeptics will provide scenarios to take care