1. Why Erlang?
Brad Anderson
BarCamp Atlanta
Oct. 18, 2008
brad@sankatygroup.com
twitter: @boorad
http://boorad.weebly.com
blog url (for now)
2. Huh? Erlang?
Programming Language created at
Ericsson (20 yrs old now)
Designed for scalable, long-lived
systems
Compiled, Functional, Dynamically
Typed, Open Source
20 yrs old, open source since mid-90ʼs, iirc.
like a mobile telephone grid
compiled (but to bytecode for a VM)
open source, but no access to VCS, just tarballs
3. 3 Biggies
Massively Concurrent
Seamlessly Distributed
Fault Tolerant
Why Erlang?
Here are my three big ticket items
- massively concurrent
- seamlessly distributed into multi-machine clusters
- extremely fault tolerant
Great for my projects
- data storage & retrieval
- scalable web apps
Maybe not so hot for computationally intensive projects
- unless they lend themselves to parallelism
4. Big #1 - Concurrent
User space “green” threads
VM manages processes across
kernel threads to maximize CPU
utilization across all available cores
Quad core? Almost 4 times faster
No mutable data structures
userspace threads != OS threads, so we can have thousands or more of these little guys
32- and 64-core processors coming
Properly written Erlang code will run N times faster on an N core processor.
I have spawned 500,000 processes on my MBP - didnʼt sweat
no mutable data == no locks, mutexes, semaphores == easy to parallelize
5. Big #1 - Concurrent
image: http://english.people.com.cn/200512/21/images/pop2.jpg
Processes are self-contained
Think of objects in Java, Python, etc.
Each process has its own stack
GC is per-process
If a process crashes, it does not affect any other processes
Message Passing Concurrency
No Shared Memory - I have mine, you have yours. Two separate brains.
To change your memory, I send you a message.
We understand concurrency Erlang-style, because the world outside of programming is parallel
6. Big #1 - Concurrent
-module(test).
-compile(export_all).
start() ->
spawn(fun() -> loop([]) end).
rpc(Pid, Query) ->
Pid ! {self(), Query},
receive
{Pid, Reply} ->
Reply
end.
loop(X) ->
receive
Any ->
io:format(“Received:~p~n”, [Any]),
loop(X)
end.
Programming Erlang First Edition - Joe Armstrong
start/0 will spawn the new process, firing off loop/1
loop/1 is tail recursive and waits for a message
rpc/2 is how you send a message in
7. Big #2 - Distributed
Nodes are separate OS processes,
instances of the VM
They are completely separate from
each other, but connect to form a
cluster
Processes can be started on any
node in the cluster
So, start up two nodes on different servers, and you are distributed.
8. Big #2 - Distributed
here we have different nodes started on different machines that form clusters.
There are three clusters split across three machines, but I could reallocate if Red needed more horses
All of this is fairly seamless - if a new node shows up, itʼs added to the cluster and can begin to handle new processes for the cluster
9. Big #3 - Fault Tolerant
Links are formed between processes
If a process exits abnormally, all
linked processes exit too.
System processes trap exits.
Instead of exiting, they receive a
message with the Pid and exit
status of linked processes
Back to our real-world thinking, If someone dies, people will notice
The mobile telecom system of the UK is a PROD Erlang application.
9 nines of reliablility - 32ms of downtime per year
10. Big #3 - Fault Tolerant
Supervision Trees
Worker processes do the real work
Supervisor processes monitor workers and restart them as needed
Supervisors can also monitor other supervisors
11. Other Goodies
Lists & Live Code
Comprehensions reloading
Pattern Matching ets & dets
Higher-Order Mnesia
Functions
OTP
bit syntax
Dialyzer
Lisp or Python comprehensions - leads us to map/reduce goodness
Pattern Matching from Prolog, very cool for elegant coding
functions are first class, can be passed around, and maintain scope for closures
bit syntax is great for working w/ protocols
live code reloading helps with those 9 nines, no downtime
ets & dets are efficient term storage mechanisms
Mnesia - built in distributed database, no need for ORM, no impedence mismatch, stores Erlang terms
OTP - libraries!
Dialyzer - code coverage, type analysis
12. Erlang Hotness
Facebook Chat Scalaris
Meebo github / Engine
Yard
RabbitMQ
Yaws
ejabberd
Mochiweb
OpenPoker
Yahoo! delicious2
CouchDB
Facebook Chat - already huge, new feature had to scale, chose Erlang
Meebo - web-based IM
RabbitMQ - super-scalable message broker
ejabberd - super scalable XMPP server
OpenPoker - high volume poker server
CouchDB - document database, stores JSON docs, not relational
Scalaris - Distributed Key Value System, colossal amounts of data, ACID, very fast retrieval
github / Engine Yard - project provisioning
Yaws - super scalable web server (ditto for mochiweb)
delicious2 used Erlang to port data to new system, role in PROD system now?
13. Yaws vs. Apache
throughput (KB/s) vs load
http://www.sics.se/~joe/apachevsyaws.html
Apache (blue, green) dies when running load of 4000 parallel sessions
red curve is yaws on NFS, blue is Apache running on NFS, green is Apache on local file system
14. Credits
Joe Armstrong
Programming Erlang - http://www.pragprog.com/titles/jaerlang
http://www.pragprog.com/articles/erlang
Toby DiPasquale - http://cbcg.net/talks/erlang.pdf
Sam Tesla - http://www.alieniloquent.com/talks/ErlangConcepts.pdf