Storm is a distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, and is a lot of fun to use!
C - Best accessible distributed realtime computation system going\nA - Learn about and start using Storm\nB - You will get a great new tool in your technology stack - interesting uses\n
CEP - continuous\n\nNot HFT-grade\n\n
\n
scaling is painful\npoor fault tolerance\ncoding is hard\n
\n
\n
tweets stock ticks manufacturing machine data sensor messages\n
\n
\n
\n
\n
DAG\n\nruns continuously\n
abstractions like Cascading, Hive, Pig make MR approachable\n\ncode size reduction\n
\n
\n
kestrel - via thrift\nkafka - transactional topologies, idempotentcy, process only once\nactivemq\n
\n
current architecture\n\ndata ingest tool for hadoop (avoid Flume madness)\n
new architecture\n
\n
Trending Topics (stream processing of the firehose)\ncomputing the ‘reach’ of a URL (Dist RPC)\n
C - Exciting times, much like Hadoop/NoSQL beginning\nA - Start tinkering with Storm, integrate into your workflows\nB - be more responsive in turning data into information\n