Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 18 Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Andere mochten auch (19)

Anzeige

Ähnlich wie Cascalog (20)

Anzeige

Aktuellste (20)

Cascalog

  1. 1. Cascalog Nathan Marz, BackType Po wer fu l a n d ea sy-t o- us e data a n a lysi s to ol fo r H adoo p
  2. 2. About Me Tech Lead at BackType Have been working on many-terabyte scale systems for two years ETL workflows Data warehouses
  3. 3. What is Hadoop? Distributed Filesystem MapReduce Framework Scales to thousands of machines and petabytes of data
  4. 4. What is Cascalog? Clojure-based query language for Hadoop with Datalog-inspired syntax Queries compile to one or more MapReduce jobs The tool I wish I had two years ago
  5. 5. Features Inner and outer joins Aggregators Functions Subqueries Sorting High performance
  6. 6. What sets Cascalog apart? Super simple Full power of Clojure always available Easy to extend with custom operations Dynamic queries Arbitrary inputs and outputs
  7. 7. What sets Cascalog apart? Super simple Full power of Clojure always available Easy to extend with custom operations Dynamic queries Arbitrary inputs and outputs
  8. 8. Experiment with Cascalog Ships with test dataset that can be queried locally (the “playground”) 5 minutes to setup Hadoop, Clojure, and Cascalog locally - see README
  9. 9. News feed generator Ranks events in social network for each person based on “importance” and recency 38 lines of code
  10. 10. Demo time!
  11. 11. News Feed “Follows” and “Action” data sources Text files on HDFS Follows Action
  12. 12. News Feed
  13. 13. News Feed Custom Aggregator to produce a news feed in JSON-like form
  14. 14. News Feed Custom Function to score each item in the feed
  15. 15. News Feed Data sources
  16. 16. News Feed Subquery to compute follower count for each person
  17. 17. News Feed Tie everything together in a single Cascalog query
  18. 18. Questions? Project page: http://www.github.com/nathanmarz/cascalog Tutorial: http://nathanmarz.com/blog/introducing-cascalog Follow me on Twitter: @nathanmarz

Hinweis der Redaktion


















×