Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Pig at Linkedin
Chris Riccomini
9/29/10
Who?
What?
LinkedIn Analytics
Pig at LinkedIn
Why?
Production Quality
Streaming
Serialization
VoldemortStorage ~ Avro
views = LOAD '/data/awesome' USING VoldemortStorage();
Voldemort ♥ Pig
Partitioning
YYYY/MM/DD
Last N days?
views = LOAD '/data/etl/tracking/extracted/profile-view' USING
VoldemortStorage('date.range', 'num.days=90;days.ago=1’)
Some-file-YYYY-MM-DD
member_position = LOAD '/data/etl/replicated/member/member_position/#LATEST'
USING VoldemortStorage()
Scheduling
Azkaban
type=pig
pig.script=myscript.pig
Ad hoc?
Future at LinkedIn
Wishes
Dates
Fix Data Types
JSON
Cross Platform
Questions?
• criccomini@linkedin.com
• http://www.riccomini.name
• http://www.sna-projects.com
• http://www.project-voldem...
Pig at Linkedin
Pig at Linkedin
Pig at Linkedin
Nächste SlideShare
Wird geladen in …5
×

Pig at Linkedin

4.299 Aufrufe

Veröffentlicht am

Pig at LinkedIn by Chris Riccomini from LinkedIn
Pig is an integral part of data analytics at LinkedIn. Learn about LinkedIn’s analytic stack, and see how Pig is used to design, develop, and deliver data products at LinkedIn. We’ll explore a successful example of Pig deployment at LinkedIn, pain points, and integration with Azkaban, Voldemort, Hadoop, and the rest of LinkedIn’s ecosystem.

Veröffentlicht in: Bildung, Technologie, Business
  • Loggen Sie sich ein, um Kommentare anzuzeigen.

Pig at Linkedin

  1. 1. Pig at Linkedin Chris Riccomini 9/29/10
  2. 2. Who?
  3. 3. What?
  4. 4. LinkedIn Analytics
  5. 5. Pig at LinkedIn
  6. 6. Why?
  7. 7. Production Quality
  8. 8. Streaming
  9. 9. Serialization
  10. 10. VoldemortStorage ~ Avro
  11. 11. views = LOAD '/data/awesome' USING VoldemortStorage();
  12. 12. Voldemort ♥ Pig
  13. 13. Partitioning
  14. 14. YYYY/MM/DD
  15. 15. Last N days?
  16. 16. views = LOAD '/data/etl/tracking/extracted/profile-view' USING VoldemortStorage('date.range', 'num.days=90;days.ago=1’)
  17. 17. Some-file-YYYY-MM-DD
  18. 18. member_position = LOAD '/data/etl/replicated/member/member_position/#LATEST' USING VoldemortStorage()
  19. 19. Scheduling
  20. 20. Azkaban
  21. 21. type=pig pig.script=myscript.pig
  22. 22. Ad hoc?
  23. 23. Future at LinkedIn
  24. 24. Wishes
  25. 25. Dates
  26. 26. Fix Data Types
  27. 27. JSON
  28. 28. Cross Platform
  29. 29. Questions? • criccomini@linkedin.com • http://www.riccomini.name • http://www.sna-projects.com • http://www.project-voldemort.com • @criccomini • LinkedIn is Hiring! Email me!

×