Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Treasure Data                      Hadoop meets Cloud with Multi-Tenancy                                     Kazuki Ohta  ...
Who are you?          Kazuki Ohta (太田一樹)              • @kzk_mover, k@treasure-data.com          Treasure Data, Inc.    ...
Treasure Data = Cloud + Big Data     Cloud                                                                            Big ...
What is the Problem?                                             4Friday, April 5, 13
Big Data? NoSQL?                            5Friday, April 5, 13
Too Many Solutions                                           6Friday, April 5, 13
Hadoop Versions                      Too Many Variations (+Eco System)                           from http://marblejenka.b...
Current Big Data Solutions: ‘Feature Creep’                      http://en.wikipedia.org/wiki/Feature_creep   8Friday, Apr...
We need Machete :)                      EVERYTHING                          with                      ONE interface       ...
‘Simplicity’ itself is a feature :)                      by Anand Babu Periasamy                       GlusterFS Co-Founde...
Next Topic: Cloud?                                           11Friday, April 5, 13
http://www.saasblogs.com/saas/demystifying-the-cloud-where-do-saas-paas-and-other-acronyms-fit-in/                         ...
Battle Field of IaaS Vendors: SCM          HW Performance / Price                     In the near future, most of         ...
PaaS, SaaS:                       IT is all about Operation                      More Sleep, More Value          With PaaS...
PaaS/SaaS Battle Field: ‘Time’ is Money                             Ideal    Customer              Expectation     Value  ...
Introduction                            to                      Treasure Data                                      16Frida...
Company Overview                        US team as of 2012 July   17Friday, April 5, 13
Company Overview          Silicon Valley-based Company              • All Founders are Japanese                      • Hi...
19         Our 50+ Customers – Fortune Global 500 leaders         and start-ups including:                      250 billio...
Vision: Single Analytics Platform for the World                                                                   20Friday...
Investors             Bill Tai             Naren Gupta - Nexus Ventures, Director of Redhat, TIBCO             Othman L...
Treasure Data’s                      Philosophy and Architecture                                                    22Frid...
Big Data Adoption Stages                        Optimization           What’s the best?                      Predictive An...
Full Stack Support for Big Data Reporting        Our best-in-class architecture       Data from almost any source        a...
Treasure Data = Collect + Store + Query                                                                25Friday, April 5, 13
Example in AdTech: MobFox           1. Europe’s largest independent mobile ad exchange.           2. 20 billion imps/month...
Two Weeks From Start to Finish!                                                        27Friday, April 5, 13
Our Value was Proven :)    Customer             Our Value: Save Time!     Value                                           ...
Architecture Breakdown      Data Collection             Data Store/Analytics        Connectivity      • Increasing variety...
1) Data Collection          60% of BI project resource is consumed here          Most ‘underestimated’ and ‘unsexy’ but ...
2) Data Store / Analytics - Columnar Storage                                                    31Friday, April 5, 13
3) Connectivity                                   REST API                      td-command                                ...
Most Difficult Challenge: Multi-Tenancy    All customers share the Hadoop clusters (4 Data Centers)    Resource Sharing ...
Conclusion          Big Data is too complex              • Needs Simplicity              • Machete v.s. Swiss Army Knife ...
We’re Hiring Top Talents, please contact me :)                                                                       35Fri...
Appendix                      18 36Friday, April 5, 13
Big Data Market Growth         (average of IDC, Gartner and Wikibon stats)               Big Data Revenue Breakdown       ...
Big Data Situation  Customer                      Treasure Data   Value                                                   ...
Treasure Data Service Architecture                           User             Apache              App                     ...
Our Own Open Source technologies   We are open source natives and proud of our heritage.   We’ve contributed to Hibernate,...
Example in Web Industry                                                41Friday, April 5, 13
Example Use Case – MySQL to TD                                          42Friday, April 5, 13
Example Use Case – MySQL to TD                                          43Friday, April 5, 13
Big Data for the Rest of Us                      www.treasure-data.com | @TreasureDataFriday, April 5, 13
Nächste SlideShare
Wird geladen in …5
×

20

Teilen

Herunterladen, um offline zu lesen

Hadoop meets Cloud with Multi-Tenancy

Herunterladen, um offline zu lesen

CTO Kaz's talk at Hadoop Conference Japan 2013 Winter.

Ähnliche Bücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

Ähnliche Hörbücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

Hadoop meets Cloud with Multi-Tenancy

  1. 1. Treasure Data Hadoop meets Cloud with Multi-Tenancy Kazuki Ohta Founder and CTO at Treasure Data, Inc. Hadoopユーザー会 k@treasure-data.com @kzk_moverFriday, April 5, 13
  2. 2. Who are you?  Kazuki Ohta (太田一樹) • @kzk_mover, k@treasure-data.com  Treasure Data, Inc. • Chief Technology Officer, Founded July 2011  Hadoop User Group Japan • One of Founders • “Hadoop徹底入門”  Open-Source Enthusiast • Hadoop, memcached, jemalloc, MongoDB, memcached, uim, etc... 2Friday, April 5, 13
  3. 3. Treasure Data = Cloud + Big Data Cloud Big Data-as-a-Service Database-as-a-service Enterprise Lightweight RDBMS Traditional RDBMS Data Warehouse DB2 On-Premise $34B $10B market market 1Bil entry Data Volume Or 10TB © 2012 Forrester Research, Inc. Reproduction Prohibited 3Friday, April 5, 13
  4. 4. What is the Problem? 4Friday, April 5, 13
  5. 5. Big Data? NoSQL? 5Friday, April 5, 13
  6. 6. Too Many Solutions 6Friday, April 5, 13
  7. 7. Hadoop Versions Too Many Variations (+Eco System) from http://marblejenka.blogspot.jp/2013/01/hadoop.html 7Friday, April 5, 13
  8. 8. Current Big Data Solutions: ‘Feature Creep’ http://en.wikipedia.org/wiki/Feature_creep 8Friday, April 5, 13
  9. 9. We need Machete :) EVERYTHING with ONE interface Simple & Discoverable Machete Design by James Lindenbaum Heroku Co-Founder http://www.youtube.com/watch?v=3BhDLm9jo5Y 9Friday, April 5, 13
  10. 10. ‘Simplicity’ itself is a feature :) by Anand Babu Periasamy GlusterFS Co-Founder 10Friday, April 5, 13
  11. 11. Next Topic: Cloud? 11Friday, April 5, 13
  12. 12. http://www.saasblogs.com/saas/demystifying-the-cloud-where-do-saas-paas-and-other-acronyms-fit-in/ 12Friday, April 5, 13
  13. 13. Battle Field of IaaS Vendors: SCM HW Performance / Price In the near future, most of HW buyers aren’t individual companies, but cloud. IaaS Vendors Decrease with Battle Field: Moore’s Law Supply Chain Management On-Premise Time 13Friday, April 5, 13
  14. 14. PaaS, SaaS: IT is all about Operation More Sleep, More Value With PaaS, you offload your development operations function and have the PaaS provider handle the tools and components required to deploy and manage applications reliably. - EngineYard 14Friday, April 5, 13
  15. 15. PaaS/SaaS Battle Field: ‘Time’ is Money Ideal Customer Expectation Value Obsolete over time Reality (On-Premise) Upgrade HW/SW Selection, PoC, Deploy... Time Sign-up or PO 15Friday, April 5, 13
  16. 16. Introduction to Treasure Data 16Friday, April 5, 13
  17. 17. Company Overview US team as of 2012 July 17Friday, April 5, 13
  18. 18. Company Overview  Silicon Valley-based Company • All Founders are Japanese • Hironobu Yoshikawa • Kazuki Ohta • Sadayuki Furuhashi  OSS Enthusiasts • MessagePack, Fluentd, etc. • Cloud native 18Friday, April 5, 13
  19. 19. 19 Our 50+ Customers – Fortune Global 500 leaders and start-ups including: 250 billion records / month in Feb 2013 2 million jobs executedFriday, April 5, 13
  20. 20. Vision: Single Analytics Platform for the World 20Friday, April 5, 13
  21. 21. Investors  Bill Tai  Naren Gupta - Nexus Ventures, Director of Redhat, TIBCO  Othman Laraki - Former VP Growth at Twitter  James Lindenbaum, Adam Wiggins, Orion Henry - Heroku Founders  Anand Babu Periasamy, Hitesh Chellani - Gluster Founders  Yukihiro “Matz” Matsumoto - Creator of Ruby Jerry Yang, Founder of Yahoo!  Dan Scheinman - Director of Arista Networks where Hadoop was invented :)  + 10 more people Check out Today (2013/01/21)’s Morning 日経新聞! • and.... 21Friday, April 5, 13
  22. 22. Treasure Data’s Philosophy and Architecture 22Friday, April 5, 13
  23. 23. Big Data Adoption Stages Optimization What’s the best? Predictive Analysis What’s a trend? Analytics Statistical Analysis Treasure Data’s FOCUS Why? Alerts Error?(80% of needs) Drill Down Query Where exactly? Reporting Ad-hoc Reports Where? Standard Reports What happened? Intelligence Sophistication 23Friday, April 5, 13
  24. 24. Full Stack Support for Big Data Reporting Our best-in-class architecture Data from almost any source and operations team ensure the can be securely and reliably integrity and availability of your uploaded using td-agent in data. streaming or batch mode. Our SQL, REST, JDBC, ODBC You can store gigabytes to and command-line interfaces petabytes of data efficiently and support all major query tools securely in our cloud-based and approaches. columnar datastore. 24Friday, April 5, 13
  25. 25. Treasure Data = Collect + Store + Query 25Friday, April 5, 13
  26. 26. Example in AdTech: MobFox 1. Europe’s largest independent mobile ad exchange. 2. 20 billion imps/month (circa Jan. 2013) 3. Serving ads for 15,000+ mobile apps (circa Jan. 2013) 4. Needed Big Data Analytics infrastructure ASAP. 26Friday, April 5, 13
  27. 27. Two Weeks From Start to Finish! 27Friday, April 5, 13
  28. 28. Our Value was Proven :) Customer Our Value: Save Time! Value Obsolete over time Reality (On-Premise) Simple Interface Upgrade HW/SW Selection, PoC, Deploy... Time Sign-up or PO 28Friday, April 5, 13
  29. 29. Architecture Breakdown Data Collection Data Store/Analytics Connectivity • Increasing variety of • Remaining complexity in • Required to ensure data sources both traditional DWH connectivity with • No single data schema and Hadoop (very slow existing BI/visualization/ • Lack of streaming data time to market) apps by JDBC, REST collection method • Challenges in scaling and ODBC. • 60% of Big Data project data volume and resource consumed expanding cost. 29Friday, April 5, 13
  30. 30. 1) Data Collection  60% of BI project resource is consumed here  Most ‘underestimated’ and ‘unsexy’ but MOST important  Fluentd: OSS lightweight but robust Log Collector • http://fluentd.org/ These talks will cover Fluentd :) 15:40∼ Log analysis system with Hadoop in livedoor 2013 by Satoshi Tagomori @ NHN Japan 16:30∼ いかにしてHadoopにデータを集めるか by Sadayuki Furuhahsi @ Treasure Data, Inc. 30Friday, April 5, 13
  31. 31. 2) Data Store / Analytics - Columnar Storage 31Friday, April 5, 13
  32. 32. 3) Connectivity REST API td-command Query Query Query API Processing JDBC, ODBC Driver Cluster BI apps Web App Treasure Data Result MySQL Columnar Storage Postgres 32Friday, April 5, 13
  33. 33. Most Difficult Challenge: Multi-Tenancy  All customers share the Hadoop clusters (4 Data Centers)  Resource Sharing (Burst Cores), Rapid Improvement, Ease of Upgrade Job Submission + Plan Change Local FairScheduler datacenter A Local FairScheduler Global datacenter B Scheduler Local FairScheduler datacenter C On-Demand Resouce Allocation Local FairScheduler datacenter D 33Friday, April 5, 13
  34. 34. Conclusion  Big Data is too complex • Needs Simplicity • Machete v.s. Swiss Army Knife (Feature Creep)  IT is changing • The value of Software itself is decreasing • Operation is the key  Treasure Data = Cloud + Big Data • Currently Focusing on Big Data Reporting • Instant Value with Simple Interface 34Friday, April 5, 13
  35. 35. We’re Hiring Top Talents, please contact me :) 35Friday, April 5, 13
  36. 36. Appendix 18 36Friday, April 5, 13
  37. 37. Big Data Market Growth (average of IDC, Gartner and Wikibon stats) Big Data Revenue Breakdown CAGR 38% “In 2012…BI and Analytics are rated #1 priorities.” — Ravi Kalakota, Gartner “Big Data is the new definitive source of “More than half a billion dollars in venture capital competitive advantage across all has been invested in new big data technology.” industries.” — Dan Vessett, IDC — Jeff Kelly, Wikibon 37Friday, April 5, 13
  38. 38. Big Data Situation Customer Treasure Data Value RedShift AWS Obsolescence over time EMR Software B Software A On-premise solutions Time Sign-up or PO 38Friday, April 5, 13
  39. 39. Treasure Data Service Architecture User Apache App Treasure Data columnar data App RDBMS warehouse Other data sources MAPREDUCE JOBS HIVE, PIG (to be supported) td-command Query Query Processing API JDBC, REST Cluster BI apps 39Friday, April 5, 13
  40. 40. Our Own Open Source technologies We are open source natives and proud of our heritage. We’ve contributed to Hibernate, Hadoop, Cassandra, Memcached, KDE, MongoDB among others. Our product reflects our deep commitment to the open-source community and is built on top of open source software we’ve authored and open sourced. • Fluentd - a popular data collector daemon written in Ruby www.fluentd.org (a leading user: SlideShare/Linkedin, One Kings Lane) • MessagePack - a fast, compact serializer. www.msgpack.org (a leading user: Pinterest, Redis) Substantial commitment (Code, Packaging, Documentation, Sponsorship) Tech marketing, Possible lead gen 40Friday, April 5, 13
  41. 41. Example in Web Industry 41Friday, April 5, 13
  42. 42. Example Use Case – MySQL to TD 42Friday, April 5, 13
  43. 43. Example Use Case – MySQL to TD 43Friday, April 5, 13
  44. 44. Big Data for the Rest of Us www.treasure-data.com | @TreasureDataFriday, April 5, 13
  • grozeille

    Jan. 22, 2016
  • howstory

    Feb. 14, 2015
  • dsheng

    Nov. 6, 2014
  • aflink

    Oct. 27, 2014
  • john826

    Aug. 20, 2013
  • akifuminiida

    May. 13, 2013
  • mediasemanticweb

    Mar. 10, 2013
  • tsuyoshiogawa

    Feb. 20, 2013
  • Kengo0911

    Feb. 6, 2013
  • tomokazubobhirai

    Jan. 27, 2013
  • mananasamuseva

    Jan. 27, 2013
  • kitevc

    Jan. 24, 2013
  • tsubo0423

    Jan. 24, 2013
  • blade74

    Jan. 23, 2013
  • ryukln

    Jan. 22, 2013
  • sylvainkalache

    Jan. 22, 2013
  • shigeyas

    Jan. 22, 2013
  • TokyoIncidents

    Jan. 22, 2013
  • Linco69

    Jan. 22, 2013
  • sudabon

    Jan. 22, 2013

CTO Kaz's talk at Hadoop Conference Japan 2013 Winter.

Aufrufe

Aufrufe insgesamt

11.108

Auf Slideshare

0

Aus Einbettungen

0

Anzahl der Einbettungen

1.547

Befehle

Downloads

272

Geteilt

0

Kommentare

0

Likes

20

×