Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Big data and Hadoop session - KnowBigData (techgig)

387 Aufrufe

Veröffentlicht am

This is the first introductory session on Big Data & Hadoop.
It clears a lot of myths and confusion about Big Data. How exactly Big Data handling is a challenge and how does Hadoop addresses those problems.

In this session we talks about Big Data and related challenges and how does Hadoop solves those challenges.

By the end of this presentation you should be fairly clear about Hadoop and Big Data.

To watch the video or know more about the course, please visit http://www.knowbigdata.com/page/big-data-and-hadoop-online-instructor-led-training

Veröffentlicht in: Bildung
  • Gehören Sie zu den Ersten, denen das gefällt!

Big data and Hadoop session - KnowBigData (techgig)

  1. 1. www.KnowBigData.comHadoop What, Why & How to Get Started with Big Data & Hadoop
  2. 2. www.KnowBigData.comHadoop ABOUT INSTRUCTOR? 2014 KnowBigData Founded 2014 Amazon Built High Throughput Systems for Amazon.com site similar to storm. 2012 2012 InMobi Built Recommender that churns 200 TB 2011 tBits Global Founded tBits Global Built an enterprise grade Document Management System 2006 D.E.Shaw Built the big data systems before the term was coined 2002 2002 IIT Roorkee Finished B.Tech.
  3. 3. www.KnowBigData.comHadoop ❏ What/why of Big Data? ❏ Why Now? ❏ Examples Customers ❏ What is Hadoop? TODAY’S CLASS ❏ Components of Hadoop ❏ Further Reading/Assignment
  4. 4. www.KnowBigData.comHadoop WHAT IS BIG DATA? • Simply: Data of Very Big Size • Can’t process with usual tools • Distributed Architecture Needed
  5. 5. www.KnowBigData.comHadoop 1.Groups of networked computers 2.Interact with each other 3.To achieve a common goal. DISTRIBUTED COMPUTING
  6. 6. www.KnowBigData.comHadoop CHARACTERSTICS OF BIG DATA Problems Involving the handling of data coming at fast rate. e.g. Number of requests being received by Facebook, Youtube streaming, Google Analytics Problems involving complex data structures e.g. Maps, Social Graphs, Recommendations VOLUME VELOCITY VARIETY Data At Rest Data In Motion Data in Many Forms Problems related to storage of huge data reliably. e.g. Storage of Logs of a website, Storage of data by gmail. FB: 300 PB. 600TB/ day
  7. 7. www.KnowBigData.comHadoop WHY IS IT IMPORTANT NOW? Smart Phones 4.6 billion mobile-phones. 1 - 2 billion people accessing the internet. Facebook:1.06 bn monthly active users, 30 billion pieces shared monthly. ~175 million tweets every day Connectivity: Social Networks The connectivity improved. The devices became cheaper, faster and smaller. Connectivity: Internet Of Things
  8. 8. www.KnowBigData.comHadoop EXAMPLE BIG DATA CUSTOMERS Web and e-commerce 1.Recommendation Engines 2.Search Quality 3.Sentiment Analyses 4.A/B testing Telecommunications 1.Customer Churn Prevention 2.Network Performance Optimization 3.Calling Data Record (CDR) Analysis 4.Analyzing Network to Predict Failure
  9. 9. www.KnowBigData.comHadoop EXAMPLE BIG DATA CUSTOMERS Government 1.Fraud Detection 2.Cyber Security Welfare 3.Justice Healthcare & Life Sciences 1.Health information exchange 2.Gene sequencing 3.Healthcare improvements 4.Drug Safety
  10. 10. www.KnowBigData.comHadoop EXAMPLE BIG DATA PROBLEMS Recommendations
  11. 11. www.KnowBigData.comHadoop EXAMPLE BIG DATA PROBLEMS Recommendations
  12. 12. www.KnowBigData.comHadoop EXAMPLE BIG DATA PROBLEMS Sentiment Analysis
  13. 13. www.KnowBigData.comHadoop SALARY TRENDS Source:Indeed.com
  14. 14. www.KnowBigData.comHadoop BIG DATA SOLUTIONS 1.Apache Hadoop ○ Apache Spark 2.Cassandra 3.MongoDB 4.Google Compute Engine
  15. 15. www.KnowBigData.comHadoop WHAT IS HADOOP? A. Created by Doug Cutting (of Yahoo) and Mike Cafarella B. Based on GFS, GMR & Google Big Table C. Built for Nutch search engine project D. Named after Toy Elephant E. Open Source - Apache F. Power, Popular & Supported G. Framework to handle Big Data H. For reliable, scalable, distributed computing I. Written in Java
  16. 16. www.KnowBigData.comHadoop Workflow SQL Inteface New Language Machine learning / STATS NoSQL Datastore Compute Engine Main Component COMPONENTS
  17. 17. www.KnowBigData.comHadoop ABOUT KNOWBIGDATA ❏ Expert Instructors ❏ CloudxLab ❏ Lifetime access to LMS ❏ Presentations ❏ Class Recording ❏ Assignments + Quizzes ❏ Project Work ❏ Real Life Project ❏ Course Completion Certificate ❏ 24x7 support ❏ KnowBigData - Alumni ❏ Jobs ❏ Stay Abreast (Updated Content, Complimentary Sessions) ❏ Stay Connected
  18. 18. www.KnowBigData.comHadoop WHAT IS CLOUDxLABSTM ? 1. For Real Life Experience 2. An online cluster of servers 3. With all required tools installed 4. Accessible globally 5. Do not require high end configuration
  19. 19. www.KnowBigData.comHadoop www.KnowBigData.com 1.Starting on... ● 12 Dec - 7am Big Data & Hadoop ● 12 Dec - 8:30pm Big Data & Spark 2.Sat-Sun - 3 hours 3.33 hrs - 3 hr x 11 classes 4.₹19999 (25% off) (Incl. Taxes) - $369 5.Includes CloudxLabs + Support + LMS 6.Every class is also recorded. reachus@KnowBigData.com +1 419 665 3276 (US) +91 803 959 1464 (IN) Upcoming Courses
  20. 20. www.KnowBigData.comHadoop Thank you. reachus@knowbigdata.com +1 419 665 3276 (US) +91 803 959 1464 (IN) Subscribe to our Youtube channel for latest videos - https://www. youtube.com/channel/UCxugRFe5wETYA7nMH6VGyEA