Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
 Facebook,Twitter, Google generating petabytes of data everyday.
 Hadron Collider project discarding large amount of dat...
Bank
Optimal
Price?
Maximise Profit
Insurance
3rd Party Survey Expert Debates
Optimal Price
Bank
Optimal
Price?
Maximise Profit
Insurance
Optimal Price
Data Warehousing
Repository
WebActivity
Transaction
Competitor...
Volume VelocityVariety
Bank
Optimal
Price?
Maximise Profit
Insurance
Optimal Price
Data Warehousing
Repository
WebActivity
Transaction
Competitor...
Decision Support
System
Digital
Nervous
System
Data
Fundamental
block to
Data
Fundamental
Block to
Business @
speed of
tho...
Bank
Repository
WebActivity
Transaction
Competitors Pricing
MarketTrends
Statistics
Optimal Price
Mobile Alert with
Travel...
International DataCorporation’s (IDC) 6th annual study:
 From 2005 to 2020, the digital universe will grow by a factor of...
2003-041996-2000 2005-06 2010 2013
Google File System
And MapReduce Papers
YARN/MapReduce 2/
Next Generation Hadoop
Hadoop...
PriceAdvantage:
1. Clusters use commodity
hardware, cheaper than
one expensive server.
2. Software License is free.
HDFS
MapReduce
Google File System
Google MapReduce
file1
Name node
Data nodes
map map map map map Reduce
User
HDFS
MapReduce HBase
Pig Hive
Sqoop/Flume
Log collection
Yahoo Facebook
Storm
Chukwa
Kafka
Structured Stores
Message broke...
Complex Algorithm
on a small dataset
SimpleAlgorithm
on a large dataset
1. Complex Algorithms needs to be
correctly sensit...
Data Engineer Data Scientist
Role
Skills
To solve business problems
using data.
To engineer software solutions.
More of pr...
-> SkeletonVersion
->All the ecosystems need
to be additionally installed.
-> Important ecosystem
members included.
-> Few...
ThankYou!!!
Superstar-Doug!!!
A small fan :- Me
And the real Hadoop
Hadoop for beginners   free course ppt
Hadoop for beginners   free course ppt
Hadoop for beginners   free course ppt
Hadoop for beginners   free course ppt
Hadoop for beginners   free course ppt
Hadoop for beginners   free course ppt
Hadoop for beginners   free course ppt
Hadoop for beginners   free course ppt
Hadoop for beginners   free course ppt
Hadoop for beginners   free course ppt
Nächste SlideShare
Wird geladen in …5
×

Hadoop for beginners free course ppt

This is a power point presentation on Hadoop and Big Data. This covers the essential knowledge one should have when stepping into the world of Big Data.

This course is available on hadoop-skills.com for free!

This course builds a basic fundamental understanding of Big Data problems and Hadoop as a solution. This course takes you through:
• This course builds Understanding of Big Data problems with easy to understand examples and illustrations.
• History and advent of Hadoop right from when Hadoop wasn’t even named Hadoop and was called Nutch
• What is Hadoop Magic which makes it so unique and powerful.
• Understanding the difference between Data science and data engineering, which is one of the big confusions in selecting a carrier or understanding a job role.
• And most importantly, demystifying Hadoop vendors like Cloudera, MapR and Hortonworks by understanding about them.

This course is available for free on hadoop-skills.com

Hadoop for beginners free course ppt

  1. 1.  Facebook,Twitter, Google generating petabytes of data everyday.  Hadron Collider project discarding large amount of data as they won’t be able to analyse. Hoping that they haven’t thrown anything valuable. Interesting facts but ….Why is Big Data important? Lets understand via an example
  2. 2. Bank Optimal Price? Maximise Profit Insurance 3rd Party Survey Expert Debates Optimal Price
  3. 3. Bank Optimal Price? Maximise Profit Insurance Optimal Price Data Warehousing Repository WebActivity Transaction Competitors Pricing MarketTrends Statistics Data WarehouseRun Statistical Algorithms Decision Support System
  4. 4. Volume VelocityVariety
  5. 5. Bank Optimal Price? Maximise Profit Insurance Optimal Price Data Warehousing Repository WebActivity Transaction Competitors Pricing MarketTrends Statistics Data WarehouseRun Statistical Algorithms Decision Support System
  6. 6. Decision Support System Digital Nervous System Data Fundamental block to Data Fundamental Block to Business @ speed of thought Sense Interpret Decide Act Organisations behaving like Biological nervous system AvatarSkynet
  7. 7. Bank Repository WebActivity Transaction Competitors Pricing MarketTrends Statistics Optimal Price Mobile Alert with Travel insurance
  8. 8. International DataCorporation’s (IDC) 6th annual study:  From 2005 to 2020, the digital universe will grow by a factor of 300, from 130 exabytes to 40,000 exabytes, or 40 trillion gigabytes  More than 5,200 gigabytes for every man, woman, and child in 2020.  From now until 2020, the digital universe will about double every two years.  33% of the digital data might be valuable if analysed, compared with 25% today. From Gartner:  4.4 Million IT Jobs Globally to Support Big Data By 2015.
  9. 9. 2003-041996-2000 2005-06 2010 2013 Google File System And MapReduce Papers YARN/MapReduce 2/ Next Generation Hadoop Hadoop spawns off Nutch Big Data problem faced by All Search engines and Mike Dreadnaught Doug Joins Cloudera 0.xx Releases of hadoop
  10. 10. PriceAdvantage: 1. Clusters use commodity hardware, cheaper than one expensive server. 2. Software License is free.
  11. 11. HDFS MapReduce Google File System Google MapReduce file1 Name node Data nodes map map map map map Reduce User
  12. 12. HDFS MapReduce HBase Pig Hive Sqoop/Flume Log collection Yahoo Facebook Storm Chukwa Kafka Structured Stores Message broker Oozie
  13. 13. Complex Algorithm on a small dataset SimpleAlgorithm on a large dataset 1. Complex Algorithms needs to be correctly sensitive to week correlations. 2. Complex Algorithms are thus difficult to code and design.
  14. 14. Data Engineer Data Scientist Role Skills To solve business problems using data. To engineer software solutions. More of programing and technical skills and ability to architect technical solutions. Strong of Mathematical Skills and understanding of statistical Models.
  15. 15. -> SkeletonVersion ->All the ecosystems need to be additionally installed. -> Important ecosystem members included. -> Few Proprietary tools like Enterprise Manager. -> Proprietary Hadoop code written in C. -> Integrated with Hadoop ecosystem members. -> Based out of Apache hadoop. -> Supports .NET framework -> Launches Hadoop Distribution: Pivotal HD
  16. 16. ThankYou!!!
  17. 17. Superstar-Doug!!! A small fan :- Me And the real Hadoop

×