SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Next generation technologies
         (The best way to jump into a parade is to jump in front of one that is already going)




                                                                             We are going to talk about
                                                                             the framework that backs
                                                                             up the technological
                                                                             infrastructure of the
                                                                             biggest players of internet
                                                                             world, some of them are
                                                                             embedded in the
                                                                             following image:
                                                                             These are just some
                                                                             biggest name; there are
                                                                             lots more in this list.
                                                                             Here we are talking about
                                                                             next generation computer
                                                                             technology, which has
                                                                             scalability, tolerance and
                                                                             much more features. The
term cloud will not unheard for you but here I am going to talk about a super technological terms
that will be back bone of cloud or distributed computing. Now you may be thing what is that
technology right? The technology that we are going to discuss is called “Hadoop”. The best thing
about the technology is its open source and readily available where you can contribute, experiment,
and use.
As apache web site says “The Apache™ Hadoop™ project develops open-source software for reliable,
scalable, distributed computing.
The Apache Hadoop software library is a framework that allows for the distributed processing of
large data sets across clusters of computers using a simple programming model. It is designed to
scale up from single servers to thousands of machines, each offering local computation and storage.
Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and
handle failures at the application layer, so delivering a highly-available service on top of a cluster of
computers, each of which may be prone to failures.

Let’s talk about some best features first:

       High scalability.
       High availability.
       High performance.
       Handling Multi-dimensional data storage.
       Handling Distributed storage.

Let’s first look on some scenarios in the internet world:
How much data can you think of which need to process by a internet player? Do you know how
much data twitter process daily? It about 7 Tb per day. How much time will it take to process this
much of data for a general computer




About 4 hr. that is just for reading, not processing, can you think about processing all twitter data
will not it take years.




                                                  So here comes Hadoop in play which sorts a
petabyte in 16.25Hr and a terabyte of data in 62 seconds. Is not it good choice yes sure it is. Likewise
think about the amount of data Facebook, Google, amazon need to process daily.
The best thing about Hadoop setup is, you don’t need special costly and high end servers rather you
can make a cluster out of Hadoop using commodity computers. Keep adding computers and keep
increasing storage and processing power.




So ultimately here are some point for “Why Hadoop?”

    •   Need to process Multi Petabyte Datasets
    •   Expensive to build reliability in each application.
    •   Nodes fail every day
        – Failure is expected, rather than exceptional.
        – The number of nodes in a cluster is not constant.
    •   Need common infrastructure
        – Efficient, reliable, Open Source Apache License
    •   The above goals are same as Condor, but
            – Workloads are IO bound and not CPU bound

Hadoop basically depends of following concept:
   1. Hadoop – common (Base)
               Hadoop Common is a set of utilities that support the Hadoop subprojects. Hadoop
      Common includes FileSystem, RPC, and serialization libraries.
2. HDFS ( Hadoop File System)(File System)
              Hadoop Distributed File System (HDFS™) is the primary storage system used by
       Hadoop applications. HDFS creates multiple replicas of data blocks and distributes them on
       compute nodes throughout a cluster to enable reliable, extremely rapid computations.

    3. Map-Reduce (Code)
               Hadoop Map-Reduce is a programming model and software framework for writing
       applications that rapidly process vast amounts of data in parallel on large clusters of
       compute nodes.

So what it is used for:

    1. Internet scale data :
           a. Web Logs: years of logs Terabytes per day.
           b. Web search- all the webpages present on this earth.
           c. Social data- all the data, messages, images, tweets, scraps, wall posts generated on
               Facebook, Twitter, and other social media.

    2. Cutting edge analytics:
           a. Machine learning, data mining.

    3. Enterprise applications:
           a. Network instrumentation, mobile logs.
           b. Video and audio processing.
           c. Text mining.

    4. And lots more.

Let's see the timeline:
References:
http://hadoop.apache.org, http://developer.yahoo.com/hadoop/
This is the best place where you can find all information about Hadoop. On this website you'll find
lots of wiki pages links and ongoing links, from which you can get lot of information about Hadoop
on how to get started with Hadoop, and all how where how to questions and their answers.
Just visit this site is explore it and experiment with the next-generation technology that is going to
be the backbone of Internet.

In the next coming articles, we'll talk about some other technologies related Hadoop likeHBase, Hive,
Avro, Cassandra, Chukwa, Mahout, Pig, Zookeeper.




                                                                                                  ∞
                                                                                 Shashwat Shriparv

Weitere ähnliche Inhalte

Was ist angesagt?

Hadoop interview quations1
Hadoop interview quations1Hadoop interview quations1
Hadoop interview quations1
Vemula Ravi
 
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability | Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Edureka!
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
Edureka!
 
Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configuration
prabakaranbrick
 

Was ist angesagt? (20)

Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
Hadoop installation by santosh nage
Hadoop installation by santosh nageHadoop installation by santosh nage
Hadoop installation by santosh nage
 
Hadoop interview quations1
Hadoop interview quations1Hadoop interview quations1
Hadoop interview quations1
 
Hadoop migration and upgradation
Hadoop migration and upgradationHadoop migration and upgradation
Hadoop migration and upgradation
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
Hadoop admin
Hadoop adminHadoop admin
Hadoop admin
 
Apache kafka configuration-guide
Apache kafka configuration-guideApache kafka configuration-guide
Apache kafka configuration-guide
 
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop lab s...
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop lab s...IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop lab s...
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop lab s...
 
Hadoop hdfs interview questions
Hadoop hdfs interview questionsHadoop hdfs interview questions
Hadoop hdfs interview questions
 
Administer Hadoop Cluster
Administer Hadoop ClusterAdminister Hadoop Cluster
Administer Hadoop Cluster
 
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability | Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
 
A day in the life of hadoop administrator!
A day in the life of hadoop administrator!A day in the life of hadoop administrator!
A day in the life of hadoop administrator!
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With Kerberos
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
 
Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configuration
 
Hadoop 31-frequently-asked-interview-questions
Hadoop 31-frequently-asked-interview-questionsHadoop 31-frequently-asked-interview-questions
Hadoop 31-frequently-asked-interview-questions
 
Hadoop Interview Questions and Answers
Hadoop Interview Questions and AnswersHadoop Interview Questions and Answers
Hadoop Interview Questions and Answers
 
Hadoop interview questions
Hadoop interview questionsHadoop interview questions
Hadoop interview questions
 
Hadoop administration
Hadoop administrationHadoop administration
Hadoop administration
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 

Andere mochten auch (17)

Gps
GpsGps
Gps
 
My sql
My sqlMy sql
My sql
 
Infinitytech
InfinitytechInfinitytech
Infinitytech
 
Fundamental programming structures in java
Fundamental programming structures in javaFundamental programming structures in java
Fundamental programming structures in java
 
H base introduction & development
H base introduction & developmentH base introduction & development
H base introduction & development
 
Firewalls
FirewallsFirewalls
Firewalls
 
I pv6
I pv6I pv6
I pv6
 
Hdmi
HdmiHdmi
Hdmi
 
Dynamic storage allocation techniques
Dynamic storage allocation techniquesDynamic storage allocation techniques
Dynamic storage allocation techniques
 
Control statements in java
Control statements in javaControl statements in java
Control statements in java
 
R language introduction
R language introductionR language introduction
R language introduction
 
I mode
I modeI mode
I mode
 
Honeypot
HoneypotHoneypot
Honeypot
 
Database design concept
Database design conceptDatabase design concept
Database design concept
 
Desirable software features simulation & modeling
Desirable software features simulation & modelingDesirable software features simulation & modeling
Desirable software features simulation & modeling
 
Graphics processing unit
Graphics processing unitGraphics processing unit
Graphics processing unit
 
Hbase
HbaseHbase
Hbase
 

Ähnlich wie Next generation technology

Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University TalksHadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talks
yhadoop
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1
Thanh Nguyen
 
Hadoop @ Sara & BiG Grid
Hadoop @ Sara & BiG GridHadoop @ Sara & BiG Grid
Hadoop @ Sara & BiG Grid
Evert Lammerts
 

Ähnlich wie Next generation technology (20)

BigData primer
BigData primerBigData primer
BigData primer
 
Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University TalksHadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talks
 
Building a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystemBuilding a Big Data platform with the Hadoop ecosystem
Building a Big Data platform with the Hadoop ecosystem
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
 
Cloud computing and Hadoop introduction
Cloud computing and Hadoop introductionCloud computing and Hadoop introduction
Cloud computing and Hadoop introduction
 
Hadoop
HadoopHadoop
Hadoop
 
Bigdata and Hadoop Bootcamp
Bigdata and Hadoop BootcampBigdata and Hadoop Bootcamp
Bigdata and Hadoop Bootcamp
 
Analyst Report : The Enterprise Use of Hadoop
Analyst Report : The Enterprise Use of Hadoop Analyst Report : The Enterprise Use of Hadoop
Analyst Report : The Enterprise Use of Hadoop
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo ppt
 
Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1
 
Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014
 
Hadoop
HadoopHadoop
Hadoop
 
BIG DATA
BIG DATABIG DATA
BIG DATA
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop Ecosystem at a Glance
Hadoop Ecosystem at a GlanceHadoop Ecosystem at a Glance
Hadoop Ecosystem at a Glance
 
Hadoop @ Sara & BiG Grid
Hadoop @ Sara & BiG GridHadoop @ Sara & BiG Grid
Hadoop @ Sara & BiG Grid
 
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and HadoopA gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
 

Mehr von Shashwat Shriparv

LibreOffice 7.3.pptx
LibreOffice 7.3.pptxLibreOffice 7.3.pptx
LibreOffice 7.3.pptx
Shashwat Shriparv
 
Intermediate code generation1
Intermediate code generation1Intermediate code generation1
Intermediate code generation1
Shashwat Shriparv
 
Information system availibility control
Information system availibility controlInformation system availibility control
Information system availibility control
Shashwat Shriparv
 

Mehr von Shashwat Shriparv (19)

Learning Linux Series Administrator Commands.pptx
Learning Linux Series Administrator Commands.pptxLearning Linux Series Administrator Commands.pptx
Learning Linux Series Administrator Commands.pptx
 
LibreOffice 7.3.pptx
LibreOffice 7.3.pptxLibreOffice 7.3.pptx
LibreOffice 7.3.pptx
 
Kerberos Architecture.pptx
Kerberos Architecture.pptxKerberos Architecture.pptx
Kerberos Architecture.pptx
 
Suspending a Process in Linux.pptx
Suspending a Process in Linux.pptxSuspending a Process in Linux.pptx
Suspending a Process in Linux.pptx
 
Kerberos Architecture.pptx
Kerberos Architecture.pptxKerberos Architecture.pptx
Kerberos Architecture.pptx
 
Command Seperators.pptx
Command Seperators.pptxCommand Seperators.pptx
Command Seperators.pptx
 
Hive query optimization infinity
Hive query optimization infinityHive query optimization infinity
Hive query optimization infinity
 
Hbase interact with shell
Hbase interact with shellHbase interact with shell
Hbase interact with shell
 
H base development
H base developmentH base development
H base development
 
H base
H baseH base
H base
 
Apache tomcat
Apache tomcatApache tomcat
Apache tomcat
 
Linux 4 you
Linux 4 youLinux 4 you
Linux 4 you
 
Java interview questions
Java interview questionsJava interview questions
Java interview questions
 
C# interview quesions
C# interview quesionsC# interview quesions
C# interview quesions
 
Inventory system
Inventory systemInventory system
Inventory system
 
Intermediate code generation1
Intermediate code generation1Intermediate code generation1
Intermediate code generation1
 
Information system availibility control
Information system availibility controlInformation system availibility control
Information system availibility control
 
Huang sheri
Huang sheriHuang sheri
Huang sheri
 
Holographic memory
Holographic memoryHolographic memory
Holographic memory
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Next generation technology

  • 1. Next generation technologies (The best way to jump into a parade is to jump in front of one that is already going) We are going to talk about the framework that backs up the technological infrastructure of the biggest players of internet world, some of them are embedded in the following image: These are just some biggest name; there are lots more in this list. Here we are talking about next generation computer technology, which has scalability, tolerance and much more features. The term cloud will not unheard for you but here I am going to talk about a super technological terms that will be back bone of cloud or distributed computing. Now you may be thing what is that technology right? The technology that we are going to discuss is called “Hadoop”. The best thing about the technology is its open source and readily available where you can contribute, experiment, and use. As apache web site says “The Apache™ Hadoop™ project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures. Let’s talk about some best features first:  High scalability.  High availability.  High performance.  Handling Multi-dimensional data storage.  Handling Distributed storage. Let’s first look on some scenarios in the internet world:
  • 2. How much data can you think of which need to process by a internet player? Do you know how much data twitter process daily? It about 7 Tb per day. How much time will it take to process this much of data for a general computer About 4 hr. that is just for reading, not processing, can you think about processing all twitter data will not it take years. So here comes Hadoop in play which sorts a petabyte in 16.25Hr and a terabyte of data in 62 seconds. Is not it good choice yes sure it is. Likewise think about the amount of data Facebook, Google, amazon need to process daily. The best thing about Hadoop setup is, you don’t need special costly and high end servers rather you can make a cluster out of Hadoop using commodity computers. Keep adding computers and keep increasing storage and processing power. So ultimately here are some point for “Why Hadoop?” • Need to process Multi Petabyte Datasets • Expensive to build reliability in each application. • Nodes fail every day – Failure is expected, rather than exceptional. – The number of nodes in a cluster is not constant. • Need common infrastructure – Efficient, reliable, Open Source Apache License • The above goals are same as Condor, but – Workloads are IO bound and not CPU bound Hadoop basically depends of following concept: 1. Hadoop – common (Base) Hadoop Common is a set of utilities that support the Hadoop subprojects. Hadoop Common includes FileSystem, RPC, and serialization libraries.
  • 3. 2. HDFS ( Hadoop File System)(File System) Hadoop Distributed File System (HDFS™) is the primary storage system used by Hadoop applications. HDFS creates multiple replicas of data blocks and distributes them on compute nodes throughout a cluster to enable reliable, extremely rapid computations. 3. Map-Reduce (Code) Hadoop Map-Reduce is a programming model and software framework for writing applications that rapidly process vast amounts of data in parallel on large clusters of compute nodes. So what it is used for: 1. Internet scale data : a. Web Logs: years of logs Terabytes per day. b. Web search- all the webpages present on this earth. c. Social data- all the data, messages, images, tweets, scraps, wall posts generated on Facebook, Twitter, and other social media. 2. Cutting edge analytics: a. Machine learning, data mining. 3. Enterprise applications: a. Network instrumentation, mobile logs. b. Video and audio processing. c. Text mining. 4. And lots more. Let's see the timeline:
  • 4. References: http://hadoop.apache.org, http://developer.yahoo.com/hadoop/ This is the best place where you can find all information about Hadoop. On this website you'll find lots of wiki pages links and ongoing links, from which you can get lot of information about Hadoop on how to get started with Hadoop, and all how where how to questions and their answers. Just visit this site is explore it and experiment with the next-generation technology that is going to be the backbone of Internet. In the next coming articles, we'll talk about some other technologies related Hadoop likeHBase, Hive, Avro, Cassandra, Chukwa, Mahout, Pig, Zookeeper. ∞ Shashwat Shriparv