SlideShare ist ein Scribd-Unternehmen logo
1 von 43
What we know or see




What’s actually there
• Wikipedia : In information technology, big data is a collection
  of data sets so large and complex that it becomes difficult to
  process using on-hand database management tools. The
  challenges include capture, curation, storage, search, sharing,
  analysis, and visualization.
• IDC : Big data technologies describe a new generation of
  technologies and architectures, designed to economically
  extract value from very large volumes of a wide variety of
  data, by enabling high velocity capture, discovery, and/or
  analysis.
• IBM says that ”three characteristics define big data:”
   – Volume (Terabytes -> Zettabytes)
   – Variety (Structured -> Semi-structured -> Unstructured)
   – Velocity (Batch -> Streaming Data)
What Big data should achieve:

Volume: Turn 12 terabytes of Tweets created each day into improved product
sentiment analysis
Convert 350 billion annual meter readings to better predict power consumption

Velocity: Sometimes 2 minutes is too late.
Scrutinize 5 million trade events created each day to identify potential fraud
Analyze 500 million daily call detail records in real-time to predict customer churn
faster

Variety: Big data is any type of data - structured and unstructured data such as
text, sensor data, audio, video, click streams, log files and more.
Monitor 100’s of live video feeds from surveillance cameras to target points of
interest
Exploit the 80% data growth in images, video and documents to improve
customer satisfaction

Veracity: Establishing trust in big data presents a huge challenge as the variety
and number of sources grows.
Big data is more than simply a matter of size; it is an opportunity to find
insights in new and emerging types of data and content, to make your
business more agile, and to answer questions that were previously
considered beyond your reach.

Here are three key technologies that can help you get a handle on big data –
and even more importantly, extract meaningful business value from it .
   • Information management for big data. Manage data as a strategic, core
   asset, with ongoing process control for big data analytics .
   •High-performance analytics for big data. Gain rapid insights from big
   data and the ability to solve increasingly complex problems using more
   data .
   •Flexible deployment options for big data. Choose between options for
   on premises or hosted, software-as-a-service (SaaS) approaches for big
   data and big data analytics
• Massively Parallel Processing (MPP) – This involves a coordinated
  processing of a program by multiple processors (200 or more in number).
  Each of the processors makes use of its own operating system and
  memory and works on different parts of the program. Each part
  communicates via messaging interface. An MPP system is also known as
  “loosely coupled” or “shared nothing” system.

• Distributed file system or network file system allows client nodes to
  access files through a computer network. This way a number of users
  working on multiple machines will be able to share files and storage
  resources. The client nodes will not be able to access the block storage
  but can interact through a network protocol. This enables a restricted
  access to the file system depending on the access lists or capabilities on
  both servers and clients which is again dependent on the protocol.
• Apache Hadoop is key technology used to handle big data, its analytics
  and stream computing. Apache Hadoop is an open source software
  project that enables the distributed processing of large data sets across
  clusters of commodity servers. It can be scaled up from a single server to
  thousands of machines and with a very high degree of fault tolerance.
  Instead of relying on high-end hardware, the resiliency of these clusters
  comes from the software’s ability to detect and handle failures at the
  application layer.

• Data Intensive Computing is a class of parallel computing application
  which uses a data parallel approach to process big data. This works based
  on the principle of collocation of data and programs or algorithms used to
  perform computation. Parallel and distributed system of inter-connected
  stand alone computers that work together as a single integrated
  computing resource is used to process / analyze big data.
• HDFS:
- Large files are split into parts
- Move file parts into a cluster
- Fault-tolerant through replication across nodes while being rack-aware
- Bookkeeping via NameNode
• MapReduce
- Move algorithms close to the data by structuring them for parallel
   execution so that each task works on a part of the data. The power of
   Simplicity!
• noSQL:
- Old technology – before RDBMS
- Sequential files
- Hierarchical DB
- Network DB
- Graph DB (Neo4j, AllegroGraph)
- Document Stores (Lotus Domino, MongoDB, JackRabbit)
- Memory Caches
• Common Tools:
- Infrastructure Management:
    - Chef & Puppet from DevOps movement
- Grid Monitoring tools:
    - Operational insight with Nagios, Ganglia, Hyperic and ZenOSS
- Development support is a need of the hour

• Operational Tools:
   – Chef & Puppet from DevOps movement + Ganglia
• Analytics Platforms (DW Solutions, BI solutions and analytics)
   – Netezza, GreenPlum, Vertica
• Universal Data (All data storage needs at enterprise level)
   – Lily & Spire
Source links and some more useful
information…
•   http://www.sas.com/resources/whitepaper/wp_46345.pdf
•   http://wikibon.org/blog/big-data-statistics/
•   http://wikibon.org/blog/big-data-infographics/
•   http://www.marketingtechblog.com/ibm-big-data-marketing/
•   http://blogs.gartner.com/doug-laney/
•   http://www-01.ibm.com/software/data/bigdata/
•   http://en.wikipedia.org/wiki/Big_data
•   http://mike2.openmethodology.org/wiki/Big_Data_Definition
•   http://pinterest.com/rtkrum/cool-infographics-gallery
•   http://www.go-gulf.com/blog/category/blog/seo-sem/
•   http://www.forbes.com/sites/davefeinleib/2012/07/24/big-data-trends/
•   http://hortonworks.com/wp-content/uploads/2012/05/bigdata_diagram.png
•   http://visual.ly/big-data
•   http://csrc.nist.gov/groups/SMA/forum/documents/june2012presentations/fcsm_june20
    12_cooper_mell.pdf
•   http://www.ramcoblog.com/big-data-technologies-a-quick-way-to-get-through-more
Big Data

Weitere ähnliche Inhalte

Was ist angesagt?

Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research reportJULIO GONZALEZ SANZ
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big datakk1718
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAmpoolIO
 
big data overview ppt
big data overview pptbig data overview ppt
big data overview pptVIKAS KATARE
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)SiamAhmed16
 
Lesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxLesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxPankajkumar496281
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...BigMine
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case Muh Saleh
 
Big data introduction
Big data introductionBig data introduction
Big data introductionChirag Ahuja
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewSivashankar Ganapathy
 
Bigdata Analytics using Hadoop
Bigdata Analytics using HadoopBigdata Analytics using Hadoop
Bigdata Analytics using HadoopNagamani Gurram
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantStuart Miniman
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real TimeAlbert Bifet
 
JPJ1417 Data Mining With Big Data
JPJ1417   Data Mining With Big DataJPJ1417   Data Mining With Big Data
JPJ1417 Data Mining With Big Datachennaijp
 

Was ist angesagt? (20)

Big data
Big dataBig data
Big data
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
Big Data Hadoop
Big Data HadoopBig Data Hadoop
Big Data Hadoop
 
Introduction to BigData
Introduction to BigData Introduction to BigData
Introduction to BigData
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Thilga
ThilgaThilga
Thilga
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
big data overview ppt
big data overview pptbig data overview ppt
big data overview ppt
 
Exploring Big Data Analytics Tools
Exploring Big Data Analytics ToolsExploring Big Data Analytics Tools
Exploring Big Data Analytics Tools
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
 
Lesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxLesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptx
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies Overview
 
Bigdata Analytics using Hadoop
Bigdata Analytics using HadoopBigdata Analytics using Hadoop
Bigdata Analytics using Hadoop
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
JPJ1417 Data Mining With Big Data
JPJ1417   Data Mining With Big DataJPJ1417   Data Mining With Big Data
JPJ1417 Data Mining With Big Data
 

Andere mochten auch

Big Data
Big DataBig Data
Big DataNGDATA
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014Stratebi
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Big data diapositivas
Big data diapositivasBig data diapositivas
Big data diapositivassgcuadrado
 
Introduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeopleIntroduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeopleSpringPeople
 
A Brief History of Big Data
A Brief History of Big DataA Brief History of Big Data
A Brief History of Big DataBernard Marr
 
IT trends – 2013 & beyond
IT trends – 2013 & beyondIT trends – 2013 & beyond
IT trends – 2013 & beyondNeha Mehta
 
Big Data, Big Opportunity: A Primer for Understanding The Big Data Frontier
Big Data, Big Opportunity: A Primer for Understanding The Big Data FrontierBig Data, Big Opportunity: A Primer for Understanding The Big Data Frontier
Big Data, Big Opportunity: A Primer for Understanding The Big Data FrontierCA Technologies
 
Systems of Intelligence - Wikibon/theCUBE
Systems of Intelligence - Wikibon/theCUBESystems of Intelligence - Wikibon/theCUBE
Systems of Intelligence - Wikibon/theCUBEGeorge Gilbert
 
Cuadro comparativo
Cuadro comparativoCuadro comparativo
Cuadro comparativoroxydereyes
 
Data Management In Cellular Networks Using Activity Mining
Data Management In Cellular Networks Using Activity MiningData Management In Cellular Networks Using Activity Mining
Data Management In Cellular Networks Using Activity MiningIDES Editor
 
Cellint transportation
Cellint transportation Cellint transportation
Cellint transportation Robert Pauley
 

Andere mochten auch (20)

What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big Data
Big DataBig Data
Big Data
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
What is big data?
What is big data?What is big data?
What is big data?
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data diapositivas
Big data diapositivasBig data diapositivas
Big data diapositivas
 
Palestra Introdução a Big Data
Palestra Introdução a Big DataPalestra Introdução a Big Data
Palestra Introdução a Big Data
 
Introduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeopleIntroduction To Big Data Analytics On Hadoop - SpringPeople
Introduction To Big Data Analytics On Hadoop - SpringPeople
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
A Brief History of Big Data
A Brief History of Big DataA Brief History of Big Data
A Brief History of Big Data
 
IT trends – 2013 & beyond
IT trends – 2013 & beyondIT trends – 2013 & beyond
IT trends – 2013 & beyond
 
Big Data, Big Opportunity: A Primer for Understanding The Big Data Frontier
Big Data, Big Opportunity: A Primer for Understanding The Big Data FrontierBig Data, Big Opportunity: A Primer for Understanding The Big Data Frontier
Big Data, Big Opportunity: A Primer for Understanding The Big Data Frontier
 
Systems of Intelligence - Wikibon/theCUBE
Systems of Intelligence - Wikibon/theCUBESystems of Intelligence - Wikibon/theCUBE
Systems of Intelligence - Wikibon/theCUBE
 
Cuadro comparativo
Cuadro comparativoCuadro comparativo
Cuadro comparativo
 
Data Management In Cellular Networks Using Activity Mining
Data Management In Cellular Networks Using Activity MiningData Management In Cellular Networks Using Activity Mining
Data Management In Cellular Networks Using Activity Mining
 
Cellint transportation
Cellint transportation Cellint transportation
Cellint transportation
 

Ähnlich wie Big Data

Fundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopFundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopArchana Gopinath
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdataTom Rogers
 
big data and hadoop
 big data and hadoop big data and hadoop
big data and hadoopahmed alshikh
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
VTU 6th Sem Elective CSE - Module 4 cloud computing
VTU 6th Sem Elective CSE - Module 4  cloud computingVTU 6th Sem Elective CSE - Module 4  cloud computing
VTU 6th Sem Elective CSE - Module 4 cloud computingSachin Gowda
 
module4-cloudcomputing-180131071200.pdf
module4-cloudcomputing-180131071200.pdfmodule4-cloudcomputing-180131071200.pdf
module4-cloudcomputing-180131071200.pdfSumanthReddy540432
 
Lecture 3.31 3.32.pptx
Lecture 3.31  3.32.pptxLecture 3.31  3.32.pptx
Lecture 3.31 3.32.pptxRATISHKUMAR32
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundationshktripathy
 
Introduction to BIG DATA
Introduction to BIG DATA Introduction to BIG DATA
Introduction to BIG DATA Zeeshan Khan
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2RojaT4
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystemnallagangus
 
Module-2_HADOOP.pptx
Module-2_HADOOP.pptxModule-2_HADOOP.pptx
Module-2_HADOOP.pptxShreyasKv13
 
BIg Data Analytics-Module-2 vtu engineering.pptx
BIg Data Analytics-Module-2 vtu engineering.pptxBIg Data Analytics-Module-2 vtu engineering.pptx
BIg Data Analytics-Module-2 vtu engineering.pptxVishalBH1
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataHaluan Irsad
 
Vikram Andem Big Data Strategy @ IATA Technology Roadmap
Vikram Andem Big Data Strategy @ IATA Technology Roadmap Vikram Andem Big Data Strategy @ IATA Technology Roadmap
Vikram Andem Big Data Strategy @ IATA Technology Roadmap IT Strategy Group
 

Ähnlich wie Big Data (20)

Fundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and HadoopFundamentals of big data analytics and Hadoop
Fundamentals of big data analytics and Hadoop
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdata
 
big data and hadoop
 big data and hadoop big data and hadoop
big data and hadoop
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
VTU 6th Sem Elective CSE - Module 4 cloud computing
VTU 6th Sem Elective CSE - Module 4  cloud computingVTU 6th Sem Elective CSE - Module 4  cloud computing
VTU 6th Sem Elective CSE - Module 4 cloud computing
 
module4-cloudcomputing-180131071200.pdf
module4-cloudcomputing-180131071200.pdfmodule4-cloudcomputing-180131071200.pdf
module4-cloudcomputing-180131071200.pdf
 
Lecture 3.31 3.32.pptx
Lecture 3.31  3.32.pptxLecture 3.31  3.32.pptx
Lecture 3.31 3.32.pptx
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundations
 
Big data Question bank.pdf
Big data Question bank.pdfBig data Question bank.pdf
Big data Question bank.pdf
 
Introduction to BIG DATA
Introduction to BIG DATA Introduction to BIG DATA
Introduction to BIG DATA
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystem
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
unit 1 big data.pptx
unit 1 big data.pptxunit 1 big data.pptx
unit 1 big data.pptx
 
Module-2_HADOOP.pptx
Module-2_HADOOP.pptxModule-2_HADOOP.pptx
Module-2_HADOOP.pptx
 
BIg Data Analytics-Module-2 vtu engineering.pptx
BIg Data Analytics-Module-2 vtu engineering.pptxBIg Data Analytics-Module-2 vtu engineering.pptx
BIg Data Analytics-Module-2 vtu engineering.pptx
 
Big Data
Big DataBig Data
Big Data
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Vikram Andem Big Data Strategy @ IATA Technology Roadmap
Vikram Andem Big Data Strategy @ IATA Technology Roadmap Vikram Andem Big Data Strategy @ IATA Technology Roadmap
Vikram Andem Big Data Strategy @ IATA Technology Roadmap
 

Big Data

  • 1.
  • 2. What we know or see What’s actually there
  • 3.
  • 4. • Wikipedia : In information technology, big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools. The challenges include capture, curation, storage, search, sharing, analysis, and visualization. • IDC : Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high velocity capture, discovery, and/or analysis. • IBM says that ”three characteristics define big data:” – Volume (Terabytes -> Zettabytes) – Variety (Structured -> Semi-structured -> Unstructured) – Velocity (Batch -> Streaming Data)
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28. What Big data should achieve: Volume: Turn 12 terabytes of Tweets created each day into improved product sentiment analysis Convert 350 billion annual meter readings to better predict power consumption Velocity: Sometimes 2 minutes is too late. Scrutinize 5 million trade events created each day to identify potential fraud Analyze 500 million daily call detail records in real-time to predict customer churn faster Variety: Big data is any type of data - structured and unstructured data such as text, sensor data, audio, video, click streams, log files and more. Monitor 100’s of live video feeds from surveillance cameras to target points of interest Exploit the 80% data growth in images, video and documents to improve customer satisfaction Veracity: Establishing trust in big data presents a huge challenge as the variety and number of sources grows.
  • 29. Big data is more than simply a matter of size; it is an opportunity to find insights in new and emerging types of data and content, to make your business more agile, and to answer questions that were previously considered beyond your reach. Here are three key technologies that can help you get a handle on big data – and even more importantly, extract meaningful business value from it . • Information management for big data. Manage data as a strategic, core asset, with ongoing process control for big data analytics . •High-performance analytics for big data. Gain rapid insights from big data and the ability to solve increasingly complex problems using more data . •Flexible deployment options for big data. Choose between options for on premises or hosted, software-as-a-service (SaaS) approaches for big data and big data analytics
  • 30.
  • 31. • Massively Parallel Processing (MPP) – This involves a coordinated processing of a program by multiple processors (200 or more in number). Each of the processors makes use of its own operating system and memory and works on different parts of the program. Each part communicates via messaging interface. An MPP system is also known as “loosely coupled” or “shared nothing” system. • Distributed file system or network file system allows client nodes to access files through a computer network. This way a number of users working on multiple machines will be able to share files and storage resources. The client nodes will not be able to access the block storage but can interact through a network protocol. This enables a restricted access to the file system depending on the access lists or capabilities on both servers and clients which is again dependent on the protocol.
  • 32. • Apache Hadoop is key technology used to handle big data, its analytics and stream computing. Apache Hadoop is an open source software project that enables the distributed processing of large data sets across clusters of commodity servers. It can be scaled up from a single server to thousands of machines and with a very high degree of fault tolerance. Instead of relying on high-end hardware, the resiliency of these clusters comes from the software’s ability to detect and handle failures at the application layer. • Data Intensive Computing is a class of parallel computing application which uses a data parallel approach to process big data. This works based on the principle of collocation of data and programs or algorithms used to perform computation. Parallel and distributed system of inter-connected stand alone computers that work together as a single integrated computing resource is used to process / analyze big data.
  • 33. • HDFS: - Large files are split into parts - Move file parts into a cluster - Fault-tolerant through replication across nodes while being rack-aware - Bookkeeping via NameNode • MapReduce - Move algorithms close to the data by structuring them for parallel execution so that each task works on a part of the data. The power of Simplicity! • noSQL: - Old technology – before RDBMS - Sequential files - Hierarchical DB - Network DB - Graph DB (Neo4j, AllegroGraph) - Document Stores (Lotus Domino, MongoDB, JackRabbit) - Memory Caches
  • 34. • Common Tools: - Infrastructure Management: - Chef & Puppet from DevOps movement - Grid Monitoring tools: - Operational insight with Nagios, Ganglia, Hyperic and ZenOSS - Development support is a need of the hour • Operational Tools: – Chef & Puppet from DevOps movement + Ganglia • Analytics Platforms (DW Solutions, BI solutions and analytics) – Netezza, GreenPlum, Vertica • Universal Data (All data storage needs at enterprise level) – Lily & Spire
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42. Source links and some more useful information… • http://www.sas.com/resources/whitepaper/wp_46345.pdf • http://wikibon.org/blog/big-data-statistics/ • http://wikibon.org/blog/big-data-infographics/ • http://www.marketingtechblog.com/ibm-big-data-marketing/ • http://blogs.gartner.com/doug-laney/ • http://www-01.ibm.com/software/data/bigdata/ • http://en.wikipedia.org/wiki/Big_data • http://mike2.openmethodology.org/wiki/Big_Data_Definition • http://pinterest.com/rtkrum/cool-infographics-gallery • http://www.go-gulf.com/blog/category/blog/seo-sem/ • http://www.forbes.com/sites/davefeinleib/2012/07/24/big-data-trends/ • http://hortonworks.com/wp-content/uploads/2012/05/bigdata_diagram.png • http://visual.ly/big-data • http://csrc.nist.gov/groups/SMA/forum/documents/june2012presentations/fcsm_june20 12_cooper_mell.pdf • http://www.ramcoblog.com/big-data-technologies-a-quick-way-to-get-through-more

Hinweis der Redaktion

  1. IDC = Definition critical with regards to introduction of Gartner’s principle of 3V’s of Big DataIBM = adds the benchmark of what the 3V’s should be talking about for Big Data
  2. Can markets still do without big data?
  3. Source links as well as additional information on the subject
  4. It’s December!