SlideShare ist ein Scribd-Unternehmen logo
1 von 14
1© Copyright 2013 EMC Corporation. All rights reserved.
Big Data – General
Introduction
- Vignesh Gopalan , IIG
2© Copyright 2013 EMC Corporation. All rights reserved.
Agenda
Big Data – Definition
Importance of Big Data
Technologies used in Big Data Analysis
3© Copyright 2013 EMC Corporation. All rights reserved.
Big Data – A Definition
Volume
Variety
Velocity
Veracity
The ‘V’s of Big Data
4© Copyright 2013 EMC Corporation. All rights reserved.
Why is Big Data Important?
Business Analytics
Big Science like LHC, Gene Sequencing Programs
Big Government
5© Copyright 2013 EMC Corporation. All rights reserved.
Big Data – Technologies Primer
MapReduce computation framework and Hadoop
Distributed File System
Distributed databases
NoSQL technologies
6© Copyright 2013 EMC Corporation. All rights reserved.
MapReduce
Published by Google
Scalable
Fault-Tolerant
Batch Computation in parallel
A distributed computation framework
7© Copyright 2013 EMC Corporation. All rights reserved.
MapReduce
Consists of two functions operating
on key-value pairs.
Map – performs filtering and sorting
Reduce - performs summary operation on
Map step results.
… continued
8© Copyright 2013 EMC Corporation. All rights reserved.
Map Reduce…
Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
9© Copyright 2013 EMC Corporation. All rights reserved.
Map Reduce…
Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
10© Copyright 2013 EMC Corporation. All rights reserved.
Distributed File System
Distributed and scalable file system
Highly Available
Intrinsically aware of Map and Reduce jobs
Supports horizontal and vertical partitioning
HDFS – Hadoop Distributed File System
11© Copyright 2013 EMC Corporation. All rights reserved.
HDFS Architecture
Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
12© Copyright 2013 EMC Corporation. All rights reserved.
Apache Hadoop
Open Source implementation of MapReduce + DFS
Image Courtesy – Wikipedia
13© Copyright 2013 EMC Corporation. All rights reserved.
NoSQL Databases
Highly optimized key-value stores
No ACID Guarantees. Eventual consistency
Fault-Tolerant, Distributed architecture.
Amazon Dynamo, Redis are examples.
A distributed computation framework
Big Data – General Introduction

Weitere ähnliche Inhalte

Was ist angesagt?

EMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data ArchivesEMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data Archivessolarisyougood
 
Transform Your Business with Big Data Storage
Transform Your Business with Big Data StorageTransform Your Business with Big Data Storage
Transform Your Business with Big Data StorageEMC
 
EMC isilon for -media-and-entertainment-sales-deck
EMC isilon for -media-and-entertainment-sales-deckEMC isilon for -media-and-entertainment-sales-deck
EMC isilon for -media-and-entertainment-sales-decksolarisyougood
 
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ   White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ EMC
 
7. emc isilon hdfs enterprise storage for hadoop
7. emc isilon hdfs   enterprise storage for hadoop7. emc isilon hdfs   enterprise storage for hadoop
7. emc isilon hdfs enterprise storage for hadoopTaldor Group
 
White Paper: EMC Isilon OneFS Operating System
White Paper: EMC Isilon OneFS Operating System  White Paper: EMC Isilon OneFS Operating System
White Paper: EMC Isilon OneFS Operating System EMC
 
EMC IT's Journey to the Private Cloud: A Practitioner's Guide
EMC IT's Journey to the Private Cloud: A Practitioner's Guide EMC IT's Journey to the Private Cloud: A Practitioner's Guide
EMC IT's Journey to the Private Cloud: A Practitioner's Guide EMC
 
White Paper: EMC Isilon OneFS — A Technical Overview
White Paper: EMC Isilon OneFS — A Technical Overview   White Paper: EMC Isilon OneFS — A Technical Overview
White Paper: EMC Isilon OneFS — A Technical Overview EMC
 
EMC Academic Alliance Presentation
EMC Academic Alliance PresentationEMC Academic Alliance Presentation
EMC Academic Alliance PresentationHaitham El-Ghareeb
 
The Future of Storage : EMC Software Defined Solution
The Future of Storage : EMC Software Defined Solution The Future of Storage : EMC Software Defined Solution
The Future of Storage : EMC Software Defined Solution RSD
 
EMC ScaleIO Overview
EMC ScaleIO OverviewEMC ScaleIO Overview
EMC ScaleIO Overviewwalshe1
 
Emc vi pr data services
Emc vi pr data servicesEmc vi pr data services
Emc vi pr data servicessolarisyougood
 
Emc isilon technical deep dive workshop
Emc isilon technical deep dive workshopEmc isilon technical deep dive workshop
Emc isilon technical deep dive workshopsolarisyougood
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...EMC
 
Transforming Mission Critical Applications
Transforming Mission Critical ApplicationsTransforming Mission Critical Applications
Transforming Mission Critical ApplicationsCenk Ersoy
 
EMC ViPR Services Storage Engine Architecture
EMC ViPR Services Storage Engine ArchitectureEMC ViPR Services Storage Engine Architecture
EMC ViPR Services Storage Engine ArchitectureEMC
 
S104875 nightmares-dreams-spectrum-control-jburg-v1809h
S104875 nightmares-dreams-spectrum-control-jburg-v1809hS104875 nightmares-dreams-spectrum-control-jburg-v1809h
S104875 nightmares-dreams-spectrum-control-jburg-v1809hTony Pearson
 
EMC-ISILON_MphasiS_Walk_through
EMC-ISILON_MphasiS_Walk_throughEMC-ISILON_MphasiS_Walk_through
EMC-ISILON_MphasiS_Walk_throughprakashjjaya
 
Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed Enterprise Management Associates
 

Was ist angesagt? (20)

EMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data ArchivesEMC Isilon Solutions for Data Archives
EMC Isilon Solutions for Data Archives
 
Transform Your Business with Big Data Storage
Transform Your Business with Big Data StorageTransform Your Business with Big Data Storage
Transform Your Business with Big Data Storage
 
EMC isilon for -media-and-entertainment-sales-deck
EMC isilon for -media-and-entertainment-sales-deckEMC isilon for -media-and-entertainment-sales-deck
EMC isilon for -media-and-entertainment-sales-deck
 
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ   White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ
White Paper: Best Practices for Data Replication with EMC Isilon SyncIQ
 
7. emc isilon hdfs enterprise storage for hadoop
7. emc isilon hdfs   enterprise storage for hadoop7. emc isilon hdfs   enterprise storage for hadoop
7. emc isilon hdfs enterprise storage for hadoop
 
White Paper: EMC Isilon OneFS Operating System
White Paper: EMC Isilon OneFS Operating System  White Paper: EMC Isilon OneFS Operating System
White Paper: EMC Isilon OneFS Operating System
 
EMC IT's Journey to the Private Cloud: A Practitioner's Guide
EMC IT's Journey to the Private Cloud: A Practitioner's Guide EMC IT's Journey to the Private Cloud: A Practitioner's Guide
EMC IT's Journey to the Private Cloud: A Practitioner's Guide
 
White Paper: EMC Isilon OneFS — A Technical Overview
White Paper: EMC Isilon OneFS — A Technical Overview   White Paper: EMC Isilon OneFS — A Technical Overview
White Paper: EMC Isilon OneFS — A Technical Overview
 
EMC Academic Alliance Presentation
EMC Academic Alliance PresentationEMC Academic Alliance Presentation
EMC Academic Alliance Presentation
 
The Future of Storage : EMC Software Defined Solution
The Future of Storage : EMC Software Defined Solution The Future of Storage : EMC Software Defined Solution
The Future of Storage : EMC Software Defined Solution
 
EMC ScaleIO Overview
EMC ScaleIO OverviewEMC ScaleIO Overview
EMC ScaleIO Overview
 
Emc isilon overview
Emc isilon overview Emc isilon overview
Emc isilon overview
 
Emc vi pr data services
Emc vi pr data servicesEmc vi pr data services
Emc vi pr data services
 
Emc isilon technical deep dive workshop
Emc isilon technical deep dive workshopEmc isilon technical deep dive workshop
Emc isilon technical deep dive workshop
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
 
Transforming Mission Critical Applications
Transforming Mission Critical ApplicationsTransforming Mission Critical Applications
Transforming Mission Critical Applications
 
EMC ViPR Services Storage Engine Architecture
EMC ViPR Services Storage Engine ArchitectureEMC ViPR Services Storage Engine Architecture
EMC ViPR Services Storage Engine Architecture
 
S104875 nightmares-dreams-spectrum-control-jburg-v1809h
S104875 nightmares-dreams-spectrum-control-jburg-v1809hS104875 nightmares-dreams-spectrum-control-jburg-v1809h
S104875 nightmares-dreams-spectrum-control-jburg-v1809h
 
EMC-ISILON_MphasiS_Walk_through
EMC-ISILON_MphasiS_Walk_throughEMC-ISILON_MphasiS_Walk_through
EMC-ISILON_MphasiS_Walk_through
 
Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed
 

Andere mochten auch

EMC & OpenStack: A View From Within
EMC & OpenStack: A View From WithinEMC & OpenStack: A View From Within
EMC & OpenStack: A View From WithinEMC
 
Integumentary Terms
Integumentary TermsIntegumentary Terms
Integumentary Termstahearn40
 
Recovered file 1
Recovered file 1Recovered file 1
Recovered file 1vicolombia
 
Insaat kursu-pendik
Insaat kursu-pendikInsaat kursu-pendik
Insaat kursu-pendiksersld54
 
What a dude born in 1888 taught me about design.
What a dude born in 1888 taught me about design. What a dude born in 1888 taught me about design.
What a dude born in 1888 taught me about design. Kylie Timpani
 
Building the Case for New Technology Have Inspiration, Will Travel ...
Building the Case for New Technology Have Inspiration, Will Travel ...Building the Case for New Technology Have Inspiration, Will Travel ...
Building the Case for New Technology Have Inspiration, Will Travel ...Society of Women Engineers
 
EMC World 2016 - code.01 Everything as Code - How did we get here?
EMC World 2016 - code.01 Everything as Code - How did we get here?EMC World 2016 - code.01 Everything as Code - How did we get here?
EMC World 2016 - code.01 Everything as Code - How did we get here?{code}
 
iNARTE Presentation to EMC Symposium 2016
iNARTE Presentation to EMC Symposium 2016iNARTE Presentation to EMC Symposium 2016
iNARTE Presentation to EMC Symposium 2016Scott Paton
 
EMC Documentum Enterprise Content Management 6.5
EMC Documentum Enterprise Content Management 6.5EMC Documentum Enterprise Content Management 6.5
EMC Documentum Enterprise Content Management 6.5Emirates Computers
 

Andere mochten auch (17)

EMC & OpenStack: A View From Within
EMC & OpenStack: A View From WithinEMC & OpenStack: A View From Within
EMC & OpenStack: A View From Within
 
Integumentary Terms
Integumentary TermsIntegumentary Terms
Integumentary Terms
 
Recovered file 1
Recovered file 1Recovered file 1
Recovered file 1
 
Prototipo
Prototipo Prototipo
Prototipo
 
Conclave
ConclaveConclave
Conclave
 
Insaat kursu-pendik
Insaat kursu-pendikInsaat kursu-pendik
Insaat kursu-pendik
 
InDA Brochure India
InDA Brochure IndiaInDA Brochure India
InDA Brochure India
 
What a dude born in 1888 taught me about design.
What a dude born in 1888 taught me about design. What a dude born in 1888 taught me about design.
What a dude born in 1888 taught me about design.
 
Spain
SpainSpain
Spain
 
SLG_EMC
SLG_EMCSLG_EMC
SLG_EMC
 
Building the Case for New Technology Have Inspiration, Will Travel ...
Building the Case for New Technology Have Inspiration, Will Travel ...Building the Case for New Technology Have Inspiration, Will Travel ...
Building the Case for New Technology Have Inspiration, Will Travel ...
 
EMC World 2016 - code.01 Everything as Code - How did we get here?
EMC World 2016 - code.01 Everything as Code - How did we get here?EMC World 2016 - code.01 Everything as Code - How did we get here?
EMC World 2016 - code.01 Everything as Code - How did we get here?
 
iNARTE Presentation to EMC Symposium 2016
iNARTE Presentation to EMC Symposium 2016iNARTE Presentation to EMC Symposium 2016
iNARTE Presentation to EMC Symposium 2016
 
Dell corporation ltd
Dell corporation ltdDell corporation ltd
Dell corporation ltd
 
EMC World 2016 Summary (Part 1)
EMC World 2016 Summary (Part 1)EMC World 2016 Summary (Part 1)
EMC World 2016 Summary (Part 1)
 
EMC Documentum Enterprise Content Management 6.5
EMC Documentum Enterprise Content Management 6.5EMC Documentum Enterprise Content Management 6.5
EMC Documentum Enterprise Content Management 6.5
 
Unit 7 Book
Unit 7 Book Unit 7 Book
Unit 7 Book
 

Ähnlich wie Big Data – General Introduction

Big data with hadoop
Big data with hadoopBig data with hadoop
Big data with hadoopAnusha sweety
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training reportSarvesh Meena
 
Cloud Models, Considerations, & Adoption Techniques
Cloud Models, Considerations, & Adoption TechniquesCloud Models, Considerations, & Adoption Techniques
Cloud Models, Considerations, & Adoption TechniquesEMC
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Managementrightsize
 
True Storage Virtualization with Software-Defined Storage
True Storage Virtualization with Software-Defined StorageTrue Storage Virtualization with Software-Defined Storage
True Storage Virtualization with Software-Defined StorageCloudOps Summit
 
Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14John Sing
 
A Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis TechniquesA Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis Techniquesijsrd.com
 
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...EMC
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview EMC
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL TechnologiesAmit Singh
 
Pivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant DancePivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant DanceEMC
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Mahantesh Angadi
 
Cloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodCloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodIRJET Journal
 
Hadoop by kamran khan
Hadoop by kamran khanHadoop by kamran khan
Hadoop by kamran khanKamranKhan587
 
Emc vi pr global data services
Emc vi pr global data servicesEmc vi pr global data services
Emc vi pr global data servicessolarisyougood
 

Ähnlich wie Big Data – General Introduction (20)

Greenplum feature
Greenplum featureGreenplum feature
Greenplum feature
 
Big Data and its Possibilities in the Cloud
Big Data and its Possibilities in the CloudBig Data and its Possibilities in the Cloud
Big Data and its Possibilities in the Cloud
 
Big data with hadoop
Big data with hadoopBig data with hadoop
Big data with hadoop
 
hadoop seminar training report
hadoop seminar  training reporthadoop seminar  training report
hadoop seminar training report
 
Cloud Models, Considerations, & Adoption Techniques
Cloud Models, Considerations, & Adoption TechniquesCloud Models, Considerations, & Adoption Techniques
Cloud Models, Considerations, & Adoption Techniques
 
Big Data Performance and Capacity Management
Big Data Performance and Capacity ManagementBig Data Performance and Capacity Management
Big Data Performance and Capacity Management
 
True Storage Virtualization with Software-Defined Storage
True Storage Virtualization with Software-Defined StorageTrue Storage Virtualization with Software-Defined Storage
True Storage Virtualization with Software-Defined Storage
 
Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14Hadoop_Its_Not_Just_Internal_Storage_V14
Hadoop_Its_Not_Just_Internal_Storage_V14
 
Hadoop technology
Hadoop technologyHadoop technology
Hadoop technology
 
Hadoop map reduce for mobile clouds
Hadoop map reduce for mobile cloudsHadoop map reduce for mobile clouds
Hadoop map reduce for mobile clouds
 
Hadoop Everywhere
Hadoop EverywhereHadoop Everywhere
Hadoop Everywhere
 
A Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis TechniquesA Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis Techniques
 
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL Technologies
 
Pivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant DancePivotal: Virtualize Big Data to Make the Elephant Dance
Pivotal: Virtualize Big Data to Make the Elephant Dance
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
 
Cloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control MethodCloud Computing Ambiance using Secluded Access Control Method
Cloud Computing Ambiance using Secluded Access Control Method
 
Hadoop by kamran khan
Hadoop by kamran khanHadoop by kamran khan
Hadoop by kamran khan
 
Emc vi pr global data services
Emc vi pr global data servicesEmc vi pr global data services
Emc vi pr global data services
 

Mehr von EMC

Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote EMC
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOEMC
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremioEMC
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereEMC
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History EMC
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewEMC
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeEMC
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic EMC
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityEMC
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeEMC
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015EMC
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesEMC
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsEMC
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookEMC
 
2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS BreachEMC
 

Mehr von EMC (20)

Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education Services
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
 
2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach
 

Kürzlich hochgeladen

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Kürzlich hochgeladen (20)

Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Big Data – General Introduction

  • 1. 1© Copyright 2013 EMC Corporation. All rights reserved. Big Data – General Introduction - Vignesh Gopalan , IIG
  • 2. 2© Copyright 2013 EMC Corporation. All rights reserved. Agenda Big Data – Definition Importance of Big Data Technologies used in Big Data Analysis
  • 3. 3© Copyright 2013 EMC Corporation. All rights reserved. Big Data – A Definition Volume Variety Velocity Veracity The ‘V’s of Big Data
  • 4. 4© Copyright 2013 EMC Corporation. All rights reserved. Why is Big Data Important? Business Analytics Big Science like LHC, Gene Sequencing Programs Big Government
  • 5. 5© Copyright 2013 EMC Corporation. All rights reserved. Big Data – Technologies Primer MapReduce computation framework and Hadoop Distributed File System Distributed databases NoSQL technologies
  • 6. 6© Copyright 2013 EMC Corporation. All rights reserved. MapReduce Published by Google Scalable Fault-Tolerant Batch Computation in parallel A distributed computation framework
  • 7. 7© Copyright 2013 EMC Corporation. All rights reserved. MapReduce Consists of two functions operating on key-value pairs. Map – performs filtering and sorting Reduce - performs summary operation on Map step results. … continued
  • 8. 8© Copyright 2013 EMC Corporation. All rights reserved. Map Reduce… Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
  • 9. 9© Copyright 2013 EMC Corporation. All rights reserved. Map Reduce… Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
  • 10. 10© Copyright 2013 EMC Corporation. All rights reserved. Distributed File System Distributed and scalable file system Highly Available Intrinsically aware of Map and Reduce jobs Supports horizontal and vertical partitioning HDFS – Hadoop Distributed File System
  • 11. 11© Copyright 2013 EMC Corporation. All rights reserved. HDFS Architecture Image Courtesy – Big Data by Nathan Marz , James Warren, Manning Publications
  • 12. 12© Copyright 2013 EMC Corporation. All rights reserved. Apache Hadoop Open Source implementation of MapReduce + DFS Image Courtesy – Wikipedia
  • 13. 13© Copyright 2013 EMC Corporation. All rights reserved. NoSQL Databases Highly optimized key-value stores No ACID Guarantees. Eventual consistency Fault-Tolerant, Distributed architecture. Amazon Dynamo, Redis are examples. A distributed computation framework