SlideShare ist ein Scribd-Unternehmen logo
1 von 49
INTRODUCTION
NOSQLHADOOP.BIGDATA.
BIGDATA
Big Data refers to TECHNOLOGY and INITIATIVES that involve data that
is too DIVERSE FAST-CHANGING or MASSIVE for conventional
technologies, skills and infrastructure to address efficiently.
1
WHAT IS BIG DATA?
BIGDATA
Big Data refers to TECHNOLOGY and INITIATIVES that involve data that
is too DIVERSE FAST-CHANGING or MASSIVE for conventional
technologies, skills and infrastructure to address efficiently.
1
WHAT IS BIG DATA?
VOLUME
High data
capacity
(Terabytes or
petabytes)
BIGDATA
BIG DATA CHARACTERISTICS
Big Data refers to TECHNOLOGY and INITIATIVES that involve data that
is too DIVERSE FAST-CHANGING or MASSIVE for conventional
technologies, skills and infrastructure to address efficiently.
1
WHAT IS BIG DATA?
VOLUME VELOCITY
High data
capacity
(Terabytes or
petabytes)
Batch
Real-time
Streams
BIGDATA
BIG DATA CHARACTERISTICS
Big Data refers to TECHNOLOGY and INITIATIVES that involve data that
is too DIVERSE FAST-CHANGING or MASSIVE for conventional
technologies, skills and infrastructure to address efficiently.
1
WHAT IS BIG DATA?
VOLUME VELOCITY VARIETY
High data
capacity
(Terabytes or
petabytes)
Batch
Real-time
Streams
Various kinds
(Structured, unstructured,
semi-structured)
BIGDATA
BIG DATA CHARACTERISTICS
Big Data refers to TECHNOLOGY and INITIATIVES that involve data that
is too DIVERSE FAST-CHANGING or MASSIVE for conventional
technologies, skills and infrastructure to address efficiently.
1
WHAT IS BIG DATA?
BIG DATA CHARACTERISTICS
VOLUME VELOCITY VARIETY VERACITY
High data
capacity
(Terabytes or
petabytes)
Batch
Real-time
Streams
Various kinds
(Structured, unstructured,
semi-structured)
Quality
Consistency
Reliability
BIGDATA
Type Characteristics Examples Technology
S T RU C T U R E D
d a t a
Entities with a pre-defined
format/schema.
RDBMS records. RDBMS, NoSQL
S E M I -
S T RU C T U R E D
d a t a
Data is lesser, maybe a schema.
XML Files, JSON
files
NoSQL,
MapReduce
U N S T RU C T U R E D
d a t a
NO structure
Email content,
images, videos,
PDF files
MapReduce
1BIGDATA
BIG DATA
TYPES
1BIGDATA
BIG DATA CHALLENGES IN STORAGE&ANALYSIS
1. PROCESS SLOWLY, UNSCALABLE
SSD (800Mb/s, 2ms seek)
SATA (300Mb/s)
IDE drive (75MB/sec, 10ms seek)
1BIGDATA
1. PROCESS SLOWLY, UNSCALABLE
2. UNRELIABLE MACHINE
IDE drive (75MB/sec, 10ms seek)
Risky
BIG DATA CHALLENGES IN STORAGE&ANALYSIS
1BIGDATA
1. PROCESS SLOWLY, UNSCALABLE
2. UNRELIABLE MACHINE
3. RELIABILITY
IDE drive (75MB/sec, 10ms seek)
Scalability
Data recovery
Partial failure
BIG DATA CHALLENGES IN STORAGE&ANALYSIS
1BIGDATA
1. PROCESS SLOWLY, UNSCALABLE
2. UNRELIABLE MACHINE
3. RELIABILITY
4. BACKUP
IDE drive (75MB/sec, 10ms seek)
BIG DATA CHALLENGES IN STORAGE&ANALYSIS
1BIGDATA
1. PROCESS SLOWLY, UNSCALABLE
2. UNRELIABLE MACHINE
3. RELIABILITY
4. BACKUP
5. PARALLEL PROCESS
IDE drive (75MB/sec, 10ms seek)
BIG DATA CHALLENGES IN STORAGE&ANALYSIS
1BIGDATA
1. PROCESS SLOWLY, UNSCALABLE
2. UNRELIABLE MACHINE
3. RELIABILITY
4. BACKUP
5. PARALLEL PROCESS
6. EXPENSIVE COST
IDE drive (75MB/sec, 10ms seek)
BIG DATA CHALLENGES IN STORAGE&ANALYSIS
HADOOP
2HADOOP
WHAT IS HADOOP ?
A free, Java-based framework that allows the DISTRIBUTED PROCESSING
of LARGE DATA SETS across CLUSTER OF COMPUTERS
using SIMPLE PROGRAMING MODELS
2HADOOP
WHAT IS HADOOP ?
HADOOP ORIGIN
GOOGLE PUBLISH
GFS & MAP
REDUCE PAPER
2 0 0 2 - 2 0 0 4
DOUGH CUTTING
ADD GFS & MAP
REDUCE TO NUTCH
2 0 0 4
YAHOO! HIRE DOUGH, BUILD
A TEAM TO DEVELOP
HADOOP
2 0 0 7
NY TIME CONVERT 4
TB OF ARCHIVE (100
EC2 CLUSTER)
Y
A free, Java-based framework that allows the DISTRIBUTED PROCESSING
of LARGE DATA SETS across CLUSTER OF COMPUTERS
using SIMPLE PROGRAMING MODELS
2HADOOP
WHAT IS HADOOP ?
HADOOP ORIGIN
WEB SCALE
DEVELOPMENT AT
YAHOO, FACEBOOK,
TWITTER
YAHOO! DOES
FASTEST SORT OF a
TB in 62 sec
2 0 0 9
YAHOO! SORT A PB IN
16.25 HOURS (3658
NODES)
APACHE HADOOP IS
NOW AN OPEN SOURCE
E CONVERT 4
ARCHIVE (100
CLUSTER)
A free, Java-based framework that allows the DISTRIBUTED PROCESSING
of LARGE DATA SETS across CLUSTER OF COMPUTERS
using SIMPLE PROGRAMING MODELS
2HADOOP
HADOOP ARCHITECTURE
Hadoop is designed and built on top two
independent parts
HADOOP
HDFSMAP REDUCE +
=
Storage file
system
Processing
2HADOOP
HADOOP ARCHITECTURE
+
Distributed across “NODES”
HDFS – Hadoop distributed file system
2HADOOP
HADOOP ARCHITECTURE
+
Provide actual storage
NAME NODE DATA NODE
Master of the system
Store meta data
Transaction blog, list of files,
list of block, data nodes
Maintain and manage blocks
on data nodes
Responsible for serving
read/write requests
Slaves; deployed on each machine.
Distributed across “NODES”
HDFS – Hadoop distributed file system
2HADOOP
HADOOP ARCHITECTURE
+
MODEL
HDFS – Hadoop distributed file system
2HADOOP
HADOOP ARCHITECTURE
+
MAP REDUCE
COMPONENTS
JOB TRACKER TASK TRACKER
Master & manage job & resource
in the cluster
Slaves, deployed on each machines
Running the map & reduce tasks
as job tracker requires
2HADOOP
HADOOP ARCHITECTURE
+
MAP REDUCE
MODEL
2HADOOP
HADOOP ARCHITECTURE
+
ALGORITHM
o Parallel algorithm
MAP REDUCE
2HADOOP
HADOOP ARCHITECTURE
+
ALGORITHM
o Parallel algorithm
o 3 basic steps
Map step
Split data into key & value
MAP REDUCE
2HADOOP
HADOOP ARCHITECTURE
ALGORITHM
o Parallel algorithm
o 3 basic steps
Map step
Shuffle step
Split data into key & value
Sorted by key
MAP REDUCE
2HADOOP
HADOOP ARCHITECTURE
+
ALGORITHM
o Parallel algorithm
o 3 basic steps
Map step
Shuffle step
Reduce step
Split data into key & value
Gather
Sorted by key
MAP REDUCE
o Logical functions: MAPPER & REDUCER
2HADOOP
HADOOP ARCHITECTURE
FUNCTIONS
o Hadoop handles distributing MAP & REDUCE tasks across the cluster
o MAP & REDUCE functions were written and submit .jars to
Hadoop clusters.
o Typically batch oriented.
MAP REDUCE
2HADOOP
HADOOP ARCHITECTURE
+
ECOSYSTEM
MODEL
2HADOOP
HADOOP FEATURES SUMMARY
+
STORE
ANYTHING
Unstructured data
semi structured data
2HADOOP
HADOOP FEATURES SUMMARY
+
STORE
ANYTHING
Unstructured data,
semi structured data
STORAGE
CAPACITY
Scale linearly
Cost is not exponential
2HADOOP
HADOOP FEATURES SUMMARY
+
STORAGE
CAPACITY
Scale linearly
Cost is not exponential
DATA LOCALITY & PROCESS
IN YOUR WAY
STORE
ANYTHING
Unstructured data,
semi structured data
2HADOOP
HADOOP FEATURES SUMMARY
+
STORE
ANYTHING
Unstructured data,
semi structured data
STORAGE
CAPACITY
Scale linearly
Cost is not exponential
DATA LOCALITY & PROCESS
IN YOUR WAY
FAILURE & FAULT
TOLERANCE
Detect failure & heal
itself
(data replicated, failed task is
re-run, no need to maintain
backup data)
2HADOOP
HADOOP FEATURES SUMMARY
+
STORE
ANYTHING
Unstructured data,
semi structured data
STORAGE
CAPACITY
Scale linearly
Cost is not exponential
DATA LOCALITY & PROCESS
IN YOUR WAY
FAILURE & FAULT
TOLERANCE
Detect failure & heal itself
(data replicated, failed task is
re-run, no need to maintain
backup data)
COST
EFFECTIVE
2HADOOP
HADOOP FEATURES SUMMARY
+
STORE
ANYTHING
Unstructured data,
semi structured data
STORAGE
CAPACITY
Scale linearly
Cost is not exponential
DATA LOCALITY & PROCESS
IN YOUR WAY
FAILURE & FAULT
TOLERANCE
Detect failure & heal
itself
(data replicated, failed task
is re-run, no need to
maintain backup data)
COST
EFFECTIVE
PRIMARILY USED FOR BATCH
PROCESSING, NOT REAL-
TIME
2HADOOP
WHO IS USING HADOOP & FOR WHAT
+
SEARCH
LOG PROCESSING
RECOMMENDATION SYSTEMS
DATA WAREHOUSE
VIDEO & IMAGE ANALYSIS
2HADOOP
+
SEARCH
LOG PROCESSING
RECOMMENDATION SYSTEMS
DATA WAREHOUSE
VIDEO & IMAGE ANALYSIS
AND
MANY
MORE …
WHO IS USING HADOOP & FOR WHAT
NOSQL
3N O S Q L
WHAT IS NOSQL ?
NOSQL = Not Only SQL
SCHEMA FREE
3N O S Q L
WHAT IS NOSQL ?
NOSQL = Not Only SQL
SCHEMA FREE
NOSQL CATEGORIES
KEY
VALUE
STORE
DYNAMO, AZURE,
REDIS,
MEMCACHED
3N O S Q L
WHAT IS NOSQL ?
NOSQL = Not Only SQL
SCHEMA FREE
NOSQL CATEGORIES
KEY
VALUE
STORE
DYNAMO, AZURE,
REDIS,
MEMCACHED
BIG TABLE /
COLUM N
STORE
(GOOGLE )
HBASE; CASSANDAR
Similar to RBDMS but
handles semi - structured
3N O S Q L
WHAT IS NOSQL ?
NOSQL = Not Only SQL
SCHEMA FREE
NOSQL CATEGORIES
KEY
VALUE
STORE
DYNAMO, AZURE,
REDIS,
MEMCACHED
BIG TABLE /
COLUM N
STORE
(GOOGLE )
HBASE; CASSANDAR
Similar to RBDMS but
handles semi - structured
GRAPH
DB NEO4J
3N O S Q L
WHAT IS NOSQL ?
NOSQL = Not Only SQL
SCHEMA FREE
NOSQL CATEGORIES
KEY
VALUE
STORE
DYNAMO, AZURE,
REDIS,
MEMCACHED
BIG TABLE /
COLUM N
STORE
(GOOGLE )
HBASE; CASSANDAR
Similar to RBDMS but
handles semi - structured
GRAPH
DB NEO4J
DOCUM E NT
S TORE
MONGODB, REDIS, COUCHDB
Similar to key – value store but
DB knows what is the value
3N O S Q L
NOSQL
+
COLLECTION: is a group of RELATED DOCUMENTS
MONGO DB – DATA MODELING CONCEPT
In form of DOCUMENTS (JSON-liked key value).
Data in MongoDB has A FLEXIBLE SCHEMA.
3N O S Q L
NOSQL
+
No JOIN, instead, there are 2 types of DOCUMENT STRUCTURE
Reference Embedded
MONGO DB – DATA MODELING CONCEPT
3N O S Q L
NOSQL
+
MONGO DB – DATA MODELING CONCEPT
* Always consider the usage of data (queries or update) when designing data models
MODEL RELATIONSHIP
BETWEEN DOCUMENTS
MODEL TREE STRUCTURES
One - to - one
One - to - many
Parent reference
Child reference
Array of ancestors
Materialized paths
Nested sets
3N O S Q L
NOSQL
MONGO DB – CRUD OPERATIONS
COMPARING: SQL VS MONGO STATEMENTS
QUERY STATEMENT
CREATE / INSERT / UPDATE / DELETE
THE END

Weitere ähnliche Inhalte

Was ist angesagt?

NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsLynn Langit
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architectureBishal Khanal
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Simplilearn
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupDatabricks
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented DatabasesFabio Fumarola
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component rebeccatho
 
Introduction To HBase
Introduction To HBaseIntroduction To HBase
Introduction To HBaseAnil Gupta
 

Was ist angesagt? (20)

NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 
Hadoop
Hadoop Hadoop
Hadoop
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
Hadoop YARN
Hadoop YARNHadoop YARN
Hadoop YARN
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
What is ETL?
What is ETL?What is ETL?
What is ETL?
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Unit 4-apache pig
Unit 4-apache pigUnit 4-apache pig
Unit 4-apache pig
 
Introduction To HBase
Introduction To HBaseIntroduction To HBase
Introduction To HBase
 
Hadoop
HadoopHadoop
Hadoop
 

Andere mochten auch

Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL TechnologiesAmit Singh
 
Introduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQLIntroduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQLTushar Shende
 
SQL or NoSQL - how to choose
SQL or NoSQL - how to chooseSQL or NoSQL - how to choose
SQL or NoSQL - how to chooseLars Thorup
 
NoSQL with Microsoft Azure
NoSQL with Microsoft AzureNoSQL with Microsoft Azure
NoSQL with Microsoft AzureKhalid Salama
 
BigData Overview
BigData OverviewBigData Overview
BigData OverviewHoryun Lee
 
Limits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsLimits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsDan Sullivan, Ph.D.
 
NoSQL-Database-Concepts
NoSQL-Database-ConceptsNoSQL-Database-Concepts
NoSQL-Database-ConceptsBhaskar Gunda
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databasesAshwani Kumar
 
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...Amazon Web Services
 
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
 Migration and Coexistence between Relational and NoSQL Databases by Manuel H... Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...Big Data Spain
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2Fabio Fumarola
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachSlides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachDATAVERSITY
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDATAVERSITY
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQLMike Crabb
 

Andere mochten auch (20)

Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL Technologies
 
Introduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQLIntroduction to Bigdata and NoSQL
Introduction to Bigdata and NoSQL
 
BigData - NoSQL
BigData -  NoSQL BigData -  NoSQL
BigData - NoSQL
 
SQL & NoSQL
SQL & NoSQLSQL & NoSQL
SQL & NoSQL
 
SQL or NoSQL - how to choose
SQL or NoSQL - how to chooseSQL or NoSQL - how to choose
SQL or NoSQL - how to choose
 
NoSQL with Microsoft Azure
NoSQL with Microsoft AzureNoSQL with Microsoft Azure
NoSQL with Microsoft Azure
 
BigData Overview
BigData OverviewBigData Overview
BigData Overview
 
Limits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in BioinformaticsLimits of RDBMS and Need for NoSQL in Bioinformatics
Limits of RDBMS and Need for NoSQL in Bioinformatics
 
NoSQL-Database-Concepts
NoSQL-Database-ConceptsNoSQL-Database-Concepts
NoSQL-Database-Concepts
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
 
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...SQL to NoSQL   Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
SQL to NoSQL Best Practices with Amazon DynamoDB - AWS July 2016 Webinar Se...
 
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
 Migration and Coexistence between Relational and NoSQL Databases by Manuel H... Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachSlides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data Warehouse
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Ähnlich wie Introduction of Big data, NoSQL & Hadoop

From oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other toolsFrom oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other toolsGuy Harrison
 
Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Imply
 
Database Modernization (Azure SQL Database)
Database Modernization (Azure SQL Database)Database Modernization (Azure SQL Database)
Database Modernization (Azure SQL Database)Radu Vunvulea
 
Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)PyData
 
Making Big Data, small
Making Big Data, smallMaking Big Data, small
Making Big Data, smallMarcinJedyk
 
Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Nathan Bijnens
 
Cowboy dating with big data TechDays at Lohika-2020
Cowboy dating with big data TechDays at Lohika-2020Cowboy dating with big data TechDays at Lohika-2020
Cowboy dating with big data TechDays at Lohika-2020b0ris_1
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogC4Media
 
Analyzing Big data in R and Scala using Apache Spark 17-7-19
Analyzing Big data in R and Scala using Apache Spark  17-7-19Analyzing Big data in R and Scala using Apache Spark  17-7-19
Analyzing Big data in R and Scala using Apache Spark 17-7-19Ahmed Elsayed
 
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfDataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfMiguel Angel Fajardo
 
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyScaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyRohit Kulkarni
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataPentaho
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointInside Analysis
 
Modernizing Mission-Critical Apps with SQL Server
Modernizing Mission-Critical Apps with SQL ServerModernizing Mission-Critical Apps with SQL Server
Modernizing Mission-Critical Apps with SQL ServerMicrosoft Tech Community
 
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyPilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyStuart Pook
 
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...nnakasone
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gMark Rittman
 
MATLAB_BIg_Data_ds_Haddop_22032015
MATLAB_BIg_Data_ds_Haddop_22032015MATLAB_BIg_Data_ds_Haddop_22032015
MATLAB_BIg_Data_ds_Haddop_22032015Asaf Ben Gal
 

Ähnlich wie Introduction of Big data, NoSQL & Hadoop (20)

From oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other toolsFrom oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other tools
 
Hadoop
HadoopHadoop
Hadoop
 
Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)Druid: Under the Covers (Virtual Meetup)
Druid: Under the Covers (Virtual Meetup)
 
Lambda architecture
Lambda architectureLambda architecture
Lambda architecture
 
Database Modernization (Azure SQL Database)
Database Modernization (Azure SQL Database)Database Modernization (Azure SQL Database)
Database Modernization (Azure SQL Database)
 
Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)
 
Making Big Data, small
Making Big Data, smallMaking Big Data, small
Making Big Data, small
 
Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013
 
Cowboy dating with big data TechDays at Lohika-2020
Cowboy dating with big data TechDays at Lohika-2020Cowboy dating with big data TechDays at Lohika-2020
Cowboy dating with big data TechDays at Lohika-2020
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
 
Analyzing Big data in R and Scala using Apache Spark 17-7-19
Analyzing Big data in R and Scala using Apache Spark  17-7-19Analyzing Big data in R and Scala using Apache Spark  17-7-19
Analyzing Big data in R and Scala using Apache Spark 17-7-19
 
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfDataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
 
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, GuindyScaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
Scaling up with hadoop and banyan at ITRIX-2015, College of Engineering, Guindy
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
 
Modernizing Mission-Critical Apps with SQL Server
Modernizing Mission-Critical Apps with SQL ServerModernizing Mission-Critical Apps with SQL Server
Modernizing Mission-Critical Apps with SQL Server
 
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyPilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
 
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
 
MATLAB_BIg_Data_ds_Haddop_22032015
MATLAB_BIg_Data_ds_Haddop_22032015MATLAB_BIg_Data_ds_Haddop_22032015
MATLAB_BIg_Data_ds_Haddop_22032015
 

Mehr von Savvycom Savvycom

Reactive programming with RxAndroid
Reactive programming with RxAndroidReactive programming with RxAndroid
Reactive programming with RxAndroidSavvycom Savvycom
 
Realm Java 2.2.0: Build better apps, faster apps
Realm Java 2.2.0: Build better apps, faster appsRealm Java 2.2.0: Build better apps, faster apps
Realm Java 2.2.0: Build better apps, faster appsSavvycom Savvycom
 
Vietnam - Asia's newest IT and Outsourcing Tiger
Vietnam - Asia's newest IT and Outsourcing TigerVietnam - Asia's newest IT and Outsourcing Tiger
Vietnam - Asia's newest IT and Outsourcing TigerSavvycom Savvycom
 
Pros and Cons of Blackberry 10
Pros and Cons of Blackberry 10Pros and Cons of Blackberry 10
Pros and Cons of Blackberry 10Savvycom Savvycom
 
Do's and Don'ts in mobile game development
Do's and Don'ts in mobile game developmentDo's and Don'ts in mobile game development
Do's and Don'ts in mobile game developmentSavvycom Savvycom
 
Trends of Information Technology in 2013
Trends of Information Technology in 2013Trends of Information Technology in 2013
Trends of Information Technology in 2013Savvycom Savvycom
 
Cloud computing - Pros and Cons
Cloud computing - Pros and ConsCloud computing - Pros and Cons
Cloud computing - Pros and ConsSavvycom Savvycom
 
Steps of outsourcing strategy
Steps of outsourcing strategySteps of outsourcing strategy
Steps of outsourcing strategySavvycom Savvycom
 
The role of QR code in daily life
The role of QR code in daily lifeThe role of QR code in daily life
The role of QR code in daily lifeSavvycom Savvycom
 
Why are social games so successful?
Why are social games so successful?Why are social games so successful?
Why are social games so successful?Savvycom Savvycom
 
What makes a complete mobile site
What makes a complete mobile siteWhat makes a complete mobile site
What makes a complete mobile siteSavvycom Savvycom
 

Mehr von Savvycom Savvycom (20)

Reactive programming with RxAndroid
Reactive programming with RxAndroidReactive programming with RxAndroid
Reactive programming with RxAndroid
 
Realm Java 2.2.0: Build better apps, faster apps
Realm Java 2.2.0: Build better apps, faster appsRealm Java 2.2.0: Build better apps, faster apps
Realm Java 2.2.0: Build better apps, faster apps
 
Serenity-BDD training
Serenity-BDD trainingSerenity-BDD training
Serenity-BDD training
 
Best PHP Framework For 2016
Best PHP Framework For 2016Best PHP Framework For 2016
Best PHP Framework For 2016
 
Vietnam - Asia's newest IT and Outsourcing Tiger
Vietnam - Asia's newest IT and Outsourcing TigerVietnam - Asia's newest IT and Outsourcing Tiger
Vietnam - Asia's newest IT and Outsourcing Tiger
 
Vietnam smartphone usage
Vietnam smartphone usageVietnam smartphone usage
Vietnam smartphone usage
 
Mobile payment
Mobile paymentMobile payment
Mobile payment
 
Swift Introduction
Swift IntroductionSwift Introduction
Swift Introduction
 
Project manegement
Project manegementProject manegement
Project manegement
 
Business Etiquette Training
Business Etiquette TrainingBusiness Etiquette Training
Business Etiquette Training
 
Pros and Cons of Blackberry 10
Pros and Cons of Blackberry 10Pros and Cons of Blackberry 10
Pros and Cons of Blackberry 10
 
Do's and Don'ts in mobile game development
Do's and Don'ts in mobile game developmentDo's and Don'ts in mobile game development
Do's and Don'ts in mobile game development
 
Trends of Information Technology in 2013
Trends of Information Technology in 2013Trends of Information Technology in 2013
Trends of Information Technology in 2013
 
Cloud computing - Pros and Cons
Cloud computing - Pros and ConsCloud computing - Pros and Cons
Cloud computing - Pros and Cons
 
Steps of outsourcing strategy
Steps of outsourcing strategySteps of outsourcing strategy
Steps of outsourcing strategy
 
Outsourcing to asia
Outsourcing to asiaOutsourcing to asia
Outsourcing to asia
 
The role of QR code in daily life
The role of QR code in daily lifeThe role of QR code in daily life
The role of QR code in daily life
 
Why are social games so successful?
Why are social games so successful?Why are social games so successful?
Why are social games so successful?
 
What makes a complete mobile site
What makes a complete mobile siteWhat makes a complete mobile site
What makes a complete mobile site
 
From app idea to reality
From app idea to realityFrom app idea to reality
From app idea to reality
 

Kürzlich hochgeladen

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 

Kürzlich hochgeladen (20)

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 

Introduction of Big data, NoSQL & Hadoop

  • 3. Big Data refers to TECHNOLOGY and INITIATIVES that involve data that is too DIVERSE FAST-CHANGING or MASSIVE for conventional technologies, skills and infrastructure to address efficiently. 1 WHAT IS BIG DATA? BIGDATA
  • 4. Big Data refers to TECHNOLOGY and INITIATIVES that involve data that is too DIVERSE FAST-CHANGING or MASSIVE for conventional technologies, skills and infrastructure to address efficiently. 1 WHAT IS BIG DATA? VOLUME High data capacity (Terabytes or petabytes) BIGDATA BIG DATA CHARACTERISTICS
  • 5. Big Data refers to TECHNOLOGY and INITIATIVES that involve data that is too DIVERSE FAST-CHANGING or MASSIVE for conventional technologies, skills and infrastructure to address efficiently. 1 WHAT IS BIG DATA? VOLUME VELOCITY High data capacity (Terabytes or petabytes) Batch Real-time Streams BIGDATA BIG DATA CHARACTERISTICS
  • 6. Big Data refers to TECHNOLOGY and INITIATIVES that involve data that is too DIVERSE FAST-CHANGING or MASSIVE for conventional technologies, skills and infrastructure to address efficiently. 1 WHAT IS BIG DATA? VOLUME VELOCITY VARIETY High data capacity (Terabytes or petabytes) Batch Real-time Streams Various kinds (Structured, unstructured, semi-structured) BIGDATA BIG DATA CHARACTERISTICS
  • 7. Big Data refers to TECHNOLOGY and INITIATIVES that involve data that is too DIVERSE FAST-CHANGING or MASSIVE for conventional technologies, skills and infrastructure to address efficiently. 1 WHAT IS BIG DATA? BIG DATA CHARACTERISTICS VOLUME VELOCITY VARIETY VERACITY High data capacity (Terabytes or petabytes) Batch Real-time Streams Various kinds (Structured, unstructured, semi-structured) Quality Consistency Reliability BIGDATA
  • 8. Type Characteristics Examples Technology S T RU C T U R E D d a t a Entities with a pre-defined format/schema. RDBMS records. RDBMS, NoSQL S E M I - S T RU C T U R E D d a t a Data is lesser, maybe a schema. XML Files, JSON files NoSQL, MapReduce U N S T RU C T U R E D d a t a NO structure Email content, images, videos, PDF files MapReduce 1BIGDATA BIG DATA TYPES
  • 9. 1BIGDATA BIG DATA CHALLENGES IN STORAGE&ANALYSIS 1. PROCESS SLOWLY, UNSCALABLE SSD (800Mb/s, 2ms seek) SATA (300Mb/s) IDE drive (75MB/sec, 10ms seek)
  • 10. 1BIGDATA 1. PROCESS SLOWLY, UNSCALABLE 2. UNRELIABLE MACHINE IDE drive (75MB/sec, 10ms seek) Risky BIG DATA CHALLENGES IN STORAGE&ANALYSIS
  • 11. 1BIGDATA 1. PROCESS SLOWLY, UNSCALABLE 2. UNRELIABLE MACHINE 3. RELIABILITY IDE drive (75MB/sec, 10ms seek) Scalability Data recovery Partial failure BIG DATA CHALLENGES IN STORAGE&ANALYSIS
  • 12. 1BIGDATA 1. PROCESS SLOWLY, UNSCALABLE 2. UNRELIABLE MACHINE 3. RELIABILITY 4. BACKUP IDE drive (75MB/sec, 10ms seek) BIG DATA CHALLENGES IN STORAGE&ANALYSIS
  • 13. 1BIGDATA 1. PROCESS SLOWLY, UNSCALABLE 2. UNRELIABLE MACHINE 3. RELIABILITY 4. BACKUP 5. PARALLEL PROCESS IDE drive (75MB/sec, 10ms seek) BIG DATA CHALLENGES IN STORAGE&ANALYSIS
  • 14. 1BIGDATA 1. PROCESS SLOWLY, UNSCALABLE 2. UNRELIABLE MACHINE 3. RELIABILITY 4. BACKUP 5. PARALLEL PROCESS 6. EXPENSIVE COST IDE drive (75MB/sec, 10ms seek) BIG DATA CHALLENGES IN STORAGE&ANALYSIS
  • 16. 2HADOOP WHAT IS HADOOP ? A free, Java-based framework that allows the DISTRIBUTED PROCESSING of LARGE DATA SETS across CLUSTER OF COMPUTERS using SIMPLE PROGRAMING MODELS
  • 17. 2HADOOP WHAT IS HADOOP ? HADOOP ORIGIN GOOGLE PUBLISH GFS & MAP REDUCE PAPER 2 0 0 2 - 2 0 0 4 DOUGH CUTTING ADD GFS & MAP REDUCE TO NUTCH 2 0 0 4 YAHOO! HIRE DOUGH, BUILD A TEAM TO DEVELOP HADOOP 2 0 0 7 NY TIME CONVERT 4 TB OF ARCHIVE (100 EC2 CLUSTER) Y A free, Java-based framework that allows the DISTRIBUTED PROCESSING of LARGE DATA SETS across CLUSTER OF COMPUTERS using SIMPLE PROGRAMING MODELS
  • 18. 2HADOOP WHAT IS HADOOP ? HADOOP ORIGIN WEB SCALE DEVELOPMENT AT YAHOO, FACEBOOK, TWITTER YAHOO! DOES FASTEST SORT OF a TB in 62 sec 2 0 0 9 YAHOO! SORT A PB IN 16.25 HOURS (3658 NODES) APACHE HADOOP IS NOW AN OPEN SOURCE E CONVERT 4 ARCHIVE (100 CLUSTER) A free, Java-based framework that allows the DISTRIBUTED PROCESSING of LARGE DATA SETS across CLUSTER OF COMPUTERS using SIMPLE PROGRAMING MODELS
  • 19. 2HADOOP HADOOP ARCHITECTURE Hadoop is designed and built on top two independent parts HADOOP HDFSMAP REDUCE + = Storage file system Processing
  • 20. 2HADOOP HADOOP ARCHITECTURE + Distributed across “NODES” HDFS – Hadoop distributed file system
  • 21. 2HADOOP HADOOP ARCHITECTURE + Provide actual storage NAME NODE DATA NODE Master of the system Store meta data Transaction blog, list of files, list of block, data nodes Maintain and manage blocks on data nodes Responsible for serving read/write requests Slaves; deployed on each machine. Distributed across “NODES” HDFS – Hadoop distributed file system
  • 22. 2HADOOP HADOOP ARCHITECTURE + MODEL HDFS – Hadoop distributed file system
  • 23. 2HADOOP HADOOP ARCHITECTURE + MAP REDUCE COMPONENTS JOB TRACKER TASK TRACKER Master & manage job & resource in the cluster Slaves, deployed on each machines Running the map & reduce tasks as job tracker requires
  • 26. 2HADOOP HADOOP ARCHITECTURE + ALGORITHM o Parallel algorithm o 3 basic steps Map step Split data into key & value MAP REDUCE
  • 27. 2HADOOP HADOOP ARCHITECTURE ALGORITHM o Parallel algorithm o 3 basic steps Map step Shuffle step Split data into key & value Sorted by key MAP REDUCE
  • 28. 2HADOOP HADOOP ARCHITECTURE + ALGORITHM o Parallel algorithm o 3 basic steps Map step Shuffle step Reduce step Split data into key & value Gather Sorted by key MAP REDUCE
  • 29. o Logical functions: MAPPER & REDUCER 2HADOOP HADOOP ARCHITECTURE FUNCTIONS o Hadoop handles distributing MAP & REDUCE tasks across the cluster o MAP & REDUCE functions were written and submit .jars to Hadoop clusters. o Typically batch oriented. MAP REDUCE
  • 32. 2HADOOP HADOOP FEATURES SUMMARY + STORE ANYTHING Unstructured data, semi structured data STORAGE CAPACITY Scale linearly Cost is not exponential
  • 33. 2HADOOP HADOOP FEATURES SUMMARY + STORAGE CAPACITY Scale linearly Cost is not exponential DATA LOCALITY & PROCESS IN YOUR WAY STORE ANYTHING Unstructured data, semi structured data
  • 34. 2HADOOP HADOOP FEATURES SUMMARY + STORE ANYTHING Unstructured data, semi structured data STORAGE CAPACITY Scale linearly Cost is not exponential DATA LOCALITY & PROCESS IN YOUR WAY FAILURE & FAULT TOLERANCE Detect failure & heal itself (data replicated, failed task is re-run, no need to maintain backup data)
  • 35. 2HADOOP HADOOP FEATURES SUMMARY + STORE ANYTHING Unstructured data, semi structured data STORAGE CAPACITY Scale linearly Cost is not exponential DATA LOCALITY & PROCESS IN YOUR WAY FAILURE & FAULT TOLERANCE Detect failure & heal itself (data replicated, failed task is re-run, no need to maintain backup data) COST EFFECTIVE
  • 36. 2HADOOP HADOOP FEATURES SUMMARY + STORE ANYTHING Unstructured data, semi structured data STORAGE CAPACITY Scale linearly Cost is not exponential DATA LOCALITY & PROCESS IN YOUR WAY FAILURE & FAULT TOLERANCE Detect failure & heal itself (data replicated, failed task is re-run, no need to maintain backup data) COST EFFECTIVE PRIMARILY USED FOR BATCH PROCESSING, NOT REAL- TIME
  • 37. 2HADOOP WHO IS USING HADOOP & FOR WHAT + SEARCH LOG PROCESSING RECOMMENDATION SYSTEMS DATA WAREHOUSE VIDEO & IMAGE ANALYSIS
  • 38. 2HADOOP + SEARCH LOG PROCESSING RECOMMENDATION SYSTEMS DATA WAREHOUSE VIDEO & IMAGE ANALYSIS AND MANY MORE … WHO IS USING HADOOP & FOR WHAT
  • 39. NOSQL
  • 40. 3N O S Q L WHAT IS NOSQL ? NOSQL = Not Only SQL SCHEMA FREE
  • 41. 3N O S Q L WHAT IS NOSQL ? NOSQL = Not Only SQL SCHEMA FREE NOSQL CATEGORIES KEY VALUE STORE DYNAMO, AZURE, REDIS, MEMCACHED
  • 42. 3N O S Q L WHAT IS NOSQL ? NOSQL = Not Only SQL SCHEMA FREE NOSQL CATEGORIES KEY VALUE STORE DYNAMO, AZURE, REDIS, MEMCACHED BIG TABLE / COLUM N STORE (GOOGLE ) HBASE; CASSANDAR Similar to RBDMS but handles semi - structured
  • 43. 3N O S Q L WHAT IS NOSQL ? NOSQL = Not Only SQL SCHEMA FREE NOSQL CATEGORIES KEY VALUE STORE DYNAMO, AZURE, REDIS, MEMCACHED BIG TABLE / COLUM N STORE (GOOGLE ) HBASE; CASSANDAR Similar to RBDMS but handles semi - structured GRAPH DB NEO4J
  • 44. 3N O S Q L WHAT IS NOSQL ? NOSQL = Not Only SQL SCHEMA FREE NOSQL CATEGORIES KEY VALUE STORE DYNAMO, AZURE, REDIS, MEMCACHED BIG TABLE / COLUM N STORE (GOOGLE ) HBASE; CASSANDAR Similar to RBDMS but handles semi - structured GRAPH DB NEO4J DOCUM E NT S TORE MONGODB, REDIS, COUCHDB Similar to key – value store but DB knows what is the value
  • 45. 3N O S Q L NOSQL + COLLECTION: is a group of RELATED DOCUMENTS MONGO DB – DATA MODELING CONCEPT In form of DOCUMENTS (JSON-liked key value). Data in MongoDB has A FLEXIBLE SCHEMA.
  • 46. 3N O S Q L NOSQL + No JOIN, instead, there are 2 types of DOCUMENT STRUCTURE Reference Embedded MONGO DB – DATA MODELING CONCEPT
  • 47. 3N O S Q L NOSQL + MONGO DB – DATA MODELING CONCEPT * Always consider the usage of data (queries or update) when designing data models MODEL RELATIONSHIP BETWEEN DOCUMENTS MODEL TREE STRUCTURES One - to - one One - to - many Parent reference Child reference Array of ancestors Materialized paths Nested sets
  • 48. 3N O S Q L NOSQL MONGO DB – CRUD OPERATIONS COMPARING: SQL VS MONGO STATEMENTS QUERY STATEMENT CREATE / INSERT / UPDATE / DELETE