SlideShare ist ein Scribd-Unternehmen logo
1 von 70
Big Data – Hadoop - NoSQL and Graph Database
Ramazan FIRIN
20.11.2012




  This document is intended for only AVEA İletişim Hizmetleri A.Ş.("AVEA"), its dealers, employees and/or others specifically authorised. The contents of this document are
  confidential and any disclosure, copying, distribution and/or taking any action in reliance with the content of this document is prohibited. AVEA is not liable for the transmission
  of this document in any manner to any third parties that are not authorised to receive.
AGENDA

•   Big Data
•   Hadoop
•   NoSQL
•   Graph DB and Neoj
•   Possible Usage in Tellco
•   Demo




                               2
Executive Summary

       • Big Data is a new IT trend

       • Hadoop and NoSQL can used to process Big Data

       • Possible usage area in Tellco :
           - Prevent Churn
            - to offer customer spesific campaign
            - to get more customer




AVEA                                   3                 R&D /MW Developement
What is Big Data?




   Datasets that are too awkward to work with using traditional,
             hands-ondatabase management tools.




                                 4
Big Data- 3V Concept




                       5
Big Data Sources

1.   Social network profiles -Facebook, LinkedIn, Yahoo, Google
2.   Social influencers - blog comments, user forums, review sites,
3.   Activity-generated data - application logs, sensor data
4.   Public—Wikipedia, IMDb, etc
5.   Data warehouse appliances - transactional data
6.   Network and in-stream monitoring
7.   Legacy documents—




                                    6
Big Data To Smart Data




 Cover of The Economist
                          7
Volume




         8
New Data Sources - Internet


•   2 Billion internet users by 2011
•   Twitter processes 7 terabytes data of every day
•   Facebook processes 10 terabytes data of every day
•   4.6 billion mobile phone
•   Google processes 24 petabytes data of every day




                                       9
Big Data Approach




                    10
Big Data Design




                  11
Big Data Usage Sector




                        12
Sample Usage - 360°Degree View of the
Customers




                      13
Sample Usage – Customer Sentiment




                     14
Sample Usage – Detect Churn Pattern




                      15
Sample Usage - Healty




                        16
Big Data Market




                  17
Big Data Solutions – Oracle Big Data Appliance




                       18
Big Data Solutions – IBM Pure Data




                       19
TOP 10 Tecnology Trend 2012 from CSC




                     20
Gartner: Top 10 IT Trends for 2013




Avea                    21           21R&D /MW Developement
Gartner:10 Critical IT Trends For The Next Five
Years

•      Third trend is Bigger data and storage:
•      By 2015, big data demand will generate 1 million jobs in the Global
       1000,
•      but only a one-third of jobs will get filled due to shortage of talent.
•      Analytics and pattern recognition are key.
•      Seeing new specialized ARM-based servers to do specialty analytics.




Avea                                      22                        22R&D /MW Developement
HADOOP




  23
What is HADOOP?




     The Apache Hadoop software library is a framework that
    allows for the distributed processing of large data sets
  across clusters of computers using simple programming models




                               24
History




          25
Hadoop Components




                    26
HADOOP ARCHITECTURE




                      27
Hadoop Ecosystem




Pig - simplifies hadoop programming, data processing language
Hive - SQL like queries
HBase - Random read/write, billions of row and millions of colums
  (NoSQL)

                                   28
Other Google Research




                        29
NoSQL




 30
RDBMS PERFORMANCE




Avea            31   31R&D /MW Developement
Join is killer...




Avea                32   32R&D /MW Developement
What is NoSQL?


•       Stands for Not Only SQL
•       Non relational
•       Cheap, Easy to implement
•       Scalability
    –   Vertically - Add more data
    –   Horizontally - Add more storage
•       No pre-defined schema
•       No join operations
•       Not ACID, support CAP threom



                                          33
NoSQL DB Types


1. Key-values Stores
2. Document Databases
3. Column Family Stores
4. Graph Databases




                          34
Key-Value Stores




 -   Redis, Voldemort
                   35
Document Database




- CouchDB, MongoDB
                    36
Column Family Stores




 -   Cassandra, HBase
                       37
Graph Database




- Neo4J, InfoGrid, Infinite Graph
                 38
RMDBS Support ACID



•   Atomicity - a transaction is all or nothing
•   Consistency - only valid data is written to the database
•   Isolation - pretend all transactions are happening serially and the data
    is correct
•   Durability - what you write is what you get




                                       39
NoSQL Support CAP Threom




                    40
NoSQL Support CAP Theorem




•   Consistency - each client always has the same view of the data.
•   Availability - all clients can always read and write.
•   Partition tolerance - if one or more nodes fails the system still works



                     You can pick only two...


                                        41
Visual Guide to NoSQL Systems




Avea                 42         42R&D /MW Developement
NoSQL Complexity




                   43
NoSQL Performance




                    44
Job Trends




Avea         45   45R&D /MW Developement
Graph DB and Neo4j




       46
Graph DB

Graph database uses graph structures with nodes, edges, and properties
  to represent and store data.




                                  47
Graph DB Usage Area



•   Recommendations             •   Time Series data
•   Business Inteligence        •   Product Catalogue
•   Social networking           •   Web Analitics
•   MDM                         •   Scientific Computing
•   System Management           •   Indexing your slow
                                    RMDBS


                           48
Relational Databases are Graphs!




                       49
Neo4j


•   Leading Graph         •   Opensource
    Database
•   Transaction           •   Traversal framework
    support (ACID)
                          •   High Performance
•   Indexing                  (traverse 1.000.000 +
                              relationship/seconds)
•   Querying
•   REST support          •   Robust (in 7/24 operation
                              since 2003)
•   Disk Based
                          •   Massive scalability
                     50
Neo4j Data Model


Neo4j has Nodes and Relationship.
Nodes and realtionships have properties.


                      Relationship type : knows
             Node1    Property          : Date of meeting   Node2
                              Relationship
                                                            Property:name
   Property:name
                                                            Property:surname
   Property:surname




                                        51
Ne4j Performance




http://www.neotechnology.com/2012/10/20-billion-relationships-imported-
   into-neo4j-on-ec2/

                                   52
Who use Neo4j?




•    Cisco - Master Data Management
•    Telenor Group : Customer organization scructure (203 million
     subscribers )
•    Deutsche Telekom: Social football site (150 million subscribers )
                                    53
Cypher For Query




                   54
Sample Code




              55
Spring Data Neo4j




                    56
Neoclipse




            57
Product Catalog




Avea              58   58R&D /MW Developement
Sample OM Data Model




                       59
Hardware Calculating Tool




                      60
Hardware Calculating Tool Result


Calculation Result             Prod Environment
                           •   4 pysical machines
                           •   3 node at every machines
                           •   1024 mhz cpu
                           •   65536 MB Ram




                      61
Orient DB


•   The Document-Graph              •   HTTP / Restfull / Json /
    database                            Binary supports
•   ACID support                    •   Hooks
•   SQL and Native Queries,         •   Fetch plans
•   schema-less, schema-full        •   Inheritance
    and schema-mixed modes
                                    •   200.000 insert per
•   Roles + Security                    second(6 M node travels
                                        with cache)
•   Functions

                               62
FluxGraph

•     Temporal Graph Database
•     Has checkpoint
•     Compatible with Neo4j




Mercedes-Benz Türk A.Ş.         63   632008-07-01_Presentation Template MBT / CEO
Examples for TelCos


•      CDR
•      Routing
•      Social graphs
•      Master Data Management
•      Spatial and LBS
•      Network topology analysis
•      Neo4j and Android




Avea                               64   64R&D /MW Developement
CDR Analysis




Avea           65   65R&D /MW Developement
Master Data Management




Avea                     66   66R&D /MW Developement
Network Management




Avea                 67   67R&D /MW Developement
Cell Network Analiysis




Avea                     68   68R&D /MW Developement
Sample Senarios



•   Customer Spesific Campaign
•   Prevent Churn
•   Get More Customer
•   Special offer for campaigns




                                  69
Thanks




  70

Weitere ähnliche Inhalte

Was ist angesagt?

DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?DataStax
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storagehybrid cloud
 
Getting Started with Big Data in the Cloud
Getting Started with Big Data in the CloudGetting Started with Big Data in the Cloud
Getting Started with Big Data in the CloudRightScale
 
20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinarCloudera, Inc.
 
Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作James Chen
 
Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019Adam Doyle
 
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...Cloudera, Inc.
 
Nordic infrastructure Conference 2017 - SQL Server on Linux Overview
Nordic infrastructure Conference 2017 - SQL Server on Linux OverviewNordic infrastructure Conference 2017 - SQL Server on Linux Overview
Nordic infrastructure Conference 2017 - SQL Server on Linux OverviewTravis Wright
 
Choosing a Next Gen Database: the New World Order of NoSQL, NewSQL, and MySQL
Choosing a Next Gen Database: the New World Order of NoSQL, NewSQL, and MySQLChoosing a Next Gen Database: the New World Order of NoSQL, NewSQL, and MySQL
Choosing a Next Gen Database: the New World Order of NoSQL, NewSQL, and MySQLScaleBase
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assTobias Lindaaker
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and HadoopFebiyan Rachman
 
Searching conversations with hadoop
Searching conversations with hadoopSearching conversations with hadoop
Searching conversations with hadoopDataWorks Summit
 
Introduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsIntroduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsNguyen Cao
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...ivmaykov
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?DataStax
 
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...Cloudera, Inc.
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3xKinAnx
 
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?DataWorks Summit
 

Was ist angesagt? (20)

DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storage
 
Getting Started with Big Data in the Cloud
Getting Started with Big Data in the CloudGetting Started with Big Data in the Cloud
Getting Started with Big Data in the Cloud
 
20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar
 
Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作
 
Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019
 
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
 
Nordic infrastructure Conference 2017 - SQL Server on Linux Overview
Nordic infrastructure Conference 2017 - SQL Server on Linux OverviewNordic infrastructure Conference 2017 - SQL Server on Linux Overview
Nordic infrastructure Conference 2017 - SQL Server on Linux Overview
 
Choosing a Next Gen Database: the New World Order of NoSQL, NewSQL, and MySQL
Choosing a Next Gen Database: the New World Order of NoSQL, NewSQL, and MySQLChoosing a Next Gen Database: the New World Order of NoSQL, NewSQL, and MySQL
Choosing a Next Gen Database: the New World Order of NoSQL, NewSQL, and MySQL
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks ass
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Big data analytics - hadoop
Big data analytics - hadoopBig data analytics - hadoop
Big data analytics - hadoop
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Searching conversations with hadoop
Searching conversations with hadoopSearching conversations with hadoop
Searching conversations with hadoop
 
Introduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsIntroduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & Applications
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?
 
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
Hadoop World 2011: Unlocking the Value of Big Data with Oracle - Jean-Pierre ...
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3
 
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
 

Andere mochten auch

Simple Way for Neo4j Visualization
Simple Way for Neo4j VisualizationSimple Way for Neo4j Visualization
Simple Way for Neo4j Visualizationramazan fırın
 
Hadoop & no sql new generation database systems
Hadoop & no sql   new generation database systemsHadoop & no sql   new generation database systems
Hadoop & no sql new generation database systemsramazan fırın
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseCaserta
 
Introduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceIntroduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceData Science Thailand
 
A Brief History of Big Data
A Brief History of Big DataA Brief History of Big Data
A Brief History of Big DataBernard Marr
 
Working With Big Data
Working With Big DataWorking With Big Data
Working With Big DataSeth Familian
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBernard Marr
 

Andere mochten auch (10)

Cloud computig systems
Cloud computig systemsCloud computig systems
Cloud computig systems
 
Simple Way for Neo4j Visualization
Simple Way for Neo4j VisualizationSimple Way for Neo4j Visualization
Simple Way for Neo4j Visualization
 
Hadoop & no sql new generation database systems
Hadoop & no sql   new generation database systemsHadoop & no sql   new generation database systems
Hadoop & no sql new generation database systems
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
 
Introduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceIntroduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data Science
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
A Brief History of Big Data
A Brief History of Big DataA Brief History of Big Data
A Brief History of Big Data
 
Working With Big Data
Working With Big DataWorking With Big Data
Working With Big Data
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
 

Ähnlich wie Big Data - Hadoop, NoSQL and Graph Databases

Big Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLBig Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLTugdual Grall
 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Acunu
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarCloudera, Inc.
 
Nordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & TomorrowNordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & TomorrowNeo4j
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-HadoopNagarjuna D.N
 
IoT and Big Data - Iot Asia 2014
IoT and Big Data - Iot Asia 2014IoT and Big Data - Iot Asia 2014
IoT and Big Data - Iot Asia 2014John Berns
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutionssolarisyougood
 
NoSQL for the SQL Server Pro
NoSQL for the SQL Server ProNoSQL for the SQL Server Pro
NoSQL for the SQL Server ProLynn Langit
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Thomas W. Dinsmore
 
Introducing MongoDB into your Organization
Introducing MongoDB into your OrganizationIntroducing MongoDB into your Organization
Introducing MongoDB into your OrganizationMongoDB
 
Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j Neo4j
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseCédric Fauvet
 
GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?Neo4j
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseSwiss Big Data User Group
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDBDATAVERSITY
 

Ähnlich wie Big Data - Hadoop, NoSQL and Graph Databases (20)

Big Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLBig Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQL
 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
 
Nordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & TomorrowNordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
Nordics Edition - The Neo4j Graph Data Platform Today & Tomorrow
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
 
IoT and Big Data - Iot Asia 2014
IoT and Big Data - Iot Asia 2014IoT and Big Data - Iot Asia 2014
IoT and Big Data - Iot Asia 2014
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 
Ibm db2update2019 icp4 data
Ibm db2update2019   icp4 dataIbm db2update2019   icp4 data
Ibm db2update2019 icp4 data
 
NoSQL for the SQL Server Pro
NoSQL for the SQL Server ProNoSQL for the SQL Server Pro
NoSQL for the SQL Server Pro
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)
 
Introducing MongoDB into your Organization
Introducing MongoDB into your OrganizationIntroducing MongoDB into your Organization
Introducing MongoDB into your Organization
 
Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j Ultime Novità di Prodotto Neo4j
Ultime Novità di Prodotto Neo4j
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph database
 
GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?GraphTour 2020 - Neo4j: What's New?
GraphTour 2020 - Neo4j: What's New?
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph database
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 

Big Data - Hadoop, NoSQL and Graph Databases

  • 1. Big Data – Hadoop - NoSQL and Graph Database Ramazan FIRIN 20.11.2012 This document is intended for only AVEA İletişim Hizmetleri A.Ş.("AVEA"), its dealers, employees and/or others specifically authorised. The contents of this document are confidential and any disclosure, copying, distribution and/or taking any action in reliance with the content of this document is prohibited. AVEA is not liable for the transmission of this document in any manner to any third parties that are not authorised to receive.
  • 2. AGENDA • Big Data • Hadoop • NoSQL • Graph DB and Neoj • Possible Usage in Tellco • Demo 2
  • 3. Executive Summary • Big Data is a new IT trend • Hadoop and NoSQL can used to process Big Data • Possible usage area in Tellco : - Prevent Churn - to offer customer spesific campaign - to get more customer AVEA 3 R&D /MW Developement
  • 4. What is Big Data? Datasets that are too awkward to work with using traditional, hands-ondatabase management tools. 4
  • 5. Big Data- 3V Concept 5
  • 6. Big Data Sources 1. Social network profiles -Facebook, LinkedIn, Yahoo, Google 2. Social influencers - blog comments, user forums, review sites, 3. Activity-generated data - application logs, sensor data 4. Public—Wikipedia, IMDb, etc 5. Data warehouse appliances - transactional data 6. Network and in-stream monitoring 7. Legacy documents— 6
  • 7. Big Data To Smart Data Cover of The Economist 7
  • 8. Volume 8
  • 9. New Data Sources - Internet • 2 Billion internet users by 2011 • Twitter processes 7 terabytes data of every day • Facebook processes 10 terabytes data of every day • 4.6 billion mobile phone • Google processes 24 petabytes data of every day 9
  • 12. Big Data Usage Sector 12
  • 13. Sample Usage - 360°Degree View of the Customers 13
  • 14. Sample Usage – Customer Sentiment 14
  • 15. Sample Usage – Detect Churn Pattern 15
  • 16. Sample Usage - Healty 16
  • 18. Big Data Solutions – Oracle Big Data Appliance 18
  • 19. Big Data Solutions – IBM Pure Data 19
  • 20. TOP 10 Tecnology Trend 2012 from CSC 20
  • 21. Gartner: Top 10 IT Trends for 2013 Avea 21 21R&D /MW Developement
  • 22. Gartner:10 Critical IT Trends For The Next Five Years • Third trend is Bigger data and storage: • By 2015, big data demand will generate 1 million jobs in the Global 1000, • but only a one-third of jobs will get filled due to shortage of talent. • Analytics and pattern recognition are key. • Seeing new specialized ARM-based servers to do specialty analytics. Avea 22 22R&D /MW Developement
  • 24. What is HADOOP? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models 24
  • 25. History 25
  • 28. Hadoop Ecosystem Pig - simplifies hadoop programming, data processing language Hive - SQL like queries HBase - Random read/write, billions of row and millions of colums (NoSQL) 28
  • 31. RDBMS PERFORMANCE Avea 31 31R&D /MW Developement
  • 32. Join is killer... Avea 32 32R&D /MW Developement
  • 33. What is NoSQL? • Stands for Not Only SQL • Non relational • Cheap, Easy to implement • Scalability – Vertically - Add more data – Horizontally - Add more storage • No pre-defined schema • No join operations • Not ACID, support CAP threom 33
  • 34. NoSQL DB Types 1. Key-values Stores 2. Document Databases 3. Column Family Stores 4. Graph Databases 34
  • 35. Key-Value Stores - Redis, Voldemort 35
  • 37. Column Family Stores - Cassandra, HBase 37
  • 38. Graph Database - Neo4J, InfoGrid, Infinite Graph 38
  • 39. RMDBS Support ACID • Atomicity - a transaction is all or nothing • Consistency - only valid data is written to the database • Isolation - pretend all transactions are happening serially and the data is correct • Durability - what you write is what you get 39
  • 40. NoSQL Support CAP Threom 40
  • 41. NoSQL Support CAP Theorem • Consistency - each client always has the same view of the data. • Availability - all clients can always read and write. • Partition tolerance - if one or more nodes fails the system still works You can pick only two... 41
  • 42. Visual Guide to NoSQL Systems Avea 42 42R&D /MW Developement
  • 45. Job Trends Avea 45 45R&D /MW Developement
  • 46. Graph DB and Neo4j 46
  • 47. Graph DB Graph database uses graph structures with nodes, edges, and properties to represent and store data. 47
  • 48. Graph DB Usage Area • Recommendations • Time Series data • Business Inteligence • Product Catalogue • Social networking • Web Analitics • MDM • Scientific Computing • System Management • Indexing your slow RMDBS 48
  • 50. Neo4j • Leading Graph • Opensource Database • Transaction • Traversal framework support (ACID) • High Performance • Indexing (traverse 1.000.000 + relationship/seconds) • Querying • REST support • Robust (in 7/24 operation since 2003) • Disk Based • Massive scalability 50
  • 51. Neo4j Data Model Neo4j has Nodes and Relationship. Nodes and realtionships have properties. Relationship type : knows Node1 Property : Date of meeting Node2 Relationship Property:name Property:name Property:surname Property:surname 51
  • 53. Who use Neo4j? • Cisco - Master Data Management • Telenor Group : Customer organization scructure (203 million subscribers ) • Deutsche Telekom: Social football site (150 million subscribers ) 53
  • 57. Neoclipse 57
  • 58. Product Catalog Avea 58 58R&D /MW Developement
  • 59. Sample OM Data Model 59
  • 61. Hardware Calculating Tool Result Calculation Result Prod Environment • 4 pysical machines • 3 node at every machines • 1024 mhz cpu • 65536 MB Ram 61
  • 62. Orient DB • The Document-Graph • HTTP / Restfull / Json / database Binary supports • ACID support • Hooks • SQL and Native Queries, • Fetch plans • schema-less, schema-full • Inheritance and schema-mixed modes • 200.000 insert per • Roles + Security second(6 M node travels with cache) • Functions 62
  • 63. FluxGraph • Temporal Graph Database • Has checkpoint • Compatible with Neo4j Mercedes-Benz Türk A.Ş. 63 632008-07-01_Presentation Template MBT / CEO
  • 64. Examples for TelCos • CDR • Routing • Social graphs • Master Data Management • Spatial and LBS • Network topology analysis • Neo4j and Android Avea 64 64R&D /MW Developement
  • 65. CDR Analysis Avea 65 65R&D /MW Developement
  • 66. Master Data Management Avea 66 66R&D /MW Developement
  • 67. Network Management Avea 67 67R&D /MW Developement
  • 68. Cell Network Analiysis Avea 68 68R&D /MW Developement
  • 69. Sample Senarios • Customer Spesific Campaign • Prevent Churn • Get More Customer • Special offer for campaigns 69

Hinweis der Redaktion

  1. This template can be used as a starter file to give updates for project milestones.SectionsRight-click on a slide to add sections. Sections can help to organize your slides or facilitate collaboration between multiple authors.NotesUse the Notes section for delivery notes or to provide additional details for the audience. View these notes in Presentation View during your presentation. Keep in mind the font size (important for accessibility, visibility, videotaping, and online production)Coordinated colors Pay particular attention to the graphs, charts, and text boxes.Consider that attendees will print in black and white or grayscale. Run a test print to make sure your colors work when printed in pure black and white and grayscale.Graphics, tables, and graphsKeep it simple: If possible, use consistent, non-distracting styles and colors.Label all graphs and tables.
  2. What is the project about?Define the goal of this projectIs it similar to projects in the past or is it a new effort?Define the scope of this projectIs it an independent project or is it related to other projects?* Note that this slide is not necessary for weekly status meetings
  3. * If any of these issues caused a schedule delay or need to be discussed further, include details in next slide.
  4. Duplicate this slide as necessary if there is more than one issue.This and related slides can be moved to the appendix or hidden if necessary.
  5. Duplicate this slide as necessary if there is more than one issue.This and related slides can be moved to the appendix or hidden if necessary.
  6. Duplicate this slide as necessary if there is more than one issue.This and related slides can be moved to the appendix or hidden if necessary.
  7. Duplicate this slide as necessary if there is more than one issue.This and related slides can be moved to the appendix or hidden if necessary.