Suche senden
Hochladen
Hadoop for carrier
•
Als PPTX, PDF herunterladen
•
0 gefällt mir
•
1,006 views
Flytxt
Folgen
Harnessing Hadoop for Big Data, Series II
Weniger lesen
Mehr lesen
Technologie
Melden
Teilen
Melden
Teilen
1 von 18
Jetzt herunterladen
Empfohlen
Co existence or Competition ? - RDBMS and Hadoop
Co existence or Competition ? - RDBMS and Hadoop
Flytxt
Data analytics driven customer experience programs
Data analytics driven customer experience programs
Flytxt
7th prepaid mobile summit presentation by Abhay Doshi
7th prepaid mobile summit presentation by Abhay Doshi
Flytxt
Improving Collaborative Filtering Based Recommenders Using Topic Modelling
Improving Collaborative Filtering Based Recommenders Using Topic Modelling
Flytxt
Recommendation engines matching items to users
Recommendation engines matching items to users
Flytxt
The Omnichannel Opportunity in Digital World: Unlocking the potential of conn...
The Omnichannel Opportunity in Digital World: Unlocking the potential of conn...
Flytxt
Leveraging open source for big data stack
Leveraging open source for big data stack
Flytxt
Big data analytics and building intelligent applications
Big data analytics and building intelligent applications
Flytxt
Empfohlen
Co existence or Competition ? - RDBMS and Hadoop
Co existence or Competition ? - RDBMS and Hadoop
Flytxt
Data analytics driven customer experience programs
Data analytics driven customer experience programs
Flytxt
7th prepaid mobile summit presentation by Abhay Doshi
7th prepaid mobile summit presentation by Abhay Doshi
Flytxt
Improving Collaborative Filtering Based Recommenders Using Topic Modelling
Improving Collaborative Filtering Based Recommenders Using Topic Modelling
Flytxt
Recommendation engines matching items to users
Recommendation engines matching items to users
Flytxt
The Omnichannel Opportunity in Digital World: Unlocking the potential of conn...
The Omnichannel Opportunity in Digital World: Unlocking the potential of conn...
Flytxt
Leveraging open source for big data stack
Leveraging open source for big data stack
Flytxt
Big data analytics and building intelligent applications
Big data analytics and building intelligent applications
Flytxt
Co-existence or competition - RDBMS and Hadoop
Co-existence or competition - RDBMS and Hadoop
Flytxt
Co existence or Competitions? RDBMS and Hadoop
Co existence or Competitions? RDBMS and Hadoop
Flytxt
Hadoop Analytics on Isilon Deep Dive
Hadoop Analytics on Isilon Deep Dive
ClaudioFahey1
Sql on everything with drill
Sql on everything with drill
Julien Le Dem
Run Your First Hadoop 2.x Program
Run Your First Hadoop 2.x Program
Skillspeed
An Introduction to Spring Data
An Introduction to Spring Data
Oliver Gierke
GlassFish in Production Environments
GlassFish in Production Environments
Bruno Borges
Slides: Introducing the new ClusterControl 1.2.10 for MySQL, MongoDB and Post...
Slides: Introducing the new ClusterControl 1.2.10 for MySQL, MongoDB and Post...
Severalnines
Tom Kyte and and Cary Milsap - 2013
Tom Kyte and and Cary Milsap - 2013
Connor McDonald
Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012
Benoit Hudzia
HTML5 WebSocket Introduction
HTML5 WebSocket Introduction
Marcelo Jabali
Data Virtualization and ETL
Data Virtualization and ETL
Lily Luo
Introducing Apache Geode and Spring Data GemFire
Introducing Apache Geode and Spring Data GemFire
John Blum
Open stackbrief happylearning
Open stackbrief happylearning
Ligong Duan
Flume intro-100717
Flume intro-100717
Cloudera, Inc.
Flume intro-100715
Flume intro-100715
Cloudera, Inc.
Java EE 7 - Embracing the Cloud and HTML 5
Java EE 7 - Embracing the Cloud and HTML 5
Amit Naik
Flume in 10minutes
Flume in 10minutes
dwmclary
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
Big Data Spain
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
Neo4j
Flytxt corporate brochure
Flytxt corporate brochure
Flytxt
Data analytics is a game changer for telcos in the digital era
Data analytics is a game changer for telcos in the digital era
Flytxt
Weitere ähnliche Inhalte
Ähnlich wie Hadoop for carrier
Co-existence or competition - RDBMS and Hadoop
Co-existence or competition - RDBMS and Hadoop
Flytxt
Co existence or Competitions? RDBMS and Hadoop
Co existence or Competitions? RDBMS and Hadoop
Flytxt
Hadoop Analytics on Isilon Deep Dive
Hadoop Analytics on Isilon Deep Dive
ClaudioFahey1
Sql on everything with drill
Sql on everything with drill
Julien Le Dem
Run Your First Hadoop 2.x Program
Run Your First Hadoop 2.x Program
Skillspeed
An Introduction to Spring Data
An Introduction to Spring Data
Oliver Gierke
GlassFish in Production Environments
GlassFish in Production Environments
Bruno Borges
Slides: Introducing the new ClusterControl 1.2.10 for MySQL, MongoDB and Post...
Slides: Introducing the new ClusterControl 1.2.10 for MySQL, MongoDB and Post...
Severalnines
Tom Kyte and and Cary Milsap - 2013
Tom Kyte and and Cary Milsap - 2013
Connor McDonald
Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012
Benoit Hudzia
HTML5 WebSocket Introduction
HTML5 WebSocket Introduction
Marcelo Jabali
Data Virtualization and ETL
Data Virtualization and ETL
Lily Luo
Introducing Apache Geode and Spring Data GemFire
Introducing Apache Geode and Spring Data GemFire
John Blum
Open stackbrief happylearning
Open stackbrief happylearning
Ligong Duan
Flume intro-100717
Flume intro-100717
Cloudera, Inc.
Flume intro-100715
Flume intro-100715
Cloudera, Inc.
Java EE 7 - Embracing the Cloud and HTML 5
Java EE 7 - Embracing the Cloud and HTML 5
Amit Naik
Flume in 10minutes
Flume in 10minutes
dwmclary
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
Big Data Spain
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
Neo4j
Ähnlich wie Hadoop for carrier
(20)
Co-existence or competition - RDBMS and Hadoop
Co-existence or competition - RDBMS and Hadoop
Co existence or Competitions? RDBMS and Hadoop
Co existence or Competitions? RDBMS and Hadoop
Hadoop Analytics on Isilon Deep Dive
Hadoop Analytics on Isilon Deep Dive
Sql on everything with drill
Sql on everything with drill
Run Your First Hadoop 2.x Program
Run Your First Hadoop 2.x Program
An Introduction to Spring Data
An Introduction to Spring Data
GlassFish in Production Environments
GlassFish in Production Environments
Slides: Introducing the new ClusterControl 1.2.10 for MySQL, MongoDB and Post...
Slides: Introducing the new ClusterControl 1.2.10 for MySQL, MongoDB and Post...
Tom Kyte and and Cary Milsap - 2013
Tom Kyte and and Cary Milsap - 2013
Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012
HTML5 WebSocket Introduction
HTML5 WebSocket Introduction
Data Virtualization and ETL
Data Virtualization and ETL
Introducing Apache Geode and Spring Data GemFire
Introducing Apache Geode and Spring Data GemFire
Open stackbrief happylearning
Open stackbrief happylearning
Flume intro-100717
Flume intro-100717
Flume intro-100715
Flume intro-100715
Java EE 7 - Embracing the Cloud and HTML 5
Java EE 7 - Embracing the Cloud and HTML 5
Flume in 10minutes
Flume in 10minutes
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
How to use Hadoop for operational and transactional purposes by RODRIGO MERI...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
026 Neo4j Data Loading (ETL_ELT) Best Practices - NODES2022 AMERICAS Advanced...
Mehr von Flytxt
Flytxt corporate brochure
Flytxt corporate brochure
Flytxt
Data analytics is a game changer for telcos in the digital era
Data analytics is a game changer for telcos in the digital era
Flytxt
Omni channel customer experience
Omni channel customer experience
Flytxt
Analytics tools drive customer experience in the digital age
Analytics tools drive customer experience in the digital age
Flytxt
Enhancing Connected Customer Experience through Mobile Consumer Analytics
Enhancing Connected Customer Experience through Mobile Consumer Analytics
Flytxt
Flytxt: Personalizing Engagement
Flytxt: Personalizing Engagement
Flytxt
Flytxt a unique success story in big data analytics
Flytxt a unique success story in big data analytics
Flytxt
Flytxt brochure
Flytxt brochure
Flytxt
Roadmap to realizing the value of telco data – opportunities, challenges, use...
Roadmap to realizing the value of telco data – opportunities, challenges, use...
Flytxt
Afaqs Reporter: Strategise, Leap & Lead with Mobile Marketing
Afaqs Reporter: Strategise, Leap & Lead with Mobile Marketing
Flytxt
Deriving economic value for CSPs with Big Data [read-only]
Deriving economic value for CSPs with Big Data [read-only]
Flytxt
Warid uganda big data experience
Warid uganda big data experience
Flytxt
Mehr von Flytxt
(12)
Flytxt corporate brochure
Flytxt corporate brochure
Data analytics is a game changer for telcos in the digital era
Data analytics is a game changer for telcos in the digital era
Omni channel customer experience
Omni channel customer experience
Analytics tools drive customer experience in the digital age
Analytics tools drive customer experience in the digital age
Enhancing Connected Customer Experience through Mobile Consumer Analytics
Enhancing Connected Customer Experience through Mobile Consumer Analytics
Flytxt: Personalizing Engagement
Flytxt: Personalizing Engagement
Flytxt a unique success story in big data analytics
Flytxt a unique success story in big data analytics
Flytxt brochure
Flytxt brochure
Roadmap to realizing the value of telco data – opportunities, challenges, use...
Roadmap to realizing the value of telco data – opportunities, challenges, use...
Afaqs Reporter: Strategise, Leap & Lead with Mobile Marketing
Afaqs Reporter: Strategise, Leap & Lead with Mobile Marketing
Deriving economic value for CSPs with Big Data [read-only]
Deriving economic value for CSPs with Big Data [read-only]
Warid uganda big data experience
Warid uganda big data experience
Kürzlich hochgeladen
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
johnbeverley2021
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Remote DBA Services
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
WSO2
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
rafiqahmad00786416
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
UiPathCommunity
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
Dropbox
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
apidays
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
apidays
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
Remote DBA Services
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Product Anonymous
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
MadyBayot
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Andrey Devyatkin
Elevate Developer Efficiency & build GenAI Application with Amazon Q
Elevate Developer Efficiency & build GenAI Application with Amazon Q
Bhuvaneswari Subramani
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Deepika Singh
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Jeffrey Haguewood
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
MIND CTI
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
The Digital Insurer
Kürzlich hochgeladen
(20)
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Elevate Developer Efficiency & build GenAI Application with Amazon Q
Elevate Developer Efficiency & build GenAI Application with Amazon Q
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
Hadoop for carrier
1.
Leveraging Hadoop Cluster
for Carrier grade application Copyright © 2011 Flytxt B.V. All rights reserved. 1/17/2012
2.
No Personalization Service discovery
Copyright © 2011 Flytxt B.V. All rights reserved. 1/17/2012 2
3.
600- 800 GB of CDR per day ◦ GPRS Signaling 50GB/day ◦ 3G Signaling 300GB/day ◦ Voice 100GB/day ◦ SMS 200GB/day 100 - 200 GB/day of Web Data Mammoth Data Data Analysis Copyright © 2011 Flytxt B.V. All rights reserved. 1/17/2012 3
4.
Copyright © 2011
Flytxt B.V. All rights reserved. 1/17/2012 4
5.
Copyright © 2011
Flytxt B.V. All rights reserved. 1/17/2012 5
6.
Framework for distributed processing of large data sets across clusters Consists of ◦ Hadoop Distributed File System aka HDFS (File system) ◦ Hadoop MapReduce (programming model ) Characteristics ◦ Performance shall scale linearly ◦ Compute should move to data ◦ Simple core, Modular and Extensible Copyright © 2011 Flytxt B.V. All rights reserved. 1/17/2012 6
7.
Current Bottleneck ◦ Data resides in multiple nodes/zones/VM instance & no elegant, reliable and efficient way of extracting data ◦ Loading terabytes of data into database is slow ◦ Parallel computing not a possibility in Conventional BI ETL ◦ User profile and application data resides in DB which can scale only vertically Copyright © 2011 Flytxt B.V. All rights reserved. 1/17/2012 7
8.
Structured Data sqoop --connect jdbc:mysql://db.example.com/website --table USERS --as- sequencefile Un Structured Data Copyright © 2011 Flytxt B.V. All rights reserved. 1/17/2012 8
9.
A Distributed data Collection server ◦ Scalable ◦ Configurable ◦ Extensible ◦ Manageable Built around the concept of flows ◦ A single flow corresponds to a type of data source ◦ Supports compression, batching & reliability setups per flow Data come in through a source ◦ Optionally processed by one or more decorators ◦ And transmitted out via sink Copyright © 2011 Flytxt B.V. All rights reserved. 1/17/2012 9
10.
Copyright © 2011
Flytxt B.V. All rights reserved. 1/17/2012 10
11.
Copyright © 2011
Flytxt B.V. All rights reserved. 1/17/2012 11
12.
Map Reduce is very powerful, but: ◦ It requires a Java programmer ◦ User has to re-invent common ◦ functionality (join, filter, etc.) Execution engine atop Hadoop Pig provides a higher level language Pig Latin Opens the system to non-Java programmers Provides common operations like join, group, filter, sort Copyright © 2011 Flytxt B.V. All rights reserved. 1/17/2012 12
13.
Web log processing. Data processing for web search platforms. Ad hoc queries across large data sets. Rapid prototyping of algorithms for processing large data sets. Pig runs on local machine and job gets executed in hadoop cluster $ cd /usr/share/cloudera/pig/ $ bin/pig –x local grunt> Log = LOAD ‘excite-small.log’ AS (user, timestamp, query); grpd = GROUP log BY user; cntd = FOREACH grpd GENERATE group, COUNT(log); STORE cntd INTO ‘output’; Copyright © 2011 Flytxt B.V. All rights reserved. 1/17/2012 13
14.
System for querying and managing structured data Built on top of hadoop Uses map reduce for execution SQL like syntax; supports ◦ From clause subquery ◦ ANSO Join (equi join ) ◦ Multi-table insert ◦ Multi group-by ◦ Sampling ◦ Object traversal Engagement ◦ Summarization ◦ Ad hoc analysis ◦ Spam detection Copyright © 2011 Flytxt B.V. All rights reserved. 1/17/2012 14
15.
Copyright © 2011
Flytxt B.V. All rights reserved. 1/17/2012 15
16.
Feature
Hive Pig Language SQL-like PigLatin Schemas/Types Yes (explicit) Yes (implicit) Partitions Yes No Server Optional(thirft) No User Defined Functions Yes Yes Custom Serializer/Deserializer Yes Yes DFS Direct Access Yes (implicit) Yes (explicit) Join/Order/Sort Yes Yes Shell Yes Yes Streaming Yes No Web Interface Yes No JDBC/ODBC Yes (limited) No Copyright © 2011 Flytxt B.V. All rights reserved. 1/17/2012 16
17.
Copyright © 2011
Flytxt B.V. All rights reserved. 1/17/2012 17
18.
Copyright © 2011
Flytxt B.V. All rights reserved. 1/17/2012 18
Jetzt herunterladen