Demystify Big Data Breakfast Briefing - Juergen Urbanski, T-Systems
1. Capturing Big Value in Big Data
at Deutsche Telekom
Jürgen Urbanski
VP Big Data Architectures & Technologies
T-Systems
Board Member Big Data & Analytics
BITKOM (German IT Industry Association)
Christian Wirth
VP BI & Big Data
T-Systems
2. Introducing Deutsche Telekom and T-Systems
Deutsche Telekom is Europe‟s largest telecom service provider
– Revenue: €58 billion
– Employees: 232,342
T-Systems is the enterprise division of Deutsche Telekom
– Revenue: €10 billion
– Employees: 52,742
– Services: data center, end user computing, networking, systems
integration, cloud and big data
1
3. Disruptive Innovations in Big Data
2
Relational
Database
HADOOP
MPP
Analytics
Data
Warehouse
Schema
Pre-defined, fixed
Required on write
Required on read
Store first, ask questions later
Processing
No or limited
data processing
Compute & storage co-located
Parallel scale out processing
Data typesStructured Any, including unstructured
..
Physical
infrastructure
Default is enterprise grade
Mission critical
Default is commodity
Much cheaper storage
4. Target Hadoop Use Cases
3
IT Infrastructure
& Operations
Business
Intelligence &
Data Warehousing
Line of Business
Potential
valueHighModerate
Lower Cost
Storage for Tier
3 / 4 workloads
(active archive)
Enterprise Data
Warehouse
Offload
Enterprise Data
Warehouse
Archive
Telecommunications & Media
Data Products
Capacity Planning & Utilization
Customer Profiling & Revenue Analytics
Targeted Advertising Analytics
Service Renewal Implementation
CDR based Data Analytics
Fraud Management
Other Industries
Connected Car
Smart Home
Cost effective
storage, processing, and
analysis
Foundation for
profitable growth
= Highlighted today
1
2
3
4
N
5. Enterprise Data Warehouse Offload
4
The Challenge
Many EDWs are at capacity
Running out of budget before
running out of relevant data
Older data archived “in the dark”,
not available for exploration
The Solution
Hadoop for data storage and
processing: parse, cleanse,
apply structure and transform
Free EDW for valuable queries
Retain all data for analysis!
Operational (44%)
ETL Processing (42%)
Analytics (11%)
DATA WAREHOUSE
Storage & Processing
HADOOP
Operational (50%)
Analytics (50%)
DATA WAREHOUSE
Cost is
1/10th
1
6. Data Products: ImmobilienScout (a DT subsidiary)
5
The Situation
Europe„s leading real estate
marketplace with data on...
– 1m properties listed currently
– 20m properties cumulative
– 6 million saved searches
– Geographical coordinates
– Enriched by socio-demographic
data on 19m properties
Team
– Product Manager
– Data Scientists
– 2 Scrum Teams
The Solution
“Market Navigator” service
– Supports realtors in acquiring
customers
– Local market analysis helps with
price setting for rent and buy
– Integrates third-party data
Functionality includes
– Price heat maps & trending
– Demand- and supply-side info
– Local area information
– Comparable transactions
2
8. Connected Car (a T-Systems offering)
When cars go online...
Calling the
repair center
Read out
vehicle data
On-Board
signaling
Online
combinations
Machine data
enriched with
Web data Based on
Cloud Technology
Reduced incidence of
product recalls
Better
management
of product life cycle
Early
error detection
Direct online link
to dealers and the OEM
Preventative maintenance
quicker repair turnaround
Usage-based
feedback
for product development
40 millon
new mobile
contracts
Higher
customer satisfaction
V
Volume
Velocity
Variety
Value
3
7
9. Smart Home: Gigaset (a T-Systems customer)
Gigaset Elements is a sensor- and cloud-based solution
for home networks
Cutting-edge sensors are combined with each other and
linked with an Internet-capable DECT ULE base station
and a secure Web server
That permits a large number of applications in the home
notably home security and elderly assisted living
The intelligent, learning system is powered by Hadoop
At a price of less than €200 for a Starter Kit, the system
is intended to be suitable for the mass market
4
8
10. Which Distribution is Right for You Today and Tomorrow?
13 original Apache Hadoop projects
No commercial support
Fully open source distribution (incl. management tools)
Reputation for cost-effective licensing
Strong developer ecosystem momentum
GTM partners incl. Microsoft, Teradata, Informatica, Talend, NetApp
Widely adopted distribution
Management tools and Impala not fully open source
GTM partners include Oracle, HP, Dell, IBM
Appeals to some business critical use cases prior to Hadoop 2.0
GTM partner AWS (M3 and M5 versions only)
Just announced by EMC, very early stage
Open
Open &
proprietary
Proprietary
9
11. How We Evaluate Hadoop Distributions
10
Hortonworks
well positioned
prior to HDP2.0
12. HDP 2.0 is Architected to be a Good Fit with these
Enterprise Requirements
11
13. T-Systems Approach to Big Data Projects
Assessment in three phases:
Maturity & Potential
Evaluation
Capability maturity
benchmarking
Identification and prioritization
of potential vs. challenges
Deliverables: Current versus
future mode gap analysis
1
Proof of Concept
Selection of initial use case
Standup of test environment
with customer data
Validation of feasibility and
potential
Deliverables: Testing of
customer-specific scenario
including cost-benefit analysis
2
Strategy & Roadmap
Development of enterprise-
wide Big Data strategy
Prioritization of road map
Implementation planning
Deliverables: Business case,
prioritized roadmap,
implementation plan
3
12
14. Deutsche Telekom Perspective
The Hadoop ecosystem delivers powerful innovation in storage, databases and
business intelligence, promising unprecedented price / performance compared to
existing technologies
Hadoop is becoming an enterprise-wide landing zone for big data. Increasingly it
is also used to transform data
We look forward to realizing cost reductions in areas such as enterprise data
warehousing. More importantly, Big Data opens up new business opportunities
for ourselves and our customers
In that journey we are partnering closely with
13
15. Big Data = Big Opportunity!
Jürgen Urbanski
juergen.urbanski@t-systems.com
Christian Wirth
christian.wirth@t-systems.com
Hinweis der Redaktion
Line of BusinessDemand 360 view of customer, employee, market, etc, but cannot be certain about what matters for analysisBusiness AnalystsNeed to incorporate more data into analysis, LOBs not sure what matters; want to reuse existing skill setsData Warehouse OwnersMust efficiently store, process, organize, deliver massive and growing data volume and variety while meeting SLAsIT ManagementDrive innovation, reduce costs, meet growing analytic demands of LOBs, mitigate risk of adopting new technologySystem AdministratorsEnsure stability and reliability of systemsBuyers:VP AnalyticsVP/Director Business IntelligenceVP/Director Data Warehousing/ManagementVP/Director InfrastructureVP/Director Operations/IT SystemsFaster customer acquisitionBetter product developmentBetter qualityLower churn
Which distribution will ensure you stay on the main path of open source innovation, vs. trap you in proprietary forks?