SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
Dr. Susan Wegner, Telekom Innovation Laboratories
25. Februar 2015, BITKOM Big Data Summit, Hanau
Future analytics –
Fabrication of Synthetic Data
DATA NATIVES 2015, Dr. Susan Wegner, Telekom Innovation Laboratories
www.laboratories.telekom.com @T_Labs
ACCESS TO DATA IS STILL AN ISSUE
2
DUE TO DIFFERENT TECHNOLOGY AND DATA SOURCES
www.laboratories.telekom.com @T_Labs
Depersonalization approaches
3
Depersonalization
Standard Anonymization Approaches
Adaption of real data using data manipulation
techniques to increase k-anonymity*
Pertubation Regression
Classification
Tree
GeneralizationSuppressionReplacement
Markov
Chain
Further
Methods
*each person contained in the releasecannot be distinguished from at least k-1 individuals whose information also appear in the release
Source: http://whimsley.typepad.com/whimsley/2011/09/data-anonymization-and-re-identification-some-basics-of-data-privacy.html
Synthesization
Creation of new data with same properties using
machines learning methods (Ongoing Research)
www.laboratories.telekom.com @T_Labs
Standard Anonymization Approaches
A tradeoff between anonymity and usefulness
4
Tradeoff
Perfect
Anonymity
Perfect
Usefulness
It is not possible to create a perfectly anonymised dataset that is perfectly useful to researchers at the same time.
Pro: 100% Data
Privacy
Con: Data Loss and
distortions can
compromise
conclusions
Pro: A maximum of
data based insights
is possible
Con: Disclosure of
individuals is
possible
Decreasing Intensity of
Anonymization 
www.laboratories.telekom.com @T_Labs
Standard Anonymization Approaches
Combining data sources endangers anonymity
5
 Anonymized Netflix Data (2007)
 10 million movie ratings
 500,000 customers
 Personal details were
removed and replaced
by random numbers
 Public IMDB* Data (2007)
 Users who entered
movie ranking using
real name
Sources Combination
Rankings
Anonymized
Netflix Data
Public IMDB
Data
 Users on IMDB using their
real name had similar ranking
patterns in the Netflix data
 It was possible to find all
other preferences of those
user in the Netflix Data
Danger
 You can never fully estimate
the anonymity of your data
using standard approaches
Conclusion
*IMDB = Internet Movie Database
Source: Narayanan & Shmatikov (2008), Robust De-anonymization of Large Datasets
www.laboratories.telekom.com @T_Labs
Depersonalization Approaches
6
Depersonalization
Standard Anonymization Approaches
Adaption of real data using data manipulation
techniques to increase k-anonymity*
Pertubation Regression
Classification
Tree
GeneralizationSuppressionReplacement
Markov
Chain
Further
Methods
Synthesization
Creation of new data with same properties using
machines learning methods (Ongoing Research)
*each person contained in the releasecannot be distinguished from at least k-1 individuals whose information also appear in the release
Source: http://whimsley.typepad.com/whimsley/2011/09/data-anonymization-and-re-identification-some-basics-of-data-privacy.html
www.laboratories.telekom.com @T_Labs
synthetic data to overcome privacy issues
7
www.laboratories.telekom.com @T_Labs
ADVANTAGES & DISADVANTAGES OF TECHNIQUES
8
Rendering anonymous means the modification of personal data so that the information concerning personal or material
circumstances can no longer be attributed to an identified or identifiable individual.
Anonymization
Actual data is used to develop patterns in which the characteristics of this actual data are largely retained. These patterns
are then used to generate new data, which no longer has any reference to an individual in the actual data.
Synthetic data makes it possible, for the first time, to use data that was previously unavailable.
Synthetization
www.laboratories.telekom.com @T_Labs
Standard Anonymization Approaches vs. Synthetic Data
Standard Approaches vs. Synthetic Data
Synthetic data is not always superior
9
Creation is fast and easy
Suitable for real time provision
Individual is retained
Unrestricted data transfer
Unrestricted data storage
100% protection of individuals
No data loss or distortion
Standard
Anonymization Synthesization
 No approach is completely superior over the
other
 Synthetic data beats standard
anonymization, if it leads to data loss or if it
doesn’t allow unrestricted storage and
transfer of data due to data privacy issues or
volume restrictions
Conclusion
Main Advantages of Synthetic Data
Synthetic
Data
Standard
Approaches
www.laboratories.telekom.com @T_Labs
First results
Comparison of Distributions
0 50000 100000 150000 200000
AIF/MOC
AIF/MTC
AIF/Update Location
IuCS MOC
IuCS MTC
Deviations are statistically not significant
Distribution of the variable ‘Place’ in the source
And synthetized data set
0
20
40
60
80
100
120
140
160
0*0
10.0706*50.0369
10.1572*50.1639
10.2506*47.5914
10.3842*48.0369
10.5281*52.2592
10.6617*51.1064
10.7925*51.5714
10.9053*48.3619
11.0056*49.1725
11.0886*49.4542
11.2239*50.99
11.3775*48.0708
11.4783*48.1275
11.5619*51.8664
11.6128*48.1525
11.7228*48.1069
11.8672*48.9258
12.0014*51.4236
12.1175*51.4356
12.2244*53.8022
12.3811*52.1761
12.5367*49.2728
12.7756*51.92
13.0006*52.4061
13.2242*51.3525
13.3464*52.6336
13.4328*52.5122
13.5742*48.9381
13.7758*52.8561
14.3211*51.2706
6.16194*50.745
6.45917*51.2103
6.6425*50.6422
6.76583*49.3239
6.84806*50.5311
6.93417*49.2456
6.99778*52.0725
7.06139*52.1711
7.12778*49.3119
7.1925*51.46
7.27194*51.9003
7.38*51.3511
7.49028*51.4111
7.59222*50.3547
7.66917*51.9272
7.78528*48.3336
7.88861*52.17
8.01194*50.9022
8.1075*49.7428
8.21611*50.02
8.29556*52.1072
8.38*51.4497
8.46167*53.2397
8.53*51.9522
8.58833*48.4703
8.64194*50.0697
8.68333*50.1431
8.7475*50.0542
8.825*50.4217
8.90944*52.0175
9.0075*48.7008
9.11306*48.805
9.18139*48.9586
9.24778*48.6986
9.36917*52.1467
9.47667*48.6161
9.59611*50.5581
9.69556*54.1139
9.78333*52.4511
9.88167*53.5592
9.95472*49.7753
0
20
40
60
80
100
120
140
160
0*0
10.0706*50.0369
10.1572*50.1639
10.2506*47.5914
10.3842*48.0369
10.5281*52.2592
10.6617*51.1064
10.7925*51.5714
10.9053*48.3619
11.0056*49.1725
11.0886*49.4542
11.2239*50.99
11.3775*48.0708
11.4783*48.1275
11.5619*51.8664
11.6128*48.1525
11.7228*48.1069
11.8672*48.9258
12.0014*51.4236
12.1175*51.4356
12.2244*53.8022
12.3811*52.1761
12.5367*49.2728
12.7756*51.92
13.0006*52.4061
13.2242*51.3525
13.3464*52.6336
13.4328*52.5122
13.5742*48.9381
13.7758*52.8561
14.3211*51.2706
6.16194*50.745
6.45917*51.2103
6.6425*50.6422
6.76583*49.3239
6.84806*50.5311
6.93417*49.2456
6.99778*52.0725
7.06139*52.1711
7.12778*49.3119
7.1925*51.46
7.27194*51.9003
7.38*51.3511
7.49028*51.4111
7.59222*50.3547
7.66917*51.9272
7.78528*48.3336
7.88861*52.17
8.01194*50.9022
8.1075*49.7428
8.21611*50.02
8.29556*52.1072
8.38*51.4497
8.46167*53.2397
8.53*51.9522
8.58833*48.4703
8.64194*50.0697
8.68333*50.1431
8.7475*50.0542
8.825*50.4217
8.90944*52.0175
9.0075*48.7008
9.11306*48.805
9.18139*48.9586
9.24778*48.6986
9.36917*52.1467
9.47667*48.6161
9.59611*50.5581
9.69556*54.1139
9.78333*52.4511
9.88167*53.5592
9.95472*49.7753
Distribution of the variable ‘activity’ in the source
And synthetized data set
 Source  Synthetized
Amount of cases for each activity
Activity 1
Activity 2
Activity 3
Activity 4
Activity 5
10
www.laboratories.telekom.com @T_Labs
Why Synthetic data?
The Solution/USP:
 Synthetic data have nearly the same quality as the original.
 They cannot be traced back to their origin.
 100% compliant with Data Privacy
 Patents pending (disruptive technology).
 It can be stored in any way and transferred to other.
 This makes new services, including individualized services, possible.
11
www.laboratories.telekom.com @T_Labs
Thank you
12
WE SHAPE THE FUTURE
BACKUP
www.laboratories.telekom.com @T_Labs
Data modelling
14
Real Data
New Data
1. Collection of several events 2. Clustering 3. Formation of regional patterns
4. Probability model 5. Fabrication of synthetic data
All possible events
Regional distribution
Local distribution
www.laboratories.telekom.com @T_Labs
Future Analytics – exemplary application
15
www.laboratories.telekom.com @T_Labs
Vision – include everything in one picture
100% Compliant with Data Privacy
16

Weitere ähnliche Inhalte

Ähnlich wie "Future Analytics - Fabrication of Synthetic Data", Dr. Susan Wegner,VP Smart Data Analytics and Communication at Deutsche Telekom

Data lifecycle mgt across the enterprise
Data lifecycle mgt across the enterpriseData lifecycle mgt across the enterprise
Data lifecycle mgt across the enterpriseOSTHUS
 
Optim test data management for IMS 2011
Optim test data management for IMS 2011Optim test data management for IMS 2011
Optim test data management for IMS 2011evgeni77
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...acijjournal
 
A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining TechniquesIJMER
 
A review on privacy preservation in data mining
A review on privacy preservation in data miningA review on privacy preservation in data mining
A review on privacy preservation in data miningijujournal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 
"Implementing data quality automation with open source stack" - Max Martynov,...
"Implementing data quality automation with open source stack" - Max Martynov,..."Implementing data quality automation with open source stack" - Max Martynov,...
"Implementing data quality automation with open source stack" - Max Martynov,...Grid Dynamics
 
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishing
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishingIEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishing
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishingIEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishingIEEEMEMTECHSTUDENTSPROJECTS
 
IRJET- Two ways Verification for Securing Cloud Data
IRJET- Two ways Verification for Securing Cloud DataIRJET- Two ways Verification for Securing Cloud Data
IRJET- Two ways Verification for Securing Cloud DataIRJET Journal
 
How to Maximize Data Governance in Snowflake Test Environment
How to Maximize Data Governance in Snowflake Test EnvironmentHow to Maximize Data Governance in Snowflake Test Environment
How to Maximize Data Governance in Snowflake Test EnvironmentJade Global
 
Dynamic Talks: "Implementing data quality automation with open source stack" ...
Dynamic Talks: "Implementing data quality automation with open source stack" ...Dynamic Talks: "Implementing data quality automation with open source stack" ...
Dynamic Talks: "Implementing data quality automation with open source stack" ...Grid Dynamics
 
Practical risk management for the multi cloud
Practical risk management for the multi cloudPractical risk management for the multi cloud
Practical risk management for the multi cloudUlf Mattsson
 
Secure Your Data with Virtual Data Fabric (ASEAN)
Secure Your Data with Virtual Data Fabric (ASEAN)Secure Your Data with Virtual Data Fabric (ASEAN)
Secure Your Data with Virtual Data Fabric (ASEAN)Denodo
 
Building AI with Security and Privacy in mind
Building AI with Security and Privacy in mindBuilding AI with Security and Privacy in mind
Building AI with Security and Privacy in mindgeetachauhan
 
Building AI with Security Privacy in Mind
Building AI with Security Privacy in MindBuilding AI with Security Privacy in Mind
Building AI with Security Privacy in Mindgeetachauhan
 
IRJET - Improving Password System using Blockchain
IRJET - Improving Password System using BlockchainIRJET - Improving Password System using Blockchain
IRJET - Improving Password System using BlockchainIRJET Journal
 
V1_I1_2012_Paper4.doc
V1_I1_2012_Paper4.docV1_I1_2012_Paper4.doc
V1_I1_2012_Paper4.docpraveena06
 

Ähnlich wie "Future Analytics - Fabrication of Synthetic Data", Dr. Susan Wegner,VP Smart Data Analytics and Communication at Deutsche Telekom (20)

Data lifecycle mgt across the enterprise
Data lifecycle mgt across the enterpriseData lifecycle mgt across the enterprise
Data lifecycle mgt across the enterprise
 
Optim test data management for IMS 2011
Optim test data management for IMS 2011Optim test data management for IMS 2011
Optim test data management for IMS 2011
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...
 
A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining Techniques
 
A review on privacy preservation in data mining
A review on privacy preservation in data miningA review on privacy preservation in data mining
A review on privacy preservation in data mining
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
"Implementing data quality automation with open source stack" - Max Martynov,...
"Implementing data quality automation with open source stack" - Max Martynov,..."Implementing data quality automation with open source stack" - Max Martynov,...
"Implementing data quality automation with open source stack" - Max Martynov,...
 
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishing
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishingIEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishing
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishing
 
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing
 
IRJET- Two ways Verification for Securing Cloud Data
IRJET- Two ways Verification for Securing Cloud DataIRJET- Two ways Verification for Securing Cloud Data
IRJET- Two ways Verification for Securing Cloud Data
 
How to Maximize Data Governance in Snowflake Test Environment
How to Maximize Data Governance in Snowflake Test EnvironmentHow to Maximize Data Governance in Snowflake Test Environment
How to Maximize Data Governance in Snowflake Test Environment
 
Dynamic Talks: "Implementing data quality automation with open source stack" ...
Dynamic Talks: "Implementing data quality automation with open source stack" ...Dynamic Talks: "Implementing data quality automation with open source stack" ...
Dynamic Talks: "Implementing data quality automation with open source stack" ...
 
Practical risk management for the multi cloud
Practical risk management for the multi cloudPractical risk management for the multi cloud
Practical risk management for the multi cloud
 
Secure Your Data with Virtual Data Fabric (ASEAN)
Secure Your Data with Virtual Data Fabric (ASEAN)Secure Your Data with Virtual Data Fabric (ASEAN)
Secure Your Data with Virtual Data Fabric (ASEAN)
 
Building AI with Security and Privacy in mind
Building AI with Security and Privacy in mindBuilding AI with Security and Privacy in mind
Building AI with Security and Privacy in mind
 
Building AI with Security Privacy in Mind
Building AI with Security Privacy in MindBuilding AI with Security Privacy in Mind
Building AI with Security Privacy in Mind
 
IRJET - Improving Password System using Blockchain
IRJET - Improving Password System using BlockchainIRJET - Improving Password System using Blockchain
IRJET - Improving Password System using Blockchain
 
V1_I1_2012_Paper4.doc
V1_I1_2012_Paper4.docV1_I1_2012_Paper4.doc
V1_I1_2012_Paper4.doc
 

Mehr von Dataconomy Media

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...Dataconomy Media
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Dataconomy Media
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Dataconomy Media
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Dataconomy Media
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...Dataconomy Media
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Dataconomy Media
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...Dataconomy Media
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Dataconomy Media
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...Dataconomy Media
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Dataconomy Media
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Dataconomy Media
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Dataconomy Media
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Dataconomy Media
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Dataconomy Media
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Dataconomy Media
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Dataconomy Media
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Dataconomy Media
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Dataconomy Media
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Dataconomy Media
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Dataconomy Media
 

Mehr von Dataconomy Media (20)

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
 

Kürzlich hochgeladen

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 

Kürzlich hochgeladen (20)

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 

"Future Analytics - Fabrication of Synthetic Data", Dr. Susan Wegner,VP Smart Data Analytics and Communication at Deutsche Telekom

  • 1. Dr. Susan Wegner, Telekom Innovation Laboratories 25. Februar 2015, BITKOM Big Data Summit, Hanau Future analytics – Fabrication of Synthetic Data DATA NATIVES 2015, Dr. Susan Wegner, Telekom Innovation Laboratories
  • 2. www.laboratories.telekom.com @T_Labs ACCESS TO DATA IS STILL AN ISSUE 2 DUE TO DIFFERENT TECHNOLOGY AND DATA SOURCES
  • 3. www.laboratories.telekom.com @T_Labs Depersonalization approaches 3 Depersonalization Standard Anonymization Approaches Adaption of real data using data manipulation techniques to increase k-anonymity* Pertubation Regression Classification Tree GeneralizationSuppressionReplacement Markov Chain Further Methods *each person contained in the releasecannot be distinguished from at least k-1 individuals whose information also appear in the release Source: http://whimsley.typepad.com/whimsley/2011/09/data-anonymization-and-re-identification-some-basics-of-data-privacy.html Synthesization Creation of new data with same properties using machines learning methods (Ongoing Research)
  • 4. www.laboratories.telekom.com @T_Labs Standard Anonymization Approaches A tradeoff between anonymity and usefulness 4 Tradeoff Perfect Anonymity Perfect Usefulness It is not possible to create a perfectly anonymised dataset that is perfectly useful to researchers at the same time. Pro: 100% Data Privacy Con: Data Loss and distortions can compromise conclusions Pro: A maximum of data based insights is possible Con: Disclosure of individuals is possible Decreasing Intensity of Anonymization 
  • 5. www.laboratories.telekom.com @T_Labs Standard Anonymization Approaches Combining data sources endangers anonymity 5  Anonymized Netflix Data (2007)  10 million movie ratings  500,000 customers  Personal details were removed and replaced by random numbers  Public IMDB* Data (2007)  Users who entered movie ranking using real name Sources Combination Rankings Anonymized Netflix Data Public IMDB Data  Users on IMDB using their real name had similar ranking patterns in the Netflix data  It was possible to find all other preferences of those user in the Netflix Data Danger  You can never fully estimate the anonymity of your data using standard approaches Conclusion *IMDB = Internet Movie Database Source: Narayanan & Shmatikov (2008), Robust De-anonymization of Large Datasets
  • 6. www.laboratories.telekom.com @T_Labs Depersonalization Approaches 6 Depersonalization Standard Anonymization Approaches Adaption of real data using data manipulation techniques to increase k-anonymity* Pertubation Regression Classification Tree GeneralizationSuppressionReplacement Markov Chain Further Methods Synthesization Creation of new data with same properties using machines learning methods (Ongoing Research) *each person contained in the releasecannot be distinguished from at least k-1 individuals whose information also appear in the release Source: http://whimsley.typepad.com/whimsley/2011/09/data-anonymization-and-re-identification-some-basics-of-data-privacy.html
  • 8. www.laboratories.telekom.com @T_Labs ADVANTAGES & DISADVANTAGES OF TECHNIQUES 8 Rendering anonymous means the modification of personal data so that the information concerning personal or material circumstances can no longer be attributed to an identified or identifiable individual. Anonymization Actual data is used to develop patterns in which the characteristics of this actual data are largely retained. These patterns are then used to generate new data, which no longer has any reference to an individual in the actual data. Synthetic data makes it possible, for the first time, to use data that was previously unavailable. Synthetization
  • 9. www.laboratories.telekom.com @T_Labs Standard Anonymization Approaches vs. Synthetic Data Standard Approaches vs. Synthetic Data Synthetic data is not always superior 9 Creation is fast and easy Suitable for real time provision Individual is retained Unrestricted data transfer Unrestricted data storage 100% protection of individuals No data loss or distortion Standard Anonymization Synthesization  No approach is completely superior over the other  Synthetic data beats standard anonymization, if it leads to data loss or if it doesn’t allow unrestricted storage and transfer of data due to data privacy issues or volume restrictions Conclusion Main Advantages of Synthetic Data Synthetic Data Standard Approaches
  • 10. www.laboratories.telekom.com @T_Labs First results Comparison of Distributions 0 50000 100000 150000 200000 AIF/MOC AIF/MTC AIF/Update Location IuCS MOC IuCS MTC Deviations are statistically not significant Distribution of the variable ‘Place’ in the source And synthetized data set 0 20 40 60 80 100 120 140 160 0*0 10.0706*50.0369 10.1572*50.1639 10.2506*47.5914 10.3842*48.0369 10.5281*52.2592 10.6617*51.1064 10.7925*51.5714 10.9053*48.3619 11.0056*49.1725 11.0886*49.4542 11.2239*50.99 11.3775*48.0708 11.4783*48.1275 11.5619*51.8664 11.6128*48.1525 11.7228*48.1069 11.8672*48.9258 12.0014*51.4236 12.1175*51.4356 12.2244*53.8022 12.3811*52.1761 12.5367*49.2728 12.7756*51.92 13.0006*52.4061 13.2242*51.3525 13.3464*52.6336 13.4328*52.5122 13.5742*48.9381 13.7758*52.8561 14.3211*51.2706 6.16194*50.745 6.45917*51.2103 6.6425*50.6422 6.76583*49.3239 6.84806*50.5311 6.93417*49.2456 6.99778*52.0725 7.06139*52.1711 7.12778*49.3119 7.1925*51.46 7.27194*51.9003 7.38*51.3511 7.49028*51.4111 7.59222*50.3547 7.66917*51.9272 7.78528*48.3336 7.88861*52.17 8.01194*50.9022 8.1075*49.7428 8.21611*50.02 8.29556*52.1072 8.38*51.4497 8.46167*53.2397 8.53*51.9522 8.58833*48.4703 8.64194*50.0697 8.68333*50.1431 8.7475*50.0542 8.825*50.4217 8.90944*52.0175 9.0075*48.7008 9.11306*48.805 9.18139*48.9586 9.24778*48.6986 9.36917*52.1467 9.47667*48.6161 9.59611*50.5581 9.69556*54.1139 9.78333*52.4511 9.88167*53.5592 9.95472*49.7753 0 20 40 60 80 100 120 140 160 0*0 10.0706*50.0369 10.1572*50.1639 10.2506*47.5914 10.3842*48.0369 10.5281*52.2592 10.6617*51.1064 10.7925*51.5714 10.9053*48.3619 11.0056*49.1725 11.0886*49.4542 11.2239*50.99 11.3775*48.0708 11.4783*48.1275 11.5619*51.8664 11.6128*48.1525 11.7228*48.1069 11.8672*48.9258 12.0014*51.4236 12.1175*51.4356 12.2244*53.8022 12.3811*52.1761 12.5367*49.2728 12.7756*51.92 13.0006*52.4061 13.2242*51.3525 13.3464*52.6336 13.4328*52.5122 13.5742*48.9381 13.7758*52.8561 14.3211*51.2706 6.16194*50.745 6.45917*51.2103 6.6425*50.6422 6.76583*49.3239 6.84806*50.5311 6.93417*49.2456 6.99778*52.0725 7.06139*52.1711 7.12778*49.3119 7.1925*51.46 7.27194*51.9003 7.38*51.3511 7.49028*51.4111 7.59222*50.3547 7.66917*51.9272 7.78528*48.3336 7.88861*52.17 8.01194*50.9022 8.1075*49.7428 8.21611*50.02 8.29556*52.1072 8.38*51.4497 8.46167*53.2397 8.53*51.9522 8.58833*48.4703 8.64194*50.0697 8.68333*50.1431 8.7475*50.0542 8.825*50.4217 8.90944*52.0175 9.0075*48.7008 9.11306*48.805 9.18139*48.9586 9.24778*48.6986 9.36917*52.1467 9.47667*48.6161 9.59611*50.5581 9.69556*54.1139 9.78333*52.4511 9.88167*53.5592 9.95472*49.7753 Distribution of the variable ‘activity’ in the source And synthetized data set  Source  Synthetized Amount of cases for each activity Activity 1 Activity 2 Activity 3 Activity 4 Activity 5 10
  • 11. www.laboratories.telekom.com @T_Labs Why Synthetic data? The Solution/USP:  Synthetic data have nearly the same quality as the original.  They cannot be traced back to their origin.  100% compliant with Data Privacy  Patents pending (disruptive technology).  It can be stored in any way and transferred to other.  This makes new services, including individualized services, possible. 11
  • 14. www.laboratories.telekom.com @T_Labs Data modelling 14 Real Data New Data 1. Collection of several events 2. Clustering 3. Formation of regional patterns 4. Probability model 5. Fabrication of synthetic data All possible events Regional distribution Local distribution
  • 16. www.laboratories.telekom.com @T_Labs Vision – include everything in one picture 100% Compliant with Data Privacy 16