SlideShare ist ein Scribd-Unternehmen logo
1 von 43
1© 2018 All rights reserved.
Distributed Database
Architecture for GDPR
Karthik Ranganathan
PostgresConf Silicon Valley
Oct 15, 2018
2© 2018 All rights reserved.
About Us
Kannan Muthukkaruppan, CEO
Nutanix ♦ Facebook ♦ Oracle
IIT-Madras, University of California-Berkeley
Karthik Ranganathan, CTO
Nutanix ♦ Facebook ♦ Microsoft
IIT-Madras, University of Texas-Austin
Mikhail Bautin, Software Architect
ClearStory Data ♦ Facebook ♦ D.E.Shaw
Nizhny Novgorod State University, Stony Brook
 Founded Feb 2016
 Apache HBase committers and early engineers on Apache Cassandra
 Built Facebook’s NoSQL platform powered by Apache HBase
 Scaled the platform to serve many mission-critical use cases
• Facebook Messages (Messenger)
• Operational Data Store (Time series Data)
 Reassembled the same Facebook team at YugaByte along with
engineers from Oracle, Google, Nutanix and LinkedIn
Founders
3© 2018 All rights reserved.
WHAT IS
YUGABYTE DB?
4© 2018 All rights reserved.
A transactional, planet-scale database
for building high-performance cloud services.
5© 2018 All rights reserved.
NoSQL + SQL Cloud Native
6© 2018 All rights reserved.
TRANSACTIONAL PLANET-SCALEHIGH PERFORMANCE
Single Shard & Distributed ACID Txns
Document-Based, Strongly
Consistent Storage
Low Latency, Tunable Reads
High Throughput
OPEN SOURCE
Apache 2.0
Popular APIs Extended
Apache Cassandra, Redis and PostgreSQL (BETA)
Auto Sharding & Rebalancing
Global Data Distribution
Design Principles
CLOUD NATIVE
Built For The Container Era
Self-Healing, Fault-Tolerant
7© 2018 All rights reserved.
WHAT IS GDPR?
8© 2018 All rights reserved.
GDPR : General Data Protection Regulation
9© 2018 All rights reserved.
Citizens of EU can control sharing and protection
of their personal data by businesses.
10© 2018 All rights reserved.
Personal Data, also called
PII (Personally Identifiable Information)
• User name
• Email address
• Date of birth
• Bank details
• Location details
• Computer IP address
11© 2018 All rights reserved.
Control over personal data
• Consent & data location
• Data privacy and safety
• Right to be forgotten
• Data access on demand
• Notify on data breach
• Data portability
• Ability to fix errors in data
• Restrict processing
Database concerns Application concerns
12© 2018 All rights reserved.
#1 USER CONSENT
AND DATA LOCATION
13© 2018 All rights reserved.
Data must be stored in EU by default. Businesses
need explicit user consent to move it outside.
14© 2018 All rights reserved.
Why is this hard?
• EU user data lives in that region
• Other countries have compliance regulation – more geo’s
• Public clouds may not have coverage – hybrid deployments
• Architecture depends on data – multiple per service
Think Global Deployments first!
15© 2018 All rights reserved.
Example – online ecommerce site
• Products table needs globally replication – not PII data
16© 2018 All rights reserved.
Read Replicas
Global Replication
Non-PII Data
Global Replication
with YugaByte DB
17© 2018 All rights reserved.
Example – online ecommerce site
• Users, orders and shipments needs locality – PII data
• Product locations table needs scale – may be PII
18© 2018 All rights reserved.
Primary Data in EU
PII Data
Non-EU Data
Non-EU Data
Geo-Partitioning
with YugaByte DB
19© 2018 All rights reserved.
Replicate data on demand to other geo’s
• User may be ok with replicating data
• Read replicas on demand (for remote, low-latency reads)
• Change data capture (for analytics)
20© 2018 All rights reserved.
Read Replicas
Primary Data in EU
PII Data with YugaByte DB
Read Replicas with
YugaByte DB
21© 2018 All rights reserved.
#2 DATA PRIVACY
AND SAFETY
22© 2018 All rights reserved.
Data must be secured by using best practices by
default. Users need to be notified on breach.
23© 2018 All rights reserved.
Implement end-to-end encryption on day #1
24© 2018 All rights reserved.
• Use TLS Encryption
• Between client and server for app interaction
• Between database servers for replication
Encrypt All Network Communication
25© 2018 All rights reserved.
TLS Encryption
Database Cluster
User
Server to server
communication
26© 2018 All rights reserved.
• Encryption at rest
• Integrate with external Key Management Systems
• Ability to rotate keys on demand
Encryption All Storage
Have a key-value table with id to cipher key. Encrypt PII data with
the cipher key for fine-grained control. More in the next section.
27© 2018 All rights reserved.
Encryption at Rest
Database Cluster
User
Encryption on disk
Key Management
Service
28© 2018 All rights reserved.
#3 RIGHT TO BE
FORGOTTEN
29© 2018 All rights reserved.
Data must be erased if on explicit request or when
data is no longer relevant to original intent.
30© 2018 All rights reserved.
• Have a key-value table with id to cipher key
• Encrypt PII data with the cipher key on write
• Decrypt PII data on access
• Delete cipher key to forget PII data
Use Encryption of Data Attributes
31© 2018 All rights reserved.
SET email=foo@bar.com FOR USER ID=XXX
Example - Storing User Profile Data
SET email=ENCRYPTED FOR USER ID=XXX
Get encryption
key for user
Encryption PII Data
Store encrypted data
• Reads require decryption
• Data not accessible without key
32© 2018 All rights reserved.
• Many cases where value not needed
• Anonymize PII data with one way hash functions
• Use hashed ids for in data warehouse
• There is no PII data if hashed ids are used!
Use Anonymization of Data Attributes
33© 2018 All rights reserved.
USER=foo@bar.com CHECKED OUT PRODUCT=X, CATEGORY=Gadget
Example – Website Analytics
USER=HASHED_VAL CHECKED OUT PRODUCT=X, CATEGORY=Gadget
One-way hash
user id
Analytics
34© 2018 All rights reserved.
Example – Website Analytics
• User no longer identifiable
• Hashed data still useful!
35© 2018 All rights reserved.
#4 DATA ACCESS
ON DEMAND
36© 2018 All rights reserved.
Ability to inform a user about what data is being used,
for what purpose and where it is stored.
37© 2018 All rights reserved.
• Store in a separate information architecture table
• Make tagging a part of the process
• Easy to find what PII data is stored on demand
Tag Tables and Columns with PII
38© 2018 All rights reserved.
• Ensure PII are encrypted
• Ensure non-PII columns do not have sensitive data
• Use Spark/Presto to perform scan periodically
• Run scan on a read replica to not impact production
Run Continuous Compliance Checks
39© 2018 All rights reserved.
Ensure PII columns are encrypted
Ensure no PII data in other columns
Tag PII Columns
40© 2018 All rights reserved.
PUTTING IT ALL TOGETHER
41© 2018 All rights reserved.
GDPR Reference Architecture
Primary Cluster
(in EU)
Read Replica Clusters
(Anywhere in the World)
Encrypted Encrypted
App clients
Encrypted Async
Replication
Reads & Writes, Encrypted
Analytics clients
Read only, Encrypted
At-Rest Encryption for All Nodes At-Rest Encryption for All Nodes
PII Columns Encrypted w/
Cipher Key
Tag PII Columns
Ensure PII columns are
encrypted
Ensure no PII data in other
columns
42© 2018 All rights reserved.
43© 2018 All rights reserved.
Questions?
Try it at
docs.yugabyte.com/latest/quick-start

Weitere ähnliche Inhalte

Was ist angesagt?

MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera Cluster
Abdul Manaf
 
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
StreamNative
 

Was ist angesagt? (20)

reInvent reCap 2022
reInvent reCap 2022reInvent reCap 2022
reInvent reCap 2022
 
How Oath (a Verizon Company) Built a Multi-Region GDPR Application with Amazo...
How Oath (a Verizon Company) Built a Multi-Region GDPR Application with Amazo...How Oath (a Verizon Company) Built a Multi-Region GDPR Application with Amazo...
How Oath (a Verizon Company) Built a Multi-Region GDPR Application with Amazo...
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 
Amazon Aurora Storage Demystified: How It All Works (DAT363) - AWS re:Invent ...
Amazon Aurora Storage Demystified: How It All Works (DAT363) - AWS re:Invent ...Amazon Aurora Storage Demystified: How It All Works (DAT363) - AWS re:Invent ...
Amazon Aurora Storage Demystified: How It All Works (DAT363) - AWS re:Invent ...
 
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftBuilding a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Amazon S3 Masterclass
Amazon S3 MasterclassAmazon S3 Masterclass
Amazon S3 Masterclass
 
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGateContinuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
 
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ NetflixWhoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
 
MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera Cluster
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the Cloud
 
AWS Cloud Security
AWS Cloud SecurityAWS Cloud Security
AWS Cloud Security
 
Understand AWS Pricing
Understand AWS PricingUnderstand AWS Pricing
Understand AWS Pricing
 
AWS S3 and GLACIER
AWS S3 and GLACIERAWS S3 and GLACIER
AWS S3 and GLACIER
 
GDPR compliance application architecture and implementation using Hadoop and ...
GDPR compliance application architecture and implementation using Hadoop and ...GDPR compliance application architecture and implementation using Hadoop and ...
GDPR compliance application architecture and implementation using Hadoop and ...
 
Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018
Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018
Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018
 
SRV308 Deep Dive on Amazon Aurora
SRV308 Deep Dive on Amazon AuroraSRV308 Deep Dive on Amazon Aurora
SRV308 Deep Dive on Amazon Aurora
 
Réseaux de stockage
Réseaux de stockageRéseaux de stockage
Réseaux de stockage
 
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
 
Hybrid cloud : why and how to connect your datacenters to OVHcloud ?
Hybrid cloud : why and how to connect your datacenters to OVHcloud ? Hybrid cloud : why and how to connect your datacenters to OVHcloud ?
Hybrid cloud : why and how to connect your datacenters to OVHcloud ?
 

Ähnlich wie Distributed Database Architecture for GDPR

Ähnlich wie Distributed Database Architecture for GDPR (20)

YugaByte DB - "Designing a Distributed Database Architecture for GDPR Complia...
YugaByte DB - "Designing a Distributed Database Architecture for GDPR Complia...YugaByte DB - "Designing a Distributed Database Architecture for GDPR Complia...
YugaByte DB - "Designing a Distributed Database Architecture for GDPR Complia...
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
 
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by YugabyteA Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
A Planet-Scale Database for Low Latency Transactional Apps by Yugabyte
 
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
Cisco Connect Toronto 2018   an introduction to Cisco kineticCisco Connect Toronto 2018   an introduction to Cisco kinetic
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
 
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
Cisco Connect Toronto 2018   an introduction to Cisco kineticCisco Connect Toronto 2018   an introduction to Cisco kinetic
Cisco Connect Toronto 2018 an introduction to Cisco kinetic
 
PayPal Notebooks at Jupytercon 2018
PayPal Notebooks at Jupytercon 2018PayPal Notebooks at Jupytercon 2018
PayPal Notebooks at Jupytercon 2018
 
Managing Biomedical Data and Metadata in Large Scale Collaborations
Managing Biomedical Data and Metadata in Large Scale CollaborationsManaging Biomedical Data and Metadata in Large Scale Collaborations
Managing Biomedical Data and Metadata in Large Scale Collaborations
 
Webinar: Three Reasons Storage Security is Failing and How to Fix It
Webinar: Three Reasons Storage Security is Failing and How to Fix ItWebinar: Three Reasons Storage Security is Failing and How to Fix It
Webinar: Three Reasons Storage Security is Failing and How to Fix It
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
Embedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern StaenderEmbedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern Staender
 
[AIIM18] Automation and Integration (Not Rip and Replace) - Stephen Ludlow
[AIIM18] Automation and Integration (Not Rip and Replace) - Stephen Ludlow[AIIM18] Automation and Integration (Not Rip and Replace) - Stephen Ludlow
[AIIM18] Automation and Integration (Not Rip and Replace) - Stephen Ludlow
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
How to scale MongoDB
How to scale MongoDBHow to scale MongoDB
How to scale MongoDB
 
YugaByte + PKS CloudFoundry Meetup 10/15/2018
YugaByte + PKS CloudFoundry Meetup 10/15/2018YugaByte + PKS CloudFoundry Meetup 10/15/2018
YugaByte + PKS CloudFoundry Meetup 10/15/2018
 
Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)
Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)
Datenvirtualisierung: Wie Sie Ihre Datenarchitektur agiler machen (German)
 
Metadata Strategies
Metadata StrategiesMetadata Strategies
Metadata Strategies
 
Application Security Logging with Splunk using Java
Application Security Logging with Splunk using JavaApplication Security Logging with Splunk using Java
Application Security Logging with Splunk using Java
 
Modern data integration expert sessions
Modern data integration expert sessionsModern data integration expert sessions
Modern data integration expert sessions
 
Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar
 

Mehr von Yugabyte

Mehr von Yugabyte (7)

Distributed SQL Databases Deconstructed
Distributed SQL Databases DeconstructedDistributed SQL Databases Deconstructed
Distributed SQL Databases Deconstructed
 
Running Stateful Apps on Kubernetes
Running Stateful Apps on KubernetesRunning Stateful Apps on Kubernetes
Running Stateful Apps on Kubernetes
 
How YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQLHow YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQL
 
YugaByte DB on Kubernetes - An Introduction
YugaByte DB on Kubernetes - An IntroductionYugaByte DB on Kubernetes - An Introduction
YugaByte DB on Kubernetes - An Introduction
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
 
Scale Transactional Apps Across Multiple Regions with Low Latency
Scale Transactional Apps Across Multiple Regions with Low LatencyScale Transactional Apps Across Multiple Regions with Low Latency
Scale Transactional Apps Across Multiple Regions with Low Latency
 
Demystifying Kubernetes Statefulsets
Demystifying Kubernetes StatefulsetsDemystifying Kubernetes Statefulsets
Demystifying Kubernetes Statefulsets
 

Kürzlich hochgeladen

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 

Kürzlich hochgeladen (20)

%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

Distributed Database Architecture for GDPR

  • 1. 1© 2018 All rights reserved. Distributed Database Architecture for GDPR Karthik Ranganathan PostgresConf Silicon Valley Oct 15, 2018
  • 2. 2© 2018 All rights reserved. About Us Kannan Muthukkaruppan, CEO Nutanix ♦ Facebook ♦ Oracle IIT-Madras, University of California-Berkeley Karthik Ranganathan, CTO Nutanix ♦ Facebook ♦ Microsoft IIT-Madras, University of Texas-Austin Mikhail Bautin, Software Architect ClearStory Data ♦ Facebook ♦ D.E.Shaw Nizhny Novgorod State University, Stony Brook  Founded Feb 2016  Apache HBase committers and early engineers on Apache Cassandra  Built Facebook’s NoSQL platform powered by Apache HBase  Scaled the platform to serve many mission-critical use cases • Facebook Messages (Messenger) • Operational Data Store (Time series Data)  Reassembled the same Facebook team at YugaByte along with engineers from Oracle, Google, Nutanix and LinkedIn Founders
  • 3. 3© 2018 All rights reserved. WHAT IS YUGABYTE DB?
  • 4. 4© 2018 All rights reserved. A transactional, planet-scale database for building high-performance cloud services.
  • 5. 5© 2018 All rights reserved. NoSQL + SQL Cloud Native
  • 6. 6© 2018 All rights reserved. TRANSACTIONAL PLANET-SCALEHIGH PERFORMANCE Single Shard & Distributed ACID Txns Document-Based, Strongly Consistent Storage Low Latency, Tunable Reads High Throughput OPEN SOURCE Apache 2.0 Popular APIs Extended Apache Cassandra, Redis and PostgreSQL (BETA) Auto Sharding & Rebalancing Global Data Distribution Design Principles CLOUD NATIVE Built For The Container Era Self-Healing, Fault-Tolerant
  • 7. 7© 2018 All rights reserved. WHAT IS GDPR?
  • 8. 8© 2018 All rights reserved. GDPR : General Data Protection Regulation
  • 9. 9© 2018 All rights reserved. Citizens of EU can control sharing and protection of their personal data by businesses.
  • 10. 10© 2018 All rights reserved. Personal Data, also called PII (Personally Identifiable Information) • User name • Email address • Date of birth • Bank details • Location details • Computer IP address
  • 11. 11© 2018 All rights reserved. Control over personal data • Consent & data location • Data privacy and safety • Right to be forgotten • Data access on demand • Notify on data breach • Data portability • Ability to fix errors in data • Restrict processing Database concerns Application concerns
  • 12. 12© 2018 All rights reserved. #1 USER CONSENT AND DATA LOCATION
  • 13. 13© 2018 All rights reserved. Data must be stored in EU by default. Businesses need explicit user consent to move it outside.
  • 14. 14© 2018 All rights reserved. Why is this hard? • EU user data lives in that region • Other countries have compliance regulation – more geo’s • Public clouds may not have coverage – hybrid deployments • Architecture depends on data – multiple per service Think Global Deployments first!
  • 15. 15© 2018 All rights reserved. Example – online ecommerce site • Products table needs globally replication – not PII data
  • 16. 16© 2018 All rights reserved. Read Replicas Global Replication Non-PII Data Global Replication with YugaByte DB
  • 17. 17© 2018 All rights reserved. Example – online ecommerce site • Users, orders and shipments needs locality – PII data • Product locations table needs scale – may be PII
  • 18. 18© 2018 All rights reserved. Primary Data in EU PII Data Non-EU Data Non-EU Data Geo-Partitioning with YugaByte DB
  • 19. 19© 2018 All rights reserved. Replicate data on demand to other geo’s • User may be ok with replicating data • Read replicas on demand (for remote, low-latency reads) • Change data capture (for analytics)
  • 20. 20© 2018 All rights reserved. Read Replicas Primary Data in EU PII Data with YugaByte DB Read Replicas with YugaByte DB
  • 21. 21© 2018 All rights reserved. #2 DATA PRIVACY AND SAFETY
  • 22. 22© 2018 All rights reserved. Data must be secured by using best practices by default. Users need to be notified on breach.
  • 23. 23© 2018 All rights reserved. Implement end-to-end encryption on day #1
  • 24. 24© 2018 All rights reserved. • Use TLS Encryption • Between client and server for app interaction • Between database servers for replication Encrypt All Network Communication
  • 25. 25© 2018 All rights reserved. TLS Encryption Database Cluster User Server to server communication
  • 26. 26© 2018 All rights reserved. • Encryption at rest • Integrate with external Key Management Systems • Ability to rotate keys on demand Encryption All Storage Have a key-value table with id to cipher key. Encrypt PII data with the cipher key for fine-grained control. More in the next section.
  • 27. 27© 2018 All rights reserved. Encryption at Rest Database Cluster User Encryption on disk Key Management Service
  • 28. 28© 2018 All rights reserved. #3 RIGHT TO BE FORGOTTEN
  • 29. 29© 2018 All rights reserved. Data must be erased if on explicit request or when data is no longer relevant to original intent.
  • 30. 30© 2018 All rights reserved. • Have a key-value table with id to cipher key • Encrypt PII data with the cipher key on write • Decrypt PII data on access • Delete cipher key to forget PII data Use Encryption of Data Attributes
  • 31. 31© 2018 All rights reserved. SET email=foo@bar.com FOR USER ID=XXX Example - Storing User Profile Data SET email=ENCRYPTED FOR USER ID=XXX Get encryption key for user Encryption PII Data Store encrypted data • Reads require decryption • Data not accessible without key
  • 32. 32© 2018 All rights reserved. • Many cases where value not needed • Anonymize PII data with one way hash functions • Use hashed ids for in data warehouse • There is no PII data if hashed ids are used! Use Anonymization of Data Attributes
  • 33. 33© 2018 All rights reserved. USER=foo@bar.com CHECKED OUT PRODUCT=X, CATEGORY=Gadget Example – Website Analytics USER=HASHED_VAL CHECKED OUT PRODUCT=X, CATEGORY=Gadget One-way hash user id Analytics
  • 34. 34© 2018 All rights reserved. Example – Website Analytics • User no longer identifiable • Hashed data still useful!
  • 35. 35© 2018 All rights reserved. #4 DATA ACCESS ON DEMAND
  • 36. 36© 2018 All rights reserved. Ability to inform a user about what data is being used, for what purpose and where it is stored.
  • 37. 37© 2018 All rights reserved. • Store in a separate information architecture table • Make tagging a part of the process • Easy to find what PII data is stored on demand Tag Tables and Columns with PII
  • 38. 38© 2018 All rights reserved. • Ensure PII are encrypted • Ensure non-PII columns do not have sensitive data • Use Spark/Presto to perform scan periodically • Run scan on a read replica to not impact production Run Continuous Compliance Checks
  • 39. 39© 2018 All rights reserved. Ensure PII columns are encrypted Ensure no PII data in other columns Tag PII Columns
  • 40. 40© 2018 All rights reserved. PUTTING IT ALL TOGETHER
  • 41. 41© 2018 All rights reserved. GDPR Reference Architecture Primary Cluster (in EU) Read Replica Clusters (Anywhere in the World) Encrypted Encrypted App clients Encrypted Async Replication Reads & Writes, Encrypted Analytics clients Read only, Encrypted At-Rest Encryption for All Nodes At-Rest Encryption for All Nodes PII Columns Encrypted w/ Cipher Key Tag PII Columns Ensure PII columns are encrypted Ensure no PII data in other columns
  • 42. 42© 2018 All rights reserved.
  • 43. 43© 2018 All rights reserved. Questions? Try it at docs.yugabyte.com/latest/quick-start