Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Five steps to secure big data
1. Five Steps to Secure Big Data
Ulf Mattsson, CTO
Protegrity
ulf.mattsson AT protegrity.com
2. Ulf Mattsson, CTO Protegrity
20 years with IBM
• Research & Development & Global Services
Inventor
• Encryption, Tokenization & Intrusion Prevention
Involvement
• PCI Security Standards Council (PCI SSC)
• American National Standards Institute (ANSI) X9
• Encryption & Tokenization
• International Federation for Information Processing
• IFIP WG 11.3 Data and Application Security
• ISACA New York Metro chapter
2
4. What is Big Data?
Hadoop
• Designed to handle the emerging “4 V’s”
• Massively Parallel Processing (MPP)
• Elastic scale
• Usually Read-Only
• Allows for data insights on massive, heterogeneous
data sets
• Includes an ecosystem of components:
Hive
Pig
Other
Application Layers
MapReduce
HDFS
Storage Layers
Physical Storage
4
8. Many Ways to Hack Big Data
BI Reporting
RDBMS
Hackers
Pig (Data Flow)
Hive (SQL)
Sqoop
Unvetted
Applications
Or
Ad Hoc
Processes
MapReduce
(Job Scheduling/Execution System)
Hbase (Column DB)
HDFS
(Hadoop Distributed File System)
Source: http://nosql.mypopescu.com/post/1473423255/apache-hadoop-and-hbase
8
Avro (Serialization)
Zookeeper (Coordination)
ETL Tools
Privileged
Users
9. Current Data Security for Big data
Authentication
• Who am I and how do I prove it?
•
Ensure the identity of the users, services and hosts that make up and
use the system is authoritatively known
Authorization
• What am I allowed to see and do?
•
Ensure services and data are accessed only by entitled identities
Data Protection
• How is my Data being Protected?
•
Ensure data cannot be usefully stolen or undetectably tampered with
Auditing
• What have I attempted to do or done?
•
Ensure a permanent record of who did what, when
11. Achieving Best Data Security for Big Data
Massively Scalable Data Security
Maximum Transparency
Maximum Performance
Easy to Use
Heterogeneous System Compatibility
Enterprise Ready
12. Many Layers of Defense
Corporate Enterprise
Kerberos Authentication
Encrypted Communications
Big Data
Corporate Firewall
Authorization through ACLs
Fine Grained
Big Data Cluster
8
Data Security Policy
Protegrity
Coarse Grained
13. Protecting the Big Data Ecosystem
BI Applications
BI Applications are authorized to access
sensitive data through the policy.
Data Access Framework
Pig
Hive
Data Processing Framework
(MapReduce)
Data Storage Framework
(HDFS)
User Defined Functions (UDFs) enable
Field Level data protection with Policy
based access controls with Monitoring.
Java API enables Field Level data
protection with Policy based access
controls with Monitoring.
File level data protection with Policy
based access controls for existing and
new data.
Volume or File Encryption with Policy
based access controls at the OS file
system level.
15. File Based Encryption Example
Files with personal identifiable information
Stored in Hadoop cluster
Root user logged-in to one of the nodes
Search for sensitive information on disk
17. Fine Grained Protection: Field Protection
Production Systems
Encryption
• Reversible
• Policy Control (Authorized / Unauthorized Access)
• Lacks Integration Transparency
• Complex Key Management
• Example !@#$%a^.,mhu7///&*B()_+!@
Tokenization / Pseudonymization
• Reversible
• Policy Control (Authorized / Unauthorized Access)
• Integrates Transparently
• No Complex Key Management
• Business Intelligence Credit Card: 0389 3778 3652 0038
Non-Production Systems
17
Masking
• Not reversible
• No Policy, Everyone Can Access the Data
• Integrates Transparently
• No Complex Key Management
• Example 0389 3778 3652 0038
18. Field Level Protection Example
Files with personal identifiable information
Loaded in to a Hive table
Select data from that table
Root user logged-in to one of the nodes
Search for sensitive information on disk
20. Policy Based Access Control
Combination of what
data needs to be
protected and who has
access to that data is
the key to creating a
meaningful policy
20
What
Who
What is the sensitive data that needs to be
protected. Data Element.
Who should have access to sensitive data and
who should not. Security access control. Roles &
Members.
21. Protegrity Data Security Policy
What
What is the sensitive data that needs to be protected. Data
Element.
How
How you want to protect and present sensitive data. There are
several methods for protecting sensitive data. Encryption,
tokenization, monitoring, etc.
Who
Who should have access to sensitive data and who
should not. Security access control. Roles &
Members.
When
When should sensitive data access be granted to those
who have access. Day of week, time of day.
Where
Where is the sensitive data stored? This will be
where the policy is enforced. At the protector.
Audit
Audit authorized or un-authorized access to sensitive
data. Optional audit of protect/unprotect.
22. Policy Based Filed Protection Example
Files with personal identifiable information
Loaded in to a Hive table
Create a view on that table
Select data as authorized user
Select data as privileged user
24. End to End Data Security Across the Enterprise
Enterprise Heterogeneous Coverage
• File Protectors: AIX, HPUX, Linux, Solaris, Windows
• Database Protectors : DB2, SQL Server, Oracle, Teradata, Informix, Netezza, Greenplum
• Big Data Protectors: BigInsights, Cloudera, Greenplum, mapR, Aster, Apache Hadoop, Hortonworks
• Big Iron Platform: zSeries, HP Non-Stop
24
25. Best Practices for Protecting Big Data
Start Early
Fine Grained protection
Select the optimal protection for the future
Enterprise coverage
Protection against insider threat
Transparent protection to the analysis process
Policy based protection and audit
25
26. Five Point Data Protection
Methodology
1. Classify
26
2. Discovery
3. Protect
4. Enforce
5. Monitor
28. Select US Regulations for Security and Privacy
Financial Services
Healthcare and Pharmaceuticals
Infrastructure and Energy
Federal Government
28
29. 1. Classify: Examples of Sensitive Data
Sensitive Information
Credit Card Numbers
PCI DSS
Names
HIPAA, State Privacy Laws
Address
HIPAA, State Privacy Laws
Dates
HIPAA, State Privacy Laws
Phone Numbers
HIPAA, State Privacy Laws
Personal ID Numbers
HIPAA, State Privacy Laws
Personally owned property numbers
HIPAA, State Privacy Laws
Personal Characteristics
HIPAA, State Privacy Laws
Asset Information
29
Compliance Regulation / Laws
HIPAA, State Privacy Laws
31. 2. Discovery in a large enterprise with many systems
System
System
System
System
System
System
System
System
System
System
System
System
Corporate Firewall
System
031
32. 2. Discovery: Determine the context to the Business
System
Retail
System
System
Employees
System
System
Corporate IP
System
Healthcare
Corporate Firewall
System
032
032
33. 2. Discover: Context to the Business and to Security
Collecting
transactions
Stores &
Ecommerce
Databases
Data Protection
Solution
Requirements
File Server
Hadoop
Applications
File Server
containing IP
Corporate Firewall
Research
Databases
033
35. Balancing Security and Data Insight
Tug of war between security and data insight
Big Data is designed for access
Privacy regulations require de-identification
Granular data-level protection
Traditional security don’t allow for seamless
data use
35
36. Protection Beyond Kerberos
ETL Tools
BI Reporting
RDBMS
Pig (Data Flow)
Hive (SQL)
Sqoop
MapReduce
(Job Scheduling/Execution System)
API enabled Field level data protection
API enabled Field level data protection
Hbase (Column DB)
HDFS
Field level data protection for existing
and new data.
(Hadoop Distributed File System)
Volume Encryption
36
38. File Encryption – Authorized User
Entire file is in the
clear when analyzed
MapReduce
HDFS
Protected with
File Encryption
38
39. File Encryption – Non Authorized User
Entire file is in
unreadable when
analyzed
MapReduce
HDFS
Protected with
File Encryption
39
40. Volume Encryption + Gateway Field Protection
Granular Field
Level Protection
MapReduce
HDFS
Data Protection File
Gateway
40
Kerberos
Access
Control
Protected with
Volume Encryption
41. Volume Encryption + Internal MapReduce Field Protection
Analytics
Granular Field
Level Protection
MapReduce
Hadoop
Staging
HDFS
MapReduce
41
Kerberos
Access Control
Protected with
Volume Encryption
42. Enforce
Policies are used to enforce
rules about how sensitive data
should be treated in the
enterprise.
42
43. A Data Security Policy
What
What is the sensitive data that needs to be protected. Data
Element.
How
How you want to protect and present sensitive data. There are
several methods for protecting sensitive data. Encryption,
tokenization, monitoring, etc.
Who
Who should have access to sensitive data and who should not.
Security access control. Roles & Members.
When
Where
Where is the sensitive data stored? This will be where the policy
is enforced. At the protector.
Audit
43
When should sensitive data access be granted to those who
have access. Day of week, time of day.
Audit authorized or un-authorized access to sensitive data.
Optional audit of protect/unprotect.
44. Volume Encryption + Field Protection + Policy Enforcement
MapReduce
HDFS
Protected with
Volume Encryption
Data Protection Policy
44
45. Volume Encryption + Field Protection + Policy Enforcement
MapReduce
HDFS
Protected with
Volume Encryption
Data Protection Policy
45
46. 4. Authorized User Example
Presentation to requestor
Name: Joe Smith
Address: 100 Main Street, Pleasantville, CA
Data Scientist,
Business Analyst
Selected data displayed (least privilege)
Response
Request
Policy
Enforcement
Authorized
Does the requestor have the authority to
access the protected data?
Protection at rest
Name: csu wusoj
Address: 476 srta coetse, cysieondusbak, CA
46
47. 4. Un-Authorized User Example
Presentation to requestor
Name: csu wusoj
Address: 476 srta coetse, cysieondusbak, CA
Privileged Used,
DBA, System
Administrators,
Bad Guy
Response
Request
Policy
Enforcement
Not
Authorized
Does the requestor have the authority to
access the protected data?
Protection at rest
Name: csu wusoj
Address: 476 srta coetse, cysieondusbak, CA
47
48. Monitor
A critically important part of a
security solution is the ongoing
monitoring of any activity on
sensitive data.
48
49. Best Practices for Protecting Big Data
Start early
Granular protection
Select the optimal protection
Enterprise coverage
Protection against insider threat
Protect highly sensitive data in a way that is mostly
transparent to the analysis process
Policy based protection
Record data access events
49
50. How Protegrity Can Help
1
2
We can help you Discover where the sensitive data sits
3
We can help you Protect your sensitive data in a flexible way
4
We can help you Enforce policies that will enable business
functions and preventing sensitive data from the wrong hands.
5
50
We can help you Classify the sensitive data
We can help you Monitor sensitive data to gain insights on
abnormal behaviors.
51. Protegrity Summary
Proven enterprise data security
software and innovation leader
•
Sole focus on the protection of
data
•
Patented Technology,
Continuing to Drive Innovation
Cross-industry applicability
•
•
Financial Services, Insurance,
Banking
•
Healthcare
•
Telecommunications, Media and
Entertainment
•
51
Retail, Hospitality, Travel and
Transportation
Manufacturing and Government
52. Please contact us for more information
Ulf.Mattsson@protegrity.com
Info@protegrity.com