SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
© Hortonworks Inc. 2017
Keeping Your Enterprise’s
Big Data Secure
Nov 2017
Page 1
Owen O’Malley
owen@hortonworks.com
@owen_omalley
© Hortonworks Inc. 2017
Security
Page 2
• What are the important threats?
• How to minimize attack surfaces?
• How do we control who has access?
• What data do we have?
© Hortonworks Inc. 2017
Threat: Accidental File Deletion
Page 3
• Need HDFS permissions
• Trash needs to be enabled too!
• Accountability
• Audit log is very important
• Avoid group accounts
© Hortonworks Inc. 2017
Threat: Accidental Killing Tasks
Page 4
• Need Linux Container Executor
• Small setuid executable
• Enables isolation between users
• Also provides local file protection
• In the works, Docker containers
• Stronger task killing
• Does not require user accounts
© Hortonworks Inc. 2017
Threat: Pretending to be a User
Page 5
• Need Kerberos
• Provides mutual authentication
• User logs in to get ticket
• Delegation tokens authenticate jobs
• Good for 1 day, renewable for 7
• Renewed and cancelled by YARN
© Hortonworks Inc. 2017
Threat: User accesses private data
Page 6
• Need Kerberos
• HDFS permissions are critical
• ACLs add additional flexibility
• Define user-to-group mapping
• Need group directories
• Minimize user or global spaces
© Hortonworks Inc. 2017
Threat: Pretending to be a Service
Page 7
• Need Kerberos integration
• Each service needs a keytab
• Hadoop’s UserGroupInformation
• loginFromKeytab
• When you are testing, test for longer
than 24 hours
© Hortonworks Inc. 2017
Threat: Remote Access
Page 8
• Need Service Level Authorization
• Which users can use each service
• Apache Ranger simplifies this
• Need firewall around cluster
• Never let users log in to masters
© Hortonworks Inc. 2017
Security Architecture
Page 9
© Hortonworks Inc. 2017
Threat: Eavesdropping Inside Data Center
Page 10
• Super user can watch network traffic
• Need wire encryption
• RPC encryption
• Data transfer
• MapReduce Shuffle
• Very expensive
• Generally not recommended
© Hortonworks Inc. 2017
Threat: Eavesdropping Outside Data Center
Page 11
• Intranet hard to control
• Need HTTPS encryption for outside
• Set up SSL certificates
• Create a master certificate to sign
all of the others.
• Have users add master certificate
to their browsers
© Hortonworks Inc. 2017
Threat: Physical access
Page 12
• Very Rare
• Attacker can get to physical box
• Hopeless
• Can remove hard drives
• Includes access to retired drives
• Need raw file system encryption
© Hortonworks Inc. 2017
Threat: Attackers gets Hadoop Admin
Page 13
• Attacker is an Hadoop Admin
• But not root
• Need HDFS Encryption Zones
• Directory sub-tree encryption
• Each file gets unique key
• Client decrypts data
© Hortonworks Inc. 2017
HDFS Encryption
Page 14
• Each zone has master key
• Each file gets unique sub-key
• HDFS stores sub-key encrypted with
master key
• Client uses sub-key to decrypt file
• Noticeable performance impact
© Hortonworks Inc. 2017
KeyProvider API
Page 15
• Key management
• Allows 3rd
party plugins
• Named keys
• Key versions and key rolling
• Ranger provides a Key Management
Server
© Hortonworks Inc. 2017
Ranger KMSRanger KMS
Transparent Data Encryption in HDFS
NNNN
A B
C D
HDFS
Client
HDFS
Client
A B
C D
A B
C D
DN DN DN
Benefits
 Selective encryption of relevant
files/folders
 Fine grained access controls
 Transparent to end application w/o
changes
 Ranger KMS integrated to external HSM
(Safenet Luna) adding to reliability/security
of KMS
SafeNet-
Luna HSM
© Hortonworks Inc. 2017
Threat: Sensitive Tabular Data
Page 17
• Need to protect sensitive columns
• Example - Personal information
• Restrict rows based on filters
• Example – Different Regions
• HDFS permissions don’t work
• Need Hive Server 2
© Hortonworks Inc. 2017
Hive Architecture with Storage-Based Auth
Page 18
© Hortonworks Inc. 2017
Hive Architecture with Hive Server 2
Page 19
© Hortonworks Inc. 2017
Configuring Hive Server 2
Page 20
• Configure Ranger security policies
• Or use Hive SQL Standard Auth
• Restrict permissions in HDFS
• Only Hive has access
• All normal users use Hive Server 2
© Hortonworks Inc. 2017
Attribute-Based Access Control (ABAC)
Page 21
• Ranger policies allow you to
control access with ABAC.
• Combination of the subject,
action, resource, and
environment
• Uses descriptive attributes: AD
group, Atlas-based tags or
classifications, geo-location, etc.
• Consistent with NIST 800-162
• Avoid role proliferation and
manageability issues
© Hortonworks Inc. 2017
Data Masking and Anonymization
Page 22
• Some users need limited access
• Very difficult problem
• Ranger’s dynamic masking
• Redact – Replace character sets
• Hash – Replace with hash
• Tokenization – Replace with ID
• User defined
© Hortonworks Inc. 2017
ORC Column Encryption
Page 23
• Encrypts just the sensitive columns
• Columns may use different keys
• Uses Ranger KMS
• Local key for each column & file
• Unencrypted data can be statically
masked
• Follow progress in ORC-14.
© Hortonworks Inc. 2017
Tabular Access from Spark
Page 24
• Require fine-grain access control
• Hive has created LLAP
• Live, Long & Process
• Caches hot columns and partitions
• Runs sub-query on daemons
• Created an LLAP Context for Spark
© Hortonworks Inc. 2017
Apache Atlas – Data Governance
Page 25
• As data and clusters grow need to
track the data.
• Atlas provides Data Governance
• What data do you have?
• Where did it come from?
• Extensible data modelling
• Integrates with other system
© Hortonworks Inc. 2017
Apache Atlas: Lineage and Impact
© Hortonworks Inc. 2017
Apache Atlas: Classification
• Categorize and curate data assets for easier discovery
• Associate context with data– Governance, Security, Business
GOVERNANCEGOVERNANCE
SECURITYSECURITY
BUSINESSBUSINESS
Metadata Catalog Search : Basic
Search for a hive_table classified as ‘PII’ and name starting with ‘prov’
Filter by
Data Asset type
Filter by
Data Asset type
Filter by
Classification
Filter by
Classification
Search text
Wildcards: prov*, *sum*
Logical expressions: prov* AND *sum*
Search text
Wildcards: prov*, *sum*
Logical expressions: prov* AND *sum*
Metadata Catalog Search : Advanced
Filter by
Data asset type
Filter by
Data asset type
Search for a hive_table named ‘employees’ and owner ‘hive’
DSL search with SQL like syntax
Select columns from impressions table in raw database
hive_column where table.name=‘impressions’ and table.db.name
= ‘raw’
DSL query
string
DSL query
string
© Hortonworks Inc. 2017
Key Take Aways
Page 30
• Think about security in terms of
threats.
• Think holistically about security.
• Consider encryption and masking.
• Create Data Catalog with Atlas.
• Identify and classify data.
• Understand data propagation.
© Hortonworks Inc. 2017
Thank You!
Page 31
Owen O’Malley
@owen_omalley
owen@hortonworks.com

Weitere ähnliche Inhalte

Was ist angesagt?

Using Hadoop to Drive Down Fraud for Telcos
Using Hadoop to Drive Down Fraud for TelcosUsing Hadoop to Drive Down Fraud for Telcos
Using Hadoop to Drive Down Fraud for TelcosCloudera, Inc.
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big DataRommel Garcia
 
Achieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impactAchieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impactElasticsearch
 
Nl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesNl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesBolke de Bruin
 
Better Search and Business Analytics at Southern Glazer’s Wine & Spirits
Better Search and Business Analytics at Southern Glazer’s Wine & SpiritsBetter Search and Business Analytics at Southern Glazer’s Wine & Spirits
Better Search and Business Analytics at Southern Glazer’s Wine & SpiritsElasticsearch
 
Apache Knox - Hadoop Security Swiss Army Knife
Apache Knox - Hadoop Security Swiss Army KnifeApache Knox - Hadoop Security Swiss Army Knife
Apache Knox - Hadoop Security Swiss Army KnifeDataWorks Summit
 
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...Databricks
 
Cloud native policy enforcement with Open Policy Agent
Cloud native policy enforcement with Open Policy AgentCloud native policy enforcement with Open Policy Agent
Cloud native policy enforcement with Open Policy AgentLibbySchulze
 
Turning Evidence into Insights: How NCIS Leverages Elastic
Turning Evidence into Insights: How NCIS Leverages Elastic Turning Evidence into Insights: How NCIS Leverages Elastic
Turning Evidence into Insights: How NCIS Leverages Elastic Elasticsearch
 
Empower your security practitioners with the Elastic Stack
Empower your security practitioners with the Elastic StackEmpower your security practitioners with the Elastic Stack
Empower your security practitioners with the Elastic StackElasticsearch
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Spark Summit
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014Cloudera, Inc.
 
Securing Data in Hybrid on-premise and Cloud Environments Using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments Using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments Using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments Using Apache RangerDataWorks Summit
 
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStaxWebinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStaxDataStax
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...DataWorks Summit
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...DataWorks Summit
 
Lightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidLightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidDataWorks Summit
 
RedisConf18 - Redis Enterprise on Cloud Native Platforms
RedisConf18 - Redis Enterprise on Cloud  Native  Platforms RedisConf18 - Redis Enterprise on Cloud  Native  Platforms
RedisConf18 - Redis Enterprise on Cloud Native Platforms Redis Labs
 
Reporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & CassandraReporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & CassandraDataStax
 

Was ist angesagt? (20)

Using Hadoop to Drive Down Fraud for Telcos
Using Hadoop to Drive Down Fraud for TelcosUsing Hadoop to Drive Down Fraud for Telcos
Using Hadoop to Drive Down Fraud for Telcos
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Achieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impactAchieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impact
 
Nl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesNl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenches
 
Better Search and Business Analytics at Southern Glazer’s Wine & Spirits
Better Search and Business Analytics at Southern Glazer’s Wine & SpiritsBetter Search and Business Analytics at Southern Glazer’s Wine & Spirits
Better Search and Business Analytics at Southern Glazer’s Wine & Spirits
 
Apache Knox - Hadoop Security Swiss Army Knife
Apache Knox - Hadoop Security Swiss Army KnifeApache Knox - Hadoop Security Swiss Army Knife
Apache Knox - Hadoop Security Swiss Army Knife
 
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
 
Cloud native policy enforcement with Open Policy Agent
Cloud native policy enforcement with Open Policy AgentCloud native policy enforcement with Open Policy Agent
Cloud native policy enforcement with Open Policy Agent
 
Turning Evidence into Insights: How NCIS Leverages Elastic
Turning Evidence into Insights: How NCIS Leverages Elastic Turning Evidence into Insights: How NCIS Leverages Elastic
Turning Evidence into Insights: How NCIS Leverages Elastic
 
Empower your security practitioners with the Elastic Stack
Empower your security practitioners with the Elastic StackEmpower your security practitioners with the Elastic Stack
Empower your security practitioners with the Elastic Stack
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014
 
Securing Data in Hybrid on-premise and Cloud Environments Using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments Using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments Using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments Using Apache Ranger
 
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStaxWebinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
 
Lightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidLightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and Druid
 
RedisConf18 - Redis Enterprise on Cloud Native Platforms
RedisConf18 - Redis Enterprise on Cloud  Native  Platforms RedisConf18 - Redis Enterprise on Cloud  Native  Platforms
RedisConf18 - Redis Enterprise on Cloud Native Platforms
 
HIPAA Compliance in the Cloud
HIPAA Compliance in the CloudHIPAA Compliance in the Cloud
HIPAA Compliance in the Cloud
 
Reporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & CassandraReporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & Cassandra
 

Ähnlich wie Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain 2017

TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Rangertrihug
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextHellmar Becker
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access SecurityCloudera, Inc.
 
Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)Hellmar Becker
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_securityAdam Muise
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoopNiel Dunnage
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview Hortonworks
 
Project Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopProject Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopCloudera, Inc.
 
Hadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyHadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyDataWorks Summit
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyAnurag Shrivastava
 
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column EncryptionProtect your Private Data in your Hadoop Clusters with ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column EncryptionDataWorks Summit
 
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column EncryptionProtect your Private Data in your Hadoop Clusters with ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column EncryptionDataWorks Summit
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big DataGreat Wide Open
 
Fine Grain Access Control for Big Data: ORC Column Encryption
Fine Grain Access Control for Big Data: ORC Column EncryptionFine Grain Access Control for Big Data: ORC Column Encryption
Fine Grain Access Control for Big Data: ORC Column EncryptionOwen O'Malley
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of ViewKaran Alang
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Cloudera, Inc.
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaCaserta
 

Ähnlich wie Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain 2017 (20)

TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Ranger
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
 
Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoop
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview
 
Project Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopProject Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for Hadoop
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Hadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyHadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happy
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happy
 
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column EncryptionProtect your Private Data in your Hadoop Clusters with ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
 
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column EncryptionProtect your Private Data in your Hadoop Clusters with ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Fine Grain Access Control for Big Data: ORC Column Encryption
Fine Grain Access Control for Big Data: ORC Column EncryptionFine Grain Access Control for Big Data: ORC Column Encryption
Fine Grain Access Control for Big Data: ORC Column Encryption
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of View
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
 

Mehr von Big Data Spain

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data Spain
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017Big Data Spain
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Big Data Spain
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Big Data Spain
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Big Data Spain
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Big Data Spain
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Big Data Spain
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Big Data Spain
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Big Data Spain
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...Big Data Spain
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Big Data Spain
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Big Data Spain
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Big Data Spain
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Big Data Spain
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...Big Data Spain
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Big Data Spain
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...Big Data Spain
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Big Data Spain
 

Mehr von Big Data Spain (20)

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
 

Kürzlich hochgeladen

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Kürzlich hochgeladen (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain 2017

  • 1.
  • 2. © Hortonworks Inc. 2017 Keeping Your Enterprise’s Big Data Secure Nov 2017 Page 1 Owen O’Malley owen@hortonworks.com @owen_omalley
  • 3. © Hortonworks Inc. 2017 Security Page 2 • What are the important threats? • How to minimize attack surfaces? • How do we control who has access? • What data do we have?
  • 4. © Hortonworks Inc. 2017 Threat: Accidental File Deletion Page 3 • Need HDFS permissions • Trash needs to be enabled too! • Accountability • Audit log is very important • Avoid group accounts
  • 5. © Hortonworks Inc. 2017 Threat: Accidental Killing Tasks Page 4 • Need Linux Container Executor • Small setuid executable • Enables isolation between users • Also provides local file protection • In the works, Docker containers • Stronger task killing • Does not require user accounts
  • 6. © Hortonworks Inc. 2017 Threat: Pretending to be a User Page 5 • Need Kerberos • Provides mutual authentication • User logs in to get ticket • Delegation tokens authenticate jobs • Good for 1 day, renewable for 7 • Renewed and cancelled by YARN
  • 7. © Hortonworks Inc. 2017 Threat: User accesses private data Page 6 • Need Kerberos • HDFS permissions are critical • ACLs add additional flexibility • Define user-to-group mapping • Need group directories • Minimize user or global spaces
  • 8. © Hortonworks Inc. 2017 Threat: Pretending to be a Service Page 7 • Need Kerberos integration • Each service needs a keytab • Hadoop’s UserGroupInformation • loginFromKeytab • When you are testing, test for longer than 24 hours
  • 9. © Hortonworks Inc. 2017 Threat: Remote Access Page 8 • Need Service Level Authorization • Which users can use each service • Apache Ranger simplifies this • Need firewall around cluster • Never let users log in to masters
  • 10. © Hortonworks Inc. 2017 Security Architecture Page 9
  • 11. © Hortonworks Inc. 2017 Threat: Eavesdropping Inside Data Center Page 10 • Super user can watch network traffic • Need wire encryption • RPC encryption • Data transfer • MapReduce Shuffle • Very expensive • Generally not recommended
  • 12. © Hortonworks Inc. 2017 Threat: Eavesdropping Outside Data Center Page 11 • Intranet hard to control • Need HTTPS encryption for outside • Set up SSL certificates • Create a master certificate to sign all of the others. • Have users add master certificate to their browsers
  • 13. © Hortonworks Inc. 2017 Threat: Physical access Page 12 • Very Rare • Attacker can get to physical box • Hopeless • Can remove hard drives • Includes access to retired drives • Need raw file system encryption
  • 14. © Hortonworks Inc. 2017 Threat: Attackers gets Hadoop Admin Page 13 • Attacker is an Hadoop Admin • But not root • Need HDFS Encryption Zones • Directory sub-tree encryption • Each file gets unique key • Client decrypts data
  • 15. © Hortonworks Inc. 2017 HDFS Encryption Page 14 • Each zone has master key • Each file gets unique sub-key • HDFS stores sub-key encrypted with master key • Client uses sub-key to decrypt file • Noticeable performance impact
  • 16. © Hortonworks Inc. 2017 KeyProvider API Page 15 • Key management • Allows 3rd party plugins • Named keys • Key versions and key rolling • Ranger provides a Key Management Server
  • 17. © Hortonworks Inc. 2017 Ranger KMSRanger KMS Transparent Data Encryption in HDFS NNNN A B C D HDFS Client HDFS Client A B C D A B C D DN DN DN Benefits  Selective encryption of relevant files/folders  Fine grained access controls  Transparent to end application w/o changes  Ranger KMS integrated to external HSM (Safenet Luna) adding to reliability/security of KMS SafeNet- Luna HSM
  • 18. © Hortonworks Inc. 2017 Threat: Sensitive Tabular Data Page 17 • Need to protect sensitive columns • Example - Personal information • Restrict rows based on filters • Example – Different Regions • HDFS permissions don’t work • Need Hive Server 2
  • 19. © Hortonworks Inc. 2017 Hive Architecture with Storage-Based Auth Page 18
  • 20. © Hortonworks Inc. 2017 Hive Architecture with Hive Server 2 Page 19
  • 21. © Hortonworks Inc. 2017 Configuring Hive Server 2 Page 20 • Configure Ranger security policies • Or use Hive SQL Standard Auth • Restrict permissions in HDFS • Only Hive has access • All normal users use Hive Server 2
  • 22. © Hortonworks Inc. 2017 Attribute-Based Access Control (ABAC) Page 21 • Ranger policies allow you to control access with ABAC. • Combination of the subject, action, resource, and environment • Uses descriptive attributes: AD group, Atlas-based tags or classifications, geo-location, etc. • Consistent with NIST 800-162 • Avoid role proliferation and manageability issues
  • 23. © Hortonworks Inc. 2017 Data Masking and Anonymization Page 22 • Some users need limited access • Very difficult problem • Ranger’s dynamic masking • Redact – Replace character sets • Hash – Replace with hash • Tokenization – Replace with ID • User defined
  • 24. © Hortonworks Inc. 2017 ORC Column Encryption Page 23 • Encrypts just the sensitive columns • Columns may use different keys • Uses Ranger KMS • Local key for each column & file • Unencrypted data can be statically masked • Follow progress in ORC-14.
  • 25. © Hortonworks Inc. 2017 Tabular Access from Spark Page 24 • Require fine-grain access control • Hive has created LLAP • Live, Long & Process • Caches hot columns and partitions • Runs sub-query on daemons • Created an LLAP Context for Spark
  • 26. © Hortonworks Inc. 2017 Apache Atlas – Data Governance Page 25 • As data and clusters grow need to track the data. • Atlas provides Data Governance • What data do you have? • Where did it come from? • Extensible data modelling • Integrates with other system
  • 27. © Hortonworks Inc. 2017 Apache Atlas: Lineage and Impact
  • 28. © Hortonworks Inc. 2017 Apache Atlas: Classification • Categorize and curate data assets for easier discovery • Associate context with data– Governance, Security, Business GOVERNANCEGOVERNANCE SECURITYSECURITY BUSINESSBUSINESS
  • 29. Metadata Catalog Search : Basic Search for a hive_table classified as ‘PII’ and name starting with ‘prov’ Filter by Data Asset type Filter by Data Asset type Filter by Classification Filter by Classification Search text Wildcards: prov*, *sum* Logical expressions: prov* AND *sum* Search text Wildcards: prov*, *sum* Logical expressions: prov* AND *sum*
  • 30. Metadata Catalog Search : Advanced Filter by Data asset type Filter by Data asset type Search for a hive_table named ‘employees’ and owner ‘hive’ DSL search with SQL like syntax Select columns from impressions table in raw database hive_column where table.name=‘impressions’ and table.db.name = ‘raw’ DSL query string DSL query string
  • 31. © Hortonworks Inc. 2017 Key Take Aways Page 30 • Think about security in terms of threats. • Think holistically about security. • Consider encryption and masking. • Create Data Catalog with Atlas. • Identify and classify data. • Understand data propagation.
  • 32. © Hortonworks Inc. 2017 Thank You! Page 31 Owen O’Malley @owen_omalley owen@hortonworks.com