SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Securing the Hadoop Ecosystem
Hadoop Security and Compliance Challenges
2
• History
• Security was not a priority in early Hadoop adopters like Yahoo!
and Facebook / it is now!
• Data concentration
• Quantity and diversity of data creates compliance challenges
• Flexibility of the Hadoop architecture
• Many paths for data in, out, processing
• Access data at different granularities, from fields to files
• ELT: sensitive data “discovery” occurs after data arrives
Cloudera has led in investments in security
3
Authentication
• First Hadoop distribution to offer strong authentication throughout
Encryption
• First Hadoop distribution to support encryption on wire
Audit
• Only Hadoop distribution to support audit histories for all data objects & access
paths
• Single point for log capture, audit
Authorization
• Founded the Apache Sentry project along with Oracle and Lab41 to manage fine-
grained permissions
Automation
• Cloudera Manager automates security configurations & LDAP/AD integration
Case Study: Finance and Banking
• Identify patterns in financially-sensitive, PCI and PII
data
• Before: Unable to build applications on Hadoop; forced
to use other systems, to greatly limit Hadoop access, or
to forgo analysis due to privacy concerns
• Now: Provide broad analysis capabilities with Impala to
large population and secured by Sentry
Fraud and Purchasing
Behavior Analysis
Enterprise Security in Hadoop overview
5
Four Functional Areas
Hadoop Cluster
Users
Applications Operators
Perimeter
Data
Access
Visibility
Defining the Functional Areas
6
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
Enabling Enterprise Security
7
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
Enabling Enterprise Security
8
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
Perimeter: Authentication in Hadoop
10
Kerberos
• Provably strong authentication between all
Hadoop services and (optionally) to end-points
• Cloudera Manager hides complexity
LDAP/AD
• Username / password
• Option for Hue, Hive Metastore, Impala
connectors, Cloudera Manager admin logins
SAML
• For Single Sign-On (SSO) for listed options
• Kerberos clients no longer required on most user
end-points
Authentication Options and Coverage
11
HDFS
DN NN
YARN
RM AM
Impala
ID SS
MapReduce
JT TT
… Services …
(Oozie, Search, etc.)
3rd Party
Gateway …
Client
Client
Client
Client
… Applications …
(Pig, Hive, Hue, etc.)
“End-to-End” Kerberos
“Core” Kerberos “Edge” AD/LDAP/SAML
IT Integration: Kerberos
• Users don’t want Yet Another Credential
• Corp IT doesn’t want to provision and maintain thousands of service principals and
keytabs
• Solution: local KDC + one-way trust
• Run MIT Kerberos KDC in the cluster
• Put all service principals here
• Set up one-way trust of central corporate realm by local KDC
• Normal user credentials can be used to access Hadoop
• Recommended: Use Cloudera Manager
• To properly tune inter-related configuration knobs
• To manage principals/keytabs creation and distribution
• To preserve service monitoring with Kerberos security enabled
IT Integration: Kerberos + LDAP
Hadoop Cluster
Local KDC (MIT Kerberos)
hdfs/host1@HADOOP.EXAMPLE.COM
yarn/host2@HADOOP.EXAMPLE.COM
…
Central
Active Directory
user@EXAMPLE.COM …
Cross-realm
trust
NN JT
LDAP group
mapping
Network Access Management
• Use Hue to front-end both Hadoop and Oozie to control access through a web browser
• HTTP proxy servers:
• Oozie : MR jobs, Pig jobs, Hive jobs
• HttpFS: hadoop fs is front-ended over HTTP
• HBase REST server: HBase reads
Secure configuration with Oozie, Hue and HttpFS front-ends co-located to act as network
bridge
Hue supports AD/LDAP based authentication instead of Kerberos for client simplicity
Enabling Enterprise Security
15
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
Data: Protection in Hadoop
16
Data in Motion Data at Rest
“Network Encryption”
• SASL: Network RPC
• SSL: MapReduce shuffle
• SSL: Web-based user and
administration tools
• SSL: JDBC
• HDFS data transfer protocol
“Data Encryption”
• Certified partner solutions
• Field-level encryption
• Data masking or tokenization
• OS-level file system encryption
Enabling Enterprise Security
18
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
Prior State of Authorization
Two Sub-Optimal Choices for SQL on Hadoop
19
• Insecure Advisory Authorization
• Users could grant themselves permissions
• Intended to prevent accidental deletion of data
• Problem: Did not guard against malicious users
• Problem: Only worked with Hive
• HDFS Impersonation
• Data was only protected at the file level by HDFS permissions
• Problem: File-level not granular enough
• Problem: Lacked flexibility; not role-based
Sentry: Key Capabilities
21
Fine-Grained Authorization
• Specify security for SERVERS, DATABASES, TABLES,
VIEWS, and search indices
Role-Based Authorization
• SELECT privilege on views & tables
• INSERT privilege on tables
• TRANSFORM privilege on servers
• ALL privilege on the server, databases, tables & views
• ALL privilege is needed to create/modify schema
Multitenant Administration
• Separate policies for each database/schema
• Can be maintained by separate admins
Sentry Architecture
22
Binding
Layer
Impala
Impala Hive
Policy Engine
Search
Policy Provider
File Database
HiveServer2
Authorization
Provider Evaluation, Validation
Parsing
Interface
Interface
Local FS/HDFS
Search
QueryMR
SQL
Query Execution Flow
23
Parse
Build
Check
Plan
Sentry
Validate SQL grammar
Construct statement tree
Validate statement objects
• First check: Authorization
Forward to execution planner
Multitenant Security
Global
[groups]
admin_group = admin_role
dep1_admin = uri_role
[roles]
admin_role = server=server1
uri_role = hdfs:///ha-nn-uri/data
[databases]
db1 = hdfs://ha-nn-
uri/user/hive/sentry/db1.ini
Per Database
[groups]
dep1_admin = db1_admin_role
dep1_analyst = db1_read_role
[roles]
db1_admin_role = server=server1-
>db=db1
db1_read_role = server=server1-
>db=db1->table=*->action=select
Apache Ecosystem and Sentry
Inline support in Cloudera Impala
Extensibility plug-in for Apache HiveServer2
Inline support in Cloudera Search
Complementary security with HDFS ACLs
Access: Authorization in Hadoop
26
File ACL
Admin RBAC
Data RBAC
• Permission at file-level granularity
• HDFS POSIX-style permissions: u/g/o
• Access Control Lists (ACL)
• HBase, Oozie, MapReduce
• Permissions on tables, views, indices
• Sentry for HiveServer2, Impala, Search
App and Workflow
• Cloudera Manager, Hue
Enabling Enterprise Security
28
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
Visibility: Cloudera Navigator
29
Audit & Access Control
• Maintain full audit history
• Ensuring appropriate
permissions and reporting
on data access for
compliance
Discovery & Exploration
• Finding out what data is
available and what it looks
like
Lineage
• Tracing data back to its
original source
Lifecycle Management
• Migration of data based on
policies
3RD PARTY
APPS
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
CLOUDERA’S ENTERPRISE DATA HUB
BATCH
PROCESSING
MAPREDUCE
ANALYTIC
SQL
IMPALA
SEARCH
ENGINE
SOLR
MACHINE
LEARNING
SPARK
STREAM
PROCESSING
SPARK STREAMING
WORKLOAD MANAGEMENT YARN
FILESYSTEM
HDFS
ONLINE NOSQL
HBASE
DATA
MANAGEMENT
CLOUDERANAVIGATOR
SYSTEM
MANAGEMENT
CLOUDERAMANAGER
SENTRY, SECURE
Why Navigator?
30
Lots of Data Landing in Cloudera Enterprise
 Huge quantities
 Many different sources – structured and unstructured
 Varying levels of sensitivity
1
Many Users Working with the Data
 Administrators and compliance officers
 Analysts and data scientists
 Business users
2
Need to Effectively Control and Consume Data
 Get visibility and control over the environment
 Discover and explore data
3
31
31
32
32
33
33
Leading Investment to Address the Challenges
34
Authentication First Hadoop distribution to offer strong authentication
throughout
Encryption First Hadoop distribution to support encryption on wire
Audit Only Hadoop distribution to support audit histories for all data
objects and access paths; Single point for log capture, audit
Authorization Founded the Apache Sentry project along with Oracle and
Lab41 to manage fine-grained permissions
Automation Cloudera Manager automates security configurations &
LDAP/AD integration
Cloudera 5: Enabling the Enterprise Data Hub
35
Open Source
Scalable
Flexible
Cost-Effective
✔
Managed ✖
Open
Architecture ✖
Secure and
Governed ✖
✔
✔
✔
3RD PARTY
APPS
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
CLOUDERA’S ENTERPRISE DATA HUB
BATCH
PROCESSING
MAPREDUCE
ANALYTIC
SQL
IMPALA
SEARCH
ENGINE
SOLR
MACHINE
LEARNING
SPARK
STREAM
PROCESSING
SPARK STREAMING
WORKLOAD MANAGEMENT YARN
FILESYSTEM
HDFS
ONLINE NOSQL
HBASE
DATA
MANAGEMENT
CLOUDERANAVIGATOR
SYSTEM
MANAGEMENT
CLOUDERAMANAGER
SENTRY
Hadoop and Data Access Security

Weitere ähnliche Inhalte

Was ist angesagt?

Keep me in the Loop: INotify in HDFS
Keep me in the Loop: INotify in HDFSKeep me in the Loop: INotify in HDFS
Keep me in the Loop: INotify in HDFSDataWorks Summit
 
HBase replication
HBase replicationHBase replication
HBase replicationwchevreuil
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryCloudera, Inc.
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergFlink Forward
 
Zookeeper Architecture
Zookeeper ArchitectureZookeeper Architecture
Zookeeper ArchitecturePrasad Wali
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesIsheeta Sanghi
 
Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariDataWorks Summit
 
Overview of HDFS Transparent Encryption
Overview of HDFS Transparent Encryption Overview of HDFS Transparent Encryption
Overview of HDFS Transparent Encryption Cloudera, Inc.
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars GeorgeJAX London
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala InternalsDavid Groozman
 
Webinar "fonctions documentaires Alfresco"
Webinar "fonctions documentaires Alfresco"Webinar "fonctions documentaires Alfresco"
Webinar "fonctions documentaires Alfresco"Michael Harlaut
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and FutureDataWorks Summit
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security ArchitectureOwen O'Malley
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?DataWorks Summit
 
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseDataWorks Summit
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0J.B. Langston
 

Was ist angesagt? (20)

Keep me in the Loop: INotify in HDFS
Keep me in the Loop: INotify in HDFSKeep me in the Loop: INotify in HDFS
Keep me in the Loop: INotify in HDFS
 
HBase replication
HBase replicationHBase replication
HBase replication
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Scaling HBase for Big Data
Scaling HBase for Big DataScaling HBase for Big Data
Scaling HBase for Big Data
 
Zookeeper Architecture
Zookeeper ArchitectureZookeeper Architecture
Zookeeper Architecture
 
HDFS Selective Wire Encryption
HDFS Selective Wire EncryptionHDFS Selective Wire Encryption
HDFS Selective Wire Encryption
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup Slides
 
Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache Ambari
 
Overview of HDFS Transparent Encryption
Overview of HDFS Transparent Encryption Overview of HDFS Transparent Encryption
Overview of HDFS Transparent Encryption
 
ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala Internals
 
Webinar "fonctions documentaires Alfresco"
Webinar "fonctions documentaires Alfresco"Webinar "fonctions documentaires Alfresco"
Webinar "fonctions documentaires Alfresco"
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0
 

Andere mochten auch

Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifyHortonworks
 
Performance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache SparkPerformance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache SparkDataWorks Summit
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersDataWorks Summit
 
Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)Peter Wood
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Kevin Minder
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the CloudDataWorks Summit
 
Information security in big data -privacy and data mining
Information security in big data -privacy and data miningInformation security in big data -privacy and data mining
Information security in big data -privacy and data miningharithavijay94
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastDataWorks Summit
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityDataWorks Summit
 
Apache Knox setup and hive and hdfs Access using KNOX
Apache Knox setup and hive and hdfs Access using KNOXApache Knox setup and hive and hdfs Access using KNOX
Apache Knox setup and hive and hdfs Access using KNOXAbhishek Mallick
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureUwe Printz
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview Hortonworks
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...DataWorks Summit
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxVinay Shukla
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Emilio Coppa
 
OAuth - Open API Authentication
OAuth - Open API AuthenticationOAuth - Open API Authentication
OAuth - Open API Authenticationleahculver
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY pptsravya raju
 
Cours Big Data Chap1
Cours Big Data Chap1Cours Big Data Chap1
Cours Big Data Chap1Amal Abid
 

Andere mochten auch (20)

Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
Performance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache SparkPerformance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache Spark
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
 
Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
 
Information security in big data -privacy and data mining
Information security in big data -privacy and data miningInformation security in big data -privacy and data mining
Information security in big data -privacy and data mining
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Hadoop
HadoopHadoop
Hadoop
 
An Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache KnoxAn Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache Knox
 
Apache Knox setup and hive and hdfs Access using KNOX
Apache Knox setup and hive and hdfs Access using KNOXApache Knox setup and hive and hdfs Access using KNOX
Apache Knox setup and hive and hdfs Access using KNOX
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache Knox
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)
 
OAuth - Open API Authentication
OAuth - Open API AuthenticationOAuth - Open API Authentication
OAuth - Open API Authentication
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Cours Big Data Chap1
Cours Big Data Chap1Cours Big Data Chap1
Cours Big Data Chap1
 

Ähnlich wie Hadoop and Data Access Security

Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Cloudera, Inc.
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Shravan (Sean) Pabba
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoopNiel Dunnage
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全Jianwei Li
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop EcosystemDataWorks Summit
 
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCombat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCloudera, Inc.
 
Cloudera GoDataFest Security and Governance
Cloudera GoDataFest Security and GovernanceCloudera GoDataFest Security and Governance
Cloudera GoDataFest Security and GovernanceGoDataDriven
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessCloudera, Inc.
 
The Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data HubThe Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data HubDataWorks Summit
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014Cloudera, Inc.
 
Project Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopProject Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopCloudera, Inc.
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_securityAdam Muise
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextHellmar Becker
 
Bringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache HadoopBringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache HadoopDataWorks Summit
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaCaserta
 
Intel boubker el mouttahid
Intel boubker el mouttahidIntel boubker el mouttahid
Intel boubker el mouttahidBigDataExpo
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of ViewKaran Alang
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityChris Nauroth
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Rangertrihug
 
大数据数据安全
大数据数据安全大数据数据安全
大数据数据安全Jianwei Li
 

Ähnlich wie Hadoop and Data Access Security (20)

Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoop
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop Ecosystem
 
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCombat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
 
Cloudera GoDataFest Security and Governance
Cloudera GoDataFest Security and GovernanceCloudera GoDataFest Security and Governance
Cloudera GoDataFest Security and Governance
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster Access
 
The Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data HubThe Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data Hub
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014
 
Project Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopProject Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for Hadoop
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
Bringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache HadoopBringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache Hadoop
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
 
Intel boubker el mouttahid
Intel boubker el mouttahidIntel boubker el mouttahid
Intel boubker el mouttahid
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of View
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Ranger
 
大数据数据安全
大数据数据安全大数据数据安全
大数据数据安全
 

Mehr von Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mehr von Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Kürzlich hochgeladen

Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...software pro Development
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 

Kürzlich hochgeladen (20)

Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 

Hadoop and Data Access Security

  • 2. Hadoop Security and Compliance Challenges 2 • History • Security was not a priority in early Hadoop adopters like Yahoo! and Facebook / it is now! • Data concentration • Quantity and diversity of data creates compliance challenges • Flexibility of the Hadoop architecture • Many paths for data in, out, processing • Access data at different granularities, from fields to files • ELT: sensitive data “discovery” occurs after data arrives
  • 3. Cloudera has led in investments in security 3 Authentication • First Hadoop distribution to offer strong authentication throughout Encryption • First Hadoop distribution to support encryption on wire Audit • Only Hadoop distribution to support audit histories for all data objects & access paths • Single point for log capture, audit Authorization • Founded the Apache Sentry project along with Oracle and Lab41 to manage fine- grained permissions Automation • Cloudera Manager automates security configurations & LDAP/AD integration
  • 4. Case Study: Finance and Banking • Identify patterns in financially-sensitive, PCI and PII data • Before: Unable to build applications on Hadoop; forced to use other systems, to greatly limit Hadoop access, or to forgo analysis due to privacy concerns • Now: Provide broad analysis capabilities with Impala to large population and secured by Sentry Fraud and Purchasing Behavior Analysis
  • 5. Enterprise Security in Hadoop overview 5 Four Functional Areas Hadoop Cluster Users Applications Operators Perimeter Data Access Visibility
  • 6. Defining the Functional Areas 6 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage
  • 7. Enabling Enterprise Security 7 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
  • 8. Enabling Enterprise Security 8 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
  • 9. Perimeter: Authentication in Hadoop 10 Kerberos • Provably strong authentication between all Hadoop services and (optionally) to end-points • Cloudera Manager hides complexity LDAP/AD • Username / password • Option for Hue, Hive Metastore, Impala connectors, Cloudera Manager admin logins SAML • For Single Sign-On (SSO) for listed options • Kerberos clients no longer required on most user end-points
  • 10. Authentication Options and Coverage 11 HDFS DN NN YARN RM AM Impala ID SS MapReduce JT TT … Services … (Oozie, Search, etc.) 3rd Party Gateway … Client Client Client Client … Applications … (Pig, Hive, Hue, etc.) “End-to-End” Kerberos “Core” Kerberos “Edge” AD/LDAP/SAML
  • 11. IT Integration: Kerberos • Users don’t want Yet Another Credential • Corp IT doesn’t want to provision and maintain thousands of service principals and keytabs • Solution: local KDC + one-way trust • Run MIT Kerberos KDC in the cluster • Put all service principals here • Set up one-way trust of central corporate realm by local KDC • Normal user credentials can be used to access Hadoop • Recommended: Use Cloudera Manager • To properly tune inter-related configuration knobs • To manage principals/keytabs creation and distribution • To preserve service monitoring with Kerberos security enabled
  • 12. IT Integration: Kerberos + LDAP Hadoop Cluster Local KDC (MIT Kerberos) hdfs/host1@HADOOP.EXAMPLE.COM yarn/host2@HADOOP.EXAMPLE.COM … Central Active Directory user@EXAMPLE.COM … Cross-realm trust NN JT LDAP group mapping
  • 13. Network Access Management • Use Hue to front-end both Hadoop and Oozie to control access through a web browser • HTTP proxy servers: • Oozie : MR jobs, Pig jobs, Hive jobs • HttpFS: hadoop fs is front-ended over HTTP • HBase REST server: HBase reads Secure configuration with Oozie, Hue and HttpFS front-ends co-located to act as network bridge Hue supports AD/LDAP based authentication instead of Kerberos for client simplicity
  • 14. Enabling Enterprise Security 15 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
  • 15. Data: Protection in Hadoop 16 Data in Motion Data at Rest “Network Encryption” • SASL: Network RPC • SSL: MapReduce shuffle • SSL: Web-based user and administration tools • SSL: JDBC • HDFS data transfer protocol “Data Encryption” • Certified partner solutions • Field-level encryption • Data masking or tokenization • OS-level file system encryption
  • 16. Enabling Enterprise Security 18 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
  • 17. Prior State of Authorization Two Sub-Optimal Choices for SQL on Hadoop 19 • Insecure Advisory Authorization • Users could grant themselves permissions • Intended to prevent accidental deletion of data • Problem: Did not guard against malicious users • Problem: Only worked with Hive • HDFS Impersonation • Data was only protected at the file level by HDFS permissions • Problem: File-level not granular enough • Problem: Lacked flexibility; not role-based
  • 18. Sentry: Key Capabilities 21 Fine-Grained Authorization • Specify security for SERVERS, DATABASES, TABLES, VIEWS, and search indices Role-Based Authorization • SELECT privilege on views & tables • INSERT privilege on tables • TRANSFORM privilege on servers • ALL privilege on the server, databases, tables & views • ALL privilege is needed to create/modify schema Multitenant Administration • Separate policies for each database/schema • Can be maintained by separate admins
  • 19. Sentry Architecture 22 Binding Layer Impala Impala Hive Policy Engine Search Policy Provider File Database HiveServer2 Authorization Provider Evaluation, Validation Parsing Interface Interface Local FS/HDFS Search
  • 20. QueryMR SQL Query Execution Flow 23 Parse Build Check Plan Sentry Validate SQL grammar Construct statement tree Validate statement objects • First check: Authorization Forward to execution planner
  • 21. Multitenant Security Global [groups] admin_group = admin_role dep1_admin = uri_role [roles] admin_role = server=server1 uri_role = hdfs:///ha-nn-uri/data [databases] db1 = hdfs://ha-nn- uri/user/hive/sentry/db1.ini Per Database [groups] dep1_admin = db1_admin_role dep1_analyst = db1_read_role [roles] db1_admin_role = server=server1- >db=db1 db1_read_role = server=server1- >db=db1->table=*->action=select
  • 22. Apache Ecosystem and Sentry Inline support in Cloudera Impala Extensibility plug-in for Apache HiveServer2 Inline support in Cloudera Search Complementary security with HDFS ACLs
  • 23. Access: Authorization in Hadoop 26 File ACL Admin RBAC Data RBAC • Permission at file-level granularity • HDFS POSIX-style permissions: u/g/o • Access Control Lists (ACL) • HBase, Oozie, MapReduce • Permissions on tables, views, indices • Sentry for HiveServer2, Impala, Search App and Workflow • Cloudera Manager, Hue
  • 24. Enabling Enterprise Security 28 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
  • 25. Visibility: Cloudera Navigator 29 Audit & Access Control • Maintain full audit history • Ensuring appropriate permissions and reporting on data access for compliance Discovery & Exploration • Finding out what data is available and what it looks like Lineage • Tracing data back to its original source Lifecycle Management • Migration of data based on policies 3RD PARTY APPS STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE CLOUDERA’S ENTERPRISE DATA HUB BATCH PROCESSING MAPREDUCE ANALYTIC SQL IMPALA SEARCH ENGINE SOLR MACHINE LEARNING SPARK STREAM PROCESSING SPARK STREAMING WORKLOAD MANAGEMENT YARN FILESYSTEM HDFS ONLINE NOSQL HBASE DATA MANAGEMENT CLOUDERANAVIGATOR SYSTEM MANAGEMENT CLOUDERAMANAGER SENTRY, SECURE
  • 26. Why Navigator? 30 Lots of Data Landing in Cloudera Enterprise  Huge quantities  Many different sources – structured and unstructured  Varying levels of sensitivity 1 Many Users Working with the Data  Administrators and compliance officers  Analysts and data scientists  Business users 2 Need to Effectively Control and Consume Data  Get visibility and control over the environment  Discover and explore data 3
  • 27. 31 31
  • 28. 32 32
  • 29. 33 33
  • 30. Leading Investment to Address the Challenges 34 Authentication First Hadoop distribution to offer strong authentication throughout Encryption First Hadoop distribution to support encryption on wire Audit Only Hadoop distribution to support audit histories for all data objects and access paths; Single point for log capture, audit Authorization Founded the Apache Sentry project along with Oracle and Lab41 to manage fine-grained permissions Automation Cloudera Manager automates security configurations & LDAP/AD integration
  • 31. Cloudera 5: Enabling the Enterprise Data Hub 35 Open Source Scalable Flexible Cost-Effective ✔ Managed ✖ Open Architecture ✖ Secure and Governed ✖ ✔ ✔ ✔ 3RD PARTY APPS STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE CLOUDERA’S ENTERPRISE DATA HUB BATCH PROCESSING MAPREDUCE ANALYTIC SQL IMPALA SEARCH ENGINE SOLR MACHINE LEARNING SPARK STREAM PROCESSING SPARK STREAMING WORKLOAD MANAGEMENT YARN FILESYSTEM HDFS ONLINE NOSQL HBASE DATA MANAGEMENT CLOUDERANAVIGATOR SYSTEM MANAGEMENT CLOUDERAMANAGER SENTRY