SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Extended Attributes and Transparent
Encryption in Apache Hadoop
Uma Maheswara Rao G
Yi Liu (刘轶)
Copyright © 2014 Intel Corporation.
Who we are?
• Uma Maheswara Rao G
- Software Engineer at Intel
- PMC member at Apache Hadoop
- Active committer of Apache Hadoop
@UmaMaheswaraG | umamahesh@apache.org
• Yi Liu (刘轶)
- Software Engineer at Intel
- Active committer of Apache Hadoop
- PMC/committer at Apache Tajo
- Senior security expert of Big data
Copyright © 2014 Intel Corporation.
Intel BigData Team
• Global team, local focus
• Worldwide (China, US and India) teams, >80% in China
• Local collaborations (industry & academic) a high priority
• Greater impact thru open source
• Active open source development (Spark, Hadoop, HBase, Storm, etc.)
• Widely used in the industry (from Facebook to Alibaba to Cloudera to China Mobile …)
• Strong influence in the open source community
• ~10 project committers in the team
• Technology and innovation oriented
• Next generations of Big Data Technologies
• Real-time, in-memory, complex analytics (statistic modeling, machine learning, graph analysis,
…)
• Bridging advanced research and real-world applications
Copyright © 2014 Intel Corporation.
Agenda
• Extended Attributes
• Transparent Encryption
Copyright © 2014 Intel Corporation.
HADOOP Ecosystem
Batch
Processing
MAPREDUCE,
HIVE, PIG
HBase
SQL
HDFS
(Hadoop Distributed File System)
YARN
(Resource Management)
Search
Stream
DATA INTEGRATION (Sqoop, Flume …)
…Machine
Learning
ZooKeeper
SPARK
Copyright © 2014 Intel Corporation.
HDFS Extended Attributes
HDFS-2006
Copyright © 2014 Intel Corporation.
Introduction
• Allows user to associate addition metadata with files/directories
• XAttrs will be set as Key-Value pair on any INode
• File System will not interpret the XAttrs
• Derived from Linux XAttrs feature, so it is functionally similar.
• Allows users to set own encoding format to XAttrs
Copyright © 2014 Intel Corporation.
Namespaces of XAttrs
• XAttrs should be prefixed with namespace
• HDFS support 5 XAttrs namespaces
•Access permission defined by file/directory permission bits
•For Sticky directories, only owner and privileged users can write
USER
•Only visible and accessed by privileged usersTRUSTED
•Not visible to users
•Only available for System kernel
SYSTEM
•Not visible to users
•Only available for System kernel for storing security information
SECURITY
•They are like SYSTEM attributes, but they can be accessed the files/directories under
./reserved/raw by the super users only.RAW
Copyright © 2014 Intel Corporation.
Implementation details
• XAttrs stored as separate INode feature in Namenode
• XAttrs will be persisted as part of INode information
• XAttrs will be validated against the Namespaces at the Namenode
Copyright © 2014 Intel Corporation.
Configuration
• dfs.namenode.xattrs.enabled
Whether the support of XAttrs is enabled in HDFS.
• dfs.namenode.fs-limits.max-xattrs-per-inode
Max number of XAttrs per Inode. Default 32 bytes.
• dfs.namenode.fs-limits.max-xattr-size
Max combined size of name and value of XAttrs. Default 16384 bytes
Copyright © 2014 Intel Corporation.
Use Cases
• Storing the Encrypted Data Encryption Keys as XAttrs in HDFS
Encrypted cluster environment
• Storing policy for Heterogeneous Storage
Copyright © 2014 Intel Corporation.
Transparent Encryption in Hadoop
(HADOOP-10150 & HDFS-6134)
Copyright © 2014 Intel Corporation.
Outlines
• Transparent to upper layer applications and transparent access to
encrypted files by all HDFS clients.
• High performance, it’s not bottleneck.
• Encryption is independent of the file type, data format.
• Scalable key management.
• End-to-end encryption: data can only be encrypted and decrypted
by the client. This satisfies two typical requirements for encryption:
at-rest encryption and in-transit encryption.
• Security: HDFS never handles unencrypted data or data encryption
keys.
Copyright © 2014 Intel Corporation.
Write file
NNNN
DNDNDN
DFS
Client
KMS
5. Encrypt data
using DEK
4. Decrypt EDEK and get DEK
Fill EDEK cache
in background
2. EDEK from
cache and persist
to File metadata.
Backing
keystore
HDFS
Copyright © 2014 Intel Corporation.
Read file
15
NNNN
DNDNDN
DFS
Client
KMS
4. Decrypt EDEK and get DEK
2. Read EDEK from
File metadata.
Backing
keystore
6. Decrypt data
using DEK
HDFS
Copyright © 2014 Intel Corporation.
Implementation details
• Pread support.
• Original file and Cipher file have the same length and 1:1
corresponding by using AES-CTR
• Use AES-NI support on Intel platform to improve encryption
performance, 20x speedup.
• We define encryption zone and files are transparently
encrypted/decrypted in the zone.
• We use two layer keys: encryption zone key (EZK), and data
encryption key (DEK) which is encrypted by EZK. Each file has a
different DEK.
16 Copyright © 2014 Intel Corporation.
Encryption/Decryption for HDFS Blocks
17 Copyright © 2014 Intel Corporation.
User Ops
• Create Key
hadoop key create <keyname> [-cipher <cipher>] [-size <size>]
[-description <description>]
[-attr <attribute=value>]
[-provider <provider>]
• Roll Key
hadoop key roll <keyname> [-provider <provider>]
• Delete Key
hadoop key delete <keyname> [-provider <provider>]
• List Keys
hadoop key list [-provider <provider>] [-metadata]
18 Copyright © 2014 Intel Corporation.
Admin Ops
• Create Encryption Zone
hdfs crypto -createZone -keyName <keyName> -path <path>
• List Encryption Zones
hdfs crypto -listZones
19 Copyright © 2014 Intel Corporation.
Usage Example
• As a normal user, create a new encryption key:
$ hadoop key create myKey
• As the super user, create a new empty directory and make it an
encryption zone:
$ sudo -u hdfs hadoop fs -mkdir /zone
$ sudo -u hdfs hdfs crypto -createZone -keyName myKey -path /zone
• Change its ownership to the normal user:
$ sudo -u hdfs hadoop fs -chown myuser:myuser /zone
• As the normal user, put a file in, read it out:
$ hadoop fs -put helloWorld /zone
$ hadoop fs -cat /zone/helloWorld
20 Copyright © 2014 Intel Corporation.
Performance
TestDFSIO Benchmark
AES-NI enabled
Copyright © 2014 Intel Corporation.
Call for Collaborations
• Close collaborations with local ecosystems
• Intel Big Data engineering teams, industry partners and academic research
• Building next generations of Big Data Technologies
• Real-time, in-memory, complex analytics, etc.
• Bridging advanced research and real-world applications
• Highly impactful through open source, university research (e.g., UC Berkeley) and
industry adoptions (e.g., Alibaba, Cloudera, etc.)
22 Copyright © 2014 Intel Corporation.
Q & A
Thanks!
Copyright © 2014 Intel Corporation.
24
Notices and Disclaimers:
• Copyright © 2014 Intel Corporation.
• Intel, the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as
the property of others.
• All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product
specifications and roadmaps.
• Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations
and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance
tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other
products.
For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
• Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel
microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the
availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.
Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not
specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference
Guides for more information regarding the specific instruction sets covered by this notice.
• Intel technologies may require enabled hardware, specific software, or services activation. Check with your system manufacturer or
retailer.
• No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages
resulting from such losses.
• You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel
products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which
includes subject matter disclosed herein.
• No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
• The products described may contain design defects or errors known as errata which may cause the product to deviate from publish.

Weitere ähnliche Inhalte

Was ist angesagt?

Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxVinay Shukla
 
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...Cloudera, Inc.
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014Cloudera, Inc.
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionSteve Loughran
 
Hadoop security overview_hit2012_1117rev
Hadoop security overview_hit2012_1117revHadoop security overview_hit2012_1117rev
Hadoop security overview_hit2012_1117revJason Shih
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop EcosystemDataWorks Summit
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Shravan (Sean) Pabba
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessCloudera, Inc.
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowDataWorks Summit
 
Nl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesNl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesBolke de Bruin
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Kevin Minder
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big DataRommel Garcia
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityDataWorks Summit
 
Ozone: Evolution of HDFS scalability & built-in GDPR compliance
Ozone: Evolution of HDFS scalability & built-in GDPR complianceOzone: Evolution of HDFS scalability & built-in GDPR compliance
Ozone: Evolution of HDFS scalability & built-in GDPR complianceDinesh Chitlangia
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayDataWorks Summit
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyAnurag Shrivastava
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Hortonworks
 

Was ist angesagt? (20)

Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache Knox
 
April 2014 HUG : Apache Sentry
April 2014 HUG : Apache SentryApril 2014 HUG : Apache Sentry
April 2014 HUG : Apache Sentry
 
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
Comprehensive Security for the Enterprise III: Protecting Data at Rest and In...
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
 
Hadoop security overview_hit2012_1117rev
Hadoop security overview_hit2012_1117revHadoop security overview_hit2012_1117rev
Hadoop security overview_hit2012_1117rev
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop Ecosystem
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster Access
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
 
Nl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenchesNl HUG 2016 Feb Hadoop security from the trenches
Nl HUG 2016 Feb Hadoop security from the trenches
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Ozone: Evolution of HDFS scalability & built-in GDPR compliance
Ozone: Evolution of HDFS scalability & built-in GDPR complianceOzone: Evolution of HDFS scalability & built-in GDPR compliance
Ozone: Evolution of HDFS scalability & built-in GDPR compliance
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happy
 
An Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache KnoxAn Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache Knox
 
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
Distilling Hadoop Patterns of Use and How You Can Use Them for Your Big Data ...
 

Ähnlich wie Apache HDFS Extended Attributes and Transparent Encryption

Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Johann Lombardi
 
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014StampedeCon
 
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...Alluxio, Inc.
 
Cloudwatt pioneers big_data
Cloudwatt pioneers big_dataCloudwatt pioneers big_data
Cloudwatt pioneers big_dataxband
 
Explore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XEExplore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XEIntel IT Center
 
The Importance of Fast, Scalable Storage for Today’s HPC
The Importance of Fast, Scalable Storage for Today’s HPCThe Importance of Fast, Scalable Storage for Today’s HPC
The Importance of Fast, Scalable Storage for Today’s HPCIntel IT Center
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?Michelle Holley
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudDataWorks Summit
 
Accelerate Your Apache Spark with Intel Optane DC Persistent Memory
Accelerate Your Apache Spark with Intel Optane DC Persistent MemoryAccelerate Your Apache Spark with Intel Optane DC Persistent Memory
Accelerate Your Apache Spark with Intel Optane DC Persistent MemoryDatabricks
 
Informix IWA data life cycle mgmt & Performance on Intel.
Informix IWA data life cycle mgmt & Performance on Intel.Informix IWA data life cycle mgmt & Performance on Intel.
Informix IWA data life cycle mgmt & Performance on Intel.Keshav Murthy
 
Data Science and CDSW
Data Science and CDSWData Science and CDSW
Data Science and CDSWJason Hubbard
 
Hadoop-Automation-Tool_RamkishorTak
Hadoop-Automation-Tool_RamkishorTakHadoop-Automation-Tool_RamkishorTak
Hadoop-Automation-Tool_RamkishorTakRam Kishor Tak
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Ceph Community
 
Catching the Software Defined Storage Wave
Catching the Software Defined Storage WaveCatching the Software Defined Storage Wave
Catching the Software Defined Storage WaveDataCore Software
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryDataWorks Summit
 
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura IntelTDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Inteltdc-globalcode
 
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined InfrastructureRed Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined InfrastructureIntel® Software
 

Ähnlich wie Apache HDFS Extended Attributes and Transparent Encryption (20)

Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
 
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
 
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
 
Cloudwatt pioneers big_data
Cloudwatt pioneers big_dataCloudwatt pioneers big_data
Cloudwatt pioneers big_data
 
FILR Demo
FILR DemoFILR Demo
FILR Demo
 
Explore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XEExplore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XE
 
The Importance of Fast, Scalable Storage for Today’s HPC
The Importance of Fast, Scalable Storage for Today’s HPCThe Importance of Fast, Scalable Storage for Today’s HPC
The Importance of Fast, Scalable Storage for Today’s HPC
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
 
Accelerate Your Apache Spark with Intel Optane DC Persistent Memory
Accelerate Your Apache Spark with Intel Optane DC Persistent MemoryAccelerate Your Apache Spark with Intel Optane DC Persistent Memory
Accelerate Your Apache Spark with Intel Optane DC Persistent Memory
 
Informix IWA data life cycle mgmt & Performance on Intel.
Informix IWA data life cycle mgmt & Performance on Intel.Informix IWA data life cycle mgmt & Performance on Intel.
Informix IWA data life cycle mgmt & Performance on Intel.
 
Data Science and CDSW
Data Science and CDSWData Science and CDSW
Data Science and CDSW
 
Clear Linux Overview and Engagement
Clear Linux Overview and EngagementClear Linux Overview and Engagement
Clear Linux Overview and Engagement
 
Hadoop-Automation-Tool_RamkishorTak
Hadoop-Automation-Tool_RamkishorTakHadoop-Automation-Tool_RamkishorTak
Hadoop-Automation-Tool_RamkishorTak
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques
 
Intel Cloud Foundry and OpenStack
Intel Cloud Foundry and OpenStackIntel Cloud Foundry and OpenStack
Intel Cloud Foundry and OpenStack
 
Catching the Software Defined Storage Wave
Catching the Software Defined Storage WaveCatching the Software Defined Storage Wave
Catching the Software Defined Storage Wave
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recovery
 
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura IntelTDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
TDC2018SP | Trilha IA - Inteligencia Artificial na Arquitetura Intel
 
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined InfrastructureRed Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
 

Kürzlich hochgeladen

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 

Kürzlich hochgeladen (20)

Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 

Apache HDFS Extended Attributes and Transparent Encryption

  • 1. Extended Attributes and Transparent Encryption in Apache Hadoop Uma Maheswara Rao G Yi Liu (刘轶) Copyright © 2014 Intel Corporation.
  • 2. Who we are? • Uma Maheswara Rao G - Software Engineer at Intel - PMC member at Apache Hadoop - Active committer of Apache Hadoop @UmaMaheswaraG | umamahesh@apache.org • Yi Liu (刘轶) - Software Engineer at Intel - Active committer of Apache Hadoop - PMC/committer at Apache Tajo - Senior security expert of Big data Copyright © 2014 Intel Corporation.
  • 3. Intel BigData Team • Global team, local focus • Worldwide (China, US and India) teams, >80% in China • Local collaborations (industry & academic) a high priority • Greater impact thru open source • Active open source development (Spark, Hadoop, HBase, Storm, etc.) • Widely used in the industry (from Facebook to Alibaba to Cloudera to China Mobile …) • Strong influence in the open source community • ~10 project committers in the team • Technology and innovation oriented • Next generations of Big Data Technologies • Real-time, in-memory, complex analytics (statistic modeling, machine learning, graph analysis, …) • Bridging advanced research and real-world applications Copyright © 2014 Intel Corporation.
  • 4. Agenda • Extended Attributes • Transparent Encryption Copyright © 2014 Intel Corporation.
  • 5. HADOOP Ecosystem Batch Processing MAPREDUCE, HIVE, PIG HBase SQL HDFS (Hadoop Distributed File System) YARN (Resource Management) Search Stream DATA INTEGRATION (Sqoop, Flume …) …Machine Learning ZooKeeper SPARK Copyright © 2014 Intel Corporation.
  • 6. HDFS Extended Attributes HDFS-2006 Copyright © 2014 Intel Corporation.
  • 7. Introduction • Allows user to associate addition metadata with files/directories • XAttrs will be set as Key-Value pair on any INode • File System will not interpret the XAttrs • Derived from Linux XAttrs feature, so it is functionally similar. • Allows users to set own encoding format to XAttrs Copyright © 2014 Intel Corporation.
  • 8. Namespaces of XAttrs • XAttrs should be prefixed with namespace • HDFS support 5 XAttrs namespaces •Access permission defined by file/directory permission bits •For Sticky directories, only owner and privileged users can write USER •Only visible and accessed by privileged usersTRUSTED •Not visible to users •Only available for System kernel SYSTEM •Not visible to users •Only available for System kernel for storing security information SECURITY •They are like SYSTEM attributes, but they can be accessed the files/directories under ./reserved/raw by the super users only.RAW Copyright © 2014 Intel Corporation.
  • 9. Implementation details • XAttrs stored as separate INode feature in Namenode • XAttrs will be persisted as part of INode information • XAttrs will be validated against the Namespaces at the Namenode Copyright © 2014 Intel Corporation.
  • 10. Configuration • dfs.namenode.xattrs.enabled Whether the support of XAttrs is enabled in HDFS. • dfs.namenode.fs-limits.max-xattrs-per-inode Max number of XAttrs per Inode. Default 32 bytes. • dfs.namenode.fs-limits.max-xattr-size Max combined size of name and value of XAttrs. Default 16384 bytes Copyright © 2014 Intel Corporation.
  • 11. Use Cases • Storing the Encrypted Data Encryption Keys as XAttrs in HDFS Encrypted cluster environment • Storing policy for Heterogeneous Storage Copyright © 2014 Intel Corporation.
  • 12. Transparent Encryption in Hadoop (HADOOP-10150 & HDFS-6134) Copyright © 2014 Intel Corporation.
  • 13. Outlines • Transparent to upper layer applications and transparent access to encrypted files by all HDFS clients. • High performance, it’s not bottleneck. • Encryption is independent of the file type, data format. • Scalable key management. • End-to-end encryption: data can only be encrypted and decrypted by the client. This satisfies two typical requirements for encryption: at-rest encryption and in-transit encryption. • Security: HDFS never handles unencrypted data or data encryption keys. Copyright © 2014 Intel Corporation.
  • 14. Write file NNNN DNDNDN DFS Client KMS 5. Encrypt data using DEK 4. Decrypt EDEK and get DEK Fill EDEK cache in background 2. EDEK from cache and persist to File metadata. Backing keystore HDFS Copyright © 2014 Intel Corporation.
  • 15. Read file 15 NNNN DNDNDN DFS Client KMS 4. Decrypt EDEK and get DEK 2. Read EDEK from File metadata. Backing keystore 6. Decrypt data using DEK HDFS Copyright © 2014 Intel Corporation.
  • 16. Implementation details • Pread support. • Original file and Cipher file have the same length and 1:1 corresponding by using AES-CTR • Use AES-NI support on Intel platform to improve encryption performance, 20x speedup. • We define encryption zone and files are transparently encrypted/decrypted in the zone. • We use two layer keys: encryption zone key (EZK), and data encryption key (DEK) which is encrypted by EZK. Each file has a different DEK. 16 Copyright © 2014 Intel Corporation.
  • 17. Encryption/Decryption for HDFS Blocks 17 Copyright © 2014 Intel Corporation.
  • 18. User Ops • Create Key hadoop key create <keyname> [-cipher <cipher>] [-size <size>] [-description <description>] [-attr <attribute=value>] [-provider <provider>] • Roll Key hadoop key roll <keyname> [-provider <provider>] • Delete Key hadoop key delete <keyname> [-provider <provider>] • List Keys hadoop key list [-provider <provider>] [-metadata] 18 Copyright © 2014 Intel Corporation.
  • 19. Admin Ops • Create Encryption Zone hdfs crypto -createZone -keyName <keyName> -path <path> • List Encryption Zones hdfs crypto -listZones 19 Copyright © 2014 Intel Corporation.
  • 20. Usage Example • As a normal user, create a new encryption key: $ hadoop key create myKey • As the super user, create a new empty directory and make it an encryption zone: $ sudo -u hdfs hadoop fs -mkdir /zone $ sudo -u hdfs hdfs crypto -createZone -keyName myKey -path /zone • Change its ownership to the normal user: $ sudo -u hdfs hadoop fs -chown myuser:myuser /zone • As the normal user, put a file in, read it out: $ hadoop fs -put helloWorld /zone $ hadoop fs -cat /zone/helloWorld 20 Copyright © 2014 Intel Corporation.
  • 22. Call for Collaborations • Close collaborations with local ecosystems • Intel Big Data engineering teams, industry partners and academic research • Building next generations of Big Data Technologies • Real-time, in-memory, complex analytics, etc. • Bridging advanced research and real-world applications • Highly impactful through open source, university research (e.g., UC Berkeley) and industry adoptions (e.g., Alibaba, Cloudera, etc.) 22 Copyright © 2014 Intel Corporation.
  • 23. Q & A Thanks! Copyright © 2014 Intel Corporation.
  • 24. 24 Notices and Disclaimers: • Copyright © 2014 Intel Corporation. • Intel, the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. • All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. • Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks. • Optimization Notice Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. • Intel technologies may require enabled hardware, specific software, or services activation. Check with your system manufacturer or retailer. • No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses. • You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. • No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. • The products described may contain design defects or errors known as errata which may cause the product to deviate from publish.