Suche senden
Hochladen
Lessons Learned on How to Secure Petabytes of Data
•
2 gefällt mir
•
3,491 views
DataWorks Summit
Folgen
Technologie
Business
Melden
Teilen
Melden
Teilen
1 von 36
Empfohlen
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
DataWorks Summit
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Hortonworks
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
Hortonworks
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Caserta
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinar
Hortonworks
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
Hortonworks
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Hortonworks
Yahoo! Hack Europe
Yahoo! Hack Europe
Hortonworks
Empfohlen
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
DataWorks Summit
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Hortonworks
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
Hortonworks
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Caserta
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinar
Hortonworks
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
Hortonworks
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Hortonworks
Yahoo! Hack Europe
Yahoo! Hack Europe
Hortonworks
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group
Hortonworks
Sukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak
Presd1 10
Presd1 10
Niels Groeneveld
Study notes for CompTIA Certified Advanced Security Practitioner
Study notes for CompTIA Certified Advanced Security Practitioner
David Sweigert
Cloud Computing for Lawyers: Practical and Ethical Uses of the Cloud
Cloud Computing for Lawyers: Practical and Ethical Uses of the Cloud
Robert Ambrogi
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Hortonworks
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Hortonworks
6 Ways to Get More From Your Azure
6 Ways to Get More From Your Azure
Holly Plude
Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data Platform
Hortonworks
J ullal hphybrid-cloud-interop14lv-theatresession-apr1tue4pm
J ullal hphybrid-cloud-interop14lv-theatresession-apr1tue4pm
Jathin Ullal
Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?
Hortonworks
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
Hortonworks
Bringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache Hadoop
DataWorks Summit
Data Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache Hadoop
Hortonworks
Leveraging The Power Of The Cloud For Your Business
Leveraging The Power Of The Cloud For Your Business
Joel Katz
AIIM/ARMA Cloud Collaboration Presentation
AIIM/ARMA Cloud Collaboration Presentation
Porter-Roth Associates
Cómo AWS lo ayuda a cumplir con requisitos regulatorios
Cómo AWS lo ayuda a cumplir con requisitos regulatorios
Amazon Web Services LATAM
Enterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble Storage
Hortonworks
Cloud Computing: What it Means for Libraries, Library Staff, Training and Skills
Cloud Computing: What it Means for Libraries, Library Staff, Training and Skills
sherif user group
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...
DataWorks Summit/Hadoop Summit
Itil v2.5
Itil v2.5
World Vision
Metodos ITIL, COBIT, BS15000
Metodos ITIL, COBIT, BS15000
Christian Cruz
Weitere ähnliche Inhalte
Was ist angesagt?
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group
Hortonworks
Sukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak
Presd1 10
Presd1 10
Niels Groeneveld
Study notes for CompTIA Certified Advanced Security Practitioner
Study notes for CompTIA Certified Advanced Security Practitioner
David Sweigert
Cloud Computing for Lawyers: Practical and Ethical Uses of the Cloud
Cloud Computing for Lawyers: Practical and Ethical Uses of the Cloud
Robert Ambrogi
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Hortonworks
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Hortonworks
6 Ways to Get More From Your Azure
6 Ways to Get More From Your Azure
Holly Plude
Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data Platform
Hortonworks
J ullal hphybrid-cloud-interop14lv-theatresession-apr1tue4pm
J ullal hphybrid-cloud-interop14lv-theatresession-apr1tue4pm
Jathin Ullal
Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?
Hortonworks
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
Hortonworks
Bringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache Hadoop
DataWorks Summit
Data Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache Hadoop
Hortonworks
Leveraging The Power Of The Cloud For Your Business
Leveraging The Power Of The Cloud For Your Business
Joel Katz
AIIM/ARMA Cloud Collaboration Presentation
AIIM/ARMA Cloud Collaboration Presentation
Porter-Roth Associates
Cómo AWS lo ayuda a cumplir con requisitos regulatorios
Cómo AWS lo ayuda a cumplir con requisitos regulatorios
Amazon Web Services LATAM
Enterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble Storage
Hortonworks
Cloud Computing: What it Means for Libraries, Library Staff, Training and Skills
Cloud Computing: What it Means for Libraries, Library Staff, Training and Skills
sherif user group
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...
DataWorks Summit/Hadoop Summit
Was ist angesagt?
(20)
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group
Sukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak-Agile-DevOps-Cloud Management
Presd1 10
Presd1 10
Study notes for CompTIA Certified Advanced Security Practitioner
Study notes for CompTIA Certified Advanced Security Practitioner
Cloud Computing for Lawyers: Practical and Ethical Uses of the Cloud
Cloud Computing for Lawyers: Practical and Ethical Uses of the Cloud
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
6 Ways to Get More From Your Azure
6 Ways to Get More From Your Azure
Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data Platform
J ullal hphybrid-cloud-interop14lv-theatresession-apr1tue4pm
J ullal hphybrid-cloud-interop14lv-theatresession-apr1tue4pm
Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
Bringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache Hadoop
Data Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache Hadoop
Leveraging The Power Of The Cloud For Your Business
Leveraging The Power Of The Cloud For Your Business
AIIM/ARMA Cloud Collaboration Presentation
AIIM/ARMA Cloud Collaboration Presentation
Cómo AWS lo ayuda a cumplir con requisitos regulatorios
Cómo AWS lo ayuda a cumplir con requisitos regulatorios
Enterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble Storage
Cloud Computing: What it Means for Libraries, Library Staff, Training and Skills
Cloud Computing: What it Means for Libraries, Library Staff, Training and Skills
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Andere mochten auch
Itil v2.5
Itil v2.5
World Vision
Metodos ITIL, COBIT, BS15000
Metodos ITIL, COBIT, BS15000
Christian Cruz
Six Lessons I Have Learnt from Steve Jobs
Six Lessons I Have Learnt from Steve Jobs
Gyan Lab
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
Jonathan Seidman
The Cloud's Hidden Lock-in: Network Latency
The Cloud's Hidden Lock-in: Network Latency
Tom Croucher
Cloud Computing. Gestión de configuraciones
Cloud Computing. Gestión de configuraciones
pacvslideshare
Diseño del software
Diseño del software
duberlisg
Hadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
markgrover
Mejorando la Gestión de la gerencia de TI
Mejorando la Gestión de la gerencia de TI
GeneXus
Hardware Provisioning for MongoDB
Hardware Provisioning for MongoDB
MongoDB
Capacity Planning
Capacity Planning
MongoDB
Los SLAs y el uso de ITIL® en un contexto de outsourcing, por Sergio Hrabinski
Los SLAs y el uso de ITIL® en un contexto de outsourcing, por Sergio Hrabinski
Foro Global Crossing
Real-World Data Governance: Managing Data & Information as an Asset - Governa...
Real-World Data Governance: Managing Data & Information as an Asset - Governa...
DATAVERSITY
The data model is dead, long live the data model
The data model is dead, long live the data model
Patrick McFadin
V mware v realize orchestrator 6.0 knowledge transfer kit
V mware v realize orchestrator 6.0 knowledge transfer kit
solarisyougood
Lambda Architectures in Practice
Lambda Architectures in Practice
C4Media
Tourist behaviour, unit 1
Tourist behaviour, unit 1
Jamia Millia Islamia
umeng analytical arch
umeng analytical arch
Yan Zhang
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
Amazon Web Services
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
Allen (Xiaozhong) Wang
Andere mochten auch
(20)
Itil v2.5
Itil v2.5
Metodos ITIL, COBIT, BS15000
Metodos ITIL, COBIT, BS15000
Six Lessons I Have Learnt from Steve Jobs
Six Lessons I Have Learnt from Steve Jobs
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
The Cloud's Hidden Lock-in: Network Latency
The Cloud's Hidden Lock-in: Network Latency
Cloud Computing. Gestión de configuraciones
Cloud Computing. Gestión de configuraciones
Diseño del software
Diseño del software
Hadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
Mejorando la Gestión de la gerencia de TI
Mejorando la Gestión de la gerencia de TI
Hardware Provisioning for MongoDB
Hardware Provisioning for MongoDB
Capacity Planning
Capacity Planning
Los SLAs y el uso de ITIL® en un contexto de outsourcing, por Sergio Hrabinski
Los SLAs y el uso de ITIL® en un contexto de outsourcing, por Sergio Hrabinski
Real-World Data Governance: Managing Data & Information as an Asset - Governa...
Real-World Data Governance: Managing Data & Information as an Asset - Governa...
The data model is dead, long live the data model
The data model is dead, long live the data model
V mware v realize orchestrator 6.0 knowledge transfer kit
V mware v realize orchestrator 6.0 knowledge transfer kit
Lambda Architectures in Practice
Lambda Architectures in Practice
Tourist behaviour, unit 1
Tourist behaviour, unit 1
umeng analytical arch
umeng analytical arch
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
Ähnlich wie Lessons Learned on How to Secure Petabytes of Data
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
DataWorks Summit
XA Secure | Whitepaper on data security within Hadoop
XA Secure | Whitepaper on data security within Hadoop
balajiganesan03
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
CA Technologies
Haven 2 0
Haven 2 0
Data Science Warsaw
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Hortonworks
Hadoop World 2010: Productionizing Hadoop: Lessons Learned
Hadoop World 2010: Productionizing Hadoop: Lessons Learned
Cloudera, Inc.
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Luan Moreno Medeiros Maciel
Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
Inside Analysis
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Hortonworks
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
Hortonworks
Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0
Rommel Garcia
Realtime Analytics in Hadoop
Realtime Analytics in Hadoop
Rommel Garcia
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks
Modern infrastructure for business data lake
Modern infrastructure for business data lake
EMC
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
WANdisco Plc
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
DataWorks Summit/Hadoop Summit
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Hortonworks
Introduction to Hadoop
Introduction to Hadoop
POSSCON
Ähnlich wie Lessons Learned on How to Secure Petabytes of Data
(20)
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
XA Secure | Whitepaper on data security within Hadoop
XA Secure | Whitepaper on data security within Hadoop
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
Haven 2 0
Haven 2 0
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Hadoop World 2010: Productionizing Hadoop: Lessons Learned
Hadoop World 2010: Productionizing Hadoop: Lessons Learned
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Hortonworks Data In Motion Webinar Series Pt. 2
Hortonworks Data In Motion Webinar Series Pt. 2
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0
Realtime Analytics in Hadoop
Realtime Analytics in Hadoop
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Modern infrastructure for business data lake
Modern infrastructure for business data lake
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Introduction to Hadoop
Introduction to Hadoop
Mehr von DataWorks Summit
Data Science Crash Course
Data Science Crash Course
DataWorks Summit
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
Managing the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
Security Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
Mehr von DataWorks Summit
(20)
Data Science Crash Course
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Kürzlich hochgeladen
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
BookNet Canada
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Malak Abu Hammad
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
naman860154
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
ThousandEyes
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
hans926745
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Maria Levchenko
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Delhi Call girls
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Delhi Call girls
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
gurkirankumar98700
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Michael W. Hawkins
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Pixlogix Infotech
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
Kürzlich hochgeladen
(20)
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Lessons Learned on How to Secure Petabytes of Data
1.
© Copyright 2014
Booz Allen Hamilton© Copyright 2014 Booz Allen Hamilton Lesson Learned Securing Data at Scale Drew Farris Peter Guerra Hadoop Summit 2014
2.
© Copyright 2014
Booz Allen Hamilton
3.
© Copyright 2014
Booz Allen Hamilton Photo: CC BY 2.0: https://www.flickr.com/photos/atoach/5015711744
4.
© Copyright 2014
Booz Allen Hamilton Photo CC BY 2.0: https://www.flickr.com/photos/dutchamsterdam/
5.
© Copyright 2014
Booz Allen Hamilton Who we are Founded and run DC Hadoop Users Group Meetup – http://www.meetup.com/Hadoop-DC Technical talks at multiple conferences – Strata, Data Science Summit, IDGA Gov Cloud Conference, Cloudera Hadoop Summit,Yahoo! Hadoop Summit, IEEE Cloud Conference, CSA Congress, Black Hat Multiple client engagements over the last 7 years – Defense – Civil and Commercial Health – Civil and Commercial Financial Services – Commercial and International + Booz Allen Big Data and Data Science Points-of-View + http://www.boozallen.com/cloud + http://www.boozallen.com/datascience + Advancing the Art of Analytics & Big Data + http://www.boozallen.com/insights/expertvoices/big- data + http://www.federalnewsradio.com/? nid=154&sid=2080808 + Tackling Large Scale Data in Government + http://www.cloudera.com/blog/2010/11/tackling- large-scale-data-in-government/ + IT Architectures for Complex Search and Information Retrieval + http://www.slideshare.net/cloudera/fuzzy-table-final + http://www.slideshare.net/ydn/3-biometric- hadoopsummit2010
6.
© Copyright 2014
Booz Allen Hamilton Agenda + Securing Data in Hadoop + Architectural Case Study + What we did + How we did it + What tools we used + Smart Data + Emerging Security Capabilities
7.
© Copyright 2014
Booz Allen Hamilton© Copyright 2014 Booz Allen Hamilton Securing Data in Hadoop
8.
© Copyright 2014
Booz Allen Hamilton + Data is growing exponentially and our ability to securely store and process it is falling behind + Security policies haven’t kept up with the technology + Most security policies and tools were not written for Big Data systems, so mapping can be difficult + Clients are often not prepared for the security challenges when integrating multiple data sources What are the security challenges with these architectures?
9.
© Copyright 2014
Booz Allen Hamilton Our approach to data security has made adoption more difficult + For the last 20 years we have built systems in silos, isolated data containers (databases, applications, and so forth) + Most organizations secure each silo individually and protect access by database + Most certification and accreditation programs (FISMA), PCI, HIPAA, and SANS top 20 controls define security controls around each data silo + Most security controls implemented are to protect the servers, user, or network access to data
10.
© Copyright 2014
Booz Allen Hamilton Example: SANS 20 – Control 15; Controlled Access on Need to Know Deploy data protection such as IDS, firewalls, anti-virus, HIPS, DLP, GRC… Wrap those around a number of Big Data technologies, most of which are based on Apache Hadoop or integrate with it: + Hortonworks / Cloudera Stack + NoSQL MongoDB / CouchDB / Cassandra + BigTable (Apache Accumulo / Apache Hbase ) Distributed Systems by nature have different security challenges because of their architecture SANS Control 15: … the data classification system and permission baseline is the blueprint for how authentication and access of data is controlled… + Step 1:An appropriate data classification system and permissions baseline applied to production data systems + Step 2:Access appropriately logged to a log management system + Step 3: Proper access control applied to portable media/USB drives + Step 4:Active scanner validates,checks access,and checks data classification + Step 5: Host-based encryption and data-loss prevention validates and checks all access requests.
11.
© Copyright 2014
Booz Allen Hamilton Overview of Security Architecture Components + Infrastructure & Network + Encryption (at Rest & in Transit) + Authentication (User Principal and Device) + Authorization (Privileged Access Management) + Access Controls (Data Visibility) + Auditing & Monitoring of Data Access + Policy & Compliance Driving Principles + Start with People, Process and Culture + Understand the Data and the Threat + Start small and build + Never finished
12.
© Copyright 2014
Booz Allen Hamilton Apache Hadoop Security Challenges Scale + The large number of tasks presents problems with direct authentication HDFS / File System + NameNodes have ACLs, while DataNodes don’t Job Execution + Propagation of credentials to executing nodes Job Data + Task Parameters / Intermediate output accessible via HTTP Multi Tenancy + Access to Intermediate Output & Local Block Storage Trust Of Auxiliary Services (Oozie, Hadoop clients, Hadoop Pipes/Streaming)
13.
© Copyright 2014
Booz Allen Hamilton First Hadoop release with Kerberos in 2008 A better solution was available, not always implemented: + Tokens: Delegation Token, Block Access Token, Job Token + Symmetric Encryption == Shared Keys + Large Cluster = Thousands of Copies of Shared Keys + Performance Goals (Less than 3% impact) lead to weak SASL QoP + Pluggable Authentication left to end-user + HDFS proxies for bulk transfer expose data Often not implemented in favor of putting Hadoop into an enclave, but still doesn’t fully regulate access to data Alternatives? + Tahoe-LAFS. Cool, but significant Performance Impact
14.
© Copyright 2014
Booz Allen Hamilton Apache Hadoop 2.x Security Hadoop RPC + Clients, MapReduce Jobs, Hadoop Daemons + SASL with varying levels of protection (QoP): - Authorization, Integrity Protection and Confidentiality Direct TCP/IP + HDFS Data Transfer between Clients, DN + Tunnel existing protocol over SASL HDFS-3637 HTTP + Web-UI, FSImage Operations between NN / SNN + HTTPS, Reloadable Java Keystore, Others + MAPREDUCE-4417, HADOOP-8581
15.
© Copyright 2014
Booz Allen Hamilton© Copyright 2014 Booz Allen Hamilton Architectural Case Study Commercial Client
16.
© Copyright 2014
Booz Allen Hamilton + Client is a multi-national Fortune 500 company with over 100,000 employees + Client had multiple data sources for each business unit – R&D, Manufacturing, Sales and Marketing, Corporate + Client wanted to combine data, but many sensitive issues around new product development and access to data by third party contractors, others within its network boundaries + Efforts to integrate data previously had failed because of political and technical issues + Could not get CISO to sign off on combining data! Challenges
17.
© Copyright 2014
Booz Allen Hamilton Securing the Enterprise Ecosystem Design Goals + Build a fully realized “Data Lake” combining information from many different sources + Protect from unauthorized release or modification of information + Focus primarily on full-text retrieval but enable a variety of analytic functions. + Enable the use of a variety of components from Hadoop Ecosystem + Implement in a series of phases based on client requirements
18.
© Copyright 2014
Booz Allen Hamilton Services (SOA) Analytics and Discovery Views and Indexes Data Lake Metadata Tagging Data Sources Infrastructure/ Management Visualization, Reporting, Dashboards, and Query Interface Human Insights and Actions Enabled by customizable interfaces and visualizations of the data Analytics and Services Your tools for analysis, modeling, testing, and simulations Data Management The single, secure repository for all of your valuable data Infrastructure The technology platform for storing and managing your data Machine Learning Free-Computation Alerting Geographic Language Translation Entity Relationship Event Grab Dense/ Sparse Structured Unstructured Streaming Provisioning Deployment Monitoring Workflow Streaming Analytics Streaming indexes Our Common Reference Architecture for Big Data
19.
© Copyright 2014
Booz Allen Hamilton Distributed* Storage Extract Distributed Analy6cs*&*Indexing Presenta6on*Layer periodic*updates Non=Rela6onal*Stores Sta6c*Rela6onal* Databases Sta6c*Data Custom*Ingest*Logic Sqoop Hadoop HDFS Storm+Lucene* Processing*Layer Index*Files Index*Persistence*& Meta=data*Management depending*on*use*case JeGy*App*Server Applica6ons*&* Services*Layer interac6ve*search batch*repor6ng View*/*UI*Model Browser*App Front=end*Client (On=Network*Users) Data$Lake$Pla*orm$Components$&$Search$App.$Architecture Enterprise*Security,*Monitoring,*and*Governance*Controls Hadoop Map/Reduce Search*&*BI*Logic Kerberos*SSO* Connector Directory Services On=Premise*Firewall Hive DNS,*DHCP,*NTP,* SMTP,*Proxy*(package* updates)*Services ZooKeeper Informa6on*Model*/* Hive*meta=store Security Groups*(FW) Network*ACLs Standard*AWS* Machine* Images Encrypted*Data* Volumes An6virus*&* System Monitoring Knox*Gateway* &*Audit*Logging AWS*Direct*Connect AWS$Virtual$Private$Cloud$(EC2) OnCPremise$Network Remote*Access* Cer6ficate (2=way*SSL) Accumulo Data* Governance*&** Stewardship Analy6c*App*&*BI* Users*(On=Network) Spoire*&*Other*BI* Tools Privileged*Users*/* Data*Scien6sts (Direct*Access) Streaming*Data User*Uploaded Data*Sets Rela6onal*Database* Triggers Ka]a low-latency updates =*Open*Source*Components*(Green)
20.
© Copyright 2014
Booz Allen Hamilton tl; dr; + Data Loading via Sqoop / Custom Transport + Ingest / Index via MapReduce + Distributed Query via Storm+Lucene + Batch / Reporting Via MR / Hive + Authentication via Kerberos + Access Via Web Application & Knox + Currently 100TB / 50% used, 150TB by EOY
21.
© Copyright 2014
Booz Allen Hamilton Infrastructure and Network Security + Amazon Web Services Provided + Virtual Private Cloud / Security Groups + Time to Deployment in Early Phases + Physical access to data centers, network isolation, etc. + Future Transition on-Premise Infrastructure + Concerned with procurement time + Other clients we’ve worked with 3-6 month turnaround for infrastructure prep + Instance Level Malware Detection tuned to co-exist with cluster workloads
22.
© Copyright 2014
Booz Allen Hamilton Encryption At Rest: + LUKS (Linux Unified Key Setup) for Ephemeral Storage Volumes + “Lock it up and throw away the key” In Transit: + SSL to Web App Endpoints and Knox Gateway + Internal Network Isolation – VPC Controls prevent traffic interception & MITM attacks
23.
© Copyright 2014
Booz Allen Hamilton Authentication and Authorization + Authentication via Kerberos + Authorization via LDAP + Future transition to enterprise authentication services: Oracle IAM. + Multi-factor Authentication for both Users and Devices via PKI + Authorization performed at both the User and Device Level
24.
© Copyright 2014
Booz Allen Hamilton Operating System user accounts and groups for users, projects and teams reflected in HDFS permissions Privileged access via Knox Gateway extension which provides access via SSH, auditing and monitoring and control of administrative connections into the cluster. (KNOX-250) Identity Provider Knox Gateway Hadoop Cluster (Master) (Oozie) (Hive2 Server) External Sources REST/SSL SSH HTTP SPNEGO Privileged Access Management
25.
© Copyright 2014
Booz Allen Hamilton Putting it All Together + Search UI is a web application accessed via SSL + Knox is the primary cluster access mechanism for users who need to access to the cluster. Knox Provides access to the following services: + WebHDFS, WebHCat, Hive, Oozie + Knox for administrative access, via custom SSH plugin
26.
© Copyright 2014
Booz Allen Hamilton Future Directions + Role Base Access Control is an emerging client need. This will require: + Integration with enterprise role management + Passing roles through Web App & Knox to backend + Role based access in Accumulo, Lucene Indexes + Smart Data Tagging Strategy …
27.
© Copyright 2014
Booz Allen Hamilton© Copyright 2014 Booz Allen Hamilton Smart Data
28.
© Copyright 2014
Booz Allen Hamilton Smart Data + How many organizations have data security requirements? + A structured, verifiable representation of security tags bound to the data is required in order for the enterprise to become inherently "smarter" about the information flowing in and around it – Smart Data + Overview of design principles: + PKI + Implement ABAC controls in IdAM + Define trusted data format based on data security + Tag all your data + Deploy Hadoop platform that leverages tags to track access + Log, monitor, and audit everything
29.
© Copyright 2014
Booz Allen Hamilton Data Element Visibility Tags (red | blue | green) Authorization Authentication Attributes (red, orange, blue) IDAM User Machine Learning Free-Computation Alerting Geographic Language Translation Entity Relationship Event Grab Dense/ Sparse Structured Unstructured Streaming Provisioning Deployment Monitoring Workflow Streaming Analytics Streaming indexes Apache Accumulo Overview of Smart Data
30.
© Copyright 2014
Booz Allen Hamilton Allow access to resource MedicalJournal with attribute patientID=x if Subject match DesignatedDoctorOfPatient and action is read with obligation on Permit: doLog_Inform(patientID,Subject,time) on Deny : doLog_UnauthorizedLogin(patientID,Subject,time) Smart Data Security Controls + Trusted Client – Authorization and Authentication using PKI + Trusted Data Format – Data visibility is controlled using Boolean expressions + Ex.“((red|blue|green) & (white|yellow))” + Clients present Authorizations (red, blue, green, yellow) to Apache Accumulo + Corresponding tags are bound to data stored in Apache Accumulo + Trusted Log – All data interactions are logged and audited Identity and Access Management + Attribute Based Access Control – Users all assigned series of attributes + Attributes and Authorization Bound by XACML, SAML + Policy Decision Point (PDP) + Policy Enforcement Point (PEP) + Policy Retrieval Point (PRP) + Policy Information Point (PIP) + Policy Administration Point (PAP)
31.
© Copyright 2014
Booz Allen Hamilton Tagging Smart Data Formulate the tags used to control data from multiple perspectives + Data Origin + Level of Access Required + Information Governance Policy + Data Owners + Intended Recipients Use fine grained tags, assign users many roles + Tag at the field level so that existence can be verified without revealing the full data record In Accumulo: + Capitalize on the richness of boolean expressions in visibility tags + Differential Compression eliminates the impact of repartition of data + Visibility Tags are bound to the data, changing visibilities is not trivial: it means a delete and a re-add.
32.
© Copyright 2014
Booz Allen Hamilton Representational versus Referential Tags Representational tags encode the specific visibilities they represent, including all alternate controls for a specific document User has roles of ACCOUNTING, RESEARCH and PII + If data has tag PII&RESEARCH, user can access data + If data has tag HIPAA&ACCOUNTING, user can’t access data Referential Tags are a code, that relies on external translation between assigned access controls and visibility markings: Data has marking of 03DECAF00D + User has roles of ACCOUNTING, RESEARCH and PII + At lookup, translation of user roles into possible referential tags Choice depends on security posture, what are the consequences of getting it wrong versus the ease of shifting policy or data?
33.
© Copyright 2014
Booz Allen Hamilton© Copyright 2014 Booz Allen Hamilton Emerging Security Capabilities
34.
© Copyright 2014
Booz Allen Hamilton Ecosystem for security capabilities for Hadoop is growing rapidly Cloudera (with Intel Rhino) + Sentry (ACLs for Hive / Impala) + Gazzang (Filesystem Encryption) + Intel Rhino + Encryption Codec Support HADOOP-9331 + Key Distribution & Management MAPREDUCE-5025 + Token Based Authentication HADOOP-9392 + Unified Authorization Framework HADOOP-9466 + Transparent Encryption for Hbase/Zookeeper + Others, see https://github.com/intel-hadoop/project-rhino/ Hortonworks + Production Ready Apache Knox + XA Secure + Central Administration + Authorization for HDFS / Hive / Hbase + Compliance Controls Lots of talks at this Hadoop Summit on data security: The Future of Hadoop Security – Joey Echeverria Hadoop REST API Security with the Apache Knox Gateway – Kevin Minder,Larry McCay Securing Big Data: Lock it Down, or Liberate? Jeff Graham,Mark Tomallo Improvements in Hadoop Security – Sanjay Radia,Chris Nauroth
35.
© Copyright 2014
Booz Allen Hamilton Summary + Security for Hadoop has come a long way and is changing rapidly, but is still maturing + Securing the data in Hadoop means thinking differently about the architecture when combining multiple data sources + Your Hadoop Architecture should provide consistent security mechanisms across all of the data + A more complete way to secure data is to implement Smart Data (ABAC and Fine Grained Access Controls) but this hasn’t been embraced consistently across the Hadoop ecosystem yet + The next 6 months will be interesting …
36.
© Copyright 2014
Booz Allen Hamilton Just Released! The Field Guide to Data Science 120 page e-book of data science geekery Download for free: http://www.boozallen.com/datascience Thanks! Drew (@drewfarris) Peter (@petrguerra)