SlideShare a Scribd company logo
1 of 27
Download to read offline
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Extend Governance in Hadoop
with Atlas Ecosystem:
Waterline, Attivo & Trifacta
June 30, 2016
Apache Atlas
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Disclaimer
This	document	may	contain	product	features	and	technology	 directions	that	are	under	development,	 may	be	
under	development	 in	the	future	or	may	ultimately	 not	be	developed.
Project	capabilities	 are	based	on	information	 that	is	publicly	 available	 within	the	Apache	Software	Foundation	
project	websites	("Apache").		Progress	of	the	project	capabilities	 can	be	tracked	from	inception	 to	release	
through	Apache,	however,	technical	feasibility,	 market	demand,	 user	feedback	and	the	overarching	 Apache	
Software	Foundation	 community	 development	process	can	all	effect	timing	and	final	 delivery.
This	document’s	 description	 of	these	features	and	technology	 directions	does	not	represent	a	contractual	
commitment,	 promise	or	obligation	 from	Hortonworks	 to	deliver	these	features	in	any	generally	available	
product.
Product	features	and	technology	directions	 are	subject	to	change,	 and	must	not	be	included	 in	contracts,	
purchase	orders,	or	sales	agreements	of	any	kind.
Since	this	document	contains	an	outline	of	general	product	development	 plans,	 customers	should	 not	rely	upon	it	
when	making	 purchasing	 decisions.
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
STRUCTURED
UNSTRUCTURED
Vision - Enterprise Data Governance Across Platforms
TRADITIONAL
RDBMS
METADATA
MPP
APPLIANCES
Project 1
Project 5
Project 4
Project 3
METADATA
Project 6
DATA
LAKE
Atlas: Metadata Truth in Hadoop
Data Management
along the entire data lifecycle with
integrated provenance and lineage
capability
Modeling with Metadata
enables comprehensive data lineage through
a hybrid approach with enhanced tagging
and attribute capabilities
Interoperable Solutions
across the Hadoop ecosystem, through a
common metadata store
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Governance Ready Certification Program
Discovery
Tagging
Prep	 /	
Cleanse
ETL
Governance
BPM
Self	 Service
Visualization
Choice: Customers choose features that they want to
deploy—a la carte versus vendor lock
Curated & Fast: Selected group of vendor partners to
provide rich, complimentary and complete features ready
to deploy
Agile: Low switching costs, Faster deployment and
innovation
Centralized: Common SLA & common open metadata
store
Flexibility: Interoperability of products through Atlas
metadata
Safe: HDP at core to provide stability and interoperability
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Governance Ready Certification Program
Discovery
Tagging
Prep	 /	
Cleanse
ETL
Governance
BPM
Self	 Service
Visualization
The Apache open source community is
committed to collaboration which critical for
proper data governance. Partners have
adopted this commitment and are extending
governance capabilities by integrating their
products with Atlas -- which is providing a rich
innovative community with a common
metadata store backed by Atlas. This session
will showcase 3 vendors:
– Waterline
– Attivo
– Trifacta
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Additional Atlas Sessions
• BOF:	Apache	Knox	and	Apache	Ranger	provide	Hadoop	security	
while	Atlas	provides	a	Hadoop	metadata	store	and	enterprise	
compliance.	Come	learn	and	discuss	security	&	governance	
innovations	and	future	directions.
Thursday 5-7 PM @ Room 210A
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Learn More:
• Hortonworks	links:		http://hortonworks.com/solutions/security-and-
governance/
• Tutorials:			https://github.com/hortonworks/tutorials/tree/atlas-ranger-
tp/tutorials/hortonworks/atlas-ranger-preview
Waterline Data
The Smart Data Catalog Company
Unlock TheValue Of The Data LakeWith
WaterlineData’s Smart Data Catalog
Time To Value
AUTOMATICALLY catalog data
assets across ALL the data AND
enable SELF-SERVICE access
Tribal Knowledge Sharing
AUGMENT semantic discovery by
CROWDSOURCING tribal data
knowledge
Trust
Enable AGILE GOVERNANCE
with automated tagging, data
stewardship, and SECURE SELF-
SERVICE access to data based on
role and policy
ShoppingMetaphorFor “Managed” Self-Service:
Amazon.com
Catalog
Find, Understand And
Collaborate
Provision
Waterline Data Is LikeAmazon For The Data Lake
Catalog
Find, Understand And
Collaborate
Provision
WorkflowOf EnablingSelf-ServiceAnalytics
With Hortonworks
Hortonworks Atlas And Ranger
Data Prep Analytics &
Visualization
Smart Data
Discovery
Profiling,Sensitive
Data & Data
Lineage Discovery,
AutomatedTagging
Data
Stewardship
Curate Tags
Self-Service
Data Catalog
Find,Collaborate
And Take Action
Metadata,Tags,
Data Lineage
Metadata,Tags,
Roles&Access
Control
Roles&Access
Control
Demo
Waterline Data
The Smart Data Catalog Company
UNIFY YOUR DATA ACROSS
SILOS
Joe Lichtman
Vice President, Product
jlichtman@attivio.com
1 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL
WHO IS ATTIVIO?
Attivio
unifies your data across silos
to provide a 360° view of your
business
2 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL
Gartner
Magic Quadrant
For Enterprise
Search, Q3
2015
Forrester
Wave: Big
Data Search
and Knowledge
Discovery
Solutions,
Q3 2015
Forrester Wave:
Big Data Text Analytics
Platforms, Q2 2016
Gartner
Magic Quadrant For
Enterprise Search,
August 2015
Forrester Wave:
Big Data Search and
Knowledge Discovery
Solutions, Q3 2015
LEADER IN SEARCH, DATA DISCOVERY AND TEXT ANALYTICS
3 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL
SEMANTIC DATA CATALOG
Attivio radically reduces time spent finding and
understanding data sources to speed time-to-analytics.
• Catalogs all your enterprise information
• Identifies what’s most relevant
• Unifies all structuredand semi-structured sources
in a visual model
• Provisions leadingBI and predictive analytics
tools such as Qlik, R, RapidMiner, Spotfire, and Tableau
58%
of the effort for BI initiatives is
wasted on data exploration and
integration
33%
of businesses cite big data
discovery as a challenge they
are facing
50%
Businesses use less than half
of their available data for BI
4 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL
CATALOG ALL YOUR ENTERPRISE INFORMATION
• Spiders and extracts
metadata for all
information types
• Automatically catalogs
data and content with
semantic meaning
• Applies human expertise
to fine-tune tagging and
align with business rules
5 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL
IDENTIFY THE RIGHT INFORMATION
• Delivers natural language
and keyword search
• Provides an eCommerce-like
shopping cart for data
• Recommends the most
relevant data for your
context
6 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL
UNIFY THE INFORMATION FOR YOUR ANALYTIC CONTEXT
• Automatically generates
data models
• Correlates all structured
data and unstructured
content
• Simplifies provisioning to
BI and advanced analytic
tools
7 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL
PROVISION & OPERATIONALIZE DATA AS A STRATEGIC ASSET
• Provision directly to agile
BI and analytics tools
OR
• Rationalize the data
warehouse for greater
simplicity and lower cost
• Power domain-specific
apps
THANK YOU!
Trifacta + Hortonworks:
Apache Atlas Integration
DATAWRANGLING
What is Data Wrangling?
2
QUESTION ANALYZE INSIGHTDISCOVER STRUCTURE CLEANSE ENRICH VALIDATE PUBLISH
I
N
G
E
S
T
I
O
N
A
C
C
E
S
S
DATA SOURCES
Transactional Data
banking
credit cards
lending
wealth
mortgages
ledgers
trades
payments
Interaction Data
social
webchat
Analytics
Reporting
Data Product Models
BUSINESS OPERATIONS
Data Wrangling within the Hortonworks Data Lake
Discovery
Zone
Shared
Zone
Raw Data
Zone
Trifacta + Hortonworks & Apache Atlas
Governance
Ingestion
Metadata & APIs
Data Wrangling
Analysis & Consumption

More Related Content

What's hot

GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache AtlasGDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache AtlasDataWorks Summit
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...DataWorks Summit/Hadoop Summit
 
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidDataWorks Summit
 
Make Streaming Analytics work for you: The Devil is in the Details
Make Streaming Analytics work for you: The Devil is in the DetailsMake Streaming Analytics work for you: The Devil is in the Details
Make Streaming Analytics work for you: The Devil is in the DetailsDataWorks Summit/Hadoop Summit
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHortonworks
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Jeffrey T. Pollock
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalHortonworks
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks
 
Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks
 
Design a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDFDesign a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDFHortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Hortonworks
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Security, ETL, BI & Analytics, and Software Integration
Security, ETL, BI & Analytics, and Software IntegrationSecurity, ETL, BI & Analytics, and Software Integration
Security, ETL, BI & Analytics, and Software IntegrationDataWorks Summit
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsHortonworks
 

What's hot (20)

GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache AtlasGDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
 
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...Automatic Detection, Classification and Authorization of Sensitive Personal D...
Automatic Detection, Classification and Authorization of Sensitive Personal D...
 
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using Druid
 
Make Streaming Analytics work for you: The Devil is in the Details
Make Streaming Analytics work for you: The Devil is in the DetailsMake Streaming Analytics work for you: The Devil is in the Details
Make Streaming Analytics work for you: The Devil is in the Details
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
 
Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar
 
Smart Cities: An APAC Necessity
Smart Cities: An APAC Necessity Smart Cities: An APAC Necessity
Smart Cities: An APAC Necessity
 
Design a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDFDesign a Dataflow in 7 minutes with Apache NiFi/HDF
Design a Dataflow in 7 minutes with Apache NiFi/HDF
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Security, ETL, BI & Analytics, and Software Integration
Security, ETL, BI & Analytics, and Software IntegrationSecurity, ETL, BI & Analytics, and Software Integration
Security, ETL, BI & Analytics, and Software Integration
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
 

Viewers also liked

Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsApache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsDataWorks Summit/Hadoop Summit
 
Role of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods IndustryRole of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods IndustryPerceptive Analytics
 
How PepsiCo's Big Data Strategy is Disrupting CPG Retail Analytics
How PepsiCo's Big Data Strategy is Disrupting CPG Retail AnalyticsHow PepsiCo's Big Data Strategy is Disrupting CPG Retail Analytics
How PepsiCo's Big Data Strategy is Disrupting CPG Retail AnalyticsHortonworks
 
Préparation de Données Hadoop avec Trifacta
Préparation de Données Hadoop avec TrifactaPréparation de Données Hadoop avec Trifacta
Préparation de Données Hadoop avec TrifactaVictor Coustenoble
 
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, TrifactaData Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, Trifactahuguk
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningKai Wähner
 

Viewers also liked (6)

Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsApache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop components
 
Role of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods IndustryRole of Analytics in Consumer Packaged Goods Industry
Role of Analytics in Consumer Packaged Goods Industry
 
How PepsiCo's Big Data Strategy is Disrupting CPG Retail Analytics
How PepsiCo's Big Data Strategy is Disrupting CPG Retail AnalyticsHow PepsiCo's Big Data Strategy is Disrupting CPG Retail Analytics
How PepsiCo's Big Data Strategy is Disrupting CPG Retail Analytics
 
Préparation de Données Hadoop avec Trifacta
Préparation de Données Hadoop avec TrifactaPréparation de Données Hadoop avec Trifacta
Préparation de Données Hadoop avec Trifacta
 
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, TrifactaData Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
 

Similar to Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta

Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...DataWorks Summit/Hadoop Summit
 
What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it DataWorks Summit/Hadoop Summit
 
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies DataWorks Summit/Hadoop Summit
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?DataWorks Summit/Hadoop Summit
 
Classification based security in Hadoop
Classification based security in HadoopClassification based security in Hadoop
Classification based security in HadoopMadhan Neethiraj
 
Security and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache AtlasSecurity and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache AtlasDataWorks Summit/Hadoop Summit
 
Oracle analytics cloud overview feb 2017
Oracle analytics cloud overview   feb 2017Oracle analytics cloud overview   feb 2017
Oracle analytics cloud overview feb 2017aioughydchapter
 
Using Tableau with Hortonworks Data Platform
Using Tableau with Hortonworks Data PlatformUsing Tableau with Hortonworks Data Platform
Using Tableau with Hortonworks Data PlatformHortonworks
 
A7 storytelling with_oracle_analytics_cloud
A7 storytelling with_oracle_analytics_cloudA7 storytelling with_oracle_analytics_cloud
A7 storytelling with_oracle_analytics_cloudDr. Wilfred Lin (Ph.D.)
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionHortonworks
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceHortonworks
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseJeffrey T. Pollock
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseHortonworks
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopHortonworks
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack EuropeHortonworks
 
Level Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationLevel Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationInside Analysis
 
The Implacable advance of the data
The Implacable advance of the dataThe Implacable advance of the data
The Implacable advance of the dataDataWorks Summit
 
Analytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsAnalytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsInside Analysis
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...Platfora
 

Similar to Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta (20)

Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
 
What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it
 
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?
 
Classification based security in Hadoop
Classification based security in HadoopClassification based security in Hadoop
Classification based security in Hadoop
 
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your DataApache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
 
Security and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache AtlasSecurity and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache Atlas
 
Oracle analytics cloud overview feb 2017
Oracle analytics cloud overview   feb 2017Oracle analytics cloud overview   feb 2017
Oracle analytics cloud overview feb 2017
 
Using Tableau with Hortonworks Data Platform
Using Tableau with Hortonworks Data PlatformUsing Tableau with Hortonworks Data Platform
Using Tableau with Hortonworks Data Platform
 
A7 storytelling with_oracle_analytics_cloud
A7 storytelling with_oracle_analytics_cloudA7 storytelling with_oracle_analytics_cloud
A7 storytelling with_oracle_analytics_cloud
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
 
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical EnterpriseEnabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Level Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationLevel Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop Acceleration
 
The Implacable advance of the data
The Implacable advance of the dataThe Implacable advance of the data
The Implacable advance of the data
 
Analytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old ConstraintsAnalytic Excellence - Saying Goodbye to Old Constraints
Analytic Excellence - Saying Goodbye to Old Constraints
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
 

More from DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 

Recently uploaded (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 

Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta

  • 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta June 30, 2016 Apache Atlas
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Disclaimer This document may contain product features and technology directions that are under development, may be under development in the future or may ultimately not be developed. Project capabilities are based on information that is publicly available within the Apache Software Foundation project websites ("Apache"). Progress of the project capabilities can be tracked from inception to release through Apache, however, technical feasibility, market demand, user feedback and the overarching Apache Software Foundation community development process can all effect timing and final delivery. This document’s description of these features and technology directions does not represent a contractual commitment, promise or obligation from Hortonworks to deliver these features in any generally available product. Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Since this document contains an outline of general product development plans, customers should not rely upon it when making purchasing decisions.
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved STRUCTURED UNSTRUCTURED Vision - Enterprise Data Governance Across Platforms TRADITIONAL RDBMS METADATA MPP APPLIANCES Project 1 Project 5 Project 4 Project 3 METADATA Project 6 DATA LAKE Atlas: Metadata Truth in Hadoop Data Management along the entire data lifecycle with integrated provenance and lineage capability Modeling with Metadata enables comprehensive data lineage through a hybrid approach with enhanced tagging and attribute capabilities Interoperable Solutions across the Hadoop ecosystem, through a common metadata store
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Governance Ready Certification Program Discovery Tagging Prep / Cleanse ETL Governance BPM Self Service Visualization Choice: Customers choose features that they want to deploy—a la carte versus vendor lock Curated & Fast: Selected group of vendor partners to provide rich, complimentary and complete features ready to deploy Agile: Low switching costs, Faster deployment and innovation Centralized: Common SLA & common open metadata store Flexibility: Interoperability of products through Atlas metadata Safe: HDP at core to provide stability and interoperability
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Governance Ready Certification Program Discovery Tagging Prep / Cleanse ETL Governance BPM Self Service Visualization The Apache open source community is committed to collaboration which critical for proper data governance. Partners have adopted this commitment and are extending governance capabilities by integrating their products with Atlas -- which is providing a rich innovative community with a common metadata store backed by Atlas. This session will showcase 3 vendors: – Waterline – Attivo – Trifacta
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Additional Atlas Sessions • BOF: Apache Knox and Apache Ranger provide Hadoop security while Atlas provides a Hadoop metadata store and enterprise compliance. Come learn and discuss security & governance innovations and future directions. Thursday 5-7 PM @ Room 210A
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Learn More: • Hortonworks links: http://hortonworks.com/solutions/security-and- governance/ • Tutorials: https://github.com/hortonworks/tutorials/tree/atlas-ranger- tp/tutorials/hortonworks/atlas-ranger-preview
  • 8. Waterline Data The Smart Data Catalog Company
  • 9. Unlock TheValue Of The Data LakeWith WaterlineData’s Smart Data Catalog Time To Value AUTOMATICALLY catalog data assets across ALL the data AND enable SELF-SERVICE access Tribal Knowledge Sharing AUGMENT semantic discovery by CROWDSOURCING tribal data knowledge Trust Enable AGILE GOVERNANCE with automated tagging, data stewardship, and SECURE SELF- SERVICE access to data based on role and policy
  • 11. Waterline Data Is LikeAmazon For The Data Lake Catalog Find, Understand And Collaborate Provision
  • 12. WorkflowOf EnablingSelf-ServiceAnalytics With Hortonworks Hortonworks Atlas And Ranger Data Prep Analytics & Visualization Smart Data Discovery Profiling,Sensitive Data & Data Lineage Discovery, AutomatedTagging Data Stewardship Curate Tags Self-Service Data Catalog Find,Collaborate And Take Action Metadata,Tags, Data Lineage Metadata,Tags, Roles&Access Control Roles&Access Control
  • 13. Demo
  • 14. Waterline Data The Smart Data Catalog Company
  • 15. UNIFY YOUR DATA ACROSS SILOS Joe Lichtman Vice President, Product jlichtman@attivio.com
  • 16. 1 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL WHO IS ATTIVIO? Attivio unifies your data across silos to provide a 360° view of your business
  • 17. 2 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL Gartner Magic Quadrant For Enterprise Search, Q3 2015 Forrester Wave: Big Data Search and Knowledge Discovery Solutions, Q3 2015 Forrester Wave: Big Data Text Analytics Platforms, Q2 2016 Gartner Magic Quadrant For Enterprise Search, August 2015 Forrester Wave: Big Data Search and Knowledge Discovery Solutions, Q3 2015 LEADER IN SEARCH, DATA DISCOVERY AND TEXT ANALYTICS
  • 18. 3 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL SEMANTIC DATA CATALOG Attivio radically reduces time spent finding and understanding data sources to speed time-to-analytics. • Catalogs all your enterprise information • Identifies what’s most relevant • Unifies all structuredand semi-structured sources in a visual model • Provisions leadingBI and predictive analytics tools such as Qlik, R, RapidMiner, Spotfire, and Tableau 58% of the effort for BI initiatives is wasted on data exploration and integration 33% of businesses cite big data discovery as a challenge they are facing 50% Businesses use less than half of their available data for BI
  • 19. 4 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL CATALOG ALL YOUR ENTERPRISE INFORMATION • Spiders and extracts metadata for all information types • Automatically catalogs data and content with semantic meaning • Applies human expertise to fine-tune tagging and align with business rules
  • 20. 5 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL IDENTIFY THE RIGHT INFORMATION • Delivers natural language and keyword search • Provides an eCommerce-like shopping cart for data • Recommends the most relevant data for your context
  • 21. 6 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL UNIFY THE INFORMATION FOR YOUR ANALYTIC CONTEXT • Automatically generates data models • Correlates all structured data and unstructured content • Simplifies provisioning to BI and advanced analytic tools
  • 22. 7 © 2016 ATTIVIO | PROPRIETARY AND CONFIDENTIAL PROVISION & OPERATIONALIZE DATA AS A STRATEGIC ASSET • Provision directly to agile BI and analytics tools OR • Rationalize the data warehouse for greater simplicity and lower cost • Power domain-specific apps
  • 24. Trifacta + Hortonworks: Apache Atlas Integration
  • 25. DATAWRANGLING What is Data Wrangling? 2 QUESTION ANALYZE INSIGHTDISCOVER STRUCTURE CLEANSE ENRICH VALIDATE PUBLISH
  • 26. I N G E S T I O N A C C E S S DATA SOURCES Transactional Data banking credit cards lending wealth mortgages ledgers trades payments Interaction Data social webchat Analytics Reporting Data Product Models BUSINESS OPERATIONS Data Wrangling within the Hortonworks Data Lake Discovery Zone Shared Zone Raw Data Zone
  • 27. Trifacta + Hortonworks & Apache Atlas Governance Ingestion Metadata & APIs Data Wrangling Analysis & Consumption