SlideShare ist ein Scribd-Unternehmen logo
1 von 28
DMT 3260
Citizens Bank Data Lake Implementation: Selecting
BigInsights ViON Spark/Hadoop Appliance
Dana Rafiee, Destiny Corporation
John DiFranco, Citizens Bank
DMT 3260
Order of Presentation
Destiny Background
The Data Scientist
Client Infrastructure Challenges
Tools Used at Clients
Client Architecture Case Studies
Citizens Bank
Financial Processing Organization
DMT
Citizens Bank, formerly part of the Royal Bank of Scotland, is implementing
a BigInsights Hadoop Data Lake with PureData System for Analytics
(Netezza) to support all of its internal data initiatives. The goal is to provide
an improved experience for customers and to grow market share. Along
their ETL journey, we’ve used Netezza SQL, Hadoop and finally IBM
BigIntegrate and BigInsights. Testing BigIntegrate on BigInsights yielded the
productivity, maintenance and performance that Citizens was looking for,
and this all came prepackaged in the the ViON Hadoop Appliance that was
rolled into its data centers—greatly simplifying entry into the Hadoop world
Abstract
DMT 3260
Destiny Background
• Business and Technology Consulting Firm
• Advising Fortune 500 Corporations for 30 years
• Build Data Lakes, Warehouses, Reporting and
Analytics environments for large corporations and
government
• Business Consultants
• Data Warehouse/Modeling Specialists
• Advanced Analytic Practitioners
• SAS and IBM Business Partner
• Objective Opinions
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Who is the Data Scientist?
• Data science is an interdisciplinary field about
processes and systems to extract knowledge or
insights from data in various forms, either structured
or unstructured.
• Statistics
• Machine learning
• Data mining
• Predictive analytics
• “Data Scientist is the new title for the Analyst”
• Paul Kent, VP of Big Data at SAS Institute
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Requirements of the Data Scientist Community
• Immediate access to data no matter where it exists
• Simple access to systems
• Legacy and Open Community Tools
• Ample resources to do their work
• Ability to store analytical results
• Fast Execution
• Access to In-House Data and External Data
• Nimble IT shop or I will find another option (Cloud)
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Why is the Playing Field Different Today?
• Legacy Data and Systems
• OLTP Systems of Record
• Mainframes
• Data Warehouses and Marts
• Dark Data (Archived)
• New Data Sources
• Social Media
• Internet of Things
• Streaming Data
• Data Brokers – Search Yourself?
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Some Big Data Use Cases
• Macy’s Inc. - Real-Time Pricing on 73 Million items based on demand and inventory.
• Tipp24AG - Betting on European lotteries with predictive analytics, building models in less than 10% of the time.
• Walmart – Text Analytics, machine learning and synonym mining to produce relevant web site search results increasing
conversions by 10-15%.
• Fast Food and Digital Menus – Long drive through lines display quick delivered items, while short lines display higher margin
items that take longer to prepare.
• Morton’s Steak House – For a publicity stunt, analyzed tweets about Morton’s, matched data to a frequent Morton’s diner and
then delivered him dinner has he landed in the airport.
• PredPol Inc. – Los Angles and Santa Cruz Police use data about earthquakes and crime to predict where crimes will happen
after an earthquake. There is up to a 33% reduction in crimes.
• Tesco PLC – Track 70 million refrigerator data points to be more proactive with maintenance and cut down energy costs.
• American Express – Predicting and reducing customer churn through analysis of historical buying patterns.
• Express Scripts Holding Co. – Through analysis, determined people were forgetting to take their medications. Invented beeping
medicine capsules and implemented automated phone calls.
• Infinity Property and Casualty Corp.- Re-analyzing dark data on claims now allow them to recover $12M in subrogation claims.
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
IT’s Challenges in Supporting the Data Scientists
• Building Proper Infrastructure to Support the Business
– Timely Access to data and systems
– Simple to use
– Open to new technologies and capabilities
– Accurate data
– Current data to support business needs
– Powerful enough to crunch all the data
– Fast or Cheap
– Robust and Reliable in an Open Environment
– On-Premise or Cloud or Hybrid
– Support Mandated Regulations
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
The Traditional IT Architecture
Mainframe Data WarehouseData Input Analyst
Information
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Why is it Not Enough?
• Inflexible
• Cannot capture new forms of data
• Cannot easily analyze new forms of data
• Cannot economically handle large data
volumes
• Cannot easily integrate with the Open
Community
• Long Lead Times for IT Projects
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Designing the New Infrastructure
• New Non-Standard Data Sources
• Structured
• Unstructured
• Streaming
• NOSQL forms
• External Sources
• Ability to Land All Data Economically
• Let the business decide what data is required
• New Analytics Requirements
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Some IT Infrastructure Considerations
• Limited Budgets and Resources
• Master Data Management
• Hadoop
– Bronze, Silver, Gold
– Single copy of the Data
– Spectrum Scale/GPFS
– Other Options
• Storage Mechanisms
– Elastic Storage Server
– DS8800, XIV
– Flash
• Types of Queries
• Historical Information
• Speed of Processing
– Fast, Expensive
– Slow, Cheap
• Location
– On-Premise
– Cloud
• Mobile Device Requirements
• Virtual Desktop
• Keeping Data In-Sync – Production and DR
– Update Strategies
– Replication Strategies
– Database
– SAN Store Utilities
• Data In-Flight
• Data Lineage
• Appliances
– PDA/Netezza
– SAP/Hana on Power
– DB2 Blu – On Premises
– DataAdapt Spark Hadoop Appliance
(BigInsights)
• Grid Processing
• Regulatory Compliance
• Data Governance
• In-house maintenance or Managed Service
• IEEE 802.3ba 40GbE, Direct Attached SAN,
NAS
• Politics
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Data Classifications
0
0.5
1
1.5
2
2.5
3
Bronze Silver Gold
Volume
Data Scientist Power User BI End User
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Discovery and Transformation of Data
• Tools to Analyze and Transform Data
– Data Stage
– Podium
– Trillium
– DataFlux
– Informatica
– Talend
• User Tools to Gain Insight into the Data
– Watson Explorer
– Attivio
• In-Database
• In Memory and Machine Learning
– Apache Spark – Micro Batches
– Apache Flink – Streaming Data Flow Engine and Memory
Management
• Other
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Building Analytics Processes and the
Challenges
• Three Categories
– Ad Hoc
– Standard Analysis and Reporting
– Statistical Models
• Challenges for IT
– Skill Sets of the Data Scientist and Power Users
– Playing Nicely Together
– Structure of the Data – Data Modeling vs. SQL Tools
– Location and Movement of the Data
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Case Studies
• Citizens Bank BigInsights Deployment
• Global Financial Advisors Deployment
• Financial Processing Organization Design
Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
DMT 3260
Citizens Bank Original Environment
• Teradata Data Warehouse
• Raw Data and History (Staging from record systems)
• Conformed Data to a Data Model (Mapped to industry standard model)
• Data Marts (Fit for purpose business specific)
DMT 3260
Challenges with the Teradata Environment
• Processing on Teradata was slow due to:
• Traditional Teradata Data Warehouse Framework
• Reference Model
• Slow Time to Market
• Extremely Expensive in Labor Costs
• Extremely Expensive to add Additional Computing Capacity
• System and SAS costs increasing
DMT 3260
Looking for Alternatives
• Execution of an information Proof of Concept
• IBM
• Oracle
• Cloudera
• Hortonworks
DMT 3260
Conclusions and Choices Made
• The IBM BigInsights Appliance is the most cost effective
• Minimal engagement from internal infrastructure organization
• Delivered fully assembled with hardware and software
• Appliance Model value proposition similar to a Netezza Appliance
DMT 3260
Standard Tools at Citizens
• IBM BigSQL
• assurance that standard tools would work well with (DB2 LUW V 10.5)
• All products support this platform
• Oracle OBI-EE – Operational Reporting
• SAS for Statistical Modeling
• Tableau for Visual Reporting
• Datastage for ETL – centralized application development model
• Spectrum Scale(GPFS) vs. Hadoop for better management of the data
and less raw storage
• Fluid Query for connections to BigInsights
DMT 3260
POC on BigInsights Appliance
• Datastage processing running on Teradata was moved to BigInsights
• Client Connectivity, queries, testing and validation
• Proved that the platform could be used as the server and storage to run
enterprise data stage processing
DMT 3260
Results
• Moved Analytics processing from Teradata to Netezza
(cost/performance)
• Increase in SAS performance by running in Netezza database
• Repurposed some SAS costs
• Reduced data warehouse admin support costs (Teradata DBAs
reallocated)
• Implemented BigInsights Hadoop for a data lake (staging and
conformity)
• Avoided large capital outlays for additional Teradata capacity
• Reduction in Labor Effort to use the new platforms
DMT 3260
Future Plans
• Evaluating and Planning Implementation of dashDB (Bridge to Cloud) to
move some items to Cloud
• Instead of paying for another year of S&S, using the funds for Bridge to
Cloud
• Attractive price point
• Adding new applications (Risk) to Netezza and the Data Lake
DMT 3260
Complimentary Consultation
o Contact Us at: info@destinycorp.com
• Discovery Session
• Analysis of Architecture
• Business Process
• Governance
• High Level Recommendations
DMT 3260
Questions and Answers
DMT 3260
Contact Information
Dana Rafiee
Managing Director
Destiny Corporation
860-721-1684 x2007
drafiee@destinycorp.com
www.destinycorp.com
John DiFranco
SVP - Director of Enterprise Data Management
Citizens Bank
John.difranco@citizensbank.com
www.citizensbank.com
781-655-4489
Thank you for your time

Weitere ähnliche Inhalte

Was ist angesagt?

Enterprise Data Governance Framework With Change Management
Enterprise Data Governance Framework With Change ManagementEnterprise Data Governance Framework With Change Management
Enterprise Data Governance Framework With Change ManagementSlideTeam
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata ManagementDATAVERSITY
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...DATAVERSITY
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...HostedbyConfluent
 
Change data capture
Change data captureChange data capture
Change data captureRon Barabash
 
Modern Metadata Strategies
Modern Metadata StrategiesModern Metadata Strategies
Modern Metadata StrategiesDATAVERSITY
 
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleHow to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleDATAVERSITY
 
Gartner: Master Data Management Functionality
Gartner: Master Data Management FunctionalityGartner: Master Data Management Functionality
Gartner: Master Data Management FunctionalityGartner
 
Big Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of ViewBig Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of ViewPietro Leo
 
BI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyBI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyShivam Dhawan
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMatei Zaharia
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Idiro Analytics - Analytics & Big Data
Idiro Analytics - Analytics & Big DataIdiro Analytics - Analytics & Big Data
Idiro Analytics - Analytics & Big DataIdiro Analytics
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Rajesh Kumar
 

Was ist angesagt? (20)

Enterprise Data Governance Framework With Change Management
Enterprise Data Governance Framework With Change ManagementEnterprise Data Governance Framework With Change Management
Enterprise Data Governance Framework With Change Management
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata Management
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
 
8 Steps to Creating a Data Strategy
8 Steps to Creating a Data Strategy8 Steps to Creating a Data Strategy
8 Steps to Creating a Data Strategy
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
 
Change data capture
Change data captureChange data capture
Change data capture
 
Modern Metadata Strategies
Modern Metadata StrategiesModern Metadata Strategies
Modern Metadata Strategies
 
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleHow to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
 
Gartner: Master Data Management Functionality
Gartner: Master Data Management FunctionalityGartner: Master Data Management Functionality
Gartner: Master Data Management Functionality
 
Big Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of ViewBig Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of View
 
BI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyBI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and Strategy
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Idiro Analytics - Analytics & Big Data
Idiro Analytics - Analytics & Big DataIdiro Analytics - Analytics & Big Data
Idiro Analytics - Analytics & Big Data
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 

Andere mochten auch

Medical University of South Carolina: Using Big Data and Predictive Analytics...
Medical University of South Carolina: Using Big Data and Predictive Analytics...Medical University of South Carolina: Using Big Data and Predictive Analytics...
Medical University of South Carolina: Using Big Data and Predictive Analytics...Seeling Cheung
 
Southwest Power Pool big data case study
Southwest Power Pool big data case study Southwest Power Pool big data case study
Southwest Power Pool big data case study Seeling Cheung
 
Constant Contact: An Online Marketing Leader’s Data Lake Journey
Constant Contact: An Online Marketing Leader’s Data Lake JourneyConstant Contact: An Online Marketing Leader’s Data Lake Journey
Constant Contact: An Online Marketing Leader’s Data Lake JourneySeeling Cheung
 
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...Seeling Cheung
 
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...Seeling Cheung
 
AddReality company overview
AddReality company overviewAddReality company overview
AddReality company overviewAddReality
 
Automate Hadoop Cluster Deployment in a Banking Ecosystem
Automate Hadoop Cluster Deployment in a Banking EcosystemAutomate Hadoop Cluster Deployment in a Banking Ecosystem
Automate Hadoop Cluster Deployment in a Banking EcosystemHellmar Becker
 
Hooduku - Big data analytics - case study
Hooduku - Big data analytics - case studyHooduku - Big data analytics - case study
Hooduku - Big data analytics - case studySudhi Seshachala
 
The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.Richard Vermillion
 
World of Watson 2016 - Architecting your Analytics House
World of Watson 2016 - Architecting your Analytics HouseWorld of Watson 2016 - Architecting your Analytics House
World of Watson 2016 - Architecting your Analytics HouseKeith Redman
 
Big Data: Querying complex JSON data with BigInsights and Hadoop
Big Data:  Querying complex JSON data with BigInsights and HadoopBig Data:  Querying complex JSON data with BigInsights and Hadoop
Big Data: Querying complex JSON data with BigInsights and HadoopCynthia Saracco
 
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL Big Data: Using free Bluemix Analytics Exchange Data with Big SQL
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL Cynthia Saracco
 
Big Data: Getting started with Big SQL self-study guide
Big Data:  Getting started with Big SQL self-study guideBig Data:  Getting started with Big SQL self-study guide
Big Data: Getting started with Big SQL self-study guideCynthia Saracco
 
Big Data: HBase and Big SQL self-study lab
Big Data:  HBase and Big SQL self-study lab Big Data:  HBase and Big SQL self-study lab
Big Data: HBase and Big SQL self-study lab Cynthia Saracco
 
Big Data: Working with Big SQL data from Spark
Big Data:  Working with Big SQL data from Spark Big Data:  Working with Big SQL data from Spark
Big Data: Working with Big SQL data from Spark Cynthia Saracco
 
Big Data: Big SQL and HBase
Big Data:  Big SQL and HBase Big Data:  Big SQL and HBase
Big Data: Big SQL and HBase Cynthia Saracco
 
Big Data Case study - caixa bank
Big Data Case study - caixa bankBig Data Case study - caixa bank
Big Data Case study - caixa bankChungsik Yun
 
Luxury 3.0- a new Retail Scenario for Product Mass Customization and On Deman...
Luxury 3.0- a new Retail Scenario for Product Mass Customization and On Deman...Luxury 3.0- a new Retail Scenario for Product Mass Customization and On Deman...
Luxury 3.0- a new Retail Scenario for Product Mass Customization and On Deman...ELSE CORP
 
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...PwC
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingm_hepburn
 

Andere mochten auch (20)

Medical University of South Carolina: Using Big Data and Predictive Analytics...
Medical University of South Carolina: Using Big Data and Predictive Analytics...Medical University of South Carolina: Using Big Data and Predictive Analytics...
Medical University of South Carolina: Using Big Data and Predictive Analytics...
 
Southwest Power Pool big data case study
Southwest Power Pool big data case study Southwest Power Pool big data case study
Southwest Power Pool big data case study
 
Constant Contact: An Online Marketing Leader’s Data Lake Journey
Constant Contact: An Online Marketing Leader’s Data Lake JourneyConstant Contact: An Online Marketing Leader’s Data Lake Journey
Constant Contact: An Online Marketing Leader’s Data Lake Journey
 
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
CSNI: How State Medicaid Agencies Can Use Analytics to Predict Opioid Abuse a...
 
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
 
AddReality company overview
AddReality company overviewAddReality company overview
AddReality company overview
 
Automate Hadoop Cluster Deployment in a Banking Ecosystem
Automate Hadoop Cluster Deployment in a Banking EcosystemAutomate Hadoop Cluster Deployment in a Banking Ecosystem
Automate Hadoop Cluster Deployment in a Banking Ecosystem
 
Hooduku - Big data analytics - case study
Hooduku - Big data analytics - case studyHooduku - Big data analytics - case study
Hooduku - Big data analytics - case study
 
The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.
 
World of Watson 2016 - Architecting your Analytics House
World of Watson 2016 - Architecting your Analytics HouseWorld of Watson 2016 - Architecting your Analytics House
World of Watson 2016 - Architecting your Analytics House
 
Big Data: Querying complex JSON data with BigInsights and Hadoop
Big Data:  Querying complex JSON data with BigInsights and HadoopBig Data:  Querying complex JSON data with BigInsights and Hadoop
Big Data: Querying complex JSON data with BigInsights and Hadoop
 
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL Big Data: Using free Bluemix Analytics Exchange Data with Big SQL
Big Data: Using free Bluemix Analytics Exchange Data with Big SQL
 
Big Data: Getting started with Big SQL self-study guide
Big Data:  Getting started with Big SQL self-study guideBig Data:  Getting started with Big SQL self-study guide
Big Data: Getting started with Big SQL self-study guide
 
Big Data: HBase and Big SQL self-study lab
Big Data:  HBase and Big SQL self-study lab Big Data:  HBase and Big SQL self-study lab
Big Data: HBase and Big SQL self-study lab
 
Big Data: Working with Big SQL data from Spark
Big Data:  Working with Big SQL data from Spark Big Data:  Working with Big SQL data from Spark
Big Data: Working with Big SQL data from Spark
 
Big Data: Big SQL and HBase
Big Data:  Big SQL and HBase Big Data:  Big SQL and HBase
Big Data: Big SQL and HBase
 
Big Data Case study - caixa bank
Big Data Case study - caixa bankBig Data Case study - caixa bank
Big Data Case study - caixa bank
 
Luxury 3.0- a new Retail Scenario for Product Mass Customization and On Deman...
Luxury 3.0- a new Retail Scenario for Product Mass Customization and On Deman...Luxury 3.0- a new Retail Scenario for Product Mass Customization and On Deman...
Luxury 3.0- a new Retail Scenario for Product Mass Customization and On Deman...
 
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 

Ähnlich wie Citizens Bank Implements BigInsights ViON Spark/Hadoop Appliance for Data Lake

ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Denodo
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusersBob Hardaway
 
Lecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsLecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsAbhishekKumarAgrahar2
 
Assessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesAssessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesDATAVERSITY
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-HadoopNagarjuna D.N
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxdickonsondorris
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)Denodo
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxAIMLSEMINARS
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar ZecevicDataScienceConferenc1
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataRoi Blanco
 

Ähnlich wie Citizens Bank Implements BigInsights ViON Spark/Hadoop Appliance for Data Lake (20)

ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Big data.ppt
Big data.pptBig data.ppt
Big data.ppt
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Lecture1
Lecture1Lecture1
Lecture1
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
Lecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsLecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in details
 
Big data in telecom
Big data in telecomBig data in telecom
Big data in telecom
 
Assessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesAssessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use Cases
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptx
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 

Mehr von Seeling Cheung

Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeeling Cheung
 
Big Fish Games: Democratizing Data Access
Big Fish Games: Democratizing Data AccessBig Fish Games: Democratizing Data Access
Big Fish Games: Democratizing Data AccessSeeling Cheung
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsConcept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsSeeling Cheung
 
BigInsights For Telecom
BigInsights For TelecomBigInsights For Telecom
BigInsights For TelecomSeeling Cheung
 
Cloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and AnalyticsCloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and AnalyticsSeeling Cheung
 
Integrating BigInsights and Puredata system for analytics with query federati...
Integrating BigInsights and Puredata system for analytics with query federati...Integrating BigInsights and Puredata system for analytics with query federati...
Integrating BigInsights and Puredata system for analytics with query federati...Seeling Cheung
 

Mehr von Seeling Cheung (7)

Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
 
Big Fish Games: Democratizing Data Access
Big Fish Games: Democratizing Data AccessBig Fish Games: Democratizing Data Access
Big Fish Games: Democratizing Data Access
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Concept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with TelematicsConcept to production Nationwide Insurance BigInsights Journey with Telematics
Concept to production Nationwide Insurance BigInsights Journey with Telematics
 
BigInsights For Telecom
BigInsights For TelecomBigInsights For Telecom
BigInsights For Telecom
 
Cloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and AnalyticsCloud Based Data Warehousing and Analytics
Cloud Based Data Warehousing and Analytics
 
Integrating BigInsights and Puredata system for analytics with query federati...
Integrating BigInsights and Puredata system for analytics with query federati...Integrating BigInsights and Puredata system for analytics with query federati...
Integrating BigInsights and Puredata system for analytics with query federati...
 

Kürzlich hochgeladen

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 

Kürzlich hochgeladen (20)

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 

Citizens Bank Implements BigInsights ViON Spark/Hadoop Appliance for Data Lake

  • 1. DMT 3260 Citizens Bank Data Lake Implementation: Selecting BigInsights ViON Spark/Hadoop Appliance Dana Rafiee, Destiny Corporation John DiFranco, Citizens Bank
  • 2. DMT 3260 Order of Presentation Destiny Background The Data Scientist Client Infrastructure Challenges Tools Used at Clients Client Architecture Case Studies Citizens Bank Financial Processing Organization
  • 3. DMT Citizens Bank, formerly part of the Royal Bank of Scotland, is implementing a BigInsights Hadoop Data Lake with PureData System for Analytics (Netezza) to support all of its internal data initiatives. The goal is to provide an improved experience for customers and to grow market share. Along their ETL journey, we’ve used Netezza SQL, Hadoop and finally IBM BigIntegrate and BigInsights. Testing BigIntegrate on BigInsights yielded the productivity, maintenance and performance that Citizens was looking for, and this all came prepackaged in the the ViON Hadoop Appliance that was rolled into its data centers—greatly simplifying entry into the Hadoop world Abstract
  • 4. DMT 3260 Destiny Background • Business and Technology Consulting Firm • Advising Fortune 500 Corporations for 30 years • Build Data Lakes, Warehouses, Reporting and Analytics environments for large corporations and government • Business Consultants • Data Warehouse/Modeling Specialists • Advanced Analytic Practitioners • SAS and IBM Business Partner • Objective Opinions Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 5. DMT 3260 Who is the Data Scientist? • Data science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured. • Statistics • Machine learning • Data mining • Predictive analytics • “Data Scientist is the new title for the Analyst” • Paul Kent, VP of Big Data at SAS Institute Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 6. DMT 3260 Requirements of the Data Scientist Community • Immediate access to data no matter where it exists • Simple access to systems • Legacy and Open Community Tools • Ample resources to do their work • Ability to store analytical results • Fast Execution • Access to In-House Data and External Data • Nimble IT shop or I will find another option (Cloud) Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 7. DMT 3260 Why is the Playing Field Different Today? • Legacy Data and Systems • OLTP Systems of Record • Mainframes • Data Warehouses and Marts • Dark Data (Archived) • New Data Sources • Social Media • Internet of Things • Streaming Data • Data Brokers – Search Yourself? Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 8. DMT 3260 Some Big Data Use Cases • Macy’s Inc. - Real-Time Pricing on 73 Million items based on demand and inventory. • Tipp24AG - Betting on European lotteries with predictive analytics, building models in less than 10% of the time. • Walmart – Text Analytics, machine learning and synonym mining to produce relevant web site search results increasing conversions by 10-15%. • Fast Food and Digital Menus – Long drive through lines display quick delivered items, while short lines display higher margin items that take longer to prepare. • Morton’s Steak House – For a publicity stunt, analyzed tweets about Morton’s, matched data to a frequent Morton’s diner and then delivered him dinner has he landed in the airport. • PredPol Inc. – Los Angles and Santa Cruz Police use data about earthquakes and crime to predict where crimes will happen after an earthquake. There is up to a 33% reduction in crimes. • Tesco PLC – Track 70 million refrigerator data points to be more proactive with maintenance and cut down energy costs. • American Express – Predicting and reducing customer churn through analysis of historical buying patterns. • Express Scripts Holding Co. – Through analysis, determined people were forgetting to take their medications. Invented beeping medicine capsules and implemented automated phone calls. • Infinity Property and Casualty Corp.- Re-analyzing dark data on claims now allow them to recover $12M in subrogation claims. Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 9. DMT 3260 IT’s Challenges in Supporting the Data Scientists • Building Proper Infrastructure to Support the Business – Timely Access to data and systems – Simple to use – Open to new technologies and capabilities – Accurate data – Current data to support business needs – Powerful enough to crunch all the data – Fast or Cheap – Robust and Reliable in an Open Environment – On-Premise or Cloud or Hybrid – Support Mandated Regulations Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 10. DMT 3260 The Traditional IT Architecture Mainframe Data WarehouseData Input Analyst Information Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 11. DMT 3260 Why is it Not Enough? • Inflexible • Cannot capture new forms of data • Cannot easily analyze new forms of data • Cannot economically handle large data volumes • Cannot easily integrate with the Open Community • Long Lead Times for IT Projects Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 12. DMT 3260 Designing the New Infrastructure • New Non-Standard Data Sources • Structured • Unstructured • Streaming • NOSQL forms • External Sources • Ability to Land All Data Economically • Let the business decide what data is required • New Analytics Requirements Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 13. DMT 3260 Some IT Infrastructure Considerations • Limited Budgets and Resources • Master Data Management • Hadoop – Bronze, Silver, Gold – Single copy of the Data – Spectrum Scale/GPFS – Other Options • Storage Mechanisms – Elastic Storage Server – DS8800, XIV – Flash • Types of Queries • Historical Information • Speed of Processing – Fast, Expensive – Slow, Cheap • Location – On-Premise – Cloud • Mobile Device Requirements • Virtual Desktop • Keeping Data In-Sync – Production and DR – Update Strategies – Replication Strategies – Database – SAN Store Utilities • Data In-Flight • Data Lineage • Appliances – PDA/Netezza – SAP/Hana on Power – DB2 Blu – On Premises – DataAdapt Spark Hadoop Appliance (BigInsights) • Grid Processing • Regulatory Compliance • Data Governance • In-house maintenance or Managed Service • IEEE 802.3ba 40GbE, Direct Attached SAN, NAS • Politics Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 14. DMT 3260 Data Classifications 0 0.5 1 1.5 2 2.5 3 Bronze Silver Gold Volume Data Scientist Power User BI End User Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 15. DMT 3260 Discovery and Transformation of Data • Tools to Analyze and Transform Data – Data Stage – Podium – Trillium – DataFlux – Informatica – Talend • User Tools to Gain Insight into the Data – Watson Explorer – Attivio • In-Database • In Memory and Machine Learning – Apache Spark – Micro Batches – Apache Flink – Streaming Data Flow Engine and Memory Management • Other Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 16. DMT 3260 Building Analytics Processes and the Challenges • Three Categories – Ad Hoc – Standard Analysis and Reporting – Statistical Models • Challenges for IT – Skill Sets of the Data Scientist and Power Users – Playing Nicely Together – Structure of the Data – Data Modeling vs. SQL Tools – Location and Movement of the Data Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 17. DMT 3260 Case Studies • Citizens Bank BigInsights Deployment • Global Financial Advisors Deployment • Financial Processing Organization Design Copyright © 2016 Destiny Corporation – Business and Technology Consulting - www.destinycorp.com
  • 18. DMT 3260 Citizens Bank Original Environment • Teradata Data Warehouse • Raw Data and History (Staging from record systems) • Conformed Data to a Data Model (Mapped to industry standard model) • Data Marts (Fit for purpose business specific)
  • 19. DMT 3260 Challenges with the Teradata Environment • Processing on Teradata was slow due to: • Traditional Teradata Data Warehouse Framework • Reference Model • Slow Time to Market • Extremely Expensive in Labor Costs • Extremely Expensive to add Additional Computing Capacity • System and SAS costs increasing
  • 20. DMT 3260 Looking for Alternatives • Execution of an information Proof of Concept • IBM • Oracle • Cloudera • Hortonworks
  • 21. DMT 3260 Conclusions and Choices Made • The IBM BigInsights Appliance is the most cost effective • Minimal engagement from internal infrastructure organization • Delivered fully assembled with hardware and software • Appliance Model value proposition similar to a Netezza Appliance
  • 22. DMT 3260 Standard Tools at Citizens • IBM BigSQL • assurance that standard tools would work well with (DB2 LUW V 10.5) • All products support this platform • Oracle OBI-EE – Operational Reporting • SAS for Statistical Modeling • Tableau for Visual Reporting • Datastage for ETL – centralized application development model • Spectrum Scale(GPFS) vs. Hadoop for better management of the data and less raw storage • Fluid Query for connections to BigInsights
  • 23. DMT 3260 POC on BigInsights Appliance • Datastage processing running on Teradata was moved to BigInsights • Client Connectivity, queries, testing and validation • Proved that the platform could be used as the server and storage to run enterprise data stage processing
  • 24. DMT 3260 Results • Moved Analytics processing from Teradata to Netezza (cost/performance) • Increase in SAS performance by running in Netezza database • Repurposed some SAS costs • Reduced data warehouse admin support costs (Teradata DBAs reallocated) • Implemented BigInsights Hadoop for a data lake (staging and conformity) • Avoided large capital outlays for additional Teradata capacity • Reduction in Labor Effort to use the new platforms
  • 25. DMT 3260 Future Plans • Evaluating and Planning Implementation of dashDB (Bridge to Cloud) to move some items to Cloud • Instead of paying for another year of S&S, using the funds for Bridge to Cloud • Attractive price point • Adding new applications (Risk) to Netezza and the Data Lake
  • 26. DMT 3260 Complimentary Consultation o Contact Us at: info@destinycorp.com • Discovery Session • Analysis of Architecture • Business Process • Governance • High Level Recommendations
  • 28. DMT 3260 Contact Information Dana Rafiee Managing Director Destiny Corporation 860-721-1684 x2007 drafiee@destinycorp.com www.destinycorp.com John DiFranco SVP - Director of Enterprise Data Management Citizens Bank John.difranco@citizensbank.com www.citizensbank.com 781-655-4489 Thank you for your time