Today, financial services firms rely on data as the basis of their industry. In the absence of the means of production for physical goods, data is the raw material used to create value for and capture value from the market. However, as data volume and variety increase, so do the susceptibility to fraud and the temptation to hackers. Learn how an enterprise data hub built on Hadoop enables advanced security and machine learning on much more descriptive and real-time data to detect and prevent fraud, from payment encryption to anti-money-laundering processes.
It can take up to 24 months to build a compliant big data infrastructure and, even then, a faulty data privacy configuration can lead to the types of massive breaches we've read about in the news. That's why we've created a PCI Hadoop Security offering to simplify and ensure the deployment and compliance process for everyone. By developing a certification plan for your big data infrastructure that leverages MasterCard's proven PCI Hadoop Security playbooks, templates, and architectures with Cloudera, you can cut your time to production and total cost by more than half and focus on realizing the business value of all your data.
No individual record is particularly valuable, but having every record opens the door to extreme value.
OPPORTUNITY ABOUNDS in FINSERV & RETAIL
Recovering from the 2008 Recession
Proactive risk management through stress testing
Fraud, anomaly, and insider threat detection and avoidance
Macro-level financial system security
Comply with more demanding regulation – data auditing, transparency
Flexibility to business model requirements and changes
Assure customer privacy
Profiling & Personalization
Better customer segmentation across LOBs – prevent churn, lower cost of acquisition
Tailor and track product cross-sell, up-sell, and bundling opportunities – recommendations, targeted marketing
Use more data to better understand sentiment – social, website, survey, public
Micro-adjustments to products and policies based on predictive analytics
Competitive Advantage
More complete view of the market
Quicker time to insight
More relevant modeling and innovation
Efficiency advantages that afford new opportunities – utilization, multi-tenancy, robust and real-time
Open-source to avoid legacy lock-in and remain flexible with less specialization
Less reliance on third-party data sources
This is the one list that no company wants to be on – this is one of the few cases when not all publicity is good publicity. Data security is such a hot topic that President Obama hosted at summit at Stanford University in February 2015, along with executives from MasterCard and others to punctuate how important safeguarding data will be to the future of business, personal privacy, and national security. Obama also appointed DJ Patil as the first Chief Data Scientist of the USA, and addressed the Strata+ Hadoop World audience in San Jose.
Note: PCI compliance is not the end point for this project and pitch. Complete security is. Even a compliant system can be (and has been) hacked.
The Payment Card Industry Data Security Standard (PCI DSS) originated as separate data security standards established by the five major credit card companies: Visa, MasterCard, Discover, American Express, and the Japan Credit Bureau (some of whom are Cloudera Enterprise customers). The goal of ensuring that cardholder data is properly secured and protected and that merchants meet minimum security levels when storing, processing, and transmitting this data was formalized as an industry-wide standard in 2004 by the Payment Card Industry Security Standards Council.
In January 2014, PCI DSS Version 3.0 went into effect, requiring organizations to mitigate payment card risks posed by third parties such as cloud computing and storage providers and payment processors. The new version also stresses that businesses and organizations that accept and/or process cards are responsible for ensuring that the third parties on whom they rely for outsourced solutions and services use appropriate security measures. In the event of a security breach resulting from non-compliance, the breached organization could be subject to stiff penalties and fines.
Direct Costs
There is a 22% likelihood that any U.S. company will experience a data breach compromising at least 10,000 records during the next 24 months (Poneman Institute’s 2014 Cost of Data Breach Study: Global Analysis).
Operating costs related to compliance are expected to reach $10 BILLION per year for the largest banks (according to Citigroup co-president Jamie Forese, quoted by Bloomberg in September 2014).
Indirect Costs
Fines: as much as $5,000-$100,000 per month (as large as $400m/yr for AML fines)
Responsibility for fraudulent charges
Cover credit monitoring charges
Cover chargebacks
Increase in transaction fees
Enormous legal fees
Brand damage
Suspension of processing rights
Elimination of and turnover of staff
Escalation to a higher compliance tier
Increased annual compliance auditing costs
Government scrutiny
Worst case – termination of relationship
PCI DSS is…
Payment Card Industry Security Standard originated as separate data security standards established by five major credit card companies: VISA, MasterCard, Discover, American Express, and Japan Credit Bureau.
All credit card data must be properly secured and protected both at rest and in motion, including digital channels.
Must meet minimum encryption and privacy levels when storing, processing, or transmitting account-related data.
Applies to all applications, databases, and file systems, including those owned or managed by merchants and third-party solution providers (especially in the cloud).
Why MasterCard as a partner?
First PCI-certified Hadoop platform
10 PB PCI-compliantly secured every day
Founding member of the PCI Security Council
Sits on the PCI Executive Committee
Four decades of data security experience
Secures 2 billion cards and 65 million transactions per minute in 210 countries
Never, ever had a data breach
Hadoop Security Maturity Curve
Stage 0 is an open source Hadoop cluster straight out of the box with no security configured. Storing your data here means that it’s highly vulnerable and your data is at risk.
Stage 1 provides slightly reduced risk exposure where you’ve configured the basic security controls for Authorization, Authentication, and Auditing. This is the most other distributions are able to deliver.
Stage 2 offers a managed, secure, and protected environment taking advantage of Cloudera components like Navigator and Sentry.
Stage 3 is a Enterprise Data Hub configured as a Secure Data Vault. This is fully compliance audit-ready and protected. At this point, your system has all of the technology components configured and ready to undergo an audit. Note it requires additional steps to pass a full audit: a combination of People, Process, and Technology and additional services that are available from Cloudera to support your team.
There are 3 key things you want to make a customer realize about the security maturity model
Where they think they are
Where they really are
Where Cloudera can help them to get to
Note: PCI compliance is not the end point for this project and pitch. Complete security is. Even a compliant system can be (and has been) hacked.
Encrypting Data at Rest
Customer Pain Point: Encrypt all data at rest with isolated key management
Cloudera Solution: Navigator is the only encryption tool native to Hadoop
Customer Pain Point: Securely store more data without losing performance
Cloudera Solution: Transparent layer between application and file system
Safeguarding Data in Motion
Customer Pain Point: Ensure compliant transmission across public networks
Cloudera Solution: Keys are separated in secure, access-controlled servers
Customer Pain Point: Insufficient key release policies for cloud applications
Cloudera Solution: Trustee approval and audit logs for all access requests
Managing Access to the Cluster
Customer Pain Point: Keep tenants from accessing privileged apps and data
Cloudera Solution: Kerberos and Sentry provide strong role-based access control
Customer Pain Point: Audit Hadoop interactions and manage data lifecycle
Cloudera Solution: Full governance, lineage, and discovery with Navigator
An enterprise data hub addresses each of the earlier challenges:
1. You can keep unlimited data online, in its original fidelity and format. As a data staging area, it can serve as an automatic archive of any data sent to it, and process that data quickly and cost-effectively.
2. Diverse users can get direct access to all business relevant data, through the best tool for the job, whether that’s SQL, search, programming, or a favorite BI or analytics tools. Users who previously had no way to benefit from data can now find and generate insights.
3. All of this can be done with confidence, thanks to Cloudera’s enterprise-grade security, governance, and management tools.
An enterprise data hub unlocks more value, from more data, for more users, in less time.
Representative Customer Stories
Costco: From disparate and limited data views to unlimited information access.
Challenge: Costco set out to refine their global data center without disrupting the existing heterogeneous environment consisting of Informatica, Oracle, Teradata, IBM, and others. A short term goal was to find a new solution for their transaction log (TLOG) processing, which was a bottleneck on the incumbent DB2 system that resulted in throwing data away.
Solution: Costco implemented Cloudera on Cisco UCS hardware to integrate their many different in-house technologies within an interconnected enterprise data hub environment, leveraging tools including Impala, Mahout, Spark, and Solr to support a variety of user requirements and workloads.
Benefit: Costco no longer throws data away, and their data processing performance has been accelerated, while alleviating at least 20% of the load on IBM systems. The traditional infrastructure is 100X more costly than Cloudera combined with Cisco hardware, so any relief on the legacy environment saves the company substantially. Their ultimate goal is to be able to capture and process all incoming data for a more comprehensive understanding of product movement by location, driving smarter decisions in market research and procurement strategy.
SFR: From analytics for some to insights for all.
Challenge: SFR’s data warehouse has served the company well for ten years, containing data on products, device usage, invoices, contracts, price plans, and call detail records (CDR), but SFR wanted to create a shared, detailed view into the customer journey -- available to employees across the company -- for real-time search, reporting, and analysis, while also aiming to bring in multi-structured data from new sources.
Solution: By complementing its data warehouse infrastructure with Cloudera Enterprise, Data Hub Edition, SFR is delivering the 360-degree view that will help the company optimize its customer journey. SFR’s 9,000+ employees now have access to a self-service discovery environment enabling query and exploration of a single, centralized data store.
Benefit: Employees across the country can now operate based on a centralized, real-time customer view that spans many devices and data sources. And by offloading large-scale data ingest, processing, and exploration of multi-structured data sets from the data warehouse to Cloudera, the data warehouse will deliver optimal system performance for 8-9 years, vs. needing an upgrade every 3 years.
MasterCard: From risk due to regulation and customer privacy concerns to trust in a secure and compliant platform.
Challenge: Given the growing importance of Hadoop in MasterCard's long-term enterprise data hub strategy, its Cloudera platform required PCI Certification to allow Hadoop to not only host PCI or potentially PCI datasets, but to also be able to integration with other PCI-certified environments.
Solution: MasterCard's Cloudera environment -- which is a key component to the company's overarching enterprise data hub strategy -- has been certified by an external auditor to fully conform to the PCI DSS V 2.0 Security standard.
Benefit: Achieving PCI compliance on its Cloudera environment is not only important in the evaluation of current workload suitability for Hadoop at MasterCard, but also for a new generation of applications and datasets that can now be hosted on the Cloudera platform which require a PCI Certified environment.
Reference Architecture taken from Hadoop Security, written by Clouderans.
Now that we’ve discussed the need for, and framework of, the PCI solution, the next question is “OK, how do we get started and what do timelines look like?”
Upon a client decision to engage with Cloudera and MasterCard on PCI Compliance, we will begin a three-stage engagement that will take between approx. 35 and 45 weeks. Without this solution, the implementation of a compliant big data infrastructure could take at a minimum two years and have inherent security risks. Cloudera and Mastercard will reduce time to implementation AND increase security reliability.
Phase 1: Assess: In the initial phase, Cloudera and MasterCard will work with you to understand your current environment and define a plan/roadmap to march your business towards PCI compliance.
Phase 2: Configure/repair: In the second phase, Cloudera and Mastercard will leverage the plans created in the 1st phase to not only build out the software, but also to build the proper documentation, roles, and processes to ensure that the implementation will be successful.
Phase 3: Report and Present: In the final stage, we will perform an audit assessment along with final documentation. We will also educator auditors on Hadoop and make sure that the client is set for success.
The three-step process to PCI compliance of (1)Assess, (2)Configure/Repair, and (3)Report/Present helps to ensure we’ve achieved a 360-degree view of compliance. It’s important to remember that compliance isn’t just about technology, but also the people and processes that support/access the technology platform. Our documentation and training will make sure that your processes, people, and technology compliment each other…which will help ensure your enterprise security and compliance.
1. Technology
Enterprise Data Hub built on Hadoop
Massive full-fidelity data retention (HDFS, HBase)
Reporting and retrieval for audit (Impala)
Scalable data security (Navigator Encrypt & Key Trustee)
Central administration and governance for lifecycle management (Cloudera Manager)
2. Process
Security Integration & Transformation with MasterCard Advisors
Review security requirements
Audit architecture and current systems
Tailor a security reference architecture
Review audit and lineage
Install and configure custom system
Implement ongoing compliance plan
Ongoing process transformation
3. People
Technical Training & Support
Removal of complexity to deliver results
Leadership for broad and deep adoption
Experience to develop effective projects
Proactive technical guidance/planning
Predictive performance optimization
Implementation of ongoing compliance
Cloudera and MasterCard Advisors are partnering to bring the leaders in payments and PCI compliance together with the world’s leading Hadoop distribution. Cloudera’s unique security offering on Hadoop pairs well with MasterCard Advisors’ deep industry knowledge to provide a solution that no two other partners can deliver to every organization and company handling credit card data, ranging from payment card processors to retailers to banks and beyond.
It’s important to remember that MasterCard is a payments company with a deep and rich history of focusing on technology and data. MasterCard sits at the core of over 2 billion payment cards and 65 million transactions each minute. MasterCard Advisors is their consulting and innovation branch, which keeps MasterCard and its clients at the forefront of the payments industry.
Note: PCI compliance is not the end point for this project and pitch. Complete security is. Even a compliant system can be (and has been) hacked.
The simplest way to comply with the PCI DSS requirement to protect stored cardholder data is to encrypt all data-at-rest and store the encryption keys away from the protected data. An enterprise data hub featuring Cloudera Navigator—the first fully integrated data security and governance application for Hadoop-based systems—is the only Hadoop platform offering out-of-the-box encryption for data-in-motion between processes and systems, as well as for data-at-rest as it persists on disk or other storage media.
Within the tool, the Navigator Encrypt feature is a transparent data encryption solution that enables organizations to secure data-at-rest in Linux. This includes primary account numbers, 16-digit credit card numbers, and other personally identifiable information. The cryptographic keys are managed by the Navigator Key Trustee feature, a software-based universal key server that stores, manages, and enforces policies for Cloudera and other cryptographic keys. Navigator Key Trustee offers robust key management policies that prevent cloud and operating system administrators, hackers, and other unauthorized personnel from accessing cryptographic keys and sensitive data.
Navigator Key Trustee can also help organizations meet the PCI DSS encryption requirements across public networks by managing the keys and certificates used to safeguard sensitive data during transmission. Navigator Key Trustee provides robust security policies—including multifactor authentication—governing access to sensitive secure socket layer (SSL) and secure shell (SSH) keys. Storing these keys in a Navigator Key Trustee server will prevent unauthorized access in the event that a device is stolen or a file is breached. Even if a hacker were able to access SSH login credentials and sign in as a trusted user, the Navigator Key Trustee key release policy is pre-set to automatically trigger a notification to designated trustees requiring them to approve a key release. If a trustee denies the key release, SSH access is denied, and an audit log showing the denial request is created.
With Navigator Encrypt, only the authorized database accounts with assigned database rights connecting from applications on approved network clients can access cardholder data stored on a server. Operating system users without access to Navigator Encrypt keys cannot read the encrypted data. Providing an additional layer of security, Navigator Key Trustee allows organizations to set a variety of key release policies that factor in who is requesting the key, where the request originated, the time of day, and the number of times a key can be retrieved, among others.
Our security story is one that we’re building hand-in-hand with Intel. In 2013, Intel established Project Rhino, which is a blueprint for enterprise-grade security. It’s meant to address many of the security concerns with Hadoop and we are working closely with them on many of these concerns – specifically around delivering unified authorization for Hadoop through Apache Sentry and bringing new encryption and key management frameworks to a Hadoop cluster.