Succes eller fiasko? Sådan håndteres Big Data i den finansielle sektor, Keith Prince, IBM UK
1. Making Sense of Big Data - The Highs
and Lows of Big Data in Financial
Services
Keith Prince – IBM Industry Solutions Executive, EMEA
2. What Is Big Data & Big Analytics?
"Big Data & Big Analytics" are terms applied to data sets whose size is
beyond the ability of commonly used software tools and data management
processes to capture, manage, and process the data within a valuable
elapsed time.
“If we look at our financial services clients, tick data is prevalent—they’re
holding 40,000 to 50,000 positions and a universe of 7 million to 8 million
securities. With that going back 15 years, it’s a lot of information. If you look at
compliance, there are things like phone transcripts — you’re storing every
phone call made in and out of a bank and again, that’s an enormous quantity of
data. We’re seeing ways of communicating outside of the telephone proliferating
as well. Whereas before you’d see email archives getting up to thousands of
terabytes, you’re now tracking and monitoring instant messaging and social
media.”
4. Challenges Facing Financial Services Firms
Balancing Risks & Increased Regulatory
Rewards Pressures
Performance & Global Market
Growth in Downturn Volatility
Customer Operational Efficiency
Engagement & Trust and Cost Reduction
5. Challenges Facing Financial Services Firms
• Develop client analytics and new
Market Insights
Customer &
Business leaders frequently make segmentation strategies
1 in 3 decisions based on information
they don’t trust, or don’t have
• Optimize client channel interactions
• Attract and retain specialized skills
• Re-establish the brand/customer trust
• Turn clients into advocates
of CIOs cited “Business • Enhanced product development
83% intelligence and analytics” as part
of their visionary plans
• Add new data quickly & cost effectively
Business
to enhance competitiveness
Agility
• Break line of business silos
• Modernize IT delivery
Business leaders say they don’t • Ability to use analytics anywhere
1 in 2 have access to the information they
need to do their jobs
• Protect investment in business analytics
Enterprise Risk
• Adjacent & End-to-End Risk Modeling
Management
of CEOs need to do a better job • Create a risk-aware culture
•
60% capturing and understanding
information rapidly in order to make
swift business decisions
•
•
Improve/automate compliance frameworks
Continuously measure and forecast risk
Address risk models, scenarios, stress
testing and data quality
9. Building Insight Is A Step-by-Step Process
FULL OPTIMISATION
SIMPLE OPTIMISATION
PROPENSITY MODELLING
CAMPAIGN MEASUREMENT
CAMPAIGN EXECUTION
CAMPAIGN SELECTION
FORECAST CAMPAIGN PERFORMANCE
CATEGORIZE, DESCRIBE & MODEL
PROSPECTS & CUSTOMERS
10. Segmenting Value In A Social Context
Brand Advocates
‘I want all my friends, family and community to benefit
from my positive experiences’
High Value
• Active communicator • Engages with multiple high
• Values oriented & chooses value brands
brands that match • Significant wallet share
Potentially Engaged Consumers
High Value ‘I like being recognized for sharing my experiences and
ideas’
value analytics
• Highly engaged over long • Wants a more
term personalized experience
Medium
• Connects with FFF • Enjoys being served
Value
Smart Purchasers
‘I want the best products for my family’& lifestyle’
• Low-Medium engagement • Sees rewards as part of
Low Value
but RFV is high price
• Information is important • QVC important
Price Conscious
‘I need to buy the best value product’
Transaction Rewards Loyalty Community
Social Activist
Based Seekers Matters Spirited
• Low engagement & value • Only needs basic product
• Shops for simplicity & price • Not brand conscious
engagement analytics
11. Things Can Also Go Wrong As Well As Right
• Bad data is inevitable as volume and variety increases
• Big Data technologies aren’t a universal hammer
• It’s not about acquiring more data, it’s about understanding context
• How quickly can you react to unforeseen events?
• Sometimes the data is the question – data finds data
• More data should lead to better prediction
• Sometimes, you already have the data you need to make a start
• Big Data can help you think – what if this was different?
• Single Version of The Truth – or all versions of the truth?
• Adopt a policy of Test & Learn
• Structured and unstructured are data
12. Getting It Right From The Start
• Ensure exec sponsorship for using Big Data to make it easier and more cost effective to Analytic Applications
deliver, manage and change BI / Exploration / Functional Industry Predictive Content
BI /
• Design-in acquisition, storage and use of the best detail (geocode & timestamp Reporting Visualization App App Analytics Analytics
Reporting
everything, risk-to-value analysis vintages)
• Design-in flexibility to update data and deliver insights/outcomes at varying speeds and IBM Big Data Platform
times Visualization Application Systems
• Testing requirements will ask big questions of the data infrastructure, for example & Discovery Development Management
• Capture all data/signals around the creation and use of the model(s), data, validation
and sign-off by internal compliance owners and governance oversight committee('s) Accelerators
• Design data and auditing flows to easily cope with adjacent models and more than one
exchange standard Hadoop Stream MPP Data
• Maximise integration, attribution and correlation of diverse alternative data: System Computing Warehouse
• Help to incorporate long tail walk throughs for modelled and un-modelled risks
• Need to use robust data interpolation techniques to upscale short term history (climate
model based on <100 years of observations when we need 1 in 200/300 context)
• Need multiple information infrastructure delivery models to test and implement scalable Information Integration & Governance
architecture (insourced/outsourced, managed, cloud, MPP, Hadoop, etc)
13. 11 TRILLION CALCULATIONS IN CONTINUOUS ASSESSMENT OF
BIG DATA 6 MINUTES CATASTROPHIC EVENTS
EXPERIENCES Top 5 Investment Mgmt Firm
200-1000x speed up of real-time
Global Re-Insurer
Catastrophe modeling utilizes
DYNAMIC RISK BASED PRICING analysis incorporating 100,000 diverse data sources, including real-
A New market Entrant simulations & 27 economic time weather and event monitoring,
Dynamic pricing and driver risk sensitivities for 400,000 securities reduced from 7 days to 2 hours
evaluation based on miles across 150 funds. At 10% of the helps to minimize portfolio risk ,
covered, time of journey, location previous system’s TCO. maximize underwriting opportunities
utilising data from a telematics and optimize capital reserves.
device in the insured vehicle.
MAKING SENSE OF 40MILLION
THE ANALYTICS PLATFORM FOR EMAILS A MONTH ANALYZING DATA @
INTERNAL MODELING & ORSA A Top 3 US Bank MARKET SPEED
A Top 3 UK Insurer Mining emails for consumer Global Exchange Group
Collecting all sentiment and employee Stays ahead of the game despite trade
customer, policy, financial and effectiveness, analysing voice data speeds measured in
market data to explore competing for changes in tone and inflection microseconds, data volumes
approaches and create the most has helped to increase the compound at 100% pa, peak volumes
accurate capital and integrated risk Promoter universe. can double in a day and 4TB of fresh
models. data every day.
13
14. Big Data Success – By Design
Case Study Description Relevance to FS Clients
Enhanced Customer Consumer profiling for offer targeting and optimisation by the merchant/issuer Stimulates usage and provides opportunities for campaign mgmt
Profiling
Merchant Offer Targeting Offers linked to consumer’s location and proximity to qualifying (or new) merchants Trigger/event based marketing service
Social Media Analytics Classification of intent, sentiment, influence and engagement Addition of social dimensions to consumer profiling across
brands/time
Lifestage Merchant Enhanced/enriched consumer profiling to recognise lifestage events Enables merchants to develop pro-active campaigns for lifestage
Campaigns signals
Customer Experience Use all available information sources to understand and monitor ‘experience’ Issuer and merchant relevance but also indicative of quality of
Management service
Customer Data Integration CDI for consumer and commercial entities and ability to graph interrelationships Extends influencer network analysis beyond ‘the brand’ – yield
and uplift impact
Fraud Detection Expand data inputs to improve detection and prevention of loss due to fraud Economic value of accurately fraud events to improve detection
& loss prevention
Catastrophe Modelling Ingestion and analysis of massive amounts of ‘event’ data for catastrophe Ingestion and correlation of a wide array of data to predict
modelling impact on engagement
Market Smart Service FICO’s marketing analytics service for it’s customers – expansion of ad-hoc Flexible analytic appliance to provision data and analytic
analytics services at a low cost point
Transaction Analysis Consumer spend and merchant category profiling across all history + fraud for Broad spectrum analysis of all transaction data for profiling and
credit card assocation network opportunity analysis
Marketing Services Direct behaviour based marketing for customer incentive and loyalty programs – Creation and delivery of incentive programs
retail + CPG + health
Campaign Management Reduction in campaign management latency – event lag Campaign execution in less time and at less cost
Nielsen Customer Insight Improving the delivery of data services from one of the largest global brands Delivering data services and insights that are easy to consumer
to tight SLA’s - at volume
German Government ‘finds’ itself €55bn richer than it thought.Not every possible problem related to data can be and should necessarily be a big data problem. The hype enthusiasm makes people try to solve everything with Hadoop & Co., move everything to NoSQL data bases and stop thinking in RDBs. In the end, every traditional BI engineer, DB administrator, and similar folks likes to be labeled, to be perceived by the management as being working with the cutting-edge technology: “We are doing big data”.There were a number of major PR gaffes in the last year, where banks were caught short over social media usage. The biggest one was from Bank of America, who tried to introduce a charge of $5 a month to use debit cards in October 2011, in response to the Durbin amendment to the Dodd-Frank Act that limits debit card transaction charges to 12 cents per transaction. Customers didn’t like the new fee and one – Molly Katchpole, a 22 year old – forced the bank to change its position purely by using Change.org to create a petition that garnered over 300,000 signatures. The fact that BoA retracted the fee was then rewarded with the award for the worst PR gaffe of 2011 and a 20 percent increase in account closures in Q4 2011. But smarts requires much more than just available data and good correlation. Two additional critical elements of smart systems are:1. An ability to make assertions based on new data points2. An ability to use new data points to reverse earlier assertionsYou can have issues over reputation but the #1 reason why customers close accounts is when mistakes are made. In this case, the RBS glitch was huge, with customers at Ulster Bank closed out of their accounts for almost a month. The mistake was made by an update to CA-7, a core payments program, which corrupted the payments files. The issue has been that no matter how hard RBS try, they cannot recreate the issue to find out what caused it, or so I hear.You would think that with global AML controls in place, a bank of HSBC’s size and breadth could handle a little tracking of terrorist cash. David Bagley, Global Head of Compliance for HSBC, has been co-chair of the Wolfsberg Group that set the rules for Anti-Money Laundering worldwide for banks since 2005, so they should know something about it. However, he and the bank got caught out as accounts in Mexico enabled drug cartels to gain monetary movements via the Cayman Islands, and terrorists were engaged in similar activities via the Saudi bank division and its counterparty Al-Rajhi Bank. It’s not so much that LIBOR had faults so that rates could be rigged, which we all now know, but more that the messenger got shot. Barclays were the messenger who blew the whistle on LIBOR rate rigging. That’s why they got the first massive fine, as they have been co-operating with the authorities. Meanwhile, all the other banks have now been drawn into the crossfire with HSBC, RBS, Lloyds, Deutsche, Mitsubishi and more being investigated with multimillion dollar fines. So what’s the problem here? That Barclays allowed the release of the LIBOR news to make them look like the sole bad guys. Result: Diamond, del Missier, Agius and more all go and the bank becomes a headless mess. At the same time Barclays had to pull all online advertising as due to negative customer feedback.some banks have been able to double the share of customers that accept offers of loans and reduce loan losses by a quarter, simply by using data they already have. Card networks and other retailers are also getting in on this business. In America Visa has teamed up with Gap, a clothes retailer, to send discount offers to cardholders who swipe their cards near Gap's stores. Yet in peering so obviously into people's spending habits, banks run a risk of spooking their customers and running foul of privacy advocates. Target, an American retailer, received unwelcome attention earlier this year when it reportedly discovered from a teenage girl's shopping patterns that she was pregnant —and mailed her baby-related coupons—before she had told her father. A less controversial way of using the data banks hold is to draw on them to offer something genuinely useful to their customers. Britain's Lloyds Banking Group is thinking of tweaking its systems to tell customers not just how much money is in their accounts when they ask for a balance, but also how much they will have available once all their usual bills are paid. “We have deep and rich information about customers that we can use to give them better insights, rather than just providing us with better insight to improve our risk management,” says Alison Brittain, head of consumer banking at Lloyds.Yet even as big data are helping banks, they are also throwing up new competitors from outside the industry. One such firm is ZestCash, which provides loans to people with bad or no credit histories. It was started by Douglas Merrill, a former chief information officer and head of engineering at Google. The big difference between ZestCash and most banks is the sheer quantity of data that the firm crunches. Whereas most American banks rely on FICO credit scores, thought to be based on 15-20 variables, such as the proportion of credit that is used and whether payments have been missed, ZestCash looks at thousands of indicators. If a customer calls to say he will miss a payment, most banks would see this as a signal that he is a high risk. But ZestCash has found that such customers are in fact more likely to repay in full. Another useful signal is the length of time customers spend on ZestCash's website before applying for a loan. “Every bit of data is noise, but when you add enough of them together in a clever enough way you can make sense of the garbage,” Mr Merrill said at a recent conference.Tesco Bank is using its retail and finance data to create multiple personas that will allow it to target a series of specific profiled behaviours for each individual rather than cluster individuals into one profile. Mutli-dimensional segmentation schemas are becoming more important.
. Ensure exec sponsorship for data assurance by using Big Data to make it easier and more cost effective to deliver, manage and changeDesign-in acquisition, storage and use of the best exposure detail (geocode everything, accurate replacement costs & constraints, insurance-to-value analysis vintages)Design-in flexibility to update data and deliver insights/outcomes at varying speeds and times. Not everything conforms to daily, weekly dimensions or batch and real time. On-demand (operational ad-hoc) is a time dimension in its own right.Timing (inflow and outflow) will need to vary depending on the portfolio (e.g., personal lines versus commercial lines, small risks versus large risks, etc).Testing requirements will ask big questions of the data infrastructure, for exampledata quality map to show occupancy, construction and geocoding attributiondata validation tests such as comparison of exposure changes (varying time sequences) for each insured and the portfolio as a wholeprovide historical loss experience to help identify specific portfolio coding issues and behaviour vs. model construction and assumptionsportfolio-specific data quality sensitivity tests as a regular part of portfolio risk analysis process and incorporate into risk decision making - incorporate data quality scoring (completeness, accuracy, use) throughout the life-cycle!Capture all data/signals around the creation and use of the catastrophe model(s), data, validation and sign-off by internal compliance owners and governance oversight committee('s)Understand and value any models tested and usedUnderstand key sources of model uncertainty and impact to the portfolioParallel run R&D (challenger) models to core risk management models (champion) to better understand the sensitivity of results to key areas of uncertainty and market behaviourRun multiple analytical cycles on both exposure and model assumptions to quantify risk boundariesModel transparency across code set and data flows around the modelSupport multiple data model standards for inflows and outflowsDesign data and auditing flows to easily cope with more than one cat model and more than one exchange standardMaximise integration, attribution and correlation of diverse alternative data:include all risk data that could identify change the loss in the model (e.g., demand surge, business interruption, additional living expenses, loss adjustment expenses)Depending on the coverages underwritten, include fire following earthquake, property contents, business income, marine, energy, flood, and auto physical damage in the model, or determine an additional estimate for these potential losses outside the modelled results for risk families adjacent to the event or area of impactMake it easy for all stakeholders and industry agencies to peer review and test the reasonableness of model outputs and revisions to the modelHelp to incorporate long tail walk throughs for modelled and un-modelled risksNeed to use robust data interpolation techniques to upscale short term history (climate model based on <100 years of observations when we need 1 in 200/300 context)Need multiple information infrastructure delivery models to test and implement scalable architecture (insourced/outsourced, managed, cloud, MPP, Hadoop, etc)