1. Information 3.0: The Future of Data Management Lisa Schlosser, CTO Thomson Reuters Content Marketplace & Charlie Vanek, Sr. Director, Product Marketing, Hubbard One
2. The Future of Data Management Introduction Data: Big Data Drives Transformational Value Technology: Case Studies in Big Data Lisa Schlosser Charlie Vanek People: Collaboration between CTO & CMO Recommendations to overcome Impediments to Transformational Value 2
3. Big Data Big Data: datasets whose size is beyond the ability of typical software tools to capture, store, manage and analyze 3
5. expectations time Peak of Inflated Expectations Plateau of Productivity Technology Trigger Trough of Disillusionment Slope of Enlightenment Years to mainstream adoption: obsolete before plateau less than 2 years 2 to 5 years 5 to 10 years more than 10 years Internet TV NFC Payment Activity Streams Private Cloud Computing Wireless Power Social Analytics Augmented Reality Group Buying Cloud Computing Gamification Media Tablet 3D Printing Virtual Assistants Image Recognition In-Memory Database Management Systems Context-Enriched Services Gesture Recognition Speech-to-Speech Translation Machine-to-Machine Communication Services Internet of Things Natural Language Question Answering Mesh Networks: Sensor Mobile Robots Location-Aware Applications "Big Data" and Extreme Information Processing and Management Speech Recognition Predictive Analytics Social TV Cloud/Web Platforms "Big Data" and Extreme Information Processing and Management Mobile Application Stores Video Analytics for Customer Service Biometric Authentication Methods Computer-Brain Interface Hosted Virtual Desktops Idea Management Quantum Computing QR/Color Code Virtual Worlds Human Augmentation Consumerization 3D Bioprinting E-Book Readers As of July 2011 Big Data 5
11. Big Data Tailored to business, KPIs Deployed sequentially, building capabilities over time IT Investment evolved simultaneously with managerial innovation 11
12. Big Data Big Data: will it produce sector-wide productivity gains like it did for Big Iron? 12
14. The Future of Data Management Introduction Data: Big Data Drives Transformational Value Technology: Case Studies in Big Data Lisa Schlosser Charlie Vanek People: Collaboration between CTO & CMO Recommendations to overcome Impediments to Transformational Value 14
16. CONTENTMARKETPLACE is an information architecture –a set of common standardsandpolicies for the way we create, consume, describe, manageanddistributeourcontent – that enables content interoperability 16 16
17. Technology Content Marketplace enabling content interoperability across Thomson Reuters – People Data Content Marketplace is helping us innovate, locate, and understand our content Thomson Reuters-wide 17
18. Technology: CLEAR Product Example Documents Entity Extraction Relationships and Attributes People Warehouse People Authority R Company Warehouse R Company Authority 18
20. Technology: Content Marketplace Content Marketplace is enabling content interoperability across Thomson Reuters. Big data will be solved through a combination of enhancements to the people, processes and technology strengths of an enterprise. Financial Science Officers and Directors Researchers Legal Who’s Who In China Media Attorneys Tax & Acct Journalists Accountants 20
22. The Future of Data Management Introduction Data: Big Data Drives Transformational Value Technology: Case Studies in Big Data Lisa Schlosser Charlie Vanek People: Collaboration between CTO & CMO Recommendations to overcome Impediments to Transformational Value 22
23. Technology: CLEAR Product Example Documents Entity Extraction Relationships and Attributes People Warehouse People Authority R Company Warehouse R Company Authority 23 !
26. Technology: Business of Law Example Hon. R. Jones: Judge, 9th Circuit Sam Carson, Orrick, Herrington Plaintiff Client: Intel Mary Vasaly, Maslon Edelman Defense Client: AMD
28. Technology: Business of Law Example “Things we’re interested in” Relationships and Profiling Information Data Sources Documents Entity Extraction Relationships and Attributes People Warehouse ERM People Authority EM R T & B Company Warehouse R Company Authority Third-party data 28
37. Telephone #’sERM With a relationship strength index of Z EM With whom we’ve done specific corporate work T & B Top 25 Companies for Practice Area X by Billing Third-party data With YoY revenues of +5%
40. The Future of Data Management Introduction Data: Big Data Drives Transformational Value Technology: Case Studies in Big Data Lisa Schlosser Charlie Vanek People: Collaboration between CTO & CMO Recommendations to overcome Impediments to Transformational Value 34
46. The Future of Data Management Introduction Data: Big Data Drives Transformational Value Technology: Case Studies in Big Data Lisa Schlosser Charlie Vanek People: Collaboration between CTO & CMO Recommendations to overcome Impediments to Transformational Value 40
60. Recommendations for Infrastructure A strong information infrastructure assumes a phased implementation approach that utilizes legacy systems, designs for scale, and prioritizes by information valuation Recognize that different project team members use different application, formats, and standards to exchange information. Look for common ways to normalize and extract meaning from all types of content to that it can be exchanged across the organization Assess current range of information management infrastructure capabilities – identifying gaps (missing capabilities) and overlap/redundancies (multiple approaches to a capability, and/or multiple technologies supporting it) Use existing systems and designs as starting points to develop common models that can then be shared by different processing components and system entities Identify an initial set of information management “common capabilities” and begin to leverage these in support of in-demand cases 45 Information Valuation - Identify information that matters most
69. Example of Big Data Transformational Value Exhibit identifies the different ways in which data is used and structured by an organization, and the competitive advantage that organizations achieve Each of these functions addresses a range of questions about an organization’s business processes. Analytics provides a higher level and proactive solution to these questions. Source: IBM 54
80. Technology: Business of Law Example “Things we’re interested in” Relationships and Profiling Information Data Sources Customer Warehouse CRM People Authority Accounting System 56 User Behavior on WL Third-party data
81.
82. Reps achieved 105% of quota and grew sales 116% over prior yearSMART Leads has increased my number of potential sales for existing accounts 80% Using SMART Leads saves me time compared to my previous process 67% SMART Leads has increased my number of new prospects 60% Thanks to SMART Leads, I am more effective at converting leads 53% Thanks to SMART Leads, I can sell more in less time 47% SMART Leads has prompted me to pitch products that I wouldn’t have otherwise considered 33% 57
Note the three themes at the forum: Data, Technology & People are covered here.
x
x
x
x
But can we say “in the big,” as a former colleague used to put it, will investments in IT capital deepening along with managerial innovation, has lead to gains in productivity for businesses. The question is, will investments in Big Data produce similar productivity gains.
What is that “Other Contribution?”
x
x
x
History at TR Over 24 years of technology experience (started as programmer)Held many leadership positions across TR (Application development, data center, Shared Platforms, BU CTOs)Current Role as CTO Content Marketplace
Content marketplace is helping our business make sense of the content it owns, understand ever expanding attributes about that content, and understand and build relationships between content. Its improving the customer experience for our users as well as helping us identify new product opportunities. Let’s talk about how we are doing that.
CLEAR is an investigative platform designed for professionals who need information about individuals and companies. With fast access to a vast collection of public and proprietary records, CLEAR helps you find out about people and their connections.Marriage license: 2 names, common addressNews article: Indicates Company and phone number and mentions someone who works thereBoat license: owner and boat IDCar Lease: Owner and make, model, serial number of car
A crime happens and someone remembers a CA license plate but only remembers the FOR part of the number. We look up the FOR in CA and find many hits but only a few in the bay area. We find not only the current owner of the car but the previous owner. The previous owner had several run-ins with the law so we look up his place of business, his current home address, and even his relatives.
Most organizations have a disjointed information infrastructure – disparate capabilities deployed in a project- or application-specific manner, with little consistency. The challenges emerging from the growth of ‘big data’ as well as business demand for more flexible, timely and well governed information in support of new use cases, is causing existing approaches to information architecture to break. Organizations that establish a road map toward a cohesive, application-independent and information source-independent set of information management technology capabilities are best positioned to support long-term enterprise information management (EIM) goals. Content marketplace is our approach to this challenging problem.Cannot approach as an integration of all data into a warehouse. Need a federated model.Internal use case: Who’s Who in China. Utilizing the people and company information we have gathered across the organization and building a UI for journalists to find information to enhance their stories. Realizing the transformational value of new products and services via big data and content marketplace.Focused on high priority entities (relating to Political handover of power in China)Major State owned companies and political organizations End user interface for journalists lead generation
Information architecture at its simplest is structuring, labeling, and organizing information. This is a critical role in Content Marketplace and a role that hasn’t appeared (officially) in any technology group I’ve worked with in the past 20+ years. Content Marketplace has a group of about 8 now.IA has shown itself most often in web design. IA presented unexplored terrain as the Web and related technologies were beginning to explode (30B pieces of content shared each month, some estimates indicate doubling humanity’s total output every 5 years). Someone was going to have to organize all that content. But IA is not just confided to the web. Consider the information architecture of something familiar to us, the book with its tables of contents, indices, pagination, chapters, and the details on the spine and cover. The enterprise setting is typically a large information space made up of disjointed silos controlled by different business units. Users are confronted by a counter-intuitive architecture that closely resembles the enterprise's org chart. Politics, culture, geography, technology, and other interesting constraints complicate this particular context. Enterprise IA is a unique IA genre, as are the architectures for ecommerce sites or entertainment sites. So we can prioritize and often proscribe which aspects of an IA will succeed in this particular genre and which won't. For example, for obvious reasons it's difficult to manually tag all documents using controlled vocabulary terms in an enterprise environment. It's easier to implement site-wide search, which doesn't necessarily require bothering content authors or manually touching their content. We can look at the entire IA palette and make some reasonable suggestions as to what will work in an enterprise environment, what won't, and when. A well-designed information architecture maps the needs of users to available content, all against the backdrop of a specific business context. Must take a balanced approach with focus on users and the processes that serve them.
x
CLEAR is an investigative platform designed for professionals who need information about individuals and companies. With fast access to a vast collection of public and proprietary records, CLEAR helps you find out about people and their connections.Marriage license: 2 names, common addressNews article: Indicates Company and phone number and mentions someone who works thereBoat license: owner and boat IDCar Lease: Owner and make, model, serial number of car
A crime happens and someone remembers a CA license plate but only remembers the FOR part of the number. We look up the FOR in CA and find many hits but only a few in the bay area. We find not only the current owner of the car but the previous owner. The previous owner had several run-ins with the law so we look up his place of business, his current home address, and even his relatives.
x
x
x
x
x
x
x
x
x
x
These items are no particular order but will need to be addressed in varying levels to be successful.As larger amounts of data become digitized and travel across organization boundaries, privacy, security, intellectual property, and even liability policies become increasingly important.Building a technology stack to handle big data and enabling a phased approach for legacy systems to participate must be planned for.Skills and talents of employees need to be considered.Access to data may need to broaden.The industry structure can indicate how easy or difficult the transformational value will be to obtain.Let’s start with recommendations for issues around Data Policies.
Our CLEAR example showed an example of big data that is at the high end of security and policy concern. Public records content contains information that is highly regulated, with policy demands that differ by state and country. Thomson Reuters has several teams focused on careful management of this personally identifiable information. There are three main areas of focus, Employee misuse, customer misuse, and external compromise.
x
Making use of large digital data sets will require the assembly of a technology stack from storage and computing through analytical and visual software applications. (Use exadata and file distribution as an example). Think of capabilities in terms of publishers and consumers of content. What do they need.Legacy systems need to be taken into account. (Use WCA ID v OA ID as example). This is where information architecture becomes critical. Utilize information architecture to identify patterns, define key terms, establish policy, build subject area map (high level categorization of types of info available in the enterprise), entity maps, information flow diagrams, etc. All of this development must consider the end customer’s needs. Information valuation will help prioritize how to approach. (Use Org Auth example)
x
Organizational leaders need to understand that big data can unlock value – and how to use it to that effect.The US alone, by 2018, could face a shortage of 140k – 190k people with deep analytical skills as well as 1.5M managers and analysts to analyze big data and make decisions based on their findings, that’s 50-60% greater than the projected supply.
Technologies are available to combine web analytics, subscriber analytics, social analytics, and sentiment data fro multiple sources to build predictive models and determine more relevant ad placement and content matching. Skill sets for data scientists’ statistical, modeling, and processing skills are scarce.
x
XXX
xxxx
The relative ease, or difficulty, of capturing value from big data will sometimes depend on the structure of a particular sector or industry. Lack of competitive intensity and performance transparency likely to slow fully leveraging the benefits of big data.