This document summarizes a presentation about using content analytics to kickstart an information governance initiative. It discusses challenges organizations face with growing data volumes and regulatory obligations. It then describes how content assessment, using analytics, classification, and collection, can help organizations understand their information landscape, prioritize efforts, and enable defensible disposition of data. The presentation includes an example case study of how one large financial organization used these techniques.
2. Cohasset Associates, Inc.
NOTES
Complex Legal and Regulatory Obligations
RISING LITIGATION COSTS REGULATORY OBLIGATIONS
AND COMPLIANCE
Average cost to collect, Classifying the
cull and review information information for regulatory
per legal case1
obligations and
$3M compliance is complex
across the enterprise’s
Portion of Information data
Unnecessarily R t i d2
U il Retained
Regulatory
Record
Has Keeping
Business
Utility
Subject to Everything
Legal Hold
70% Else
Sources:
1. Litigation Cost Survey of Major Companies, 2010 (from Conference on Civil Litigation, Duke Law School, May 2010)
2. Industry Estimates
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 4
Information Growth and Complexity
70% of organizations have 6 or more
document repositories
By 2015, 60% of information workers will
interact with content through a mobile device
Data doubles every 12 -18 months
By 2014, refusing to communicate with
customers via social channels will be as
harmful as ignoring a customer email or
phone call today
80% of data is considered unstructured while
90% is considered unmanaged
Source:
Gartner PPC 2012
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 5
Slow Progress to Take Control
MISALIGNED PROCESSES DEFENSIBLE DISPOSITION
AND BUDGETING OF CONTENT
Fortune 1000 companies Companies that can
that have initiated ‘claim’ that they can
comprehensive defensibly dispose of
information governance content today2
strategies1
44x
22%
2020
5-8% 5-8%
35 ZB*
* Zettabyte = 1 trillion
gigabytes
Sources:
1. 3. IDC Analyst Estimate, October 2011
2. Source: CGOC Benchmark Report on Information Governance, October 2010
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 6
2012 Managing Electronic Records Conference 5.2
3. Cohasset Associates, Inc.
NOTES
Why Is It So Difficult?
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 7
Understanding Your Information Landscape
o Where do I start ?
o What don’t we know?
o How do I sort through the debris?
Unnecessary Necessary
Information Information
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 8
What Is Content Assessment?
o Combination of software, services and best practices that
enable informed decisions about your content
o Provides organizations with the ability to
explore, analyze, organize the content across their
information landscape and validate their actions based on
business value
o Content Assessment enables
• discovery of content
• insight into the value of your content
• identification of compliance gaps
• defensible disposition
• proactive management of valuable business data
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 9
2012 Managing Electronic Records Conference 5.3
4. Cohasset Associates, Inc.
NOTES
Content Assessment: Components
o Technology
• Analyze – Content analytics
• Classify – Auto-classification
• Collect – Content crawlers
o Services
• Assess
o Best practices and industry standards
o Evaluation and recommendations
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 10
Content Analytics
o Enterprise content assessed to understand the content that matters
o Automatically extract and analyze concepts, entities, relationships,
metadata and classifications
o Multiple graphical views of the facets of unstructured content (i.e.,
clustering)
o Automatic highlighting of interesting anomalies and correlations in the data
o Analysis focused on more than metadata for in-depth understanding
y p g
• Topic of document and topic changes
• Purpose of document
• Organizations mentioned
• Individuals mentioned
• Concepts mentioned
• Key cutoff dates
• Key business concepts
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 11
How Analytics Work
Analyzed Content
Extracted
Claimant: Soft Tissue Injury Concept (and Data)
Person Injury Body Part Location
Noun Verb Noun Phrase Prep Phrase
John sprained his ankle on the step
...
Source Information
Internal (ECM, Files, DBMS, etc.)
and External (Social, News, etc.)
Automatic Visualization for Interactive
Exploration and Assessment
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 12
2012 Managing Electronic Records Conference 5.4
5. Cohasset Associates, Inc.
NOTES
Leveraging Analytics for Better Assessment
o Assess and decide what information to manage, trust and
leverage
o Determine content silos demanding priority attention
o Expose sensitive data such as financial info and personally
identifiable information (PII)
o Detects near duplicates
o Enables defensible disposition of duplicate, over-retained or
irrelevant information (i.e., ROT)
o Retain relevant information and business context
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 13
Auto-Classification
o Applies taxonomy/classification schedule to content in an
automated or semi-automated fashion to meet confidence
thresholds
o Categories unique to your organization
• Used to explore content and match documents as well as refine
categories and rules as needed
o Can also create category schemes and training sets
• Automatically clusters content into proposed categories
• Explores content based on those category schemes, self-learns
• Export categories and content for training and ongoing
application
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 14
Content Collection
o Content crawlers can automate the collection of content
identified as valuable
• Connectors to repositories
• Content in file shares
o Auto-classification rules can be applied on ingestion
o Options
• Decommission of legacy repository
• X-forward
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 15
2012 Managing Electronic Records Conference 5.5
6. Cohasset Associates, Inc.
NOTES
Services and Evaluation
o Help prioritize information governance strategy based on
• Potential cost-reduction opportunities
• Identified risk or compliance gaps and needed remediation
o Best practice recommendations
• ARMA Diagnostic Tool – Assessing records management
maturity
• GARP® Assessment – Evaluation of more than 100 attributes of
information governance you can deploy across your organization
to determine how you measure up against the GARP® principles
to help determine your individual principle scores, leading to an
overall GARP® score for information governance readiness
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 16
Case Study
Viewpointe Proprietary & Viewpointe Proprietary & Copyright Viewpointe 2012. All Rights Reserved.
Confidential Information. Confidential Information. Copyright Viewpointe 2012. All Rights Reserved 17
About Viewpointe
o Founded in 2001 with an original charter to solve information management
for financial services companies
• Built by the financial services industry for the financial services industry,
Viewpointe understands the complexities of highly regulated industries
• Over 1,500 of the top U.S. financial institutions use Viewpointe solutions
o Viewpointe has an established and proven legacy of providing information-
centric services via our private cloud
p
o Provides solutions and services for information governance, check
exchange and clearing & settlement
Viewpointe at a glance:
A trusted partner to
Provides the Manages 184 billion Selected as one of many of the nation’s
nation’s largest items, with over six the best financial largest, most
trusted archive at million sub-second technology service complex
over 29 PBs in retrievals daily from providers by FinTech companies, storing
our private cloud its private cloud 100 since 2006 some of their most
sensitive data
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 18
2012 Managing Electronic Records Conference 5.6
7. Cohasset Associates, Inc.
NOTES
Information Governance Platform
OnPointe is an information governance platform – enabling compliance across
the enterprise, enforcing retention and disposal of content – delivered via a
secure, scalable, private cloud built to augment an organization’s current IT
infrastructure.
CLASSIFY RETAIN
ANALYZE PRESERVE
COLLECT DISPOSE
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 19
OnPointe Provides…
• Helps customers meet regulatory, legal and business requirements for data access, monitoring,
Information retention management and defensible destruction of information via automated enforcement of
Governance your governance policies. Improve compliance and eDiscovery processes to help increase
predictability, mitigate risk and lower cost.
Secure • Customer data is securely managed and retained in our geographically dispersed, Tier 3+ data
centers to help ensure data protection and business continuity. Our services undergo an annual
Cloud SSAE-16 (formerly SAS70 Type-II) review.
On-Demand • Sophisticated retrieval capabilities help to ensure near-instant access to important data for
Access
A knowledge sharing, i
k l d h i inquiry and d i i making.
i d decision ki
Cost • Consumption-based model helps you control costs and minimize upfront investments in
Management & hardware, software, installation and maintenance, helping to deliver a lower total cost of ownership
(TCO) for your organization.
Efficiency
Flexible • Using a standard application programming interface (API) to our distributed infrastructure allows for
Integration ease of integration with existing business processes and applications.
• While support from the OnPointe team during implementation is inherent in the platform, these
Assessment optional services help you understand your enterprise information landscape while highlighting
Services specific areas for improvement based on industry best practices, providing recommendations for
achieving holistic information governance.
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 20
Assessing an Organization’s Environment
The Viewpointe Content Assessment process is designed to benefit
organizations with an evaluation of their RIM environment
o Services typically included:
• Ranking organization to industry peers using the ARMA diagnostic tool
• Evaluating sample data sources by using automated assessment tools,
identifying gaps or deficiencies
• Identifying potential areas of risk for remediation and lowering of their risk
profile
• Identifying potential cost-reduction opportunities
o Deliverables typically included:
• Analysis of customer’s records management environment
• Recommendations affecting prioritized areas within prospective
environment
• Provision of a future state policy, procedures and technology vision
• Development of a business case and technical solution to optimize the
company’s record management program and information governance
strategy
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 21
2012 Managing Electronic Records Conference 5.7
8. Cohasset Associates, Inc.
NOTES
Viewpointe Content Assessment – Typical Findings
Metric Description Typical Customer Status
Maturity ARMA RIM Diagnostic Survey Companies typically are in the 30-70% range.
Record Defines document categories for Despite up-to-date policies and retention
Retention business records and their associated schedules, may not be universally known or
Policy & retention, which can be time and/or enforced.
Rules event-based.
Record Records / non-records past deletion Companies typically have geographically based
Retention dates are in the system exposing
system, file shares with no retention management; high
companies to risk and cost. levels (>20%) of over-retention records.
Duplication Unnecessary duplicates maintained in Typical duplication rates are 30-60%; higher for
system emails and lower for SharePoint.
PII Data PII in the system in potential violation Typically 0-5%; higher rates for older documents.
of company’s own rules and
government regulation
eDiscovery Added eDiscovery cost and risk Too much ESI from duplication and over-retention.
Readiness assumed when their systems are not No clear audit, understanding of data sources,
under proper information governance automation of legal holds or early case
control assessments.
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 22
Viewpointe Content Assessment: Case Study
Large Financial Services Organization
o Business Problem:
• Corporate compliance initiative to get content under control
• Multiple repositories of content from years of acquisitions
• IT costs too high; CIO directive to reduce storage costs
• Legal mandate for all content to be compliant
• Intellectual property (IP) scattered across the organization
p p y( ) g
• IP lost due to lack of security and proper management
• Data duplication
o Business Objective:
• All digital content under control and in compliance
• Unnecessary digital content removed
• New, incoming content is managed and controlled
• All IP managed and secured
• Accessibility only to those who need it
• Duplication reduced and cost-savings on storage
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 23
Case Study - Overview
o A large financial services company and customer of Viewpointe
requested an assessment of its current information governance and
recommendations for improvement.
o Viewpointe conducted a detailed examination of information
governance practices within a sample of the customer’s key
business units
units.
o Diagnostic with quantitative scoring of customer’s records
management provided, eDiscovery and PII practices benchmarked
against a peer group of financial services companies.
o Working with IBM software and support, Viewpointe conducted an
automated content assessment against approximately 1.3 million
files from SharePoint, File Shares and email.
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 24
2012 Managing Electronic Records Conference 5.8
9. Cohasset Associates, Inc.
NOTES
Case Study – Tools Used
Content Assessment
tools include:
• Records Management/
eDiscovery diagnostic
tool
• IBM Content Analytics
(ICA) and IBM
Classification Module
along with Viewpointe’s
tuned knowledge base
• Templates for records
retention
schedules, records
management policy
and procedures, as Sample Record Retention Schedule
needed
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 25
Case Study – Content Analytics & Classification Process
o ICA was used against unstructured documents as part of the
process to:
• Help identify credit card, account and phone numbers, addresses and
monetary amounts.
• Match documents against an existing taxonomy, creating a Windows
Explorer hierarchical view of the documents.
• Examine documents that didn’t fall into an existing taxonomy.
g y
o Used IBM Classification Module to match documents to existing
document retention categories; refined classification categories and
rules, as needed.
o Compared document dates against defined document-retention
rules by category to determine over-retention amounts.
o Analyzed exact and near-exact duplicate amounts to determine
storage inefficiencies.
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 26
Content Analytics - Exposing Sensitive data
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 277
2
2012 Managing Electronic Records Conference 5.9
10. Cohasset Associates, Inc.
NOTES
Auto-Classification – Proposing Taxonomy
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 28
Content Analytics - Detecting Near Duplicates
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 29
Enabling Collection
1
2
3
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 300
3
2012 Managing Electronic Records Conference 5.10
11. Cohasset Associates, Inc.
NOTES
Case Study – Sample Recommendations
1. Improve decentralized records management governance structure
Explicitly extend the structure to electronic records
Provide increased centralized oversight (franchise model)
Provide representation and assistance from IT
Introduce formal accountability measures
2. Complete an inventory of all electronic records
• Centralized data map linked to records-retention schedule,
p ,
identifying systems of record and custodians
3. Develop corporate template for electronic records management
procedures and training
• Can be modified by LOBs as needed; gives them a ‘quick start’
and identifies baseline requirements and knowledge
4. Develop and implement standard electronic records management
controls, KPIs and audit procedures to monitor compliance and
measure business benefits
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 31
OnPointe
Case Study – Initial Recommendations
5. Implement an electronic records management solution for structured and
unstructured records
• Phase 1 with select core LOBs and repositories/applications
• Develop a plan to grow in phases until 20 percent of the systems are
covered
6. Develop a more streamlined records-retention schedule
p
• Automate classification/declaration, active/inactive management and
disposition workflow
• Move toward a ‘big bucket’ retention schedule that takes into account
the finer grained search and automated classification
7. Reduce backlog of electronic records
• Make use of the new structure and tools to reduce duplication and
over-retention
• If required, incorporate legal hold/eDiscovery of otherwise over-
retained records for legal preservation purposes
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 32
Viewpointe Content Assessment: Sample Results
Content assessment helps determine document types, document aging,
conformance with customer retention rules, duplication of information and
potential PII risks.
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 33
2012 Managing Electronic Records Conference 5.11
12. Cohasset Associates, Inc.
NOTES
Content Assessment Benefits – In Summary
o Insight
• identification and understanding of your information landscape and
content in context
• make decisions based on content value and get started today
• understand your information governance readiness, benchmarked
against industry best practices and peers; plan to meet your goals
o Cost Reduction
• retire legacy systems / repositories quickly
• reduce storage requirements and adjust tiering strategies for
immediate gains
o Risk Reduction, Compliance and Litigation Readiness
• the identification of unknown risk or compliance gaps
• prioritize areas of focus based on need for remediation
• defensibly dispose of data debris or ROT
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 34
Q&A
Kristi Perdue, erms
Director, Product Marketing, OnPointe Services
Viewpointe, LLC
kristi.perdue@viewpointe.com
Viewpointe Proprietary & Confidential Information. Copyright Viewpointe 2012. All Rights Reserved. 35
2012 Managing Electronic Records Conference 5.12