SlideShare a Scribd company logo
1 of 6
Download to read offline
A SMARTER WAY TO SELECT A DATA
EXTRACTION METHOD FOR LEGACY DATA
EXECUTIVE SUMMARY
Deciding how to archive data within legacy applications is a difficult choice for IT organizations,
especially considering the huge growth in data and the expenses associated with maintaining
legacy applications. Add to that regulatory, governance and legal discovery requirements, and
it’s easy to understand how vexing this has become for organizations.
That’s why selecting the most appropriate extraction method for legacy data is such an
important requirement. Choosing from among such diverse approaches as table archiving,
data record archiving, file archiving and hybrid record archiving means taking into account
not only the organization’s current needs, but also trying to predict its future requirements—
certainly no easy task.
What is clear, however, is that this is a choice that must be made because organizations will
continue to struggle with too many systems, aging applications and rising costs to retain and
produce data. Fortunately, there are innovative archiving solutions that support a full range of
structured and unstructured data in a cohesive, efficient platform that works seamlessly with
any data extraction method.
Read this white paper to learn more about archiving solutions that make it easier for
organizations to select the best data extraction method for their legacy data, and in so
doing, ensure the most flexible approach to creating an enterprise-wide, standards-based,
infrastructure-agnostic archiving platform.
TABLE OF CONTENTS
EXECUTIVE SUMMARY.......................................................................... 1
WHAT TO LOOK FOR IN AN ARCHIVING PLATFORM.............................. 4
HOW EMC INFOARCHIVE HELPS........................................................... 4
CONCLUSION........................................................................................ 5
2
One of the biggest—and perhaps one of the most complex—challenges faced by
today’s IT organizations centers on archiving data created by, and residing within,
legacy applications. Although many enterprises would like to address the twin
problems of rapid data growth and the expense associated with maintaining legacy
applications, IT departments ultimately must confront the vexing problem of how
best to extract data from legacy systems.
Making that choice, however, requires considerable thought about the organization’s
current and future needs for that data. The decision is also made more difficult
by figuring out how to deal with such issues as compliance; changing business
processes and workflows; growing e-discovery demands; and the mounting cost and
complexity of data storage, archiving and recovery.
Selecting the best data extraction method typically means evaluating and choosing
from among four different approaches:
1.	 Table archiving, typically used when an organization does not need
to maintain the relationships between tables. It is well suited for the
decommissioning of structured data.
2.	 Data record archiving, where data is presented as a single record. It is
typically employed for compliance requirements; unstructured data is not
a factor.
3.	 File archiving, which allows unstructured content to be archived. While
this approach doesn’t necessarily work well for data mining use cases, it
is good in circumstances where the internal data structure isn’t known but
where important data can already be produced.
4.	 Hybrid record archiving, which unites structured and unstructured
content into a single record. It is ideal for compliance requirements
and complex business records, letting business users view essential
unstructured data as well as its related structured data in a single business
object that allows for big data mining.
Selecting from among these four approaches isn’t a simple task, because different
circumstances—even within the same organization—can tilt the scale toward
different methods. But one thing is certain: This is a choice that must be made,
because organizations continue to wrestle with too many systems and huge growth
in data volumes. Access to essential information must be maintained even when
legacy applications are being retired—even when data sits in a state of disuse.
According to the Data Management Institute, 40% of data on enterprise systems is
inert, while another 15% is either “orphaned” or being used inappropriately on IT
infrastructure.1
This growing complexity, combined with the heightened sense of urgency IT
departments feel to preserve access to important data on legacy systems, has a
major implication: Your extraction methodology needs to support archiving of a full
range of records, both structured and unstructured, in a single, unified archive. But
that’s not all: You need an archiving platform that works seamlessly and securely
with any or all of these four data extraction methods.
In EMC’s InfoArchive archiving platform, data handling is simplified by the utilization
of Archival Information Units (AIUs), which allow individual data objects to be stored
and extracted. Multiple AIUs with similar properties, such as retention, create an
Archival Information Package (AIP).
1 “Avoiding the Data Apocalypse,” SearchStorage.com, 2014
3
WHAT TO LOOK FOR IN AN
ARCHIVING PLATFORM
Because organizations are dynamic and fluid, it makes sense to avoid being locked into
an archiving platform that doesn’t provide maximum choice and flexibility in extraction
methods. Other factors, such as re-architecting data warehouses, changing database
management applications or even undergoing a corporate merger or acquisition can
result in the need to use different data extraction procedures and technologies.
As a result, your archiving platform must include a number of key features and functions
in order to support any of the four extraction methods. These include:
•	 Single unified repository. Regardless of the relevant extraction method, it
becomes easier, more efficient and more reliable if you are able to pull data
from a single, enterprise-wide repository. By having a single location that
pulls together data from myriad applications, warehouses and databases, it
becomes a much cleaner process. This is particularly useful in scenarios where
organizations decide they’d like to decommission legacy applications because
of the cost and time required to support aging and even archaic systems,
but must maintain access to key data for governance, compliance and
e-discovery needs.
•	 Support for both structured and unstructured data. Although unstructured
data now makes up about 80-90% of all new data, a huge amount of legacy
data locked in old systems is in a structured format. Organizations can’t afford
to be limited in their ability to extract either structured or unstructured data by
the characteristics of a particular data extraction method.
•	 Compliance-centric design. Look for a platform with a wide range of
compliance-aware functionality, such as date- and event-based retention;
centralized retention policy management; single and multiple retention policies;
inheritance of retention policies based on archived data’s classification; support
for RSA Data Protection Manager for security and encryption; and long-term
retention of regulated information.
•	 Enterprise scalability. Data volumes are expected to grow at double-digit
rates for the foreseeable future, and your archiving platform must keep
pace. Enterprise scalability must come not only in higher capacities and
more intelligent/automated data tiering policies, but also must ensure high
availability, high-IOPS performance and low latency.
•	 Preserved data integrity. Regardless of the method used, IT and business
executives must have supreme confidence in the integrity of all extracted
data. The archiving platform must have safeguards to ensure that integrity
so all users of the extracted data are seeing precisely the same information
at all times.
•	 Management facility and low administration footprint to reduce impact
on IT. Policy management for archiving and its related functions such as
data tiering and storage management needs to be an integrated, automated
deliverable for the archiving platform across all extraction methods. Policies and
automation help ease management headaches and reduce the need for manual
search, extraction and reporting by IT professionals.
HOW EMC INFOARCHIVE HELPS
EMC InfoArchive is a versatile, flexible and robust archiving platform. It supports all data
extraction technologies and approaches, and provides organizations with an enterprise-
wide, standards-based, infrastructure-agnostic platform for archiving.
4
As an integrated product suite, InfoArchive is designed for data integrity, compliance,
high availability, security, scalability and automation for both structured and
unstructured data. Among InfoArchive’s key capabilities are:
•	 No dependencies on original application.
•	 Support for wide range of data retention policies.
•	 Assurance of data quality.
•	 Support for industry standards (OAIS) for long-term retention and easy access.
•	 Consolidated repository for ingestion and retention of all data types.
•	 Data security and encryption.
•	 Scalability to hundreds of billions of records.
•	 Single view of data and content.
InfoArchive’s design ensures that authorized users both inside and outside the enterprise
can easily, quickly, reliably and securely access archived information, and that IT
departments can use whatever extraction method is most appropriate for the particular
use case. Users can search concurrently for data across multiple datasets without
experiencing unacceptable latency or worrying about inconsistent data integrity from
search to search.
As a result, InfoArchive helps organizations reduce the cost and complexity of data
extraction and archiving while ensuring compliance and enabling smart, non-disruptive
application decommissioning.
A final element of the added value that InfoArchive brings is an expert coalition of
technology and service partners that offer different skills and experience to organizations
looking to get the most out of InfoArchive. The seven core organizations that make up
the InfoArchive Consortium have a range of specialties, all of them crucial to helping
companies adopt next-generation unified archiving based on EMC InfoArchive.
One of those members—Revolution Data Systems (RDS)—is an expert in hybrid record
data migration. RDS offers the expertise and ETL tools to manage archiving from legacy
systems. By utilizing the RDS team, organizations can migrate data and documents from
homegrown and third-party databases, ERP systems, Web applications and flat files.
RDS data migration services facilitate the often complex task of deciding the best
extraction method for any given legacy system. They also ensure that your data—
regardless of complexity—is archived properly, allowing seamless access to your
legacy data.
CONCLUSION
As data volumes grow—unstructured data in particular—IT organizations increasingly
have to make tough decisions about which data to keep available on primary storage,
which to move to secondary storage for archiving purposes and which to get rid of
entirely. But compliance, governance, analytics and e-discovery are just some of the
reasons that organizations must always maintain that data somewhere—and in an easily
accessible state—even as legacy applications are being decommissioned.
IT organizations have four different data extraction methods to choose from, and it’s
not at all unusual for organizations to need to use multiple methods depending upon
the use case or workload. As a result, organizations need to adopt an enterprise-wide
archiving platform that supports any and all extraction methods for both structured and
unstructured data.
5
InfoArchive is an ideal data archiving platform because it provides organizations with
maximum flexibility in selecting the right data extraction method for their needs.
InfoArchive enables a single, unified repository designed for enterprise scalability,
security, data integrity and fast, reliable extraction.
With the technical and business leadership of EMC and the unique capabilities of
InfoArchive Consortium members like Revolution Data Systems, InfoArchive gives IT
organizations the ability to archive all types of information, achieve controlled access to
archive data and mitigate the need for dependencies on the originating application.
CONTACT US
For more information on
InfoArchive or Revolution
Data Systems’ enterprise
content management
tools and services, please
visit www.emc.com/
content-management/
infoarchive or http://www.
revolutiondatasystems.
com/.

More Related Content

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Featured

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
Simplilearn
 

Featured (20)

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
 

A smarter way to select a data extraction method for legacy data

  • 1. A SMARTER WAY TO SELECT A DATA EXTRACTION METHOD FOR LEGACY DATA EXECUTIVE SUMMARY Deciding how to archive data within legacy applications is a difficult choice for IT organizations, especially considering the huge growth in data and the expenses associated with maintaining legacy applications. Add to that regulatory, governance and legal discovery requirements, and it’s easy to understand how vexing this has become for organizations. That’s why selecting the most appropriate extraction method for legacy data is such an important requirement. Choosing from among such diverse approaches as table archiving, data record archiving, file archiving and hybrid record archiving means taking into account not only the organization’s current needs, but also trying to predict its future requirements— certainly no easy task. What is clear, however, is that this is a choice that must be made because organizations will continue to struggle with too many systems, aging applications and rising costs to retain and produce data. Fortunately, there are innovative archiving solutions that support a full range of structured and unstructured data in a cohesive, efficient platform that works seamlessly with any data extraction method. Read this white paper to learn more about archiving solutions that make it easier for organizations to select the best data extraction method for their legacy data, and in so doing, ensure the most flexible approach to creating an enterprise-wide, standards-based, infrastructure-agnostic archiving platform.
  • 2. TABLE OF CONTENTS EXECUTIVE SUMMARY.......................................................................... 1 WHAT TO LOOK FOR IN AN ARCHIVING PLATFORM.............................. 4 HOW EMC INFOARCHIVE HELPS........................................................... 4 CONCLUSION........................................................................................ 5 2
  • 3. One of the biggest—and perhaps one of the most complex—challenges faced by today’s IT organizations centers on archiving data created by, and residing within, legacy applications. Although many enterprises would like to address the twin problems of rapid data growth and the expense associated with maintaining legacy applications, IT departments ultimately must confront the vexing problem of how best to extract data from legacy systems. Making that choice, however, requires considerable thought about the organization’s current and future needs for that data. The decision is also made more difficult by figuring out how to deal with such issues as compliance; changing business processes and workflows; growing e-discovery demands; and the mounting cost and complexity of data storage, archiving and recovery. Selecting the best data extraction method typically means evaluating and choosing from among four different approaches: 1. Table archiving, typically used when an organization does not need to maintain the relationships between tables. It is well suited for the decommissioning of structured data. 2. Data record archiving, where data is presented as a single record. It is typically employed for compliance requirements; unstructured data is not a factor. 3. File archiving, which allows unstructured content to be archived. While this approach doesn’t necessarily work well for data mining use cases, it is good in circumstances where the internal data structure isn’t known but where important data can already be produced. 4. Hybrid record archiving, which unites structured and unstructured content into a single record. It is ideal for compliance requirements and complex business records, letting business users view essential unstructured data as well as its related structured data in a single business object that allows for big data mining. Selecting from among these four approaches isn’t a simple task, because different circumstances—even within the same organization—can tilt the scale toward different methods. But one thing is certain: This is a choice that must be made, because organizations continue to wrestle with too many systems and huge growth in data volumes. Access to essential information must be maintained even when legacy applications are being retired—even when data sits in a state of disuse. According to the Data Management Institute, 40% of data on enterprise systems is inert, while another 15% is either “orphaned” or being used inappropriately on IT infrastructure.1 This growing complexity, combined with the heightened sense of urgency IT departments feel to preserve access to important data on legacy systems, has a major implication: Your extraction methodology needs to support archiving of a full range of records, both structured and unstructured, in a single, unified archive. But that’s not all: You need an archiving platform that works seamlessly and securely with any or all of these four data extraction methods. In EMC’s InfoArchive archiving platform, data handling is simplified by the utilization of Archival Information Units (AIUs), which allow individual data objects to be stored and extracted. Multiple AIUs with similar properties, such as retention, create an Archival Information Package (AIP). 1 “Avoiding the Data Apocalypse,” SearchStorage.com, 2014 3
  • 4. WHAT TO LOOK FOR IN AN ARCHIVING PLATFORM Because organizations are dynamic and fluid, it makes sense to avoid being locked into an archiving platform that doesn’t provide maximum choice and flexibility in extraction methods. Other factors, such as re-architecting data warehouses, changing database management applications or even undergoing a corporate merger or acquisition can result in the need to use different data extraction procedures and technologies. As a result, your archiving platform must include a number of key features and functions in order to support any of the four extraction methods. These include: • Single unified repository. Regardless of the relevant extraction method, it becomes easier, more efficient and more reliable if you are able to pull data from a single, enterprise-wide repository. By having a single location that pulls together data from myriad applications, warehouses and databases, it becomes a much cleaner process. This is particularly useful in scenarios where organizations decide they’d like to decommission legacy applications because of the cost and time required to support aging and even archaic systems, but must maintain access to key data for governance, compliance and e-discovery needs. • Support for both structured and unstructured data. Although unstructured data now makes up about 80-90% of all new data, a huge amount of legacy data locked in old systems is in a structured format. Organizations can’t afford to be limited in their ability to extract either structured or unstructured data by the characteristics of a particular data extraction method. • Compliance-centric design. Look for a platform with a wide range of compliance-aware functionality, such as date- and event-based retention; centralized retention policy management; single and multiple retention policies; inheritance of retention policies based on archived data’s classification; support for RSA Data Protection Manager for security and encryption; and long-term retention of regulated information. • Enterprise scalability. Data volumes are expected to grow at double-digit rates for the foreseeable future, and your archiving platform must keep pace. Enterprise scalability must come not only in higher capacities and more intelligent/automated data tiering policies, but also must ensure high availability, high-IOPS performance and low latency. • Preserved data integrity. Regardless of the method used, IT and business executives must have supreme confidence in the integrity of all extracted data. The archiving platform must have safeguards to ensure that integrity so all users of the extracted data are seeing precisely the same information at all times. • Management facility and low administration footprint to reduce impact on IT. Policy management for archiving and its related functions such as data tiering and storage management needs to be an integrated, automated deliverable for the archiving platform across all extraction methods. Policies and automation help ease management headaches and reduce the need for manual search, extraction and reporting by IT professionals. HOW EMC INFOARCHIVE HELPS EMC InfoArchive is a versatile, flexible and robust archiving platform. It supports all data extraction technologies and approaches, and provides organizations with an enterprise- wide, standards-based, infrastructure-agnostic platform for archiving. 4
  • 5. As an integrated product suite, InfoArchive is designed for data integrity, compliance, high availability, security, scalability and automation for both structured and unstructured data. Among InfoArchive’s key capabilities are: • No dependencies on original application. • Support for wide range of data retention policies. • Assurance of data quality. • Support for industry standards (OAIS) for long-term retention and easy access. • Consolidated repository for ingestion and retention of all data types. • Data security and encryption. • Scalability to hundreds of billions of records. • Single view of data and content. InfoArchive’s design ensures that authorized users both inside and outside the enterprise can easily, quickly, reliably and securely access archived information, and that IT departments can use whatever extraction method is most appropriate for the particular use case. Users can search concurrently for data across multiple datasets without experiencing unacceptable latency or worrying about inconsistent data integrity from search to search. As a result, InfoArchive helps organizations reduce the cost and complexity of data extraction and archiving while ensuring compliance and enabling smart, non-disruptive application decommissioning. A final element of the added value that InfoArchive brings is an expert coalition of technology and service partners that offer different skills and experience to organizations looking to get the most out of InfoArchive. The seven core organizations that make up the InfoArchive Consortium have a range of specialties, all of them crucial to helping companies adopt next-generation unified archiving based on EMC InfoArchive. One of those members—Revolution Data Systems (RDS)—is an expert in hybrid record data migration. RDS offers the expertise and ETL tools to manage archiving from legacy systems. By utilizing the RDS team, organizations can migrate data and documents from homegrown and third-party databases, ERP systems, Web applications and flat files. RDS data migration services facilitate the often complex task of deciding the best extraction method for any given legacy system. They also ensure that your data— regardless of complexity—is archived properly, allowing seamless access to your legacy data. CONCLUSION As data volumes grow—unstructured data in particular—IT organizations increasingly have to make tough decisions about which data to keep available on primary storage, which to move to secondary storage for archiving purposes and which to get rid of entirely. But compliance, governance, analytics and e-discovery are just some of the reasons that organizations must always maintain that data somewhere—and in an easily accessible state—even as legacy applications are being decommissioned. IT organizations have four different data extraction methods to choose from, and it’s not at all unusual for organizations to need to use multiple methods depending upon the use case or workload. As a result, organizations need to adopt an enterprise-wide archiving platform that supports any and all extraction methods for both structured and unstructured data. 5
  • 6. InfoArchive is an ideal data archiving platform because it provides organizations with maximum flexibility in selecting the right data extraction method for their needs. InfoArchive enables a single, unified repository designed for enterprise scalability, security, data integrity and fast, reliable extraction. With the technical and business leadership of EMC and the unique capabilities of InfoArchive Consortium members like Revolution Data Systems, InfoArchive gives IT organizations the ability to archive all types of information, achieve controlled access to archive data and mitigate the need for dependencies on the originating application. CONTACT US For more information on InfoArchive or Revolution Data Systems’ enterprise content management tools and services, please visit www.emc.com/ content-management/ infoarchive or http://www. revolutiondatasystems. com/.