SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
How Can
 Content Management Software
          Keep Pace?

San Francisco Gilbane Conference 2009
        Content Integration Strategies
              Dick Weisinger
                           g
                 June 4, 2009
Dick Weisinger
 Vice President and Chief Technologist
  Formtek, Inc
 20+ years of experience in Content,
  Document and Image Management
                     g       g
 Regular blogger at
  http://www.formtek.com/blog
Formtek
 An ECM software and services company
  – 25-year history
    25 year
 Experts in general ECM and CM space
 Depth of experience in engineering data
  management
 Formtek Orion ECM Software
 Alfresco Gold Integration Partner
Drowning in Digital Data
 Hand-held devices                  E-Discovery / Records
                                      Management
 High-resolution video
                                     Di iti d B i
                                      Digitized Business D t
                                                         Data
 High-End Video Games
                                     Financial and Health
 High-Resolution
                                      Records
  Graphics d Images
  G hi and I
                                     Business Continuity
 Scientific Data
                                      Backups

Analysts at:
         Gartner Group,
         Forester Research,
                  Research
         IDC and
         The 451 Group
all predict massive growth in digital data.
                                      data
Size of the Digital Universe
    2003 – 20 exabytes
    2006 – 161 exabytes
    2007 – 281 exabytes
    2008 – 486 exabytes
    2010 – 988 exabytes of data
    2011 – 1800 exabytes of data
    2012 – 2500 exabytes of data
      (30% of data is created by enterprises)   Source: IDC

One Exabyte == 1 billion gigabytes or 1000 petabytes
                   (about 250 million DVDs)
161 exabytes is the equivalent of 12 stacks of books each
extending 93 million miles from the earth to the Sun.
Data in Business and Science
 Walmart adds a billion rows of data to
  its 600 terabyte database every hour
 Chevron’s gas and oil exploration
  collects 2 terabytes of data daily
                  y                y
 Large Hadron collider in Switzerland to
  collect 300 exabytes per year
 Department of Energy has increased
  their data by a factor of 10 every four
  years since 1990
Hardware’s Shrinking Cost

Year    Cost/MB
1986    $51.30
                  Storage costs are
1991    $13.00    plummeting,
                  plummeting but not as fast
1994    $1.00     as the amount of data is
                  growing.
1997    $0.09
        $0 09
2000    $0.07     Cheap storage costs also
2003    $0.02
        $0 02     encourage applications to
                  store ever more data.
2009    $0.0002
Can Software Keep Pace?
How Can We Find Anything?

 Search Algorithms have evolved and
  improved, but…
 Internet Search is only Fair to Good
  – Google Page-Rank
      8+ billion web pages, hundreds of thousands of
                      p g ,
       servers
 Enterprise Search is Poor
  – Usage patterns are hard to model
The Problem of Search

 49 percent of business users say that finding
  data is difficult d time consuming.
  d t i diffi lt and ti          i
                           -- AIIM 2008 Market Study


 Users have a 50 percent success rate at
  search
       h
                            -- Recommind Survey
                            March 2009
Scattered Data Repositories
                 p
 Corporate Applications
    –   ERP
    –   PLM/PDM
    –   Business Intelligence / Knowledge Management
    –   Content and Document Management
   Relational Databases
   Local and Shared File Syste s
     oca a d S a ed e Systems
   Internet/Intranet HTTP servers
   Email Servers
   Disk Appliances (digital cameras, cell phone…)
Multiple Repository Challenge
      p     p      y        g
Problem
 How to access and search data to achieve:
      Compliance
      eDiscovery
      Business Intelligence
Challenge
 Many organization have multiple repositories from
      y g                      p     p
  multiple vendors
 Lack of standards around API and query language
 Each system is different and has very little common
  reuse
Unstructured Data Search is Hard
 80 percent of enterprise data is unstructured
     p               p
  – Eg., emails, PDF, Word and Office docs
 No underlying data model or schema
           y g
  – emails and IM often lack context and use
    shorthand and abbreviations that increase the
    search challenge
Huge Data Sets Brings Huge Problems
   Search gets harder as data sets grow
    – Longer to index and search
    – Harder to determine context
   The more systems, the harder to secure
   The more systems, the harder to
    consolidate search
   Conflicting or Inconsistent Data
    – Whi h i th system of reference?
      Which is the  t    f f        ?
Getting Data Under Control
 Ultimate goal: Content Intelligence
  – Knowledge extraction
  – Ability to distill, condense and summarize data

How?
 Apply more Structure and Reuse
  – XML Tags
 Allow greater access across data sources
  – Consolidation of Systems
  – Integration of Systems
Creating Structure
Semi-Structured Data
S     S
 Use a structured native data format
  – XML Authoring/Publishing applications
      DITA publishing XML
  – Microsoft Office 2007 docx, etc. (Office Open
    XML)
      Complex: 29 namespaces and 89 schema models
 Add Structure
  – Append Headers and Embedded Properties
      Eg., Tiff, jpeg images
      PDF and embedded Microsoft Office files
 Associate tags and metadata with
  unstructured data
Centralized Repository Efficiency

   Management efficiencies of scale
   More efficient search
     – No need to consolidate search results
   Available to users via a single interface
Integration of Repositories
 Content-Intelligence Platforms can
  integrate/unite multiple repositories
 XML is the pipeline for integration
 Integration via APIs or XML Web
  services
   – REST Web Services have momentum
   – Integration with SOA
CMIS -- ECM Integration

 ECM vendors have united to create a
  new interoperability standard:
  Content Management Interoperability
  Services (CMIS)
  – Web services for sharing information
    between different content repositories
                                p
  – “SQL for Document Management”
What is CMIS?

 Content Management Interoperability Services
  – Defines a lowest-denominator CM capability set
  – CM content is accessed as SOAP or AtomPub
    (REST) web services
  – A single application works identically with content
    from any CMIS vendor
            y
CMIS Timeline
 1993 – ODMA (Open Document Management API)
 1996 – DMA (AIIM Document Management Alliance)
 1996 – WebDAV (Web-based Distributed Authoring and Versioning )
 2002 - JSR-170 / Java Content Repository (Day Software)
         JSR 170
 2005 – iECM (AIIM Interoperable ECM)
 October 2006 – CMIS started
 August 2008 - Contributing members invited
 September 2008 - Draft Specification submitted to
  OASIS
 Possible completion and acceptance in late 2009 or
  early 2010
JCR versus CMIS
  Session-based API   Services Based
  Java Only           Language Agnostic
  “Complete” ECM      Core ECM functions
  Infrastructure      Interoperability
                             p       y
  Targets DM, RM,     Intended specifically
  DAM, WCM…           for DM
  Complex             Simple
  Prescriptive        Little or No Change
  Connectors by Day   Vendor Connectors
  Version 2.0         Version .61
  Design spearheaded Design Led by Top
  by Day Software    Tier ECM Vendors
CMIS: Creators and Participants
 Founding Companies for the Original Standard
  – EMC/Documentum
  – IBM/Filenet
  – Microsoft
 Contributing Members (after August 7, 2008)
  –   Alfresco
  –   Open Text
  –   Oracle
  –   SAP
  –   More …
CMIS – The Model
 Documents
   – Eg Office document or image
     Eg.,
   – Content, Metadata and Version History
 Folders
   – Defines Organization and Hierarchy
   – Container, Metadata and Hierarchy/Organization
 Object Links and Relations
    j
   – Reference between two folders or documents
   – Requires a source and target
 Policies
   – Set of rules that can be applied to control other objects, eg.
     ACLs or retention policy
Benefits of CMIS
 Standardized Core ECM functions
 Enables Interoperability between repositories
                 p       y            p
 Encourages Flexible Application Development
 Encourages ‘mash-up’ composite applications
 A single application can consolidate and
  aggregate content from multiple CMIS
  repositories
 Business Processes/Workflow can span and
  touch all enterprise content
CMIS Weak Points
   Only Basic Content Functions Available
   Does not cover Admin/Management
   Does not cover User Authentication
   Does not handle Security/Authorization
Applications
 Workflow/Business Processes
  – Connect work packages from any
    repository
 Portals and Mash-ups
  – Aggregated Content from multiple sources
 E-Discovery and Compliance
Summary
 Massive Growth in Content Creation
 Advances in hardware technology is
  fueling content creation and storage
 Search and Retrieval of content grows
  in complexity with its volume
 Content Intelligence is needed to bring
  understanding to data
 Standards like XML and CMIS provide
                                  p
  consistent classification and handling of
  data

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata managementOpen Data Support
 
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...Vladimir Bacvanski, PhD
 
Future of cloud up presentation m_dawson
Future of cloud up presentation m_dawsonFuture of cloud up presentation m_dawson
Future of cloud up presentation m_dawsonKhazret Sapenov
 
Fully Automated SOA ETL Metadata Capture Soln
Fully Automated SOA ETL Metadata Capture SolnFully Automated SOA ETL Metadata Capture Soln
Fully Automated SOA ETL Metadata Capture SolnMarhaus Hooge
 
Database Architecture Proposal
Database Architecture ProposalDatabase Architecture Proposal
Database Architecture ProposalDATANYWARE.com
 
BMO's Fully Automated SOA ETL Metadata Capture Soln
BMO's Fully Automated SOA ETL Metadata Capture SolnBMO's Fully Automated SOA ETL Metadata Capture Soln
BMO's Fully Automated SOA ETL Metadata Capture SolnMark Pahulje
 
Storing Archive Data to meet Compliance Challenges
Storing Archive Data to meet Compliance ChallengesStoring Archive Data to meet Compliance Challenges
Storing Archive Data to meet Compliance ChallengesTony Pearson
 
M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!MER Conference
 
Removal Based Improved Replication Control and Fault Tolerance Method for Roa...
Removal Based Improved Replication Control and Fault Tolerance Method for Roa...Removal Based Improved Replication Control and Fault Tolerance Method for Roa...
Removal Based Improved Replication Control and Fault Tolerance Method for Roa...IJCSIS Research Publications
 
Qo Introduction V2
Qo Introduction V2Qo Introduction V2
Qo Introduction V2Joe_F
 
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - PresentationIOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - PresentationDavid Walker
 
Resume robert nase 2016
Resume robert nase 2016Resume robert nase 2016
Resume robert nase 2016Robert Nase
 
Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Denodo
 
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZIBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZIBMInfoSphereUGFR
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014MapR Technologies
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseBui Ha
 

Was ist angesagt? (20)

Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata management
 
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...
 
Future of cloud up presentation m_dawson
Future of cloud up presentation m_dawsonFuture of cloud up presentation m_dawson
Future of cloud up presentation m_dawson
 
Fully Automated SOA ETL Metadata Capture Soln
Fully Automated SOA ETL Metadata Capture SolnFully Automated SOA ETL Metadata Capture Soln
Fully Automated SOA ETL Metadata Capture Soln
 
Database Architecture Proposal
Database Architecture ProposalDatabase Architecture Proposal
Database Architecture Proposal
 
BMO's Fully Automated SOA ETL Metadata Capture Soln
BMO's Fully Automated SOA ETL Metadata Capture SolnBMO's Fully Automated SOA ETL Metadata Capture Soln
BMO's Fully Automated SOA ETL Metadata Capture Soln
 
Storing Archive Data to meet Compliance Challenges
Storing Archive Data to meet Compliance ChallengesStoring Archive Data to meet Compliance Challenges
Storing Archive Data to meet Compliance Challenges
 
M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!
 
Removal Based Improved Replication Control and Fault Tolerance Method for Roa...
Removal Based Improved Replication Control and Fault Tolerance Method for Roa...Removal Based Improved Replication Control and Fault Tolerance Method for Roa...
Removal Based Improved Replication Control and Fault Tolerance Method for Roa...
 
Qo Introduction V2
Qo Introduction V2Qo Introduction V2
Qo Introduction V2
 
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - PresentationIOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
 
Why Data Vault?
Why Data Vault? Why Data Vault?
Why Data Vault?
 
Data Archiving white paper
Data Archiving white paperData Archiving white paper
Data Archiving white paper
 
Resume robert nase 2016
Resume robert nase 2016Resume robert nase 2016
Resume robert nase 2016
 
Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes
 
AtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White PapaerAtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White Papaer
 
Best practices and trends in people soft
Best practices and trends in people softBest practices and trends in people soft
Best practices and trends in people soft
 
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZIBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data Warehouse
 

Ähnlich wie Gilbane 2009 -- How Can Content Management Software Keep Pace?

Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...Denodo
 
Big Data in Engineering Applications
Big Data in Engineering ApplicationsBig Data in Engineering Applications
Big Data in Engineering Applicationsamit kumar
 
ICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data ScienceICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data ScienceKaran Sachdeva
 
The Information Governance Headache - SharePoint ECM
The Information Governance Headache - SharePoint ECMThe Information Governance Headache - SharePoint ECM
The Information Governance Headache - SharePoint ECMGareth Fisher
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationDenodo
 
Spca2014 navigating clouds sp_con14_mackie
Spca2014 navigating clouds sp_con14_mackieSpca2014 navigating clouds sp_con14_mackie
Spca2014 navigating clouds sp_con14_mackieNCCOMMS
 
ECM with SharePoint - SPSOzarks
ECM with SharePoint - SPSOzarksECM with SharePoint - SPSOzarks
ECM with SharePoint - SPSOzarksAndrew Parmenter
 
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...MongoDB
 
Smarter Data Protection And Storage Management Solutions
Smarter Data Protection And Storage Management SolutionsSmarter Data Protection And Storage Management Solutions
Smarter Data Protection And Storage Management Solutionsaejaz7
 
Alluxio - Virtual Unified File System
Alluxio - Virtual Unified File System Alluxio - Virtual Unified File System
Alluxio - Virtual Unified File System Alluxio, Inc.
 
Going green kl presentation
Going green kl presentationGoing green kl presentation
Going green kl presentationPeter1020
 
Big data & Its influence in the IT
Big data & Its influence in the ITBig data & Its influence in the IT
Big data & Its influence in the ITHeamalatha Pradeeba
 
Big data presentation (2014)
Big data presentation (2014)Big data presentation (2014)
Big data presentation (2014)Xavier Constant
 
IBM Storage at the Incisive Media, IT Leaders Forum with Computing.co.uk
IBM Storage at the Incisive Media, IT Leaders Forum with Computing.co.ukIBM Storage at the Incisive Media, IT Leaders Forum with Computing.co.uk
IBM Storage at the Incisive Media, IT Leaders Forum with Computing.co.ukMatt Fordham
 
Die Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AIDie Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AIDenodo
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Denodo
 
How to Radically Simplify Your Business Data Management
How to Radically Simplify Your Business Data ManagementHow to Radically Simplify Your Business Data Management
How to Radically Simplify Your Business Data ManagementClusterpoint
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)Denodo
 
KMWorld - The Future of Enterprise Content Management (ECM)
KMWorld - The Future of Enterprise Content Management (ECM)KMWorld - The Future of Enterprise Content Management (ECM)
KMWorld - The Future of Enterprise Content Management (ECM)Nuxeo
 

Ähnlich wie Gilbane 2009 -- How Can Content Management Software Keep Pace? (20)

Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
 
Big Data in Engineering Applications
Big Data in Engineering ApplicationsBig Data in Engineering Applications
Big Data in Engineering Applications
 
ICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data ScienceICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data Science
 
The Information Governance Headache - SharePoint ECM
The Information Governance Headache - SharePoint ECMThe Information Governance Headache - SharePoint ECM
The Information Governance Headache - SharePoint ECM
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
Spca2014 navigating clouds sp_con14_mackie
Spca2014 navigating clouds sp_con14_mackieSpca2014 navigating clouds sp_con14_mackie
Spca2014 navigating clouds sp_con14_mackie
 
ECM with SharePoint - SPSOzarks
ECM with SharePoint - SPSOzarksECM with SharePoint - SPSOzarks
ECM with SharePoint - SPSOzarks
 
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...
 
Smarter Data Protection And Storage Management Solutions
Smarter Data Protection And Storage Management SolutionsSmarter Data Protection And Storage Management Solutions
Smarter Data Protection And Storage Management Solutions
 
Alluxio - Virtual Unified File System
Alluxio - Virtual Unified File System Alluxio - Virtual Unified File System
Alluxio - Virtual Unified File System
 
Going green kl presentation
Going green kl presentationGoing green kl presentation
Going green kl presentation
 
Big data & Its influence in the IT
Big data & Its influence in the ITBig data & Its influence in the IT
Big data & Its influence in the IT
 
Big data presentation (2014)
Big data presentation (2014)Big data presentation (2014)
Big data presentation (2014)
 
IBM Storage at the Incisive Media, IT Leaders Forum with Computing.co.uk
IBM Storage at the Incisive Media, IT Leaders Forum with Computing.co.ukIBM Storage at the Incisive Media, IT Leaders Forum with Computing.co.uk
IBM Storage at the Incisive Media, IT Leaders Forum with Computing.co.uk
 
Die Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AIDie Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AI
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)
 
How to Radically Simplify Your Business Data Management
How to Radically Simplify Your Business Data ManagementHow to Radically Simplify Your Business Data Management
How to Radically Simplify Your Business Data Management
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
 
KMWorld - The Future of Enterprise Content Management (ECM)
KMWorld - The Future of Enterprise Content Management (ECM)KMWorld - The Future of Enterprise Content Management (ECM)
KMWorld - The Future of Enterprise Content Management (ECM)
 

Kürzlich hochgeladen

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 

Kürzlich hochgeladen (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

Gilbane 2009 -- How Can Content Management Software Keep Pace?

  • 1. How Can Content Management Software Keep Pace? San Francisco Gilbane Conference 2009 Content Integration Strategies Dick Weisinger g June 4, 2009
  • 2. Dick Weisinger  Vice President and Chief Technologist Formtek, Inc  20+ years of experience in Content, Document and Image Management g g  Regular blogger at http://www.formtek.com/blog
  • 3. Formtek  An ECM software and services company – 25-year history 25 year  Experts in general ECM and CM space  Depth of experience in engineering data management  Formtek Orion ECM Software  Alfresco Gold Integration Partner
  • 4. Drowning in Digital Data  Hand-held devices  E-Discovery / Records Management  High-resolution video  Di iti d B i Digitized Business D t Data  High-End Video Games  Financial and Health  High-Resolution Records Graphics d Images G hi and I  Business Continuity  Scientific Data Backups Analysts at: Gartner Group, Forester Research, Research IDC and The 451 Group all predict massive growth in digital data. data
  • 5. Size of the Digital Universe  2003 – 20 exabytes  2006 – 161 exabytes  2007 – 281 exabytes  2008 – 486 exabytes  2010 – 988 exabytes of data  2011 – 1800 exabytes of data  2012 – 2500 exabytes of data (30% of data is created by enterprises) Source: IDC One Exabyte == 1 billion gigabytes or 1000 petabytes (about 250 million DVDs) 161 exabytes is the equivalent of 12 stacks of books each extending 93 million miles from the earth to the Sun.
  • 6. Data in Business and Science  Walmart adds a billion rows of data to its 600 terabyte database every hour  Chevron’s gas and oil exploration collects 2 terabytes of data daily y y  Large Hadron collider in Switzerland to collect 300 exabytes per year  Department of Energy has increased their data by a factor of 10 every four years since 1990
  • 7. Hardware’s Shrinking Cost Year Cost/MB 1986 $51.30 Storage costs are 1991 $13.00 plummeting, plummeting but not as fast 1994 $1.00 as the amount of data is growing. 1997 $0.09 $0 09 2000 $0.07 Cheap storage costs also 2003 $0.02 $0 02 encourage applications to store ever more data. 2009 $0.0002
  • 8. Can Software Keep Pace? How Can We Find Anything?  Search Algorithms have evolved and improved, but…  Internet Search is only Fair to Good – Google Page-Rank  8+ billion web pages, hundreds of thousands of p g , servers  Enterprise Search is Poor – Usage patterns are hard to model
  • 9. The Problem of Search  49 percent of business users say that finding data is difficult d time consuming. d t i diffi lt and ti i -- AIIM 2008 Market Study  Users have a 50 percent success rate at search h -- Recommind Survey March 2009
  • 10. Scattered Data Repositories p  Corporate Applications – ERP – PLM/PDM – Business Intelligence / Knowledge Management – Content and Document Management  Relational Databases  Local and Shared File Syste s oca a d S a ed e Systems  Internet/Intranet HTTP servers  Email Servers  Disk Appliances (digital cameras, cell phone…)
  • 11. Multiple Repository Challenge p p y g Problem  How to access and search data to achieve: Compliance eDiscovery Business Intelligence Challenge  Many organization have multiple repositories from y g p p multiple vendors  Lack of standards around API and query language  Each system is different and has very little common reuse
  • 12. Unstructured Data Search is Hard  80 percent of enterprise data is unstructured p p – Eg., emails, PDF, Word and Office docs  No underlying data model or schema y g – emails and IM often lack context and use shorthand and abbreviations that increase the search challenge
  • 13. Huge Data Sets Brings Huge Problems  Search gets harder as data sets grow – Longer to index and search – Harder to determine context  The more systems, the harder to secure  The more systems, the harder to consolidate search  Conflicting or Inconsistent Data – Whi h i th system of reference? Which is the t f f ?
  • 14. Getting Data Under Control  Ultimate goal: Content Intelligence – Knowledge extraction – Ability to distill, condense and summarize data How?  Apply more Structure and Reuse – XML Tags  Allow greater access across data sources – Consolidation of Systems – Integration of Systems
  • 15. Creating Structure Semi-Structured Data S S  Use a structured native data format – XML Authoring/Publishing applications  DITA publishing XML – Microsoft Office 2007 docx, etc. (Office Open XML)  Complex: 29 namespaces and 89 schema models  Add Structure – Append Headers and Embedded Properties  Eg., Tiff, jpeg images  PDF and embedded Microsoft Office files  Associate tags and metadata with unstructured data
  • 16. Centralized Repository Efficiency  Management efficiencies of scale  More efficient search – No need to consolidate search results  Available to users via a single interface
  • 17. Integration of Repositories  Content-Intelligence Platforms can integrate/unite multiple repositories  XML is the pipeline for integration  Integration via APIs or XML Web services – REST Web Services have momentum – Integration with SOA
  • 18. CMIS -- ECM Integration  ECM vendors have united to create a new interoperability standard: Content Management Interoperability Services (CMIS) – Web services for sharing information between different content repositories p – “SQL for Document Management”
  • 19. What is CMIS?  Content Management Interoperability Services – Defines a lowest-denominator CM capability set – CM content is accessed as SOAP or AtomPub (REST) web services – A single application works identically with content from any CMIS vendor y
  • 20. CMIS Timeline  1993 – ODMA (Open Document Management API)  1996 – DMA (AIIM Document Management Alliance)  1996 – WebDAV (Web-based Distributed Authoring and Versioning )  2002 - JSR-170 / Java Content Repository (Day Software) JSR 170  2005 – iECM (AIIM Interoperable ECM)  October 2006 – CMIS started  August 2008 - Contributing members invited  September 2008 - Draft Specification submitted to OASIS  Possible completion and acceptance in late 2009 or early 2010
  • 21. JCR versus CMIS Session-based API Services Based Java Only Language Agnostic “Complete” ECM Core ECM functions Infrastructure Interoperability p y Targets DM, RM, Intended specifically DAM, WCM… for DM Complex Simple Prescriptive Little or No Change Connectors by Day Vendor Connectors Version 2.0 Version .61 Design spearheaded Design Led by Top by Day Software Tier ECM Vendors
  • 22. CMIS: Creators and Participants  Founding Companies for the Original Standard – EMC/Documentum – IBM/Filenet – Microsoft  Contributing Members (after August 7, 2008) – Alfresco – Open Text – Oracle – SAP – More …
  • 23.
  • 24. CMIS – The Model  Documents – Eg Office document or image Eg., – Content, Metadata and Version History  Folders – Defines Organization and Hierarchy – Container, Metadata and Hierarchy/Organization  Object Links and Relations j – Reference between two folders or documents – Requires a source and target  Policies – Set of rules that can be applied to control other objects, eg. ACLs or retention policy
  • 25. Benefits of CMIS  Standardized Core ECM functions  Enables Interoperability between repositories p y p  Encourages Flexible Application Development  Encourages ‘mash-up’ composite applications  A single application can consolidate and aggregate content from multiple CMIS repositories  Business Processes/Workflow can span and touch all enterprise content
  • 26. CMIS Weak Points  Only Basic Content Functions Available  Does not cover Admin/Management  Does not cover User Authentication  Does not handle Security/Authorization
  • 27. Applications  Workflow/Business Processes – Connect work packages from any repository  Portals and Mash-ups – Aggregated Content from multiple sources  E-Discovery and Compliance
  • 28. Summary  Massive Growth in Content Creation  Advances in hardware technology is fueling content creation and storage  Search and Retrieval of content grows in complexity with its volume  Content Intelligence is needed to bring understanding to data  Standards like XML and CMIS provide p consistent classification and handling of data