SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
Cohasset Associates, Inc.

                                                                                              NOTES
         Will Technology-Assisted Predictive Modeling and Auto-
                Classification End the ‘End-User’ Burden in
                          Records Management?



                             2012 Managing Electronic Records Conference
                                                              Chicago, IL
                                                                   g
                                                             May 7, 2012

                                                               Jason R. Baron, Esq.
                                                                     Director of Litigation
                                                               Office of General Counsel
                                            National Archives and Records Administration

                                                                   Dave Lewis, Ph.D.
                                                           David D. Lewis Consulting, LLC
                                                                             Chicago, IL




         A New Era of Government
                “[P]roper records management is the backbone of open Government.”
                      President Obama’s Memorandum dated November 28, 2011
                                 re “Managing Government Records”
         http://www.whitehouse.gov/the-press-office/2011/11/28/presidential-memorandum-
                                    managing-government-records




2012 Managing Electronic Records Conference                                                           6.1
Cohasset Associates, Inc.

                                                                                                                      NOTES
        Reality:
        The era of Big Data has just
        begun….
          Lehman Brothers Investigation
             -- 350 billion page universe (3 petabytes)
             -- Examiner narrowed collection by selecting
          key custodians, using dozens of Boolean
          searches
             -- Reviewed 5 million docs (40 million pages
          using 70 contract attorneys)
          Source: Report of Anton R. Valukas, Examiner, In re Lehman Brothers Holdings Inc., et al., Chapter 11
          Case No. 08-13555 (U.S. Bankruptcy Ct. S.D.N.Y. March 11, 2010), Vol. 7, Appx. 5, at
          http://lehmanreport.jenner.com/.




        Process Optimization Problem 1: The
        transactional toll of user-based
        recordkeeping schemes (“as is” RM)




                                                                                                                  5




        …. and the need for
        better, automated solutions ….




                                                                                                                  6




2012 Managing Electronic Records Conference                                                                                   6.2
Cohasset Associates, Inc.

                                                                   NOTES
        Impact of Technology on E-Records
        Management: Snapshot 2012 (“As is”)
           A universe of proprietary products exists in the
            marketplace: document management and
            records management applications (RMAs)
           DoD 5015.2 version 3 compliant products
           However, scalability issues exist
           Agencies must prepare to confront significant
            front-end process issues when transitioning to
            electronic recordkeeping
           Records schedule simplification is key


                                                               7




        RM wish list for 2012….
           RM’s “easy button”: the elusive goal of zero
            extra keystrokes to comply with RM
            requirements (capture)
           A technology app that automatically tags
            records in compliance with RM policies and
            practices (categorize)
           Supervised learning RM with minimal records
            officer or end user involvement (learn)
           Rule-based and role-based RM
           Advanced search                                    8




        Electronic Archiving As The
        First Step
           What is it?
            100% snapshot of (typically) email, plus in some
            cases other selected ESI applications
           How does it differ from an RMA?
            Goal is of preservation of evidence, not records
            management per se
           NARA Bulletin 2008-05


                                                               9




2012 Managing Electronic Records Conference                                6.3
Cohasset Associates, Inc.

                                                                                                          NOTES
        A Possible Path Forward?
           Email archiving in short term, synced to existing
            proprietary software on email system
           Designation of key senior officials as creating
            permanent records, consistent with existing records
            schedules
           Additional designations of permanent records by
            agency component
           “Smart” filters/categorical rules built in based on
            content, to the extent feasible to do
           Default are records in designated temporary record
            buckets, disposed of under existing records
            schedules.
                                                                                                     10




         A pyramid approach combines disposition policy with automated
         tools    to    bring     FRA      email     under      records
         management, preservation, and access
                      = permanent or top
                                                                  = temporary or staff and support
                      officials



                                                                             slider




         The position of the “set-point” for email capture depends on policy and resources:
         setting it higher allows use of tools now available to get 100% of email at lower
         volumes;* setting it lower means more records will be captured and smarter tools
         are needed to distinguish and disposition temporary- and non-record.

         Implementing an email archiving policy is feasible now, since tools are readily
         available to capture 100% of email traffic at the individual or organizational level, in
         formats that can be archived.




         A pyramid approach combines disposition policy with automated
         tools    to    bring     FRA      email     under      records
         management, preservation, and access
                      = permanent or top
                                                                  = temporary or staff and support
                      officials



                                                                             slider




         The position of the “set-point” for email capture depends on policy and resources:
         setting it higher allows use of tools now available to get 100% of email at lower
         volumes;* setting it lower means more records will be captured and smarter tools
         are needed to distinguish and disposition temporary- and non-record.

         Implementing an email archiving policy is feasible now, since tools are readily
         available to capture 100% of email traffic at the individual or organizational level, in
         formats that can be archived.




2012 Managing Electronic Records Conference                                                                       6.4
Cohasset Associates, Inc.

                                                                             NOTES
        How To Avoid A Train Wreck
        With Email Archiving….




                     Capture E-mail But Utilize Records Management!
                                                                        13




        Functional Requirements for
        Categorization Products in the Federal
        workplace

           Ease of use …. Scalability …. Archiving in native
           formats….. Metadata preservation … Seamless integration
           with existing software apps …. Versioning …. Compatibility
           with big bucket records schedules …. Advanced search
           capabilities …. Ease of training / machine learning using
           records officers or end users …. Cost




        Process Optimization Problem 2: The
        Coming Age of Dark Archives (and the
        inability to provide access)




                                                                        15




2012 Managing Electronic Records Conference                                          6.5
Cohasset Associates, Inc.

                                                                                       NOTES
                 Emerging New Strategies:
                  “Predictive Analytics”




        Improved review and case
        assessment: cluster docs
        thru use of software with
        minimal human
        intervention at front end to             Slide adapted from Gartner
                                                 Conference                       16
        code “seeded” data set                   June 23, 2010 Washington, D.C.




                      Language Processing
                         Technologies
                             Retrieval / Search                      2.
        Information            Classification                             1.
        Retrieval
                           Question Answering
                              Summarization
                            Entity Recognition
                          Information Extraction              Natural
                                                              Language
                           Machine Translation                Processing
                                     :
                                                                                  17




         Text Classification
            Deciding which of
             several groups a text
             belongs to
            Crudest form of
             language
             understanding...
                ...but often can be automated
                 with high accuracy


                                                                                  18




2012 Managing Electronic Records Conference                                                    6.6
Cohasset Associates, Inc.

                                                                                         NOTES
                       Why Classify?
                                                    ...to specify
         Reduce                                     an action for
                               ...to finite
         infinite                                   every
                               set of
         variety of                                 possible
                               classes...
         text...                                    input.




                                                                                    19




        Other Advantages of Text
        Classification
           Supervised learning:
               Classifiers (rules) can be
                learned by imitating manual
                classifications


           Straightforward numerical
            measures of quality                                 recall: 85% +/- 4%
                                                                precision: 75% +/- 3%


           Objective reason why a
            decision was made                          classification
                                                           rule

                                                                                    20




        Variations on Classification
           Binary vs. multiclass

           Hierarchical



           Probabilistic      83%            17%

           Graded / ordered / fuzzy
                                                                                    21




2012 Managing Electronic Records Conference                                                      6.7
Cohasset Associates, Inc.

                                                                     NOTES
        Defining Sets of Classes
           Tradeoff among
               Ideal classes to
                implementpolicy
               Classes you can teach

            
                people to assign
                Classes you can
                                                         ?
                teachsoftwareto assign
           Be skeptical of automatic
            discovery of classes
                                                                22




        Text Retrieval Systems
           AKA search engines,
            semi-structured
            databases, text
            databases, etc.
            databases etc




                                                                23




                Classification              Search


                  autonomous             interactive



                     long term              transitory



                  organizational         personal




                    structured              independent   ? ?
                                                           ?

                                                                24




2012 Managing Electronic Records Conference                                  6.8
Cohasset Associates, Inc.

                                                                         NOTES
        Some Distinctions Among
        Search Approaches
           Exact Match vs.
            Ranked Retrieval vs.
                                           
                                         "Concepts"
            Browsing                      vs.
                                          "Keywords"
                                          "Keywords"
           Text Representations

           Matching Aids


                                                                    25




        Exact Match Search
           Query specifies conditions
            document must meet           budget AND Knoxville
                                         AND (revised or preliminary)
           Variants
               Boolean
                B l
               SQL
               Faceted
           Often (ambiguously) called
            "keyword" search

                                                                    26




        A Faceted Search Interface




                                                                    27




2012 Managing Electronic Records Conference                                      6.9
Cohasset Associates, Inc.

                                                                 NOTES
        Ranked Retrieval
           Query specifies important
            attributes of desired
            documents
           System statistically weights
            those attributes
           Results returned in order of
            strength of match


                                                            28




        Statistical Evidence in Ranked
        Retrieval
           Corpus statistics
               Word (and metadata) counts
           Unsupervised learning
               Clustering, LSI/LSA etc.
                Cl t i      LSI/LSA, t
               finds (maybe useless) patterns
           Supervised learning
               aka "relevance feedback"
               learn indicators of user interest

                                                            29




        Browsing
           Hierarchies
           Networks
           Clusters
           Spaces / Maps / Dimensions
               make great pictures / demos
               unclear if useful for finding information




                                                            30




2012 Managing Electronic Records Conference                              6.10
Cohasset Associates, Inc.

                                                                                                                                 NOTES
        Visual Analysis Examples
        (Presentation by Dr. Victoria Lemieux, Univ. British Columbia,
        at Society of American Archivist Annual Mtg. 2010, Washington, D.C.)




                    With acknowledgments to Jeffrey Heer, Exploring Enron, http://hci.stanford.edu/jheer/projects/enron/,
                             Adam Perer, Contrasting Portraits, http://hcil.cs.umd.edu/trs/2006-08/2006-08.pdf,             31
                             and Fernanda Viegas, Email Conversations, http://fernandaviegas.com/email.html




                                                                                                                            32




2012 Managing Electronic Records Conference                                                                                              6.11
Cohasset Associates, Inc.

                                                                                     NOTES
        What Evidence Can The
        Search Software Use?
           Words, phrases, etc.
           Manually assigned categories
           Metadata
               Author, organization, creation date, change
                date, access date, length, file type,...
           Contextual information (links,
            attachments,...)


                                                                                34




        What Resources Aid
        Matching?
           Linguistic analysis
               At word level or higher
           Clusters / spaces / ...
           Thesauri / semantic nets /
            concept maps / ...
               Suited to your task?
               Modifiable?
               How is text determined to
                belong to category?
                                                                                35




        Concepts v. Keywords
        Supreme Court of Information Retrieval, Case No. 1-tfidf-0-2902, 2009


           Search software marketing:
               Them = keyword search = bad
               Us = concept search = good
           Reality:
            R lit
               Both terms have referred to dozens of
                different technologies...
               ...including some of the same ones!
           Conceptual search is an aspiration, not
            a technology
                                                                                36




2012 Managing Electronic Records Conference                                                  6.12
Cohasset Associates, Inc.

                                                                                   NOTES
            Example of Boolean search string
            from U.S. v. Philip Morris

            (((master settlement agreement OR msa) AND NOT (medical
             savings account OR metropolitan standard area)) OR s. 1415
             OR (ets AND NOT educational testing service) OR (liggett
             AND NOT sharon a. liggett) OR atco OR lorillard OR (pmi
             AND NOT presidential management intern) OR pm usa OR
             rjr OR (b&w AND NOT photo*) OR phillip morris OR batco
             OR ftc test method OR star scientific OR vector group OR
             joe camel OR (marlboro AND NOT upper marlboro)) AND
             NOT (tobacco* OR cigarette* OR smoking OR tar OR
             nicotine OR smokeless OR synar amendment OR philip
             morris OR r.j. reynolds OR ("brown and williamson") OR
             ("brown & williamson") OR bat industries OR liggett group)

                                                                              37




         U.S. v. Philip Morris E-mail Winnowing
         Process


            20 million  200,000  100,000          80,000     20,000
            email        hits based relevant         produced    placed on
            records      on keyword emails           to opposing privilege
                         terms used                  party       logs
                         (1%)


             A PROBLEM: only a handful entered as exhibits at trial
             A BIGGER PROGLEM: the 1% figure does not scale

                                                                              38




        Judicial endorsement of predictive analytics
        in document review by Judge Peck in Da
        Silva Moore v. PublicisGroupe(SDNY Feb.
        24, 2012)
             This opinion appears to be the first in which a Court
             has approved of the use of computer-assisted review.
                    pp                          p
             . . . What the Bar should take away from this Opinion
             is that computer-assisted review is an available tool
             and should be seriously considered for use in large-
             data-volume cases where it may save the producing
             party (or both parties) significant amounts of legal
             fees in document review. Counsel no longer have to
             worry about being the ‘first’ or ‘guinea pig’ for judicial
             acceptance of computer-assisted review . . .
             Computer-assisted review can now be considered
             judicially-approved for use in appropriate cases.




2012 Managing Electronic Records Conference                                                6.13
Cohasset Associates, Inc.

                                                                                  NOTES
              Social Networking/Links Analysis Example




                                            From Marc Smith
                                            Posted on Flickr                 40
                                            Under Creative Commons License




        Judicial second guessing of failure to use
        e-search capabilities: Capitol Records v.
        MP3 Tunes, 261 F.R.D. 44 (S.D.N.Y. 2009)


           “In [a prior case] the Court notes its dismay that the
            party opposing discovery of its ESI had organized its
            files in a manner which seemed to serve no purpose
            other than ‘to discourage audits. . .’ Similarly, in this
            case, [the party] host[ed] no ediscovery software on
            their servers and apparently are unable to conduct
            centralized email searches of groups of users
            without downloading them to a separate file and
            relying on the services of an outside vendor.”
                                                                             41




        Judicial second guessing of failure to use
        e-search capabilities: Capitol Records v.
        MP3 Tunes (con’t)
        Court went on to add:
        “The day will undoubtedly will come when
          burden arguments based on a large
          organization’s lack of internal ediscovery
             g                                     y
          software will be received about as well as the
          contention that a party should be spared from
          retrieving paper documents because it had
          filed them sequentially, but in no apparent
          groupings, in an effort to avoid the added
          expense of file folders or indices.”
                                                                             42




2012 Managing Electronic Records Conference                                               6.14
Cohasset Associates, Inc.

                                                                              NOTES
        Problem 3: Innovative
        Thinking




                                                                         43




        The records management world of
        tomorrow….




        References
        Background Law Review Referencing Autocategorization&
           Advanced Search
        J. Baron, “Law in the Age of Exabytes: Some Further Thoughts on
           ‘Information Inflation’ and Current Issues in E-Discovery
           Search, 17 Richmond J. Law & Technology (2011), see
           http://law.richmond.edu
           htt //l     i h     d d

        Latest “Predictive Coding” Case Law to follow in blogs online:
         Da Silva Moore v PublicisGroupe& MSL Group, 11 Civ. 1279
           (S.D.N.Y.) (Peck, M.J.) (Opinion dated Feb. 24 2012)
         Kleen Products, LLC v. Packaging Corp. of America, 10 C 5711
           (N.D. Ill.) (Nolan, M.J.)

                                                                         45




2012 Managing Electronic Records Conference                                           6.15
Cohasset Associates, Inc.

                                                NOTES

           Jason R. Baron
         Director of Litigation
                         g
            Office of General Counsel
            National Archives and
            Records Administration

           (301) 837-1499
           Email: jason.baron@nara.gov


                                           46




          Dave Lewis, Ph.D.
        David D. Lewis Consulting, LLC
          Chicago, IL

          Email: consult@DavidDLewis.com
          http//www.DavidDLewis.com




                                           47




2012 Managing Electronic Records Conference             6.16

Weitere ähnliche Inhalte

Ähnlich wie M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification End the 'End User' Burden in Records Management?

Document Management Techniques & Technologies
Document Management Techniques & TechnologiesDocument Management Techniques & Technologies
Document Management Techniques & TechnologiesGihan Wikramanayake
 
Cibm work shop 2chapter six
Cibm  work shop 2chapter sixCibm  work shop 2chapter six
Cibm work shop 2chapter sixShaheen Khan
 
The Growing Email Archiving Dilemma
The Growing Email Archiving DilemmaThe Growing Email Archiving Dilemma
The Growing Email Archiving Dilemmagkfletch
 
A Pragmatic Strategy for Oracle Enterprise Content Management (ECM)
A Pragmatic Strategy for Oracle Enterprise Content Management (ECM)A Pragmatic Strategy for Oracle Enterprise Content Management (ECM)
A Pragmatic Strategy for Oracle Enterprise Content Management (ECM)Brian Huff
 
DSS - ITSEC Conference - Protected-Networks - An Open Door May Tempt a Saint ...
DSS - ITSEC Conference - Protected-Networks - An Open Door May Tempt a Saint ...DSS - ITSEC Conference - Protected-Networks - An Open Door May Tempt a Saint ...
DSS - ITSEC Conference - Protected-Networks - An Open Door May Tempt a Saint ...Andris Soroka
 
Jane report mam she
Jane report mam she Jane report mam she
Jane report mam she Janecatalla04
 
LVA Electronic Records Management
LVA Electronic Records ManagementLVA Electronic Records Management
LVA Electronic Records ManagementPaul Neal
 
A Pragmatic Strategy for Oracle Enterprise Content Management
A Pragmatic Strategy for Oracle Enterprise Content ManagementA Pragmatic Strategy for Oracle Enterprise Content Management
A Pragmatic Strategy for Oracle Enterprise Content ManagementBrian Huff
 
M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!MER Conference
 
Email Management & E-forms
Email Management & E-formsEmail Management & E-forms
Email Management & E-formsCarol Hagen
 
Ideate Framework WS-REST 2011
Ideate Framework  WS-REST 2011Ideate Framework  WS-REST 2011
Ideate Framework WS-REST 2011Dave Duggal
 
IS 3003Chapter 61The Globe and MailIt is the.docx
IS 3003Chapter 61The Globe and MailIt is the.docxIS 3003Chapter 61The Globe and MailIt is the.docx
IS 3003Chapter 61The Globe and MailIt is the.docxpriestmanmable
 

Ähnlich wie M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification End the 'End User' Burden in Records Management? (20)

Document Management Techniques & Technologies
Document Management Techniques & TechnologiesDocument Management Techniques & Technologies
Document Management Techniques & Technologies
 
What Is Ecm?
What Is Ecm?What Is Ecm?
What Is Ecm?
 
What is-ecm-1227461596391360-9
What is-ecm-1227461596391360-9What is-ecm-1227461596391360-9
What is-ecm-1227461596391360-9
 
Cibm work shop 2chapter six
Cibm  work shop 2chapter sixCibm  work shop 2chapter six
Cibm work shop 2chapter six
 
The Growing Email Archiving Dilemma
The Growing Email Archiving DilemmaThe Growing Email Archiving Dilemma
The Growing Email Archiving Dilemma
 
Gov civilworkshop
Gov civilworkshopGov civilworkshop
Gov civilworkshop
 
A Pragmatic Strategy for Oracle Enterprise Content Management (ECM)
A Pragmatic Strategy for Oracle Enterprise Content Management (ECM)A Pragmatic Strategy for Oracle Enterprise Content Management (ECM)
A Pragmatic Strategy for Oracle Enterprise Content Management (ECM)
 
DSS - ITSEC Conference - Protected-Networks - An Open Door May Tempt a Saint ...
DSS - ITSEC Conference - Protected-Networks - An Open Door May Tempt a Saint ...DSS - ITSEC Conference - Protected-Networks - An Open Door May Tempt a Saint ...
DSS - ITSEC Conference - Protected-Networks - An Open Door May Tempt a Saint ...
 
Jane report mam she
Jane report mam she Jane report mam she
Jane report mam she
 
Digital Destiny
Digital DestinyDigital Destiny
Digital Destiny
 
LVA Electronic Records Management
LVA Electronic Records ManagementLVA Electronic Records Management
LVA Electronic Records Management
 
Database System Concepts
Database System ConceptsDatabase System Concepts
Database System Concepts
 
Database Management System 1
Database Management System 1Database Management System 1
Database Management System 1
 
A Pragmatic Strategy for Oracle Enterprise Content Management
A Pragmatic Strategy for Oracle Enterprise Content ManagementA Pragmatic Strategy for Oracle Enterprise Content Management
A Pragmatic Strategy for Oracle Enterprise Content Management
 
M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!M12S17 - Big Data Requires Big ERM!
M12S17 - Big Data Requires Big ERM!
 
The Case for NSF
The Case for NSFThe Case for NSF
The Case for NSF
 
Email Management & E-forms
Email Management & E-formsEmail Management & E-forms
Email Management & E-forms
 
New IM ToolBelt
New IM ToolBeltNew IM ToolBelt
New IM ToolBelt
 
Ideate Framework WS-REST 2011
Ideate Framework  WS-REST 2011Ideate Framework  WS-REST 2011
Ideate Framework WS-REST 2011
 
IS 3003Chapter 61The Globe and MailIt is the.docx
IS 3003Chapter 61The Globe and MailIt is the.docxIS 3003Chapter 61The Globe and MailIt is the.docx
IS 3003Chapter 61The Globe and MailIt is the.docx
 

Mehr von MER Conference

M12S23 - Right-sizing Your Information Footprint by Chucking Your Dead Data
M12S23 - Right-sizing Your Information Footprint by Chucking Your Dead DataM12S23 - Right-sizing Your Information Footprint by Chucking Your Dead Data
M12S23 - Right-sizing Your Information Footprint by Chucking Your Dead DataMER Conference
 
M12S21 - "Corporate Alzheimer's": The Impending Crisis in Accessing Digital R...
M12S21 - "Corporate Alzheimer's": The Impending Crisis in Accessing Digital R...M12S21 - "Corporate Alzheimer's": The Impending Crisis in Accessing Digital R...
M12S21 - "Corporate Alzheimer's": The Impending Crisis in Accessing Digital R...MER Conference
 
M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
 M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data SystemsMER Conference
 
M12S18 - Records and Information Management: What Healthcare Should be Learni...
M12S18 - Records and Information Management: What Healthcare Should be Learni...M12S18 - Records and Information Management: What Healthcare Should be Learni...
M12S18 - Records and Information Management: What Healthcare Should be Learni...MER Conference
 
M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...
 M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ... M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...
M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...MER Conference
 
M12S13 - RIM for the Next Generation: A Call to Action
 M12S13 - RIM for the Next Generation: A Call to Action M12S13 - RIM for the Next Generation: A Call to Action
M12S13 - RIM for the Next Generation: A Call to ActionMER Conference
 
M12S11 - The Do's and Don'ts of Managing Social Media
 M12S11 - The Do's and Don'ts of Managing Social Media M12S11 - The Do's and Don'ts of Managing Social Media
M12S11 - The Do's and Don'ts of Managing Social MediaMER Conference
 
M12S01 - The Information Tsunami: Where We Are and How to Move Forward
M12S01 - The Information Tsunami: Where We Are and How to Move ForwardM12S01 - The Information Tsunami: Where We Are and How to Move Forward
M12S01 - The Information Tsunami: Where We Are and How to Move ForwardMER Conference
 
M12S09 - ERM Case Law: The Latest News, Trends, and Issues
M12S09 - ERM Case Law: The Latest News, Trends, and IssuesM12S09 - ERM Case Law: The Latest News, Trends, and Issues
M12S09 - ERM Case Law: The Latest News, Trends, and IssuesMER Conference
 
M12S08 - Transforming RIM to 'Responsible Information Management'
M12S08 - Transforming RIM to 'Responsible Information Management'M12S08 - Transforming RIM to 'Responsible Information Management'
M12S08 - Transforming RIM to 'Responsible Information Management'MER Conference
 
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...MER Conference
 
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...MER Conference
 
M12S07 - Retention & ESI - Paths to Success - Part Two
M12S07 - Retention & ESI - Paths to Success - Part TwoM12S07 - Retention & ESI - Paths to Success - Part Two
M12S07 - Retention & ESI - Paths to Success - Part TwoMER Conference
 

Mehr von MER Conference (13)

M12S23 - Right-sizing Your Information Footprint by Chucking Your Dead Data
M12S23 - Right-sizing Your Information Footprint by Chucking Your Dead DataM12S23 - Right-sizing Your Information Footprint by Chucking Your Dead Data
M12S23 - Right-sizing Your Information Footprint by Chucking Your Dead Data
 
M12S21 - "Corporate Alzheimer's": The Impending Crisis in Accessing Digital R...
M12S21 - "Corporate Alzheimer's": The Impending Crisis in Accessing Digital R...M12S21 - "Corporate Alzheimer's": The Impending Crisis in Accessing Digital R...
M12S21 - "Corporate Alzheimer's": The Impending Crisis in Accessing Digital R...
 
M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
 M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
M12S19 - S19 - CASE STUDY: e-RIM Success with Structured Data Systems
 
M12S18 - Records and Information Management: What Healthcare Should be Learni...
M12S18 - Records and Information Management: What Healthcare Should be Learni...M12S18 - Records and Information Management: What Healthcare Should be Learni...
M12S18 - Records and Information Management: What Healthcare Should be Learni...
 
M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...
 M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ... M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...
M12S15 - CASE STUDY: Spoliation - The Actual Case As It Was To Be Argued in ...
 
M12S13 - RIM for the Next Generation: A Call to Action
 M12S13 - RIM for the Next Generation: A Call to Action M12S13 - RIM for the Next Generation: A Call to Action
M12S13 - RIM for the Next Generation: A Call to Action
 
M12S11 - The Do's and Don'ts of Managing Social Media
 M12S11 - The Do's and Don'ts of Managing Social Media M12S11 - The Do's and Don'ts of Managing Social Media
M12S11 - The Do's and Don'ts of Managing Social Media
 
M12S01 - The Information Tsunami: Where We Are and How to Move Forward
M12S01 - The Information Tsunami: Where We Are and How to Move ForwardM12S01 - The Information Tsunami: Where We Are and How to Move Forward
M12S01 - The Information Tsunami: Where We Are and How to Move Forward
 
M12S09 - ERM Case Law: The Latest News, Trends, and Issues
M12S09 - ERM Case Law: The Latest News, Trends, and IssuesM12S09 - ERM Case Law: The Latest News, Trends, and Issues
M12S09 - ERM Case Law: The Latest News, Trends, and Issues
 
M12S08 - Transforming RIM to 'Responsible Information Management'
M12S08 - Transforming RIM to 'Responsible Information Management'M12S08 - Transforming RIM to 'Responsible Information Management'
M12S08 - Transforming RIM to 'Responsible Information Management'
 
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...
M12S05 - CASE STUDY: Leveraging Content Analytics to Kick-Start your Informat...
 
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...
M12S02 - ERM Software: Historic Timeline, Lessons Learned, Current Issues, Fu...
 
M12S07 - Retention & ESI - Paths to Success - Part Two
M12S07 - Retention & ESI - Paths to Success - Part TwoM12S07 - Retention & ESI - Paths to Success - Part Two
M12S07 - Retention & ESI - Paths to Success - Part Two
 

Kürzlich hochgeladen

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 

Kürzlich hochgeladen (20)

Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 

M12S06 - Will Technology-Assisted Predictive Modeling and Auto-Classification End the 'End User' Burden in Records Management?

  • 1. Cohasset Associates, Inc. NOTES Will Technology-Assisted Predictive Modeling and Auto- Classification End the ‘End-User’ Burden in Records Management? 2012 Managing Electronic Records Conference Chicago, IL g May 7, 2012 Jason R. Baron, Esq. Director of Litigation Office of General Counsel National Archives and Records Administration Dave Lewis, Ph.D. David D. Lewis Consulting, LLC Chicago, IL A New Era of Government “[P]roper records management is the backbone of open Government.” President Obama’s Memorandum dated November 28, 2011 re “Managing Government Records” http://www.whitehouse.gov/the-press-office/2011/11/28/presidential-memorandum- managing-government-records 2012 Managing Electronic Records Conference 6.1
  • 2. Cohasset Associates, Inc. NOTES Reality: The era of Big Data has just begun…. Lehman Brothers Investigation -- 350 billion page universe (3 petabytes) -- Examiner narrowed collection by selecting key custodians, using dozens of Boolean searches -- Reviewed 5 million docs (40 million pages using 70 contract attorneys) Source: Report of Anton R. Valukas, Examiner, In re Lehman Brothers Holdings Inc., et al., Chapter 11 Case No. 08-13555 (U.S. Bankruptcy Ct. S.D.N.Y. March 11, 2010), Vol. 7, Appx. 5, at http://lehmanreport.jenner.com/. Process Optimization Problem 1: The transactional toll of user-based recordkeeping schemes (“as is” RM) 5 …. and the need for better, automated solutions …. 6 2012 Managing Electronic Records Conference 6.2
  • 3. Cohasset Associates, Inc. NOTES Impact of Technology on E-Records Management: Snapshot 2012 (“As is”)  A universe of proprietary products exists in the marketplace: document management and records management applications (RMAs)  DoD 5015.2 version 3 compliant products  However, scalability issues exist  Agencies must prepare to confront significant front-end process issues when transitioning to electronic recordkeeping  Records schedule simplification is key 7 RM wish list for 2012….  RM’s “easy button”: the elusive goal of zero extra keystrokes to comply with RM requirements (capture)  A technology app that automatically tags records in compliance with RM policies and practices (categorize)  Supervised learning RM with minimal records officer or end user involvement (learn)  Rule-based and role-based RM  Advanced search 8 Electronic Archiving As The First Step  What is it? 100% snapshot of (typically) email, plus in some cases other selected ESI applications  How does it differ from an RMA? Goal is of preservation of evidence, not records management per se  NARA Bulletin 2008-05 9 2012 Managing Electronic Records Conference 6.3
  • 4. Cohasset Associates, Inc. NOTES A Possible Path Forward?  Email archiving in short term, synced to existing proprietary software on email system  Designation of key senior officials as creating permanent records, consistent with existing records schedules  Additional designations of permanent records by agency component  “Smart” filters/categorical rules built in based on content, to the extent feasible to do  Default are records in designated temporary record buckets, disposed of under existing records schedules. 10 A pyramid approach combines disposition policy with automated tools to bring FRA email under records management, preservation, and access = permanent or top = temporary or staff and support officials slider The position of the “set-point” for email capture depends on policy and resources: setting it higher allows use of tools now available to get 100% of email at lower volumes;* setting it lower means more records will be captured and smarter tools are needed to distinguish and disposition temporary- and non-record. Implementing an email archiving policy is feasible now, since tools are readily available to capture 100% of email traffic at the individual or organizational level, in formats that can be archived. A pyramid approach combines disposition policy with automated tools to bring FRA email under records management, preservation, and access = permanent or top = temporary or staff and support officials slider The position of the “set-point” for email capture depends on policy and resources: setting it higher allows use of tools now available to get 100% of email at lower volumes;* setting it lower means more records will be captured and smarter tools are needed to distinguish and disposition temporary- and non-record. Implementing an email archiving policy is feasible now, since tools are readily available to capture 100% of email traffic at the individual or organizational level, in formats that can be archived. 2012 Managing Electronic Records Conference 6.4
  • 5. Cohasset Associates, Inc. NOTES How To Avoid A Train Wreck With Email Archiving…. Capture E-mail But Utilize Records Management! 13 Functional Requirements for Categorization Products in the Federal workplace Ease of use …. Scalability …. Archiving in native formats….. Metadata preservation … Seamless integration with existing software apps …. Versioning …. Compatibility with big bucket records schedules …. Advanced search capabilities …. Ease of training / machine learning using records officers or end users …. Cost Process Optimization Problem 2: The Coming Age of Dark Archives (and the inability to provide access) 15 2012 Managing Electronic Records Conference 6.5
  • 6. Cohasset Associates, Inc. NOTES Emerging New Strategies: “Predictive Analytics” Improved review and case assessment: cluster docs thru use of software with minimal human intervention at front end to Slide adapted from Gartner Conference 16 code “seeded” data set June 23, 2010 Washington, D.C. Language Processing Technologies Retrieval / Search 2. Information Classification 1. Retrieval Question Answering Summarization Entity Recognition Information Extraction Natural Language Machine Translation Processing : 17 Text Classification  Deciding which of several groups a text belongs to  Crudest form of language understanding...  ...but often can be automated with high accuracy 18 2012 Managing Electronic Records Conference 6.6
  • 7. Cohasset Associates, Inc. NOTES Why Classify? ...to specify Reduce an action for ...to finite infinite every set of variety of possible classes... text... input. 19 Other Advantages of Text Classification  Supervised learning:  Classifiers (rules) can be learned by imitating manual classifications  Straightforward numerical measures of quality recall: 85% +/- 4% precision: 75% +/- 3%  Objective reason why a decision was made classification rule 20 Variations on Classification  Binary vs. multiclass  Hierarchical  Probabilistic 83% 17%  Graded / ordered / fuzzy 21 2012 Managing Electronic Records Conference 6.7
  • 8. Cohasset Associates, Inc. NOTES Defining Sets of Classes  Tradeoff among  Ideal classes to implementpolicy  Classes you can teach  people to assign Classes you can ? teachsoftwareto assign  Be skeptical of automatic discovery of classes 22 Text Retrieval Systems  AKA search engines, semi-structured databases, text databases, etc. databases etc 23 Classification Search autonomous interactive long term transitory organizational personal structured independent ? ? ? 24 2012 Managing Electronic Records Conference 6.8
  • 9. Cohasset Associates, Inc. NOTES Some Distinctions Among Search Approaches  Exact Match vs. Ranked Retrieval vs.  "Concepts" Browsing vs. "Keywords" "Keywords"  Text Representations  Matching Aids 25 Exact Match Search  Query specifies conditions document must meet budget AND Knoxville AND (revised or preliminary)  Variants  Boolean B l  SQL  Faceted  Often (ambiguously) called "keyword" search 26 A Faceted Search Interface 27 2012 Managing Electronic Records Conference 6.9
  • 10. Cohasset Associates, Inc. NOTES Ranked Retrieval  Query specifies important attributes of desired documents  System statistically weights those attributes  Results returned in order of strength of match 28 Statistical Evidence in Ranked Retrieval  Corpus statistics  Word (and metadata) counts  Unsupervised learning  Clustering, LSI/LSA etc. Cl t i LSI/LSA, t  finds (maybe useless) patterns  Supervised learning  aka "relevance feedback"  learn indicators of user interest 29 Browsing  Hierarchies  Networks  Clusters  Spaces / Maps / Dimensions  make great pictures / demos  unclear if useful for finding information 30 2012 Managing Electronic Records Conference 6.10
  • 11. Cohasset Associates, Inc. NOTES Visual Analysis Examples (Presentation by Dr. Victoria Lemieux, Univ. British Columbia, at Society of American Archivist Annual Mtg. 2010, Washington, D.C.) With acknowledgments to Jeffrey Heer, Exploring Enron, http://hci.stanford.edu/jheer/projects/enron/, Adam Perer, Contrasting Portraits, http://hcil.cs.umd.edu/trs/2006-08/2006-08.pdf, 31 and Fernanda Viegas, Email Conversations, http://fernandaviegas.com/email.html 32 2012 Managing Electronic Records Conference 6.11
  • 12. Cohasset Associates, Inc. NOTES What Evidence Can The Search Software Use?  Words, phrases, etc.  Manually assigned categories  Metadata  Author, organization, creation date, change date, access date, length, file type,...  Contextual information (links, attachments,...) 34 What Resources Aid Matching?  Linguistic analysis  At word level or higher  Clusters / spaces / ...  Thesauri / semantic nets / concept maps / ...  Suited to your task?  Modifiable?  How is text determined to belong to category? 35 Concepts v. Keywords Supreme Court of Information Retrieval, Case No. 1-tfidf-0-2902, 2009  Search software marketing:  Them = keyword search = bad  Us = concept search = good  Reality: R lit  Both terms have referred to dozens of different technologies...  ...including some of the same ones!  Conceptual search is an aspiration, not a technology 36 2012 Managing Electronic Records Conference 6.12
  • 13. Cohasset Associates, Inc. NOTES Example of Boolean search string from U.S. v. Philip Morris  (((master settlement agreement OR msa) AND NOT (medical savings account OR metropolitan standard area)) OR s. 1415 OR (ets AND NOT educational testing service) OR (liggett AND NOT sharon a. liggett) OR atco OR lorillard OR (pmi AND NOT presidential management intern) OR pm usa OR rjr OR (b&w AND NOT photo*) OR phillip morris OR batco OR ftc test method OR star scientific OR vector group OR joe camel OR (marlboro AND NOT upper marlboro)) AND NOT (tobacco* OR cigarette* OR smoking OR tar OR nicotine OR smokeless OR synar amendment OR philip morris OR r.j. reynolds OR ("brown and williamson") OR ("brown & williamson") OR bat industries OR liggett group) 37 U.S. v. Philip Morris E-mail Winnowing Process  20 million  200,000  100,000  80,000  20,000  email hits based relevant produced placed on  records on keyword emails to opposing privilege  terms used party logs  (1%)   A PROBLEM: only a handful entered as exhibits at trial   A BIGGER PROGLEM: the 1% figure does not scale 38 Judicial endorsement of predictive analytics in document review by Judge Peck in Da Silva Moore v. PublicisGroupe(SDNY Feb. 24, 2012) This opinion appears to be the first in which a Court has approved of the use of computer-assisted review. pp p . . . What the Bar should take away from this Opinion is that computer-assisted review is an available tool and should be seriously considered for use in large- data-volume cases where it may save the producing party (or both parties) significant amounts of legal fees in document review. Counsel no longer have to worry about being the ‘first’ or ‘guinea pig’ for judicial acceptance of computer-assisted review . . . Computer-assisted review can now be considered judicially-approved for use in appropriate cases. 2012 Managing Electronic Records Conference 6.13
  • 14. Cohasset Associates, Inc. NOTES Social Networking/Links Analysis Example From Marc Smith Posted on Flickr 40 Under Creative Commons License Judicial second guessing of failure to use e-search capabilities: Capitol Records v. MP3 Tunes, 261 F.R.D. 44 (S.D.N.Y. 2009)  “In [a prior case] the Court notes its dismay that the party opposing discovery of its ESI had organized its files in a manner which seemed to serve no purpose other than ‘to discourage audits. . .’ Similarly, in this case, [the party] host[ed] no ediscovery software on their servers and apparently are unable to conduct centralized email searches of groups of users without downloading them to a separate file and relying on the services of an outside vendor.” 41 Judicial second guessing of failure to use e-search capabilities: Capitol Records v. MP3 Tunes (con’t) Court went on to add: “The day will undoubtedly will come when burden arguments based on a large organization’s lack of internal ediscovery g y software will be received about as well as the contention that a party should be spared from retrieving paper documents because it had filed them sequentially, but in no apparent groupings, in an effort to avoid the added expense of file folders or indices.” 42 2012 Managing Electronic Records Conference 6.14
  • 15. Cohasset Associates, Inc. NOTES Problem 3: Innovative Thinking 43 The records management world of tomorrow…. References Background Law Review Referencing Autocategorization& Advanced Search J. Baron, “Law in the Age of Exabytes: Some Further Thoughts on ‘Information Inflation’ and Current Issues in E-Discovery Search, 17 Richmond J. Law & Technology (2011), see http://law.richmond.edu htt //l i h d d Latest “Predictive Coding” Case Law to follow in blogs online:  Da Silva Moore v PublicisGroupe& MSL Group, 11 Civ. 1279 (S.D.N.Y.) (Peck, M.J.) (Opinion dated Feb. 24 2012)  Kleen Products, LLC v. Packaging Corp. of America, 10 C 5711 (N.D. Ill.) (Nolan, M.J.) 45 2012 Managing Electronic Records Conference 6.15
  • 16. Cohasset Associates, Inc. NOTES Jason R. Baron Director of Litigation g Office of General Counsel National Archives and Records Administration (301) 837-1499 Email: jason.baron@nara.gov 46 Dave Lewis, Ph.D. David D. Lewis Consulting, LLC Chicago, IL Email: consult@DavidDLewis.com http//www.DavidDLewis.com 47 2012 Managing Electronic Records Conference 6.16