SlideShare a Scribd company logo
1 of 8
Why Is it even important? Protect you investment Get the greatest possible ROI Ready for future demands Positioned for scaling Only when you have mastered your solution today should you / can you advance to the next
FIRST!  Know your documents Most common question asked all vendors: “How accurate are you” – the un-answerable question It’s not as simple as you think Type of text/handprint Full Page / Data Capture Fixed / Semi-Structured Business process associated with recognition Level of variations Likelihood of new document types What are you willing to give up in exchange for accuracy
Form Design Corner Stones No underlined fields Ideal field types Proper Field spacing Favor simplicity Numbers Separated fields ex. Phone number consists of 3 separate fields
Paper Document Prep Bad prep, Bad! Folds Scan batch size Paper type Tacky paper
Scanning Resolution Bit Depth Black and White, size and accuracy Grayscale: size, accuracy, repurposing Color: accuracy, repurposing, and future demands Scanning feed path 72 Dpi    96 Dpi       150 Dpi            200 Dpi                                   300 Dpi      600 Dpi  Greater Accuracy
Digital Document Prep Some image clean-up is good for OCR some isn’t Relative to the document type Border removal Deskew Despekeling Background Removal Character Regeneration Thresholding Dropout
Recognition Setup Pushing and Pulling levers Settings for your document type Settings for performance generally decrease accuracy Fine-Tuning Samples, Samples, Samples Does your workflow route exceptions, to the right place? Determine mile markers Realistic Goal Optimum
Final Thoughts The opportunity to improve the accuracy of current installation is very high The opportunity to improve the utilize more of the technology you already own is very high Other business units Don’t play the blame game, it’s a waste of time

More Related Content

Similar to Improving Data Capture Accuracy

About Allegra Print & Imaging
About Allegra Print & ImagingAbout Allegra Print & Imaging
About Allegra Print & Imaging
Neisa01
 
Api Capabilities Powerpoint Jh 2008
Api Capabilities Powerpoint Jh 2008Api Capabilities Powerpoint Jh 2008
Api Capabilities Powerpoint Jh 2008
guest5bd0c2
 
spt vision objects
spt vision objectsspt vision objects
spt vision objects
Polo Dimeo
 
Wash D.C. Final
Wash D.C. FinalWash D.C. Final
Wash D.C. Final
jcleary
 

Similar to Improving Data Capture Accuracy (20)

Revisiting the Analog Hole: The Use of OCR to Exfiltrate Data
Revisiting the Analog Hole: The Use of OCR to Exfiltrate DataRevisiting the Analog Hole: The Use of OCR to Exfiltrate Data
Revisiting the Analog Hole: The Use of OCR to Exfiltrate Data
 
Scanning 101 Standards
Scanning 101 StandardsScanning 101 Standards
Scanning 101 Standards
 
An Introduction to Document Scanning, Understanding Your Requirements
An Introduction to Document Scanning, Understanding Your RequirementsAn Introduction to Document Scanning, Understanding Your Requirements
An Introduction to Document Scanning, Understanding Your Requirements
 
Trouble with distribution
Trouble with distributionTrouble with distribution
Trouble with distribution
 
About Allegra Print & Imaging
About Allegra Print & ImagingAbout Allegra Print & Imaging
About Allegra Print & Imaging
 
Api Capabilities Powerpoint Jh 2008
Api Capabilities Powerpoint Jh 2008Api Capabilities Powerpoint Jh 2008
Api Capabilities Powerpoint Jh 2008
 
Interaction Design & Psychology (2002)
Interaction Design & Psychology (2002)Interaction Design & Psychology (2002)
Interaction Design & Psychology (2002)
 
spt vision objects
spt vision objectsspt vision objects
spt vision objects
 
Modern Relationships — AI in Customer Experience w/ Dollar Shave Club
Modern Relationships — AI in Customer Experience w/ Dollar Shave ClubModern Relationships — AI in Customer Experience w/ Dollar Shave Club
Modern Relationships — AI in Customer Experience w/ Dollar Shave Club
 
New Age Digital Pen Presentation 05 2009
New Age Digital Pen Presentation 05 2009New Age Digital Pen Presentation 05 2009
New Age Digital Pen Presentation 05 2009
 
Scanners Mary Van Court
Scanners  Mary  Van CourtScanners  Mary  Van Court
Scanners Mary Van Court
 
Scanners Mary Van Court
Scanners Mary Van CourtScanners Mary Van Court
Scanners Mary Van Court
 
Data science
Data scienceData science
Data science
 
Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017Enterprise deep learning lessons bodkin o reilly ai sf 2017
Enterprise deep learning lessons bodkin o reilly ai sf 2017
 
Enterprise Grade Data Labeling - Design Your Ground Truth to Scale in Produ...
Enterprise Grade Data Labeling - Design Your Ground Truth to Scale in Produ...Enterprise Grade Data Labeling - Design Your Ground Truth to Scale in Produ...
Enterprise Grade Data Labeling - Design Your Ground Truth to Scale in Produ...
 
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned ImagesImprove OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
 
The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.The Warranty Data Lake – After, Inc.
The Warranty Data Lake – After, Inc.
 
MongoDB.local Seattle 2019: Advanced Schema Design Patterns
MongoDB.local Seattle 2019: Advanced Schema Design PatternsMongoDB.local Seattle 2019: Advanced Schema Design Patterns
MongoDB.local Seattle 2019: Advanced Schema Design Patterns
 
Document Automation and Integration Webinar For CVision
Document Automation and Integration Webinar For CVisionDocument Automation and Integration Webinar For CVision
Document Automation and Integration Webinar For CVision
 
Wash D.C. Final
Wash D.C. FinalWash D.C. Final
Wash D.C. Final
 

More from Chris Riley ☁

CloudShare TeamLabs Walkthrough
CloudShare TeamLabs WalkthroughCloudShare TeamLabs Walkthrough
CloudShare TeamLabs Walkthrough
Chris Riley ☁
 
SharePoint Meet ECM at #SPSSC
SharePoint Meet ECM at #SPSSCSharePoint Meet ECM at #SPSSC
SharePoint Meet ECM at #SPSSC
Chris Riley ☁
 
SharePoint Meet ECM - SPS Houston
SharePoint Meet ECM - SPS HoustonSharePoint Meet ECM - SPS Houston
SharePoint Meet ECM - SPS Houston
Chris Riley ☁
 

More from Chris Riley ☁ (20)

The Bootstrappers Guide to the Startup Stack
The Bootstrappers Guide to the Startup StackThe Bootstrappers Guide to the Startup Stack
The Bootstrappers Guide to the Startup Stack
 
Robot & Frank & Basic AI
Robot & Frank & Basic AIRobot & Frank & Basic AI
Robot & Frank & Basic AI
 
DevOps is for Everyone - DevOps East
DevOps is for Everyone - DevOps EastDevOps is for Everyone - DevOps East
DevOps is for Everyone - DevOps East
 
Enterprise Docker Requires a Private Registry
Enterprise Docker Requires a Private RegistryEnterprise Docker Requires a Private Registry
Enterprise Docker Requires a Private Registry
 
Continuous Integration (CI) is about more than releases
Continuous Integration (CI) is about more than releasesContinuous Integration (CI) is about more than releases
Continuous Integration (CI) is about more than releases
 
What DevOps means for QA Teams
What DevOps means for QA TeamsWhat DevOps means for QA Teams
What DevOps means for QA Teams
 
Enterprise DevOps fact or fiction - DevOps Summit 2014
Enterprise DevOps fact or fiction - DevOps Summit 2014Enterprise DevOps fact or fiction - DevOps Summit 2014
Enterprise DevOps fact or fiction - DevOps Summit 2014
 
Navigating the Developer Tools Market: DevOps Camp Houston 2014
Navigating the Developer Tools Market: DevOps Camp Houston 2014Navigating the Developer Tools Market: DevOps Camp Houston 2014
Navigating the Developer Tools Market: DevOps Camp Houston 2014
 
Infragistics uses DevOps to increase customer engagment
Infragistics uses DevOps to increase customer engagmentInfragistics uses DevOps to increase customer engagment
Infragistics uses DevOps to increase customer engagment
 
CloudShare TeamLabs Walkthrough
CloudShare TeamLabs WalkthroughCloudShare TeamLabs Walkthrough
CloudShare TeamLabs Walkthrough
 
Dev/Test in the Cloud - A Business Case
Dev/Test in the Cloud - A Business CaseDev/Test in the Cloud - A Business Case
Dev/Test in the Cloud - A Business Case
 
Pingar - The Future of Text Analytics
Pingar - The Future of Text AnalyticsPingar - The Future of Text Analytics
Pingar - The Future of Text Analytics
 
Pingar App for SharePoint
Pingar App for SharePointPingar App for SharePoint
Pingar App for SharePoint
 
SharePoint meet ECM @ Live 360 2013
SharePoint meet ECM @ Live 360 2013SharePoint meet ECM @ Live 360 2013
SharePoint meet ECM @ Live 360 2013
 
Move your SharePoint Development to the Cloud
Move your SharePoint Development to the CloudMove your SharePoint Development to the Cloud
Move your SharePoint Development to the Cloud
 
SPS Toronoto - SharePoint meet ECM
SPS Toronoto - SharePoint meet ECMSPS Toronoto - SharePoint meet ECM
SPS Toronoto - SharePoint meet ECM
 
CloudShare SP Expert Hackathon
CloudShare SP Expert HackathonCloudShare SP Expert Hackathon
CloudShare SP Expert Hackathon
 
SharePoint Meet ECM at #SPSSC
SharePoint Meet ECM at #SPSSCSharePoint Meet ECM at #SPSSC
SharePoint Meet ECM at #SPSSC
 
SharePoint Meet ECM - SPS Houston
SharePoint Meet ECM - SPS HoustonSharePoint Meet ECM - SPS Houston
SharePoint Meet ECM - SPS Houston
 
SharePoint, Cloud, Records Managment
SharePoint, Cloud, Records ManagmentSharePoint, Cloud, Records Managment
SharePoint, Cloud, Records Managment
 

Recently uploaded

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Improving Data Capture Accuracy

  • 1. Why Is it even important? Protect you investment Get the greatest possible ROI Ready for future demands Positioned for scaling Only when you have mastered your solution today should you / can you advance to the next
  • 2. FIRST! Know your documents Most common question asked all vendors: “How accurate are you” – the un-answerable question It’s not as simple as you think Type of text/handprint Full Page / Data Capture Fixed / Semi-Structured Business process associated with recognition Level of variations Likelihood of new document types What are you willing to give up in exchange for accuracy
  • 3. Form Design Corner Stones No underlined fields Ideal field types Proper Field spacing Favor simplicity Numbers Separated fields ex. Phone number consists of 3 separate fields
  • 4. Paper Document Prep Bad prep, Bad! Folds Scan batch size Paper type Tacky paper
  • 5. Scanning Resolution Bit Depth Black and White, size and accuracy Grayscale: size, accuracy, repurposing Color: accuracy, repurposing, and future demands Scanning feed path 72 Dpi 96 Dpi 150 Dpi 200 Dpi 300 Dpi 600 Dpi Greater Accuracy
  • 6. Digital Document Prep Some image clean-up is good for OCR some isn’t Relative to the document type Border removal Deskew Despekeling Background Removal Character Regeneration Thresholding Dropout
  • 7. Recognition Setup Pushing and Pulling levers Settings for your document type Settings for performance generally decrease accuracy Fine-Tuning Samples, Samples, Samples Does your workflow route exceptions, to the right place? Determine mile markers Realistic Goal Optimum
  • 8. Final Thoughts The opportunity to improve the accuracy of current installation is very high The opportunity to improve the utilize more of the technology you already own is very high Other business units Don’t play the blame game, it’s a waste of time