Image Recognition. Technology, Guidelines and Trends

•Als PPTX, PDF herunterladen•

5 gefällt mir•1,628 views

Image Recognition has probably been one of the hottest topics throughout 2014 with announcements such as the launch of the Amazon FireFly app and several millions of VC capital and M&A in this space. Image recognition has the potential to become ubiquitous in our day-to-day interactions with real world objects that are connected with the digital world. This talk will be divided in four topics. First, it will cover basic aspects of the technology: the different approaches, the type of objects that are recognized, and the limitations of each technique through demonstrations. Second, the audience will be guided through the steps required to embed an image recognition solution into an app or service. Third, a number of vendor solutions will be described to give hands on pointers for those willing to start integrating such solutions. Finally, the talk will discuss the future of image recognition in different fields. You can watch the video of the presentation here: https://www.youtube.com/watch?v=ilbTvfchtQY

Technologie

Image Recognition
Technology, Guidelines and Trends
David Marimon
CEO & Co-founder
david@catchoom.com
+34 654 906 753

The visual recognition market is estimated to grow
from $9.65 billion in 2014
to $25.65 billion by 2019
According to Image Recognition Market, Markets and Markets, May 2014

Image
Recognition
Face
Recognition
Object
Classification
Object
Character
Recognition
Visual Recognition

Augmented Reality and Image Recognition,
the happy couple

Outline
What works with image recognition
How to put image recognition into your app
Vendor comparison
Trends

How does the world look like for a machine?
Textured Textureless Transparent
Deformable Rigid

What’s possible with Image Recognition?
Textured Textureless Transparent
Deformable Rigid

Outline
What objects work with image recognition
How to put image recognition into your app
Vendor comparison
Trends

What do you need to build an app?
Content
On-device or Cloud
Image Recognition
Image database

Choose the IR mode that fits best
Cloud Service On-Device SDK

Choose the IR mode that fits best
Cloud Service On-Device SDK
IR requires Internet Yes No
IR speed Depends on network Controlled
Content updates Immediate Require local sync
Analytics Latest available Rely on app connection

Cloud Service On-Device SDK
Vendors in the AR space

Service On Premises On-Device
Vendors in the IR space

Why my favorite is Catchoom
Real World-
tested
Built for
usability
Fast,
accurate,
reliable
Scalable

Catchoom has delivered
over 420 million
image recognitions to date

Outline
Image recognition: approaches and limitations
How to put image recognition into your app
Vendor comparison
Trends

Extended Search
On-Device SDK Cloud Service

Takeaways
1. Image recognition is the door to a broad range of
applications and services
2. Improve performance with better image databases
3. Choose on-device or cloud IR depending on your use
case.
4. Catchoom is already behind 420M interactions and
looking to meet upcoming trends

Image Recognition
Technology, Guidelines and Trends
David Marimon
CEO & Co-founder
david@catchoom.com
+34 654 906 753
Visit our booth
for live demos!

Image Recognition vs
Object Classification
Textured Textureless Transparent Deformable Rigid
Image
Recognition
Object
Classification

Challenges with benchmarks
Label a database with both reference and test
images
Identify infrastructure differences
Understand performance is not necessarily
optimized for your use case

How to benchmark
Small dataset Full test
1. Contact the vendor 1. Contact the vendor
2. Label your database
3. Use APIs

Empfohlen

Building an Image Recognition Service - How to leverage IBM Watson for visual...

10x Nation

Image recognition technology (Medical Presentation)

saravanan guru

Augmented Reality India

Mediatrix Advertising

Augmented reality

HETALPANDYA13

Augmented reality(ar) seminar

Harshith Booragadda

Augmented Reality And Augmented Reality Applications

CDN Mobile Solutions

Augmented Reality is a new and trending technology of the present era. It turns your dreams into reality. AR has changed the life of a human being completely. Augmented Reality is used in almost all major areas like Education, Medical, Architecture, E-commerce, Emergency Management, Gaming and Fun, Manufacturing Industry, Beauty and Fashion as well. Know what is augmented reality and how it can be used in all such major fields via a presentation.

Augumented Reality trendsAlexander Vyatkin

AugumentRealitydeepalakshmidhanaraj

Empfohlen

Building an Image Recognition Service - How to leverage IBM Watson for visual...

10x Nation

Image recognition technology (Medical Presentation)

saravanan guru

Augmented Reality India

Mediatrix Advertising

Augmented reality

HETALPANDYA13

Augmented reality(ar) seminar

Harshith Booragadda

Augmented Reality And Augmented Reality Applications

CDN Mobile Solutions

Augumented Reality trendsAlexander Vyatkin

AugumentRealitydeepalakshmidhanaraj

Image Recognitionguestbe3cbf

Image Recognition With TensorFlow

Yaz Santissi

Image recognition is a problem that clearly illustrates the advantages of machine learning over traditional programming approaches. In this deep dive, how to quickly get set up with TensorFlow on Ubuntu using containers will be shown. To be even more efficient, what is becoming known as transfer learning will be demonstrated. An existing image recognition model will be used rather than the time consuming approach of building one from scratch. Subsequently, this classifier model will be trained with an image dataset. And finally, the retrained model will be tested with new external images.

Augmented reality using Triggered by Image Recognition

Nilesh Pawar

Lifi ppt

rr140688

ppt on LIFI TECHNOLOGYtanshu singh

Using of Augmented Reality Technology in Food and Drink Industry

Araz Davud

Digital Imaging

Patrick Woessner

DIGITAL IMAGE TECHNOLOGY

100677809

Integrating Text and ImageEileen MacAvery Kane

Imago OCR: Open-source toolkit for chemical structure image recognition

Mikhail Rybalkin

Golang 으로 vision api 적용하기

동철 박

Augmented Reality - the next big thing in mobile

Hari Gottipati

Process for Big Data Analysis

Myunggoon Choi

본 자료는 빅데이터를 분석하는 전반적인 과정에 대해 정리한 자료로써 사회과학을 포함한 다양한 영역(컴퓨터 공학, 통계학, 수학 등)이 분석 과정에 참여할 수 있는지를 정리한 자료이다. 분석 과정 세부 영역에 있어선 주로 사회과학의 관점에서 기술하였다. 현재 자료는 2010년부터 사회과학의 관점에서 데이터 분석을 계속 해오면서 경험한 부분과 문헌 및 발표 자료 등을 통해 정리한 자료이다. 앞으로 여러 영역을 공부하면서 빅데이터 분석 프로세스를 더욱 발전시켜 나갈 예정이다.

[코세나, kosena] 빅데이터 기반의 End-to-End APM과 비정형 데이터 분석 자료입니다.

kosena

Facial recognition technology by vaibhavVaibhav P

Et lab

Priyankush salouria

Image to text Converter

Dhiraj Raj

Lifi technology(nimesh bariya)

Nimesh Bariya

빅데이터미래전략세미나발표자료 빅데이터기술현황및전망-황승구-20120410Peter Woo

Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...

Altoros

VR & AR Visual techniques in Market Research: a real case study.

Elio Dalprato

915.pptx

JanineRoseSabano

Weitere ähnliche Inhalte

Andere mochten auch

Image Recognitionguestbe3cbf

Image Recognition With TensorFlow

Yaz Santissi

Augmented reality using Triggered by Image Recognition

Nilesh Pawar

Lifi ppt

rr140688

ppt on LIFI TECHNOLOGYtanshu singh

Using of Augmented Reality Technology in Food and Drink Industry

Araz Davud

Digital Imaging

Patrick Woessner

DIGITAL IMAGE TECHNOLOGY

100677809

Integrating Text and ImageEileen MacAvery Kane

Imago OCR: Open-source toolkit for chemical structure image recognition

Mikhail Rybalkin

Golang 으로 vision api 적용하기

동철 박

Augmented Reality - the next big thing in mobile

Hari Gottipati

Process for Big Data Analysis

Myunggoon Choi

[코세나, kosena] 빅데이터 기반의 End-to-End APM과 비정형 데이터 분석 자료입니다.

kosena

Facial recognition technology by vaibhavVaibhav P

Et lab

Priyankush salouria

Image to text Converter

Dhiraj Raj

Lifi technology(nimesh bariya)

Nimesh Bariya

빅데이터미래전략세미나발표자료 빅데이터기술현황및전망-황승구-20120410Peter Woo

Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...

Altoros

Andere mochten auch (20)

Image Recognition

Image Recognition With TensorFlow

Augmented reality using Triggered by Image Recognition

Lifi ppt

ppt on LIFI TECHNOLOGY

Using of Augmented Reality Technology in Food and Drink Industry

Digital Imaging

DIGITAL IMAGE TECHNOLOGY

Integrating Text and Image

Imago OCR: Open-source toolkit for chemical structure image recognition

Golang 으로 vision api 적용하기

Augmented Reality - the next big thing in mobile

Process for Big Data Analysis

[코세나, kosena] 빅데이터 기반의 End-to-End APM과 비정형 데이터 분석 자료입니다.

Facial recognition technology by vaibhav

Et lab

Image to text Converter

Lifi technology(nimesh bariya)

빅데이터미래전략세미나발표자료 빅데이터기술현황및전망-황승구-20120410

Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...

Ähnlich wie Image Recognition. Technology, Guidelines and Trends

VR & AR Visual techniques in Market Research: a real case study.

Elio Dalprato

915.pptx

JanineRoseSabano

Augmented reality

Kritiverma22

Mobile App Development Service for Idea Cellular | Success Story

iProgrammer Solutions Private Limited

In the successful partnership between Idea Cellular and iProgrammer, mobile application development services took center stage. iProgrammer's strategic focus on crafting a native mobile app for Idea Cellular reshaped the telecom landscape, optimizing user engagement and customer service efficiency. Overcoming challenges like call volume spikes and store scalability limitations, the app provided seamless access to services, driving revenue growth and user satisfaction. With advanced technologies and robust backend systems, iProgrammer ensured uninterrupted functionality, boasting an impressive 99.6% uptime and attracting millions of daily active users. For more information, Visit - https://www.iprogrammer.com/mobile-app-development-service/

Blocking Viral SaaS Adoption is Blocking Innovation - Novosco & Amplipahe

Novosco

The vast majority of net software solutions produced in recent years have been delivered in the form of SaaS, a trend which shows no signs of abating. This presents a paradoxical challenge for IT & business leaders; how to enable their IT consumers to discover innovative tools which potentially unlock business benefits, whilst also ensuring governance, risk, compliance, and security policies are met. Join Ampliphae and Novosco to explore this challenge with real world examples of how the controlled adoption of innovative SaaS solutions has led to real world business benefits.

Building retail moments that matter across digital and physical environments

National Retail Federation

i-Verve Company Brochure.pdf

I-Verve Inc

This will official brochure of I-Verve Inc . I-Verve Inc is a US-based software development company formed in 2016 with headquarters in New Jersey. From offering end-to-end software development services to body shopping to various businesses and big giants across the globe, i-Verve has proved itself one of the most reliable IT solutions providers. It started with 5 employees and is now packed with 100+ highly skilled, talented, and experienced software engineers. So far, i-Verve has successfully delivered 1000+ projects in various domains leveraging the latest tech stack. Delivering only quality services and staying transparent with clients by adding value to their concept is what i-Verve’s goal is.

ENGAGE 2015 - Inn-App Retargeting On Mobile Devices The Way Forward - Addicti...

IAB Canada

Roundtable Discussion: Revlon, SAP and VMware See huge Benefits from Aggressi...

Dana Gardner

Company-Profile-Ciber.pdf

Harvey C

Mobility-RCTalha Haroon

Rendered.ai - Intro to Synthetic data for Computer Vision.pdf

Chris Andrews

Rendered.ai is a Platform as a Service that enables data scientists, data engineers, and developers to create and deploy unlimited, customized synthetic data generation for computer vision-related machine learning and artificial intelligence workflows, reducing expense, closing gaps, and overcoming bias, security, and privacy issues when compared with the use or acquisition of real data. Visit https://www.rendered.ai This guide was put together to help leaders who are working with teams who struggle to build computer vision algorithms using the data that they have available. Find out how Rendered.ai helps overcome time to market and performance issues when training CV algorithms.

Rewarded Video: Benefits and Best Practices

ironSource

PWC: Why we believe VR/AR will boost global GDP by $1.5 trillion

Alejandro Franceschi

PWC: Why we believe VR/AR will boost global GDP by $1.5 trillion USD. We estimate virtual reality (VR) and augmented reality (AR) can bring net economic benefits of $1.5 trillion by 2030. But where did we get that number from? As you can imagine, estimating the potential impacts of new technologies like VR and AR is tricky and uncertain. The task is even more difficult when these technologies are expected to develop rapidly and become more deeply ingrained in our everyday lives. But we feel it's important to highlight the potential in a way that give our clients the facts to build a business case to act - and that starts with a robust methodology.

Unlocking The Marketing Potential Behind the Beacon Technology Outbreak

Klyp

Computer Vision Software Development.pdf

JohnAdams514191

Computer Vision Software Development is the process of creating software that can interpret and understand visual information from the world using a computer. This type of software is designed to simulate human visual perception using Artificial Intelligence (AI) and machine learning techniques. Computer Vision software can be used for a wide range of applications, such as image and video analysis, object and facial recognition, and navigation for self-driving cars. The software can also be used in industries such as retail, manufacturing, and healthcare for tasks such as product inspection, quality control, and medical imaging analysis. Computer vision software development helps companies to automate their processes, improve their efficiency, and gain new insights from visual data.

Review-2 LSM-1.pptx

Sid9832

Cloud Journey- Partner Advantage

Salesforce Partners

Alternative to SolarWinds

Site24x7

So, Mobile is Big. Now What?

BMA Carolinas

Ähnlich wie Image Recognition. Technology, Guidelines and Trends (20)

VR & AR Visual techniques in Market Research: a real case study.

915.pptx

Augmented reality

Mobile App Development Service for Idea Cellular | Success Story

Blocking Viral SaaS Adoption is Blocking Innovation - Novosco & Amplipahe

Building retail moments that matter across digital and physical environments

i-Verve Company Brochure.pdf

ENGAGE 2015 - Inn-App Retargeting On Mobile Devices The Way Forward - Addicti...

Roundtable Discussion: Revlon, SAP and VMware See huge Benefits from Aggressi...

Company-Profile-Ciber.pdf

Mobility-RC

Rendered.ai - Intro to Synthetic data for Computer Vision.pdf

Rewarded Video: Benefits and Best Practices

PWC: Why we believe VR/AR will boost global GDP by $1.5 trillion

Unlocking The Marketing Potential Behind the Beacon Technology Outbreak

Computer Vision Software Development.pdf

Review-2 LSM-1.pptx

Cloud Journey- Partner Advantage

Alternative to SolarWinds

So, Mobile is Big. Now What?

Kürzlich hochgeladen

みなさんこんにちはこれ何文字まで入るの？40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの？えこ...

名前です男

Uni Systems Copilot event_05062024_C.Vlachos.pdf

Uni Systems S.M.S.A.

National Security Agency - NSA mobile device best practices

Quotidiano Piemontese

The Art of the Pitch: WordPress Relationships and Sales

Laura Byrne

Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes? All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

FIDO Alliance

Epistemic Interaction - tuning interfaces to provide information for AI support

Alan Dix

Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024 https://alandix.com/academic/papers/synergy2024-epistemic/ As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.

Pushing the limits of ePRTC: 100ns holdover for 100 days

Adtran

UiPath Test Automation using UiPath Test Suite series, part 6

DianaGray10

Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI. UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities. Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes. What will you get from this session? 1. Insights into integrating generative AI. 2. Understanding how this integration enhances test automation within the UiPath platform 3. Practical demonstrations 4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath Topics covered: What is generative AI Test Automation with generative AI and Open AI. UiPath integration with generative AI Speaker: Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP

Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

James Anderson

Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!

SOFTTECHHUB

As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.

20240605 QFM017 Machine Intelligence Reading List May 2024

Matthew Sinclair

Essentials of Automations: The Art of Triggers and Actions in FME

Safe Software

In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation. We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios. Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...

DanBrown980551

Do you want to learn how to model and simulate an electrical network from scratch in under an hour? Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)! During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook. PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides: - A fully editable and extendable library for grid component modelling; - Visualization tools to display your network; - Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses; The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well. What you will learn during the webinar: - For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills; - For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.

Artificial Intelligence for XMLDevelopment

Octavian Nadolu

In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject. We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup. Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved. The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring. The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise. By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.

20240609 QFM020 Irresponsible AI Reading List May 2024

Matthew Sinclair

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Neo4j

Leonard Jayamohan, Partner & Generative AI Lead, Deloitte This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Paige Cruz

Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack. While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack. I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:

By Design, not by Accident - Agile Venture Bolzano 2024

Pierluigi Pugliese

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024

Albert Hoitingh

Kürzlich hochgeladen (20)

Uni Systems Copilot event_05062024_C.Vlachos.pdf

National Security Agency - NSA mobile device best practices

The Art of the Pitch: WordPress Relationships and Sales

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

Epistemic Interaction - tuning interfaces to provide information for AI support

Pushing the limits of ePRTC: 100ns holdover for 100 days

UiPath Test Automation using UiPath Test Suite series, part 6

Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!

20240605 QFM017 Machine Intelligence Reading List May 2024

Essentials of Automations: The Art of Triggers and Actions in FME

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...

Artificial Intelligence for XMLDevelopment

20240609 QFM020 Irresponsible AI Reading List May 2024

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

By Design, not by Accident - Agile Venture Bolzano 2024

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024

Image Recognition. Technology, Guidelines and Trends

1. Image Recognition Technology, Guidelines and Trends David Marimon CEO & Co-founder david@catchoom.com +34 654 906 753

2. The visual recognition market is estimated to grow from $9.65 billion in 2014 to $25.65 billion by 2019 According to Image Recognition Market, Markets and Markets, May 2014

3. Image Recognition Face Recognition Object Classification Object Character Recognition Visual Recognition

4. Augmented Reality and Image Recognition, the happy couple

5. Outline What works with image recognition How to put image recognition into your app Vendor comparison Trends

6. Outline What works with image recognition How to put image recognition into your app Vendor comparison Trends

7. How does the world look like for a machine? Textured Textureless Transparent Deformable Rigid

8. What’s possible with Image Recognition? Textured Textureless Transparent Deformable Rigid

9. Outline What objects work with image recognition How to put image recognition into your app Vendor comparison Trends

10. What do you need to build an app? Content On-device or Cloud Image Recognition Image database

11. Curate the Image Database

12. Choose the IR mode that fits best Cloud Service On-Device SDK

13. Choose the IR mode that fits best Cloud Service On-Device SDK IR requires Internet Yes No IR speed Depends on network Controlled Content updates Immediate Require local sync Analytics Latest available Rely on app connection

14. Outline What works with image recognition How to put image recognition into your app Vendor comparison Trends

15. Cloud Service On-Device SDK Vendors in the AR space

16. Service On Premises On-Device Vendors in the IR space

17. Why my favorite is Catchoom Real World- tested Built for usability Fast, accurate, reliable Scalable

18. Catchoom has delivered over 420 million image recognitions to date

19. Outline Image recognition: approaches and limitations How to put image recognition into your app Vendor comparison Trends

20. Extended Search On-Device SDK Cloud Service

21. Industrial Applications

22. Apparel recognition

23. Takeaways 1. Image recognition is the door to a broad range of applications and services 2. Improve performance with better image databases 3. Choose on-device or cloud IR depending on your use case. 4. Catchoom is already behind 420M interactions and looking to meet upcoming trends

24. Image Recognition Technology, Guidelines and Trends David Marimon CEO & Co-founder david@catchoom.com +34 654 906 753 Visit our booth for live demos!

25. Annex

26. Image Recognition vs Object Classification Textured Textureless Transparent Deformable Rigid Image Recognition Object Classification

27. Challenges with benchmarks Label a database with both reference and test images Identify infrastructure differences Understand performance is not necessarily optimized for your use case

28. How to benchmark Small dataset Full test 1. Contact the vendor 1. Contact the vendor 2. Label your database 3. Use APIs

Hinweis der Redaktion

The visual recognition market is growing extremely quickly. The two main reasons for this growth are kind of obvious: There is a big proliferation of images on the Internet and; There has also been a big expansion in the use of mobile for searching and purchasing
On December 1975, Kodak and Sasson invented the digital camera. Ever since we can process images and videos digitally, we’ve been developing visual recognition, trying to make machines understand the environment. Visual Recognition at large is a field of activity that has many branches. It is important to know that each one uses different computer vision approaches and there is not yet one ring to rule them all. The most prominent branches are Image Recognition, Face Recognition, Object classification, and Object Character Recognition, and each one has a different level of maturity. Image Recognition enables a fast search for images in a database to match an image taken by a smartphone or tablet. The image match pulls up related content, and users can interact, shop or rate products. Face Recognition is basically the same but instead of comparing with images or any object, it focuses on faces. Most face recognition solutions work by training a system with very large databases of images of faces previously labelled. The main use case is security or photo album organization. Object classification is a bit different in scope. Instead of searching for a very specific match in a database, it is trying to understand the elements present in a picture. This is the closest to what a kid does: this is a chair, this is a dog, or more complex descriptions like this is a steem train under Swiss Matterhorn. The use case is simple: Google. Object Character Recognition identfies letters and numbers from an image. It is used in digitizing ancient books for instance. In this tutorial, I’ll talk about Image Recognition and give you an overview of the technology, guidelines to build apps and services, and trends that we see in the market.
Why am I talking about IR in an AR conf? Image Recognition is the door to most AR interactions in the world. Via Image Recognition, a machine can tell what the user is seeing through her camera. If we know that, we can provide limitless options connected to the digital world. For instance, we can augment the environment with an inmersive experience that helps the user make a better decision.
Computer Vision tries to understand what is there and what is happening in the world via images and videos. Let’s take a look at the world with the eyes of a machine and try to see what will make us suffer.
In the first row, you can find samples of objects that differ from the amount of visual pattern that is available for recognition. In the second row, you see two sorts that differ a lot in the amount of different samples that can exist from the same very object.
It is important to set the expectations right with respect to the kinds of objects that I showed before and the technology that is available. If an object has a lot of texture, it has a higher probability of being more distinguishable within a large set of images, for instance, book covers. It does not work so well when the difference between two hundred objects have no pattern and all the same shade of grey. On the other hand, if the goal is to say “this is a blue shirt”, object classification works smoothly. If an object is deformable, we could create a database with tones of samples but it would become unbearable if you want to do that for a hundred thousand object. On the other hand, you can still train a classification system with many samples of that object in different deformations. What happens if an object is transparent? Let me tell you a story: when Logitech launched a mouse that could work over glass surfaces a few years ago,... well, rumor has it that on the day of the demo, they had to scratch the glass to make it work. The reason was that the sensor needed to "see" the dirty dots and scratches to translate that into motion. As another example, time-of-flight cameras like Kinect see through glass, or in other words, they do not see the glass in front of them. These examples showcase the challenge that glass puts into any sensing. ------- I’ve been restrictive here and for instance Catchoom’s IR engine works with deformable objects, as along as they are textured. Textureless is possible, but depends on the size of the database and how close two objects can be.
In in this 2nd part, I’ll cover fundamental aspects of project development and discuss the pieces that are necessary to deploy an app that includes Image Recognition.
There are three elements that you need for an Image Recognition app to be built. The base of the pyramid is the Image database. This is something that is often overlooked at the beginning. Sometimes, we find customers that consider the collection of images that trigger experiences after they’ve spent resources on building the app. We suggest to spend as much time with the reference images as possible to get the best experience for your users. The second piece is the technology component. There are many options here and I’ll give you some pointers in a minute. And last but not least, Content is always king. Make sure your app is valuable to your users. Image recognition is impressive, but even more impressive is when users want to repeat and come back to your app.
Imagine you prepared your database with any of the images below. Then you try to recognize that logo with a query image like the one on top. For different reference images, you’ll get very different results. The message here is to devote time to the image database. Typically, you’ll learn what works and what doesn’t, but it is good to chat with us to know what will work and what may be an issue. One of our customers augments tattoos. You definitively want to get it right before tattooing your skin.
On-device IR makes sense especially on cases that it is preferable to offload a server infrastructure and provide quick responses to users. This is the case for second screen environments where the user gets content or offers in sync with a TV show. Cloud IR on the other hand is very well suited for magazines or any content that is frequently updated and has a rather uniform traffic.
Let’s compare both at the feature level. While OD looks technically more appealing, it has some limitations when it comes to enable common business interests like content updates or analytics. In general, you will achieve the same results with both, so it depends on the use case or even your business model.
I’ll give you an overview of the vendors in the AR and outside the AR space that can help you with that.
In this list we have AR-vendors. AR vendors offer IR that is used to trigger AR experiences at scale. In other words, they allow to search through larger databases that what would fit into a smartphone by relying on the cloud. The disadvantage of most AR vendors who offer cloud IR, is that they’re designed only for AR and are not that flexible when used for non-AR use cases. Also, for augmented reality it is now commonly known that patterns need well-spread texture. Image recognition is not as demanding, but benefits from curation.
In this list, we have vendors that offer the core service, independently of how you want to use it whether it is to render an AR experience, compare products or anything you’d like to do. The table shows one additional column, which is “On Premises”. Instead of a SaaS, some vendors including Catchoom, license the core server technology to allow others build entire platforms. For example Times of India, the largest publisher in India, among other AR browsers run Catchoom inside their servers. As you can see from this and the previous slide, Catchoom is the only one who offer in both spaces AR and IR, and also have the full set of options.
But the real reason why I like Catchoom is that we have a unique combination of ingredients in our magic sauce. First, our image recognition tests are performed using pictures snapped by users in real world environments – so our technology knows how to handle difficult angles, blurry images, low light conditions and reflections. Second, our passion for seamless interactions. Catchoom was built to give users an easy, seamless image recognition experience – with no knowledge of the technology required. They just keep snapping photos like they always do. Third, the results speak for themselves. An independent benchmark study using images taken by real users rated Catchoom 20% higher on image recognition than our competitors. We also ensure a response within half a second regardless of your location thanks to our servers in the US and EU. And last, you can build entire platforms. Whether you use our service or an on-premises installation, our image recognition software is designed to deliver outstanding performance regardless of the traffic or size of your database. From hundreds of requests per second, to millions of images, we’ve engineered our software to be prepared.
Catchoom is, in fact, already one of the most used IR engines. Even though you may not have not heard of the brand Catchoom, our solution is already behind 420 million image recognitions globally.
And now I’m getting to the last part of the talk to discuss some of the trends that we see in this space.
There is a number of businesses with a long list of products that have a head and a long-tail of popularity. This is typically the case for eCommerce sites. What we see is an increasing demand to search on-device on a subset of images and if there is no match, continue with a cloud request. We have patented technology to provide support for this kind of environments without cutting any corner on the performance.
Imagine you’re a technician that has to repair a very specific part in a Star Destroyer. How can you search through all the catalogue of parts in a fraction of a second just by scanning that part. This is another research line that Catchoom is working on right now.
Fashion is one of the most exhiting sectors for image recognition. Being able to recognize a pair of shoes, a handbag or a complete look is in the mindset of thousands of fashionistas around the world. Catchoom is investing in recent advances in the field of computer vision using a technique that is called deep learning. Deep learning allows neural networks to learn the visual properties of certain objects and be able to classify them with very high precision. ----- Those three are the main trends that we see in the IR space, and Catchoom’s Labs are heavily investing in building the technology that will make them possible in the near future.
1. Image recognition is the door to a broad range of applications and services in a fast growing market. 2. You can significantly improve the performance with better image databases. 3. Choose on-device or cloud depending on your technical and business needs. 4. Catchoom is already behind 420M interactions and is working on the current trends to meet them in the near future.
Please visit our booth in the next couple of days for live demos. Thanks you very much for your time!
Catchoom in fact is already one of the most used IR engines out there. While maybe you have not heard of the brand Catchoom, our solution is already behind 420 million image recognition interactions across the world.
There are a number of challenges when trying to compare the performance of image recognition vendors. 1. How many of you have around 100,000 images on both sides of the equation, references and test images? That’s probably around the number you need to build to 1M images. 2. Is the infrastructure showing the real experience that your users will have? Let me give you an example, Catchoom has servers in the US and in EU that allow apps to connect to the closest server wherever you are in the world. Is your app global, or simply your customer is in another continent? Take that into account. 3. Performance is not necessarily optimized for a specific use case. So the question is, does that vendor perform so well / wrong? Most vendors provide the same experience to all customers because they cannot fine-tune parameters, but rather offer performance that is on average good for a large variety of cases. If you use 100,000 images, you probably have multiple use cases represented, but if you just have a few, you may not show the full benchmark of that solution.
You’re probably under two situations: Situation #1: you have a customer, with ver few images and you just want it to work like charm. Situation #2: you’re building a self-served service, where your customers or partners will upload images without any supervision. In both cases, my suggestion is to contact the vendor to know exactly what is possible and what not, and whether some tweeks here and there can improve significantly the results. For instance, at Catchoom, we look at particular cases in your results to try to identify improvements or simply different profiles of the internal paramenters that can be tuned for your case. But the reality is that unless you have an On Premises license, you won’t be able to fine tune any paramenter as all cloud service providers have the same performance across all customers.