Augustin Marty, CEO @deepomatic, discussed computer visions' progress thanks to deep learning, at the 2016 Hello Tomorrow Summit. He puts forward a solution to tackle the challenges in computer vision, making AI for every company. Learn more at www.deepomatic.com
Generative Artificial Intelligence: How generative AI works.pdf
Tackling Challenges in Computer Vision
1. In the past few years we have been witnessing incredible
progress in the field of computer vision, mainly due to deep
learning.
Tackling challenges in
computer vision
Augustin Marty
CEO Deepomatic
2. Deep learning changed so much in solving image related questions.
You need to feed your model with examples. Give your model images, thousands of images so that it learns to differentiate.
Imagenet: Iconic challenge in the world of computer vision –with thousands of categories: all the images must be placed by algorithms into the correct category.
Democratisation of image recognition – error rate dropped from 26% to 3% today. 5% is the error rate of a human.
DEEP LEARNING
IS THE NEW PARADIGM
3. The deep learning progress was also made possible because the tech giants started to massively investing in it, developing bigger models (with more layers and millions of parameters) .
This is a heavy process
GoogleNet (2014)
22 LAYERS
ResNet (2015)
152 LAYERS
4. With bigger and more complex models and algorithms there is a need for great computing power.
Here NVIDIA’s CEO, Jen-Hsun Huang, is presenting and Open AI supercomputer to Elon Musk.
This is a relatively small box, aligned with GPU processing calculations amazingly fast.
This type of supercomputer is now affordable and accessible, especially since it is in the Cloud.
This progress and computing power has led to interesting applications in image recognition:
OPEN-AI SUPERCOMPUTER
5. google show and tell : Google’s image captioning model.
The model is able to describe the scene with a collection of
verbs, adjectives – like a human does.
Google’s Show and Tell
6. Style transfer is another example:
Deep learning algorithms can understand style of a
painting and reproduce it.
Here it understands Van Gogh’s expressionist painting and
- coupled with a picture of houses along a river -
reproduces the style, creating a new picture.
Style
Transfer
7. Video Coloration:
Artificial intelligence colours the video turning the black and
white video into a coloured one.
How was this achieved?:
1.Engineers took thousands of coloured videos and made
them black and white.
2. They then trained the algorithm to understand the
correlation between the B&W videos and the respective
coloured ones.
3. Then the algorithm was able to colour new, initially B&W
videos.
Video
Colouration
8. In specific industry problems those models don’t necessarily work. Here the following 3 images are taken from Microsoft’s image recognition platform online.
Here an automotive part is mistaken for a “close up of a plane” - not quite!
“Close up of a plane”
9. A terrorist is labeled as ‘man looking at the ocean' .
Of course there's an ocean and a man but first of all he’s
clearly looking away from the ocean. The machine doesn’t
recognise the dark knife in his hands, and can’t properly
identify the face because of the mask.
“Man looking at the ocean”
10. a group of men standing on a dirt field :
it’s not wrong: you have a group of men and yes it’s a dirt road.
But we humans, understand the context of this picture better , unlike the algorithm: these group of men are fighters – they have weapons, they are fighters, and are probably engaging in an act
of war
All this goes to show that progress is undeniable but there is much to still do.
When you want to apply these technologies to your industry or company specific challenges it might not work. You may think that therefore its not for you, that AI and computer vision doesn’t
solve your need
But…
“Group of men standing on top of a dirt field”
11. Artificial Intelligence is for everyone, for every company.
The following examples show industry specific problems
that were solved thanks to computer visions
AI IS FOR EVERY COMPANY
12. SADAKO TECHNOLOGIES- a Spanish firm- has developed a waste sorting device by combining robotics and computer vision.
They are therefore able to automatically distinguish plastic from other waste on a conveyer belt.
This can have incredibly promising applications for the future waste sorting systems and management,, and other cleaning applications
CASE STUDY: SADAKO
Waste Sorting
13. Regaind is a startup that is able to qualify the image
aesthetics. Selecting best pictures (amongst thousands of
pictures taken during a vacation) it creates photo albums
automatically by analysing the quality of the picture.
CASE STUDY: REGAIND
Image Aesthetics
14. Coming back to the image of the fighters.
Its possible, with todays’ technology, to develop weapon
detection for images and in videos. This has great use and
is of great importance to military and intelligence.
CASE STUDY: SECURITY
Weapon Detection
15. The three previous application have been developed by
small companies. If they were able to do this than so can
you, if you follow the right methodology
There is a secret sauce to tackle image recognition
challenges specific to your industry:
THE SECRET AI SAUCE
TO SOLVE
YOUR PROBLEMS
16. First, you need a deep learning framework.
These are available as they are open source. You just
need an engineer to use them.
A framework
17. Second ingredient: annotated images.
These must be relevant to the task you are tackling
The images differ for each use-case and problem – need to
develop a dataset.
ANNOTATED DATA
19. Assemble the 3 ingredients;
You trained an algorithm thanks to your dataset and
framework.
Your first algorithm is applied to never-before-seen images
(it is never perfect at first).
Your algorithm won’t give an answer on all new images
provides: you need humans-in-the-loop to keep annotating
those that weren’t (when the machine lacked confidence),
completing the task.
AI + annotators who complete the job and also keeps
building the dataset, creating a better algorithm …
Dataset
Neural network
models
Humans in
the loop
TrainingAnnotation
Calling humans when the
model is not sure
THE LOOP
20. This may all seem relatively simple but there’s a catch: you need to have a very good dataset for it to work well: a
huge amount of perfectly annotated images
Creating these datasets takes lot’s of time.
BUT…
THERE’S A CATCH
21. Here’s an example of furniture detection
To develop and algorithm that detects furniture in images you need a dataset with boxes around every single item in the image.
Consequently, you need to do this manually at first, making sure the boxes are perfectly around the item, and that no object is missed.
This takes 10 minutes for 1 image!
Sadako Technology, mentioned earlier, needed to do this to train their technology: they put millions of boxes around plastic bottles to create their dataset.
FURNITURE
DETECTION
(10 min)
22. Some tasks are even more time consuming. If you want to
develop algorithms for robotics (automated cars, robots,
drones etc.) – they need to understand their entire
environment.
So in this case, to train algorithms you need to determine
what each pixel represents in the image.
This segmentation task takes over an hour
URBAN
SEGMENTATIO
N
(70min)
24. Good dataset creation is crucial to speed up the pace of AI
progress
LACK OF DATA IS SLOWING DOWN AI EXPANSION
25. To make datasets today there are 2 ways of doing it for now:
1) done internally by data scientists
2) Use crowdsourcing, such s Amazons Mechanical Turk – this isn’t too bad – but is time consuming and you need to do many quality reviews and check to ensure satisfactory results
Every data scientist has, at least once, thrown out a dataset due to its poor quality.
AMAZON
MECHANICAL
TURK
Time consuming,
poor quality
DO IT INTERNALLY
Make your data
scientists want to quit
or
Solutions
26. Real need for Industrialising the dataset creation process is the true solution to move forward to solve image related challenges for each company.
There are a few elements that may help the industrialisation and democratisation of the dataset creation:
INDUSTRIALISING THE
ANNOTATION PROCESS
27. 1.
Improve the UX
of annotation tools
We need to have a dedicated software: today there is no software to produce datasets. Its crazy to think that each company develops their own small software.
A big leap in productivity can be achieved by simply improving the design and the annotation experience
28. Second element to increase pace fo AI production is to work on active learning.
You don’t want to annotate millions of images. Active learning is science that helps select the most informative image to build AI with as few images as possible
2.
Active Learning & HITL
29. Improve software with machine learning: if software knows
what you’re doing it can really improve the ease of the
3.
Improve tools with AI
30. We intend to reduce the time from 70 to 5 minutes
increasing the productivity by 10x
We intend to reduce the time from
70 to 5 minutes in 10 months
31. Here machine learning helps software to annotate images.
This video shows that if you are looking for a face the box
will automatically adjust around the head. The same goes
for when annotating objects pixel-wise – speeding up the
completion of the task.
32. AI really is for everyone and can solve any companies
challenges. Algorithms are becoming commodities and
datasets are the bottle neck of AI.
Democratising and industrialising the process of dataset
creation will allow for all of us, all companies to move
forward with their AI. Applications and goals.
THANK YOU
Hinweis der Redaktion
In the past few years we have been witnessing incredible progress in the field of computer vision, mainly due to deep learning.
Deep learning changed so much in solving image related questions.
You need to feed your model with examples. Give your model images, thousands of images so that it learns to differentiate. Imagenet: Iconic challenge in the world of computer vision –with thousands of categories: all the images must be placed by algorithms into the correct category.
Democratisation of image recognition – error rate dropped from 26% to 3% today. 5% is the error rate of a human.
The deep learning progress was also made possible because the tech giants started to massively investing in it, developing bigger models (with more layers and millions of parameters) .
This is a heavy process
With bigger and more complex models and algorithms there is a need for great computing power. Here NVIDIA’s CEO, Jen-Hsun Huang, is presenting and Open AI supercomputer to Elon Musk.
This is a relatively small box, aligned with GPU processing calculations amazingly fast.
This type of supercomputer is now affordable and accessible, especially since it is in the Cloud.
This progress and computing power has led to interesting applications in image recognition:
google show and tell : Google’s image captioning model. The model is able to describe the scene with a collection of verbs, adjectives – like a human does.
Style transfer is another example:
Deep learning algorithms can understand style of a painting and reproduce it. Here it understands Van Gogh’s expressionist painting and - coupled with a picture of houses along a river - reproduces the style, creating a new picture.
Video Coloration:
Artificial intelligence colours the video turning the black and white video into a coloured one.
How was this achieved?:
1.Engineers took thousands of coloured videos and made them black and white. 2. They then trained the algorithm to understand the correlation between the B&W videos and the respective coloured ones.
3. Then the algorithm was able to colour new, initially B&W videos.
In specific industry problems those models don’t necessarily work. Here the following 3 images are taken from Microsoft’s image recognition platform online. Here an automotive part is mistaken for a “close up of a plane” - not quite!
A terrorist is labeled as ‘man looking at the ocean' . Of course there's an ocean and a man but first of all he’s clearly looking away from the ocean. The machine doesn’t recognise the dark knife in his hands, and can’t properly identify the face because of the mask.
a group of men standing on a dirt field : it’s not wrong: you have a group of men and yes it’s a dirt road.But we humans, understand the context of this picture better , unlike the algorithm: these group of men are fighters – they have weapons, they are fighters, and are probably engaging in an act of war
All this goes to show that progress is undeniable but there is much to still do.
When you want to apply these technologies to your industry or company specific challenges it might not work. You may think that therefore its not for you, that AI and computer vision doesn’t solve your need
But…
Artificial Intelligence is for everyone, for every company. The following examples show industry specific problems that were solved thanks to computer visions
SADAKO TECHNOLOGIES- a Spanish firm- has developed a waste sorting device by combining robotics and computer vision.
They are therefore able to automatically distinguish plastic from other waste on a conveyer belt. This can have incredibly promising applications for the future waste sorting systems and management,, and other cleaning applications
Regaind is a startup that is able to qualify the image aesthetics. Selecting best pictures (amongst thousands of pictures taken during a vacation) it creates photo albums automatically by analysing the quality of the picture.
Coming back to the image of the fighters.
Its possible, with todays’ technology, to develop weapon detection for images and in videos. This has great use and is of great importance to military and intelligence.
The three previous application have been developed by small companies. If they were able to do this than so can you, if you follow the right methodology
There is a secret sauce to tackle image recognition challenges specific to your industry:
First, you need a deep learning framework. These are available as they are open source. You just need an engineer to use them.
Second ingredient: annotated images. These must be relevant to the task you are tackling
The images differ for each use-case and problem – need to develop a dataset.
Annotators – human in the loop
Assemble the 3 ingredients;
You trained an algorithm thanks to your dataset and framework.
Your first algorithm is applied to never-before-seen images (it is never perfect at first). Your algorithm won’t give an answer on all new images provides: you need humans-in-the-loop to keep annotating those that weren’t (when the machine lacked confidence), completing the task. AI + annotators who complete the job and also keeps building the dataset, creating a better algorithm …
This may all seem relatively simple but there’s a catch: you need to have a very good dataset for it to work well: a huge amount of perfectly annotated images
Creating these datasets takes lot’s of time.
Here’s an example of furniture detection
To develop and algorithm that detects furniture in images you need a dataset with boxes around every single item in the image.Consequently, you need to do this manually at first, making sure the boxes are perfectly around the item, and that no object is missed. This takes 10 minutes for 1 image! Sadako Technology, mentioned earlier, needed to do this to train their technology: they put millions of boxes around plastic bottles to create their dataset.
Some tasks are even more time consuming. If you want to develop algorithms for robotics (automated cars, robots, drones etc.) – they need to understand their entire environment. So in this case, to train algorithms you need to determine what each pixel represents in the image. This segmentation task takes over an hour
The real bottleneck is now the dataset creation
Good dataset creation is crucial to speed up the pace of AI progress
To make datasets today there are 2 ways of doing it for now: 1) done internally by data scientists
2) Use crowdsourcing, such s Amazons Mechanical Turk – this isn’t too bad – but is time consuming and you need to do many quality reviews and check to ensure satisfactory results
Every data scientist has, at least once, thrown out a dataset due to its poor quality.
Real need for Industrialising the dataset creation process is the true solution to move forward to solve image related challenges for each company.
There are a few elements that may help the industrialisation and democratisation of the dataset creation:
We need to have a dedicated software: today there is no software to produce datasets. Its crazy to think that each company develops their own small software.
A big leap in productivity can be achieved by simply improving the design and the annotation experience
Second element to increase pace of AI production is to work on active learning.
You don’t want to annotate millions of images. Active learning is science that helps select the most informative image to build AI with as few images as possible
Improve software with machine learning: if software knows what you’re doing it can really improve the ease of the task.
We intend to reduce the time from 70 to 5 minutes increasing the productivity by 10x
Here machine learning helps software to annotate images. This video shows that if you are looking for a face the box will automatically adjust around the head. The same goes for when annotating objects pixel-wise – speeding up the completion of the task.
AI really is for everyone and can solve any companies challenges. Algorithms are becoming commodities and datasets are the bottle neck of AI.
Democratising and industrialising the process of dataset creation will allow for all of us, all companies to move forward with their AI. Applications and goals. THANK YOU