More Related Content Similar to Real time video analytics with InfoSphere Streams, OpenCV and R (20) Real time video analytics with InfoSphere Streams, OpenCV and R1. © 2013 IBM Corporation
Real time video analytics with InfoSphere Streams, OpenCV and R
data2day conference 2014, Karlsruhe
Stephan Reimann – IT Specialist Big Data – stephan.reimann@de.ibm.com @stereimann
Wilfried Hoge – IT Architect Big Data – hoge@de.ibm.com @wilfriedhoge 2. © 2014 IBM Corporation
Motivation: Use machine data to make machines smarter
Modern machines produces an incredible amount of data
Use machine generated data to
–make machines more efficient
–reduce downtimes with better maintenance management
–prevent failures
-> make machines smarter
Also use unstructured data such as video
Use that data in real time
2 3. © 2014 IBM Corporation
The demo scenario: Imagine a tunnel drill equipment where the conveyor belt is continuously supervised by a video camera
What if you can detect a problem in real time, and take an appropriate action such as stopping the machine to prevent damage?
Our demo focuses on analyzing the data from a single camera to make it easy to understand; in a real life scenario there are usually many structured and unstructured data sources that are most likely combined (e.g. analyzing the image data together with speed info)
And since we did not have one, we created one
3 4. © 2014 IBM Corporation
Traditional approach
– Historical fact finding
– Analyze persisted data
– (Micro-) Batch philosophy
– PULL approach
Streaming analytics
– Analyze the current moment / the now
– Analyze data directly “in Motion” – without
storing it
– Analyze data at the speed it is created
– PUSH approach
Streaming analytics is a paradigm shift from pull to push analytics in
real time, directly „on the wire“, data does not need to be persisted
Data Repository Analysis Insight Data Analysis Insight
4
6. © 2014 IBM Corporation
We have used standard algorithms from OpenCV to extract the inte- resting part of the pictures by learning and removing the background
We are only interested in the objects that are on the conveyor belt, not in the conveyor belt
But we don‘t know which objects will pass there, there may be many different
One approach is to describe the background and filter it out, in other words: outlier analysis
We have used a standard algorithm (CodeBook) from OpenCV (open source image analytics library)
6 7. © 2014 IBM Corporation
The background removal included sevaral steps such as data preparation & cleansing and background detection & removal
7 8. © 2014 IBM Corporation
The background removal included sevaral steps such as data preparation & cleansing and background detection & removal
Filter:
Select the area of interest
Cleanse: Reduce the noise level
Analyze & Transform: Learn the background and create a mask: black=background, white=foreground
Cleanse: Reduce the noise level of the back- ground detection
Transform: Combine the background detec- tion with the original image, it‘s basically a logical AND
Just for visualization:
Create the blue separator image
Publish:
The Export operators provide the data to other streaming analytics applications (here: the visuali- zation & the color analytics) via publish & subscribe
8 9. © 2014 IBM Corporation
Features
Background
Frequencies
Spectrum
Edges
Camera Motion
Energy
Zero-crossings
Models
P
P
P
P
P
P
P
P
P
P
Positive Examples
Negative
Examples
N
N
N
N
N
N
N
N
N
N
Labeled Data
Unlabeled Data
Addaboost
K-means
Regression
Bayes Net
Nearest Neighbor
Neural Net
Deep Belief Nets
GMM
Clustering
Markov Model
Decision Tree
Expectation Maximization
Factor Graph
Shot Boundaries
Semantics
Multimedia Data
Scenes
Locations
Settings
Objects
Activities
Actions
Objects
Actions
Behaviors
People
Objects
Living
Cars
Animals
People
Vehicles
Activities
Scenes
People
Places
Faces
Objects
Events
Activities
GMM
SVMs
Shape
Texture
Ensemble
Classifiers
Motion
Moving Objects
Active Learning
Regions
Scene Dynamics
Tracks
Color
One approach to image analytics is extracting features and using a variety of statistical/mathematical concepts to deduce the semantics
9 10. © 2014 IBM Corporation
Visual Features
Spatial Granularities
Spatial-Frequency Information
Spatial Information
Distribution
Local
Texture
Color Wavelet
Tamura Texture
Wavelet Texture
Color Wavelet Texture
Spatial Relation
Edges/Shape
Shape Moments
Edge Histogram
Siftogram
Fourier Shape
Image Type
Image Statistics
Dominant Colors
Spatial
Scales
Scale- Orientation
Hough Circle
Max- Response Filters
Curvelets
Color (Pixels)
Color Correlogram
Color Moments
Interest Points
Thumbnail Image
Local Binary Patterns
Color Histogram
Complexity
1 3
2
Global
Pyramid3
Horiz. Parts
Vertical
Horizontal
Layout
Pyramid
Grid
Cross
Center
Typical image features used for analytics include color, shapes, texture and many more, we have focussed on color for the demo
10 11. © 2014 IBM Corporation
We have calculated several color features and the object‘s area, now we can use it for calculations / analytics
Area (in pixel) Absolut Color Values Color Histogram
The cool thing: Now you have attributes! It‘s structured! You can directly use it or combine it with other data sources, e.g. calculate conveyor belt throughput based on area and speed information.
Analytics: Calculates the color attributes and the area
Import:
Receives the data from the background separation app via subscribe
Visualization:
Write the text and draw the color histogram
11 12. © 2014 IBM Corporation
We have „marked“ the structured data from the color analytics application and used it to train a model to detect object classes
Describing explicitly what is characteristic for an object class is difficult/impossible. We have used the numbers to let the algorithm behind the model learn it. The algorithm just needs the marked data (=training data set). Marked data means we provided the information which object class was visible at which time.
12 13. © 2014 IBM Corporation
The model is created when the application is started based on the training data, and predicts the object class for each image in real time
We have used R (an Open Source package for statistics and advanced analytics) to create the predictive model
The model is created when the streaming analytics application is started
Once the application is running, the individual score and the prediction are calculated for each individual image (or in other words: the predictive model is applied), this is called scoring
In our demo the model is only trained once at startup and maintains constant afterwards, but it is also possible to refresh models continuously or in certain intervals
Import:
Receives the data from or analytics app via subscribe
Visualization:
Visualizes the results
Visualization: Write the prediction as text on the image
13 14. © 2014 IBM Corporation
Features
Background
Frequencies
Spectrum
Edges
Camera Motion
Energy
Zero-crossings
Models
P
P
P
P
P
P
P
P
P
P
Positive Examples
Negative
Examples
N
N
N
N
N
N
N
N
N
N
Labeled Data
Unlabeled Data
Addaboost
K-means
Regression
Bayes Net
Nearest Neighbor
Neural Net
Deep Belief Nets
GMM
Clustering
Markov Model
Decision Tree
Expectation Maximization
Factor Graph
Shot Boundaries
Semantics
Multimedia Data
Scenes
Locations
Settings
Objects
Activities
Actions
Objects
Actions
Behaviors
People
Objects
Living
Cars
Animals
People
Vehicles
Activities
Scenes
People
Places
Faces
Objects
Events
Activities
GMM
SVMs
Shape
Texture
Ensemble
Classifiers
Motion
Moving Objects
Active Learning
Regions
Scene Dynamics
Tracks
Color
Color
Decision Tree
The demo has shown image analytics on one feature and model, in reality a combination of several features & models is used
14 15. © 2014 IBM Corporation
A freely available Webcast from IBM Research provides further insights into image and video analytics and the theorie behind
IBM Analytics Education Series: Lecture 7 - Multimedia - Image and Video Analytics
15 16. © 2014 IBM Corporation
R
–Open Source software for statistics and advanced analytics
–http://cran.r-project.org/
We have used InfoSphere Streams for the real time analytics and have extended it with R and OpenCV for the implementation
OpenCV
–Open Source computer vision and machine learning software library
–http://opencv.org/ & InfoSphere Streams OpenCV Toolkit on GitHub
InfoSphere Streams
–Software for real time analytics on any kind of Big Data
Free Quickstart Edition
Developer Community
www.ibmdw.net/streamsdev/
ibm.co/streamsqs
+
Tutorials, Labs, Forum, ...
GitHub Community
github.com/IBMStreams
+
Toolkits,
Toolkits,
Toolkits
16 17. © 2014 IBM Corporation
InfoSphere Streams is the result of an IBM research project, designed for high-throughput, low latency and to make streaming analytics easy
Scale out
Millions of Events per Second
Complex Data & Analytics
All kinds of data
Complex analytics: Everything you can express via an algorithm
Low Latency
Analyzes data at the speed it is created
Latencies down to μs
Immediate action in real time
+
+
InfoSphere Streams
Capabilities
How it works
–Define apps as flow graphs consisting of sources (inputs), operators & sinks (outputs)
–Extend the functionality with your code if required for full flexibility
–The clustered, distributed runtime on commodity HW scales nearly limitless
–GUIs for rapid development and operations make streaming analytics easy
17 18. © 2014 IBM Corporation
Telecommunication
Transport
Manufacturing
Security
Radio astronomy
Healthcare
Industrie 4.0
Energy & Utilities
Connected Car
... optimizes the traffic in Stockholm and Dublin
... analyzes acoustic signals to protect sensible areas
... optimizes the quality of mobile networks
... is the foundation for real-time campaign to increase customer satis- faction and revenues
... analyzes and selects images in real-time within the world‘s largest radio telescope
... and is a core component within many innovation initiatives
Present / In production
Trends
Prototypes
InfoSphere Streams is already used in a broad range of real time analytics applications across industries
18 19. © 2014 IBM Corporation
Where technology meets business potential: Start making sense of your data (in real time), it is possible!
Gain value from your data
19
There are many opportu- nities to gain value from data. Let‘s talk how to make sense of your data!
http://www-05.ibm.com/de/events/workshop/bigdata/
Make maintenance more predictable to reduce downtimes
Detect error patterns to prevent failures
Better understand complex systems and their dependencies to improve efficiency