The Intersection of Robotics, Search and AI with Solr, MyRobotLab, and Deep Learning - Kevin Watters, KMW Technology
1. Robotics, Search and AI with
Solr, MyRobotLab, and Deep
Learning
Kevin Watters
Founder, KMW Technology
@kwatters76
#Activate18 #ActivateSearch
2. What are we going to show today?
• A life sized humanoid 3D printed open source robot that can
learn from it’s surrounding and interact with humans in a
meaningful and natural manor.
• Teach a robot to recognize people by making an introduction.
• Just as a humans meet and remember each other, so should
robots.
• Simplify the barriers of human robotic interaction.
4. Introduction
• KMW Technology
• Boston Based Search Professional Services
• Founded in 2010
• Search consulting and contracting
• Solr
• Elastic
• Deeplearning4j
• NLP/NLU
• ETL & Custom Connectors
• www.kmwllc.com
5. How did I get here?
• Open source supporter, committer, contributor
• Enjoy teaching and sharing
• Maker Faire / Maker movement
• NYC / Bay Area / Denver / Boston / Paris
• EE from Northeastern University
• Passion for building and integrating.
• Search and AI passion.
7. Introducing Lloyd
• Started Construction 2014 ?
• MakerBot Replicator 2
• Powered By MyRobotLab
• 2 arduinos
• 1 Raspberry PI
• 2 cameras
• 25 servo motors (more to come.)
• Speech Recognition
• Speech Synthesis
• Memory
• And Telepresence and remote operation with OculusRift Support!
8. InMoov
• Open Source life sized 3D printed humanoid robot
• Designed by Gael Langevin, Paris France
• Started in 2012
• Inspired 3D printed prostetics projects
• Bionico / eNable
• Approx. 500 exist around the world
• Gael believes in Open source and that it takes a world to raise a
robot.
• More Info at:
www.inmoov.fr
9. MyRobotLab
I for one welcome our new robotic
overloads!
• Started by Greg Perry, Portland, Oregon
• Java based Open Source framework
• Hosted on github
• Borg in technologies
• Over 100 open source projects integrated..
And counting!
• Pub / Sub service based architecture
• Scripting via Python / jython
• Multi-platform
(Windows/Linux/Mac/RasPI)
• www.myrobotlab.org
10. How do we make this robot “smart”?
• How does can it recognize people and interact with them.
• What were the challenges ? Where do we want to take the robot?
• How do we get this robot to learn and understand?
• What tools exist out there to solve this problem?
• How can we wire it all together…
• We should be able to interact with the robot without a keyboard.
• We should be able to teach the robot new things.
• Lets make it cognitive!.. And open source!
14. Speech Recognition
• Initially using CMU Sphinx
• Need to specify a fixed grammar
• Not very active project
• English only
• Offline
• Webkit Speech Recognition
• Built into Google Chrome
• Accessed via WebGui & Javascript
• Supports ~ 60 langauges & dialects
• Requires Internet Connection
16. Speech Synthesis
• MaryTTS
• Open source, supports a few different voices and languages
• No internet connection required
• MyCroft – AI Speech Synthesis
• Open source, limited number of voices
• No internet connection Required
• LocalSpeech – invoke existing command line utilities
• FreeTTS
• Festival
• Natural Reader / Acapela Speech
• Good quality
• Requires internet connection
• Lots of voices for various languages
• Amazon Polly
• Requires account, small cost
• Good quality, requires internet connection
18. Natural Language Understanding
• Based on ProgramAB
• AIML 2.0 (XML based)
• Created by Dr. Richard S. Wallace in 1995. (Yeah it’s old, but it
works.)
• Case based Reasoning
• Uses recursion to break down user utterances
• Special fork of project on github to support MyRobotLab specific
using “OOB” or Out-Of-Band calls to MyRobotLab services or
external web services
• Pandorabots Create your own online
• Mitsuku is the current winner of the Lobner prize and is AIML based.
19. ProgramAB extensions
• Maven Based
• Proper Logging
• 40x faster loading large AIML sets
• SRAIX handler for extensibility to call out to external services.
• Localization / Locale support
• CJK support with Lucene Tokenization
• CJK Tokenizer (Chinese / Korean)
• Kuramoji Tokenizer (Japanese)
20. AIML Tags & Simple Example
“Category” is the basic unit to define a response
“Pattern” This is the string that defines the matching for the utterance.
Patterns can have wildcards or reference a “set” of items for brevity.
“Template” this defines how to handle the response
“That” this is an optional tag that can specify additional matching criteria
based on that previous resonse from the robot. It’s used to create “multi-
pass” conversations
“Topic” an optional tag that specifies the precedence of matching.
Categories in the current topic are matched before the default topic.
Useful for talking about particular subjects in more detail without giving
generic responses.
Add additional mappings for an utterance of “Greetings”, or “Hey” to
recursively return the response for “Hi”
Map any utterance that starts with “HELLO” to return the response for “Hi”
User : Hello Robot!
Bot: Hello User!
<category>
<pattern>HI</pattern>
<template>Hello user!</template>
</category>
<category>
<pattern>GREETINGS</pattern>
<template><srai>HI</srai></template>
</category>
<category>
<pattern>HEY</pattern>
<template><srai>HI</srai></template>
</category>
<category>
<pattern>HELLO *</pattern>
<template><srai>HI</srai></template>
</category>
21. Learn Tag Example
• AIML has built into it the ability to add new categories and responses dynamically.
• Use the wildcards to generate a template to teach the robot.
• Use the “think” tag so the robot only thinks it and doesn’t say it!
• Use the “learn” tag to add a new category that is filled out from the pattern.
• User the “eval” tag to return the value that matched the first and second * in the
pattern
• A helper category that will match any utterance starting with “What is” to return the
response for what ever comes after the words “what is”
• User : Learn that Pi is yummy
• Bot: Ok Pi is yummy.
• User : What is Pi?
• Bot: Yummy.
<category>
<pattern>LEARN * IS *</pattern>
<template>OK <star/> IS <star index="2"/>
<think>
<learn>
<category>
<pattern><eval><star/></eval></pattern>
<template><eval><star index="2"/></eval></template>
</category>
</learn>
</think>
</template>
</category>
<category>
<pattern>WHAT IS *</pattern>
<template><srai><star/></srai></template>
</category>
22. Wikipedia based Q/A API Integration
• Using ProgramAB to convert the utterance to a Solr query.
• MyRobotLab support for indexing XML, JDBC, RSS, etc
• Support for Document Processing Pipeline with pluggable
stages
• Indexed Wikipedia using Sweble java parser
• Extracting Infoboxes and indexing them as triples
• Constructing high precision queries to answer free form
questions.
• KMW based NLU web service integrated to simplify.
24. OpenCV / JavaCV
• Open source…
• Java bindings via the JavaCPP project. Thanks Samuel Audet!
• 2000+ different algorithms to manipulate and extract information
from image and video data
• Support for a wide range of different hardware from webcams to
remote Mjpeg video streams.
• Video Processing Pipeline of filters that is modular and dynamic
that enhance the image with metadata and classifications.
26. Memory (Embedded and Cloud Solr)
• Solr is integrated into MyRobotLab as both Embedded Solr
Server or external SolrCloud instance.
• Can attach to all information flows between services in
MyRobotLab to capture data-inflight between services
• Records what the robot hears, sees, says, recognizes, and how
the robot moves
• Stores image data in non-searchable binary field.
• Can be queried to produce training datasets for deep learning
and custom AI model.
27. More Solr Stuff!
• Dynamically attach to the Inbox or Outbox of an MRL Service
• Serialize the Message object that is passed between services
into a SolrInputDocument.
• Ability to specify stateful information to be tagged on data being
indexed to label incoming data
• Facet based metrics on the amount of data flowing through the
robots nervous system.
30. Deeplearning4j
• Java based deep learning framework supported by SkyMind.io
• GPU acceleration and native support across many platforms
with JavaCPP project.
• Relies on ND4J with JNI to do heavy matrix math modeled after
Python’s NumPY
• Can load models from Tensorflow, Keras, Café
• Has a pre-trained model zoo
• Supports custom network topologies
• Feed forward, CNN, RNN, LSTM support
31. Solr for Deeplearning in 1,2,3
Solr’s native support for faceting, random sort ordering, and
pagination is ideal for generating a random sampling to produce
both the testing and training datasets.
1. Query to get total training dataset count with a facet on the
label field to get all labels for a dataset.
2. Query with a random sort order ascending for the training
dataset (max pagination offset based on percentage of
dataset.)
3. Query with the same random sort seed descending for the
remaining examples for the testing dataset.
32. VGG-16 Image Classification
• Visual Geometry Group (U. Oxford)
• ImageNet ILSVRC-2014 (1st runner up)
• Can classify ~ 9000 classes of objects
• 16 level deep Convolutional Neural Network
• Input layer 224x224 pixels with 3 color channels (163,968)
• Open source pre-trained (Creative Commons Attribution)
• Available in Model-Zoo for DL4J
33. Yolo - Darknet
• Yolo (You Only Look Once)
• Classification and Localization (Bounding Box)
• Trained on COCO dataset (Common Object in Context)
• Pretrained model, native support in OpenCV to load pre-trained
model
• Currently using YoloV2 (Input 416x416x3 pixels)
34. Transfer learning
• Training neural networks takes a long time and a lot of compute
resources.
• Training VGG16 took multiple weeks.
• 138,357,544 parameters to train
• Use pre-trained model, chop off last (output) layer
• Add new output layer with the number of classes that you want to
classify.
• Hold all layers frozen except for the output layer (~ 4,097 parameters
per class to train.)
• Training time ~ 5 minutes to get to 80+ % accuracy
• Small training dataset ( ~ 50 examples per class.)
35. Combine Yolo & VGG
• We can combine Yolo classification to produce the bounding
box.
• Crop image based on bounding box and pass it to a transfer
learned VGG16 model to sub-classify
• Yolo detects Person
• Pass cropped image of the person to VGG16 to subclassify and
identify which person it is!
• Store training data in Solr from Yolo
• Custom Dataset Iterator that queries embedded Solr
• Inspired by SOLR-11838
38. The Singularity is near!
• We have presented an integration of many open source
technologies to demonstrate emergent behavior and to help
provide a bit of our vision of how robots and their human care
takers will evolve in the next decades to come.
• This is an open source project in its inception. Feel free to
contribute or take from this as much or as little as you like.
• Try it out! Download it.. Don’t like it? Make it better!
• Want to help out? Pull Requests welcome!
39. Whats next?
• New technologies?
• DeepWave – Speech?
• LSTM based Speech Recognition?
• Performance and stability
• Better distributed and swarm capabilities
• SLAM – Simultaneous Localization and Mapping
• Current Release is “Manticore”
• Upcoming release is “Nixie” (Soon!)
• Next release is “Ogre”