Hands-On with Google’s Machine Learning APIs, 12/3/2017

Hands-On with Google’s
Machine Learning APIs
Stephen Wylie
12/3/2017
@SWebCEO
+StephenWylie
mrcity

About me
WEBSITES
www.stev-o.us
goshtastic.blogspot.com
www.ledgoes.com
www.openbrite.com
EMAIL
stev-o@stev-o.us
nerds@openbrite.com
G+
https://plus.google.com/u/1
/+StephenWylie
TWITTER
@SWebCEO
GITHUB
mrcity
 Senior Software Engineer at Capital One
 Skunkworks for Auto Finance Innovation
 Successful Kickstarter (1000% funded -
BriteBlox)
 Intel Innovator, DMS Member, Vintage
computer
collector/restorer/homebrewer,
hackathoner
 Civic Hacking Cmte Chair @ DMS
@SWebCEO +StephenWylie #MachineLearning

Today’s Mission
 Touch on ML API offerings
 Explore Google’s RESTful ML APIs
 Cloud Vision
 Natural Language
 TensorFlow (Not a RESTful API but still cool)
 Allocate time to play, develop ideas
 Have good conversations, network
 Find learning partner or group

Before We Start…
Hopefully you followed instructions on
https://github.com/mrcity/mlworkshop/
 Get access to the APIs
 Install TensorFlow

What is Machine Learning?
 Programming computers to deduce things from data…
 Conclusions
 Patterns
 Objects in images
 …using generic mathematical methods
 No advance knowledge of trends in data
 Lots of algorithms available
 The process can create beautiful constructs

Who’s using ML?
 Chat bots
 Self-driving cars
 Pepper the robot
 MarI/O
 Document recognition & field extraction

Machine Learning Tools Ecosystem
 APIs you interface with
 HP, Amazon, Microsoft, IBM, Google, Facebook’s Caffe on mobile & Web
 Software you use
 Orange (U of Ljubljana, Slovenia)
 Weka (U of Waikato, New Zealand)
 Hardware you compile programs to run on
 nVidia GPUs with CUDA, DGX-1 supercomputer
 Movidius neural compute stick
 Amazon DeepLens camera

Google’s ML APIs In Particular
 Google Play Services
 Mobile Vision API
 RESTful ML API Services
 Cloud Vision, Cloud Speech, Natural Language, Translation
 Job Discovery*, DialogFlow, Cloud Video Intelligence
 Cloud ML Engine
 Local ML Services
 TensorFlow, Tensorflow Lite
 Tensorflow Serving
^ Pre-defined models
v User-defined models
* private beta

https://cloud.google.com/vision/
Cloud Vision API
What does it look like to you?

Detect Faces, Parse Barcodes, Segment
Text
Availability
Native
Android
Native iOS
RESTful
API
FACE API
BARCODE
API
TEXT API

What do you see in that cloud?
Breaks down into more features than just FACE, BARCODE, and TEXT:
From https://cloud.google.com/vision/docs/requests-and-responses
Feature Type Description
LABEL_DETECTION Execute Image Content Analysis on the entire image and return
TEXT_DETECTION Perform Optical Character Recognition (OCR) on text within an image
FACE_DETECTION Detect faces within the image
LANDMARK_DETECTION Detect geographic landmarks within the image
LOGO_DETECTION Detect company logos within the image
SAFE_SEARCH_DETECTION Determine image safe search properties on the image
IMAGE_PROPERTIES Compute image properties, including dominant colors

What do you see in that cloud?
Beta API 1.1 offers additional features:
Note Google does not guarantee any SLAs, deprecation policies, or future
backward compatibility with these services.
Feature Type Description
IMAGE_PROPERTIES Predict crop hints for the image, in addition to previous data
WEB_DETECTION Detect news, events, or celebrities within an image, then search Google
Images for similar photos
DOCUMENT_TEXT_
DETECTION
Optimize text parser for dense OCR, rather than some text with non-text

Cloud Vision APIs
 Can simultaneously detect multiple features
 Features billed individually per use on image
 No Barcode feature
 Simple JSON request/response format
 Submit image from Cloud Storage or in Base64
 Returns 0 or more annotations by confidence

For Your Eyes Only
No OAuth required for Cloud Vision
Make requests using API Key
POST https://vision.googleapis.com/v1/images:annotate?key={YOUR_API_KEY}
Easy to script using Service Account

Response types
Feature Returns
Label Description of the picture’s contents
Confidence score
Text, Logo Text contents or logo owner name
Bounding polygon containing the text or logo
[Text only] Locale (language)
[Logo only] Confidence score
Face Bounding polygon and rotational characteristics of the face
Positions of various characteristics such as eyes, ears, lips, chin, forehead
Confidence score of exhibiting joy, sorrow, anger, or surprise
Landmark Description of the landmark and confidence score
Bounding polygon of the recognized landmark in the picture
Safe Search Likelihood of the image containing adult or violent content, that it was a spoof, or
contains graphic medical imagery

Response types
Feature Returns
Image
properties
Array of dominant RGB colors within the image, ordered by fraction of pixels
Crop hints by confidence and “importanceFraction”
Web Entities
and Pages
News, events, celebrities, or other labels found within an image
Similar images from Google Image Search
Website URLs containing matching images
Document Text List of nested objects consisting of types Page  Block  Paragraph  Word  Symbol
Bounding box of the recognized text
Good example of the Document Text hierarchy at
https://cloud.google.com/vision/docs/detecting-fulltext#vision-document-
text-detection-python

Demo

Mobile Vision vs. Cloud Vision
Mobile Vision is for Native Android
Free; no usage quotas
Handles more data processing
Can utilize camera video
Takes advantage of hardware

https://cloud.google.com/natural-language/
Cloud Natural Language API
Making computers speak human

Natural Language API: Analyze Any ASCII
Parses text for parts of speech
Discovers entities like organizations,
people, locations
Analyzes text sentiment
Use Speech, Vision, Translate APIs
upstream
Works with English, Spanish, or Japanese

Sample NL API Request
From https://cloud.google.com/natural-language/docs/basics
<- Optional, can be guessed automatically
<- Optional, but recommended
// Use only ONE of “content” or “gcsContentUri”
Pick one of https://language.googleapis.com/v1/documents:analyzeEntities
analyzeEntitySentiment
analyzeSentiment
analyzeSyntax
classifyText

Combined Sample NL API Request
From https://cloud.google.com/natural-language/docs/basics
<- Optional, can be guessed automatically
Make request to https://language.googleapis.com/v1/documents:annotateText

Interpreting NL API Sentiment Responses
SCORE
-1 1
MAGNITUDE
0 ∞
1
10 10
2
10
3
Sample
analyzeSentiment
response for the
Gettysburg
Address:
{
“score”: 0.4,
“magnitude”: 3.8
}

https://cloud.google.com/prediction/
Google Prediction API (Deprecated)
To further their conquest for all knowledge past, present, and future
This service is shutting down on 4/30/2018.

Making Predictions With Google
 Build “trained” model or use “hosted” model
 Hosted models (all demos):
 Language identifier
 Tag categorizer (as android, appengine, chrome, youtube)
 Sentiment predictor
 Trained models:
 Submit attributes and labels for each example
 Need at least six examples
 Store examples in Cloud Storage

Don’t Model Trains; Train Your Model
Train API against data
prediction.trainedmodels.insert
Send prediction query
prediction.trainedmodels.predict
Update the model
prediction.trainedmodels.update
Other CRUD operations: list, get, delete

Don’t Model Trains; Train Your Model
Insert query requires:
id
modelType
storageDataLocation
Don’t forget: poll for status updates

Permissions To Make Predictions
OAuth is required for Predictions
Easy to script using Service Account
Or, get Web app credentials:
https://console.developers.google.com/apis/credentials

Smartly Auto Fill Google Sheets
Add-on for Google Spreadsheets
No previously trained model needed
Use data with partially-labeled
examples
Specify column to auto-fill
Wait... a... long... time...

TensorFlow
All that Linear Algebra you slept through in college

About TensorFlow
 Offline library for large-
scale numerical
computation
 Think of a graph:
 Nodes represent
mathematical operations
 Edges represent tensors
flowing between them
 Excellent at building
deep neural networks
Soft𝑚𝑎𝑥 𝑥 𝑖 =
𝑒 𝑥 𝑖
𝑗 𝑒 𝑥 𝑗
𝑅𝑒𝐿𝑈 𝑛 =𝑓 𝑥
= max(0, 𝑥)

Tense About Tensors?
 Think about MNIST handwritten digits
Each number is 28 pixels squared
There are 10 numbers, 0-9

Tense About Tensors?
 Define an input tensor of shape
(any batch size, 784)
x = tf.placeholder(tf.float32, shape=[None, 784])
 Define a target output tensor of shape
(any batch size, 10)
y_ = tf.placeholder(tf.float32, shape=[None, 10])
 Define weights matrix (784x10)
and biases vector (10-D)

One-Hot: Cool To the Touch
 Load the input data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
 One-hot?!
 Think about encoding categorical features:
US = 0, UK = 1, India = 2, Canada = 3, …
 This implies ordinal properties and confuses learners
 Break encoding into Booleans:
 This is where the 10-D target output tensor comes from
US = [1, 0, 0, 0]
UK = [0, 1, 0, 0]
Etc…

TensorFlow Data Structures -
Placeholders
 Come from inputs prior to computation
 x (input picture as vector), y_ (one-hot 10-D
classification vector)
[0, 0, 0, 0, 0, 0, 0, 0, …
0, 0, 0, 0, 0, 1, 1, 0, …
0, 0, 0, 0, 1, 1, 1, 0, …
0, 0, 0, 0, 1, 1, 0, 0, …
0, 0, 0, 1, 1, 1, 0, 0, …
...
[0, 0, 0, 0, 1,
0, 0, 0, 0, 0]
Input x y_

TensorFlow Data Structures –
Variables
 Values (i.e. model parameters) inside nodes
 Used and modified by learning process
 Need to be initialized with
W = tf.Variable(tf.zeros([784,10]))
init = tf.global_variables_initializer()
 W (weights to scale inputs by), b (bias to add
to scaled value)

Training a Dragon, if the Dragon is a Model
 Your Simple Model:
y = tf.matmul(x, W) + b
 Cross-entropy: distance between guess & correct answer
cross_entropy = tf.reduce_mean(tf.nn.
softmax_cross_entropy_with_logits(labels=y_, logits=y))
 Gradient descent: minimize cross-entropy
train_step =
tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
 Learning rate: 0.5
𝐻 𝑦′ 𝑦 = −
𝑖
𝑦𝑖
′
log(𝑦𝑖)

Dragon Get
Wiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiings!
 Start a Session
 Run global_variables_initializer
 Run training for 1000 steps
sess = tf.Session()
sess.run(init)
for i in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
train_step.run(feed_dict={x: batch_xs, y_: batch_ys})
 Expensive to use all training data at once!
 Pick 100 random samples each step

Test Flight Evaluation
 Compare labels between guess y and correct y_
correct_prediction =
tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
 Cast each Boolean result into either a 0 or 1, then average it
accuracy = tf.reduce_mean(
tf.cast(correct_prediction, tf.float32))
 Print the final figure
print(accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels}))

Future Talks, Other Talks
Follow me if you want to hear these!
 Build a Neural Network in Python with NumPy
 Build a Neural Network with nVidia CUDA
Elsewhere,
 Mapmaking with Google Maps API, Polymer, and
Firebase
 The Process Of Arcade Game ROM Hacking

More Resources
 Google’s “Googly Eyes” Android app [Mobile Vision API]
https://github.com/googlesamples/android-
vision/tree/master/visionSamples/googly-eyes
 Quick, Draw! Google classification API for sketches
https://quickdraw.withgoogle.com/
 Making Android Apps With Intelligence, by Margaret Maynard-
Reid https://realm.io/news/360andev-margaret-maynard-reid-
making-android-apps-with-intelligence/ (Video + slides)

Thank You

Hands-On with Google’s Machine Learning APIs, 12/3/2017

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Hands-On with Google’s Machine Learning APIs, 12/3/2017

Ähnlich wie Hands-On with Google’s Machine Learning APIs, 12/3/2017 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Hands-On with Google’s Machine Learning APIs, 12/3/2017

Hinweis der Redaktion