Google Big Data Expo

a 30 min short walk
Robert Saxby - Big Data Product Specialist

Sustainability
Google datacenters have half the
overhead of typical industry data centers
Largest private investor in renewables: $2
billion generating 3.2 GW
Applying Machine Learning produced
40% reduction in cooling energy

Large Datasets Cutting Edge Models Compute at Scale
Drivers of Success in AI/ML Projects

App DeveloperData Scientist
Build custom modelsUse/extend OSS SDK Use pre-built models
ML researcher
Cloud MLE ML Perception services
End to End: Google Cloud AI Spectrum

Proprietary + Confidential
What is TensorFlow?
● A system for distributed, parallel machine learning
● It’s based on general-purpose dataflow graphs
● It targets heterogeneous devices
○ A single PC with CPU
○ A single PC with GPU(s)
○ A mobile device
○ Clusters of 100s or 1000s of CPUs, GPUs and TPUs

Another data flow system
MatMul
Add Relu
biases
weights
examples
labels
Xent
Graph of Nodes, also called Operations or ops

With tensors
MatMul
Add Relu
biases
weights
examples
labels
Xent
Edges are N-dimensional arrays: Tensors

What’s in a name?
0 Scalar (magnitude only) s = 483
1 Vector (magnitude and direction) v = [1.1, 2.2, 3.3]
2 Matrix (table of numbers) m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
3 3-Tensor (cube of numbers) t = [[[2], [4], [6]], [[8], [10], [12]], [[14], [16], [18]]]
4 n-Tensor (you get the idea) ....

Convolutional layer
W1
[4, 4, 3]
W2
[4, 4, 3]
+padding
W[4, 4, 3, 2]
filter
size
input
channels
output
channels
stride
convolutional
subsampling
convolutional
subsampling
convolutional
subsampling

With state
Add Mul
biases
...
learning
rate
−=...
'Biases' is a variable −= updates biasesSome ops compute gradients

And distributed
Add Mul
biases
...
learning
rate
−=...
Device BDevice A

TensorFlow Distributed Execution Engine
CPU GPU Android iOS ...
C++ FrontendPython Frontend ...
Layers
Estimator
Models in a box
Train and evaluate
models
Build models
Keras
Model
Canned Estimators

Artificial Intelligence
The science of making things smart
Neural Network
A type of algorithm in machine learning
Machine Learning
Building machines that can learn

The popular imagination of what ML is
Lots of data Magical resultsComplex mathematics in multidimensional spaces

In reality, ML is
Collect
data
Create the
model
Refine the
model
Understand
and prepare
the data
Serve the
model
Define
objectives

Neural Network is a function that can learn

How about this?

Neural Network can extract hidden features from data

28x28
pixels
softmax
...
...
0 1 2 9
weighted sum of all
pixels + bias
neuron outputs
784 pixels
A very simple model

L0,0
w0,0
w0,1
w0,2
w0,3
… w0,9
w1,0
w1,1
w1,2
w1,3
… w1,9
w2,0
w2,1
w2,2
w2,3
… w2,9
w3,0
w3,1
w3,2
w3,3
… w3,9
w4,0
w4,1
w4,2
w4,3
… w4,9
w5,0
w5,1
w5,2
w5,3
… w5,9
w6,0
w6,1
w6,2
w6,3
… w6,9
w7,0
w7,1
w7,2
w7,3
… w7,9
w8,0
w8,1
w8,2
w8,3
… w8,9
…
w783,0
w783,1
w783,2
… w783,9
x
x
x
x
x
x
x
x
L1,0
L1,1
L1,2
L1,3
… L1,9
L2,0
L2,1
L2,2
L2,3
… L2,9
L3,0
L3,1
L3,2
L3,3
… L3,9
L4,0
L4,1
L4,2
L4,3
… L4,9
…
L99,0
L99,1
L99,2
… L99,9
L0,0
L0,1
L0,2
L0,3
… L0,9
… + b0
b1
b2
b3
… b9
+ Same 10 biases
on all lines
X : 100 images,
one per line,
flattened
784 pixels
784lines
broadcast
100 images at a time

9...0 1 2
sigmoid function
softmax
200
100
60
10
30
784
overkill
;-)
Going Deep, 5 Layers Deep

TanhSigmoidBinary StepIdentity Relu
Softmax
1 1 1
-1
weighted sum of all
pixels + bias
2.0
1.0
0.1
Scores
→ Logits
0.7
0.2
0.1
Probabilities
Activation Functions

Predictions Images Weights Biases
Y[100, 10] X[100, 784] W[784,10] b[10]
matrix multiply
broadcast
on all lines
applied line
by line
tensor shapes in [ ]
Softmax on a batch of images

Cross entropy:
computed probabilities
actual probabilities, “one-hot” encoded
0 0 0 0 0 0 1 0 0 0
this is a “6”
0.1 0.2 0.1 0.3 0.2 0.1 0.9 0.2 0.1 0.1
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
Success?

Gradient Descent

import tensorflow as tf
X = tf.placeholder(tf.float32, [None, 28, 28, 1])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
init = tf.initialize_all_variables()
# model
Y=tf.nn.softmax(tf.matmul(tf.reshape(X,[-1, 784]), W) + b)
# placeholder for correct answers
Y_ = tf.placeholder(tf.float32, [None, 10])
# loss function
cross_entropy = -tf.reduce_sum(Y_ * tf.log(Y))
# % of correct answers found in batch
is_correct = tf.equal(tf.argmax(Y,1), tf.argmax(Y_,1))
accuracy = tf.reduce_mean(tf.cast(is_correct,tf.float32))
optimizer = tf.train.GradientDescentOptimizer(0.003)
train_step = optimizer.minimize(cross_entropy)
sess = tf.Session()
sess.run(init)
for i in range(10000):
# load batch of images and correct answers
batch_X, batch_Y = mnist.train.next_batch(100)
train_data={X: batch_X, Y_: batch_Y}
# train
sess.run(train_step, feed_dict=train_data)
# success ? add code to print it
a,c = sess.run([accuracy, cross_entropy], feed=train_data)
# success on test data ?
test_data={X:mnist.test.images, Y_:mnist.test.labels}
a,c = sess.run([accuracy, cross_entropy], feed=test_data)
initialisation
model
success metrics
training step
Run
The whole code

Workshop
Self-paced code lab (summary below ↓): goo.gl/mVZloU
Code: github.com/martin-gorner/tensorflow-mnist-tutorial
1-5. Theory (install then sit back and listen or read)
Neural networks 101: softmax, cross-entropy,
mini-batching, gradient descent, hidden layers, sigmoids,
and how to implement them in Tensorflow
6. Practice (full instructions for this step)
Open file: mnist_1.0_softmax.py
Run it, play with the visualisations (keyboard shortcuts
on previous slide), read and understand the code as well
as the basic structure of a Tensorflow program.
Start from the file mnist_1.0_softmax.py and add one
or two hidden layers.
Solution in: mnist_2.0_five_layers_sigmoid.py
Special care for deep neural networks: use RELU
activation functions, use a better optimiser, initialise
weights with random values and beware of the log(0)
9-10. Practice (full instructions for this step)
Use a decaying learning rate and then add dropout
Solution in: mnist_2.2_five_layers_relu_lrdecay_dropout.py
11. Theory (sit back and listen or read)
Convolutional networks
Replace your model with a convolutional network,
without dropout.
Solution in: mnist_3.0_convolutional.py
13. Challenge (full instructions for this step)
Try a bigger neural network (good hyperparameters on
slide 43) and add dropout on the last layer to get >99%
Solution in: mnist_3.0_convolutional_bigger_dropout.py
?
?

https://cloud.google.com/solutions/running-distributed-tensorflow-on-compute-engine
Distributed TensorFlow on Compute Engine

Machine Learning on any data, of any size
Cloud ML Engine
Portable models with TensorFlow
Services are designed to work together
Managed distributed training infrastructure
that supports CPUs and GPUs
Automatic hyperparameter tuning

Custom Estimators: The Model
https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census
...
def _model_fn(mode, features, labels):
...
if mode == Modes.PREDICT:
...
return tf.estimator.EstimatorSpec(mode, predictions=predictions, export_outputs=export_outputs)
...
if mode == Modes.TRAIN:
...
return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
...

https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census
Custom Estimators: The Task
...
train_input = lambda: model.generate_input_fn(hparams.train_files, num_epochs=hparams.num_epochs,
batch_size=hparams.train_batch_size)
...
"""This function is used by learn_runner to create an Experiment which executes model code
provided in the form of an Estimator and input functions."""
def _experiment_fn(run_config, hparams):
tf.estimator.Estimator(
model.generate_model_fn(
...
),
train_input_fn=train_input,
eval_input_fn=eval_input,
**experiment_args
)
...

Running locally
gcloud ml-engine local train
--module-name trainer.task --package-path trainer/
--
--train-files $TRAIN_DATA --eval-files $EVAL_DATA --train-steps 1000 --job-dir $MODEL_DIR
training
data
evaluation
data
output
directory
train locally

Single trainer running in the cloud
gcloud ml-engine jobs submit training $JOB_NAME --job-dir $OUTPUT_PATH
--runtime-version 1.0 --module-name trainer.task --package-path trainer/ --region $REGION
--
--train-files $TRAIN_DATA --eval-files $EVAL_DATA --train-steps 1000 --verbosity DEBUG
train in the cloud
region
Google cloud storage
location

Distributed training in the cloud
--scale-tier STANDARD_1
--
distributed

Refine the model
Feature engineering
Better algorithms
More examples, more data
Hyperparameter tuning

● Automatic hyperparameter tuning service
● Build better performing models faster and save
many hours of manual tuning
● Google-developed search (Bayesian Optimisation)
algorithm efficiently finds better hyperparameters
for your model/dataset
HyperParam #1
Objective
We want to find this
Not these
https://cloud.google.com/blog/big-data/2017/08/hyperparameter-tuning-in-cloud-machine-learning-engine-using-bayesian-optimization

--scale-tier STANDARD_1 --config $HPTUNING_CONFIG
--
hypertuning

trainingInput:
hyperparameters:
goal: MAXIMIZE
hyperparameterMetricTag: accuracy
maxTrials: 4
maxParallelTrials: 2
params:
- parameterName: first-layer-size
type: INTEGER
minValue: 50
maxValue: 500
scaleType: UNIT_LINEAR_SCALE
...
...
# Construct layers sizes with exponetial decay
hidden_units=[
max(2, int(hparams.first_layer_size *
hparams.scale_factor**i))
for i in range(hparams.num_layers)
],
...
parser.add_argument(
'--first-layer-size',
help='Number of nodes in the 1st layer of the DNN',
default=100,
type=int
)
...
hptuning_config.yaml task.py

Deploying the model
Creating model
gcloud ml-engine models create $MODEL_NAME --regions=$REGION
Creating versions
gcloud ml-engine versions create v1 --model $MODEL_NAME --origin $MODEL_BINARIES
--runtime-version 1.0
gcloud ml-engine models list

Predicting
gcloud ml-engine predict --model $MODEL_NAME --version v1 --json-instances ../test.json
Using REST:
POST https://ml.googleapis.com/v1/{name=projects/**}:predict
JSON format (in this case):
{"age": 25, "workclass": "private", "education": "11th", "education_num": 7, "marital_status":
"Never-married", "occupation": "machine-op-inspector", "relationship": "own-child", "gender": "
male", "capital_gain": 0, "capital_loss": 0, "hours_per_week": 40, "native_country": "
United-States"}

Google Big Data Expo

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Google Big Data Expo

Ähnlich wie Google Big Data Expo (20)

Mehr von BigDataExpo

Mehr von BigDataExpo (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Google Big Data Expo