6. Key Announcements
TensorFlow 2.0 Fairness Learning ML Kit AI Hub
Federated Learning TPU v3 Cloud TPU Pods TensorFlow on Swift
TensorFlow Lite for IoT Devices TensorFlow Agent TensorFlow Extended (TFX)
TensorFlow.js Google Coral Firebase Prediction Edge TPU
7. Key Announcements
TensorFlow 2.0 Fairness Learning ML Kit AI Hub
Federated Learning TPU v3 Cloud TPU Pods TensorFlow on Swift
TensorFlow Lite for IoT Devices TensorFlow Agent TensorFlow Extended (TFX)
TensorFlow.js Google Coral
TensorFlow
Firebase Prediction Edge TPU
8. Key Announcements
TensorFlow 2.0 Fairness Learning ML Kit AI Hub
Federated Learning TPU v3 Cloud TPU Pods TensorFlow on Swift
TensorFlow Lite for IoT Devices TensorFlow Agent TensorFlow Extended (TFX)
TensorFlow.js Google Coral
TPU / Device
Firebase Prediction Edge TPU
9. Key Announcements
TensorFlow 2.0 Fairness Learning ML Kit AI Hub
Federated Learning TPU v3 Cloud TPU Pods TensorFlow on Swift
TensorFlow Lite for IoT Devices TensorFlow Agent TensorFlow Extended (TFX)
TensorFlow.js Google Coral
ML Kit
Firebase Prediction Edge TPU
10. This session, We will talk
TensorFlow 2.0 Fairness Learning ML Kit AI Hub
Federated Learning TPU v3 Cloud TPU Pods TensorFlow on Swift
TensorFlow Lite for IoT Devices TensorFlow Agent TensorFlow Extended (TFX)
TensorFlow.js Google Coral
Distributed Learning
Firebase Prediction Edge TPU
14. The problem
● Learning time has dependency on the and GPU model
● The model update process works on only
● High spec GPU machine is too
● Single GPU has a practical
● There is no way to support
Previous learning environment
batch-size
single GPU
expensive
limitations
scalability
15. SGD with multiple GPU
Model
loss
Gradient
GPU 1CPU
Aggregate
(AVG)
Update
∆𝒘
Model
loss
Gradient
GPU 3
Model
loss
Gradient
GPU 2
Gather
∆𝒘𝟏 ∆𝒘𝟐 ∆𝒘𝟑
Previous learning environment
16. The issue which we can find
● Data transmission time is slow between GPU memory and CPU
● There is GPU stickiness issue (*GPU balancing issue)
● This solution is for only single bare metal server (node)
Previous learning environment
TW gradient CPU model
19. The definition of Distribution
Increase efficiency by dividing the problem into smaller parts
Problem
Worker Worker Worker Worker
Answer
20. Three way of distributions
Parallel Concurrent Parallel + Concurrent
To build a distributed environment,
We should understand the difference of three categories for distributed solutions
21. Well known distributed solutions
● DistBelief (Google brain 1st distributed environment for Deep Learning)
● Horovod (Uber’s Distributed Tensorflow Environment)
● AllReduce (Today’s topic!)
● Federated Learning (Google announced on 2018)
● CollectiveAllReduce (Google Tensorflow tf.contrib.distribute.CollectiveAllReduce)
25. Use case of Uber (horovod)
https://eng.uber.com/horovod/
http://www.cs.fsu.edu/~xyuan/paper/09jpdc.pdf
Ring AllReduce
Use case of Uber (horovod)
https://eng.uber.com/horovod/
http://www.cs.fsu.edu/~xyuan/paper/09jpdc.pdf
Ring AllReduce
52. Summary
● TPU’s inter-connect design gives high-speed for communication with units
● TPU v3 and Pods basically follows AllReduce
(1-D ring AllReduce, 2-D AllReduce)
● TPU Pods is not available yet (Alpha ‘19 06 30)