Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Soumith Chintala - Increasing the Impact of AI Through Better Software

239 Aufrufe

Veröffentlicht am

Increasing the Impact of AI Through Better Software

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Soumith Chintala - Increasing the Impact of AI Through Better Software

  1. 1. SOUMITH CHINTALA FACEBOOK AI Increasing the Impact of AI through Better Software
  2. 2. AI at Facebook
  3. 3. S O C I A L R E C O M M E N D A T I O N S E N H A N C I N G E X I S T I N G P R O D U C T S
  4. 4. S O C I A L R E C O M M E N D A T I O N S M A C H I N E T R A N S L A T I O N S E N H A N C I N G E X I S T I N G P R O D U C T S
  5. 5. A C C E S S I B I L I T YS O C I A L R E C O M M E N D A T I O N S M A C H I N E T R A N S L A T I O N S Diana and Hanna on green bicycles Riding bikes in Santo Domingo Biking near a red building I M A G E C O N TA I N S E N H A N C I N G E X I S T I N G P R O D U C T S
  6. 6. B O T S & A S S I S T A N T S G E N E R A T E D C O N T E N T A R E F F E C T S V R H A R D W A R E P O W E R I N G N E W E X P E R I E N C E S
  7. 7. S H A R E B A I T I N G S U I C I D E P R E V E N T I O N H E L P I N G P R O T E C T C O M M U N I T Y
  8. 8. A I I S A D V A N C I N G Q U I C K L Y R E I N F O R C E M E N T L E A R N I N G C O M P U T E R V I S I O N N AT U R A L L A N G U A G E P R O C E S S I N G
  9. 9. Research and Experimentation
  10. 10. Deep Learning Frameworks
  11. 11. PYTORCH OVERVIEW
  12. 12. What is PyTorch? Ndarray library with GPU support automatic differentiation engine gradient based optimization package Deep Learning Reinforcement Learning Numpy-alternative Utilities (data loading, etc.)
  13. 13. ndarray library • np.ndarray <-> torch.Tensor • 200+ operations, similar to numpy • very fast acceleration on
  14. 14. ndarray library Numpy PyTorch
  15. 15. ndarray / Tensor library
  16. 16. ndarray / Tensor library
  17. 17. ndarray / Tensor library
  18. 18. ndarray / Tensor library
  19. 19. NumPy bridge
  20. 20. NumPy bridge Zero memory-copy very efficient
  21. 21. NumPy bridge
  22. 22. NumPy bridge
  23. 23. Seamless GPU Tensors
  24. 24. Neural Networks
  25. 25. Neural Networks
  26. 26. Neural Networks
  27. 27. Optimization package SGD, Adagrad, RMSProp, LBFGS, etc.
  28. 28. Distributed training S C A L A B I L I T Y Challenges: • Scaling to hundreds of GPUs • Heterogeneous clusters, Ethernet/InfiniBand • Potentially unreliable nodes In PyTorch 1.0: • Fully revamped distributed backend - c10d
  29. 29. Distributed PyTorch • MPI style distributed communication • Broadcast Tensors to other nodes • Reduce Tensors among nodes - for example: sum gradients among all nodes
  30. 30. Distributed Data Parallel
  31. 31. Distributed Data Parallel
  32. 32. Work items in practice Writing Dataset loaders Building models Implementing Training loop Checkpointing models Interfacing with environments Dealing with GPUs Building optimizers Building Baselines
  33. 33. Work items in practice Writing Dataset loaders Building models Implementing Training loop Checkpointing models Interfacing with environments Dealing with GPUs Building optimizers Building Baselines Python + PyTorch - an environment to do all of this
  34. 34. T I M E L I N E S I N C E R E L E A S E 0.1.12 first public release 0.2.0 Distributed Higher-order gradients Broadcasting Advanced-Indexing 0.3.0 Reduce framework overhead by 10x ONNX Support 0.4.0 Windows Support Tensor <-> Variable distributions 1.0.0 torch.jit
  35. 35. Code == Model == DataCode == Model 1.0.0 torch.jit
  36. 36. Code == Model == DataCode == Model 1.0.0 torch.jit def myfun(x): y = x * x z = y.tanh() return z @torch.jit.script def myfun(x): y = x * x z = y.tanh() return z
  37. 37. Code == Model == DataCode == Model 1.0.0 torch.jit def myfun(x): y = x * x z = y.tanh() return z @torch.jit.script def myfun(x): y = x * x z = y.tanh() return z graph(%x : Dynamic) { %y : Dynamic = aten::mul(%x, %x) %z : Dynamic = aten::tanh(%y) return (%z) }
  38. 38. Code == Model == DataCode == Model 1.0.0 torch.jit def myfun(x): y = x * x z = y.tanh() return z @torch.jit.script def myfun(x): y = x * x z = y.tanh() return z graph(%x : Dynamic) { %y : Dynamic = aten::mul(%x, %x) %z : Dynamic = aten::tanh(%y) return (%z) } Export and run anywhere
  39. 39. 1.0.0 torch.jit def myfun(x): y = x * x z = y.tanh() return z @torch.jit.script def myfun(x): y = x * x z = y.tanh() return z Eager execution Execution with ahead-of-time analysis
  40. 40. 1.0.0 torch.jit def myfun(x): y = x * x z = y.tanh() return z @torch.jit.script def myfun(x): y = x * x z = y.tanh() return z Eager execution Execution with ahead-of-time analysis
  41. 41. 1.0.0 torch.jit def myfun(x): y = x * x z = y.tanh() return z @torch.jit.script def myfun(x): y = x * x z = y.tanh() return z Eager execution Execution with ahead-of-time analysis
  42. 42. 1.0.0 torch.jit def myfun(x): y = x * x z = y.tanh() return z @torch.jit.script def myfun(x): y = x * x z = y.tanh() return z Eager execution Execution with ahead-of-time analysis
  43. 43. 1.0.0 torch.jit def myfun(x): y = x * x z = y.tanh() return z @torch.jit.script def myfun(x): y = x * x z = y.tanh() return z Eager execution Execution with ahead-of-time analysis
  44. 44. 1.0.0 torch.jit def myfun(x): y = x * x z = y.tanh() return z @torch.jit.script def myfun(x): return tanh_mul(x) Eager execution Execution with ahead-of-time analysis
  45. 45. 1.0.0 torch.jit def myfun(x): y = x * x z = y.tanh() return z @torch.jit.script def myfun(x): return tanh_mul(x) Eager execution Execution with ahead-of-time analysis saturate faster hardware whole program optimizations
  46. 46. PyTorch Models are Python program, autograd for derivatives + Simple + Debuggable — print and pdb + Hackable — use any Python library – Needs Python to run – Difficult to optimize and parallelize Eager Mode PyTorch Models are programs written in an optimizable subset of Python + Production deployment + No Python dependency + Optimizable Script Mode
  47. 47. Tools to transition eager code into script mode P Y T O R C H J I T @torch.jit.script torch.jit.trace For prototyping, training, and experiments For use at scale in production E A G E R M O D E S C R I P T M O D E
  48. 48. Build tools for transitioning from research to production in the same environment Fully loaded: expose building blocks for performance and scaling Opt-in: trade-off flexibility only when beneficial Same environment: continuously refactor your model V I S I O N V A L U E S
  49. 49. Deployment in C++ S C A L A B I L I T Y & C R O S S - P L A T F O R M Often Python is not an option: • High overhead on small models • Multithreading services bottleneck on GIL • Deployment service might be C++ only In PyTorch 1.0: • Convert inference part of the model to Torch Script • Link with libtorch.so in your C++ application Torch Script + state_dict
  50. 50. PyTorch Models are Python program, autograd for derivatives + Simple + Debuggable — print and pdb + Hackable — use any Python library
  51. 51. Ecosystem • Probabilistic Programming http://pyro.ai/ github.com/probtorch/probtorch
  52. 52. Ecosystem • Gaussian Processes https://github.com/cornellius-gp/gpytorch
  53. 53. Ecosystem • Machine Translation https://github.com/OpenNMT/OpenNMT-pyhttps://github.com/facebookresearch/fairseq-py
  54. 54. Ecosystem • AllenNLP http://allennlp.org/
  55. 55. Ecosystem • Pix2PixHD https://github.com/NVIDIA/pix2pixHD
  56. 56. Ecosystem • Sentiment Discovery https://github.com/NVIDIA/sentiment-discovery
  57. 57. Ecosystem
  58. 58. G E T S T A R T E D L O C A L I N S T A L L C L O U D P A R T N E R S
  59. 59. THANKS

×