Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Graph Execution

Raffi Khatchadourian
Raffi KhatchadourianAssociate Professor of Computer Science um City University of New York (CUNY) Hunter College
Introduction Motivation Optimization Approach Conclusion
Towards Safe Automated Refactoring of
Imperative Deep Learning Programs to Graph
Execution
Raffi Khatchadourian1,2
Tatiana Castro Vélez2
Mehdi
Bagherzadeh3
Nan Jia2
Anita Raja1,2
1
City University of New York (CUNY) Hunter College, USA
2
City University of New York (CUNY) Graduate Center, USA
3
Oakland University, USA
International Conference on Automated Software Engineering
September 14, 2023, Kirchberg, Luxembourg
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 1 / 12
Introduction Motivation Optimization Approach Conclusion
Deep Learning Systems & Run-time Performance
Machine Learning (ML), including Deep Learning (DL), systems are
pervasive.
As datasets grow, efficiency becomes essential to support
responsiveness [Zhou et al., 2020].
For efficiency, DL frameworks have traditionally embraced a deferred
execution-style supporting graph-based (DNN) computation.
Scalable, but development is . . .
Error-prone.
Cumbersome.
Produces programs that are difficult to debug.
Because graph computation executes statements in a non-imperative
order, traditional SE tools cannot help troubleshoot bugs [Arpteg
et al., 2018].
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 2 / 12
TensorFlow Deferred Execution-style Code
1 # Build a graph.
2 a = tf.constant(5.0)
3 b = tf.constant(6.0)
4 c = a * b
5
6 # Launch graph in a session.
7 sess = tf.Session()
8
9 # Evaluate the tensor `c`.
10 print(sess.run(c)) # prints 30.0
Lines 2–4 build a computation graph.
Line 4 does not execute until the Session is run on line 10.
No native support common imperative program constructs, e.g.,
iteration.
TensorFlow Deferred Execution-style Code
1 # Build a graph.
2 a = tf.constant(5.0)
3 b = tf.constant(6.0)
4 c = a * b
5
6 # Launch graph in a session.
7 sess = tf.Session()
8
9 # Evaluate the tensor `c`.
10 print(sess.run(c)) # prints 30.0
Lines 2–4 build a computation graph.
Line 4 does not execute until the Session is run on line 10.
No native support common imperative program constructs, e.g.,
iteration.
TensorFlow Deferred Execution-style Code
1 # Build a graph.
2 a = tf.constant(5.0)
3 b = tf.constant(6.0)
4 c = a * b
5
6 # Launch graph in a session.
7 sess = tf.Session()
8
9 # Evaluate the tensor `c`.
10 print(sess.run(c)) # prints 30.0
Lines 2–4 build a computation graph.
Line 4 does not execute until the Session is run on line 10.
No native support common imperative program constructs, e.g.,
iteration.
Introduction Motivation Optimization Approach Conclusion
Imperative DL Programming, Eager Execution, &
Hybridization
Imperative DL frameworks (e.g., TensorFlow Eager, Keras, PyTorch)
encouraging eager execution are more natural, less error-prone, and
easier to debug.
Sacrifices run-time performance.
Thus, hybrid approaches (e.g., Hybridize, TorchScript, AutoGraph)
have surfaced that:
Execute imperative DL programs as static graphs at run-time.
Are integrated into mainstream DL frameworks (e.g.,
TensorFlow, MXNet, PyTorch).
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 4 / 12
Eager TensorFlow Imperative (OO) DL Model Code
1 class SequentialModel(tf.keras.Model):
2 def __init__(self, **kwargs):
3 super(SequentialModel, self).__init__(...)
4 self.flatten = layers.Flatten(input_shape=(28, 28))
5 num_layers = 100 # Add many small layers.
6 self.layers = [layers.Dense(64, activation = "relu") for n in
range(num_layers)]
,
→
7 self.dropout = tf.keras.layers.Dropout(0.2)
8 self.dense_2 = tf.keras.layers.Dense(10)
9
10
11 def __call__(self, x):
12 x = self.flatten(x)
13 for layer in self.layers:
14 x = layer(x)
15 x = self.dropout(x)
16 x = self.dense_2(x)
17 return x
Hybridized TensorFlow Imperative (OO) DL Model Code
1 class SequentialModel(tf.keras.Model):
2 def __init__(self, **kwargs):
3 super(SequentialModel, self).__init__(...)
4 self.flatten = layers.Flatten(input_shape=(28, 28))
5 num_layers = 100 # Add many small layers.
6 self.layers = [layers.Dense(64, activation = "relu") for n in
range(num_layers)]
,
→
7 self.dropout = tf.keras.layers.Dropout(0.2)
8 self.dense_2 = tf.keras.layers.Dense(10)
9
10 @tf.function(...) # Executes model as graph (optional args).
11 def __call__(self, x):
12 x = self.flatten(x)
13 for layer in self.layers:
14 x = layer(x)
15 x = self.dropout(x)
16 x = self.dense_2(x)
17 return x
On line 10, AutoGraph used to potentially enhance performance.
Decorates model’s call() method with @tf.function, possibly
providing optional yet influential decorator arguments.
At run-time, call()’s execution will be “traced” (∼9.22 speedup).
Hybridized TensorFlow Imperative (OO) DL Model Code
1 class SequentialModel(tf.keras.Model):
2 def __init__(self, **kwargs):
3 super(SequentialModel, self).__init__(...)
4 self.flatten = layers.Flatten(input_shape=(28, 28))
5 num_layers = 100 # Add many small layers.
6 self.layers = [layers.Dense(64, activation = "relu") for n in
range(num_layers)]
,
→
7 self.dropout = tf.keras.layers.Dropout(0.2)
8 self.dense_2 = tf.keras.layers.Dense(10)
9
10 @tf.function(...) # Executes model as graph (optional args).
11 def __call__(self, x):
12 x = self.flatten(x)
13 for layer in self.layers:
14 x = layer(x)
15 x = self.dropout(x)
16 x = self.dense_2(x)
17 return x
On line 10, AutoGraph used to potentially enhance performance.
Decorates model’s call() method with @tf.function, possibly
providing optional yet influential decorator arguments.
At run-time, call()’s execution will be “traced” (∼9.22 speedup).
Hybridized TensorFlow Imperative (OO) DL Model Code
1 class SequentialModel(tf.keras.Model):
2 def __init__(self, **kwargs):
3 super(SequentialModel, self).__init__(...)
4 self.flatten = layers.Flatten(input_shape=(28, 28))
5 num_layers = 100 # Add many small layers.
6 self.layers = [layers.Dense(64, activation = "relu") for n in
range(num_layers)]
,
→
7 self.dropout = tf.keras.layers.Dropout(0.2)
8 self.dense_2 = tf.keras.layers.Dense(10)
9
10 @tf.function(...) # Executes model as graph (optional args).
11 def __call__(self, x):
12 x = self.flatten(x)
13 for layer in self.layers:
14 x = layer(x)
15 x = self.dropout(x)
16 x = self.dense_2(x)
17 return x
On line 10, AutoGraph used to potentially enhance performance.
Decorates model’s call() method with @tf.function, possibly
providing optional yet influential decorator arguments.
At run-time, call()’s execution will be “traced” (∼9.22 speedup).
Introduction Motivation Optimization Approach Conclusion Drawbacks
Hybridization Drawbacks
Needs non-trivial, specialized metadata [Jeong et al., 2019].
Exhibit limitations and known issues with native program constructs.
Subtle considerations required to:
Specify (decorate) the functions to be migrated.
Make code amenable to safe, accurate, and efficient graph execution.
Avoid performance bottlenecks and semantically inequivalent
results [Cao et al., 2021, Castro Vélez et al., 2022].
Manual analysis and refactoring (semantics-preserving,
source-to-source transformation) for optimal results can be error-
and omission-prone [Dig et al., 2009].
Further complicated by:
Increasing Object-Orientation (OO) in DL model code [Chollet,
2020].
Dynamically-typed languages (e.g., Python).
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 7 / 12
Introduction Motivation Optimization Approach Conclusion Problem Insight Goals Approach
Key Insight
Although imperative DL code is sequentially executed, hybridizing code
resembles parallelizing sequential code.
Example
To void unexpected behavior, like concurrent programs, hybrid functions
should avoid side-effects.
Idea
Adapt concepts from automated refactorings that parallelize sequential
code, e.g., Streaming APIs [Khatchadourian et al., 2019].
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 8 / 12
Work In Progress
Two new, fully-automated refactorings are in-progress:
Convert Eager Function to Hybrid Transforms otherwise
eagerly-executed imperative (Python) DL code for
enhanced run-time performance.
Automatically specifies (decorates) whether and how
code could be reliably and efficiently executed as
graphs at run-time.
Avoids hybridizing code under certain conditions
(e.g., side-effecting code) to preserve semantics.
Optimize Hybrid Function Transforms code already running as
graphs for optimal run-time performance.
Modifies existing decorator parameters (e.g., tensor
shape specs).
Potentially restructures code to be more amenable to
graph transformation.
Possibly dehybridize code when eager execution could
be faster (e.g., graph “retracing”).
Approach Highlights
Novel tensor analysis for imperative DL code.
Current analyzers work on only procedural (TF 1) code.
Modernization of WALA Ariadne [Dolby et al., 2018] for imperative
(TF 2) code underway.
Implemented as a PyDev Eclipse IDE plug-in [Zadrozny, 2023].
Integrates Ariadne for tensor type inference and shape (static)
analysis.
Introduction Motivation Optimization Approach Conclusion Problem Insight Goals Approach
Approach Challenges
Lack of static type information.
Needed to determine candidate functions (at least one Tensor
parameter).
Unlike, e.g., Java, Python has no restrictions on decorator
(annotation) arguments.
tf.function may be called as a function instead of a decorator.
Example
hyb_call = tf.function(call)
hyb_call()
Determine tensor shapes.
Existing analyses only for procedural (TF 1) code.
Working towards statically resolving imperative (TF 2) code.
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 11 / 12
Introduction Motivation Optimization Approach Conclusion
Conclusion
Imperative Deep Learning code is easier to debug, write, and
maintain than traditional DL code that runs in a deferred execution.
However, it comes at the expense of (run-time) performance.
Hybrid approaches bridge the gap between eager and graph
execution.
Using hybrid techniques to achieve optimal performance and
semantics preservation is non-trivial.
Future Work
Automated client-side analyses and transformations to use
hybridization APIs correctly and optimally is in-progress.
Evaluation using dataset from our MSR ’22 [Castro Vélez et al.,
2022] empirical study.
More details in paper! http://bit.ly/tf2-ase23
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 12 / 12
Introduction Motivation Optimization Approach Conclusion
For Further Reading I
Abadi, Martı́n et al. (2016). “TensorFlow: A System for Large-Scale Machine Learning”. In: Symposium on Operating Systems
Design and Implementation.
Agrawal, Akshay et al. (2019). TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning. arXiv:
1903.01855 [cs.PL].
Apache (Apr. 8, 2021). Hybridize. Apache MXNet documentation. url:
https://mxnet.apache.org/versions/1.8.0/api/python/docs/tutorials/packages/gluon/blocks/hybridize.html (visited
on 04/08/2021).
Arpteg, A., B. Brinne, L. Crnkovic-Friis, and J. Bosch (2018). “Software Engineering Challenges of Deep Learning”. In: Euromicro
Conference on Software Engineering and Advanced Applications. IEEE, pp. 50–59. doi: 10.1109/SEAA.2018.00018.
Cao, Junming, Bihuan Chen, Chao Sun, Longjie Hu, and Xin Peng (Dec. 3, 2021). Characterizing Performance Bugs in Deep
Learning Systems. arXiv: 2112.01771 [cs.SE].
Castro Vélez, Tatiana, Raffi Khatchadourian, Mehdi Bagherzadeh, and Anita Raja (May 2022). “Challenges in Migrating
Imperative Deep Learning Programs to Graph Execution: An Empirical Study”. In: International Conference on Mining Software
Repositories. MSR ’22. ACM/IEEE. ACM. doi: 10.1145/3524842.3528455. arXiv: 2201.09953 [cs.SE].
Chen, Tianqi, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang
(2015). “MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems”. In: Workshop on
Machine Learning Systems at NIPS. arXiv: 1512.01274 [cs.DC].
Chollet, François (2020). Deep Learning with Python. 2nd ed. Manning.
Dig, Danny, John Marrero, and Michael D. Ernst (2009). “Refactoring sequential Java code for concurrency via concurrent
libraries”. In: International Conference on Software Engineering. IEEE, pp. 397–407. doi: 10.1109/ICSE.2009.5070539.
Dilhara, Malinda, Ameya Ketkar, Nikhith Sannidhi, and Danny Dig (2022). “Discovering Repetitive Code Changes in Python ML
Systems”. In: International Conference on Software Engineering. ICSE ’22. To appear.
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 12 / 12
Introduction Motivation Optimization Approach Conclusion
For Further Reading II
Dolby, Julian, Avraham Shinnar, Allison Allain, and Jenna Reinen (2018). “Ariadne: Analysis for Machine Learning Programs”. In:
International Workshop on Machine Learning and Programming Languages. MAPL 2018. ACM SIGPLAN. Philadelphia, PA, USA:
Association for Computing Machinery, pp. 1–10. isbn: 9781450358347. doi: 10.1145/3211346.3211349.
Facebook Inc. (2019). PyTorch Documentation. TorchScript. en. url: https://pytorch.org/docs/stable/jit.html (visited on
02/19/2021).
Jeong, Eunji, Sungwoo Cho, Gyeong-In Yu, Joo Seong Jeong, Dong-Jin Shin, Taebum Kim, and Byung-Gon Chun (July 2019).
“Speculative Symbolic Graph Execution of Imperative Deep Learning Programs”. In: SIGOPS Oper. Syst. Rev. 53.1, pp. 26–33.
issn: 0163-5980. doi: 10.1145/3352020.3352025.
Khatchadourian, Raffi, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed (2019). “Safe Automated Refactoring for Intelligent
Parallelization of Java 8 Streams”. In: International Conference on Software Engineering. ICSE ’19. IEEE Press, pp. 619–630. doi:
10.1109/ICSE.2019.00072.
Kim, Miryung, Thomas Zimmermann, and Nachiappan Nagappan (Nov. 2012). “A Field Study of Refactoring Challenges and
Benefits”. In: Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software
Engineering. FSE ’12. Cary, North Carolina: ACM. isbn: 9781450316149. doi: 10.1145/2393596.2393655.
Moldovan, Dan, James M. Decker, Fei Wang, Andrew A. Johnson, Brian K. Lee, Zachary Nado, D. Sculley, Tiark Rompf, and
Alexander B. Wiltschko (2019). AutoGraph: Imperative-style Coding with Graph-based Performance. arXiv: 1810.08061 [cs.PL].
Negara, Stas, Nicholas Chen, Mohsen Vakilian, Ralph E. Johnson, and Danny Dig (2013). “A Comparative Study of Manual and
Automated Refactorings”. In: European Conference on Object-Oriented Programming. Ed. by Giuseppe Castagna. Berlin,
Heidelberg: Springer Berlin Heidelberg, pp. 552–576. isbn: 978-3-642-39038-8.
OpenAI, Inc. (Aug. 18, 2023). ChatGPT. url: https://chat.openai.com (visited on 08/18/2023).
Paszke, Adam et al. (Dec. 3, 2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. arXiv: 1912.01703
[cs.LG].
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 12 / 12
Introduction Motivation Optimization Approach Conclusion
For Further Reading III
Zadrozny, Fabio (Apr. 15, 2023). PyDev. url: https://www.pydev.org (visited on 05/31/2023).
Zhou, Weijie, Yue Zhao, Guoqiang Zhang, and Xipeng Shen (2020). “HARP: Holistic Analysis for Refactoring Python-Based
Analytics Programs”. In: International Conference on Software Engineering, pp. 506–517. doi: 10.1145/3377811.3380434.
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 12 / 12
Appendix Static Analysis Refactoring LLMs Notebooks
Appendix
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 1 / 6
Appendix Static Analysis Refactoring LLMs Notebooks
Why Static Analysis?
Refactorings must operate on (at least some) static information.
Must eventually transform the source code.
May eventually integrate hybrid analyses to resolve difficult static
cases.
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 2 / 6
Appendix Static Analysis Refactoring LLMs Notebooks
Why Automated Refactoring?
In general, such problems may also be handled by compilers or
runtimes; however, refactoring has several benefits:
Gives developers more control over where the optimizations take
place and making graph execution explicit.
Can be issued multiple times, e.g., prior to major releases.
Unlike static checkers, they transform source code, a task that can
be otherwise error-prone and involve subtle nuances.
Refactorings can act like recommendation systems, which is
important for analyzing and transforming programs written in
dynamic languages where static assumptions may be easily violated!
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 3 / 6
Appendix Static Analysis Refactoring LLMs Notebooks
Refactoring Developer Adoption
Developers generally underuse automated refactorings [Kim et al.,
2012, Negara et al., 2013].
Data scientists and engineers may be more open to using automated
(refactoring) tools.
Our approach will be fully automated with minimal barrier to entry.
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 4 / 6
Appendix Static Analysis Refactoring LLMs Notebooks
LLMs & Big Data Refactoring
LLMs [OpenAI, Inc., 2023] can also perform refactorings.
Other Big Data-driven refactorings [Dilhara et al., 2022] are exciting
and promising.
Obtaining a (correct) dataset large enough to automatically extract
the proposed refactorings is challenging as developers struggle with
(manually) migrating DL code to graph execution [Castro Vélez
et al., 2022].
LLM inference capabilities are currently limited.
LLMs have a token limitation.
Hybridization requires interprocedural analysis.
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 5 / 6
Appendix Static Analysis Refactoring LLMs Notebooks
Notebook Support
We plan to investigate notebook support in the future.
We envision the approach to be used on (larger) DL systems,
consisting of multiple files.
Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 6 / 6
1 von 25

Recomendados

Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:... von
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Raffi Khatchadourian
302 views58 Folien
Deep Learning, Scala, and Spark von
Deep Learning, Scala, and SparkDeep Learning, Scala, and Spark
Deep Learning, Scala, and SparkOswald Campesato
312 views59 Folien
Deep Learning and TensorFlow von
Deep Learning and TensorFlowDeep Learning and TensorFlow
Deep Learning and TensorFlowOswald Campesato
114 views63 Folien
Automatic Task-based Code Generation for High Performance DSEL von
Automatic Task-based Code Generation for High Performance DSELAutomatic Task-based Code Generation for High Performance DSEL
Automatic Task-based Code Generation for High Performance DSELJoel Falcou
1.4K views54 Folien
Natural language processing open seminar For Tensorflow usage von
Natural language processing open seminar For Tensorflow usageNatural language processing open seminar For Tensorflow usage
Natural language processing open seminar For Tensorflow usagehyunyoung Lee
188 views48 Folien
Keras and TensorFlow von
Keras and TensorFlowKeras and TensorFlow
Keras and TensorFlowNopphawanTamkuan
84 views11 Folien

Más contenido relacionado

Similar a Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Graph Execution

Go Faster With Native Compilation von
Go Faster With Native CompilationGo Faster With Native Compilation
Go Faster With Native CompilationPGConf APAC
1K views44 Folien
Go faster with_native_compilation Part-2 von
Go faster with_native_compilation Part-2Go faster with_native_compilation Part-2
Go faster with_native_compilation Part-2Rajeev Rastogi (KRR)
546 views44 Folien
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B... von
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...Databricks
2.5K views39 Folien
H2 o berkeleydltf von
H2 o berkeleydltfH2 o berkeleydltf
H2 o berkeleydltfOswald Campesato
246 views67 Folien
Deep Learning in Your Browser von
Deep Learning in Your BrowserDeep Learning in Your Browser
Deep Learning in Your BrowserOswald Campesato
165 views66 Folien
Overview of TensorFlow For Natural Language Processing von
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processingananth
6K views11 Folien

Similar a Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Graph Execution(20)

Go Faster With Native Compilation von PGConf APAC
Go Faster With Native CompilationGo Faster With Native Compilation
Go Faster With Native Compilation
PGConf APAC1K views
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B... von Databricks
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
Databricks2.5K views
Overview of TensorFlow For Natural Language Processing von ananth
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
ananth6K views
190111 tf2 preview_jwkang_pub von Jaewook. Kang
190111 tf2 preview_jwkang_pub190111 tf2 preview_jwkang_pub
190111 tf2 preview_jwkang_pub
Jaewook. Kang510 views
Introduction to Deep Learning and Tensorflow von Oswald Campesato
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
Oswald Campesato203 views
Standardizing on a single N-dimensional array API for Python von Ralf Gommers
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
Ralf Gommers119 views
Introduction to Deep Learning, Keras, and TensorFlow von Sri Ambati
Introduction to Deep Learning, Keras, and TensorFlowIntroduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlow
Sri Ambati2.1K views
Introduction to Deep Learning, Keras, and Tensorflow von Oswald Campesato
Introduction to Deep Learning, Keras, and TensorflowIntroduction to Deep Learning, Keras, and Tensorflow
Introduction to Deep Learning, Keras, and Tensorflow
Oswald Campesato397 views
Pydiomatic von rik0
PydiomaticPydiomatic
Pydiomatic
rik0924 views
Bring your neural networks to the browser with TF.js - Simone Scardapane von MeetupDataScienceRoma
Bring your neural networks to the browser with TF.js - Simone ScardapaneBring your neural networks to the browser with TF.js - Simone Scardapane
Bring your neural networks to the browser with TF.js - Simone Scardapane
Language translation with Deep Learning (RNN) with TensorFlow von S N
Language translation with Deep Learning (RNN) with TensorFlowLanguage translation with Deep Learning (RNN) with TensorFlow
Language translation with Deep Learning (RNN) with TensorFlow
S N2.2K views

Más de Raffi Khatchadourian

Automated Evolution of Feature Logging Statement Levels Using Git Histories a... von
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Raffi Khatchadourian
58 views21 Folien
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o... von
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...Raffi Khatchadourian
433 views16 Folien
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U... von
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...Raffi Khatchadourian
362 views28 Folien
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys... von
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...Raffi Khatchadourian
1.3K views26 Folien
Automated Evolution of Feature Logging Statement Levels Using Git Histories a... von
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Raffi Khatchadourian
120 views49 Folien
An Empirical Study on the Use and Misuse of Java 8 Streams von
An Empirical Study on the Use and Misuse of Java 8 StreamsAn Empirical Study on the Use and Misuse of Java 8 Streams
An Empirical Study on the Use and Misuse of Java 8 StreamsRaffi Khatchadourian
439 views73 Folien

Más de Raffi Khatchadourian(20)

Automated Evolution of Feature Logging Statement Levels Using Git Histories a... von Raffi Khatchadourian
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o... von Raffi Khatchadourian
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U... von Raffi Khatchadourian
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys... von Raffi Khatchadourian
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a... von Raffi Khatchadourian
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
An Empirical Study on the Use and Misuse of Java 8 Streams von Raffi Khatchadourian
An Empirical Study on the Use and Misuse of Java 8 StreamsAn Empirical Study on the Use and Misuse of Java 8 Streams
An Empirical Study on the Use and Misuse of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams von Raffi Khatchadourian
Safe Automated Refactoring for Intelligent Parallelization of Java 8 StreamsSafe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams von Raffi Khatchadourian
Safe Automated Refactoring for Intelligent Parallelization of Java 8 StreamsSafe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams ... von Raffi Khatchadourian
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams ...Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams ...
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams ...
A Tool for Optimizing Java 8 Stream Software via Automated Refactoring von Raffi Khatchadourian
A Tool for Optimizing Java 8 Stream Software via Automated RefactoringA Tool for Optimizing Java 8 Stream Software via Automated Refactoring
A Tool for Optimizing Java 8 Stream Software via Automated Refactoring
Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t... von Raffi Khatchadourian
Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t...Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t...
Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t...
Towards Safe Refactoring for Intelligent Parallelization of Java 8 Streams von Raffi Khatchadourian
Towards Safe Refactoring for Intelligent Parallelization of Java 8 StreamsTowards Safe Refactoring for Intelligent Parallelization of Java 8 Streams
Towards Safe Refactoring for Intelligent Parallelization of Java 8 Streams
Proactive Empirical Assessment of New Language Feature Adoption via Automated... von Raffi Khatchadourian
Proactive Empirical Assessment of New Language Feature Adoption via Automated...Proactive Empirical Assessment of New Language Feature Adoption via Automated...
Proactive Empirical Assessment of New Language Feature Adoption via Automated...
Defaultification Refactoring: A Tool for Automatically Converting Java Method... von Raffi Khatchadourian
Defaultification Refactoring: A Tool for Automatically Converting Java Method...Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Defaultification Refactoring: A Tool for Automatically Converting Java Method... von Raffi Khatchadourian
Defaultification Refactoring: A Tool for Automatically Converting Java Method...Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE... von Raffi Khatchadourian
Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE...Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE...
Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE...
Poster on Automated Refactoring of Legacy Java Software to Default Methods von Raffi Khatchadourian
Poster on Automated Refactoring of Legacy Java Software to Default MethodsPoster on Automated Refactoring of Legacy Java Software to Default Methods
Poster on Automated Refactoring of Legacy Java Software to Default Methods
Automated Refactoring of Legacy Java Software to Default Methods Talk at GMU von Raffi Khatchadourian
Automated Refactoring of Legacy Java Software to Default Methods Talk at GMUAutomated Refactoring of Legacy Java Software to Default Methods Talk at GMU
Automated Refactoring of Legacy Java Software to Default Methods Talk at GMU
Towards Improving Interface Modularity in Legacy Java Software Through Automa... von Raffi Khatchadourian
Towards Improving Interface Modularity in Legacy Java Software Through Automa...Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Towards Improving Interface Modularity in Legacy Java Software Through Automa...

Último

DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the... von
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...Deltares
6 views22 Folien
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ... von
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Donato Onofri
795 views34 Folien
ict act 1.pptx von
ict act 1.pptxict act 1.pptx
ict act 1.pptxsanjaniarun08
13 views17 Folien
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h... von
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...Deltares
5 views31 Folien
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme... von
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...Deltares
5 views28 Folien
Tridens DevOps von
Tridens DevOpsTridens DevOps
Tridens DevOpsTridens
9 views28 Folien

Último(20)

DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the... von Deltares
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...
DSD-INT 2023 Leveraging the results of a 3D hydrodynamic model to improve the...
Deltares6 views
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ... von Donato Onofri
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Donato Onofri795 views
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h... von Deltares
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...
DSD-INT 2023 Exploring flash flood hazard reduction in arid regions using a h...
Deltares5 views
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme... von Deltares
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
Deltares5 views
Tridens DevOps von Tridens
Tridens DevOpsTridens DevOps
Tridens DevOps
Tridens9 views
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols von Deltares
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
Deltares7 views
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx von animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm14 views
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut... von Deltares
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
Deltares7 views
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema von Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - GeertsemaDSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
Deltares17 views
Generic or specific? Making sensible software design decisions von Bert Jan Schrijver
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge... von Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
Deltares17 views
Navigating container technology for enhanced security by Niklas Saari von Metosin Oy
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas Saari
Metosin Oy13 views
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs von Deltares
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
Deltares8 views
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI... von Marc Müller
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Marc Müller37 views
Software testing company in India.pptx von SakshiPatel82
Software testing company in India.pptxSoftware testing company in India.pptx
Software testing company in India.pptx
SakshiPatel827 views
Dapr Unleashed: Accelerating Microservice Development von Miroslav Janeski
Dapr Unleashed: Accelerating Microservice DevelopmentDapr Unleashed: Accelerating Microservice Development
Dapr Unleashed: Accelerating Microservice Development
Miroslav Janeski10 views

Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Graph Execution

  • 1. Introduction Motivation Optimization Approach Conclusion Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Graph Execution Raffi Khatchadourian1,2 Tatiana Castro Vélez2 Mehdi Bagherzadeh3 Nan Jia2 Anita Raja1,2 1 City University of New York (CUNY) Hunter College, USA 2 City University of New York (CUNY) Graduate Center, USA 3 Oakland University, USA International Conference on Automated Software Engineering September 14, 2023, Kirchberg, Luxembourg Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 1 / 12
  • 2. Introduction Motivation Optimization Approach Conclusion Deep Learning Systems & Run-time Performance Machine Learning (ML), including Deep Learning (DL), systems are pervasive. As datasets grow, efficiency becomes essential to support responsiveness [Zhou et al., 2020]. For efficiency, DL frameworks have traditionally embraced a deferred execution-style supporting graph-based (DNN) computation. Scalable, but development is . . . Error-prone. Cumbersome. Produces programs that are difficult to debug. Because graph computation executes statements in a non-imperative order, traditional SE tools cannot help troubleshoot bugs [Arpteg et al., 2018]. Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 2 / 12
  • 3. TensorFlow Deferred Execution-style Code 1 # Build a graph. 2 a = tf.constant(5.0) 3 b = tf.constant(6.0) 4 c = a * b 5 6 # Launch graph in a session. 7 sess = tf.Session() 8 9 # Evaluate the tensor `c`. 10 print(sess.run(c)) # prints 30.0 Lines 2–4 build a computation graph. Line 4 does not execute until the Session is run on line 10. No native support common imperative program constructs, e.g., iteration.
  • 4. TensorFlow Deferred Execution-style Code 1 # Build a graph. 2 a = tf.constant(5.0) 3 b = tf.constant(6.0) 4 c = a * b 5 6 # Launch graph in a session. 7 sess = tf.Session() 8 9 # Evaluate the tensor `c`. 10 print(sess.run(c)) # prints 30.0 Lines 2–4 build a computation graph. Line 4 does not execute until the Session is run on line 10. No native support common imperative program constructs, e.g., iteration.
  • 5. TensorFlow Deferred Execution-style Code 1 # Build a graph. 2 a = tf.constant(5.0) 3 b = tf.constant(6.0) 4 c = a * b 5 6 # Launch graph in a session. 7 sess = tf.Session() 8 9 # Evaluate the tensor `c`. 10 print(sess.run(c)) # prints 30.0 Lines 2–4 build a computation graph. Line 4 does not execute until the Session is run on line 10. No native support common imperative program constructs, e.g., iteration.
  • 6. Introduction Motivation Optimization Approach Conclusion Imperative DL Programming, Eager Execution, & Hybridization Imperative DL frameworks (e.g., TensorFlow Eager, Keras, PyTorch) encouraging eager execution are more natural, less error-prone, and easier to debug. Sacrifices run-time performance. Thus, hybrid approaches (e.g., Hybridize, TorchScript, AutoGraph) have surfaced that: Execute imperative DL programs as static graphs at run-time. Are integrated into mainstream DL frameworks (e.g., TensorFlow, MXNet, PyTorch). Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 4 / 12
  • 7. Eager TensorFlow Imperative (OO) DL Model Code 1 class SequentialModel(tf.keras.Model): 2 def __init__(self, **kwargs): 3 super(SequentialModel, self).__init__(...) 4 self.flatten = layers.Flatten(input_shape=(28, 28)) 5 num_layers = 100 # Add many small layers. 6 self.layers = [layers.Dense(64, activation = "relu") for n in range(num_layers)] , → 7 self.dropout = tf.keras.layers.Dropout(0.2) 8 self.dense_2 = tf.keras.layers.Dense(10) 9 10 11 def __call__(self, x): 12 x = self.flatten(x) 13 for layer in self.layers: 14 x = layer(x) 15 x = self.dropout(x) 16 x = self.dense_2(x) 17 return x
  • 8. Hybridized TensorFlow Imperative (OO) DL Model Code 1 class SequentialModel(tf.keras.Model): 2 def __init__(self, **kwargs): 3 super(SequentialModel, self).__init__(...) 4 self.flatten = layers.Flatten(input_shape=(28, 28)) 5 num_layers = 100 # Add many small layers. 6 self.layers = [layers.Dense(64, activation = "relu") for n in range(num_layers)] , → 7 self.dropout = tf.keras.layers.Dropout(0.2) 8 self.dense_2 = tf.keras.layers.Dense(10) 9 10 @tf.function(...) # Executes model as graph (optional args). 11 def __call__(self, x): 12 x = self.flatten(x) 13 for layer in self.layers: 14 x = layer(x) 15 x = self.dropout(x) 16 x = self.dense_2(x) 17 return x On line 10, AutoGraph used to potentially enhance performance. Decorates model’s call() method with @tf.function, possibly providing optional yet influential decorator arguments. At run-time, call()’s execution will be “traced” (∼9.22 speedup).
  • 9. Hybridized TensorFlow Imperative (OO) DL Model Code 1 class SequentialModel(tf.keras.Model): 2 def __init__(self, **kwargs): 3 super(SequentialModel, self).__init__(...) 4 self.flatten = layers.Flatten(input_shape=(28, 28)) 5 num_layers = 100 # Add many small layers. 6 self.layers = [layers.Dense(64, activation = "relu") for n in range(num_layers)] , → 7 self.dropout = tf.keras.layers.Dropout(0.2) 8 self.dense_2 = tf.keras.layers.Dense(10) 9 10 @tf.function(...) # Executes model as graph (optional args). 11 def __call__(self, x): 12 x = self.flatten(x) 13 for layer in self.layers: 14 x = layer(x) 15 x = self.dropout(x) 16 x = self.dense_2(x) 17 return x On line 10, AutoGraph used to potentially enhance performance. Decorates model’s call() method with @tf.function, possibly providing optional yet influential decorator arguments. At run-time, call()’s execution will be “traced” (∼9.22 speedup).
  • 10. Hybridized TensorFlow Imperative (OO) DL Model Code 1 class SequentialModel(tf.keras.Model): 2 def __init__(self, **kwargs): 3 super(SequentialModel, self).__init__(...) 4 self.flatten = layers.Flatten(input_shape=(28, 28)) 5 num_layers = 100 # Add many small layers. 6 self.layers = [layers.Dense(64, activation = "relu") for n in range(num_layers)] , → 7 self.dropout = tf.keras.layers.Dropout(0.2) 8 self.dense_2 = tf.keras.layers.Dense(10) 9 10 @tf.function(...) # Executes model as graph (optional args). 11 def __call__(self, x): 12 x = self.flatten(x) 13 for layer in self.layers: 14 x = layer(x) 15 x = self.dropout(x) 16 x = self.dense_2(x) 17 return x On line 10, AutoGraph used to potentially enhance performance. Decorates model’s call() method with @tf.function, possibly providing optional yet influential decorator arguments. At run-time, call()’s execution will be “traced” (∼9.22 speedup).
  • 11. Introduction Motivation Optimization Approach Conclusion Drawbacks Hybridization Drawbacks Needs non-trivial, specialized metadata [Jeong et al., 2019]. Exhibit limitations and known issues with native program constructs. Subtle considerations required to: Specify (decorate) the functions to be migrated. Make code amenable to safe, accurate, and efficient graph execution. Avoid performance bottlenecks and semantically inequivalent results [Cao et al., 2021, Castro Vélez et al., 2022]. Manual analysis and refactoring (semantics-preserving, source-to-source transformation) for optimal results can be error- and omission-prone [Dig et al., 2009]. Further complicated by: Increasing Object-Orientation (OO) in DL model code [Chollet, 2020]. Dynamically-typed languages (e.g., Python). Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 7 / 12
  • 12. Introduction Motivation Optimization Approach Conclusion Problem Insight Goals Approach Key Insight Although imperative DL code is sequentially executed, hybridizing code resembles parallelizing sequential code. Example To void unexpected behavior, like concurrent programs, hybrid functions should avoid side-effects. Idea Adapt concepts from automated refactorings that parallelize sequential code, e.g., Streaming APIs [Khatchadourian et al., 2019]. Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 8 / 12
  • 13. Work In Progress Two new, fully-automated refactorings are in-progress: Convert Eager Function to Hybrid Transforms otherwise eagerly-executed imperative (Python) DL code for enhanced run-time performance. Automatically specifies (decorates) whether and how code could be reliably and efficiently executed as graphs at run-time. Avoids hybridizing code under certain conditions (e.g., side-effecting code) to preserve semantics. Optimize Hybrid Function Transforms code already running as graphs for optimal run-time performance. Modifies existing decorator parameters (e.g., tensor shape specs). Potentially restructures code to be more amenable to graph transformation. Possibly dehybridize code when eager execution could be faster (e.g., graph “retracing”).
  • 14. Approach Highlights Novel tensor analysis for imperative DL code. Current analyzers work on only procedural (TF 1) code. Modernization of WALA Ariadne [Dolby et al., 2018] for imperative (TF 2) code underway. Implemented as a PyDev Eclipse IDE plug-in [Zadrozny, 2023]. Integrates Ariadne for tensor type inference and shape (static) analysis.
  • 15. Introduction Motivation Optimization Approach Conclusion Problem Insight Goals Approach Approach Challenges Lack of static type information. Needed to determine candidate functions (at least one Tensor parameter). Unlike, e.g., Java, Python has no restrictions on decorator (annotation) arguments. tf.function may be called as a function instead of a decorator. Example hyb_call = tf.function(call) hyb_call() Determine tensor shapes. Existing analyses only for procedural (TF 1) code. Working towards statically resolving imperative (TF 2) code. Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 11 / 12
  • 16. Introduction Motivation Optimization Approach Conclusion Conclusion Imperative Deep Learning code is easier to debug, write, and maintain than traditional DL code that runs in a deferred execution. However, it comes at the expense of (run-time) performance. Hybrid approaches bridge the gap between eager and graph execution. Using hybrid techniques to achieve optimal performance and semantics preservation is non-trivial. Future Work Automated client-side analyses and transformations to use hybridization APIs correctly and optimally is in-progress. Evaluation using dataset from our MSR ’22 [Castro Vélez et al., 2022] empirical study. More details in paper! http://bit.ly/tf2-ase23 Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 12 / 12
  • 17. Introduction Motivation Optimization Approach Conclusion For Further Reading I Abadi, Martı́n et al. (2016). “TensorFlow: A System for Large-Scale Machine Learning”. In: Symposium on Operating Systems Design and Implementation. Agrawal, Akshay et al. (2019). TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning. arXiv: 1903.01855 [cs.PL]. Apache (Apr. 8, 2021). Hybridize. Apache MXNet documentation. url: https://mxnet.apache.org/versions/1.8.0/api/python/docs/tutorials/packages/gluon/blocks/hybridize.html (visited on 04/08/2021). Arpteg, A., B. Brinne, L. Crnkovic-Friis, and J. Bosch (2018). “Software Engineering Challenges of Deep Learning”. In: Euromicro Conference on Software Engineering and Advanced Applications. IEEE, pp. 50–59. doi: 10.1109/SEAA.2018.00018. Cao, Junming, Bihuan Chen, Chao Sun, Longjie Hu, and Xin Peng (Dec. 3, 2021). Characterizing Performance Bugs in Deep Learning Systems. arXiv: 2112.01771 [cs.SE]. Castro Vélez, Tatiana, Raffi Khatchadourian, Mehdi Bagherzadeh, and Anita Raja (May 2022). “Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study”. In: International Conference on Mining Software Repositories. MSR ’22. ACM/IEEE. ACM. doi: 10.1145/3524842.3528455. arXiv: 2201.09953 [cs.SE]. Chen, Tianqi, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang (2015). “MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems”. In: Workshop on Machine Learning Systems at NIPS. arXiv: 1512.01274 [cs.DC]. Chollet, François (2020). Deep Learning with Python. 2nd ed. Manning. Dig, Danny, John Marrero, and Michael D. Ernst (2009). “Refactoring sequential Java code for concurrency via concurrent libraries”. In: International Conference on Software Engineering. IEEE, pp. 397–407. doi: 10.1109/ICSE.2009.5070539. Dilhara, Malinda, Ameya Ketkar, Nikhith Sannidhi, and Danny Dig (2022). “Discovering Repetitive Code Changes in Python ML Systems”. In: International Conference on Software Engineering. ICSE ’22. To appear. Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 12 / 12
  • 18. Introduction Motivation Optimization Approach Conclusion For Further Reading II Dolby, Julian, Avraham Shinnar, Allison Allain, and Jenna Reinen (2018). “Ariadne: Analysis for Machine Learning Programs”. In: International Workshop on Machine Learning and Programming Languages. MAPL 2018. ACM SIGPLAN. Philadelphia, PA, USA: Association for Computing Machinery, pp. 1–10. isbn: 9781450358347. doi: 10.1145/3211346.3211349. Facebook Inc. (2019). PyTorch Documentation. TorchScript. en. url: https://pytorch.org/docs/stable/jit.html (visited on 02/19/2021). Jeong, Eunji, Sungwoo Cho, Gyeong-In Yu, Joo Seong Jeong, Dong-Jin Shin, Taebum Kim, and Byung-Gon Chun (July 2019). “Speculative Symbolic Graph Execution of Imperative Deep Learning Programs”. In: SIGOPS Oper. Syst. Rev. 53.1, pp. 26–33. issn: 0163-5980. doi: 10.1145/3352020.3352025. Khatchadourian, Raffi, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed (2019). “Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams”. In: International Conference on Software Engineering. ICSE ’19. IEEE Press, pp. 619–630. doi: 10.1109/ICSE.2019.00072. Kim, Miryung, Thomas Zimmermann, and Nachiappan Nagappan (Nov. 2012). “A Field Study of Refactoring Challenges and Benefits”. In: Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. FSE ’12. Cary, North Carolina: ACM. isbn: 9781450316149. doi: 10.1145/2393596.2393655. Moldovan, Dan, James M. Decker, Fei Wang, Andrew A. Johnson, Brian K. Lee, Zachary Nado, D. Sculley, Tiark Rompf, and Alexander B. Wiltschko (2019). AutoGraph: Imperative-style Coding with Graph-based Performance. arXiv: 1810.08061 [cs.PL]. Negara, Stas, Nicholas Chen, Mohsen Vakilian, Ralph E. Johnson, and Danny Dig (2013). “A Comparative Study of Manual and Automated Refactorings”. In: European Conference on Object-Oriented Programming. Ed. by Giuseppe Castagna. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 552–576. isbn: 978-3-642-39038-8. OpenAI, Inc. (Aug. 18, 2023). ChatGPT. url: https://chat.openai.com (visited on 08/18/2023). Paszke, Adam et al. (Dec. 3, 2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. arXiv: 1912.01703 [cs.LG]. Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 12 / 12
  • 19. Introduction Motivation Optimization Approach Conclusion For Further Reading III Zadrozny, Fabio (Apr. 15, 2023). PyDev. url: https://www.pydev.org (visited on 05/31/2023). Zhou, Weijie, Yue Zhao, Guoqiang Zhang, and Xipeng Shen (2020). “HARP: Holistic Analysis for Refactoring Python-Based Analytics Programs”. In: International Conference on Software Engineering, pp. 506–517. doi: 10.1145/3377811.3380434. Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 12 / 12
  • 20. Appendix Static Analysis Refactoring LLMs Notebooks Appendix Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 1 / 6
  • 21. Appendix Static Analysis Refactoring LLMs Notebooks Why Static Analysis? Refactorings must operate on (at least some) static information. Must eventually transform the source code. May eventually integrate hybrid analyses to resolve difficult static cases. Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 2 / 6
  • 22. Appendix Static Analysis Refactoring LLMs Notebooks Why Automated Refactoring? In general, such problems may also be handled by compilers or runtimes; however, refactoring has several benefits: Gives developers more control over where the optimizations take place and making graph execution explicit. Can be issued multiple times, e.g., prior to major releases. Unlike static checkers, they transform source code, a task that can be otherwise error-prone and involve subtle nuances. Refactorings can act like recommendation systems, which is important for analyzing and transforming programs written in dynamic languages where static assumptions may be easily violated! Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 3 / 6
  • 23. Appendix Static Analysis Refactoring LLMs Notebooks Refactoring Developer Adoption Developers generally underuse automated refactorings [Kim et al., 2012, Negara et al., 2013]. Data scientists and engineers may be more open to using automated (refactoring) tools. Our approach will be fully automated with minimal barrier to entry. Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 4 / 6
  • 24. Appendix Static Analysis Refactoring LLMs Notebooks LLMs & Big Data Refactoring LLMs [OpenAI, Inc., 2023] can also perform refactorings. Other Big Data-driven refactorings [Dilhara et al., 2022] are exciting and promising. Obtaining a (correct) dataset large enough to automatically extract the proposed refactorings is challenging as developers struggle with (manually) migrating DL code to graph execution [Castro Vélez et al., 2022]. LLM inference capabilities are currently limited. LLMs have a token limitation. Hybridization requires interprocedural analysis. Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 5 / 6
  • 25. Appendix Static Analysis Refactoring LLMs Notebooks Notebook Support We plan to investigate notebook support in the future. We envision the approach to be used on (larger) DL systems, consisting of multiple files. Khatchadourian, Vélez, Bagherzadeh, Jia, Raja Towards Refactoring of Imperative DL Programs to Graphs 6 / 6