This document describes a tool called Optimize Streams that uses automated refactoring and static analysis to optimize Java 8 stream code for improved performance. The tool analyzes stream code to determine when parallelization is safe and interference-free. It was tested on 11 Java projects totaling over 600,000 lines of code, and observed an average speedup of 1.55x after refactoring stream code. The tool integrates analyses from the WALA and SAFE frameworks to infer ordering properties and prevent resource errors during refactoring.
DSPy a system for AI to Write Prompts and Do Fine Tuning
A Tool for Optimizing Java 8 Stream Software via Automated Refactoring
1. A Tool for Optimizing Java 8 Stream Software
via Automated Refactoring
Raffi Khatchadourian1,2
Yiming Tang2
Mehdi Bagherzadeh3
Syed
Ahmed3
IEEE International Working Conference on Source Code Analysis and Manipu-
lation
September 2018, Madrid, Spain
1
Computer Science, City University of New York (CUNY) Hunter College, USA
2
Computer Science, City University of New York (CUNY) Graduate Center, USA
3
Computer Science & Engineering, Oakland University, USA
3. Streaming APIs
• Streaming APIs are widely-available in today’s mainstream,
Object-Oriented programming languages [Biboudis et al., 2015].
• Incorporate MapReduce-like operations on native data
structures like collections.
• Can make writing parallel code easier, less error-prone (avoid
data cases, thread contention).
1
4. Problem
• MapReduce traditionally runs in highly-distributed
environments with no shared memory.
• Streaming APIs typically execute on a single node under
multiple threads or cores in a shared memory space.
• Collections reside in local memory.
• Issues may arise from close ties between shared memory and
the operations.
• Developers must manually determine whether running stream
code in parallel is efficient and interference-free.
• Requires thorough understanding of the API.
• Error-prone, possibly requiring complex analysis.
• Omission-prone, optimization opportunities may be missed.
2
5. Solution
• Fully-automated refactoring tool named Optimize Streams.
• Transforms Java 8 stream code for improved performance.
• Publicly available as an open source Eclipse IDE1
plug-in.2
• Includes fully-functional UI, preview pane, and unit tests.
• Based on:
• Novel ordering analysis.
• Infers when maintaining ordering is necessary for semantics
preservation.
• Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986].
• Augments the type system with “state.”
• Traditionally used for preventing resource usage errors.
1http://eclipse.org.
2Available at http://git.io/vpTLk.
3
6. • First to integrate automated refactoring with typestate analysis.3
• Uses WALA static analysis framework4
and the SAFE typestate
analysis engine.5
• Combines analysis results from varying IR representations (SSA,
AST).
3To the best of our knowledge.
4http://wala.sf.net
5http://git.io/vxwBs
4
10. Preliminary Results
• Applied to 11 Java projects of varying size and domain with a
total of ∼642 KSLOC.
• 36.31% candidate streams were refactorable.
• Observed an initial average speedup of 1.55 during performance
testing.
• See paper for more details, including user feedback, as well as
tool and data set engineering challenges.
6
12. • Optimize Streams is an open source, automated refactoring tool
that assists developers with writing optimal Java 8 Stream code.
• Integrates an Eclipse refactoring with the advanced static
analyses offered by WALA and SAFE.
• 11 Java projects totaling ∼642 thousands of lines of code were
used in the tool’s assessment.
• A speedup of 1.55 on the refactored code was observed as part
of a preliminary study.
7
13. For Further Reading
Biboudis, Aggelos, Nick Palladinos, George Fourtounis, and Yannis Smaragdakis (2015).
“Streams à la carte: Extensible Pipelines with Object Algebras”. In: ECOOP,
pp. 591–613. doi: 10.4230/LIPIcs.ECOOP.2015.591.
Fink, Stephen J., Eran Yahav, Nurit Dor, G. Ramalingam, and Emmanuel Geay (2008).
“Effective Typestate Verification in the Presence of Aliasing”. In: ACM TOSEM 17.2,
pp. 91–934. doi: 10.1145/1348250.1348255.
Strom, Robert E and Shaula Yemini (1986). “Typestate: A programming language
concept for enhancing software reliability”. In: IEEE TSE SE-12.1, pp. 157–171. doi:
10.1109/tse.1986.6312929.
8
14. Provocative Statements
1. Streaming API usage does not match that of how the API
designers envisioned usage.
Question
What are the consequences for future versions of such APIs?
2. Using streaming APIs in mainstream, Object-Oriented languages
has many benefits, such as conciseness and succinct
parallelism, but hinders code reuse, thus promoting clones.
Question
Is writing multiple, similar lambda expressions easier than writing
reusable functions?
9