The document discusses the artifact evaluation experience at the CGO and PPoPP 2015 conferences. It provides an overview of the joint artifact evaluation process used, which involved authors submitting related materials that were evaluated over 2 weeks by committees. The highest ranked artifacts from each conference won prizes. Challenges discussed include differing software/hardware environments and accessing proprietary tools/benchmarks. Suggestions are made to improve reproducibility, such as providing common infrastructure and expanding hardware/software resources.
2. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Outline
• What is Artifact Evaluation (AE)?
• Joint AE process for CGO’15 and PPoPP’15
• Two Prizes for highest-ranked artifacts from CGO and PPoPP
• Challenges
• Suggestions for future AE
Sponsors
3. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Some issues
Article
Tools
Scripts
Hardware
Simulators
Benchmarks
Data sets Libraries
OS
Compilers
VMs
Related material
Experimental results
Databases
4. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Some issues
Raising number of articles Where is related material?
Why bother
?• Time consuming - waste of time
• Not needed for promotion
• Life span – MS/PhD/project
• Can cause competition
5. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Some issues
Raising number of articles
• Difficult or even impossible to reproduce
results from publications
• Demotivating to redevelop past
techniques
• Little trust from industry
• Computer engineering is often considered
as hacking - difficult to attract students
6. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Some issues
Raising number of articles
• Difficult or even impossible to reproduce
results from publications
• Demotivating to redevelop past techniques
• Little trust from industry
• Computer engineering is often considered as
hacking - difficult to attract students
Possible solution:
• Make it sexy to share code and data
(at least to reproduce results)
• Engage with the community
Governmental funding agencies data mandates
7. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
What is Artifact Evaluation (AE)?
Authors of accepted articles has an
option to submit related material for
an AE committee to be evaluated
PC members nominate one or two
senior student/engineer for AE
committee
• Abstract
• Packed artifact (or remote access)
• ReadMe (how to validate results)
8. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
What is Artifact Evaluation (AE)?
Authors of accepted articles has an
option to submit related material for
an AE committee to be evaluated
PC members nominate one or two
senior student/engineer for AE
committee
• Abstract
• Packed artifact (or remote access)
• ReadMe (how to validate results)
~2 weeks for evaluation, at least 2
reviews per artifact, 4 days for rebuttal
• Summary and contributions of the paper.
•Artifact packaging and reproducibility.
•Artifact implementation and usability.
•Overall assessment.
•On what platform/how was the artifact
evaluated.
Ranking:
1. Significantly exceeded expectations
2. Exceeded expectations
3. Met expectations
4. Fell below expectations
5. Significantly fell below expectations
9. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Joint AE process for CGO’15 and PPoPP’15
CGO/PPoPP’15 organizers:
Aaron Smith, Kunle Olukotun, Robert Hundt, Jason Mars, Chris Fensch
Albert Cohen, David Grove, Calin Cascaval
Acknowledgments:
Reviewers:
David Boehme, Santiago Bock, Lingda Li, Lin Ma, Yiannis Nikolakopulos, Jeeva
Paudel, Paul Thomson, Peter Libic, Dave Wilkinson, Weiwei Chen, Riyadh
Baghdadi, Na Meng, Arun Raman, Bapi Chatterjee, Martin Maas, Vojtech Horky,
Vasileios Trigonakis, Mahdi Eslamimehr, Yuhao Zhu, Melanie Kambadur, Michael
Laurenzano
Related AE:
Shriram Krishnamurthi
Authors:
8 submitted artifacts for CGO and 10 for PPoPP
10. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Accepted artifacts
CGO’15
cTuning.org/event/ae-cgo2015
•Locality-Centric Thread Scheduling for Bulk-
synchronous Programming Models on CPU
Architectures
Hee-Seok Kim, Izzat El Hajj, John Stratton, Steven Lumetta
and Wen-Mei Hwu
•MemorySanitizer: fast detector of uninitialized memory
use in C++
Evgeniy Stepanov and Konstantin Serebryany
•A Parallel Abstract Interpreter for JavaScript
Kyle Dewey, Vineeth Kashyap and Ben Hardekopf
•A Graph-Based Higher-Order Intermediate
Representation
Roland Leißa, Marcel Köster and Sebastian Hack
•Optimizing the flash-RAM energy trade-off in deeply
embedded systems
James Pallister, Kerstin Eder and Simon J. Hollis
•Scalable Conditional Induction Variable (CIV) Analysis
Cosmin E. Oancea and Lawrence Rauchwerger
PPoPP’15
cTuning.org/event/ae-cgo2015
•NUMA-aware Graph-structured Analytics
Kaiyuan Zhang, Rong Chen and Haibo Chen
•Predicate RCU: An RCU for Scalable Concurrent Updates
Maya Arbel and Adam Morrison
•Scalable and Efficient Implementation of 3D Unstructured Meshes
Computation: A Case Study on Matrix Assembly
Loïc Thébault, Eric Petit and Quang Dinh
•VirtCL: A Framework for OpenCL Device Abstraction and Management
Yi-Ping You, Hen-Jung Wu, Yeh-Ning Tsai and Yen-Ting Chao
•Dynamic deadlock verification for general barrier synchronisation
Tiago Cogumbreiro, Raymond Hu, Francisco Martins and Nobuko Yoshida
•Low-Overhead Software Transactional Memory with Progress Guarantees
and Strong Semantics
Minjia Zhang, Jipeng Huang, Man Cao and Michael Bond
•The SprayList: A Scalable Relaxed Priority Queue
Justin Kopinsky, Dan Alistarh, Jerry Li and Nir Shavit
•Performance Implications of Dynamic Memory Allocators on Transactional
Memory Systems
Alexandro Baldassin, Edson Borin and Guido Araujo
•More than You Ever Wanted to Know about Synchronization
Vincent Gramoli
•Cache-Oblivious Wavefront: Improving Parallelism of Recursive DP
Algorithms without Losing Cache-efficiency
Yuan Tang, Ronghui You, Haibin Kan, Jesmin Tithi, Pramod Ganapathi and
Rezaul Chowdhury
11. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Highest-ranked artifacts from CGO and PPoPP
1st place
2nd place
Quadro K6000
(will be shipped directly)
Acer C720P
12. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Highest-ranked artifacts from CGO and PPoPP
1st place
2nd place
Quadro K6000
(will be shipped directly)
Acer C720P
“The SprayList: A scalable
relaxed priority queue”
Justin Kopinsky, Dan Alistarh,
Jerry Li and Nir Shavit
“A graph-based higher-order
intermediate representation”
Roland Leißa, Marcel Köster
and Sebastian Hack
13. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Challenges
We need your feedback to improve AE!
• Should we replicate or reproduce results?
• Should we allow reviewers communicate with authors
(keep anonymity)?
• Can we slightly change experimental setups?
• What if new results invalidate paper claims?
• Do we need to be able to reinstall tools from scratch?
• Artifact consistent with a paper? Well documented? Easy to use?
Need to provide better guidelines
for authors and reviewers!
14. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Challenges
Different SW/HW
GCC 4.1.x
ICC 11.1
LLVM 2.8
OpenMP MPI OpenCL
perf
ATLAS
function-
level
hardware
counters
pass
reordering
frequency
GCC 4.9.x
genetic
algorithms
ARM v8
CUDA 5.x
GCC 4.3.x
GCC 4.4.x
GCC 4.5.x
GCC 4.6.x
ICC 11.0
ICC 12.0
LLVM 2.6
LLVM 3.x
MVS 2013
XLC
HMPP
PAPI
Scalascapredictive
schedulingMKL
polyhedral
transformations KNN
bandwidth
memory size
execution time
SSE4
SimpleScalar
LTO
cache size
threads
algorithm precision
Open64
Jikes
TAU
GCC 5.x
• 6 VirtualBox images (2x2Gb, 1x20Gb)
do not include unrelated SW such as OpenOffice, GNOME, …
• 2 VWMare images (proprietary)
• 2 CDE
• 1 Docker
• 3 access to remote machine with preinstalled software
• 4 compressed tar balls
VMs not good for performance evaluation!
Should have a large pool of qualified reviewers:
should be able to install tools and know some basic script debugging
15. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Challenges
• Accessing proprietary/paid/large benchmarks (SPEC2006, EEMBC, etc)
Authors should add some benchmarks/data sets to test their code. If the
benchmarks/data sets are proprietary, please provide a couple of synthetic or
public ones
• Installing proprietary/paid/large tools such as Intel compilers and performance
analysis tools
• Reinstalling large software tools with many dependencies
• Accessing non-public tools (such as large, academic and non released compilers)
say just to validate 1 pass
• Getting access to a very rare and/or powerful hardware (i.e. clusters or
supercomputers or hardware with specific counters such as measuring consumed
energy)
• Getting anonymous access to the authors’ machines
• Requiring sole and long access to (authors’) busy machines (say for performance
or energy tuning)
16. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Some ideas
• Arrange AE server with pre-installed most commonly used software and with
access to some hardware
• FPGAs
• Microcontrollers
• ARM/Qualcomm/Intel development boards
• Arrange access to various distributed machines at authors’ sites with pre-
installed tools
• Arrange access to most commonly used clusters (registration will be done by AE
chairs to preserve anonymity of the reviewers):
• XSEDE, PRACE, GRID5000, CINES, opensciencegrid.org
• Making a pool of good artifact evaluators - need to get at least 3 reviews per
artifact (2 is not enough)
• Develop common experiential infrastructure (workflows, meta-information)?
http://github.com/ctuning/ck http://cknowledge.org/repo
Discussions with ACM about formalization / meta-description / stamp.
17. Grigori Fursin and Bruce Childers “Artifact Evaluation Experience: CGO and PPoPP 2015”
Keep in touch
AE for CGO/PPoPP:
• Grigori Fursin, grigori.fursin@cTuning.org
• Bruce Childers, childers@cs.pitt.edu
AE for PLDI/OOSPLA
• Shriram Krishnamurthi
• Jan Vitek
• Eric Eide
Our projects:
• http://cknowledge.org/reproducibility
• http://www.occamportal.org
• http://github.com/ctuning/ck
Sponsors are welcome!