SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Putting Compilers to Work
August 19th, 2015 Drew Paroski
Why do compilers and runtimes matter?
•Virtually all modern software is powered by
compilers
•Compilers can have a huge effect on the
efficiency and performance of software
•Also can affect how programmers develop
software
Efficiency and Performance
•Qualitative:
•Change what user experiences are possible
•Quantitative:
•Reduce CPU and resource usage
•This often translates into reduced costs
and/or increased revenue
How do compilers work?
•Demo: Demystifying machine code
Let’s look at a 50-line C++ program that
generates machine code at run time and
invokes it
Example Compilation Pipeline (HHVM)
parse
codegen
Virtual ISA
(aka bytecode)
emit
Intermediate
Representation
analyze
PHP
Source
Abstract
Syntax Tree
(AST)
CPU
x64
Machine
Code
InterpreterHHVM
Runtime
interpretation
The Rise of Web Programming
•1993: CGI scripting (Perl / C)
•1995: Javascript, PHP, and Java applets
•1996: Flash, ASP, and ActiveX
•1999: JSP
•2004: Ruby on Rails
•2005: Django (Python)
•2007: Silverlight (XAML / C#)
•Engines for most of these languages were
interpreters
•Often ~100x slower (or more) than C and C++
•Most interpreters don’t have a noticeable
compilation delay
•Supports a rapid iterative development
workflow (edit-save-run)
Web Programming Languages
•Websites grew in their complexity
•Developers started to bump against the
limitations of early engines’ performance
•Bonus points: Try to preserve the rapid
iterative development workflow of these
languages (edit-save-run)
Building Efficient Compilers for
Dynamic Languages
Compiler Improvements in Web Programming
•2002: ASP.NET
•2007: Cython
•2008: JRuby
•2009: TraceMonkey, WebKit SFX/Nitro, V8
•2010: HipHop for PHP, PyPy 1.2
•2011: HHVM
•2014: Pyston
•Let’s talk about performance optimization in
general
Improving software performance
Three Areas to Optimize
1) Data fetching and I/O in general
2) How memory is used on an individual machine
3) How computation is actually performed by the
CPU
•Always measure using profiling tools to
determine what areas you should focus your
optimization efforts on
•Typically the biggest issue to look at
•Does you application frequently stall while
waiting for data?
Data Fetching and other I/O
CPU
Blocked
on I/O
CPUCPU
Blocked
on I/O
Wall time
•Are you fetching more data than you need to?
•Are you making lots of round trips in a serial
fashion?
•Fetch data in batches to reduce # of round trips
•Is your application still blocking a lot?
•Use async I/O APIs
Data Fetching and other I/O
•Are you taking a lot of cache misses?
•Caches misses slow things down by making the CPU
stall
•Pay attention to your application’s memory usage
•Use as little memory as possible
•Avoid pointer chasing
How is memory used?
•Is your application repeatedly re-computing
something that could be memoized?
•Are common operations as efficient as possible?
•Is your application needlessly making lots of
short-lived allocations?
•Is there unnecessary contention with locks or
interlocked operations?
How is computation performed on the CPU?
•Compilers requires a fair amount of
investment before it pays off
•Have you exhausted the low hanging fruit
from other avenues of performance
optimization?
•Are there better engines already out there
you can use?
When to invest in compilers
•Before building anything, think about what your
goals really are
• How much better does execution performance need
to be? 2x? 4x?
• Do you need fast compile times? How fast?
• Do you need to support a rapid iterative workflow?
• Do these goals help improve the user experience for
your product or your company’s bottom line?
What are your goals?
•Everyone wants the best performance, but it doesn’t
come for free
•Do you really need 5x better execution performance?
Or would 2x be good enough for the next few years?
•Is it worth it for your company to pay one or more
engineers to work on this instead of other things
your company needs?
What are your goals?
•Does your use case require doing compilation
on-line?
•On average, how big are the source programs
that are being compiled?
•Does code your compiler outputs execute for
short periods of time or are they longer running?
•Is your source language statically typed or
dynamically typed?
What are the constraints of your use case?
• Do you need fine control over low-level / native stuff?
• Do you need to integrate with existing native code or data
structures?
• Do you need to work on multiple processors/platforms?
• How quickly do you need a working solution?
• What’s the best fit for your use case?
• Ahead of time (AOT) compilation
• Just-in-time (JIT) compilation
• Interpreted execution
What are the constraints of your use case?
•Full custom approach
•Interpreter approach
•Transpiler approach
•Build on top of an existing backend (LLVM, JVM)
•Meta-tracing / partial evaluation frameworks
(PyPy, Truffle)
Approaches
•Easy to write, maintain, and debug
•Can be built relatively quickly
•Very low compile times
•Well-crafted interpreters can deliver better
execution performance than you probably think
Interpreters
•Interpreters built to execute a virtual ISA (bytecode)
tend to be faster than AST interpreters
•Generally try to limit branches in your opcode
handlers
• Common trick: split up an opcode into several separate
opcodes
•Dispatch loop one of the major sources of
overhead, invest some time in optimizing it
Interpreters
•A transpiler converts one programming to another,
and relies on an existing compiler to compile and
run the target language (typically C or C++)
•Can be built fairly quickly, relatively easy to debug
•Can deliver better execution perf than interpreters
•Examples: Cython, HipHop for PHP
Transpiler approach
• Often has lower ceiling for best possible long-term
performance vs. the full custom approach
• Primitive operations the source language exposes often
do not cleanly map onto the primitive operations exposed
by the target language
• Transpiler architecture can get unwieldy as your
system evolves and you squeeze for more perf
• For use cases that involve compiling medium-sized
programs or larger, transpiling to C/C++ effectively
locks you into AOT-compilation-only model
Downsides of Transpilers
•LLVM is the backend for the clang C/C++ compiler
•Unlike most compiler backends, LLVM has well
designed external facing APIs
•Most suitable for compiling statically typed
languages where longer compile times are
acceptable
•LLVM is good when you need to tight integration
with existing native code where perf really matters
Building on top of LLVM
•Examples: Scala, JRuby
•Works really well if the source language was
designed with the JVM in mind
•Not great if you need to tight integration with
existing native code where perf matters
•Suffers from some of the same problems as
transpilers if your source language wasn’t
designed with the JVM in mind
Building on top of the JVM
•Examples: PyPy, Truffle
•Takes an interpreter for your source language as input
•Analyzes the interpreter implementation, stitches
together code from different opcode handlers, and
then does optimization passes on these fragments
and emits them to machine code
•New and therefore not as proven as other
approaches
Meta-tracing and partial evaluation frameworks
•Most expensive option in terms of time/effort
•Gives you maximum control over every part of
your system, let’s you craft everything to your
exact use case
•Can produce the best possible execution
performance and compile-time performance
•Ex: JVM, .NET, gcc, clang, V8, HHVM
Full Custom Approach
•Major risks:
•Can take too long to build
•Can go off the rails if you don’t have the proper
expertise
Full Custom Approach
•Depending on the approach you take, you’ll
need to optimize different parts of the system
•Memory, memory, memory
•Reduce memory usage and cache misses
•Optimize your runtime’s binary layout
•Try out jemalloc or tcmalloc
•Try out Linux’s “huge pages” feature
Optimization Advice
What’s Next for Compilers?
•Continued focus on engines that both
deliver superior execution performance and
support the rapid iterative development
workflow
Predictions: What’s Next?
•DB query compilation will be an interesting space
to watch
•Disk is not the bottleneck anymore
•Growing demand for real-time analytics on huge,
constantly-changing datasets
•I think we’ll see different database systems striving
to deliver the highest quality SQL->machine code
compilation
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon
 
Membase East Coast Meetups
Membase East Coast MeetupsMembase East Coast Meetups
Membase East Coast MeetupsMembase
 
Membase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase
 
Application Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceApplication Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceScott Mansfield
 
Membase Introduction
Membase IntroductionMembase Introduction
Membase IntroductionMembase
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQHakka Labs
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in JavaRuben Badaró
 
Bootstrap SaaS startup using Open Source Tools
Bootstrap SaaS startup using Open Source ToolsBootstrap SaaS startup using Open Source Tools
Bootstrap SaaS startup using Open Source Toolsbotsplash.com
 
Column and hadoop
Column and hadoopColumn and hadoop
Column and hadoopAlex Jiang
 
ClustrixDB: how distributed databases scale out
ClustrixDB: how distributed databases scale outClustrixDB: how distributed databases scale out
ClustrixDB: how distributed databases scale outMariaDB plc
 
Introducing Venice
Introducing VeniceIntroducing Venice
Introducing VeniceYan Yan
 
Introduction to Prometheus Monitoring (Singapore Meetup)
Introduction to Prometheus Monitoring (Singapore Meetup) Introduction to Prometheus Monitoring (Singapore Meetup)
Introduction to Prometheus Monitoring (Singapore Meetup) Arseny Chernov
 
Couchbase@live person meetup july 22nd
Couchbase@live person meetup   july 22ndCouchbase@live person meetup   july 22nd
Couchbase@live person meetup july 22ndIdo Shilon
 
Stumbling stones when migrating from Oracle
 Stumbling stones when migrating from Oracle Stumbling stones when migrating from Oracle
Stumbling stones when migrating from OracleEDB
 
Connecting kafka message systems with scylla
Connecting kafka message systems with scylla   Connecting kafka message systems with scylla
Connecting kafka message systems with scylla Maheedhar Gunturu
 
StorageArchitecturesForCloudVDI
StorageArchitecturesForCloudVDIStorageArchitecturesForCloudVDI
StorageArchitecturesForCloudVDIVinay Rao
 
Node.js and couchbase Full Stack JSON - Munich NoSQL
Node.js and couchbase   Full Stack JSON - Munich NoSQLNode.js and couchbase   Full Stack JSON - Munich NoSQL
Node.js and couchbase Full Stack JSON - Munich NoSQLPhilipp Fehre
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScyllaDB
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistencyScyllaDB
 
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...ScyllaDB
 

Was ist angesagt? (20)

HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...
 
Membase East Coast Meetups
Membase East Coast MeetupsMembase East Coast Meetups
Membase East Coast Meetups
 
Membase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase Meetup - Silicon Valley
Membase Meetup - Silicon Valley
 
Application Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceApplication Caching: The Hidden Microservice
Application Caching: The Hidden Microservice
 
Membase Introduction
Membase IntroductionMembase Introduction
Membase Introduction
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Bootstrap SaaS startup using Open Source Tools
Bootstrap SaaS startup using Open Source ToolsBootstrap SaaS startup using Open Source Tools
Bootstrap SaaS startup using Open Source Tools
 
Column and hadoop
Column and hadoopColumn and hadoop
Column and hadoop
 
ClustrixDB: how distributed databases scale out
ClustrixDB: how distributed databases scale outClustrixDB: how distributed databases scale out
ClustrixDB: how distributed databases scale out
 
Introducing Venice
Introducing VeniceIntroducing Venice
Introducing Venice
 
Introduction to Prometheus Monitoring (Singapore Meetup)
Introduction to Prometheus Monitoring (Singapore Meetup) Introduction to Prometheus Monitoring (Singapore Meetup)
Introduction to Prometheus Monitoring (Singapore Meetup)
 
Couchbase@live person meetup july 22nd
Couchbase@live person meetup   july 22ndCouchbase@live person meetup   july 22nd
Couchbase@live person meetup july 22nd
 
Stumbling stones when migrating from Oracle
 Stumbling stones when migrating from Oracle Stumbling stones when migrating from Oracle
Stumbling stones when migrating from Oracle
 
Connecting kafka message systems with scylla
Connecting kafka message systems with scylla   Connecting kafka message systems with scylla
Connecting kafka message systems with scylla
 
StorageArchitecturesForCloudVDI
StorageArchitecturesForCloudVDIStorageArchitecturesForCloudVDI
StorageArchitecturesForCloudVDI
 
Node.js and couchbase Full Stack JSON - Munich NoSQL
Node.js and couchbase   Full Stack JSON - Munich NoSQLNode.js and couchbase   Full Stack JSON - Munich NoSQL
Node.js and couchbase Full Stack JSON - Munich NoSQL
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDS
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistency
 
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
 

Ähnlich wie Putting Compilers to Work

Design Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best PracticesDesign Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best PracticesInductive Automation
 
Design Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best PracticesDesign Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best PracticesInductive Automation
 
computer languages
computer languagescomputer languages
computer languagesRajendran
 
Intro to Programming Lang.pptx
Intro to Programming Lang.pptxIntro to Programming Lang.pptx
Intro to Programming Lang.pptxssuser51ead3
 
Embedded c c++ programming fundamentals master
Embedded c c++ programming fundamentals masterEmbedded c c++ programming fundamentals master
Embedded c c++ programming fundamentals masterHossam Hassan
 
Compilers and interpreters
Compilers and interpretersCompilers and interpreters
Compilers and interpretersRAJU KATHI
 
Programming Languages #devcon2013
Programming Languages #devcon2013Programming Languages #devcon2013
Programming Languages #devcon2013Iván Montes
 
4_5802928814682016556.pptx
4_5802928814682016556.pptx4_5802928814682016556.pptx
4_5802928814682016556.pptxAshenafiGirma5
 
Cd ch1 - introduction
Cd   ch1 - introductionCd   ch1 - introduction
Cd ch1 - introductionmengistu23
 
CD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxCD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxZiyadMohammed17
 
Simplifying debugging for multi-core Linux devices and low-power Linux clusters
Simplifying debugging for multi-core Linux devices and low-power Linux clusters Simplifying debugging for multi-core Linux devices and low-power Linux clusters
Simplifying debugging for multi-core Linux devices and low-power Linux clusters Rogue Wave Software
 
Scaling systems for research computing
Scaling systems for research computingScaling systems for research computing
Scaling systems for research computingThe BioTeam Inc.
 
NetWork - 15.10.2011 - Applied code generation in .NET
NetWork - 15.10.2011 - Applied code generation in .NET NetWork - 15.10.2011 - Applied code generation in .NET
NetWork - 15.10.2011 - Applied code generation in .NET Dmytro Mindra
 
Introduction to Python Programming
Introduction to Python ProgrammingIntroduction to Python Programming
Introduction to Python ProgrammingAkhil Kaushik
 
Scaling with Symfony - PHP UK
Scaling with Symfony - PHP UKScaling with Symfony - PHP UK
Scaling with Symfony - PHP UKRicard Clau
 
Python-unit -I.pptx
Python-unit -I.pptxPython-unit -I.pptx
Python-unit -I.pptxcrAmth
 

Ähnlich wie Putting Compilers to Work (20)

Design Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best PracticesDesign Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best Practices
 
Design Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best PracticesDesign Like a Pro: Scripting Best Practices
Design Like a Pro: Scripting Best Practices
 
computer languages
computer languagescomputer languages
computer languages
 
Intro to Programming Lang.pptx
Intro to Programming Lang.pptxIntro to Programming Lang.pptx
Intro to Programming Lang.pptx
 
Embedded c c++ programming fundamentals master
Embedded c c++ programming fundamentals masterEmbedded c c++ programming fundamentals master
Embedded c c++ programming fundamentals master
 
Compilers and interpreters
Compilers and interpretersCompilers and interpreters
Compilers and interpreters
 
Compilers.pptx
Compilers.pptxCompilers.pptx
Compilers.pptx
 
Enterprise PHP
Enterprise PHPEnterprise PHP
Enterprise PHP
 
Compiler design
Compiler designCompiler design
Compiler design
 
Go fundamentals
Go fundamentalsGo fundamentals
Go fundamentals
 
Programming Languages #devcon2013
Programming Languages #devcon2013Programming Languages #devcon2013
Programming Languages #devcon2013
 
4_5802928814682016556.pptx
4_5802928814682016556.pptx4_5802928814682016556.pptx
4_5802928814682016556.pptx
 
Cd ch1 - introduction
Cd   ch1 - introductionCd   ch1 - introduction
Cd ch1 - introduction
 
CD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxCD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptx
 
Simplifying debugging for multi-core Linux devices and low-power Linux clusters
Simplifying debugging for multi-core Linux devices and low-power Linux clusters Simplifying debugging for multi-core Linux devices and low-power Linux clusters
Simplifying debugging for multi-core Linux devices and low-power Linux clusters
 
Scaling systems for research computing
Scaling systems for research computingScaling systems for research computing
Scaling systems for research computing
 
NetWork - 15.10.2011 - Applied code generation in .NET
NetWork - 15.10.2011 - Applied code generation in .NET NetWork - 15.10.2011 - Applied code generation in .NET
NetWork - 15.10.2011 - Applied code generation in .NET
 
Introduction to Python Programming
Introduction to Python ProgrammingIntroduction to Python Programming
Introduction to Python Programming
 
Scaling with Symfony - PHP UK
Scaling with Symfony - PHP UKScaling with Symfony - PHP UK
Scaling with Symfony - PHP UK
 
Python-unit -I.pptx
Python-unit -I.pptxPython-unit -I.pptx
Python-unit -I.pptx
 

Mehr von SingleStore

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeSingleStore
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsSingleStore
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemSingleStore
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeSingleStore
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics SingleStore
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLSingleStore
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastSingleStore
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQLSingleStore
 
An Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsAn Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsSingleStore
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureSingleStore
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored ProceduresSingleStore
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017SingleStore
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming DataSingleStore
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSingleStore
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondSingleStore
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementSingleStore
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AISingleStore
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudSingleStore
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataSingleStore
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSingleStore
 

Mehr von SingleStore (20)

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data life
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and Analytics
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS Ecosystem
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free Life
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQL
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks Webcast
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQL
 
An Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsAn Engineering Approach to Database Evaluations
An Engineering Approach to Database Evaluations
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed Architecture
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored Procedures
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming Data
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and Beyond
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data Management
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AI
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming Data
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
 

Kürzlich hochgeladen

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Kürzlich hochgeladen (20)

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

Putting Compilers to Work

  • 1. Putting Compilers to Work August 19th, 2015 Drew Paroski
  • 2. Why do compilers and runtimes matter? •Virtually all modern software is powered by compilers •Compilers can have a huge effect on the efficiency and performance of software •Also can affect how programmers develop software
  • 3. Efficiency and Performance •Qualitative: •Change what user experiences are possible •Quantitative: •Reduce CPU and resource usage •This often translates into reduced costs and/or increased revenue
  • 4. How do compilers work? •Demo: Demystifying machine code Let’s look at a 50-line C++ program that generates machine code at run time and invokes it
  • 5.
  • 6. Example Compilation Pipeline (HHVM) parse codegen Virtual ISA (aka bytecode) emit Intermediate Representation analyze PHP Source Abstract Syntax Tree (AST) CPU x64 Machine Code InterpreterHHVM Runtime interpretation
  • 7. The Rise of Web Programming •1993: CGI scripting (Perl / C) •1995: Javascript, PHP, and Java applets •1996: Flash, ASP, and ActiveX •1999: JSP •2004: Ruby on Rails •2005: Django (Python) •2007: Silverlight (XAML / C#)
  • 8. •Engines for most of these languages were interpreters •Often ~100x slower (or more) than C and C++ •Most interpreters don’t have a noticeable compilation delay •Supports a rapid iterative development workflow (edit-save-run) Web Programming Languages
  • 9. •Websites grew in their complexity •Developers started to bump against the limitations of early engines’ performance •Bonus points: Try to preserve the rapid iterative development workflow of these languages (edit-save-run) Building Efficient Compilers for Dynamic Languages
  • 10. Compiler Improvements in Web Programming •2002: ASP.NET •2007: Cython •2008: JRuby •2009: TraceMonkey, WebKit SFX/Nitro, V8 •2010: HipHop for PHP, PyPy 1.2 •2011: HHVM •2014: Pyston
  • 11. •Let’s talk about performance optimization in general Improving software performance
  • 12. Three Areas to Optimize 1) Data fetching and I/O in general 2) How memory is used on an individual machine 3) How computation is actually performed by the CPU •Always measure using profiling tools to determine what areas you should focus your optimization efforts on
  • 13. •Typically the biggest issue to look at •Does you application frequently stall while waiting for data? Data Fetching and other I/O CPU Blocked on I/O CPUCPU Blocked on I/O Wall time
  • 14. •Are you fetching more data than you need to? •Are you making lots of round trips in a serial fashion? •Fetch data in batches to reduce # of round trips •Is your application still blocking a lot? •Use async I/O APIs Data Fetching and other I/O
  • 15. •Are you taking a lot of cache misses? •Caches misses slow things down by making the CPU stall •Pay attention to your application’s memory usage •Use as little memory as possible •Avoid pointer chasing How is memory used?
  • 16. •Is your application repeatedly re-computing something that could be memoized? •Are common operations as efficient as possible? •Is your application needlessly making lots of short-lived allocations? •Is there unnecessary contention with locks or interlocked operations? How is computation performed on the CPU?
  • 17. •Compilers requires a fair amount of investment before it pays off •Have you exhausted the low hanging fruit from other avenues of performance optimization? •Are there better engines already out there you can use? When to invest in compilers
  • 18. •Before building anything, think about what your goals really are • How much better does execution performance need to be? 2x? 4x? • Do you need fast compile times? How fast? • Do you need to support a rapid iterative workflow? • Do these goals help improve the user experience for your product or your company’s bottom line? What are your goals?
  • 19. •Everyone wants the best performance, but it doesn’t come for free •Do you really need 5x better execution performance? Or would 2x be good enough for the next few years? •Is it worth it for your company to pay one or more engineers to work on this instead of other things your company needs? What are your goals?
  • 20. •Does your use case require doing compilation on-line? •On average, how big are the source programs that are being compiled? •Does code your compiler outputs execute for short periods of time or are they longer running? •Is your source language statically typed or dynamically typed? What are the constraints of your use case?
  • 21. • Do you need fine control over low-level / native stuff? • Do you need to integrate with existing native code or data structures? • Do you need to work on multiple processors/platforms? • How quickly do you need a working solution? • What’s the best fit for your use case? • Ahead of time (AOT) compilation • Just-in-time (JIT) compilation • Interpreted execution What are the constraints of your use case?
  • 22. •Full custom approach •Interpreter approach •Transpiler approach •Build on top of an existing backend (LLVM, JVM) •Meta-tracing / partial evaluation frameworks (PyPy, Truffle) Approaches
  • 23. •Easy to write, maintain, and debug •Can be built relatively quickly •Very low compile times •Well-crafted interpreters can deliver better execution performance than you probably think Interpreters
  • 24. •Interpreters built to execute a virtual ISA (bytecode) tend to be faster than AST interpreters •Generally try to limit branches in your opcode handlers • Common trick: split up an opcode into several separate opcodes •Dispatch loop one of the major sources of overhead, invest some time in optimizing it Interpreters
  • 25. •A transpiler converts one programming to another, and relies on an existing compiler to compile and run the target language (typically C or C++) •Can be built fairly quickly, relatively easy to debug •Can deliver better execution perf than interpreters •Examples: Cython, HipHop for PHP Transpiler approach
  • 26. • Often has lower ceiling for best possible long-term performance vs. the full custom approach • Primitive operations the source language exposes often do not cleanly map onto the primitive operations exposed by the target language • Transpiler architecture can get unwieldy as your system evolves and you squeeze for more perf • For use cases that involve compiling medium-sized programs or larger, transpiling to C/C++ effectively locks you into AOT-compilation-only model Downsides of Transpilers
  • 27. •LLVM is the backend for the clang C/C++ compiler •Unlike most compiler backends, LLVM has well designed external facing APIs •Most suitable for compiling statically typed languages where longer compile times are acceptable •LLVM is good when you need to tight integration with existing native code where perf really matters Building on top of LLVM
  • 28. •Examples: Scala, JRuby •Works really well if the source language was designed with the JVM in mind •Not great if you need to tight integration with existing native code where perf matters •Suffers from some of the same problems as transpilers if your source language wasn’t designed with the JVM in mind Building on top of the JVM
  • 29. •Examples: PyPy, Truffle •Takes an interpreter for your source language as input •Analyzes the interpreter implementation, stitches together code from different opcode handlers, and then does optimization passes on these fragments and emits them to machine code •New and therefore not as proven as other approaches Meta-tracing and partial evaluation frameworks
  • 30. •Most expensive option in terms of time/effort •Gives you maximum control over every part of your system, let’s you craft everything to your exact use case •Can produce the best possible execution performance and compile-time performance •Ex: JVM, .NET, gcc, clang, V8, HHVM Full Custom Approach
  • 31. •Major risks: •Can take too long to build •Can go off the rails if you don’t have the proper expertise Full Custom Approach
  • 32. •Depending on the approach you take, you’ll need to optimize different parts of the system •Memory, memory, memory •Reduce memory usage and cache misses •Optimize your runtime’s binary layout •Try out jemalloc or tcmalloc •Try out Linux’s “huge pages” feature Optimization Advice
  • 33. What’s Next for Compilers? •Continued focus on engines that both deliver superior execution performance and support the rapid iterative development workflow
  • 34. Predictions: What’s Next? •DB query compilation will be an interesting space to watch •Disk is not the bottleneck anymore •Growing demand for real-time analytics on huge, constantly-changing datasets •I think we’ll see different database systems striving to deliver the highest quality SQL->machine code compilation

Hinweis der Redaktion

  1. This talk is about compilers from a practical perspective, about why compilers are important, and how and when to invest in building compilers.
  2. Software = greatest force powering technology over last 50 years
  3. - Reducing how many servers you need - Producing higher yields for your trading algorithms - Increasing user engagement with your product - Can make your product better (an edge over your competitors) in ways like improve your product’s battery life
  4. [Live demo]
  5. Unlike the demo, of course, real compilers are more complex and multiple stages that are organized into what’s called a “compilation pipeline”. Here on this slide, for example, we have a diagram of the compilation pipeline used by HHVM. The demo showed what happens at the very end of the compilation pipeline. Pipelines can differ a fair amount when comparing different compilers.. You’ll almost always see ASTs early in the pipeline, and mid-to-lower level SSA-based IR later in the pipeline. Also, for engines that take the “virtual machine” approach, you’ll typically see v-ISA around the middle of the pipeline. Pipelines are structured based on your constraints, your goals, and what kind of optimizations you want to focus on.
  6. Many new languages were created during this period as developers faced new challenges involved in building web sites and web applications.
  7. Most of the engines were interpreters that were in the ballpark of 100x slower than C/C++, but for small programs that mostly blocked on I/O the performance was acceptable.
  8. Over the last 10-15 years, there were a lot of new efficient engines that came out for the popular dynamic languages: Javascript, PHP, Python, and Ruby. These advancements changed what was possible with web development. Sites could be richer, have more dynamic content, and be more interactive while still loading quickly.
  9. Before digging into compilers and runtimes specifically, it’s important to dig into performance optimization in general
  10. The diagram on this slide shows an example of an application that alternates between doing computation on the CPU and blocking on I/O. The x-axis is wall time. Notice how even if you made the “CPU” portions shrink to zero, the most you can reduce wall time here is about 50% (a 2x performance improvement)
  11. Using async I/O APIs and task switch between multiple independent tasks to keep the CPU busy and avoid blocking
  12. Small programs completely fit in cache, giving the illusion that memory is faster than it actually is. Larger programs will miss in the cache sometimes, Cache misses are similar to blocking on I/O, accept at a lower level of granularity. Instead of blocking dozens or hundreds of times, each time for a few milli-seconds (this is what I/O does).. with memory the CPU will stall many millions of times, each time for a tenth of a micro-second (death of a million cuts) Reduce the number of allocations, reduce the size of common data structures, avoid pointer-chasing and needless indirections where possible, when two variables are frequently read at the same time put the variables next to each other in your data structures Profiling what actually causes cache misses can be tricky. If you look where you took a cache miss, you’ll be able to identify the “victim” (i.e. what piece of data got evicted from the cache) but it will take some investigation to find the cause or the “perpetrator” (what piece of code polluted the cache and caused the other data to get evicted)
  13. Use profilers to look at callstacks where the most CPU time is being spent. Don’t operate solely on hunches – even if you’re right most of the time, you might waste time focusing on the wrong things. Another technique that works sometimes: Get your hands dirty and trace through machine code in the debugger, focus on hot functions and common operations.
  14. Short amounts of time: less than 1 ms or 100 ms Longer running: greater than 1 second or 1 minute or more
  15. Interpreters: Instead of executing your program by compiling all the way to machine code (or some target language that is then compiled down to machine code). Interpreters run programs using a technique called “interpreted execution” where the interpreter engine is effectively executing on behalf of the source program.
  16. If you plan on doing optimizations at the bytecode level, consider preferring a register-based bytecode over stack-based Design your bytecode so that opcode handlers can get most of what they need directly from an instruction’s immediate arguments As you squeeze for more performance, dispatch loop can be written in assembly
  17. Depending on your use case, transpilers can deliver better performance that interpreters Like interpreters, there’s no need to worry about the low-level nitty gritty details of generating raw machine code No need to perform certain kinds of low-level compiler optimizations
  18. Often has lower ceiling for best possible long-term performance vs. the full custom approach.. Now, the same can be said for interpreters, but I wanted to specifically call this out for traspilers because.. There’s often a misconception that if you traspile to C/C++ you’re going to get performance on par with hand written C/C++ programs Primitive operations that the source language exposes (whose precise semantics are relied on by most large codebases written in the source language) often do not cleanly map onto the primitive operations exposed by the target language Transpiling to C/C++ effectively locks you in to AOT for most use cases
  19. Remember that LLVM only handles the end of your compilation pipeline for you (low-level IR to machine code), but the rest of the compilation pipeline (from source code down to low-level IR) is still on you. Doesn’t provide automatic memory management for you (no GC). It was built originally for an AOT compiler for a static language where fast compilation was not the top goal. LLVM has been incorporated into some major JIT-compilation-based engines for languages like JavaScript and PHP, but often these engines still have their own JIT and LLVM is often used as a second or third gear that is only used to compile the hottest ~1% of the program.
  20. Benefits over LLVM: it’s a little higher level and provides GC for you InvokeDynamic was a promising development that came out ~5 years ago -> JRuby takes advantage of InvokeDynamic
  21. Meta-tracing and partial evaluation frameworks are the newest approach when building efficient compilers. New and therefore not as proven as other approaches, but I think it’s a really exciting space and I’ll be curious to see what happens in the coming years.
  22. Can go off the rails if you don’t have the proper expertise (it can go off the rails even if you do have expertise).
  23. Depending on the approach you take, you’ll need to optimize different parts of the system. Obviously, optimize the parts of the system you control, but if you’re using an existing backend/framework you’ll need to make sure you’re doing the right things to get the best perf out of that technology. Use jemalloc or tcmalloc instead of the default allocator implementations from the C/C++ runtimes. If you have a huge amount of generated code, take advantage of Linux’s “huge pages” feature.
  24. DB query compilation will be an interesting space to watch. With the rise of in-memory DBs, disk is not the bottleneck anymore. I think we’ll see different database systems striving to deliver the highest quality SQL->machine code compilation (and when I say that I’m referring both to execution performance of the machine code, as well as fast compile times)