SlideShare a Scribd company logo
1 of 70
Download to read offline
Devel::NYTProf
Perl Source Code Profiler


        Tim Bunce - July 2009
        Screencast available at
http://blog.timbunce.org/tag/nytprof/
Devel::DProf
• Oldest Perl Profiler —1995

• Design flaws make it practically useless
  on modern systems

• Limited to 0.01 second resolution
  even for realtime measurements!
Devel::DProf Is Broken
$ perl -we 'print "sub s$_ { sqrt(42) for 1..100 };
 s$_({});n" for 1..1000' > x.pl

$ perl -d:DProf x.pl

$ dprofpp -r
   Total Elapsed Time =    0.108 Seconds
            Real Time =    0.108 Seconds
   Exclusive Times
   %Time ExclSec CumulS #Calls sec/call Csec/c   Name
    9.26   0.010 0.010       1   0.0100 0.0100   main::s76
    9.26   0.010 0.010       1   0.0100 0.0100   main::s323
    9.26   0.010 0.010       1   0.0100 0.0100   main::s626
    9.26   0.010 0.010       1   0.0100 0.0100   main::s936
    0.00       - -0.000      1        -      -   main::s77
    0.00       - -0.000      1        -      -   main::s82
Lots of Perl Profilers
• Take your pick...
   Devel::DProf          |   1995   |   Subroutine
   Devel::SmallProf      |   1997   |   Line
   Devel::AutoProfiler   |   2002   |   Subroutine
   Devel::Profiler       |   2002   |   Subroutine
   Devel::Profile        |   2003   |   Subroutine
   Devel::FastProf       |   2005   |   Line
   Devel::DProfLB        |   2006   |   Subroutine
   Devel::WxProf         |   2008   |   Subroutine
   Devel::Profit         |   2008   |   Line
   Devel::NYTProf        |   2008   |   Line & Subroutine
Evolution

Devel::DProf        | 1995 | Subroutine
Devel::SmallProf     | 1997 | Line
Devel::AutoProfiler | 2002 | Subroutine
Devel::Profiler     | 2002 | Subroutine
Devel::Profile      | 2003 | Subroutine
Devel::FastProf      | 2005 | Line
Devel::DProfLB      | 2006 | Subroutine
Devel::WxProf       | 2008 | Subroutine
Devel::Profit       | 2008 | Line
Devel::NYTProf v1    | 2008 | Line
Devel::NYTProf v2    | 2008 | Line & Subroutine
 ...plus lots of innovations!
What To Measure?

              CPU Time   Real Time


Subroutines
                 ?          ?
Statements
                 ?          ?
CPU Time vs Real Time
• CPU time
 - Very poor resolution (0.01s) on many systems
 - Not (much) affected by load on system
 - Doesn’t include time spent waiting for i/o etc.
• Real time
 - High resolution: microseconds or better
 - Is affected by load on system
 - Includes time spent waiting
Sub vs Line
• Subroutine Profiling
 - Measures time between subroutine entry and exit
 - That’s the Inclusive time. Exclusive by subtraction.
 - Reasonably fast, reasonably small data files
• Problems
 - Can be confused by funky control flow
 - No insight into where time spent within large subs
 - Doesn’t measure code outside of a sub
Sub vs Line
• Line/Statement profiling
 - Measure time from start of one statement to next
 - Exclusive time (except includes built-ins & xsubs)
 - Fine grained detail
• Problems
 - Very expensive in CPU & I/O
 - Assigns too much time to some statements
 - Too much detail for large subs (want time per sub)
 - Hard to get overall subroutine times
Devel::NYTProf
v1 Innovations

• Fork of Devel::FastProf by Adam Kaplan
  - working at the New York Times
• HTML report borrowed from Devel::Cover
• More accurate: Discounts profiler overhead
  including cost of writing to the file
• Test suite!
v2 Innovations


• Profiles time per block!
  - Statement times can be aggregated
    to enclosing block
    and enclosing sub
v2 Innovations

• Dual Profilers!
 - Is a statement profiler
 - and a subroutine profiler
 - At the same time!
v2 Innovations
• Subroutine profiler
 -   tracks time per calling location
 -   even for xsubs
 -   calculates exclusive time on-the-fly
 -   discounts overhead of statement profiler
 -   immune from funky control flow
 -   in memory, writes to file at end
 -   extremely fast
v2 Innovations

• Statement profiler gives correct timing
  after leave ops
 - unlike previous statement profilers...
 - last statement in loops doesn’t accumulate
   time spent evaluating the condition
 - last statement in subs doesn’t accumulate time
   spent in remainder of calling statement
v2 Other Features
•   Profiles compile-time activity
•   Profiling can be enabled & disabled on the fly
•   Handles forks with no overhead
•   Correct timing for mod_perl
•   Sub-microsecond resolution
•   Multiple clocks, including high-res CPU time
•   Can snapshot source code & evals into profile
•   Built-in zip compression
Profiling Performance
                Time     Size
    Perl         x1       -
 SmallProf      x 22      -
  FastProf      x 6.3    42,927KB
 NYTProf        x 3.9    11,174KB
   + blocks=0   x 3.5     9,628KB
   + stmts=0    x 2.5*        205KB
   DProf        x 4.9    60,736KB
v3 Features

•   Profiles slow opcodes: system calls, regexps, ...
•   Subroutine caller name noted, for call-graph
•   Handles goto ⊂ e.g. AUTOLOAD
•   HTML report includes interactive TreeMaps
•   Outputs call-graph in Graphviz dot format
Running NYTProf

perl -d:NYTProf ...


perl -MDevel::NYTProf ...


PERL5OPT=-d:NYTProf


NYTPROF=file=/tmp/nytprof.out:addpid=1:slowops=1
Reporting NYTProf
• CSV - old, limited, dull
  $ nytprofcsv


  # Format: time,calls,time/call,code
  0,0,0,sub foo {
  0.000002,2,0.00001,print "in sub foon";
  0.000004,2,0.00002,bar();
  0,0,0,}
  0,0,0,
Reporting NYTProf
• KcacheGrind call graph - new and cool
   - contributed by C. L. Kao.
   - requires KcacheGrind

  $ nytprofcg   # generates nytprof.callgraph
  $ kcachegrind # load the file via the gui
Reporting NYTProf
• HTML report
   - page per source file, annotated with times and links
   - subroutine index table with sortable columns
   - interactive Treemaps of subroutine times
   - generates Graphviz dot file of call graph

$ nytprofhtml # writes HTML report in ./nytprof/...
$ nytprofhtml --file=/tmp/nytprof.out.793 --open
Summary




                             Links to annotated
                                source code




Link to sortable table
      of all subs
                          Timings for perl builtins
Exclusive vs. Inclusive
• Exclusive Time = Bottom up
 - Detail of time spent “just here”
 - Where the time actually gets spent
 - Useful for localized (peephole) optimisation


• Inclusive Time = Top down
 - Overview of time spent “in and below”
 - Useful to prioritize structural optimizations
Overall time spent in and below this sub

                                            (in + below)




       Color coding based on
     Median Average Deviation
     relative to rest of this file          Timings for each location calling into,
                                                 or out of, the subroutine
Treemap showing relative
                 proportions of exclusive time




                                  Boxes represent subroutines
                                   Colors only used to show
                                packages (and aren’t pretty yet)




Hover over box to see details
                                          Click to drill-down one level
                                              in package hierarchy
Let’s take a look...
Optimizing
  Hints & Tips
Phase 0
Before you start
DONʼT
DO IT!
“The First Rule of Program Optimization:
Don't do it.

The Second Rule of Program Optimization
(for experts only!): Don't do it yet.”

- Michael A. Jackson
Why not?
“More computing sins are committed in the
name of efficiency (without necessarily
achieving it) than for any other single
reason - including blind stupidity.”

- W.A. Wulf
“We should forget about small efficiencies,
say about 97% of the time: premature
optimization is the root of all evil.
Yet we should not pass up our
opportunities in that critical 3%.”

- Donald Knuth
“We should forget about small efficiencies,
say about 97% of the time: premature
optimization is the root of all evil.
Yet we should not pass up our
opportunities in that critical 3%.”
- Donald Knuth
How?
“Bottlenecks occur in surprising places, so
don't try to second guess and put in a speed
hack until you have proven that's where the
bottleneck is.”

- Rob Pike
“Measure twice, cut once.”

- Old Proverb
Phase 1
Low Hanging Fruit
Low Hanging Fruit
1.   Profile code running representative workload.
2.   Look at Exclusive Time of subroutines.
3.   Do they look reasonable?
4.   Examine worst offenders.
5.   Fix only simple local problems.
6.   Profile again.
7.   Fast enough? Then STOP!
8.   Rinse and repeat once or twice, then move on.
“Simple Local Fixes”


 Changes unlikely to introduce bugs
Move invariant
 expressions
 out of loops
Avoid->repeated
     ->chains
->of->accessors(...)

   Use a temporary variable
Use faster accessors


   Class::Accessor
   -> Class::Accessor::Fast
   --> Class::Accessor::Faster
   ---> Class::XSAccessor
Avoid calling subs that
 don’t do anything!
  my $unsed_variable = $self->foo;


  my $is_logging = $log->info(...);
  while (...) {
      $log->info(...) if $is_logging;
      ...
  }
Exit subs and loops early
  Delay initializations
  return if not ...a cheap test...;
  return if not ...a more expensive test...;
  my $foo = ...initializations...;
  ...body of subroutine...
Fix silly code

-   return exists $nav_type{$country}{$key}
-               ? $nav_type{$country}{$key}
-               : undef;
+   return $nav_type{$country}{$key};
Beware pathological
regular expressions

      NYTPROF=slowops=2
Avoid unpacking args
  in very hot subs
   sub foo { shift->delegate(@_) }

   sub bar {
       return shift->{bar} unless @_;
       return $_[0]->{bar} = $_[1];
   }
Retest.

Fast enough?

        STOP!
Put the profiler down and walk away
Phase 2
 Deeper Changes
Profile with a
    known workload


E.g., 1000 identical requests
Check Inclusive Times
 (especially top-level subs)


Reasonable percentage
  for the workload?
Check subroutine
   call counts

    Reasonable
for the workload?
Add caching
    if appropriate
   to reduce calls


Remember invalidation
Walk up call chain
 to find good spots
     for caching


Remember invalidation
Creating many objects
 that don’t get used?


 Lightweight proxies
 e.g. DateTimeX::Lite
Retest.

Fast enough?

        STOP!
Put the profiler down and walk away
Phase 3
Structural Changes
Push loops down


-   $object->walk($_) for @dogs;

+   $object->walk_these(@dogs);
Change the data
   structure

hashes <–> arrays
Change the algorithm

What’s the “Big O”?
O(n 2) or O(logn) or ...
Rewrite hot-spots in C

     Inline::C
It all adds up!

“I achieved my fast times by
multitudes of 1% reductions”

       - Bill Raymond
Questions?
    Tim.Bunce@pobox.com
@timbunce on twitter occasionally
    http://blog.timbunce.org

More Related Content

What's hot

Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestMigrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestDatabricks
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecturenickmbailey
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBMariaDB plc
 
OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?ScyllaDB
 
How to Shutdown Netapp Cluster Mode Storage System with Multi-Node Cluster (6...
How to Shutdown Netapp Cluster Mode Storage System with Multi-Node Cluster (6...How to Shutdown Netapp Cluster Mode Storage System with Multi-Node Cluster (6...
How to Shutdown Netapp Cluster Mode Storage System with Multi-Node Cluster (6...Saroj Sahu
 
Why your Spark Job is Failing
Why your Spark Job is FailingWhy your Spark Job is Failing
Why your Spark Job is FailingDataWorks Summit
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaDatabricks
 
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...ScyllaDB
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsBrendan Gregg
 
Outrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkOutrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkScyllaDB
 
Multi-signed Kernel Module
Multi-signed Kernel ModuleMulti-signed Kernel Module
Multi-signed Kernel ModuleSUSE Labs Taipei
 
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...Alex Levenson
 
MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11NeoClova
 
淺談 Live patching technology
淺談 Live patching technology淺談 Live patching technology
淺談 Live patching technologySZ Lin
 
Open HFT libraries in @Java
Open HFT libraries in @JavaOpen HFT libraries in @Java
Open HFT libraries in @JavaPeter Lawrey
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationshadooparchbook
 
Maxscale_메뉴얼
Maxscale_메뉴얼Maxscale_메뉴얼
Maxscale_메뉴얼NeoClova
 
ZFS: The Last Word in Filesystems
ZFS: The Last Word in FilesystemsZFS: The Last Word in Filesystems
ZFS: The Last Word in FilesystemsJarod Wang
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeDatabricks
 

What's hot (20)

Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestMigrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDB
 
OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?
 
How to Shutdown Netapp Cluster Mode Storage System with Multi-Node Cluster (6...
How to Shutdown Netapp Cluster Mode Storage System with Multi-Node Cluster (6...How to Shutdown Netapp Cluster Mode Storage System with Multi-Node Cluster (6...
How to Shutdown Netapp Cluster Mode Storage System with Multi-Node Cluster (6...
 
Why your Spark Job is Failing
Why your Spark Job is FailingWhy your Spark Job is Failing
Why your Spark Job is Failing
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
 
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
How We Reduced Performance Tuning Time by Orders of Magnitude with Database O...
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
 
Outrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkOutrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar Framework
 
Multi-signed Kernel Module
Multi-signed Kernel ModuleMulti-signed Kernel Module
Multi-signed Kernel Module
 
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
 
MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11MaxScale이해와활용-2023.11
MaxScale이해와활용-2023.11
 
淺談 Live patching technology
淺談 Live patching technology淺談 Live patching technology
淺談 Live patching technology
 
Open HFT libraries in @Java
Open HFT libraries in @JavaOpen HFT libraries in @Java
Open HFT libraries in @Java
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
 
Maxscale_메뉴얼
Maxscale_메뉴얼Maxscale_메뉴얼
Maxscale_메뉴얼
 
ZFS: The Last Word in Filesystems
ZFS: The Last Word in FilesystemsZFS: The Last Word in Filesystems
ZFS: The Last Word in Filesystems
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
 

Viewers also liked

How to inspect a RUNNING perl process
How to inspect a RUNNING perl processHow to inspect a RUNNING perl process
How to inspect a RUNNING perl processMasaaki HIROSE
 
Why I Am Passionate About Perl
Why I Am Passionate About PerlWhy I Am Passionate About Perl
Why I Am Passionate About Perlbrian d foy
 
Introduction To Testing With Perl
Introduction To Testing With PerlIntroduction To Testing With Perl
Introduction To Testing With Perljoshua.mcadams
 
Utility Modules That You Should Know About
Utility Modules That You Should Know AboutUtility Modules That You Should Know About
Utility Modules That You Should Know Aboutjoshua.mcadams
 
Top 10 Perl Performance Tips
Top 10 Perl Performance TipsTop 10 Perl Performance Tips
Top 10 Perl Performance TipsPerrin Harkins
 
Profiling with Devel::NYTProf
Profiling with Devel::NYTProfProfiling with Devel::NYTProf
Profiling with Devel::NYTProfbobcatfish
 
DBIx::Class beginners
DBIx::Class beginnersDBIx::Class beginners
DBIx::Class beginnersleo lapworth
 

Viewers also liked (7)

How to inspect a RUNNING perl process
How to inspect a RUNNING perl processHow to inspect a RUNNING perl process
How to inspect a RUNNING perl process
 
Why I Am Passionate About Perl
Why I Am Passionate About PerlWhy I Am Passionate About Perl
Why I Am Passionate About Perl
 
Introduction To Testing With Perl
Introduction To Testing With PerlIntroduction To Testing With Perl
Introduction To Testing With Perl
 
Utility Modules That You Should Know About
Utility Modules That You Should Know AboutUtility Modules That You Should Know About
Utility Modules That You Should Know About
 
Top 10 Perl Performance Tips
Top 10 Perl Performance TipsTop 10 Perl Performance Tips
Top 10 Perl Performance Tips
 
Profiling with Devel::NYTProf
Profiling with Devel::NYTProfProfiling with Devel::NYTProf
Profiling with Devel::NYTProf
 
DBIx::Class beginners
DBIx::Class beginnersDBIx::Class beginners
DBIx::Class beginners
 

Similar to Devel::NYTProf 2009-07 (OUTDATED, see 201008)

Devel::NYTProf v3 - 200908 (OUTDATED, see 201008)
Devel::NYTProf v3 - 200908 (OUTDATED, see 201008)Devel::NYTProf v3 - 200908 (OUTDATED, see 201008)
Devel::NYTProf v3 - 200908 (OUTDATED, see 201008)Tim Bunce
 
Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Tim Bunce
 
Perl at SkyCon'12
Perl at SkyCon'12Perl at SkyCon'12
Perl at SkyCon'12Tim Bunce
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
Make static instrumentation great again, High performance fuzzing for Windows...
Make static instrumentation great again, High performance fuzzing for Windows...Make static instrumentation great again, High performance fuzzing for Windows...
Make static instrumentation great again, High performance fuzzing for Windows...Lucas Leong
 
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013Intel Software Brasil
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Spark Summit
 
Packaging perl (LPW2010)
Packaging perl (LPW2010)Packaging perl (LPW2010)
Packaging perl (LPW2010)p3castro
 
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...srisatish ambati
 
Booting into functional programming
Booting into functional programmingBooting into functional programming
Booting into functional programmingDhaval Dalal
 
DSL Construction with Ruby - ThoughtWorks Masterclass Series 2009
DSL Construction with Ruby - ThoughtWorks Masterclass Series 2009DSL Construction with Ruby - ThoughtWorks Masterclass Series 2009
DSL Construction with Ruby - ThoughtWorks Masterclass Series 2009Harshal Hayatnagarkar
 
Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudRevolution Analytics
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13Dave Gardner
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...Chris Fregly
 
Peddle the Pedal to the Metal
Peddle the Pedal to the MetalPeddle the Pedal to the Metal
Peddle the Pedal to the MetalC4Media
 

Similar to Devel::NYTProf 2009-07 (OUTDATED, see 201008) (20)

Devel::NYTProf v3 - 200908 (OUTDATED, see 201008)
Devel::NYTProf v3 - 200908 (OUTDATED, see 201008)Devel::NYTProf v3 - 200908 (OUTDATED, see 201008)
Devel::NYTProf v3 - 200908 (OUTDATED, see 201008)
 
Nyt Prof 200910
Nyt Prof 200910Nyt Prof 200910
Nyt Prof 200910
 
Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406
 
Ansible - A 'crowd' introduction
Ansible - A 'crowd' introductionAnsible - A 'crowd' introduction
Ansible - A 'crowd' introduction
 
Perl at SkyCon'12
Perl at SkyCon'12Perl at SkyCon'12
Perl at SkyCon'12
 
13 risc
13 risc13 risc
13 risc
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
Make static instrumentation great again, High performance fuzzing for Windows...
Make static instrumentation great again, High performance fuzzing for Windows...Make static instrumentation great again, High performance fuzzing for Windows...
Make static instrumentation great again, High performance fuzzing for Windows...
 
08 subprograms
08 subprograms08 subprograms
08 subprograms
 
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
 
Packaging perl (LPW2010)
Packaging perl (LPW2010)Packaging perl (LPW2010)
Packaging perl (LPW2010)
 
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
 
Booting into functional programming
Booting into functional programmingBooting into functional programming
Booting into functional programming
 
DSL Construction with Ruby - ThoughtWorks Masterclass Series 2009
DSL Construction with Ruby - ThoughtWorks Masterclass Series 2009DSL Construction with Ruby - ThoughtWorks Masterclass Series 2009
DSL Construction with Ruby - ThoughtWorks Masterclass Series 2009
 
Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13
 
Scaling tappsi
Scaling tappsiScaling tappsi
Scaling tappsi
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
 
Peddle the Pedal to the Metal
Peddle the Pedal to the MetalPeddle the Pedal to the Metal
Peddle the Pedal to the Metal
 

More from Tim Bunce

Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyTim Bunce
 
Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Tim Bunce
 
Perl Memory Use 201209
Perl Memory Use 201209Perl Memory Use 201209
Perl Memory Use 201209Tim Bunce
 
Perl Memory Use 201207 (OUTDATED, see 201209 )
Perl Memory Use 201207 (OUTDATED, see 201209 )Perl Memory Use 201207 (OUTDATED, see 201209 )
Perl Memory Use 201207 (OUTDATED, see 201209 )Tim Bunce
 
Perl Dist::Surveyor 2011
Perl Dist::Surveyor 2011Perl Dist::Surveyor 2011
Perl Dist::Surveyor 2011Tim Bunce
 
PL/Perl - New Features in PostgreSQL 9.0 201012
PL/Perl - New Features in PostgreSQL 9.0 201012PL/Perl - New Features in PostgreSQL 9.0 201012
PL/Perl - New Features in PostgreSQL 9.0 201012Tim Bunce
 
Perl6 DBDI YAPC::EU 201008
Perl6 DBDI YAPC::EU 201008Perl6 DBDI YAPC::EU 201008
Perl6 DBDI YAPC::EU 201008Tim Bunce
 
Perl 6 DBDI 201007 (OUTDATED, see 201008)
Perl 6 DBDI 201007 (OUTDATED, see 201008)Perl 6 DBDI 201007 (OUTDATED, see 201008)
Perl 6 DBDI 201007 (OUTDATED, see 201008)Tim Bunce
 
PL/Perl - New Features in PostgreSQL 9.0
PL/Perl - New Features in PostgreSQL 9.0PL/Perl - New Features in PostgreSQL 9.0
PL/Perl - New Features in PostgreSQL 9.0Tim Bunce
 
DBI Advanced Tutorial 2007
DBI Advanced Tutorial 2007DBI Advanced Tutorial 2007
DBI Advanced Tutorial 2007Tim Bunce
 
Perl Myths 200909
Perl Myths 200909Perl Myths 200909
Perl Myths 200909Tim Bunce
 
DashProfiler 200807
DashProfiler 200807DashProfiler 200807
DashProfiler 200807Tim Bunce
 
DBI for Parrot and Perl 6 Lightning Talk 2007
DBI for Parrot and Perl 6 Lightning Talk 2007DBI for Parrot and Perl 6 Lightning Talk 2007
DBI for Parrot and Perl 6 Lightning Talk 2007Tim Bunce
 
DBD::Gofer 200809
DBD::Gofer 200809DBD::Gofer 200809
DBD::Gofer 200809Tim Bunce
 
Perl Myths 200802 with notes (OUTDATED, see 200909)
Perl Myths 200802 with notes (OUTDATED, see 200909)Perl Myths 200802 with notes (OUTDATED, see 200909)
Perl Myths 200802 with notes (OUTDATED, see 200909)Tim Bunce
 

More from Tim Bunce (15)

Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
 
Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013
 
Perl Memory Use 201209
Perl Memory Use 201209Perl Memory Use 201209
Perl Memory Use 201209
 
Perl Memory Use 201207 (OUTDATED, see 201209 )
Perl Memory Use 201207 (OUTDATED, see 201209 )Perl Memory Use 201207 (OUTDATED, see 201209 )
Perl Memory Use 201207 (OUTDATED, see 201209 )
 
Perl Dist::Surveyor 2011
Perl Dist::Surveyor 2011Perl Dist::Surveyor 2011
Perl Dist::Surveyor 2011
 
PL/Perl - New Features in PostgreSQL 9.0 201012
PL/Perl - New Features in PostgreSQL 9.0 201012PL/Perl - New Features in PostgreSQL 9.0 201012
PL/Perl - New Features in PostgreSQL 9.0 201012
 
Perl6 DBDI YAPC::EU 201008
Perl6 DBDI YAPC::EU 201008Perl6 DBDI YAPC::EU 201008
Perl6 DBDI YAPC::EU 201008
 
Perl 6 DBDI 201007 (OUTDATED, see 201008)
Perl 6 DBDI 201007 (OUTDATED, see 201008)Perl 6 DBDI 201007 (OUTDATED, see 201008)
Perl 6 DBDI 201007 (OUTDATED, see 201008)
 
PL/Perl - New Features in PostgreSQL 9.0
PL/Perl - New Features in PostgreSQL 9.0PL/Perl - New Features in PostgreSQL 9.0
PL/Perl - New Features in PostgreSQL 9.0
 
DBI Advanced Tutorial 2007
DBI Advanced Tutorial 2007DBI Advanced Tutorial 2007
DBI Advanced Tutorial 2007
 
Perl Myths 200909
Perl Myths 200909Perl Myths 200909
Perl Myths 200909
 
DashProfiler 200807
DashProfiler 200807DashProfiler 200807
DashProfiler 200807
 
DBI for Parrot and Perl 6 Lightning Talk 2007
DBI for Parrot and Perl 6 Lightning Talk 2007DBI for Parrot and Perl 6 Lightning Talk 2007
DBI for Parrot and Perl 6 Lightning Talk 2007
 
DBD::Gofer 200809
DBD::Gofer 200809DBD::Gofer 200809
DBD::Gofer 200809
 
Perl Myths 200802 with notes (OUTDATED, see 200909)
Perl Myths 200802 with notes (OUTDATED, see 200909)Perl Myths 200802 with notes (OUTDATED, see 200909)
Perl Myths 200802 with notes (OUTDATED, see 200909)
 

Recently uploaded

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

Devel::NYTProf 2009-07 (OUTDATED, see 201008)

  • 1. Devel::NYTProf Perl Source Code Profiler Tim Bunce - July 2009 Screencast available at http://blog.timbunce.org/tag/nytprof/
  • 2. Devel::DProf • Oldest Perl Profiler —1995 • Design flaws make it practically useless on modern systems • Limited to 0.01 second resolution even for realtime measurements!
  • 3. Devel::DProf Is Broken $ perl -we 'print "sub s$_ { sqrt(42) for 1..100 }; s$_({});n" for 1..1000' > x.pl $ perl -d:DProf x.pl $ dprofpp -r Total Elapsed Time = 0.108 Seconds Real Time = 0.108 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 9.26 0.010 0.010 1 0.0100 0.0100 main::s76 9.26 0.010 0.010 1 0.0100 0.0100 main::s323 9.26 0.010 0.010 1 0.0100 0.0100 main::s626 9.26 0.010 0.010 1 0.0100 0.0100 main::s936 0.00 - -0.000 1 - - main::s77 0.00 - -0.000 1 - - main::s82
  • 4. Lots of Perl Profilers • Take your pick... Devel::DProf | 1995 | Subroutine Devel::SmallProf | 1997 | Line Devel::AutoProfiler | 2002 | Subroutine Devel::Profiler | 2002 | Subroutine Devel::Profile | 2003 | Subroutine Devel::FastProf | 2005 | Line Devel::DProfLB | 2006 | Subroutine Devel::WxProf | 2008 | Subroutine Devel::Profit | 2008 | Line Devel::NYTProf | 2008 | Line & Subroutine
  • 5. Evolution Devel::DProf | 1995 | Subroutine Devel::SmallProf | 1997 | Line Devel::AutoProfiler | 2002 | Subroutine Devel::Profiler | 2002 | Subroutine Devel::Profile | 2003 | Subroutine Devel::FastProf | 2005 | Line Devel::DProfLB | 2006 | Subroutine Devel::WxProf | 2008 | Subroutine Devel::Profit | 2008 | Line Devel::NYTProf v1 | 2008 | Line Devel::NYTProf v2 | 2008 | Line & Subroutine ...plus lots of innovations!
  • 6. What To Measure? CPU Time Real Time Subroutines ? ? Statements ? ?
  • 7. CPU Time vs Real Time • CPU time - Very poor resolution (0.01s) on many systems - Not (much) affected by load on system - Doesn’t include time spent waiting for i/o etc. • Real time - High resolution: microseconds or better - Is affected by load on system - Includes time spent waiting
  • 8. Sub vs Line • Subroutine Profiling - Measures time between subroutine entry and exit - That’s the Inclusive time. Exclusive by subtraction. - Reasonably fast, reasonably small data files • Problems - Can be confused by funky control flow - No insight into where time spent within large subs - Doesn’t measure code outside of a sub
  • 9. Sub vs Line • Line/Statement profiling - Measure time from start of one statement to next - Exclusive time (except includes built-ins & xsubs) - Fine grained detail • Problems - Very expensive in CPU & I/O - Assigns too much time to some statements - Too much detail for large subs (want time per sub) - Hard to get overall subroutine times
  • 11. v1 Innovations • Fork of Devel::FastProf by Adam Kaplan - working at the New York Times • HTML report borrowed from Devel::Cover • More accurate: Discounts profiler overhead including cost of writing to the file • Test suite!
  • 12. v2 Innovations • Profiles time per block! - Statement times can be aggregated to enclosing block and enclosing sub
  • 13. v2 Innovations • Dual Profilers! - Is a statement profiler - and a subroutine profiler - At the same time!
  • 14. v2 Innovations • Subroutine profiler - tracks time per calling location - even for xsubs - calculates exclusive time on-the-fly - discounts overhead of statement profiler - immune from funky control flow - in memory, writes to file at end - extremely fast
  • 15. v2 Innovations • Statement profiler gives correct timing after leave ops - unlike previous statement profilers... - last statement in loops doesn’t accumulate time spent evaluating the condition - last statement in subs doesn’t accumulate time spent in remainder of calling statement
  • 16. v2 Other Features • Profiles compile-time activity • Profiling can be enabled & disabled on the fly • Handles forks with no overhead • Correct timing for mod_perl • Sub-microsecond resolution • Multiple clocks, including high-res CPU time • Can snapshot source code & evals into profile • Built-in zip compression
  • 17.
  • 18. Profiling Performance Time Size Perl x1 - SmallProf x 22 - FastProf x 6.3 42,927KB NYTProf x 3.9 11,174KB + blocks=0 x 3.5 9,628KB + stmts=0 x 2.5* 205KB DProf x 4.9 60,736KB
  • 19. v3 Features • Profiles slow opcodes: system calls, regexps, ... • Subroutine caller name noted, for call-graph • Handles goto &sub; e.g. AUTOLOAD • HTML report includes interactive TreeMaps • Outputs call-graph in Graphviz dot format
  • 20. Running NYTProf perl -d:NYTProf ... perl -MDevel::NYTProf ... PERL5OPT=-d:NYTProf NYTPROF=file=/tmp/nytprof.out:addpid=1:slowops=1
  • 21. Reporting NYTProf • CSV - old, limited, dull $ nytprofcsv # Format: time,calls,time/call,code 0,0,0,sub foo { 0.000002,2,0.00001,print "in sub foon"; 0.000004,2,0.00002,bar(); 0,0,0,} 0,0,0,
  • 22. Reporting NYTProf • KcacheGrind call graph - new and cool - contributed by C. L. Kao. - requires KcacheGrind $ nytprofcg # generates nytprof.callgraph $ kcachegrind # load the file via the gui
  • 23.
  • 24. Reporting NYTProf • HTML report - page per source file, annotated with times and links - subroutine index table with sortable columns - interactive Treemaps of subroutine times - generates Graphviz dot file of call graph $ nytprofhtml # writes HTML report in ./nytprof/... $ nytprofhtml --file=/tmp/nytprof.out.793 --open
  • 25.
  • 26. Summary Links to annotated source code Link to sortable table of all subs Timings for perl builtins
  • 27. Exclusive vs. Inclusive • Exclusive Time = Bottom up - Detail of time spent “just here” - Where the time actually gets spent - Useful for localized (peephole) optimisation • Inclusive Time = Top down - Overview of time spent “in and below” - Useful to prioritize structural optimizations
  • 28.
  • 29. Overall time spent in and below this sub (in + below) Color coding based on Median Average Deviation relative to rest of this file Timings for each location calling into, or out of, the subroutine
  • 30.
  • 31. Treemap showing relative proportions of exclusive time Boxes represent subroutines Colors only used to show packages (and aren’t pretty yet) Hover over box to see details Click to drill-down one level in package hierarchy
  • 32. Let’s take a look...
  • 36. “The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet.” - Michael A. Jackson
  • 38. “More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason - including blind stupidity.” - W.A. Wulf
  • 39. “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” - Donald Knuth
  • 40. “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” - Donald Knuth
  • 41. How?
  • 42. “Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you have proven that's where the bottleneck is.” - Rob Pike
  • 43. “Measure twice, cut once.” - Old Proverb
  • 45. Low Hanging Fruit 1. Profile code running representative workload. 2. Look at Exclusive Time of subroutines. 3. Do they look reasonable? 4. Examine worst offenders. 5. Fix only simple local problems. 6. Profile again. 7. Fast enough? Then STOP! 8. Rinse and repeat once or twice, then move on.
  • 46. “Simple Local Fixes” Changes unlikely to introduce bugs
  • 48. Avoid->repeated ->chains ->of->accessors(...) Use a temporary variable
  • 49. Use faster accessors Class::Accessor -> Class::Accessor::Fast --> Class::Accessor::Faster ---> Class::XSAccessor
  • 50. Avoid calling subs that don’t do anything! my $unsed_variable = $self->foo; my $is_logging = $log->info(...); while (...) { $log->info(...) if $is_logging; ... }
  • 51. Exit subs and loops early Delay initializations return if not ...a cheap test...; return if not ...a more expensive test...; my $foo = ...initializations...; ...body of subroutine...
  • 52. Fix silly code - return exists $nav_type{$country}{$key} - ? $nav_type{$country}{$key} - : undef; + return $nav_type{$country}{$key};
  • 54. Avoid unpacking args in very hot subs sub foo { shift->delegate(@_) } sub bar { return shift->{bar} unless @_; return $_[0]->{bar} = $_[1]; }
  • 55. Retest. Fast enough? STOP! Put the profiler down and walk away
  • 56. Phase 2 Deeper Changes
  • 57. Profile with a known workload E.g., 1000 identical requests
  • 58. Check Inclusive Times (especially top-level subs) Reasonable percentage for the workload?
  • 59. Check subroutine call counts Reasonable for the workload?
  • 60. Add caching if appropriate to reduce calls Remember invalidation
  • 61. Walk up call chain to find good spots for caching Remember invalidation
  • 62. Creating many objects that don’t get used? Lightweight proxies e.g. DateTimeX::Lite
  • 63. Retest. Fast enough? STOP! Put the profiler down and walk away
  • 65. Push loops down - $object->walk($_) for @dogs; + $object->walk_these(@dogs);
  • 66. Change the data structure hashes <–> arrays
  • 67. Change the algorithm What’s the “Big O”? O(n 2) or O(logn) or ...
  • 68. Rewrite hot-spots in C Inline::C
  • 69. It all adds up! “I achieved my fast times by multitudes of 1% reductions” - Bill Raymond
  • 70. Questions? Tim.Bunce@pobox.com @timbunce on twitter occasionally http://blog.timbunce.org