Top 10 Perl Performance Tips

Top 10 Perl Performance Tips

Perrin Harkins
We Also Walk Dogs

Ground Rules

● Make a repeatable test to measure progress with
○ Sometimes turns up surprises
● Use a profiler (Devel::NYTProf) to find where the time is
going
○ Don't flail and waste time optimizing the wrong things!
● Try to weigh the cost of developer time vs buying more
hardware
○ Optimization is crack for developers, hard to know when
to stop

1. The Big Picture

● The biggest gains usually come from changing your high-
level approach
○ Is there a more efficient algorithm?
○ Can you restructure to reduce duplicated effort?
● Sometimes you just need to tune your SQL
● A boatload of RAM hides a multitude of sins
● The bottleneck is usually I/O
○ Files
○ Database
○ Network
○ Batch I/O often makes a huge difference

2. Use DBI Efficiently

● Can make a huge difference in tight loops with many small
queries
● connect_cached() avoids connection overhead
○ Or use your favorite connection cache, but beware
overuse of ping()
● prepare_cached() avoids object creation and server-side
prepare overhead
● Use bind parameters to reuse SQL statements instead of
creating new ones


● Use bind_cols() in a fetch() loop for most efficient retrieval.
○ Less copying is faster.
○ Alternatively, fetchrow_arrayref()
● prepare() and then many execute() calls is faster
than do()


● Turn off AutoCommit for batch changes
○ Commit every thousand rows or so saves work for your
database
● Use your database's bulk loader when possible
○ Writing rows to CSV and using MySQL's LOAD DATA
INFILE crushes the fastest DBI code
○ 10X speedup is not unusual


● Use ORMs Wisely
○ Consider using straight DBI for the most performance
sensitive sections
■ Removing a layer means fewer method calls and
faster code
○ Write report queries by hand if they seem slow
■ Optimizer hints and choices about SQL variations are
beyond the scope of ORMs but make a huge
difference for this kind of query

3. Choose the Fastest Hash Storage

● memcached is not the fastest option for a local cache
○ BerkeleyDB (not DB_File!) and Cache::FastMmap are
about twice as fast
● CHI abstracts the storage layer
○ Useful if you think network strategy may change later

3. Choose the Fastest Hash Storage

Cache Get time Set time Run time
CHI::Driver::Memory 0.03ms 0.05ms 0.35s
BerkeleyDb 0.05ms 0.17ms 0.57s
Cache::FastMmap 0.06ms 0.09ms 0.62s
CHI::Driver::File 0.10ms 0.26ms 1.11s
Cache::Memcached::Fast 0.12ms 0.15ms 1.23s
Memcached::libmemcached 0.14ms 0.16ms 1.40s
CHI::Driver::DBI Sqlite 0.11ms 1.94ms 2.05s
Cache::Memcached 0.29ms 0.21ms 2.88s
CHI::Driver::DBI MySQL 0.45ms 0.33ms 4.41s

4. Generate Code and Compile to a
Subroutine
● This is how most templating tools work.
● Remove the cost of things that won't change for a while
○ Skip re-parsing templates
○ Skip large groups of conditionals
○ Choose architecture-specific code

my %subs;
my $code = qq{print "Hello $thingn";};
$subs{'hello'} = eval "sub { $code }";
$subs{'hello'}->();

5. Sling Text Efficiently

● Slurp files when possible.

my $text = do { local $/; <$fh>; }

● Seems obvious, but I still see people doing this:
my @lines = <$fh>;
my $text = join('', @lines);
● Consider memory with huge files.

5. Sling Text Efficiently

● Use a "sliding window" to search very large files.
○ Too big to slurp, but line-by-line is slow.
○ Chunks of 8K or 16K are much faster, but require book-
keeping code.
○ http://www.perlmonks.org/?node_id=128925
● Use the cheapest string tests you can get away with.
○ index() beats a regex when you just want to know if a
string contains another string
● Use a fast CSV parser
○ Text::CSV_XS is much faster than the regexes you
copied from that web page.

6. Replace LWP With Something
Faster
● LWP is amazing, but modules built on C libraries tend to be
faster.
○ LWP::Curl
○ HTTP::Lite
○ Maybe HTTP::Async for parallel

LWP 32.8/s
HTTP::Async 64.5/s
HTTP::Lite 200/s
LWP::Curl 1000/s

7. Use a Fast Serializer

● Data::Dumper is great for debugging, but slow for
serialization.
● JSON::XS is the new speed king, and is human-readable
and cross-language.
● Storable handles more and is second-best in speed.

7. Use a Fast Serializer

YAML 84.7/s

XML::Simple 800/s

Data::Dumper 2143/s

FreezeThaw 2635/s

YAML::Syck 4307/s

JSON::Syck 4654/s

Storable 9774/s

JSON::XS 41473/s

8. Avoid Startup Costs

● Use a daemon to run code persistently
○ Skip the costs of compiling
○ Cache data
○ Open connections ahead of time
● mod_perl, FastCGI, Plack, etc. for web
● PPerl for command-line
○ Or hit your web server with lwp-get

9. Sometimes You Have to Get Crazy

● Use the @_ array directly to avoid copying

sub add_to_sql {
my $sqlbase = shift; # hashref
my ($name, $value) = @_;
if ($value) {
push(@{ $sqlbase->{'names'} }, $name);
push(@{ $sqlbase->{'values'} }, $value);
}
return $sqlbase;
}

9. Sometimes You Have to Get Crazy

sub add_to_sql {
# takes 3 params: hashref, name, and value
return if not $_[2];

push(@{ $_[0]->{'names'} }, $_[1]);
push(@{ $_[0]->{'values'} }, $_[2]);
}

● 40% faster than original
● More than 40% harder to read

10. Consider Compiling Your Own Perl

● Compiling without threads can be good for a free 15% or so.
● No code changes needed!
● Has maintenance costs.

Resources

Tim Bunce's Advanced DBI slides:
http://www.slideshare.net/Tim.Bunce/dbi-advanced-tutorial-
2007

Also see Tim's NYTProf slides:
http://www.slideshare.net/Tim.Bunce/develnytprof-v4-at-oscon-
201007

man perlperf

Programming Perl appendix on performance

Thank you!

Slides will be available on the
conference website

Avoid tie()

● Slower than method calls!
● PITA to debug too.

Use a Fast Sort

● For sorting on derived keys, consider a GRT sort.
○ Faster than Schwartzian Transform
○ Use Sort::Maker to build it.

Top 10 Perl Performance Tips

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (18)

Andere mochten auch

Andere mochten auch (16)

Ähnlich wie Top 10 Perl Performance Tips

Ähnlich wie Top 10 Perl Performance Tips (20)

Mehr von Perrin Harkins

Mehr von Perrin Harkins (13)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Top 10 Perl Performance Tips