Performance and Scalability Tuning

Scalable Performance

Building enterprise-scale web applications that perform

Scalability vs. Performance

• Ratio of the increase in • Serving a single request in the
throughput to an increase in shortest amount of time
resources • Inverse of latency
• Support additional users at the
least incremental cost
• Predictability of application
behavior as users are added

Scalable Performance

• Number of requests that can be concurrently served
(throughput) while meeting a minimum level of service
(response time)
• Measuring:
o Resource utilization
o Throughput
o Response time

Horizontal vs. Vertical Scaling

• Increase the hardware resources • Improve hardware capabilities
• Separate types of processing into (cpu, RAM, storage, etc…)
tiers • No network bottleneck
• Commodity hardware is a • Becomes increasingly expensive
predictable cost per user • Practical and financial limitations
• Limitations to scaling are dictated to the ability to scale
by application architecture • Typically increases performance
• Can degrade performance

Horizontal vs. Vertical Scaling

General Observations

• Performance decreases in each later tier (LB > web > app >
DB) due to increasing complexity
o Service requests in the earliest possible tier
• Costs of scalability increases in each later tier (LB < web <
app < DB)
o Architect bottlenecks in the earliest possible tier
• Scalable performance is ultimately limited by the operations
that do not scale linearly
• Ideally, each request that makes it to the database tier
would have its own connection
o Realistically, this means serving requests in earlier tiers because of
constraints to db scaling

Application Bottlenecks

• Thread starvation
• Thread contention
• IO contention
• IO performance
• Memory limitations
• Data access

Resource Capacity Settings

• Database CPUs
• Database connection pool
• Application server CPUs
• Application server thread pool
• JVM Heap settings
• Web server thread pool

Resource Capacity Settings

• Walk through the application architecture and identify the
points where a request could potentially wait.
• Open all wait points.
• Generate balanced and representative load against the
environment.
• Identify the limiting wait point’s saturation point.
• Tighten all wait points to facilitate only the maximum load of
the limiting wait point.
• Force all pending requests to wait at the Web server.
• Add more resources.

Profiling

• Long running http requests
• Long running methods
• Memory leaks
• Deadlocks
• Long running queries

Database Tier

• System of record
• Difficult to scale horizontally and expensive to scale
vertically
• Keep connections limited to what the server will support
o Block at the app tier
• Perform data processing on staging server
• Each database has its own optimization techniques
o Explain plan to locate and eliminate full table scans
o Query and table caches
o Buffer sizes

Application Tier

• Generally cannot be stateless due to security and business
requirements
• Sticky vs. clustered sessions
o Use the HTTPSession sparingly
o If the app does not need to be HA, it may not require clean failover
and can drop sessions
• Scaling horizontally could potentially put more load on the
DB
o Caching can be used to offset the load

Application Caching

• Read-only data can be cached like static data in the web
tier
• Write-able data can be cached but will impose limitations on
clustering
o Will probably either need to be in-sync across the cluster or turned off
completely
• Filters can provide caching of dynamic, secure data at the
earliest point in this tier
o Caches entire response
o Not recommended for user-specific data
• Service layer or data access caches provide a simple way
to stop requests from continuing to the database
o Transparent to the calling code
o Cache interceptors

Tomcat Tuning

• maxThreads controls actively served connections
• backlog controls the number of connections that can be
queued
• maxThreads + backlog = total accepted connections
• connectionTimeout can be used to drop faulty connections
• bufferSize is by default set to -1, no buffering of output
• Keep heap size manageable, < 2GB

Web Tier

• Clustering is easiest in the web tier because they are
generally stateless
• Web tier clustering can provide super-linear scaling (IO
contention, context switching costs)
• The web tier should serve all static data (images, static
html)
• The web tier can serve dynamic requests by caching non-
secure data that only depends upon url parameters
o Squid reverse proxy
o Apache mod cache
o Memcached

Apache Tuning

• Limit connections in the web tier to prevent overloading
later tiers (MaxClients)
• ServerLimit x Memory per process < RAM available to limit
swapping
• Avg connections = ThreadsPerChild x Apache hosts / App
Server hosts
• ProxyPass max = ThreadsPerChild

Other

• Grid Caches
o GigaSpaces
o Coherence
• CDN
o Akamai
• Compute Appliances
o Azul Systems

References

• http://www.theserverside.com/tt/articles/content/JIApresent
ations/JIA-HASP.pdf
• http://www.mnot.net/cache_docs/
• http://httpd.apache.org/docs/2.2/misc/perf-tuning.html
• http://www.yourkit.com/overview/index.jsp
• http://dev2dev.bea.com/pub/a/2006/05/declarative-
caching.html
• http://azulsystems.com/
• http://www.theserverside.com/tt/knowledgecenter/knowledg
ecenter.tss?l=ProJavaEE_Ch06

Performance and Scalability Tuning

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (18)

Ähnlich wie Performance and Scalability Tuning

Ähnlich wie Performance and Scalability Tuning (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Performance and Scalability Tuning