How to build integrations with Mule ESB that meet your customers' needs around performance, scalability, and reliability. Presented with Rupesh Ramachandran at MuleSoft CONNECT 2014.
12. 12
Performance: Case Study
Test Case Mule ESB 3.5 EE as API Gateway
Infrastructure Amazon EC2 with 10GbE network
Throughput ~8000tps
Latency ~5ms
Scale Linear scale out, to 6 boxes*
Someone’s going to define them, intentionally or not
Personal Story – emergent properties -> discover in production
Instead, determine these requirements along with the functional ones
Define terms: Service Level Agreement, originally from telecom, informally used in integration work to describe precise non functional requirements at system boundaries. OLA another term you may see used.
Opposing Forces
Tuning for reliability can negatively impact performance and scalability, …
Complexity is the currency
Primary:
Response Time
TPS
Secondary:
How many concurrent users can be handled
Is performance sustainable over time?
Peak throughput and response times require tuning of the entire stack. For instance:
JVM (heap size, GC algorithm for low GC pause)
OS (Linux ulimits, tcp_ip stack, HugePages, etc)
File System (avoid excess logging, SSD vs HDD vs NFS)
Network (1gig ethernet cannot handle over 125MBps per channel and quickly becomes the bottleneck)
Downstream systems: backend systems like DB, SOAP WS, JMS Server, Websphere MQ, etc being integrated by Mule need to be tuned/scaled to perform equally or better than the ESB or it becomes the limiting factor
Synchronous: Request-response with fast response times – synchronous allows use of single thread per request and avoid context switching overhead. If request takes too long, thread held up for long time, not always on CPU. Sub-optimal core usage. Non-blocking synchronous HTTP (Use Jetty)
Asynchronous: Leverages SEDA, provides better CPU utilization for longer running processes. Threads not held up.
Real-time: Typically synchronous, but not necessary. All records read into memory and processed at once. Problematic if occasional spikes or large payloads.
Batch: Records processed in stages/batches. Like ETL jobs.
Stateful: Typically required for long running processes, for H/A via in-memory grid, for state based flow controls like aggregator, scatter-gather, etc
Stateless: No state means no persistence overhead. Helps achieve web scale.
Managing SLA’s depending on consumer
Get SLA metrics from API Analytics
Works for all HTTP endpoints
Business critical API. Back end .NET WS and REST service. Payloads 10k – 500k
8000tps = 650+ million records per day (could go higher if not for 10GbE saturation)
Average Latency of 5ms is round-trip from test client. Mule added latency is approx 1ms
Scale test was done on smaller boxes. Stopped at 6 because the 10g ethernet was saturated.
Single MuleESB. 2G heap on a 36GB RAM box. 50% cpu used on a 24 vcpu box. SOAP Proxy use case with data transformation from SOAP to REST.
Prioritize: Establishing baseline one SLA at a time allows you to protect the most important things first
Model: How to generate load? How realistic?
Measure: Instrument, Record, apply statistics
<60 seconds, describe iterative process
Scale up = additive, Scale out = multiplicative
SEDA: focuses threading controls on hot spots
Store and Forward: supports out of process SEDA
MOM: supports scale out
Work with project manager to determine the list of services that would go live, and what realistic and peak load might be
Rank them by importance and by volume, so we work on the right one first
Mock outbound services, instrument with JMX, profiler, network monitoring
Model request patterns in a JMeter test plan, execute against QA environment
Observe Response Time, Error Rate, CPU, Memory, IO, Threads, Network
See next slides