Things you should know about Scalability!

Things you should know about
Scalability!
WJAX 2011, 08.11.2011, Munich
Robert Mederer
Copyright © 2011 Accenture All Rights Reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture.

Abstract
Things you should know about Scalability!
Delivering architecture@internet-scale has several challenges to be
solved to be ready for extreme scalable architectures. This session is
about the art of scale, scalability, and scaling of web architectures. It will
give an overview of challenges, good practices and solutions to achieve
high scalability for web-based systems.

Copyright © 2011 Accenture. All rights reserved. 2

Who am I?

Experience
Robert Mederer
Lead Architecture & Execution 2000 - 2005: Technology Architect and Software
Anni-Albers-Straße 11 Engineer in several projects
80807 München 2006: Technical Architecture Lead, Integration
Mobil: +49-175-57-68012 and Execution Architecture for Location-Based
Service Provider
robert.mederer@accenture.com
2009: Technical Architecture Lead, Frontend
and Execution Architecture for a Government
Agency

2009/2010: Technical Architecture and front-
office integration build lead, Integration and
Execution Architecture Financial Services
Agency

2011: Architect and QA for Location Based
Services Platform


Accenture
High performance achieved
Company Profile Worldwide Revenues $25.5 billion
•  Global management consulting, (in US$ billion, as of August 31, 2011)
technology services and Communications
outsourcing company Resources & High Tech
•  236.000 employees
•  Rank 47 among the
“Best Global Brands 2008”
•  Top 100 Employer
•  28 of the DAX-30-Companies Public
Financial Service
•  96 of the Fortune-Global-100 Services
•  More than three-quarters of the
Fortune-Global-500 Products

•  87 of our Top 100-clients have been
with us for 10 or more years


Local Accenture … ???
Geographic unit
•  Austria
•  Switzerland
•  Germany
Employees Berlin
•  >6000 Düsseldorf
•  We are hiring!
Exciting Technology work Frankfurt
•  Large scale projects Erlangen/
(100+ people / multiple years) Nürnberg
•  Most challenging requirements Munich
–  Stock Exchange / Banking / Trading Systems Vienna
–  AEMS Mobility Platform
–  Large Scale Web Applications Zurich
(> 1M page views / day)
–  Batch Architectures


Agenda

•  Introduction
•  Case Study
•  Solution and Good Practice
•  Further Topics
•  Conclusion


Agenda

•  Introduction
•  Case Study
•  Conclusion


Introduction

High Scalability / Overload

Source: Ezprezzo

Introduction | Question

Audience?
Who are You? How large is your total
– Developers, database?
– Architects, – < 10 GB?
– IT Manager – 10 GB-100 GB?
– 100GB-1TB?
How large are your – 1TB-10 TB?
application (QPS)? – 10TB+?
– 10-100?
– 100-1000?
– 1000-10000?
– 10000+?

Introduction | What is Performance?

How do I know if I have a performance problem?

If your system is
slow for a single user


Introduction | What is Scalability?

How do I know if I have a scalability problem?

If your system is
fast for a individual user
but slow under high load


Non-Functional Testing
Performance Testing of Web based systems
Definition
•  Performance testing is defined as the technical investigation to determine
or validate the speed, scalability, and/or stability characteristics of the web
based system under test.
•  Performance-related activities, such as testing and tuning, are concerned
with achieving response times, throughput, and resource-utilization levels
that meet the performance objectives for the application (SLA) under test.
Key Types of Performance Testing:
Performance Load Testing Stress Testing Capacity Testing
Testing

“Will it be fast "Will it support all "What happens if "What do I need
enough?“ of my clients?“ something goes to plan for when I
wrong?" get more
Source: Thomas Werft, Performance Engineer at Accenture customers?“

Non-Functional Testing
Performance Testing of Web based systems
Key performance indicators:
Criteria KPI Description
Response Time Average An average is a value found by adding all of the numbers in a
(first / last byte in ms) set together and then dividing them by the quantity of numbers
in the set
Percentile (Target A percentile is a measure that tells us what percent of the total
98%) frequency scored at or below that measure.

Median A median is simply the middle value in a data set when
sequenced from lowest to highest.
Throughput (QPS) Requests per Throughput is the number of units of work that can be handled
Second; per unit t of time; for instance, requests per second, calls per
Transaction per day, hits per second, reports per year, etc.
Second
Resource Utilization Processor; Resource utilization is the cost of the project in terms of system
Memory; Disk I/O; resources.
Network I/O Utilization is the percentage of time that a resource is busy
servicing user requests. The remaining percentage of time is
considered idle time.
 Results are used for Performance Engineering, Performance Tuning
Source: Thomas Werft, Performance Engineer at Accenture


Scalability
Definition

A system’s capacity to uphold the
same performance under heavier
volumes.

Source: Patterns for Performance and Operability: Building and Testing Enterprise Software, Chris Ford et. al., 2008


Vertical Scalability
Is achieved by increasing the capacity of a single node
•  CPU,
•  Memory,
•  Bandwidth, …
Simple Process
•  Application is generally not affected by
those changes
Classical Example are Super
Computers like
•  HP Integrity Superdome
•  IBM Mainframe

Source: Hewlett-Packard


Horizontal Scalability
•  Application is spread on a cluster with several nodes
•  Nodes can be added to scale out
Produces overhead
-  Keep cluster consistent
-  Node error detection and
handling
-  Communication between nodes
• May be used to increase
reliability and availability
•  Distributed Systems and Programs like
– SETI@Home
– World Wide Web
– Domain Name Service
Source: Space Sciences Laboratory, U.C. Berkeley

Introduction | Scalability Trade-Offs | Availability vs. Consistency

CAP Theorem (Brewer‘s Theorem)

• Consistency – all clients see the
same data at the same time
Consistency
• Availability – all clients can find
all data even in presence of
failure
• Partition Tolerance – system
Partition works even when one node
Availability
Tolerance failed

Impossible
Source: PODC-keynote, Towards Robust Distributed Systems, Dr. Eric A. Brewer, 2000


CAP Theorem
Normally, two of these properties for any shared-data
system
C Consistency + Availability
•  High data integrity
P A •  Single site, cluster database, LDAP, etc.
•  2-phase commit, data replication, etc.
C Consistency + Partition
•  Distributed database, distributed locking, etc.
P A
•  Pessimistic locking, etc.
Availability + Partition
C •  High scalability
P A •  Distributed cache, DNS, etc.
•  Optimistic locking, expiration/leases (timeout), etc.

Source: “Architecting Cloudy Applications”, David Chou


Data and Scalability
Distributed Non- Available & Partition Tolerant
Relational data Consistent & Available
•  Cassandra •  RDBMSs
store solutions
must relax
•  SimpleDB Consistency (MySQL,
•  CouchDB Postgres, etc.)
guarantees around
•  Riak •  Greenplum
consistency,
•  Dynamo •  Vertica
partition tolerance
•  Voldemort
and availability,
•  Tokyo
resulting in
Cabinet
systems optimized
•  KAI
for different
combinations Partition
Availability
of properties. Tolerance

Data Models Key:
Consistent & Partition Tolerant
Relational (comparison)
•  BigTable •  Scalaris
Key-Value •  HyperTable •  BerkeleyDB
Column-Oriented
•  Hbase •  MemcacheDB
Document-Oriented
•  MongoDB •  Redis
Source: Visual Guide to NoSQL Systems, http://blog.nahurst.com/tag/cap •  Terrastore


Analysis and Classification



ACID - Do I really need it?
Relational databases were originally designed for transactional data processing
– reliably processing and maintaining data integrity – on different HW architectures.
In order to guarantee transactional integrity, the traditional relational database
management system (RDBMS) was architected to guarantee four core properties:
Atomicity, Consistency, Isolation and Durability (ACID).

Atomicity Consistency
A database is said to be atomic if when one if the database remains in a consistent state
part of the transaction fails, the entire after any transaction. Therefore, if a
transaction fails and database state is left transaction violates the consistency of the
unchanged. database (e.g. the value is not the right type)
then the transaction should be rolled back.

Durability Isolation
A database is said to be durable if it recovers A database is said to be isolated if transactions
all of the committed transactions in the system can’t have access to data currently being
even after system failure. modified by another transaction.



BASE
Modern Internet systems: focused on BASE
• Basically Available
• Soft-state (or scalable)
• Eventually consistent
Example: Amazon outage in April 2010 brought thousand
of customers down, including Pfizer, Netflix, Quora,
Foursquare, Reddit, …
• The Amazon.com 2010 Shareholder Letter Focusses on Technology
• http://www.allthingsdistributed.com/2011/04/the_amazoncom_2010_shareholder.html
• http://broadcast.oreilly.com/2011/04/the-aws-outage-the-clouds-shining-moment.html
• http://www.nytimes.com/2011/04/23/technology/23cloud.html

• http://www.allthingsdistributed.com/2007/12/eventually_consistent.html Dec. 2007



ACID vs. BASE
ACID BASE

•  Strong consistency for transactions •  Availability and scaling highest
highest priority priorities
•  Availability less important •  Weak consistency
•  Pessimistic •  Optimistic
•  Complex mechanisms •  Simple and fast


Introduction | Scalability Trade-Offs - Latency vs. Throughput

Network Latency vs. Throughput

Network protocols has an inherent throughput bottleneck that becomes
more severe with increased packet loss and latency
Source: http://www.asperasoft.com/en/technology/shortcomings_of_TCP_2/the_shortcomings_of_TCP_file_transfer_2

Introduction | Scalability and Edge Computing

Edge Computing
Transferring data or services from a centralized point to the
edge of the network
• Processing load is distributed
• Closer to the user
• Decreases latency
• Lower cost of hardware
• Increases service levels
• Greater flexibility in responding to
service requests
• Seasonal spikes in demand can be
off-loaded to other edge servers


Introduction | Caching

Caching and Types of Caches
Object cache
•  Store objects for the application to be reused
•  Cache data from database or generated by application
•  E.g. ehCache, memcached, etc.
Application Cache
•  Speed up performance or minimize resources used
•  Proxy caching / Reverse proxy caching
•  E.g. Squid, Varnish, etc
Content Delivery Network (CDN)
•  Faster response time and fewer requests on the origin servers
•  Push content closer to end user
•  E.g. Akamai, Savvis, Mirror Image Internet, Netscaler, Amazon
CloudFoundry, etc


CDN
Abstract architecture of a Content Delivery Network (CDN)

Source:Content Delivery Network (CDN) Research Directory, http://ww2.cs.mu.oz.au/~apathan/CDNs.html


CDN
Basic interaction flows in a CDN environment

Source: Basic interaction flows in a CDN environment, http://ww2.cs.mu.oz.au/~apathan/CDNs.html

Introduction
Basics
Load Balancing
Definition:
• Methodology to distribute workload across multiple computers
or a computer cluster, network links, central processing units,
disk drives, or other resources, to achieve optimal resource
utilization, maximize throughput, minimize response time, and
avoid overload
• Using multiple components with load balancing, instead of a
single component, may increase reliability through
redundancy. The load balancing service is usually provided by
dedicated software or hardware, such as a multilayer switch or
a Domain Name System server.


Introduction

Load Balancing (Major) Usage
•  Distributing the load across multiple servers
Server LB •  Target is to scale beyond the capacity of one server, and to tolerate a
server failure.

Global Server •  Directing users to different data center sites consisting of server farms
•  Target is to provide users with fast response time and to tolerate a

LB complete data center failure (availability, business continuity, disaster
recovery, geographic routing)

•  Distribute the load across multiple firewalls

Firewall LB •  Target is to scale beyond the capacity of one firewall, and tolerate a
firewall failure.

Transparent •  Transparently directs traffic to caches to accelerate the response time
for clients

Cache Switching •  Or improve the performance of web servers by offloading the static
content to caches.

Source: Load Balancing Servers, Firewalls, and Caches by Chandra Kopparapu; John Wiley & Sons © 2002

Introduction
Basics
Load Balancing Algorithm’s
Random Allocation
•  Pros: Simple to implement.
•  Cons: Can lead to overloading of one server while under-utilization of
others.
Round-Robin Allocation
•  Pros: Better than random allocation because the requests are equally
divided among the available servers in an orderly fashion.
•  Cons: Round robin algorithm is not enough for load balancing based on
processing overhead required and if the server specifications are not
identical to each other in the server group.
Weighted Round-Robin Allocation
•  Pros: Takes care of the capacity of the servers in the group.
•  Cons: Does not consider the advanced load balancing requirements such
as processing times for each individual request.

Introduction
Basics
Server Load Balancing
•  Hardware
– Barracuda Networks
– Cisco Systems
– Citrix Systems
– F5 Networks (BigIp)
– Etc.
•  Software
– HAProxy
Simple Load Balancing over DNS – Apache HTTP Server with
(List of IP‘s with round robin) mod_proxy for Tomcat
Does that work?
– …
Problem:
• No real load balancing due to TTL of DNS
• No health check for service availability

Introduction | Load Balancing

Global Server Load Balancing

•  Functionality
– DNS based routing
– Based on IP GEO database
(Geographic routing)
– Assumption: Local DNS for
client

•  Provider
– F5 Networks (Global Load
Balancing Solutions)
– UltraDNS (Traffic Controller
Service)
– Level3 (Traffic Manager,
Copyright © 2011 Accenture. All rights reserved. BCDR Solution) 33

Introduction | Load Balancing

Global Server Load Balancing
Characteristics / Usage
•  Increase application availability in event of entire site failure or overload
(Business Continuity, Disaster Recovery)
•  Scale application performance by load balancing traffic across multiple
sites (Edge Computing (together with CDN))
•  Need for more granularity and control in directing Web traffic
•  More flexibility in building and managing Internet infrastructures
–  E.g. Site based downtime management during release upgrade
•  Cons: Not always working! Due to assumption of a local DNS (Public DNS
usage, DNS over VPN could fail to get the nearest server location)
–  (see: http://www.royans.net/arch/fixing-gslb-global-server-load-balancing/)
•  Fix: Google proposed a DNS enhancement to not use the DNS resolver IP
further more the client / end-user IP (see: DNS resolver,
http://googlecode.blogspot.com/2010/01/proposal-to-extend-dns-protocol.html )


Agenda

•  Introduction
•  Case Study
•  Conclusion


Case Study – Internet Scale Web Services

Case Study – Non-Functional Requirements

ASIA: 15 Mil.

EU: 30 Mil.
USA: 50 Mil.

User groups:
•  Web Browser users
•  Mobile users AU: 2 Mil.
Availability: 99,99 %


ASIA:
1 data center:
•  Singapore
Peak: 5.000 QPS

EU:
USA: 2 data center:
2 data center: •  Frankfurt
•  New York •  London
•  San Francisco Peak: 10.000 QPS
Peak: 20.000 QPS

AU:
1 data center:
•  Sydney
Peak: 3.000 QPS

Performance in Case of Failure

EU:
USA: Failover
Failover Frankfurt ↔ London:
New York ↔ San Francisco: 20.000 QPS
40.000 QPS

AU / ASIA:
Failover
Singapore ↔ Sydney:
8.000 QPS

Response Times

RESTful Web Services:
•  Calculate service: 100 ms (50ms latency)
•  Binary service: 60 ms (50 ms latency)
•  Search service: 50 ms (50 ms latency)

Data

100 TByte on each geography
-  Binary (video, image, …)
-  Index data

Agenda

•  Introduction
•  Case Study
•  Conclusion


Case Study – Solution


Agenda

•  Introduction
•  Case Study
•  Conclusion


Furhter Topics

• Organization
– People, Process and Tools
– Governance (Lifecycle management)

• Where I do I find the truth in a highly scaled and
distributed architecture?
– Logging
• Log Analytics (e.g. Scribe (not really), Splunk)
– End-to-end data visualization


Agenda

•  Introduction
•  Case Study
•  Conclusion


Conclusion

Content Caching
Reverse proxy Caching
•  Fast and Scales well
•  Dealing with invalidation is tricky
•  Direct cache invalidation scales badly
•  Instead, change URLs of modified resources
•  Old ones will drop out of cache naturally
CDN – Content Delivery Network
•  Faster response time and fewer requests on the origin servers
•  No 100% control of caching. Based on internal statistics (Akamai).
•  Operated by 3rd parties. Already in place. Not for Free
•  Once something is cached on CDN, assume that it never changes
•  Sometimes does load balancing as well


Conclusion

Common Concepts of Scalable Architecture

parallelization
asynchronous idempotent
7 Habits of operations
Good
partitioned Distributed fault-tolerance
data Systems
optimistic shared nothing
concurrency loosely coupled
Source: "Architecting Cloudy Applications", David Chou
Source: highscalability.com

Conclusion

Questionnaire
•  Is there a need to scale my application?
–  Vertical scaling is more easy to achieve (Cost)
–  Use horizontal scaling only when required (Complexity)

•  Is there a plan to proof your designed solution?
–  Plan to do a lot of realistic Proof-of-Concepts

•  Is there a one size fits all solution?
–  NO!

•  How important is ACID?
– Is BASE enough?
– Can a NoSQL solution be used?

References
The Art of Scalability: Scalable Web Architecture, Processes and Organizations for the Modern
Enterprise; Michael T. Fisher, Martin L. Abbott; Addison-Wesley Professional; 1 edition

Scalability Rules: 50 Principles for Scaling Web Sites; Martin L. Abbott, Michael T. Fisher Addison-
Wesley Professional; 1 edition (May 15, 2011)

Scalable Internet Architectures; Theo Schlossnagle; Sams; 1 edition (July 31, 2006)

Building Scalable Web Sites; Henderson; Oreilly

Websites: HighScalability.com, infoQ.com, Qcon.com, …


Thank You!

Contribution and Review:
Bukowski, Markus; Conradt, Steffen; Jacobs, Mareike; Krogemann,
Markus; Peuker, Jan; Van Isacker, Pieter; Wagenknecht, Dominik; Wagner,
Hubert; Werft, Thomas; Zakotnik, Jure


Things you should know about Scalability!

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (18)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Things you should know about Scalability!

Ähnlich wie Things you should know about Scalability! (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Things you should know about Scalability!