SlideShare ist ein Scribd-Unternehmen logo
1 von 109
Downloaden Sie, um offline zu lesen
Architecting for the Cloud
Len and Matt Bass
Scalability
Link to yesterday’s slides
http://www.slideshare.net/lenbass/architecting-
for-the-cloud-intro-virtualization-iaa-s
Outline
• Introduction to scalability
• CPU scaling
• I/O scaling
Characteristic of cloud from NIST
• On-demand self-service. A consumer can unilaterally
provision computing capabilities, such as server time and
network storage, as needed automatically without requiring
human interaction with each service’s provider.
Scale in the Cloud
• Many people think that you get scalability just by virtue
of being in the cloud
• This isn’t true
• What the cloud gives you is the ability to quickly and
easily add resources
– It doesn’t guarantee that this results in additional capacity
• Just like with security you need to design scalability in
What is Scalability?
• (Problem definition) Scalability is the ability of a
system to support growing amount of work.
– May be from additional users
– May be from additional requests from current users
– May be from operational activities.
• (Solution definition) Scalability is the ability to
increase or decrease the resources available to
your application by either changing the number
of servers or disks or changing the size of the
servers or disks.
Why scale?
• Are more users always a good thing?
– This is a cost/benefit question.
– More users have benefits – presumably more people
receive service and the organization more revenue.
– More users have a cost – hardware, software, and
personnel.
• Do costs scale linearly with users?
– For Netflix, the answer is yes.
– For Linkedin, the answer is no.
The different aspects of scalability
• Adding users
– Large amounts of new users may require new computation
facilities
• Adding data
– Large amounts of new data requires
• More computation
• Careful attention to the distribution of this data.
• Adding computation
– Computation is embedded in virtual machines
– Elasticity means adding new virtual machines
• Scaling should not impact existing activities
• May need to scale by adding computation capacity (CPU) or
by adding I/O capacity
8
Scaling Up vs Scaling Out
• Scaling up means adding more capacity to
existing hardware
– More memory
– More disk
– Faster CPU or more cores
• Scaling out means adding additional hardware
– More systems
Costs in scaling out
• Each virtual machine has a cost – per hour
• Licensing costs.
– Many software packages charge licenses per CPU or per (virtual)
computer.
– Every new instance that utilizes one of these packages incurs
licensing costs
• Personnel costs
– In small to medium size organizations, one sysadmin can
administer ~30 machines.
– In large, highly automated organizations, one sysadmin can
administer ~1000s of machines.
– Movement called “DevOps” has as one goal the reduction of
personnel costs in operations. (more on this later).
How much lead time for growth of
number of users?
• Some things are predictable
– Seasonal variation.
• Christmas
• Tax season
– Daily variation
• Working hours or non-working hours in various time zones
• Holidays
– Promotions or special offers
– Sporting events
• Other things are not predictable
– Being “SlashDotted”
– News items
– Rapid growth in popularity of a company.
– Disaster
Managing growth in number of users
• A lead time allows planning
– Restructure database
– Add or restructure software
• When no lead time is available, elasticity of
the cloud is the main mechanism.
Outline
• Introduction to scalability
• CPU scaling
– Load balancers
– Rule Based Scaling
– Scaling Patterns
• I/O scaling
Why have a load balancer?
• Suppose there are too many users for a single instance of a service
• The cloud allow us to create another instance of that service
(elasticity)
• We would like to have the half the users use one instance and half
use the other
• Two options:
1. Couple instances and users (half and half). This is accomplished by
having users access an instance of a service directly by IP address.
2. Use an intermediary (load balancer) to distribute half of the
requests to one instance and the other half to the other.
Option 2 is preferable for a variety of reasons which we will see.
14
Load Balancing
• Physically a load balancer is a box that
looks like it belongs in a computer
network.
Load Balancer
Logically, a load balancer takes requests from
clients and distributes them to copies of an
application executing on multiple different
servers
Servers
Clients
Load
Balancer
Message sequence – client makes a
request
Servers
Clients
Load
Balancer
Message sequence- request arrives at
load balancer
Servers
Clients
Load
Balancer
Message sequence – request is send to
one server
Servers
Clients
Load
Balancer
Message sequence – reply goes
directly back to client
Servers
Clients
Load
Balancer
Suppose Load Balancer Becomes Overloaded –
Load Balance the Load Balancers
Hierarchy of Load Balancers
• Server always sends message back to client.
• Load balancers use variety of algorithms to choose
instance for message
– Round robin. Rotate requests evenly
– Weighted round robin. Rotate requests according to some
weighting.
– Hashing – IP address of source to determine instance.
Means that a request from a particular client always sent
to same instance as long as it is still in service.
• Note that these algorithms do not require knowledge
of an instance’s load. That situation we will cover in a
little bit.
Outline
• Introduction to scalability
• CPU scaling
– Load balancers
– Rule based scaling
– Scaling Patterns
• I/O scaling
Rule Based Scaling
Server
• A server is a virtual machine without any software
• A virtual machine can be allocated with varying amounts of
memory, CPU, disk
• Each variant has different cost, typically per hour
Machine Image
• A machine image is a copy of the contents of the memory
of a computer.
• A machine image may be created from any contents of a
computer. Some options:
– Bare metal
– With OS
– With LAMP Stack
• Linux
• Apache HTTP Server
• MySQL
• PhP or Python
• If licensed software is contained in the machine image,
then a license fee is paid when it is loaded
Executable Virtual Machine
• An executable virtual machine is created by
loading a machine image into a server.
• Executable virtual machine can then be
– Booted
– Paused
– Shut down
Machine Image Server
Adding/Removing Resources
• Example shows two servers with one to be removed.
• Could be N servers with one to be added or removed
• Creating a new instance
takes some time
• Removing an instance also
takes time – it must satisfy
existing requests and be
detached from existing
connections.
Autoscaling group
• An autoscaling group is a collection of
instances that have been defined to be scaled
together.
• Typically these represent instances of the
same application.
Creating an autoscaling group
• An autoscaling group needs to know
– Machine instance id
– VM type
– Scaling policy
Scaling Policy
• Specify minimum, maximum, and desired
number of instances
• Can specify scaling based on time of day
– E.g. scale up during 9:00-5:00 and down other times
• Can scale based on average CPU usage
– E.g. average CPU utilization <40% means delete
instance
– Average CPU utilization >60% means add instance.
– Values come from monitor.
Outline
• Introduction to scalability
• CPU scaling
– Load balancers
– Rule Based Scaling
– Scaling Patterns
• I/O scaling
Scaling Patterns
• Autoscaling implements Push Pattern for
messages
• Another pattern is Pull Pattern
Push Pattern
Push Pattern Description
• Client sends a request (e.g. HTTP message) to
the app in the cloud.
• Request arrives at a load balancer
• Load balancer forwards request to one of the
VMs in the resource pool.
• Load balancer uses scheduling strategy to
decide which VM gets the request, e.g.
dispatch to VM with lowest CPU utilization.
How does the load balancer know?
• The load balancer knows CPU utilization of the VMs and it
knows how many requests it (the load balancer) has received,
and possibly how long it took to service the requests. It does
not know application specifics such as how many requests a
VM can process.
• When resource pool is overloaded, new resources are
allocated.
• The monitor decides (based on controller rules) when new
resources are needed. It must have direct insight into the VM
instances in order to do this. Hence, the monitor utilizes a
monitoring service provided by the cloud for each instance.
36
Pull architecture pattern (aka Producer-
Consumer)
Pull architecture description
• Each request from the client is application
specific and typed.
• The queue keeps separate queues for each
application running on the VMs.
• A VM requests the next message of a particular
type (pull) and processes it.
• The monitor can now see how long a request
waits in a queue or the average queue length and
this is an indication of the load on the VMs that
have applications that service requests of that
type.
Differences
• Push is more responsive to requests. They are
immediately forwarded to a service. There is a
possibility that the service is overloaded.
• Pull is less responsive since it relies on servers to
de-queue messages.
• In the pull architecture, a service polls for new
messages even if there is nothing in its queue and
this introduces overhead.
• It is easier to monitor and control workload in the
pull architecture since messages are application
specific and typed.
Outline
• Introduction to scalability
• CPU scaling
• I/O scaling
– Multiple sites
– Software techniques
I/O Scaling
• Scaling out assumes scaling requirement is
solved with more CPUs.
• It may be that I/O is also a problem.
– You may run your application in multiple sites
– Half the clients go to one site, half to another
Questions when you have multiple
sites
How do clients know which site to use?
How are databases used by the applications
coordinated across sites (we defer this question).
Domain Name Server (DNS)
Client sends URL to DNS
DNS takes as input a URL and returns an IP address
Client uses IP address to send message to load balancer for a site
Site 1 Site 2
Domain Name Server
Website.com
123.45.67.89
123.45.67.89
DNS
DNS with multiple sites
• DNS server returns IP address of both sites.
• DNS server will vary which address is listed
first.
• Client will, typically, choose first entry.
Site 1 Site 2
Domain Name
Server
Website.com
123.45.67.89
456.77.88.99123.45.67.89
DNS
Outline
• Introduction to scalability
• CPU scaling
• I/O scaling
– Multiple sites
– Software techniques
Recall Pull Pattern
To Scale for I/O - Make the queue
manager more sophisticated
Key Value Store
Publisher – takes values from key-
value store and distributes them
Clients
Summary
• Scalability is the ability to respond to
increasing or decreasing workload
– Add CPU capacity through utilizing features of
cloud provider
– Add I/O capacity through
• Distributing requests to multiple sites
• Have fast message passing software
QUESTIONS?
Architecting for the Cloud
Introduction to Availability
Outline
• What is availability
• Faults
• Availability patterns
Outline
• What is availability
• Faults
• Availability patterns
Cost of Downtime
• According to a recent survey the average cost of
unplanned downtime is $7,900/minute*
• 91% of reporting companies have experienced an
unplanned outage in the last 24 months
• The average outage lasts 118 minutes
• The average frequency of outages over a 24
month period were:
– 10.16 limited outages
– 5.88 local outages
– 2.04 total outages
* Emerson Network Power, Ponemon Institute Study 2013
Cost of Downtime II
• As the previous numbers indicate downtime can be expensive
• Experienced in August 2013
– New York Times had a 2 hour outage (stock price declined, twitter
exploded, and Wall Street Journal dropped their fees to try and
capture readership)
– Google had between 1 – 5 minutes of downtime (~$500,000 direct
loss and 40% reduction in overall web traffic)
– Amazon had an outage of under an hour (> $5 million)
• In addition to direct losses indirect losses are experienced
– Loss of confidence, reputation, and good will
– Productivity losses
– Compliance penalties
– …
Availability: a Business Concern
• The availability of the business service impacts
the earnings and associated value of an
organization
• If the organization relies on an IT system to
deliver business service then the availability of
the IT system impacts the value of the
organization
• In this section we are going to look at the
availability of the system
– We want to keep in mind, however, that the objective
is the availability of the business service
What Is Availability?
• Availability in general refers to the degree to
which a system is in an operable state
• This is typically articulated as the percentage
of time the system is available (or we’d like to
have the system available) e.g. 99.99%
• There are many related terms e.g.
– Availability
– Fault-Tolerance
– Reliability
How is Availability Measured?
Availability is typically measured as:
MTBF
MTBF + MTTR
MTBF = Mean Time Between Failures
MTTR = Mean Time To Repair
9s
Availability Downtime per Year
90% (1-nine) 36.5 days/year
99% (2-nines) 3.65 days/year
99.9% (3-nines) 8.76 hours/year
99.99% (4-nines) 52 minutes/year
99.999% (5-nines) 5 minutes/year
99.9999% (6-nines) 31 seconds/year !
Calculating System Availability I
• Each component = 99% (3.65 days a year)
• The overall system, however, has an availability that
is the product of each component’s availability
– 99% X 99% = 98% (7.26 days a year)
99% 99%
Calculating System Availability
• Each component = 99% (3.65 days
a year)
• The overall system in this case,
however, is based on the
likelihood that both components
would fail at the same time
1 – ((100% - 99%) X (100% - 99%) )=
99.99% (3.65 hours a year!!)
Redundant Elements
99%
99%
Availability Measures
• A couple of things to keep in mind
– These measures refer to the mean not the minimum
time between failures
– As the MTBF increases the impact of MTTR decreases
– As the MTTR approaches 0 the overall availability
approaches 1
• Historically these measures were developed for
hardware components
Availability Requirements
• MTBF can be measured for operational systems
• How do you predict the MTBF for a system that
is yet to be built, however?
• Does it make sense to use the previously
defined availability measure as a requirement?
• If not, how should requirements be articulated?
Actionable Requirements
• Remember that as a business the concern is that the
services are available as needed
• In order to determine the likely availability of a
system (or design) you must
– Understand the likelihood that various kinds of faults could
occur
– Understand the impact of these faults on overall system
availability
• You must therefore translate the desired
business objective into a set of fault scenarios
End to End Availability
• Engineers often think about availability of some
portion of the system e.g.
– Availability of the database or web server
• Organizations, however, are concerned with end to
end availability
• When thinking about availability requirements you
should think about the organizational perspective
– Once you’ve done this you’ll then need to map this to the
engineering perspective
Requirements Vary
• We start with the desired requirements from a business perspective
• We then look at the system context to determine what faults might
disrupt the desired behavior
– This is likely an iterative process
• One thing to keep in mind is that different business contexts imply
different requirements
• Consider the needs of Discreet Manufacturing vs. Continuous
Manufacturing
• Discreet manufacturing is when you manufacture discreet products
– e.g. an automobile assembly line
• Continuous process automation is when you manufacture things like
chemicals or concrete
• How might the systems respond differently in the event of a fault?
Example Scenario
If a processor in one of the servers fails during
peak load, the system shall continue to operate
without dropping any of the current tasks and
without any noticeable delay
Relationship to Goals
• How does this scenario relate to availability goals?
– It does not in and of itself guarantee a particular level of
availability
• This in conjunction with scenarios for other faults that
could impact a service do improve availability, however
• In order to understand how to think about the design
we need to:
– Identify the activities that require availability
– Identify the related faults
– Identify the desired response if the fault occurs
Outline
• What is availability
• Faults
• Availability patterns
Fault Characteristics
• “Fail silent” vs. “fail operational”
– Fail silent  when a component fails it no longer operates
– Fail operational  a component continues to operate (although not
correctly) when a fault is present
• Transient vs. deterministic
– Some faults will always occur in a consistent way
– Others may come and go intermittently
• Some will look similar to other faults e.g.
– A hung process, a processor crash, and a network outage can all look
the same
What’s the matter with
this $#@!#% computer …
A System Can Fail Silently …
Let’s look at an example interaction
Client Machine
Network
Server
FileSystem
Hmm … what’s the best
vegetarian restaurant in
Bogota?
Symptoms of Faults
• From an end users perspective many faults
exhibit themselves similarly
• These faults could all look the same to an end
user:
– A hung process
– A crashed processor
– A network outage
– An overloaded element
Or Fail Operational …
Client Machine
Network
Server
FileSystem
Carnes de Res is the best
vegetarian restaurant???
Hmm … what’s the best
vegetarian restaurant in
Bogota?
Fault Manifestation
• These types of faults could occur in any of the
elements of the system
• Depending on where they occur different mitigation
strategies might be appropriate
• As a result you need to
– Analyze your system and determine what faults might
occur
– Identify the desired response if they do occur
• This is called a fault model
Fault Model
• A fault model describes the system faults that
could disrupt the critical functionality
• The fault model is going to depend on both the
critical functionality and the specific architecture of
the system
• Once the fault model is identified you’ll need to
describe the desired response if the fault occurs
Cost of Availability
• We’ve established that downtime can be
expensive
• It’s also the case that “uptime” can be expensive
– Implementing a mechanism to be resilient to faults
can be expensive
• We want to understand the cost and benefit for
proposed strategies and select the set that make
sense from a business perspective
• This means the initial requirements might change
…
Example
• We want “appropriate” availability
• A study has been done for mobile carrier
customers
– This study has determined that customers will
tolerate 2 dropped calls per 100 calls made
– As soon as the system drops 3 calls per 100 they
will start to change providers
• What does this say about the “appropriate”
availability of the system?
Outline
• What is availability
• Faults
• Availability patterns
Elements of Availability
• Fault detection
– The system recognizes that a fault has occurred
• Masking faults
– The system is able to continue to operate despite the fault
• Recover from the fault
– The system is able to repair the faulty element of the
system
Fault Detection
• There are standard “tactics” that we can use for
fault detection
• They don’t detect the same types of faults,
however
• They also have different “costs”
– This cost can be in terms of effort or overhead of one
kind or another
• We need to understand something about the
kinds of faults we are trying to detect before we
can select the appropriate tactic
Detecting Silent Faults
• It’s much easier to detect elements that fail
silently
• Essentially we monitor the “liveness” of the
element where the fault could exist
• Example tactics are:
– Exceptions
– Heartbeat
– Ping/echo
Exceptions
• When an anomalous or exceptional event occurs
it can be detected by exception handlers
• When the exception is “caught” an alternate path
of execution is triggered
• The exception handling code can notify other
portions of the system of the issue
• Doesn’t impose significant overhead on the
system
Heart Beat
• A component emits a regular “heart beat”
• Another element will listen for this
• If this heart beat is not detected it is assumed
that the component is no longer operational
• Does add overhead to the system
• Only an indication of the “liveness” of the
component
Ping/Echo
• Similar to heart beat except a “watchdog” sends a
ping and listens for a response
• If no response is heard it is assumed the component
is not operational
• Requires more coupling than heart beat
• Increases network traffic
• Again it’s only an indication of the liveness of the
component
Failing Operational
• If an element or system fails operational it’s
more difficult to detect
• You don’t just monitor if the system responds
but also need to determine if the results are
“correct”
• Example tactics include:
– Exceptions
– Voting
– Check sum
Voting
• You compare the response of multiple elements
performing the same operation
• If the results of one of the elements doesn’t
match the others you assume it’s faulty
• Can detect erroneous output
• Adds overhead (must wait for multiple responses
and compare)
Check Sum
• A mathematical calculation that’s applied to a
piece of data to determine if it’s been altered
• Does add some processing overhead to the
system
• Can detect data corruption
Tolerating Faults
• In many cases you realize that faults will occur
– Particularly in large distributed systems
• You can’t tolerate outages every time one of the
nodes experiences a fault
• You therefore need to hide the fact that the
system has a faulty component
• This is called “fault masking”
• Again the strategies associated with masking the
fault are going to be dependent on the kind of
fault being masked
Strategies For Fault Masking
• Modular redundancy
• Rollback
– Restoring the system to a previously identified “safe state”
• Roll forward
– “skipping” an operation that is causing a problem
• Retrying an operation
• Shedding load
• …
Modular Redundancy
• Redundant systems have multiple replicated elements
(copies)
– Not to be confused with load balancing approaches
– The thing to realize is that the state is replicated across the copies
• There are multiple strategies for software replication
– Cold standby
– Warm standby
– Hot standby
Redundancy: Cold Standby
• There are non-operational copies available
• State is stored (e.g. in logs) but is not loaded on the
copies until they are needed
• When a failure occurs the state is reconstructed and
the replica is introduced
• Reduces operational overhead associated with
maintaining copies
• Increases MTTR
Cold Standby
Redundancy: Warm Standby
• In this configuration you have a primary replica that is actively
processing requests
• You have passive replicas that are not actively processing
requests although they are online
• State is periodically loaded into the backup replicas
• As with cold standbys the processing overhead is reduced
• The MTTR is dependent on the state checkpoints (typically
less than with cold standbys)
Warm Standby
Redundancy: Hot Standby
• All copies are processing requests
• All of the duplicate responses will be suppressed
• The copies need to be synchronized continuously
– Thus the processing overhead is increased as the number of replicas
increases
• The MTTR is reduced to virtually zero, however, in the event
that one of the replicas fail
Hot Standby
Considerations
• State management
– If there is state that is managed in the replicated elements you need to
worry about synchronizing state
• State can be pushed to other elements …
– This impacts other concerns such as performance or security, however
– Caching commonly accessed data is a typical strategy for dealing with
performance concerns
• Kinds of replicas
• Frequency of check pointing
State Management
Roll Back
• Roll back is when you undo a transaction
• You need to manage state appropriately
– You need to define an atomic set of actions
• This could be taking complete snap shot of
system state or just roll back of a transaction
Roll Forward
• Roll forward essentially skips a task and then
applies the changes involved in the
transactions
• The system will then be in the state consistent
with the desired change
Retrying an Operation
• This is as simple as it sounds
• When a given operation fails you retry it
• It can be used in conjunction with a detection
mechanism like exceptions
Shedding Load
• Sometimes issues occur due to an overload
situation
• This can lead to:
– Timing errors
– Buffer overflows
– Memory consumption issues
• Shedding less critical load can help alleviate the
problem
Strategies For Fault Recovery
• Reboot
– This could be a partial (e.g. restarting an
application or process) or total system reboot
• Removal of faulty component
• Restore component to a previously
identified safe state
• …
Reboot
• Rebooting the system can often correct the
issue
• This can also be done as a preventative
measure
• It can be a complete or partial reboot
• There is such a thing as a “micro reboot” that
takes milliseconds
Component Removal
• If you have a faulty component you can
remove it from service
• You might try other remedies such as
restarting first
Checkpointing State
• You can periodically take a snap shot of the
system
• If at some point you have an issue, you can
restore the system to the previously defined
state
• The more frequently you take a snap shot of the
state the smaller the loss but the more
overhead
Availability in the Cloud
• From a high level achieving availability in the
cloud is the same process as elsewhere
– It needs to be designed in
• That means you need to understand the faults
that could occur
• You then need to apply the appropriate
decisions to achieve the desired result
Fault Model
• We will give specific faults that occur later in the
course
– This requires first a better understanding of the
architecture of the cloud
• At this point it’s useful to understand that the cloud is
made up of faulty components
– Failures happen on a regular basis
• There are mechanisms built in to handle this, but
– They aren’t always successful
– They don’t deal with application specific concerns
– Some things that might be a fault for your application isn’t
considered a fault by the infrastucture
Summary
• Availability measures are not adequate for design
• You need to be able to translate availability goals into
a set of actionable requirements that identify the
possible faults and desired responses
• The approaches should support the desired
responses in the event that a fault occurs
Questions??

Weitere ähnliche Inhalte

Was ist angesagt?

Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...WMLab,NCU
 
An adaptive cloud downloading service
An adaptive cloud downloading serviceAn adaptive cloud downloading service
An adaptive cloud downloading serviceJPINFOTECH JAYAPRAKASH
 
Server And Hardware Virtualization_Aakash1.1
Server And Hardware Virtualization_Aakash1.1Server And Hardware Virtualization_Aakash1.1
Server And Hardware Virtualization_Aakash1.1Aakash Agarwal
 
EQR Reporting: Rails + Amazon EC2
EQR Reporting:  Rails + Amazon EC2EQR Reporting:  Rails + Amazon EC2
EQR Reporting: Rails + Amazon EC2jeperkins4
 
Five Workload-to-Cloud Migration Methods
Five Workload-to-Cloud Migration MethodsFive Workload-to-Cloud Migration Methods
Five Workload-to-Cloud Migration MethodsPeak 10
 
Nexus 1000_ver 1.1
Nexus 1000_ver 1.1Nexus 1000_ver 1.1
Nexus 1000_ver 1.1Aakash Agarwal
 
Load Balancing In Cloud Computing newppt
Load Balancing In Cloud Computing newpptLoad Balancing In Cloud Computing newppt
Load Balancing In Cloud Computing newpptUtshab Saha
 
Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Pete Siddall
 
Hhm 3479 mq clustering and shared queues for high availability
Hhm 3479 mq clustering and shared queues for high availabilityHhm 3479 mq clustering and shared queues for high availability
Hhm 3479 mq clustering and shared queues for high availabilityPete Siddall
 
IBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster RecoveryIBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster RecoveryMarkTaylorIBM
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT An adaptive cloud downloading service
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT An adaptive cloud downloading serviceDOTNET 2013 IEEE CLOUDCOMPUTING PROJECT An adaptive cloud downloading service
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT An adaptive cloud downloading serviceIEEEGLOBALSOFTTECHNOLOGIES
 
Dynamic Load balancing Linux private Cloud (DRS)
Dynamic Load balancing Linux private Cloud (DRS)Dynamic Load balancing Linux private Cloud (DRS)
Dynamic Load balancing Linux private Cloud (DRS)kamrankausar
 
Cio Breakfast Roundtable 05142009 Final Virtualization
Cio Breakfast Roundtable 05142009 Final VirtualizationCio Breakfast Roundtable 05142009 Final Virtualization
Cio Breakfast Roundtable 05142009 Final Virtualizationguestc900809
 
IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)MarkTaylorIBM
 
Knowledge share about scalable application architecture
Knowledge share about scalable application architectureKnowledge share about scalable application architecture
Knowledge share about scalable application architectureAHM Pervej Kabir
 
IBM WebSphere MQ: Managing Workloads, Scaling and Availability with MQ Clusters
IBM WebSphere MQ: Managing Workloads, Scaling and Availability with MQ ClustersIBM WebSphere MQ: Managing Workloads, Scaling and Availability with MQ Clusters
IBM WebSphere MQ: Managing Workloads, Scaling and Availability with MQ ClustersDavid Ware
 
Dynamodb tutorial
Dynamodb tutorialDynamodb tutorial
Dynamodb tutorialHarikaReddy115
 
What's New with Amazon DynamoDB - SRV311 - Chicago AWS Summit
What's New with Amazon DynamoDB - SRV311 - Chicago AWS SummitWhat's New with Amazon DynamoDB - SRV311 - Chicago AWS Summit
What's New with Amazon DynamoDB - SRV311 - Chicago AWS SummitAmazon Web Services
 

Was ist angesagt? (20)

Moving CCAP To The Cloud
Moving CCAP To The CloudMoving CCAP To The Cloud
Moving CCAP To The Cloud
 
Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...
 
An adaptive cloud downloading service
An adaptive cloud downloading serviceAn adaptive cloud downloading service
An adaptive cloud downloading service
 
Server And Hardware Virtualization_Aakash1.1
Server And Hardware Virtualization_Aakash1.1Server And Hardware Virtualization_Aakash1.1
Server And Hardware Virtualization_Aakash1.1
 
ACE - Comcore
ACE - ComcoreACE - Comcore
ACE - Comcore
 
EQR Reporting: Rails + Amazon EC2
EQR Reporting:  Rails + Amazon EC2EQR Reporting:  Rails + Amazon EC2
EQR Reporting: Rails + Amazon EC2
 
Five Workload-to-Cloud Migration Methods
Five Workload-to-Cloud Migration MethodsFive Workload-to-Cloud Migration Methods
Five Workload-to-Cloud Migration Methods
 
Nexus 1000_ver 1.1
Nexus 1000_ver 1.1Nexus 1000_ver 1.1
Nexus 1000_ver 1.1
 
Load Balancing In Cloud Computing newppt
Load Balancing In Cloud Computing newpptLoad Balancing In Cloud Computing newppt
Load Balancing In Cloud Computing newppt
 
Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...Hhm 3474 mq messaging technologies and support for high availability and acti...
Hhm 3474 mq messaging technologies and support for high availability and acti...
 
Hhm 3479 mq clustering and shared queues for high availability
Hhm 3479 mq clustering and shared queues for high availabilityHhm 3479 mq clustering and shared queues for high availability
Hhm 3479 mq clustering and shared queues for high availability
 
IBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster RecoveryIBM MQ - High Availability and Disaster Recovery
IBM MQ - High Availability and Disaster Recovery
 
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT An adaptive cloud downloading service
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT An adaptive cloud downloading serviceDOTNET 2013 IEEE CLOUDCOMPUTING PROJECT An adaptive cloud downloading service
DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT An adaptive cloud downloading service
 
Dynamic Load balancing Linux private Cloud (DRS)
Dynamic Load balancing Linux private Cloud (DRS)Dynamic Load balancing Linux private Cloud (DRS)
Dynamic Load balancing Linux private Cloud (DRS)
 
Cio Breakfast Roundtable 05142009 Final Virtualization
Cio Breakfast Roundtable 05142009 Final VirtualizationCio Breakfast Roundtable 05142009 Final Virtualization
Cio Breakfast Roundtable 05142009 Final Virtualization
 
IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)
 
Knowledge share about scalable application architecture
Knowledge share about scalable application architectureKnowledge share about scalable application architecture
Knowledge share about scalable application architecture
 
IBM WebSphere MQ: Managing Workloads, Scaling and Availability with MQ Clusters
IBM WebSphere MQ: Managing Workloads, Scaling and Availability with MQ ClustersIBM WebSphere MQ: Managing Workloads, Scaling and Availability with MQ Clusters
IBM WebSphere MQ: Managing Workloads, Scaling and Availability with MQ Clusters
 
Dynamodb tutorial
Dynamodb tutorialDynamodb tutorial
Dynamodb tutorial
 
What's New with Amazon DynamoDB - SRV311 - Chicago AWS Summit
What's New with Amazon DynamoDB - SRV311 - Chicago AWS SummitWhat's New with Amazon DynamoDB - SRV311 - Chicago AWS Summit
What's New with Amazon DynamoDB - SRV311 - Chicago AWS Summit
 

Ähnlich wie Architecting for the cloud scability-availability

Cloud computing and Docker
Cloud computing and DockerCloud computing and Docker
Cloud computing and DockerSrinivasVaddi4
 
Chapeter 2 introduction to cloud computing
Chapeter 2   introduction to cloud computingChapeter 2   introduction to cloud computing
Chapeter 2 introduction to cloud computingeShikshak
 
Building Scalable Applications with Microsoft Azure
Building Scalable Applications with Microsoft AzureBuilding Scalable Applications with Microsoft Azure
Building Scalable Applications with Microsoft AzureFisnik Doko
 
Load Balancing in Cloud Computing.pptx
Load Balancing in Cloud Computing.pptxLoad Balancing in Cloud Computing.pptx
Load Balancing in Cloud Computing.pptxPradipPoudel4
 
unit3 part1.pptx
unit3 part1.pptxunit3 part1.pptx
unit3 part1.pptxJanpreet Singh
 
Chapter 1 Introduction to Cloud Computing
Chapter 1 Introduction to Cloud ComputingChapter 1 Introduction to Cloud Computing
Chapter 1 Introduction to Cloud Computingnewbie2019
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to MicroservicesMahmoudZidan41
 
An Introduction to Cloud Computing and Lates Developments.ppt
An Introduction to Cloud Computing and Lates Developments.pptAn Introduction to Cloud Computing and Lates Developments.ppt
An Introduction to Cloud Computing and Lates Developments.pptHarshalUbale2
 
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...Amazon Web Services
 
iMobileMagic Teck Talk Scale Up
iMobileMagic Teck Talk Scale UpiMobileMagic Teck Talk Scale Up
iMobileMagic Teck Talk Scale UpPedro Machado
 
05. performance-concepts
05. performance-concepts05. performance-concepts
05. performance-conceptsMuhammad Ahad
 
Scalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availabilityScalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availabilityPapitha Velumani
 
Day 5 - AWS Autoscaling Master Class - The New Capacity Plan
Day 5 - AWS Autoscaling Master Class - The New Capacity PlanDay 5 - AWS Autoscaling Master Class - The New Capacity Plan
Day 5 - AWS Autoscaling Master Class - The New Capacity PlanAmazon Web Services
 
UNIT IV RESOURCE MANAGEMENT AND SECURITY
UNIT IV RESOURCE MANAGEMENT AND SECURITYUNIT IV RESOURCE MANAGEMENT AND SECURITY
UNIT IV RESOURCE MANAGEMENT AND SECURITYSheik Mohideen
 
CloudComputing_UNIT 3.pdf
CloudComputing_UNIT 3.pdfCloudComputing_UNIT 3.pdf
CloudComputing_UNIT 3.pdfkhan593595
 
CloudComputing_UNIT 3.pdf
CloudComputing_UNIT 3.pdfCloudComputing_UNIT 3.pdf
CloudComputing_UNIT 3.pdfkhan593595
 
CloudComputing_UNIT4.pdf
CloudComputing_UNIT4.pdfCloudComputing_UNIT4.pdf
CloudComputing_UNIT4.pdfkhan593595
 

Ähnlich wie Architecting for the cloud scability-availability (20)

unit3.ppt
unit3.pptunit3.ppt
unit3.ppt
 
Cloud computing and Docker
Cloud computing and DockerCloud computing and Docker
Cloud computing and Docker
 
Chapeter 2 introduction to cloud computing
Chapeter 2   introduction to cloud computingChapeter 2   introduction to cloud computing
Chapeter 2 introduction to cloud computing
 
Building Scalable Applications with Microsoft Azure
Building Scalable Applications with Microsoft AzureBuilding Scalable Applications with Microsoft Azure
Building Scalable Applications with Microsoft Azure
 
Load Balancing in Cloud Computing.pptx
Load Balancing in Cloud Computing.pptxLoad Balancing in Cloud Computing.pptx
Load Balancing in Cloud Computing.pptx
 
unit3 part1.pptx
unit3 part1.pptxunit3 part1.pptx
unit3 part1.pptx
 
Chapter 1 Introduction to Cloud Computing
Chapter 1 Introduction to Cloud ComputingChapter 1 Introduction to Cloud Computing
Chapter 1 Introduction to Cloud Computing
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 
An Introduction to Cloud Computing and Lates Developments.ppt
An Introduction to Cloud Computing and Lates Developments.pptAn Introduction to Cloud Computing and Lates Developments.ppt
An Introduction to Cloud Computing and Lates Developments.ppt
 
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
 
iMobileMagic Teck Talk Scale Up
iMobileMagic Teck Talk Scale UpiMobileMagic Teck Talk Scale Up
iMobileMagic Teck Talk Scale Up
 
05. performance-concepts
05. performance-concepts05. performance-concepts
05. performance-concepts
 
Distributed Computing ppt
Distributed Computing pptDistributed Computing ppt
Distributed Computing ppt
 
Scalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availabilityScalable analytics for iaas cloud availability
Scalable analytics for iaas cloud availability
 
Day 5 - AWS Autoscaling Master Class - The New Capacity Plan
Day 5 - AWS Autoscaling Master Class - The New Capacity PlanDay 5 - AWS Autoscaling Master Class - The New Capacity Plan
Day 5 - AWS Autoscaling Master Class - The New Capacity Plan
 
UNIT IV RESOURCE MANAGEMENT AND SECURITY
UNIT IV RESOURCE MANAGEMENT AND SECURITYUNIT IV RESOURCE MANAGEMENT AND SECURITY
UNIT IV RESOURCE MANAGEMENT AND SECURITY
 
CloudComputing_UNIT 3.pdf
CloudComputing_UNIT 3.pdfCloudComputing_UNIT 3.pdf
CloudComputing_UNIT 3.pdf
 
CloudComputing_UNIT 3.pdf
CloudComputing_UNIT 3.pdfCloudComputing_UNIT 3.pdf
CloudComputing_UNIT 3.pdf
 
CloudComputing_UNIT4.pdf
CloudComputing_UNIT4.pdfCloudComputing_UNIT4.pdf
CloudComputing_UNIT4.pdf
 

Mehr von Len Bass

Devops syllabus
Devops syllabusDevops syllabus
Devops syllabusLen Bass
 
DevOps Syllabus summer 2020
DevOps Syllabus summer 2020DevOps Syllabus summer 2020
DevOps Syllabus summer 2020Len Bass
 
11 secure development
11  secure development 11  secure development
11 secure development Len Bass
 
10 disaster recovery
10 disaster recovery  10 disaster recovery
10 disaster recovery Len Bass
 
9 postproduction
9 postproduction 9 postproduction
9 postproduction Len Bass
 
8 pipeline
8 pipeline 8 pipeline
8 pipeline Len Bass
 
7 configuration management
7 configuration management 7 configuration management
7 configuration management Len Bass
 
6 microservice architecture
6 microservice architecture6 microservice architecture
6 microservice architectureLen Bass
 
5 infrastructure security
5 infrastructure security5 infrastructure security
5 infrastructure securityLen Bass
 
4 container management
4  container management4  container management
4 container managementLen Bass
 
3 the cloud
3 the cloud 3 the cloud
3 the cloud Len Bass
 
1 virtual machines
1 virtual machines1 virtual machines
1 virtual machinesLen Bass
 
2 networking
2 networking2 networking
2 networkingLen Bass
 
Quantum talk
Quantum talkQuantum talk
Quantum talkLen Bass
 
Icsa2018 blockchain tutorial
Icsa2018 blockchain tutorialIcsa2018 blockchain tutorial
Icsa2018 blockchain tutorialLen Bass
 
Experience in teaching devops
Experience in teaching devopsExperience in teaching devops
Experience in teaching devopsLen Bass
 
Understanding blockchains
Understanding blockchainsUnderstanding blockchains
Understanding blockchainsLen Bass
 
What is a blockchain
What is a blockchainWhat is a blockchain
What is a blockchainLen Bass
 
Dev ops and safety critical systems
Dev ops and safety critical systemsDev ops and safety critical systems
Dev ops and safety critical systemsLen Bass
 
My first deployment pipeline
My first deployment pipelineMy first deployment pipeline
My first deployment pipelineLen Bass
 

Mehr von Len Bass (20)

Devops syllabus
Devops syllabusDevops syllabus
Devops syllabus
 
DevOps Syllabus summer 2020
DevOps Syllabus summer 2020DevOps Syllabus summer 2020
DevOps Syllabus summer 2020
 
11 secure development
11  secure development 11  secure development
11 secure development
 
10 disaster recovery
10 disaster recovery  10 disaster recovery
10 disaster recovery
 
9 postproduction
9 postproduction 9 postproduction
9 postproduction
 
8 pipeline
8 pipeline 8 pipeline
8 pipeline
 
7 configuration management
7 configuration management 7 configuration management
7 configuration management
 
6 microservice architecture
6 microservice architecture6 microservice architecture
6 microservice architecture
 
5 infrastructure security
5 infrastructure security5 infrastructure security
5 infrastructure security
 
4 container management
4  container management4  container management
4 container management
 
3 the cloud
3 the cloud 3 the cloud
3 the cloud
 
1 virtual machines
1 virtual machines1 virtual machines
1 virtual machines
 
2 networking
2 networking2 networking
2 networking
 
Quantum talk
Quantum talkQuantum talk
Quantum talk
 
Icsa2018 blockchain tutorial
Icsa2018 blockchain tutorialIcsa2018 blockchain tutorial
Icsa2018 blockchain tutorial
 
Experience in teaching devops
Experience in teaching devopsExperience in teaching devops
Experience in teaching devops
 
Understanding blockchains
Understanding blockchainsUnderstanding blockchains
Understanding blockchains
 
What is a blockchain
What is a blockchainWhat is a blockchain
What is a blockchain
 
Dev ops and safety critical systems
Dev ops and safety critical systemsDev ops and safety critical systems
Dev ops and safety critical systems
 
My first deployment pipeline
My first deployment pipelineMy first deployment pipeline
My first deployment pipeline
 

KĂźrzlich hochgeladen

Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfWilly Marroquin (WillyDevNET)
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 

KĂźrzlich hochgeladen (20)

Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 

Architecting for the cloud scability-availability

  • 1. Architecting for the Cloud Len and Matt Bass Scalability
  • 2. Link to yesterday’s slides http://www.slideshare.net/lenbass/architecting- for-the-cloud-intro-virtualization-iaa-s
  • 3. Outline • Introduction to scalability • CPU scaling • I/O scaling
  • 4. Characteristic of cloud from NIST • On-demand self-service. A consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service’s provider.
  • 5. Scale in the Cloud • Many people think that you get scalability just by virtue of being in the cloud • This isn’t true • What the cloud gives you is the ability to quickly and easily add resources – It doesn’t guarantee that this results in additional capacity • Just like with security you need to design scalability in
  • 6. What is Scalability? • (Problem definition) Scalability is the ability of a system to support growing amount of work. – May be from additional users – May be from additional requests from current users – May be from operational activities. • (Solution definition) Scalability is the ability to increase or decrease the resources available to your application by either changing the number of servers or disks or changing the size of the servers or disks.
  • 7. Why scale? • Are more users always a good thing? – This is a cost/benefit question. – More users have benefits – presumably more people receive service and the organization more revenue. – More users have a cost – hardware, software, and personnel. • Do costs scale linearly with users? – For Netflix, the answer is yes. – For Linkedin, the answer is no.
  • 8. The different aspects of scalability • Adding users – Large amounts of new users may require new computation facilities • Adding data – Large amounts of new data requires • More computation • Careful attention to the distribution of this data. • Adding computation – Computation is embedded in virtual machines – Elasticity means adding new virtual machines • Scaling should not impact existing activities • May need to scale by adding computation capacity (CPU) or by adding I/O capacity 8
  • 9. Scaling Up vs Scaling Out • Scaling up means adding more capacity to existing hardware – More memory – More disk – Faster CPU or more cores • Scaling out means adding additional hardware – More systems
  • 10. Costs in scaling out • Each virtual machine has a cost – per hour • Licensing costs. – Many software packages charge licenses per CPU or per (virtual) computer. – Every new instance that utilizes one of these packages incurs licensing costs • Personnel costs – In small to medium size organizations, one sysadmin can administer ~30 machines. – In large, highly automated organizations, one sysadmin can administer ~1000s of machines. – Movement called “DevOps” has as one goal the reduction of personnel costs in operations. (more on this later).
  • 11. How much lead time for growth of number of users? • Some things are predictable – Seasonal variation. • Christmas • Tax season – Daily variation • Working hours or non-working hours in various time zones • Holidays – Promotions or special offers – Sporting events • Other things are not predictable – Being “SlashDotted” – News items – Rapid growth in popularity of a company. – Disaster
  • 12. Managing growth in number of users • A lead time allows planning – Restructure database – Add or restructure software • When no lead time is available, elasticity of the cloud is the main mechanism.
  • 13. Outline • Introduction to scalability • CPU scaling – Load balancers – Rule Based Scaling – Scaling Patterns • I/O scaling
  • 14. Why have a load balancer? • Suppose there are too many users for a single instance of a service • The cloud allow us to create another instance of that service (elasticity) • We would like to have the half the users use one instance and half use the other • Two options: 1. Couple instances and users (half and half). This is accomplished by having users access an instance of a service directly by IP address. 2. Use an intermediary (load balancer) to distribute half of the requests to one instance and the other half to the other. Option 2 is preferable for a variety of reasons which we will see. 14
  • 15. Load Balancing • Physically a load balancer is a box that looks like it belongs in a computer network.
  • 16. Load Balancer Logically, a load balancer takes requests from clients and distributes them to copies of an application executing on multiple different servers Servers Clients Load Balancer
  • 17. Message sequence – client makes a request Servers Clients Load Balancer
  • 18. Message sequence- request arrives at load balancer Servers Clients Load Balancer
  • 19. Message sequence – request is send to one server Servers Clients Load Balancer
  • 20. Message sequence – reply goes directly back to client Servers Clients Load Balancer
  • 21. Suppose Load Balancer Becomes Overloaded – Load Balance the Load Balancers
  • 22. Hierarchy of Load Balancers • Server always sends message back to client. • Load balancers use variety of algorithms to choose instance for message – Round robin. Rotate requests evenly – Weighted round robin. Rotate requests according to some weighting. – Hashing – IP address of source to determine instance. Means that a request from a particular client always sent to same instance as long as it is still in service. • Note that these algorithms do not require knowledge of an instance’s load. That situation we will cover in a little bit.
  • 23. Outline • Introduction to scalability • CPU scaling – Load balancers – Rule based scaling – Scaling Patterns • I/O scaling
  • 25. Server • A server is a virtual machine without any software • A virtual machine can be allocated with varying amounts of memory, CPU, disk • Each variant has different cost, typically per hour
  • 26. Machine Image • A machine image is a copy of the contents of the memory of a computer. • A machine image may be created from any contents of a computer. Some options: – Bare metal – With OS – With LAMP Stack • Linux • Apache HTTP Server • MySQL • PhP or Python • If licensed software is contained in the machine image, then a license fee is paid when it is loaded
  • 27. Executable Virtual Machine • An executable virtual machine is created by loading a machine image into a server. • Executable virtual machine can then be – Booted – Paused – Shut down Machine Image Server
  • 28. Adding/Removing Resources • Example shows two servers with one to be removed. • Could be N servers with one to be added or removed • Creating a new instance takes some time • Removing an instance also takes time – it must satisfy existing requests and be detached from existing connections.
  • 29. Autoscaling group • An autoscaling group is a collection of instances that have been defined to be scaled together. • Typically these represent instances of the same application.
  • 30. Creating an autoscaling group • An autoscaling group needs to know – Machine instance id – VM type – Scaling policy
  • 31. Scaling Policy • Specify minimum, maximum, and desired number of instances • Can specify scaling based on time of day – E.g. scale up during 9:00-5:00 and down other times • Can scale based on average CPU usage – E.g. average CPU utilization <40% means delete instance – Average CPU utilization >60% means add instance. – Values come from monitor.
  • 32. Outline • Introduction to scalability • CPU scaling – Load balancers – Rule Based Scaling – Scaling Patterns • I/O scaling
  • 33. Scaling Patterns • Autoscaling implements Push Pattern for messages • Another pattern is Pull Pattern
  • 35. Push Pattern Description • Client sends a request (e.g. HTTP message) to the app in the cloud. • Request arrives at a load balancer • Load balancer forwards request to one of the VMs in the resource pool. • Load balancer uses scheduling strategy to decide which VM gets the request, e.g. dispatch to VM with lowest CPU utilization.
  • 36. How does the load balancer know? • The load balancer knows CPU utilization of the VMs and it knows how many requests it (the load balancer) has received, and possibly how long it took to service the requests. It does not know application specifics such as how many requests a VM can process. • When resource pool is overloaded, new resources are allocated. • The monitor decides (based on controller rules) when new resources are needed. It must have direct insight into the VM instances in order to do this. Hence, the monitor utilizes a monitoring service provided by the cloud for each instance. 36
  • 37. Pull architecture pattern (aka Producer- Consumer)
  • 38. Pull architecture description • Each request from the client is application specific and typed. • The queue keeps separate queues for each application running on the VMs. • A VM requests the next message of a particular type (pull) and processes it. • The monitor can now see how long a request waits in a queue or the average queue length and this is an indication of the load on the VMs that have applications that service requests of that type.
  • 39. Differences • Push is more responsive to requests. They are immediately forwarded to a service. There is a possibility that the service is overloaded. • Pull is less responsive since it relies on servers to de-queue messages. • In the pull architecture, a service polls for new messages even if there is nothing in its queue and this introduces overhead. • It is easier to monitor and control workload in the pull architecture since messages are application specific and typed.
  • 40. Outline • Introduction to scalability • CPU scaling • I/O scaling – Multiple sites – Software techniques
  • 41. I/O Scaling • Scaling out assumes scaling requirement is solved with more CPUs. • It may be that I/O is also a problem. – You may run your application in multiple sites – Half the clients go to one site, half to another
  • 42. Questions when you have multiple sites How do clients know which site to use? How are databases used by the applications coordinated across sites (we defer this question).
  • 43. Domain Name Server (DNS) Client sends URL to DNS DNS takes as input a URL and returns an IP address Client uses IP address to send message to load balancer for a site Site 1 Site 2 Domain Name Server Website.com 123.45.67.89 123.45.67.89 DNS
  • 44. DNS with multiple sites • DNS server returns IP address of both sites. • DNS server will vary which address is listed first. • Client will, typically, choose first entry. Site 1 Site 2 Domain Name Server Website.com 123.45.67.89 456.77.88.99123.45.67.89 DNS
  • 45. Outline • Introduction to scalability • CPU scaling • I/O scaling – Multiple sites – Software techniques
  • 47. To Scale for I/O - Make the queue manager more sophisticated Key Value Store Publisher – takes values from key- value store and distributes them Clients
  • 48. Summary • Scalability is the ability to respond to increasing or decreasing workload – Add CPU capacity through utilizing features of cloud provider – Add I/O capacity through • Distributing requests to multiple sites • Have fast message passing software
  • 50. Architecting for the Cloud Introduction to Availability
  • 51. Outline • What is availability • Faults • Availability patterns
  • 52. Outline • What is availability • Faults • Availability patterns
  • 53. Cost of Downtime • According to a recent survey the average cost of unplanned downtime is $7,900/minute* • 91% of reporting companies have experienced an unplanned outage in the last 24 months • The average outage lasts 118 minutes • The average frequency of outages over a 24 month period were: – 10.16 limited outages – 5.88 local outages – 2.04 total outages * Emerson Network Power, Ponemon Institute Study 2013
  • 54. Cost of Downtime II • As the previous numbers indicate downtime can be expensive • Experienced in August 2013 – New York Times had a 2 hour outage (stock price declined, twitter exploded, and Wall Street Journal dropped their fees to try and capture readership) – Google had between 1 – 5 minutes of downtime (~$500,000 direct loss and 40% reduction in overall web traffic) – Amazon had an outage of under an hour (> $5 million) • In addition to direct losses indirect losses are experienced – Loss of confidence, reputation, and good will – Productivity losses – Compliance penalties – …
  • 55. Availability: a Business Concern • The availability of the business service impacts the earnings and associated value of an organization • If the organization relies on an IT system to deliver business service then the availability of the IT system impacts the value of the organization • In this section we are going to look at the availability of the system – We want to keep in mind, however, that the objective is the availability of the business service
  • 56. What Is Availability? • Availability in general refers to the degree to which a system is in an operable state • This is typically articulated as the percentage of time the system is available (or we’d like to have the system available) e.g. 99.99% • There are many related terms e.g. – Availability – Fault-Tolerance – Reliability
  • 57. How is Availability Measured? Availability is typically measured as: MTBF MTBF + MTTR MTBF = Mean Time Between Failures MTTR = Mean Time To Repair
  • 58. 9s Availability Downtime per Year 90% (1-nine) 36.5 days/year 99% (2-nines) 3.65 days/year 99.9% (3-nines) 8.76 hours/year 99.99% (4-nines) 52 minutes/year 99.999% (5-nines) 5 minutes/year 99.9999% (6-nines) 31 seconds/year !
  • 59. Calculating System Availability I • Each component = 99% (3.65 days a year) • The overall system, however, has an availability that is the product of each component’s availability – 99% X 99% = 98% (7.26 days a year) 99% 99%
  • 60. Calculating System Availability • Each component = 99% (3.65 days a year) • The overall system in this case, however, is based on the likelihood that both components would fail at the same time 1 – ((100% - 99%) X (100% - 99%) )= 99.99% (3.65 hours a year!!) Redundant Elements 99% 99%
  • 61. Availability Measures • A couple of things to keep in mind – These measures refer to the mean not the minimum time between failures – As the MTBF increases the impact of MTTR decreases – As the MTTR approaches 0 the overall availability approaches 1 • Historically these measures were developed for hardware components
  • 62. Availability Requirements • MTBF can be measured for operational systems • How do you predict the MTBF for a system that is yet to be built, however? • Does it make sense to use the previously defined availability measure as a requirement? • If not, how should requirements be articulated?
  • 63. Actionable Requirements • Remember that as a business the concern is that the services are available as needed • In order to determine the likely availability of a system (or design) you must – Understand the likelihood that various kinds of faults could occur – Understand the impact of these faults on overall system availability • You must therefore translate the desired business objective into a set of fault scenarios
  • 64. End to End Availability • Engineers often think about availability of some portion of the system e.g. – Availability of the database or web server • Organizations, however, are concerned with end to end availability • When thinking about availability requirements you should think about the organizational perspective – Once you’ve done this you’ll then need to map this to the engineering perspective
  • 65. Requirements Vary • We start with the desired requirements from a business perspective • We then look at the system context to determine what faults might disrupt the desired behavior – This is likely an iterative process • One thing to keep in mind is that different business contexts imply different requirements • Consider the needs of Discreet Manufacturing vs. Continuous Manufacturing • Discreet manufacturing is when you manufacture discreet products – e.g. an automobile assembly line • Continuous process automation is when you manufacture things like chemicals or concrete • How might the systems respond differently in the event of a fault?
  • 66. Example Scenario If a processor in one of the servers fails during peak load, the system shall continue to operate without dropping any of the current tasks and without any noticeable delay
  • 67. Relationship to Goals • How does this scenario relate to availability goals? – It does not in and of itself guarantee a particular level of availability • This in conjunction with scenarios for other faults that could impact a service do improve availability, however • In order to understand how to think about the design we need to: – Identify the activities that require availability – Identify the related faults – Identify the desired response if the fault occurs
  • 68. Outline • What is availability • Faults • Availability patterns
  • 69. Fault Characteristics • “Fail silent” vs. “fail operational” – Fail silent  when a component fails it no longer operates – Fail operational  a component continues to operate (although not correctly) when a fault is present • Transient vs. deterministic – Some faults will always occur in a consistent way – Others may come and go intermittently • Some will look similar to other faults e.g. – A hung process, a processor crash, and a network outage can all look the same
  • 70. What’s the matter with this $#@!#% computer … A System Can Fail Silently … Let’s look at an example interaction Client Machine Network Server FileSystem Hmm … what’s the best vegetarian restaurant in Bogota?
  • 71. Symptoms of Faults • From an end users perspective many faults exhibit themselves similarly • These faults could all look the same to an end user: – A hung process – A crashed processor – A network outage – An overloaded element
  • 72. Or Fail Operational … Client Machine Network Server FileSystem Carnes de Res is the best vegetarian restaurant??? Hmm … what’s the best vegetarian restaurant in Bogota?
  • 73. Fault Manifestation • These types of faults could occur in any of the elements of the system • Depending on where they occur different mitigation strategies might be appropriate • As a result you need to – Analyze your system and determine what faults might occur – Identify the desired response if they do occur • This is called a fault model
  • 74. Fault Model • A fault model describes the system faults that could disrupt the critical functionality • The fault model is going to depend on both the critical functionality and the specific architecture of the system • Once the fault model is identified you’ll need to describe the desired response if the fault occurs
  • 75. Cost of Availability • We’ve established that downtime can be expensive • It’s also the case that “uptime” can be expensive – Implementing a mechanism to be resilient to faults can be expensive • We want to understand the cost and benefit for proposed strategies and select the set that make sense from a business perspective • This means the initial requirements might change …
  • 76. Example • We want “appropriate” availability • A study has been done for mobile carrier customers – This study has determined that customers will tolerate 2 dropped calls per 100 calls made – As soon as the system drops 3 calls per 100 they will start to change providers • What does this say about the “appropriate” availability of the system?
  • 77. Outline • What is availability • Faults • Availability patterns
  • 78. Elements of Availability • Fault detection – The system recognizes that a fault has occurred • Masking faults – The system is able to continue to operate despite the fault • Recover from the fault – The system is able to repair the faulty element of the system
  • 79. Fault Detection • There are standard “tactics” that we can use for fault detection • They don’t detect the same types of faults, however • They also have different “costs” – This cost can be in terms of effort or overhead of one kind or another • We need to understand something about the kinds of faults we are trying to detect before we can select the appropriate tactic
  • 80. Detecting Silent Faults • It’s much easier to detect elements that fail silently • Essentially we monitor the “liveness” of the element where the fault could exist • Example tactics are: – Exceptions – Heartbeat – Ping/echo
  • 81. Exceptions • When an anomalous or exceptional event occurs it can be detected by exception handlers • When the exception is “caught” an alternate path of execution is triggered • The exception handling code can notify other portions of the system of the issue • Doesn’t impose significant overhead on the system
  • 82. Heart Beat • A component emits a regular “heart beat” • Another element will listen for this • If this heart beat is not detected it is assumed that the component is no longer operational • Does add overhead to the system • Only an indication of the “liveness” of the component
  • 83. Ping/Echo • Similar to heart beat except a “watchdog” sends a ping and listens for a response • If no response is heard it is assumed the component is not operational • Requires more coupling than heart beat • Increases network traffic • Again it’s only an indication of the liveness of the component
  • 84. Failing Operational • If an element or system fails operational it’s more difficult to detect • You don’t just monitor if the system responds but also need to determine if the results are “correct” • Example tactics include: – Exceptions – Voting – Check sum
  • 85. Voting • You compare the response of multiple elements performing the same operation • If the results of one of the elements doesn’t match the others you assume it’s faulty • Can detect erroneous output • Adds overhead (must wait for multiple responses and compare)
  • 86. Check Sum • A mathematical calculation that’s applied to a piece of data to determine if it’s been altered • Does add some processing overhead to the system • Can detect data corruption
  • 87. Tolerating Faults • In many cases you realize that faults will occur – Particularly in large distributed systems • You can’t tolerate outages every time one of the nodes experiences a fault • You therefore need to hide the fact that the system has a faulty component • This is called “fault masking” • Again the strategies associated with masking the fault are going to be dependent on the kind of fault being masked
  • 88. Strategies For Fault Masking • Modular redundancy • Rollback – Restoring the system to a previously identified “safe state” • Roll forward – “skipping” an operation that is causing a problem • Retrying an operation • Shedding load • …
  • 89. Modular Redundancy • Redundant systems have multiple replicated elements (copies) – Not to be confused with load balancing approaches – The thing to realize is that the state is replicated across the copies • There are multiple strategies for software replication – Cold standby – Warm standby – Hot standby
  • 90. Redundancy: Cold Standby • There are non-operational copies available • State is stored (e.g. in logs) but is not loaded on the copies until they are needed • When a failure occurs the state is reconstructed and the replica is introduced • Reduces operational overhead associated with maintaining copies • Increases MTTR
  • 92. Redundancy: Warm Standby • In this configuration you have a primary replica that is actively processing requests • You have passive replicas that are not actively processing requests although they are online • State is periodically loaded into the backup replicas • As with cold standbys the processing overhead is reduced • The MTTR is dependent on the state checkpoints (typically less than with cold standbys)
  • 94. Redundancy: Hot Standby • All copies are processing requests • All of the duplicate responses will be suppressed • The copies need to be synchronized continuously – Thus the processing overhead is increased as the number of replicas increases • The MTTR is reduced to virtually zero, however, in the event that one of the replicas fail
  • 96. Considerations • State management – If there is state that is managed in the replicated elements you need to worry about synchronizing state • State can be pushed to other elements … – This impacts other concerns such as performance or security, however – Caching commonly accessed data is a typical strategy for dealing with performance concerns • Kinds of replicas • Frequency of check pointing
  • 98. Roll Back • Roll back is when you undo a transaction • You need to manage state appropriately – You need to define an atomic set of actions • This could be taking complete snap shot of system state or just roll back of a transaction
  • 99. Roll Forward • Roll forward essentially skips a task and then applies the changes involved in the transactions • The system will then be in the state consistent with the desired change
  • 100. Retrying an Operation • This is as simple as it sounds • When a given operation fails you retry it • It can be used in conjunction with a detection mechanism like exceptions
  • 101. Shedding Load • Sometimes issues occur due to an overload situation • This can lead to: – Timing errors – Buffer overflows – Memory consumption issues • Shedding less critical load can help alleviate the problem
  • 102. Strategies For Fault Recovery • Reboot – This could be a partial (e.g. restarting an application or process) or total system reboot • Removal of faulty component • Restore component to a previously identified safe state • …
  • 103. Reboot • Rebooting the system can often correct the issue • This can also be done as a preventative measure • It can be a complete or partial reboot • There is such a thing as a “micro reboot” that takes milliseconds
  • 104. Component Removal • If you have a faulty component you can remove it from service • You might try other remedies such as restarting first
  • 105. Checkpointing State • You can periodically take a snap shot of the system • If at some point you have an issue, you can restore the system to the previously defined state • The more frequently you take a snap shot of the state the smaller the loss but the more overhead
  • 106. Availability in the Cloud • From a high level achieving availability in the cloud is the same process as elsewhere – It needs to be designed in • That means you need to understand the faults that could occur • You then need to apply the appropriate decisions to achieve the desired result
  • 107. Fault Model • We will give specific faults that occur later in the course – This requires first a better understanding of the architecture of the cloud • At this point it’s useful to understand that the cloud is made up of faulty components – Failures happen on a regular basis • There are mechanisms built in to handle this, but – They aren’t always successful – They don’t deal with application specific concerns – Some things that might be a fault for your application isn’t considered a fault by the infrastucture
  • 108. Summary • Availability measures are not adequate for design • You need to be able to translate availability goals into a set of actionable requirements that identify the possible faults and desired responses • The approaches should support the desired responses in the event that a fault occurs