This document discusses the importance of reliability, availability, scalability, and performance for applications and systems. It outlines techniques for improving availability, such as high availability and redundancy. It also covers performance factors like latency and scaling architectures. The document proposes using a data grid to cache data in memory across multiple servers for improved scalability, availability, and performance over traditional database-centric architectures. Benefits of a data grid include linear scalability, high availability, location independence, and the ability to store large amounts of data in-memory.
If you don’t have these non-functional requirements then no amount of whizzy functionality will win over customers.Take each one in turn.
97% availability is 11 days a year
Two key techniques:Redundancy is doubling up the components therefore reducing the failure rateDecouplingRemoving the lifecycle dependency for service availability through messaging or caching etc.
Algorithmic Performance can be helped with profilers etc. in my experience a very rare performance issue in a production systemResource Limitations require tuning and ability to scale yourself out of the problem (See scalability)If you can’t scale you will hit a wallResource contention requires careful tuning as you can’t scale your way out of this problem you must architect and design your system to
Latency Factors in Distributed ApplicationsPinging a server in london could be 60ms whereas US could be 150msMobile networks are more unreliable and data rates to the same client can fluctuate rapidlyData size – shifting large amounts of data takes longer than small amountsMaking many fine grained calls means you hit the network latency many times as one or two large calls reduces round trips but may increase data latencyContention on Resources like locks
Obviously we all want increased volume so that we know our service is successfulMobile has massively increased the load on some systems due to the nature of always on people accessing systems when they never would be before (e.g. nation rail website)Periodic load variation throughout the dayCloud computing whereby servers can be started on demand enables elastic scaling – remember your capacity planning and HA requirements
Scale Out means adding more and more servers and nodes into your cluster using a load balancer.Scale Up means adding more cpus and memory to a single serverl
Typically Scalability is non-linear due to other limits in the system – Database, Network, Concurrency Hot Spots
Goal is for the load balancer to redirect to “any” node as they are all homogenousSometimes sticky sessions are used to redirect to the state holder for non-critical state.
Pseudo Stateless give typical non linear scalability issues as the database must be scaled.
INCREASES AVAILABILITY VIA DECOUPLING
REDUCES LATENCY BY MOVING DATA CLOSER TO PROCESSING
CLUSTERING REQUIRED FOR HA AND SCALABILITY !! Data is not in Node B so need to do Read Through CachingCluster Cache needs to be preloadedSubsequent requests will use cached data
GIVES SCALABILITY
NOT HA ADD DUPLICATES FOR HA
Not a lot of Hardware!!!!Some customers I know (not necessarily coherence) looking at 200 hardware nodes with 32Gb RAM each node for some big data clusters
Often “Off Heap Storage”Not a database!!!!
Consider was needs to be truly persistent;Financial RecordsCustomer Details and AccountsDon’t need; Who’s sitting at what table What cards they have been dealt How much they raised the bet What positions they took up
Without computation you just have a big cache!Good but not radical!Very Expensive to pull all the data across the grid
Moves the processing the data not the other way around!Much more efficient and the processor will likely have a small amount of data associated with it whereas the cache size may be very large!Massively REDUCES LATENCY through Not sending the DataMassively INCREASES PARALLELISMCan REDUCE LOCK Overhead as Lock Acquisition is Local
Coherence supports programmtic querying of the gridCohQL allows straight queries but there is also a programmticapiCohQL new in July 2010 Coherence 3.6
Used for ChatCould be used for new trades for a trader or bids or offers on a securityAsynchronous notificationManagement
Huge scalable Grid with Huge data storage capacity – backed by a database with Write Behind ProcessingEvents To PUSH Data to clients via web sockets or AjaxParallel Computation and UpdatesWrite Behind Processing for asynch StorageElastically Expand Capacity through rebalancing the Partition