SlideShare ist ein Scribd-Unternehmen logo
1 von 91
Non Functional Properties of Event Processing Presenters:  Opher Etzion and Tali Yatzkar-Haham Participated in the preparation: Ella Rabinovich and Inna Skarbovsky
Introduction to non functional properties of event processing
The variety There are variety of cheesecakes There are many systems that conceptually look like EPN, but they are different   in non functional properties
Two examples Very large network management: Millions of events every minute;  Very few are significant, same event  is repeated. Time windows are very short. Patient monitoring according to medical  Treatment protocol : Sporadic events, but each is meaningful,  time windows can span for weeks.  Both of them can be implemented by event Processing – but very differently.
Agenda Introduction to Non functional properties of event processing Performance and scalability considerations Availability considerations Usability considerations Security and privacy considerations  Summary  I II III IV V VI
Performance and Scalability Considerations
Performance benchmarks There is a large variance among applications, thus a collection of benchmarks should be devised, and each application should be classified to a benchmark  Some classification criteria: Application complexity  Filtering rate Required Performance metrics
Performance benchmarks – cont. Adi A., Etzion O. Amit - the situation manager. The VLDB Journal – The International Journal on Very Large Databases. Volume 13 Issue 2, 2004. Mendes M., Bizarro P., Marques P. Benchmarking event processing systems: current state and future directions. WOSP/SIPEW 2010: 259-260 . Previous studies ‎indicate that there is a  major performance degradation  as application complexity increases.
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Some benchmarks scenarios  Adi A., Etzion O. Amit - the situation manager. The VLDB Journal – The International Journal on Very Large Databases. Volume 13 Issue 2, 2004. 100000 100000 100000 100000 total external events 16503 7903 scenario 3 124319 1742 1372 accumulated latency  (ms) 1923 57470 72887 throughput (event/s) scenario 4 scenario 2 scenario 1
Performance indicators  One of the sources of variety Observations: The same system provides extremely different behavior based on type of functions employed Different application may require different metrics
Throughput  Input throughput output throughput Processing throughput Measures: number of input events that the system can digest within a given time interval  Measures: Total processing times / # of event processed within a given time interval   Measures: # of events that were emitted to consumers within a given time interval
Latency latency In the E2E level it is defined as the elapsed time  FROM the time-point when the producer emits an input event  TO the time-point when the consumer receives an output event  The latency definition But – input event may not result in output event: It may be filtered out, participate in a pattern but does not result in pattern detection, or participates in deferred operation (e.g. aggregation) Similar definitions for the EPA level, or path  level
Latency definition – two variations: Producer 1 Producer 2 Producer 3 EPA Detecting  Sequence (E1,E2,E3) within Sliding window of 1 hour  E1 E2 E3 Consumer 11:00 12:00 11:10 11:15 11:30 E1 E2 E3 11:40 E2 Variation I: We measure the latency of E3 only Variation II: We measure the Latency of each event; for events that don’t create derived events directly, we measure the time until the system finishes processing them
Performance goals and metrics  ,[object Object],[object Object],Max throughput All/ 80% have max/avg latency <  δ All/ 90%  of time units have throughput >  Ω minmax latency minavg latency latency leveling
Optimization tools Blackbox optimizations: Distribution Parallelism Scheduling Load balancing  Load shedding  Whitebox optimizations: Implementation selection Implementation optimization Pattern rewriting
Scalability Scalability  is the ability of a system to handle growing amounts of work in a graceful manner, or its ability to be enlarged effortlessly and transparently to accommodate this growth Scale up Vertical scalability Adding resources within the same logical unit to increase capacity Scale out Horizontal scalability Adding additional logical units to increase processing power
Vertical Scalability- Scaling up ,[object Object],Qualifications of application designed for scale-up ,[object Object],[object Object],[object Object],Adding resources to a single logical unit to increase it’s processing abilities ,[object Object],[object Object]
Horizontal Scalability - Scaling out Qualifications of application designed for scale-out For stateful applications ,[object Object],[object Object],[object Object],[object Object],Different patterns associated ,[object Object],[object Object],Adding multiple logical units and making them work as a single unit ,[object Object],[object Object],[object Object],[object Object]
Scale-out and scale-up tradeoffs Scale up Scale out ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
General approach to scalability Usually applications combine the two approaches… Scaling out by… ,[object Object],[object Object],[object Object],[object Object],Scaling up by… ,[object Object],[object Object]
Scalability in event processing: various dimensions # of producers   # of input events  # of EPA types # of  concurrent  runtime instances # of  concurrent  runtime  contexts  Internal state size  # of consumers  # of derived events  Processing complexity # of geographical Locations  # of geographical Locations
Event-processing techniques for scalability Load shedding Load partitioning according to EPAs topology and Runtime Contexts
Scalability in event volume ,[object Object],Scale out techniques ,[object Object],[object Object],Scale up techniques ,[object Object],Applicable scale-up and scale-out techniques ,[object Object],Scale out techniques Some applications requiring high event throughput financial  weather  phone-call tracking
Scalability in quantity of event processing agents ,[object Object],[object Object],Applicable scale-up and scale-out techniques ,[object Object],Optimization in agent assignment (mapping between logical and physical artifacts) ,[object Object]
Scalability in quantity of event processing agents – partitioning and parallelism ,[object Object],[object Object],[object Object],[object Object],Parallelism/Distribution Partitioning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Scalability in a number of producers/consumers Growth in a number of producers   usually results in growth in event load even if number of events produced by each one is small Growth in a number of consumers   Requires optimization at routing level, such as multicasting
Scalability in a number of context partitions and context-state size Hash  (customer id) Nodes events Each context partition is represented by internal state of a certain size  ,[object Object],Growth in a number of context partitions ,[object Object],Significant growth of internal state for a single context partition ,[object Object]
Availability Considerations
Availability Availability  is ratio of time the system is perceived as functioning by its users   to the time it is required or expected to function Can be expressed as ,[object Object],[object Object],[object Object]
Availability expectations and solutions Continuous availability  provides the ability to keep the business application running without any noticeable downtime Major outages… ,[object Object],[object Object],[object Object],[object Object],Continuous operation   is the ability to avoid planned outages Minor outages… ,[object Object],[object Object],[object Object],[object Object],[object Object]
Components of high availability Fault avoidance – redundancy and duplication ,[object Object],[object Object],[object Object],[object Object],Fault tolerance -recoverability ,[object Object]
Redundancy and duplication Redundancy ,[object Object],Scale out techniques ,[object Object],Failover – automatic reconfiguration Load balancing is one of the players ,[object Object],[object Object],Duplication ,[object Object],[object Object]
Recoverability in stateful applications –  state management tradeoffs Data grid – replication of state between multiple machines ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Memory based state Better performance  than pure db ,[object Object],In-memory db with caching capabilities ,[object Object],[object Object]
High availability costs Implementing some of HA practices can be very expensive… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Availability in event processing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Using the general availability techniques…
Cost-effectiveness of recoverability  techniques in EP Have to consider if implementing recoverability is cost-effective? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Usability Considerations
Usability 101  Definition by  Jakob Nilsen * * http :// www . useit . com / alertbox / 20030825 . html   Learnability: How easy it is for  Users to accomplish basic tasks the first time they  encounter the  system?  Efficiency: Once users have  Learned the  system, How  quickly can they perform tasks?  Memorability: When users return after period of not  using the system, How easily can they reestablish  proficiency  ?  Errors: How many errors do users make,  how severe are  these errors, and how easily they can recover from the errors? Satisfaction: How pleasant is it to use the  system?  Utility: Does the system do what the user intended?
In this part of the tutorial we’ll talk about  Build time IDE  Runtime control and audit tools  Correctness – internal  Consistency  Debug and validation  Consistency with the environment -  Transactional behavior
Build time interfaces  Text based programming languages   Visual  languages   Form based languages   Natural   languages   interfaces
Text-based IDE (Sybase/CCL)
Another Text-based IDE (Apama)
Visual language – StreamSQL EventFlow (Streambase)
Visual language – StreamSQL EventFlow (Streambase) – cont.
Form based language – Websphere Business Events (IBM)  ,[object Object],[object Object]
[object Object],[object Object],Natural language for event processing Based on work done by Mark H Linehan  (IBM T.J.Watson Research Center) free text Frequent big cash deposit pattern is defined as “at least 4 big cash deposits to the same account”, where  big deposit  decision depends on customer’s profile. structured English A  derived event   that  is derived from a  big cash deposit   using   the   frequent deposits in same account   applying   threshold  the   count   of  the  participant event set   of   frequent big cash deposits   is greater than or equal to   4.
Run time tools  Performance monitoring Dashboards Audit and provenance Two types of run time tools: Monitoring the application  Monitoring the event processing systems
Performance Monitoring (Aleri/Sybase)
Dashboard (Apama)
Dashboard Construction (Apama)
Dashboard (IBM WBE)
Provenance and audit  Tracking all consequences of an event Tracking the reasons that something happens Within the event processing system: Derivation of events, routing of events,  Actions triggered by the events
Example: Pharmaceutical pedigree
Validation and debugging  Debugger Testing and simulation Validation
Breakpoints and Debugging
Breakpoints and Debugging (StreamBase)
Testing & simulation – IBM WBE
[object Object],[object Object],[object Object],[object Object],Application validation
Validation techniques Static Analysis ,[object Object],[object Object],Dynamic Analysis ,[object Object],[object Object],[object Object],Build-time Development phase Run-time Development and production phases Analysis with Formal Methods ,[object Object],Build-time Development phase
[object Object],[object Object],[object Object],[object Object],Static analysis
[object Object],[object Object],[object Object],[object Object],Dynamic Analysis Runtime Scenario Dynamic Analysis Component EP Application Definition History Data Store Observations for dynamic analysis EP system invocation on runtime scenario Results analysis for correctness and coverage Analysis results
[object Object],[object Object],[object Object],Advanced verification with formal methods ,[object Object],[object Object],[object Object],[object Object]
Correctness  The ability of a developer to create correct implementation for all cases (including the boundaries)   Observation: A substantial amount of effort is invested today in many of the tools to workaround the inability of the language to easily create correct solutions
Some correctness topics The right interpretation of language constructs  The right order of events  The right classification of events to windows
The right interpretation of language constructs – example All (E1, E2) – what do we mean? A customer both sells and buys the same security in value of more than $1M within a single day Deal fulfillment:  Package arrival and payment arrival  6/3 10:00 7/3 11:00 8/3 11:00 8/3 14:00
Fine tuning of the semantics (I) When should the derived event be emitted?  When the Pattern is matched ? At the window end?
Fine tuning of the semantics (II) How many instances of derived events should be emitted?  Only once?  Every time there is a match ?
Fine tuning of the semantics (III) What happens if the same event happens several times?  Only one – first, last, higher/lower value on some predicate?  All of them participate in a match?
Fine tuning of the semantics (IV) Can we consume or reuse events that participate in a match?
Fine tuning of semantics – conclusion  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],In other cases – explicit programming and workarounds are used if semantics intended is different than the default semantics
The right order of events - scenario ,[object Object],[object Object],[object Object],===Input Bids=== Bid Start 12:55:00 credit bid id=2,occurrence time=12:55:32,price=4   cash bid id=29,occurrence time=12:55:33,price=4 cash bid id=33,occurrence time=12:55:34,price=3 credit bid id=66,occurrence time=12:55:36,price=4 credit bid id=56,occurrence time=12:55:59,price=5 Bid End 12:56:00  ===Winning Bid=== cash bid id=29,occurrence time=12:55:33,price=4 Trace:  Race conditions: Between events; Between events and Window start/end
Ordering in a distributed environment  -  possible issues Even if the occurrence time of an event is accurate,  it might arrive after some processing has already been done If we used occurrence time of an event as reported by the sources it might not be accurate,  due to clock accuracy in the source  Most systems order event by detection time – but events may switch their order on the way
Clock accuracy in the source  Clock synchronization Time server,  example:  http://tf.nist.gov/service/its.htm
Buffering technique ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Sorted Buffer (by occurrence time) To t > To +   Producers Event Processing
Retrospective compensation ,[object Object],[object Object],[object Object]
Classification to windows - scenario Calculate Statistics for each  Player (aggregate per quarter) Calculate Statistics for each  Team (aggregate per quarter) Window classification: Player statistics are calculated at the end of each quarter Team statistics are calculated at the end of each quarter based on the players events arrived within the same quarter All instances of player statistics that occur within a quarter window must be classified to the same window, even if they are derived after the window termination.
Transactional Behavior  ,[object Object],[object Object],Nothing gets out of the system until the transaction is committed ,[object Object],[object Object]
Transactional behavior in event processing?  Typically, event processing systems have decoupled architecture, and does not  exhibit transactional behavior However, in several cases event processing is embedded within a transactional environment
CASE I: Transactional ECA at the consumer side  When a derived event is emitted to a consumer, there is an ECA rule, with several actions, that is required to run as atomic unit.  If failed, the Derived event should be withdrawn
CASE II: An event processing system monitors transactional system In this case, the producer may emit events that are not confirmed and may be rolled back.
Case III: Event processing is part of a chain There is some transactional relationship between the producer and consumer The event processing system should transfer rollback notice from the consumer to the producer ,[object Object],[object Object]
Case IV: A path in the event processing network should act as “unit of work” Example: the “determine winner” fails, and the bid is cancelled, all bid events are not kept in the event stores, and are withdrawn for other processing purposes
Transactions in event processing systems ,[object Object],[object Object],All (E1, E2)  - E2 arrived 5 days after E1 - The processing of the pattern failed –  What do we mean? Withdraw only E2? Withdraw also E1 after 5 days?
Security and Privacy Considerations
Security, privacy and trust Security requirements ensure that operations are only performed by authorized parties, and that privacy considerations are met. Based on Enhancing the Development Life Cycle to Produce Secure Software [DHS/DACS 08]  Characteristics of secure application: Containing no malicious logic that causes it to behave in a malicious manner. Trustworthiness Recovering as quickly as possible with as little damage as possible from attacks. Survivability Executing predictably and operating correctly under all conditions, including hostile conditions. Dependability
Towards security assurance Identify and categorize the information the software is going to contain Low sensitivity –  The impact of security violation is minimal High sensitivity –  Violation may pose a threat to human life Develop security requirements ,[object Object],[object Object],[object Object],[object Object]
Security in event processing systems ,[object Object],[object Object],[object Object],[object Object],authorized authorized
Security in event processing systems – cont. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Security patterns in event processing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Summary
Summary  Non Functional properties determine the nature of event processing applications – distribution, availability, optimization, correctness and security are some of the dimensions  There are often the main decision factor in selecting whether to use an event processing system, and in the selection among various alternatives.

Weitere ähnliche Inhalte

Was ist angesagt?

Fault tolearant system
Fault tolearant systemFault tolearant system
Fault tolearant system
arvinthsaran
 
Distributed Middleware Reliability & Fault Tolerance Support in System S
Distributed Middleware Reliability & Fault Tolerance Support in System SDistributed Middleware Reliability & Fault Tolerance Support in System S
Distributed Middleware Reliability & Fault Tolerance Support in System S
Harini Sirisena
 
Fault Tolerance System
Fault Tolerance SystemFault Tolerance System
Fault Tolerance System
Ehsan Ilahi
 
Fault Tolerance (Distributed computing)
Fault Tolerance (Distributed computing)Fault Tolerance (Distributed computing)
Fault Tolerance (Distributed computing)
Sri Prasanna
 
Communication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed SystemsCommunication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed Systems
guest61205606
 

Was ist angesagt? (20)

Dependable Systems -Dependability Threats (2/16)
Dependable Systems -Dependability Threats (2/16)Dependable Systems -Dependability Threats (2/16)
Dependable Systems -Dependability Threats (2/16)
 
Fault tolearant system
Fault tolearant systemFault tolearant system
Fault tolearant system
 
Dependable Systems -Fault Tolerance Patterns (4/16)
Dependable Systems -Fault Tolerance Patterns (4/16)Dependable Systems -Fault Tolerance Patterns (4/16)
Dependable Systems -Fault Tolerance Patterns (4/16)
 
Dependable Systems - Summary (16/16)
Dependable Systems - Summary (16/16)Dependable Systems - Summary (16/16)
Dependable Systems - Summary (16/16)
 
Distributed Middleware Reliability & Fault Tolerance Support in System S
Distributed Middleware Reliability & Fault Tolerance Support in System SDistributed Middleware Reliability & Fault Tolerance Support in System S
Distributed Middleware Reliability & Fault Tolerance Support in System S
 
Fault Tolerance System
Fault Tolerance SystemFault Tolerance System
Fault Tolerance System
 
Dependable Systems -Dependability Attributes (5/16)
Dependable Systems -Dependability Attributes (5/16)Dependable Systems -Dependability Attributes (5/16)
Dependable Systems -Dependability Attributes (5/16)
 
Fault Tolerance 101
Fault Tolerance 101 Fault Tolerance 101
Fault Tolerance 101
 
Fault Tolerant and Distributed System
Fault Tolerant and Distributed SystemFault Tolerant and Distributed System
Fault Tolerant and Distributed System
 
Dependable Systems -Dependability Means (3/16)
Dependable Systems -Dependability Means (3/16)Dependable Systems -Dependability Means (3/16)
Dependable Systems -Dependability Means (3/16)
 
Fault tolerance techniques
Fault tolerance techniquesFault tolerance techniques
Fault tolerance techniques
 
Fault Tolerance (Distributed computing)
Fault Tolerance (Distributed computing)Fault Tolerance (Distributed computing)
Fault Tolerance (Distributed computing)
 
Architectural patterns for real-time systems
Architectural patterns for real-time systemsArchitectural patterns for real-time systems
Architectural patterns for real-time systems
 
Software Performance
Software Performance Software Performance
Software Performance
 
9 fault-tolerance
9 fault-tolerance9 fault-tolerance
9 fault-tolerance
 
Fault tolerance
Fault toleranceFault tolerance
Fault tolerance
 
Fault tolerance
Fault toleranceFault tolerance
Fault tolerance
 
Communication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed SystemsCommunication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed Systems
 
Physical and Logical Clocks
Physical and Logical ClocksPhysical and Logical Clocks
Physical and Logical Clocks
 
Fault tolerance in Information Centric Networks
Fault tolerance in Information Centric NetworksFault tolerance in Information Centric Networks
Fault tolerance in Information Centric Networks
 

Andere mochten auch

Chapter 12
Chapter 12Chapter 12
Chapter 12
cclay3
 
Module 3 Scanning
Module 3   ScanningModule 3   Scanning
Module 3 Scanning
leminhvuong
 

Andere mochten auch (20)

Access control attacks by nor liyana binti azman
Access control attacks by nor liyana binti azmanAccess control attacks by nor liyana binti azman
Access control attacks by nor liyana binti azman
 
Comparative Analysis of Personal Firewalls
Comparative Analysis of Personal FirewallsComparative Analysis of Personal Firewalls
Comparative Analysis of Personal Firewalls
 
Reactconf 2014 - Event Stream Processing
Reactconf 2014 - Event Stream ProcessingReactconf 2014 - Event Stream Processing
Reactconf 2014 - Event Stream Processing
 
Installing Complex Event Processing On Linux
Installing Complex Event Processing On LinuxInstalling Complex Event Processing On Linux
Installing Complex Event Processing On Linux
 
Session hijacking
Session hijackingSession hijacking
Session hijacking
 
Tutorial in DEBS 2008 - Event Processing Patterns
Tutorial in DEBS 2008 - Event Processing PatternsTutorial in DEBS 2008 - Event Processing Patterns
Tutorial in DEBS 2008 - Event Processing Patterns
 
Complex Event Processing with Esper and WSO2 ESB
Complex Event Processing with Esper and WSO2 ESBComplex Event Processing with Esper and WSO2 ESB
Complex Event Processing with Esper and WSO2 ESB
 
CyberLab CCEH Session - 3 Scanning Networks
CyberLab CCEH Session - 3 Scanning NetworksCyberLab CCEH Session - 3 Scanning Networks
CyberLab CCEH Session - 3 Scanning Networks
 
Chapter 12
Chapter 12Chapter 12
Chapter 12
 
Ceh v8 labs module 03 scanning networks
Ceh v8 labs module 03 scanning networksCeh v8 labs module 03 scanning networks
Ceh v8 labs module 03 scanning networks
 
Nmap scripting engine
Nmap scripting engineNmap scripting engine
Nmap scripting engine
 
Debs2009 Event Processing Languages Tutorial
Debs2009 Event Processing Languages TutorialDebs2009 Event Processing Languages Tutorial
Debs2009 Event Processing Languages Tutorial
 
Analizadores de Protocolos
Analizadores de ProtocolosAnalizadores de Protocolos
Analizadores de Protocolos
 
Why Data Virtualization Is Good For Big Data Analytics?
Why Data Virtualization Is Good For Big Data Analytics?Why Data Virtualization Is Good For Big Data Analytics?
Why Data Virtualization Is Good For Big Data Analytics?
 
Tutoriel esper
Tutoriel esperTutoriel esper
Tutoriel esper
 
Scanning with nmap
Scanning with nmapScanning with nmap
Scanning with nmap
 
Module 3 Scanning
Module 3   ScanningModule 3   Scanning
Module 3 Scanning
 
Port Scanning Overview
Port Scanning  OverviewPort Scanning  Overview
Port Scanning Overview
 
Building Real-time CEP Application with Open Source Projects
Building Real-time CEP Application with Open Source Projects Building Real-time CEP Application with Open Source Projects
Building Real-time CEP Application with Open Source Projects
 
Optimizing Your SOA with Event Processing
Optimizing Your SOA with Event ProcessingOptimizing Your SOA with Event Processing
Optimizing Your SOA with Event Processing
 

Ähnlich wie Debs 2011 tutorial on non functional properties of event processing

USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
gerogepatton
 
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
ijaia
 
Using Semi-supervised Classifier to Forecast Extreme CPU Utilization
Using Semi-supervised Classifier to Forecast Extreme CPU UtilizationUsing Semi-supervised Classifier to Forecast Extreme CPU Utilization
Using Semi-supervised Classifier to Forecast Extreme CPU Utilization
gerogepatton
 

Ähnlich wie Debs 2011 tutorial on non functional properties of event processing (20)

Event-driven BPM the JBoss way
Event-driven BPM the JBoss wayEvent-driven BPM the JBoss way
Event-driven BPM the JBoss way
 
Performance testing : An Overview
Performance testing : An OverviewPerformance testing : An Overview
Performance testing : An Overview
 
Being Elastic -- Evolving Programming for the Cloud
Being Elastic -- Evolving Programming for the CloudBeing Elastic -- Evolving Programming for the Cloud
Being Elastic -- Evolving Programming for the Cloud
 
T3 Consortium's Performance Center of Excellence
T3 Consortium's Performance Center of ExcellenceT3 Consortium's Performance Center of Excellence
T3 Consortium's Performance Center of Excellence
 
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
 
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATIONUSING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
USING SEMI-SUPERVISED CLASSIFIER TO FORECAST EXTREME CPU UTILIZATION
 
Using Semi-supervised Classifier to Forecast Extreme CPU Utilization
Using Semi-supervised Classifier to Forecast Extreme CPU UtilizationUsing Semi-supervised Classifier to Forecast Extreme CPU Utilization
Using Semi-supervised Classifier to Forecast Extreme CPU Utilization
 
ScaleFast Grid And Flow
ScaleFast Grid And FlowScaleFast Grid And Flow
ScaleFast Grid And Flow
 
Performance testing wreaking balls
Performance testing wreaking ballsPerformance testing wreaking balls
Performance testing wreaking balls
 
Cloud Storage and Security
Cloud Storage and SecurityCloud Storage and Security
Cloud Storage and Security
 
Error Isolation and Management in Agile Multi-Tenant Cloud Based Applications
Error Isolation and Management in Agile Multi-Tenant Cloud Based Applications Error Isolation and Management in Agile Multi-Tenant Cloud Based Applications
Error Isolation and Management in Agile Multi-Tenant Cloud Based Applications
 
Error isolation and management in agile
Error isolation and management in agileError isolation and management in agile
Error isolation and management in agile
 
Best Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBayBest Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBay
 
Review se
Review seReview se
Review se
 
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
 
Construção de uma plataforma de observabilidade centralizada
Construção de uma plataforma de observabilidade centralizadaConstrução de uma plataforma de observabilidade centralizada
Construção de uma plataforma de observabilidade centralizada
 
Software architecture unit 4
Software architecture unit 4Software architecture unit 4
Software architecture unit 4
 
Performance testing basics
Performance testing basicsPerformance testing basics
Performance testing basics
 
AFITC 2018 - Using Process Maturity and Agile to Strengthen Cyber Security
AFITC 2018 - Using Process Maturity and Agile to Strengthen Cyber SecurityAFITC 2018 - Using Process Maturity and Agile to Strengthen Cyber Security
AFITC 2018 - Using Process Maturity and Agile to Strengthen Cyber Security
 
Software Requirements and Design Process in the Aerospace Industry
Software Requirements and Design Process in the Aerospace IndustrySoftware Requirements and Design Process in the Aerospace Industry
Software Requirements and Design Process in the Aerospace Industry
 

Mehr von Opher Etzion

On Internet of Everything and Personalization. Talk in INTEROP 2014
On Internet of Everything and Personalization. Talk in INTEROP 2014On Internet of Everything and Personalization. Talk in INTEROP 2014
On Internet of Everything and Personalization. Talk in INTEROP 2014
Opher Etzion
 
The Internet of Things and some introduction to the Technological Empowerment...
The Internet of Things and some introduction to the Technological Empowerment...The Internet of Things and some introduction to the Technological Empowerment...
The Internet of Things and some introduction to the Technological Empowerment...
Opher Etzion
 
Debs 2013 tutorial : Why is event-driven thinking different from traditional ...
Debs 2013 tutorial : Why is event-driven thinking different from traditional ...Debs 2013 tutorial : Why is event-driven thinking different from traditional ...
Debs 2013 tutorial : Why is event-driven thinking different from traditional ...
Opher Etzion
 

Mehr von Opher Etzion (20)

DEBS 2019 tutorial : correctness and consistency of event-based systems
DEBS 2019 tutorial  : correctness and consistency of event-based systems DEBS 2019 tutorial  : correctness and consistency of event-based systems
DEBS 2019 tutorial : correctness and consistency of event-based systems
 
Sw architectures 2018 on microservices and eda
Sw architectures 2018    on microservices and edaSw architectures 2018    on microservices and eda
Sw architectures 2018 on microservices and eda
 
ER 2017 tutorial - On Paradoxes, Autonomous Systems and dilemmas
ER 2017 tutorial - On Paradoxes, Autonomous Systems and dilemmasER 2017 tutorial - On Paradoxes, Autonomous Systems and dilemmas
ER 2017 tutorial - On Paradoxes, Autonomous Systems and dilemmas
 
Event processing within the human body - Tutorial
Event processing within the human body - Tutorial Event processing within the human body - Tutorial
Event processing within the human body - Tutorial
 
DEBS 2015 tutorial When Artificial Intelligence meets the Internet of Things
DEBS 2015 tutorial   When Artificial Intelligence meets the Internet of ThingsDEBS 2015 tutorial   When Artificial Intelligence meets the Internet of Things
DEBS 2015 tutorial When Artificial Intelligence meets the Internet of Things
 
Dynamic stories
Dynamic storiesDynamic stories
Dynamic stories
 
Has Internet of Things really happened?
Has Internet of Things really happened? Has Internet of Things really happened?
Has Internet of Things really happened?
 
On the personalization of event-based systems
On the personalization of event-based systems On the personalization of event-based systems
On the personalization of event-based systems
 
On Internet of Everything and Personalization. Talk in INTEROP 2014
On Internet of Everything and Personalization. Talk in INTEROP 2014On Internet of Everything and Personalization. Talk in INTEROP 2014
On Internet of Everything and Personalization. Talk in INTEROP 2014
 
Introduction to the institute of technological empowerment
Introduction to the institute of technological empowermentIntroduction to the institute of technological empowerment
Introduction to the institute of technological empowerment
 
DEBS 2014 tutorial on the Internet of Everything.
DEBS 2014 tutorial  on the Internet of Everything. DEBS 2014 tutorial  on the Internet of Everything.
DEBS 2014 tutorial on the Internet of Everything.
 
The Internet of Things and some introduction to the Technological Empowerment...
The Internet of Things and some introduction to the Technological Empowerment...The Internet of Things and some introduction to the Technological Empowerment...
The Internet of Things and some introduction to the Technological Empowerment...
 
ER 2013 tutorial: modeling the event driven world
ER 2013 tutorial:  modeling the event driven world ER 2013 tutorial:  modeling the event driven world
ER 2013 tutorial: modeling the event driven world
 
Event semantics and model - multimedia events workshop
Event semantics and model -  multimedia events workshopEvent semantics and model -  multimedia events workshop
Event semantics and model - multimedia events workshop
 
Debs 2013 tutorial : Why is event-driven thinking different from traditional ...
Debs 2013 tutorial : Why is event-driven thinking different from traditional ...Debs 2013 tutorial : Why is event-driven thinking different from traditional ...
Debs 2013 tutorial : Why is event-driven thinking different from traditional ...
 
Debs 2012 gong show immortality
Debs 2012 gong show immortalityDebs 2012 gong show immortality
Debs 2012 gong show immortality
 
Debs 2012 basic proactive
Debs 2012 basic proactiveDebs 2012 basic proactive
Debs 2012 basic proactive
 
Debs 2012 uncertainty tutorial
Debs 2012 uncertainty tutorialDebs 2012 uncertainty tutorial
Debs 2012 uncertainty tutorial
 
Proactive eth talk
Proactive eth talkProactive eth talk
Proactive eth talk
 
Aaai 2011 event processing tutorial
Aaai 2011 event processing tutorialAaai 2011 event processing tutorial
Aaai 2011 event processing tutorial
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

Debs 2011 tutorial on non functional properties of event processing

  • 1. Non Functional Properties of Event Processing Presenters: Opher Etzion and Tali Yatzkar-Haham Participated in the preparation: Ella Rabinovich and Inna Skarbovsky
  • 2. Introduction to non functional properties of event processing
  • 3. The variety There are variety of cheesecakes There are many systems that conceptually look like EPN, but they are different in non functional properties
  • 4. Two examples Very large network management: Millions of events every minute; Very few are significant, same event is repeated. Time windows are very short. Patient monitoring according to medical Treatment protocol : Sporadic events, but each is meaningful, time windows can span for weeks. Both of them can be implemented by event Processing – but very differently.
  • 5. Agenda Introduction to Non functional properties of event processing Performance and scalability considerations Availability considerations Usability considerations Security and privacy considerations Summary I II III IV V VI
  • 7. Performance benchmarks There is a large variance among applications, thus a collection of benchmarks should be devised, and each application should be classified to a benchmark Some classification criteria: Application complexity Filtering rate Required Performance metrics
  • 8. Performance benchmarks – cont. Adi A., Etzion O. Amit - the situation manager. The VLDB Journal – The International Journal on Very Large Databases. Volume 13 Issue 2, 2004. Mendes M., Bizarro P., Marques P. Benchmarking event processing systems: current state and future directions. WOSP/SIPEW 2010: 259-260 . Previous studies ‎indicate that there is a major performance degradation as application complexity increases.
  • 9.
  • 10. Performance indicators One of the sources of variety Observations: The same system provides extremely different behavior based on type of functions employed Different application may require different metrics
  • 11. Throughput Input throughput output throughput Processing throughput Measures: number of input events that the system can digest within a given time interval Measures: Total processing times / # of event processed within a given time interval Measures: # of events that were emitted to consumers within a given time interval
  • 12. Latency latency In the E2E level it is defined as the elapsed time FROM the time-point when the producer emits an input event TO the time-point when the consumer receives an output event The latency definition But – input event may not result in output event: It may be filtered out, participate in a pattern but does not result in pattern detection, or participates in deferred operation (e.g. aggregation) Similar definitions for the EPA level, or path level
  • 13. Latency definition – two variations: Producer 1 Producer 2 Producer 3 EPA Detecting Sequence (E1,E2,E3) within Sliding window of 1 hour E1 E2 E3 Consumer 11:00 12:00 11:10 11:15 11:30 E1 E2 E3 11:40 E2 Variation I: We measure the latency of E3 only Variation II: We measure the Latency of each event; for events that don’t create derived events directly, we measure the time until the system finishes processing them
  • 14.
  • 15. Optimization tools Blackbox optimizations: Distribution Parallelism Scheduling Load balancing Load shedding Whitebox optimizations: Implementation selection Implementation optimization Pattern rewriting
  • 16. Scalability Scalability is the ability of a system to handle growing amounts of work in a graceful manner, or its ability to be enlarged effortlessly and transparently to accommodate this growth Scale up Vertical scalability Adding resources within the same logical unit to increase capacity Scale out Horizontal scalability Adding additional logical units to increase processing power
  • 17.
  • 18.
  • 19.
  • 20.
  • 21. Scalability in event processing: various dimensions # of producers # of input events # of EPA types # of concurrent runtime instances # of concurrent runtime contexts Internal state size # of consumers # of derived events Processing complexity # of geographical Locations # of geographical Locations
  • 22. Event-processing techniques for scalability Load shedding Load partitioning according to EPAs topology and Runtime Contexts
  • 23.
  • 24.
  • 25.
  • 26. Scalability in a number of producers/consumers Growth in a number of producers usually results in growth in event load even if number of events produced by each one is small Growth in a number of consumers Requires optimization at routing level, such as multicasting
  • 27.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 38. Usability 101 Definition by Jakob Nilsen * * http :// www . useit . com / alertbox / 20030825 . html Learnability: How easy it is for Users to accomplish basic tasks the first time they encounter the system? Efficiency: Once users have Learned the system, How quickly can they perform tasks? Memorability: When users return after period of not using the system, How easily can they reestablish proficiency ? Errors: How many errors do users make, how severe are these errors, and how easily they can recover from the errors? Satisfaction: How pleasant is it to use the system? Utility: Does the system do what the user intended?
  • 39. In this part of the tutorial we’ll talk about Build time IDE Runtime control and audit tools Correctness – internal Consistency Debug and validation Consistency with the environment - Transactional behavior
  • 40. Build time interfaces Text based programming languages Visual languages Form based languages Natural languages interfaces
  • 43. Visual language – StreamSQL EventFlow (Streambase)
  • 44. Visual language – StreamSQL EventFlow (Streambase) – cont.
  • 45.
  • 46.
  • 47. Run time tools Performance monitoring Dashboards Audit and provenance Two types of run time tools: Monitoring the application Monitoring the event processing systems
  • 52. Provenance and audit Tracking all consequences of an event Tracking the reasons that something happens Within the event processing system: Derivation of events, routing of events, Actions triggered by the events
  • 54. Validation and debugging Debugger Testing and simulation Validation
  • 57. Testing & simulation – IBM WBE
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63. Correctness The ability of a developer to create correct implementation for all cases (including the boundaries) Observation: A substantial amount of effort is invested today in many of the tools to workaround the inability of the language to easily create correct solutions
  • 64. Some correctness topics The right interpretation of language constructs The right order of events The right classification of events to windows
  • 65. The right interpretation of language constructs – example All (E1, E2) – what do we mean? A customer both sells and buys the same security in value of more than $1M within a single day Deal fulfillment: Package arrival and payment arrival 6/3 10:00 7/3 11:00 8/3 11:00 8/3 14:00
  • 66. Fine tuning of the semantics (I) When should the derived event be emitted? When the Pattern is matched ? At the window end?
  • 67. Fine tuning of the semantics (II) How many instances of derived events should be emitted? Only once? Every time there is a match ?
  • 68. Fine tuning of the semantics (III) What happens if the same event happens several times? Only one – first, last, higher/lower value on some predicate? All of them participate in a match?
  • 69. Fine tuning of the semantics (IV) Can we consume or reuse events that participate in a match?
  • 70.
  • 71.
  • 72. Ordering in a distributed environment - possible issues Even if the occurrence time of an event is accurate, it might arrive after some processing has already been done If we used occurrence time of an event as reported by the sources it might not be accurate, due to clock accuracy in the source Most systems order event by detection time – but events may switch their order on the way
  • 73. Clock accuracy in the source Clock synchronization Time server, example: http://tf.nist.gov/service/its.htm
  • 74.
  • 75.
  • 76. Classification to windows - scenario Calculate Statistics for each Player (aggregate per quarter) Calculate Statistics for each Team (aggregate per quarter) Window classification: Player statistics are calculated at the end of each quarter Team statistics are calculated at the end of each quarter based on the players events arrived within the same quarter All instances of player statistics that occur within a quarter window must be classified to the same window, even if they are derived after the window termination.
  • 77.
  • 78. Transactional behavior in event processing? Typically, event processing systems have decoupled architecture, and does not exhibit transactional behavior However, in several cases event processing is embedded within a transactional environment
  • 79. CASE I: Transactional ECA at the consumer side When a derived event is emitted to a consumer, there is an ECA rule, with several actions, that is required to run as atomic unit. If failed, the Derived event should be withdrawn
  • 80. CASE II: An event processing system monitors transactional system In this case, the producer may emit events that are not confirmed and may be rolled back.
  • 81.
  • 82. Case IV: A path in the event processing network should act as “unit of work” Example: the “determine winner” fails, and the bid is cancelled, all bid events are not kept in the event stores, and are withdrawn for other processing purposes
  • 83.
  • 84. Security and Privacy Considerations
  • 85. Security, privacy and trust Security requirements ensure that operations are only performed by authorized parties, and that privacy considerations are met. Based on Enhancing the Development Life Cycle to Produce Secure Software [DHS/DACS 08] Characteristics of secure application: Containing no malicious logic that causes it to behave in a malicious manner. Trustworthiness Recovering as quickly as possible with as little damage as possible from attacks. Survivability Executing predictably and operating correctly under all conditions, including hostile conditions. Dependability
  • 86.
  • 87.
  • 88.
  • 89.
  • 91. Summary Non Functional properties determine the nature of event processing applications – distribution, availability, optimization, correctness and security are some of the dimensions There are often the main decision factor in selecting whether to use an event processing system, and in the selection among various alternatives.

Hinweis der Redaktion

  1. For example scalability can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added, and which can be upgraded easily and transparently without shutting the system down
  2. Actor model  is a concurrent computation model that treats &amp;quot;actors&amp;quot; as the universal primitives of concurrent computation: in response to a message that it receives, an actor can make local decisions, create more actors, send more messages, and determine how to respond to the next message received.
  3. The  Master/Worker  pattern consists of two logical entities: a  Master , and one or more instances of a  Worker . The Master  initiates the computation by creating a set of tasks, puts them in some shared space and then waits for the tasks to be picked up and completed by the  Workers A Shared Nothing system typically partitions its data among many nodes on different databases (assigning different computers to deal with different users or queries), or may require every node to maintain its own copy of the application&apos;s data, using some kind of coordination protocol. This is often referred to as  data sharding . One of the approaches to achieve SN architecture for stateful applications (which typically maintain state in a  centralized database ) is the use of a  data grid , also known as distributed caching Space based architecture ( Space-Based Architecture  ( SBA ) is a  software architecture pattern  for achieving linear  scalability  of stateful, high-performance applications using the  tuple space  paradigm Applications are built out of a set of self-sufficient units, known as processing-units (PU). These units are independent of each other, so that the application can scale by adding more units.) Packaging services into PUs based on their runtime dependencies to reduce network chattiness and number of moving parts. 1. Scaling-out by spreading our application bundles across the set of available machines. 2. Scaling-up by running multiple threads in each bundle. MapReduce is a framework for processing huge datasets of certain kinds of distributable problems using a large number of nodes in a cluster . Computational processing can occur on data stored either in a  filesystem  (unstructured) or within a  database  (structured). &amp;quot;Map&amp;quot; step:  The master node takes the input, partitions it up into smaller sub-problems, and distributes those to worker nodes. A worker node may do this again in turn, leading to a multi-level  tree  structure. The worker node processes that smaller problem, and passes the answer back to its master node. &amp;quot;Reduce&amp;quot; step:  The master node then takes the answers to all the sub-problems and combines them in some way to get the output – the answer to the problem it was originally trying to solve.
  4. Increased management complexity – have to deal with partial failure, consistency Issues as throughput and latency between nodes – network traffic costs, serialization/deserialization
  5. Scaling out spreading application modules (services with runtime dependencies) across a set of available machines Load-partitioning and load-balancing between the application modules Using distributed cache for stateful applications
  6. Care should be taken when referring to a large event volume (MAX input throughput metric) Some system might filter out a large percentage of events before they hit the “heavy” processing layer Complexity of computation should be taken into account
  7. Growth in a number of context partitions leads to growth in overall internal state of the system
  8. Redundancy Failover – automatic reconfiguration of the system ensuring continuation of service after failure of one or more of its components Load balancing is one of the players in implementing failover Components are monitored continuously (“heart-bit monitoring”) When one fails load-balancer no longer sends traffic to it and instead sends to another component When the initial component comes back online the load-balancer begins to route traffic back
  9. After detection of failure and maybe reconfiguration/resolution of the fault, the effects of errors must be eliminated.  Normally the system operation is backed up to some point in its processing that preceded the fault detection, and operation recommences from this point.  This form of recovery, often called rollback, usually entails strategies using backup files, checkpointing, and journaling. In in memory db the implementation of the persistence layer is more complex – need to decide how we sync with the db (write-through? Periodically?) Need to decide how and when to load data on cache misses. Etc. Lots of commercial solutions exist now for in-memory db with caching.