SlideShare ist ein Scribd-Unternehmen logo
1 von 26
http://chinesemilitaryreview.blogspot.com/2012/01/plaafs-j-10a-refueling-from-h-6u-badger.html

1
•
•
•
•
•
•

Motivation
Data flow update types
Quantitative/Qualitative metrics
Update strategies
Evaluation
Conclusion

http://www.gilaberttax.com/2013/04/16/targetedpartnership-tax-allocations/

2
Gartner, “Big Data,”
http://www.gartner.com/itglossary/big-data/

Twitter Storm

3
http://smartgrid.usc.edu
“The Smart Grid Explained, The Hype and The Promise”, WESCO, FMEA/FMPA Conference 2009

4
• Mission critical data flows cannot suffer downtime
– How to update continuous dataflow applications with minimal disruption ?

• Evaluating dynamic update.
– Performance impact
• Throughput , Latency

– Consistency
• Data loss
• Reproducibility

5

http://www.ipandora.net/2009/08/09/pray-steadfastly/
• Formalize different types of data flow
updates needs.
• Identify qualitative and quantitative
metrics to be considered when
designing update strategies
• Introduce five different data flow
strategies and analytically characterize
their performance metrics
• Implement a consistent, low latency
update strategy in Floe continuous
dataflow engine and evaluate it against
a simple update strategy for a
motivating application from Los Angeles
power grid project

http://www.flickr.com/photos/dhammakaya/7095451689/

6
• Continuous data flow τ(Ƿ,С) is a directed graph
– Ƿ set of processors
– С set of directed edges(channels) connecting processors

7
•
•
•
•

Processor update
Channel update
Independent sub graph update
Connected sub graph update

8
P3

P1

P5

P2

P4

P6

P2++

• Updates to one or more processors
• | Ƿ | remains constant
• С remains constant

9
P3

P1

P5

P2

P4

P6

• Change in number of channels or connectivity
• No changes to processors

10
P3

P1

P5

P2

P4

P6

P2++

• Updates to one or more processors and channels
– No change in number of processors
– Channel connectivity change, Channel addition/removal
11
P3

P1

P5

P2

P4

P2++

P6

P2++

P2++

• Connected sub-graph in data flow is replaced by another
connected sub-graph
12
• Quantitative
–
–
–
–

Refresh latency
Lag latency
Throughput
Message loss

• Qualitative
– Consistency
– Interleaved vs Delineated
http://theculturevulture.co.uk/blog/reviews/what-happened-at-whats-next/

13
• Refresh latency
– Time between update start and first message from the
new workflow component

• Lag latency
– Time between update start and time at which last
message from the old work flow is emitted.

• Throughput
– Message throughput drop at update time

• Message loss
– Is there a message loss ? How many ?

14
• Consistency
– Does message consistently processed through a one
version of data flow ?

• Interleaved & Delineated
• Let tf be the first message processed and emitted from τs+1 and tl
be the last message processed and emitted from τs
• Delineated if tf > tl

15
•
•
•
•
•

Native Consistent Lossy update (NCL)
Native Consistent High latency update (NCH)
High-Throughput Inconsistent Update (HTI)
Message Versioned Consistent Update (MVC)
Path Versioned Consistent Update (PVC)

https://www.ubat.com/blog/do-i-really-need-a-business-plan/
16
P3

P5

P2

P4

Pause

P1

P6

• Pause input stream , terminate dataflow , deploy new data flow,
resume workflow
–
–
–
–
–

Consistent
Delineated
Lag latency = 0
Refresh latency = Deployment time + Min(wave head time)
Throughput = 0 ;starting at update start time for a duration of refresh
latency

17
P5

P3
Flush
Pause

P1

P2

P4

P6

• Pause input stream , flush on the fly messages (TTLold), terminate
dataflow , deploy new data flow, resume workflow
–
–
–
–
–
–

Consistent
Delineated
Refresh latency = DT + TTLold + Min(wave head time)
Lag Latency = TTLold
No Message loss
Throughput goes to 0
18
P3

P1

P5

P2

P4

P6

• Perform in place updates upon request
– Inconsistent
– Interleaving messages
– Low latencies (bounds are derived per update type)

19
P3

P5

Update current version

P1

P2

• Tags messages at the
sources
• Message versions are used
to find the correct
processor/channel/sub-graph

P4

P6

P4

– Consistent
– Interleaved
20
• Extension of MVC
• Message tagged with current path it took
• Dispatch messages to new version either if they processed
through new version or its processed through components
present in both new and old versions of workflow.
– Consistent
– Interleaved

21
• Implemented MCV in Floe[1] Continuous data flow engine.
• Compare MVC against Naïve Consistent Lossy update.
• Used Message Context as a carrier of data-flow version
Floe Message
Key
Properties<K,V>

Payload

[1] https://github.com/usc-cloud/floe
22
• Update processor “Parse” to “Parse++”

23
24
• Online updates to mission critical continuous data flows is
an important problem space.
• Formalized and analyzed
– Update models
– Evaluation metrics
– Update strategies and their trade offs

• Empirically evaluate MVC and NCL update strategies.

25
http://thesciencepresenter.wordpress.com/category/behavi
our-management/
26

Weitere ähnliche Inhalte

Andere mochten auch

DRI Qualified Immunity Article
DRI Qualified Immunity ArticleDRI Qualified Immunity Article
DRI Qualified Immunity Article
Dale Conder Jr.
 

Andere mochten auch (15)

summary10-2
summary10-2summary10-2
summary10-2
 
Global Commodity Update
Global Commodity UpdateGlobal Commodity Update
Global Commodity Update
 
North American Market Update
North American Market UpdateNorth American Market Update
North American Market Update
 
Daftar harga tas maika 2016
Daftar harga tas maika 2016Daftar harga tas maika 2016
Daftar harga tas maika 2016
 
портфоліо чехова
портфоліо чеховапортфоліо чехова
портфоліо чехова
 
DRI Qualified Immunity Article
DRI Qualified Immunity ArticleDRI Qualified Immunity Article
DRI Qualified Immunity Article
 
#CodeGate NIGP April 2015 Minutes
#CodeGate NIGP April 2015 Minutes#CodeGate NIGP April 2015 Minutes
#CodeGate NIGP April 2015 Minutes
 
2016 hksgda booklet
2016 hksgda booklet2016 hksgda booklet
2016 hksgda booklet
 
Q2 2015 Financial Results and Investors' Conference Call
Q2 2015 Financial Results and Investors' Conference CallQ2 2015 Financial Results and Investors' Conference Call
Q2 2015 Financial Results and Investors' Conference Call
 
презентація викладача і курсу
презентація викладача і курсупрезентація викладача і курсу
презентація викладача і курсу
 
M2 bmc2007 cours01
M2 bmc2007 cours01M2 bmc2007 cours01
M2 bmc2007 cours01
 
Tipos de distribuciones de probabilidad.
Tipos de distribuciones de probabilidad.Tipos de distribuciones de probabilidad.
Tipos de distribuciones de probabilidad.
 
Diabetes Destroyer
Diabetes Destroyer Diabetes Destroyer
Diabetes Destroyer
 
Design of Pedestrian Bridge
Design of Pedestrian BridgeDesign of Pedestrian Bridge
Design of Pedestrian Bridge
 
THE KINGDOM
THE KINGDOMTHE KINGDOM
THE KINGDOM
 

Ähnlich wie Escience2013-Continuous Data Flow Update Strategies for Mission-Critical Applications

Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
DataWorks Summit
 
IT Service Transformations
IT Service TransformationsIT Service Transformations
IT Service Transformations
TCM Solutions
 

Ähnlich wie Escience2013-Continuous Data Flow Update Strategies for Mission-Critical Applications (20)

Next generation web protocols
Next generation web protocolsNext generation web protocols
Next generation web protocols
 
Free Netflow analyzer training - diagnosing_and_troubleshooting
Free Netflow analyzer  training - diagnosing_and_troubleshootingFree Netflow analyzer  training - diagnosing_and_troubleshooting
Free Netflow analyzer training - diagnosing_and_troubleshooting
 
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
NetFlow Analyzer Training Part II : Diagnosing and troubleshooting traffic is...
 
Monolithic to microservices
Monolithic to microservicesMonolithic to microservices
Monolithic to microservices
 
Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...
Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...
Bandwidth reporting, capacity planning, and traffic shaping: NetFlow Analyzer...
 
Streaming real time data with Vibe Data Stream
Streaming real time data with Vibe Data StreamStreaming real time data with Vibe Data Stream
Streaming real time data with Vibe Data Stream
 
Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...
Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...
Labmeeting - 20150831 - Overhead and Performance of Low Latency Live Streamin...
 
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
Openstack Summit Vancouver 2015 - Maintaining and Operating Swift at Public C...
 
Performance test
Performance testPerformance test
Performance test
 
Fast ni csproposersdayslidesfinal
Fast ni csproposersdayslidesfinalFast ni csproposersdayslidesfinal
Fast ni csproposersdayslidesfinal
 
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M ResumeShaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M Resume
 
Validation and Business Considerations for Clinical Study Migrations
Validation and Business Considerations for Clinical Study MigrationsValidation and Business Considerations for Clinical Study Migrations
Validation and Business Considerations for Clinical Study Migrations
 
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
UAT - Cards Migration (Whitepaper)
UAT - Cards Migration (Whitepaper)UAT - Cards Migration (Whitepaper)
UAT - Cards Migration (Whitepaper)
 
IT Service Transformations
IT Service TransformationsIT Service Transformations
IT Service Transformations
 
Igniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner CableIgniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner Cable
 
IBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & Accounting
IBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & AccountingIBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & Accounting
IBM Impact 2014 AMC-1877: IBM WebSphere MQ for z/OS: Performance & Accounting
 
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
What's New in Syncsort's Trillium Line of Data Quality Software - TSS Enterpr...
 
Scaling Push Messaging for Millions of Devices @Netflix
Scaling Push Messaging for Millions of Devices @NetflixScaling Push Messaging for Millions of Devices @Netflix
Scaling Push Messaging for Millions of Devices @Netflix
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Escience2013-Continuous Data Flow Update Strategies for Mission-Critical Applications

  • 2. • • • • • • Motivation Data flow update types Quantitative/Qualitative metrics Update strategies Evaluation Conclusion http://www.gilaberttax.com/2013/04/16/targetedpartnership-tax-allocations/ 2
  • 4. http://smartgrid.usc.edu “The Smart Grid Explained, The Hype and The Promise”, WESCO, FMEA/FMPA Conference 2009 4
  • 5. • Mission critical data flows cannot suffer downtime – How to update continuous dataflow applications with minimal disruption ? • Evaluating dynamic update. – Performance impact • Throughput , Latency – Consistency • Data loss • Reproducibility 5 http://www.ipandora.net/2009/08/09/pray-steadfastly/
  • 6. • Formalize different types of data flow updates needs. • Identify qualitative and quantitative metrics to be considered when designing update strategies • Introduce five different data flow strategies and analytically characterize their performance metrics • Implement a consistent, low latency update strategy in Floe continuous dataflow engine and evaluate it against a simple update strategy for a motivating application from Los Angeles power grid project http://www.flickr.com/photos/dhammakaya/7095451689/ 6
  • 7. • Continuous data flow τ(Ƿ,С) is a directed graph – Ƿ set of processors – С set of directed edges(channels) connecting processors 7
  • 8. • • • • Processor update Channel update Independent sub graph update Connected sub graph update 8
  • 9. P3 P1 P5 P2 P4 P6 P2++ • Updates to one or more processors • | Ƿ | remains constant • С remains constant 9
  • 10. P3 P1 P5 P2 P4 P6 • Change in number of channels or connectivity • No changes to processors 10
  • 11. P3 P1 P5 P2 P4 P6 P2++ • Updates to one or more processors and channels – No change in number of processors – Channel connectivity change, Channel addition/removal 11
  • 12. P3 P1 P5 P2 P4 P2++ P6 P2++ P2++ • Connected sub-graph in data flow is replaced by another connected sub-graph 12
  • 13. • Quantitative – – – – Refresh latency Lag latency Throughput Message loss • Qualitative – Consistency – Interleaved vs Delineated http://theculturevulture.co.uk/blog/reviews/what-happened-at-whats-next/ 13
  • 14. • Refresh latency – Time between update start and first message from the new workflow component • Lag latency – Time between update start and time at which last message from the old work flow is emitted. • Throughput – Message throughput drop at update time • Message loss – Is there a message loss ? How many ? 14
  • 15. • Consistency – Does message consistently processed through a one version of data flow ? • Interleaved & Delineated • Let tf be the first message processed and emitted from τs+1 and tl be the last message processed and emitted from τs • Delineated if tf > tl 15
  • 16. • • • • • Native Consistent Lossy update (NCL) Native Consistent High latency update (NCH) High-Throughput Inconsistent Update (HTI) Message Versioned Consistent Update (MVC) Path Versioned Consistent Update (PVC) https://www.ubat.com/blog/do-i-really-need-a-business-plan/ 16
  • 17. P3 P5 P2 P4 Pause P1 P6 • Pause input stream , terminate dataflow , deploy new data flow, resume workflow – – – – – Consistent Delineated Lag latency = 0 Refresh latency = Deployment time + Min(wave head time) Throughput = 0 ;starting at update start time for a duration of refresh latency 17
  • 18. P5 P3 Flush Pause P1 P2 P4 P6 • Pause input stream , flush on the fly messages (TTLold), terminate dataflow , deploy new data flow, resume workflow – – – – – – Consistent Delineated Refresh latency = DT + TTLold + Min(wave head time) Lag Latency = TTLold No Message loss Throughput goes to 0 18
  • 19. P3 P1 P5 P2 P4 P6 • Perform in place updates upon request – Inconsistent – Interleaving messages – Low latencies (bounds are derived per update type) 19
  • 20. P3 P5 Update current version P1 P2 • Tags messages at the sources • Message versions are used to find the correct processor/channel/sub-graph P4 P6 P4 – Consistent – Interleaved 20
  • 21. • Extension of MVC • Message tagged with current path it took • Dispatch messages to new version either if they processed through new version or its processed through components present in both new and old versions of workflow. – Consistent – Interleaved 21
  • 22. • Implemented MCV in Floe[1] Continuous data flow engine. • Compare MVC against Naïve Consistent Lossy update. • Used Message Context as a carrier of data-flow version Floe Message Key Properties<K,V> Payload [1] https://github.com/usc-cloud/floe 22
  • 23. • Update processor “Parse” to “Parse++” 23
  • 24. 24
  • 25. • Online updates to mission critical continuous data flows is an important problem space. • Formalized and analyzed – Update models – Evaluation metrics – Update strategies and their trade offs • Empirically evaluate MVC and NCL update strategies. 25

Hinweis der Redaktion

  1. The story behind this paper is some what interesting.It was started in last winter break where I started implementing a dynamic update for out in house continious data flow engine.After the winter break is over and when I met the with Yogesh and one of my colleges in lab who maintained floe at that time. we had a disagreement regarding the consistency provided by my implementation. In this discussion we realized that there can be different dynamic update types for distributed continuous data flows with different trade offs.This paper is a result of looking in to this problem in detail.
  2. Motivation behind the need for dynamic update for continuous data-flows.Different possible types of updates to data flowsIdentify metrics to evaluate dynamic data flow updatesIntroduce set of update strategies&apos; which offer different performance/quality trade-offs for different update types.Present empirical evaluation results some update strategies. Finish the presentation with a conclusion.
  3. Big data High volume , High Variety , High velocity data assets Initial efforts on batch processing systemsCyber physical systems, Sensor networks, social network streams need data stream processing systems. This is where continious data flows comes in to the picture
  4. Power grids are transforming into smart grids. (We can see power grids are transforming in to smart girds)Smart meters allow electricity consumption events to transfer in near real timeIntelligent management Demand response optimization to predict and forecast power grid demands and allow take corrective measurements if demand &gt; supply USC act as a micro-grid test bed to evaluate forecasting models and curtailment techniques.Process data streams from over 100 buildings /50k sensors to measure power usage/ equipment status , ambient temperature etc. Continious data flowsRead data , phrase data ,extract and validate reading ,annotate data , inserted in to RDF storage used by smart girl web portal, and also parsed data directed to analytics model which does energy forecasting. Those trigger actions.
  5. Those work flows can’t suffer downtime.Need to update : to improve, fix issues, ex : parsing/annotating logic needs to change when sensor streams change or get updated. User need :No message loss.Ordering of new/old messagesReproducibility of data. Update should not affect the reproducibility property Some applications might accept delay as : Web portal while facility management might not. Shut down the data flow and restart views on the fly updates.
  6. I’ll not board you by going through all the formalism and theoretical bounds. Rather I’ll try to go through examples and give an intuition which will be useful when you are reading the paper.
  7. Continious data flow is a directed graph with processor nodes and channel edges. Processors does the data processing while channels carry data between processors connecting them
  8. We define four types of updates that can be done in a continious data flow.
  9. Processor update defined as update to one or more processors without changing number of processors or channels for channel connectivity.
  10. Channel update change either number of channels in the data flow or change the connectivity.
  11. Combination of previous update types.Update one or more processors and channelsNo change in number of processors No Change in
  12. In Connected sub-graph update, update is done by replacing a connected sub-graph in data flow replaced by another connected subgraph.
  13. We identify and formalize different Quantitate and qualitative metrics that can be used to evaluate different dynamic data flow updates. We define and identify RL,LL,T,ML as Quantitative metrics and Consistency , Interleaved vs Delineated as Qualitative metrics
  14. Refresh latency : How fast we see the effect of update ? Lag Latency : How long it take to flush messages from the old work flowThroughput : is there a impact to throughputMessage loss : Does this update strategy cause
  15. Consistency : Is data reproducible ? Delineated : Can we draw a line between new messages and old messages such that there is no old messages emitted from the system after we see the first new message
  16. We introduce 5 different data flow update strategies that can be implemented with different trade-offs. And we analyze the evaluation metrics against
  17. Pause the input stream , terminate the data flow immediately deploy new data flow and then resume the workflow. Consistent : We terminate the old data flow immediately upon the request and messages processing in the data flow is lost. So all the messages are processed either from old version of the dataflow or new version of the data flow. Delineated: Since we terminated the data flow with the on the fly messages. There will be no old messages emitted from the data flow after the deployment of the new data floe. Lag latency is zero since we terminate the data flow immediately after update request. Refresh latency