SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Some Open Problems in
Publish/Subscribe Networking

     David S. Rosenblum
       Chief Technology Officer
       PreCache Inc.
Acknowledgments
   Alexander L. Wolf
       University of Colorado at Boulder


   Antonio Carzaniga
       University of Colorado at Boulder


   PreCache Engineering Team
Background
Information-Centric
Internet Applications
   Software and Antivirus Updates
   Consumer Alerts
   Location-Based Services for Mobile Wireless
   Multiplayer Online Games
   Web Search Engines
   e-Business (e.g., Supply Chain Mgmt)
   Distributed Sensor Networks

        Publish/subscribe is a natural fit!
        Publish/subscribe is a natural fit!
Publish/Subscribe Networking
   Publish/subscribe is traditionally
    implemented by centralized servers
   But server-based realizations do not scale
    to Internet-wide applications
   So existing networks require “faking it”
       Request/response interaction
       Continual subscriber polling
       Enormous server farms
       Dumb caching
   And so we must realize publish/subscribe
    via a distributed network of routers
SIENA Content-Based Routing
       Subscription Forwarding
                                             s1:1
                                             s1:1
           s1
a                         s1:a
                          s1:a           2
                      1

                                                                   s1:2
                                                                   s1:2
                             s1:2
                             s1:2                              3
                                    5
    s1:1
    s1:1   4                            s1:3
                                        s1:3    6


                                                    s1:3
                                                    s1:3
                                                           7

                      8
               s1:5
               s1:5
                                         s1:6
                                         s1:6
                                                9
SIENA Content-Based Routing
       Subscription Merging
s1 covers s2                                  s1:1
                                              s1:1
                                              s1:1
                                              s1:1
                                              s2:5
                                              s2:5
                                                     s1 covers s2
                           s1:a
                           s1:a
a                          s1:a
                           s1:a           2
                       1   s2:2
                           s2:2


                              s1:2                                  s1:2
                                                                    s1:2
                              s1:2
                              s1:2
                              s1:2
                              s2:8                              3
                              s2:8
                                     5
    s1:1
    s1:1    4                            s1:3
                                         s1:3    6


                                                     s1:3
                                                     s1:3
                                                            7
       s2
                s1:5
                s1:5   8
b               s1:5
                s1:5
                s2:b
                s2:b                             9
                                          s1:6
                                          s1:6
SIENA Content-Based Routing
       Notification Delivery
                                             s1:1
                                             s1:1     n1 matches s1
                                             s2:5
                                             s2:5
                          s1:a
                          s1:a                        n1 matches s2
a                                        2
                      1   s2:2
                          s2:2


                             s1:2                                  s1:2
                                                                   s1:2
                             s1:2
                             s2:8
                             s2:8                              3
                                    5
    s1:1
    s1:1   4                            s1:3
                                        s1:3    6


                                                    s1:3
                                                    s1:3
                                                           7
                                                                    n1
               s1:5
               s1:5   8
b              s2:b
               s2:b                             9
                                         s1:6
                                         s1:6
PreCache
       NETINJECTOR Architecture
                          Internet




Publisher                                          Subscriber




        = Event Agent   = Routing Engine   = Channel Manager
PreCache NETINJECTOR
   Routing and forwarding based on SIENA
       Generalize idea of subscription merging
           Compute single subscription covering all received
            subscriptions
       Employ approximate matching
           Constant time and space complexity
           Log time and space with additional leakage reduction
   Channel services
       Namespace management
       Resource allocation
       Load balancing, fault tolerance, authentication
Open Problems
Comments on the Problems
   Problems identified based on
    experience with NETINJECTOR and SIENA
   Many of the problems arise because of
    a desire for scalability
   Some problems are deeply technical
   Other problems are simply pragmatic
Problem
       Wireless Mobile Devices (WMDs?)
                                             s1:1
                                             s1:1
                      s1:a
                      s1:a
a                                        2
                      1

                                                                   s1:2
                                                                   s1:2
                             s1:2
                             s1:2                              3
                                    5
    s1:1
    s1:1   4                            s1:3
                                        s1:3    6


                                                    s1:3
                                                    s1:3
                                                           7

                      8
a              s1:5
               s1:5                             9
                                         s1:6
                                         s1:6
Problem
Issues with Wireless Mobile Devices

   Caching notifications in the network
   Stream reconstruction and duplicate
    suppression
   Frequency of movement versus
    overhead of reconfiguration
   Gateways for email, SMS, etc.
Problem
Security

   Traditional security properties are address-based
       Example: Authentication
            Bob wants to make sure Alice sent the message
       Content-based analog
            Bob wants to make sure a message represents reality
   Pub/sub admits new kinds of vulnerabilities
       Example: Denial of Service
            Highly generic subscription (“Price > 0”) causes flood of
             notifications to subscriber
       How do you distinguish a malicious subscriber from a
        greedy subscriber?
   How do you do content-based routing when the
    content is encrypted???
Problem
Client Connections and Firewalls

   Want constant connection between
    subscriber and edge router
       Otherwise subscriber polls for notifications
       Connections limits may require multiplexing
   Client must initiate connection to edge
    router in order to breach firewall
   And if port 80 is the only open port …
       Need HTTP encapsulation of messages
       May need HTML formatting of messages
       Routers need to multiplex and/or demultiplex
        message traffic
Problem
Approximate Matching

   Rationale: High-performance routing
       Expect approximate matching to have better
        time/space complexity than exact matching
   Approximation must be conservative
       False positives OK, false negatives not
       Must still perform exact match at some point
        before delivery to subscriber
   Leakage may increase traffic
       Tradeoff in computational resources
   We need simulation tools to explore this!
Problem
Optimizing for Traffic Variations

   Can routers dynamically optimize for traffic
    variations?
   Example: The Brittany Spears Effect
       All subscribers want certain notifications N1
       Few subscribers want other notifications N2
       N1 notifications may flood network
   Example: The Google Effect
       Certain subscribers S1 want all notifications
       Other subscribers S2 want few notifications
       S1 subscriptions may dominate routing
   We need simulation tools to explore this!
Problem
Service-Provider Deployment

   Difficult to convince network service
    providers to enhance their networks with
    publish/subscribe
       Application demand not yet critical
       Lack of standards
   Economic barriers govern router design
       Example: 100M users, $10K/router
           1000 users/router: 100K routers, $1G outlay
           100 routers: $1M outlay, 1M users/router
Problem
Peer-to-Peer Deployment

   Reasonable alternative to service-provider
    deployment
       “Grass roots” generation of demand
   Challenges
       Dynamically aligning peer topology to
        underlying network topology
       Dynamically partitioning routing responsibilities
        across peers
       Ensuring reliability, privacy and/or integrity of
        messages
Problem
Unicast Fanout at Edge Routers

   Example: 100M users on 1K routers
       100K users per router
       10Kbyte notification
           >80 milliseconds over OC-192
           >80 seconds over 10Mb Ethernet
           >4 hours over 56K modem
   Idea: Use publish/subscribe for “leveling”
       Partition users into classes
           Example: Last digit of serial number
       Publish once per class
       Tune publication rate to available bandwidth and SLA
Conclusion
Conclusion
   Many Internet applications naturally
    require publish/subscribe messaging
   Scalability can be achieved through
    publish/subscribe networking
   SIENA, PreCache, and others have
    established many fundamental results
   But many open problems remain to be
    solved
Some Open Problems in Publish/Subscribe Networking (keynote talk at DEBS 2003)

Weitere ähnliche Inhalte

Andere mochten auch

PT1_Jenny _PT mid checkup
PT1_Jenny _PT mid checkupPT1_Jenny _PT mid checkup
PT1_Jenny _PT mid checkup
originaltieoff
 
Finalaya daily market wrap_13feb2014
Finalaya daily market wrap_13feb2014Finalaya daily market wrap_13feb2014
Finalaya daily market wrap_13feb2014
Investors Empowered
 

Andere mochten auch (18)

Applications and Abstractions: A Cautionary Tale (invited talk at a DIMACS Wo...
Applications and Abstractions: A Cautionary Tale (invited talk at a DIMACS Wo...Applications and Abstractions: A Cautionary Tale (invited talk at a DIMACS Wo...
Applications and Abstractions: A Cautionary Tale (invited talk at a DIMACS Wo...
 
Whither Software Engineering Research? (keynote talk at APSEC 2012)
Whither Software Engineering Research? (keynote talk at APSEC 2012)Whither Software Engineering Research? (keynote talk at APSEC 2012)
Whither Software Engineering Research? (keynote talk at APSEC 2012)
 
Known Unknowns: Testing in the Presence of Uncertainty (talk at ACM SIGSOFT F...
Known Unknowns: Testing in the Presence of Uncertainty (talk at ACM SIGSOFT F...Known Unknowns: Testing in the Presence of Uncertainty (talk at ACM SIGSOFT F...
Known Unknowns: Testing in the Presence of Uncertainty (talk at ACM SIGSOFT F...
 
Probability and Uncertainty in Software Engineering (keynote talk at NASAC 2013)
Probability and Uncertainty in Software Engineering (keynote talk at NASAC 2013)Probability and Uncertainty in Software Engineering (keynote talk at NASAC 2013)
Probability and Uncertainty in Software Engineering (keynote talk at NASAC 2013)
 
Felicitous Computing (invited Talk for UC Irvine ISR Distinguished Speaker Se...
Felicitous Computing (invited Talk for UC Irvine ISR Distinguished Speaker Se...Felicitous Computing (invited Talk for UC Irvine ISR Distinguished Speaker Se...
Felicitous Computing (invited Talk for UC Irvine ISR Distinguished Speaker Se...
 
The Power of Probabilistic Thinking (keynote talk at ASE 2016)
The Power of Probabilistic Thinking (keynote talk at ASE 2016)The Power of Probabilistic Thinking (keynote talk at ASE 2016)
The Power of Probabilistic Thinking (keynote talk at ASE 2016)
 
Jogging While Driving, and Other Software Engineering Research Problems (invi...
Jogging While Driving, and Other Software Engineering Research Problems (invi...Jogging While Driving, and Other Software Engineering Research Problems (invi...
Jogging While Driving, and Other Software Engineering Research Problems (invi...
 
Eamon de valera
Eamon de valeraEamon de valera
Eamon de valera
 
Indices 18 feb 2014
Indices 18 feb 2014Indices 18 feb 2014
Indices 18 feb 2014
 
Indices 10 dec2012050503
Indices 10 dec2012050503Indices 10 dec2012050503
Indices 10 dec2012050503
 
PT1_Jenny _PT mid checkup
PT1_Jenny _PT mid checkupPT1_Jenny _PT mid checkup
PT1_Jenny _PT mid checkup
 
SDN Introduction
SDN IntroductionSDN Introduction
SDN Introduction
 
Finalaya daily market wrap_13feb2014
Finalaya daily market wrap_13feb2014Finalaya daily market wrap_13feb2014
Finalaya daily market wrap_13feb2014
 
Indices 07 nov2013060411
Indices 07 nov2013060411Indices 07 nov2013060411
Indices 07 nov2013060411
 
Viewpoint
ViewpointViewpoint
Viewpoint
 
Indices 18 dec2012060551
Indices 18 dec2012060551Indices 18 dec2012060551
Indices 18 dec2012060551
 
Indices 11 dec2012051757
Indices 11 dec2012051757Indices 11 dec2012051757
Indices 11 dec2012051757
 
Presentation
PresentationPresentation
Presentation
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Some Open Problems in Publish/Subscribe Networking (keynote talk at DEBS 2003)

  • 1. Some Open Problems in Publish/Subscribe Networking David S. Rosenblum Chief Technology Officer PreCache Inc.
  • 2. Acknowledgments  Alexander L. Wolf  University of Colorado at Boulder  Antonio Carzaniga  University of Colorado at Boulder  PreCache Engineering Team
  • 4. Information-Centric Internet Applications  Software and Antivirus Updates  Consumer Alerts  Location-Based Services for Mobile Wireless  Multiplayer Online Games  Web Search Engines  e-Business (e.g., Supply Chain Mgmt)  Distributed Sensor Networks Publish/subscribe is a natural fit! Publish/subscribe is a natural fit!
  • 5. Publish/Subscribe Networking  Publish/subscribe is traditionally implemented by centralized servers  But server-based realizations do not scale to Internet-wide applications  So existing networks require “faking it”  Request/response interaction  Continual subscriber polling  Enormous server farms  Dumb caching  And so we must realize publish/subscribe via a distributed network of routers
  • 6. SIENA Content-Based Routing Subscription Forwarding s1:1 s1:1 s1 a s1:a s1:a 2 1 s1:2 s1:2 s1:2 s1:2 3 5 s1:1 s1:1 4 s1:3 s1:3 6 s1:3 s1:3 7 8 s1:5 s1:5 s1:6 s1:6 9
  • 7. SIENA Content-Based Routing Subscription Merging s1 covers s2 s1:1 s1:1 s1:1 s1:1 s2:5 s2:5 s1 covers s2 s1:a s1:a a s1:a s1:a 2 1 s2:2 s2:2 s1:2 s1:2 s1:2 s1:2 s1:2 s1:2 s2:8 3 s2:8 5 s1:1 s1:1 4 s1:3 s1:3 6 s1:3 s1:3 7 s2 s1:5 s1:5 8 b s1:5 s1:5 s2:b s2:b 9 s1:6 s1:6
  • 8. SIENA Content-Based Routing Notification Delivery s1:1 s1:1 n1 matches s1 s2:5 s2:5 s1:a s1:a n1 matches s2 a 2 1 s2:2 s2:2 s1:2 s1:2 s1:2 s1:2 s2:8 s2:8 3 5 s1:1 s1:1 4 s1:3 s1:3 6 s1:3 s1:3 7 n1 s1:5 s1:5 8 b s2:b s2:b 9 s1:6 s1:6
  • 9. PreCache NETINJECTOR Architecture Internet Publisher Subscriber = Event Agent = Routing Engine = Channel Manager
  • 10. PreCache NETINJECTOR  Routing and forwarding based on SIENA  Generalize idea of subscription merging  Compute single subscription covering all received subscriptions  Employ approximate matching  Constant time and space complexity  Log time and space with additional leakage reduction  Channel services  Namespace management  Resource allocation  Load balancing, fault tolerance, authentication
  • 12. Comments on the Problems  Problems identified based on experience with NETINJECTOR and SIENA  Many of the problems arise because of a desire for scalability  Some problems are deeply technical  Other problems are simply pragmatic
  • 13. Problem Wireless Mobile Devices (WMDs?) s1:1 s1:1 s1:a s1:a a 2 1 s1:2 s1:2 s1:2 s1:2 3 5 s1:1 s1:1 4 s1:3 s1:3 6 s1:3 s1:3 7 8 a s1:5 s1:5 9 s1:6 s1:6
  • 14. Problem Issues with Wireless Mobile Devices  Caching notifications in the network  Stream reconstruction and duplicate suppression  Frequency of movement versus overhead of reconfiguration  Gateways for email, SMS, etc.
  • 15. Problem Security  Traditional security properties are address-based  Example: Authentication  Bob wants to make sure Alice sent the message  Content-based analog  Bob wants to make sure a message represents reality  Pub/sub admits new kinds of vulnerabilities  Example: Denial of Service  Highly generic subscription (“Price > 0”) causes flood of notifications to subscriber  How do you distinguish a malicious subscriber from a greedy subscriber?  How do you do content-based routing when the content is encrypted???
  • 16. Problem Client Connections and Firewalls  Want constant connection between subscriber and edge router  Otherwise subscriber polls for notifications  Connections limits may require multiplexing  Client must initiate connection to edge router in order to breach firewall  And if port 80 is the only open port …  Need HTTP encapsulation of messages  May need HTML formatting of messages  Routers need to multiplex and/or demultiplex message traffic
  • 17. Problem Approximate Matching  Rationale: High-performance routing  Expect approximate matching to have better time/space complexity than exact matching  Approximation must be conservative  False positives OK, false negatives not  Must still perform exact match at some point before delivery to subscriber  Leakage may increase traffic  Tradeoff in computational resources  We need simulation tools to explore this!
  • 18. Problem Optimizing for Traffic Variations  Can routers dynamically optimize for traffic variations?  Example: The Brittany Spears Effect  All subscribers want certain notifications N1  Few subscribers want other notifications N2  N1 notifications may flood network  Example: The Google Effect  Certain subscribers S1 want all notifications  Other subscribers S2 want few notifications  S1 subscriptions may dominate routing  We need simulation tools to explore this!
  • 19. Problem Service-Provider Deployment  Difficult to convince network service providers to enhance their networks with publish/subscribe  Application demand not yet critical  Lack of standards  Economic barriers govern router design  Example: 100M users, $10K/router  1000 users/router: 100K routers, $1G outlay  100 routers: $1M outlay, 1M users/router
  • 20. Problem Peer-to-Peer Deployment  Reasonable alternative to service-provider deployment  “Grass roots” generation of demand  Challenges  Dynamically aligning peer topology to underlying network topology  Dynamically partitioning routing responsibilities across peers  Ensuring reliability, privacy and/or integrity of messages
  • 21. Problem Unicast Fanout at Edge Routers  Example: 100M users on 1K routers  100K users per router  10Kbyte notification  >80 milliseconds over OC-192  >80 seconds over 10Mb Ethernet  >4 hours over 56K modem  Idea: Use publish/subscribe for “leveling”  Partition users into classes  Example: Last digit of serial number  Publish once per class  Tune publication rate to available bandwidth and SLA
  • 23. Conclusion  Many Internet applications naturally require publish/subscribe messaging  Scalability can be achieved through publish/subscribe networking  SIENA, PreCache, and others have established many fundamental results  But many open problems remain to be solved