Some Open Problems in Publish/Subscribe Networking (keynote talk at DEBS 2003)
1. Some Open Problems in
Publish/Subscribe Networking
David S. Rosenblum
Chief Technology Officer
PreCache Inc.
2. Acknowledgments
Alexander L. Wolf
University of Colorado at Boulder
Antonio Carzaniga
University of Colorado at Boulder
PreCache Engineering Team
4. Information-Centric
Internet Applications
Software and Antivirus Updates
Consumer Alerts
Location-Based Services for Mobile Wireless
Multiplayer Online Games
Web Search Engines
e-Business (e.g., Supply Chain Mgmt)
Distributed Sensor Networks
Publish/subscribe is a natural fit!
Publish/subscribe is a natural fit!
5. Publish/Subscribe Networking
Publish/subscribe is traditionally
implemented by centralized servers
But server-based realizations do not scale
to Internet-wide applications
So existing networks require “faking it”
Request/response interaction
Continual subscriber polling
Enormous server farms
Dumb caching
And so we must realize publish/subscribe
via a distributed network of routers
10. PreCache NETINJECTOR
Routing and forwarding based on SIENA
Generalize idea of subscription merging
Compute single subscription covering all received
subscriptions
Employ approximate matching
Constant time and space complexity
Log time and space with additional leakage reduction
Channel services
Namespace management
Resource allocation
Load balancing, fault tolerance, authentication
12. Comments on the Problems
Problems identified based on
experience with NETINJECTOR and SIENA
Many of the problems arise because of
a desire for scalability
Some problems are deeply technical
Other problems are simply pragmatic
13. Problem
Wireless Mobile Devices (WMDs?)
s1:1
s1:1
s1:a
s1:a
a 2
1
s1:2
s1:2
s1:2
s1:2 3
5
s1:1
s1:1 4 s1:3
s1:3 6
s1:3
s1:3
7
8
a s1:5
s1:5 9
s1:6
s1:6
14. Problem
Issues with Wireless Mobile Devices
Caching notifications in the network
Stream reconstruction and duplicate
suppression
Frequency of movement versus
overhead of reconfiguration
Gateways for email, SMS, etc.
15. Problem
Security
Traditional security properties are address-based
Example: Authentication
Bob wants to make sure Alice sent the message
Content-based analog
Bob wants to make sure a message represents reality
Pub/sub admits new kinds of vulnerabilities
Example: Denial of Service
Highly generic subscription (“Price > 0”) causes flood of
notifications to subscriber
How do you distinguish a malicious subscriber from a
greedy subscriber?
How do you do content-based routing when the
content is encrypted???
16. Problem
Client Connections and Firewalls
Want constant connection between
subscriber and edge router
Otherwise subscriber polls for notifications
Connections limits may require multiplexing
Client must initiate connection to edge
router in order to breach firewall
And if port 80 is the only open port …
Need HTTP encapsulation of messages
May need HTML formatting of messages
Routers need to multiplex and/or demultiplex
message traffic
17. Problem
Approximate Matching
Rationale: High-performance routing
Expect approximate matching to have better
time/space complexity than exact matching
Approximation must be conservative
False positives OK, false negatives not
Must still perform exact match at some point
before delivery to subscriber
Leakage may increase traffic
Tradeoff in computational resources
We need simulation tools to explore this!
18. Problem
Optimizing for Traffic Variations
Can routers dynamically optimize for traffic
variations?
Example: The Brittany Spears Effect
All subscribers want certain notifications N1
Few subscribers want other notifications N2
N1 notifications may flood network
Example: The Google Effect
Certain subscribers S1 want all notifications
Other subscribers S2 want few notifications
S1 subscriptions may dominate routing
We need simulation tools to explore this!
19. Problem
Service-Provider Deployment
Difficult to convince network service
providers to enhance their networks with
publish/subscribe
Application demand not yet critical
Lack of standards
Economic barriers govern router design
Example: 100M users, $10K/router
1000 users/router: 100K routers, $1G outlay
100 routers: $1M outlay, 1M users/router
20. Problem
Peer-to-Peer Deployment
Reasonable alternative to service-provider
deployment
“Grass roots” generation of demand
Challenges
Dynamically aligning peer topology to
underlying network topology
Dynamically partitioning routing responsibilities
across peers
Ensuring reliability, privacy and/or integrity of
messages
21. Problem
Unicast Fanout at Edge Routers
Example: 100M users on 1K routers
100K users per router
10Kbyte notification
>80 milliseconds over OC-192
>80 seconds over 10Mb Ethernet
>4 hours over 56K modem
Idea: Use publish/subscribe for “leveling”
Partition users into classes
Example: Last digit of serial number
Publish once per class
Tune publication rate to available bandwidth and SLA
23. Conclusion
Many Internet applications naturally
require publish/subscribe messaging
Scalability can be achieved through
publish/subscribe networking
SIENA, PreCache, and others have
established many fundamental results
But many open problems remain to be
solved