ABSTRACT
Modern telco billing processing systems face huge challenges in the search to move from mostly offline, or batch, processing of billing events to handling huge volumes of events in near-real time. The challenges for an architect of such a system are: how to handle the high throughput processing of near real-time events with low latency while maintaining strict transactional semantics and provide high availability and scalability of the service.
We present our extreme transaction processing solution leveraging Oracle Coherence as an in-memory-data-grid (IMDG) which provides reliable messaging, data storage, and asynchronous write-back to RDBMS. The system follows the staged event driven architecture (SEDA) design pattern providing for a flexible and manageable design that is both highly available and scalable. We discuss lessons learned during performance tuning and profiling plus best practices that are applicable for any Coherence user.
OUTLINE
0. Agenda
1. Extreme Transaction Processing systems
2. Discuss Billing Event processing as XTP
3. What are the approaches
a. RDBMS, RDBMS + SSD
b. In Memory DB (Oracle x10)
c. RDBMS + Caching
d. IMDG
4. Leverage IMDG using a SEDA Architecture
a. Specifics of architecture
5. Performance and tuning
a. Profiling tools that we built
b. What bottlenecks
6. Lessons learned - best practices
a. Incubator libraries
b. Locking
a. Tips and tricks
7. Q&A
21. Use Case - Telco Billing
Billing System
Payment
Telecommunication services
system
Events Events
Service Payments
provided logs
User data
Tariffing of services
Billing rules
Balance management
Balance Users
CRM
Service enable/ Rules
disable
Balance data Balance
...
Copyright 2010 Grid Dynamics
25. RDBMS + SSD
• Pros
• Same as RDBMS...
• Speed up processing
• Cons
• Only improves performance, not scale
Copyright 2010 Grid Dynamics
26. Dedicated Hardware
• Pros
• Can speed up performance
• Often programming model is same
• Cons
• Expensive to scale
• Difficult to provide H/A
Copyright 2010 Grid Dynamics
27. In-memory database
• Pros
• Mature programming model
• Excellent latency
• Cons
• Does not scale horizontally
• System capacity limited
Copyright 2010 Grid Dynamics
31. Complex event processing
Event driven model
One event may trigger another event, which
triggers another and so on in a cascade.
Copyright 2010 Grid Dynamics
33. Scale Eight (2000)
• Web based delivery of media files
• Basically Amazon S3
• Total system capacity - several petabytes
• On-site appliance for local read/write
Copyright 2010 Grid Dynamics
35. On-site appliance
• Local cache of remote data
Copyright 2010 Grid Dynamics
36. On-site appliance
• Local cache of remote data
• Exported NAS as either NFS or SMB
Copyright 2010 Grid Dynamics
37. On-site appliance
• Local cache of remote data
• Exported NAS as either NFS or SMB
• Version 1 - C++ Synchronous Threaded
Copyright 2010 Grid Dynamics
38. On-site appliance
• Local cache of remote data
• Exported NAS as either NFS or SMB
• Version 1 - C++ Synchronous Threaded
• Version 2 - C++ Asynchronous Event-based
Copyright 2010 Grid Dynamics
42. Lessons Learned
Threaded Application
• Linear worfklows easy
• Branching worfklows hard
Copyright 2010 Grid Dynamics
43. Lessons Learned
Threaded Application
• Linear worfklows easy
• Branching worfklows hard
• Locking is hard
Copyright 2010 Grid Dynamics
44. Lessons Learned
Threaded Application
• Linear worfklows easy
• Branching worfklows hard
• Locking is hard
• Context switching kills
performance
Copyright 2010 Grid Dynamics
45. Lessons Learned
Threaded Application
• Linear worfklows easy
• Branching worfklows hard
• Locking is hard
• Context switching kills
performance
• Complexity means large code
base is unmaintainable
Copyright 2010 Grid Dynamics
46. Lessons Learned
Threaded Application
• Linear worfklows easy
• Branching worfklows hard
• Locking is hard
• Context switching kills
performance
• Complexity means large code
base is unmaintainable
• Only experts can do it right
Copyright 2010 Grid Dynamics
47. Lessons Learned
Threaded Application Event Based Application
• Linear worfklows easy
• Branching worfklows hard
• Locking is hard
• Context switching kills
performance
• Complexity means large code
base is unmaintainable
• Only experts can do it right
Copyright 2010 Grid Dynamics
48. Lessons Learned
Threaded Application Event Based Application
• Linear worfklows easy • Linear workflows complex
• Branching worfklows hard
• Locking is hard
• Context switching kills
performance
• Complexity means large code
base is unmaintainable
• Only experts can do it right
Copyright 2010 Grid Dynamics
49. Lessons Learned
Threaded Application Event Based Application
• Linear worfklows easy • Linear workflows complex
• Branching worfklows hard • Branching workflows easy(ier)
• Locking is hard
• Context switching kills
performance
• Complexity means large code
base is unmaintainable
• Only experts can do it right
Copyright 2010 Grid Dynamics
50. Lessons Learned
Threaded Application Event Based Application
• Linear worfklows easy • Linear workflows complex
• Branching worfklows hard • Branching workflows easy(ier)
• Locking is hard • Locking is easy
• Context switching kills
performance
• Complexity means large code
base is unmaintainable
• Only experts can do it right
Copyright 2010 Grid Dynamics
51. Lessons Learned
Threaded Application Event Based Application
• Linear worfklows easy • Linear workflows complex
• Branching worfklows hard • Branching workflows easy(ier)
• Locking is hard • Locking is easy
• Context switching kills • Easy to max 1 CPU, hard to max
performance many
• Complexity means large code
base is unmaintainable
• Only experts can do it right
Copyright 2010 Grid Dynamics
52. Lessons Learned
Threaded Application Event Based Application
• Linear worfklows easy • Linear workflows complex
• Branching worfklows hard • Branching workflows easy(ier)
• Locking is hard • Locking is easy
• Context switching kills • Easy to max 1 CPU, hard to max
performance many
• Complexity means large code • Complexity means large code
base is unmaintainable base is unmaintainable
• Only experts can do it right
Copyright 2010 Grid Dynamics
53. Lessons Learned
Threaded Application Event Based Application
• Linear worfklows easy • Linear workflows complex
• Branching worfklows hard • Branching workflows easy(ier)
• Locking is hard • Locking is easy
• Context switching kills • Easy to max 1 CPU, hard to max
performance many
• Complexity means large code • Complexity means large code
base is unmaintainable base is unmaintainable
• Only experts can do it right • Only experts can do it right
Copyright 2010 Grid Dynamics
54. What we need is
something in between...
Copyright 2010 Grid Dynamics
57. Introducing SEDA...
• Staged Event Driven Architecture
• Introduced by Matt Welsh in 2002 as a
research paper
Copyright 2010 Grid Dynamics
58. Introducing SEDA...
• Staged Event Driven Architecture
• Introduced by Matt Welsh in 2002 as a
research paper
• Blends threaded and event based models
Copyright 2010 Grid Dynamics
59. Introducing SEDA...
• Staged Event Driven Architecture
• Introduced by Matt Welsh in 2002 as a
research paper
• Blends threaded and event based models
• Stages are completely independent and are
threaded
Copyright 2010 Grid Dynamics
60. Introducing SEDA...
• Staged Event Driven Architecture
• Introduced by Matt Welsh in 2002 as a
research paper
• Blends threaded and event based models
• Stages are completely independent and are
threaded
• Stages are connected via queues
Copyright 2010 Grid Dynamics
61. Introducing SEDA...
• Staged Event Driven Architecture
• Introduced by Matt Welsh in 2002 as a
research paper
• Blends threaded and event based models
• Stages are completely independent and are
threaded
• Stages are connected via queues
• Code branches occurs between stages
Copyright 2010 Grid Dynamics
69. SEDA Benefits
• Easy to design
Copyright 2010 Grid Dynamics
70. SEDA Benefits
• Easy to design
• Easy to understand
Copyright 2010 Grid Dynamics
71. SEDA Benefits
• Easy to design
• Easy to understand
• Easy to test
Copyright 2010 Grid Dynamics
72. SEDA Benefits
• Easy to design
• Easy to understand
• Easy to test
• Easy to reuse
Copyright 2010 Grid Dynamics
73. SEDA Benefits
• Easy to design
• Easy to understand
• Easy to test
• Easy to reuse
• Event-driven architecture, synchronous
programming model
Copyright 2010 Grid Dynamics
74. SEDA Benefits
• Easy to design
• Easy to understand
• Easy to test
• Easy to reuse
• Event-driven architecture, synchronous
programming model
• Easy to scale
Copyright 2010 Grid Dynamics
84. Distributed SEDA
BUSINESS BUSINESS BUSINESS BUSINESS
LOGIC LOGIC LOGIC LOGIC
DISTRIBUTED SEDA FRAMEWORK
COHERENCE
Copyright 2010 Grid Dynamics
85. Components...
Generic processing unit, base
Generic for all hierarchy
Consumes inbound message
Transformer and produces outbound
message
Routes inbound message to
one of several outbound
Router
queues
Duplicates inbound message to
Fork several outbound queues
Consumes several inbound
Junction messages and produces
outbound message.
Synchronization point
Copyright 2010 Grid Dynamics
86. Now model Billing
Processing as Network
Inbound event Outbound event
Search for relevant
Write-off check
counters
Tariffing
Rules processing
Asynchronous
Changes fixation
RDBMS replication
Copyright 2010 Grid Dynamics
87. And scale it out...
Horizontal
Scaling
Copyright 2010 Grid Dynamics
88. How to implement it in
Coherence...
Copyright 2010 Grid Dynamics