2. <Insert Picture Here>
Pimp My Data Grid
Brian Oliver
Senior Principal Solutions Architect (brian.oliver@oracle.com)
Oracle Coherence | Oracle Fusion Middleware
3. Agenda
⢠An Architectural Challenge
⢠Enter the Data Grid
⢠Architectural Patterns that Limit Application Scalability
⢠Pimping Data Grid
⢠Service Grids
⢠Trading Exchange
⢠Agile Groovy Grid
⢠Unstoppable Spring
(c) Copyright 2007. Oracle Corporation
5. Scale this...
⢠Domain: Retail Banking Infrastructure
⢠Over 500 Banks
⢠100,000+ Teller Staff Desktops Applications
⢠10,000+ Cash Machines (ATMs)
⢠10,000,000âs of Internet Banking Transactions/day
⢠Current Infrastructure
⢠Java SE based (no J2EE â apart from Servlets)
⢠Oracle RAC (not an issue â scaling across a WAN âş )
⢠Messaging (serious challenges)
⢠Processing Business Tasks (challenges approaching)
⢠30,000,000+ Business Tasks a day â minimum.
⢠must do 100,000,000 effortlessly per/day before going live
(c) Copyright 2007. Oracle Corporation
6. Scale this...
⢠Execution of Business Tasks
⢠Account Balance, Credit/Debit, Funds Transfer, Statement
Processing, Batch Processing, Payment Processing
⢠Tasks arrive from a variety of clients (thin, rich, cross-
platform, mainframes...) â variety of languages
⢠Goal:
⢠Tasks are executed by the âcloudâ The
⢠Donât want to build own âcloudâ software Cloud
⢠Their knowledege:
⢠Massive experience in scale-out. Could build it themselves,
but budget (time/resources/money) will be saved by buying.
(c) Copyright 2007. Oracle Corporation
8. Constraints...
⢠No Single Points of Failure ⢠No Data or Task Loss
⢠No Simple Points of Bottleneck ⢠During failure
⢠During server upgrade
⢠No Service Registries
⢠During scale out
⢠No Masters + Workers ⢠No Transactions (XA)
⢠already got one that is partitioned into
over 200 separate clusters ⢠Support multiple versions
⢠No Manual Partitioning ⢠Predictable response times
⢠Keep everything in Memory ⢠Predictable scale out costs
⢠Active + Active Sites ⢠Manage via JMX, from any point
⢠Across WAN in the âCloudâ.
⢠Develop system on a note book ⢠Pure Java Standard Edition
⢠Scale to over 500 servers ⢠Infrastructure add a maximum of
⢠No reconfiguration outages 3ms latency to tasks.
⢠No byte-code manipulation / ⢠Integrate with existing
proxies applications (Java 1.4.2+)
(c) Copyright 2007. Oracle Corporation
9. Enter the Data Grid
(c) Copyright 2008. Oracle Corporation
10. Enter the Data Grid
⢠Data Grid â Horizontally scalable in-memory data
management
⢠Goal
⢠Eliminate data source contention by scaling out data
management with commodity hardware
⢠Underlying PhilosophiesâŚ
⢠Keep âdataâ in the âapplication-tierâ (where itâs used)
⢠âDisks are slow and databases are evilâ
⢠âData Grids will solve your application scalability and
performance problemsâ
(c) Copyright 2008. Oracle Corporation
12. With thisâŚ
Note to Marketing: Replace âCloudâ with Data Grid, Distributed Cache,
Data Fabric, Information Fabric, Network Attached Storage, Java Space,
Service Grid, Compute Grid, Object Grid, Shared Memory or other term âş
(c) Copyright 2008. Oracle Corporation
16. Client + Server Pattern
Server is point of contention
Contention increases Server
response time = increased Client
latencies
Client scale-out increases contention
Not just Database related. Consider
Store-and-Forward messaging
systems and Spaces
The server may be a âswitchâ
Lesson: Avoid Single Points of
Contention / Bottleneck (SPOB)
(c) Copyright 2008. Oracle Corporation
17. Master + Worker Pattern
Master is point of contention
Contention increases Master
response time = increases Worker
(and requestor) Latencies
Scale-out increases contention
Lesson: Avoid Single Points of
Contention / Bottleneck (SPOB)
(c) Copyright 2008. Oracle Corporation
18. Master + Worker Pattern
Reality...
Typically Master + Worker actually is
also Client + Server!
Lesson: Avoid patterns with SPOB!
(c) Copyright 2008. Oracle Corporation
19. Master + Worker Pattern
Continued...
Typically Master + Worker actually is
also Client + Server!
Often the driving requirement for
âData Gridâ in a âCompute Gridâ
Lesson: Avoid patterns with
multiple SPOB!
(c) Copyright 2008. Oracle Corporation
20. Increasing Resilience
Increasing resilience increases
latency
Synchronously maintained resilience
typically doubles latencies
Asynchronously maintained resilience
will always introduce data integrity
issues
Lesson: Resilience rarely has zero-
latency properties
Lesson: Resilience â Persistence
(c) Copyright 2008. Oracle Corporation
21. Partition for Parallelism
Partition Data onto separate Masters
to provide load-balancing and
increase parallelism
Not easy, especially if access
patterns are dynamic and load is
uneven
âJoinsâ become very difficult, but
queries work in parallel
Lesson: Hot spots are inevitable
Lesson: Partition failure may
corrupt state. RAID is a better
partitioning strategy
Lesson: Avoid âregistriesâ to
locate data/services (ie: Masters)
(c) Copyright 2008. Oracle Corporation
22. Summary
⢠Avoid Single Points of Contention ⢠Avoid moving data
⢠Avoid Single Points of Failure ⢠Exploit Data Affinity
⢠Avoid Client + Server ⢠Data + Data and Data + Compute
⢠Avoid Master + Worker ⢠Deploy code everywhere
⢠Itâs smaller
⢠Active + Active better than Active
⢠Dynamic code deployment is
+ Passive dangerous in transactional systems
⢠Ensure fair utilization of resources
⢠Exploit Parallelism
⢠Resilience increases latency
⢠Partition Data for Parallelism
⢠Resilience â Persistence
⢠Hot Spots are unavoidable
⢠Resilience = Redundancy ⢠Pipeline architectures help significantly
⢠RAID is a good pattern ⢠Use Caching to reduce I/O
⢠XML is not great ⢠Cache Coherency is not free
⢠Interoperability is best achieved at ⢠Cache Coherency is essential for
the binary level (hardest, but best) Data Integrity
⢠Understand the underlying
implementation of solutions!
(c) Copyright 2008. Oracle Corporation
23. Achieving Scalability and High
Performance means...
1. Doing something completely different
architecturally... including inside the âCloudâ.
2. Avoiding patterns that limit scalability or performance
3. Ensuring each architectural component (from
external) providers avoids the âlimitingâ patterns
= knowing the internals of the provided solutions
(c) Copyright 2008. Oracle Corporation
25. Oracle Coherence
⢠ProvidesâŚ
⢠Container-less peer-to-peer Clustering of Java Processes
⢠Data Structures to manage Data across a Cluster / Grid
⢠Other StuffâŚ
⢠Real-Time Event Observation â Listener Pattern
⢠Materialized Views of Data â Continuous Queries
⢠Parallel Queries and Aggregation â Object-based Queries
⢠Parallel Data Processing
⢠Parallel Grid Processing
⢠RemoteException Free Distributed Computing
⢠Clustered JMX
⢠MAN + WAN Connectivity
26. Oracle Coherence
⢠Development Toolkit
⢠Pure Java 1.4.2+ Libraries
⢠Pure .Net 1.1 and 2.0 (Client Libraries)
⢠No Third-Party Dependencies
⢠No Open Source Dependencies
⢠No Masters
⢠No Registries
⢠Other Libraries forâŚ
⢠Database and File System Integration
⢠Top Link and Hibernate
⢠Http Session Management, Spring, âŚ
27. Oracle Coherence
⢠Some usesâŚ
⢠Caching state in the Application-tier
⢠Relieve load on lower-tier systems
⢠Databases, Mainframes, Web Servers, Web Services
⢠Reliably managing Application state in the Application-tier
⢠Scaling out application state (in the application-tier)
⢠In-Memory Http Session Management
⢠Reliable and Automatically Partitioned Grid Processing
⢠Temporary System of Record for Extreme Transaction
Processing
30. Strategy
⢠Business Tasks are regular Java objects (pojo)
⢠Place Business Tasks into Coherence
⢠Coherence dynamically distributes Tasks across the Cluster
⢠Tasks are resilient in the Cluster
⢠May use âaffinityâ to ensure related Tasks processed together
⢠Register Backing Map Listeners in the Cluster
members to execute Tasks
⢠Scaling out Coherence = Scaling out Task Processing
(c) Copyright 2008. Oracle Corporation
31. Backing Map Listener is what?
⢠Coherence distributes, manages and stores state
(objects) using âBacking Mapsâ
⢠Backing Map...
⢠Class that is responsible for managing state.
⢠Can be replaced to change how state is managed.
⢠Eg: in heap, off heap, hibernate, BDB, toplink, wan, file
system, memory mapped files across a wan.
⢠May be replaced, composed and customized.
⢠Backing Map Listener...
⢠Class that receives data events from Backing Maps
(c) Copyright 2008. Oracle Corporation
32. Strategy
⢠As Tasks enter the âCloudâ Coherence notifies BML
⢠Our BML implementation schedules, manages,
executes the Tasks (using Java 5 Executor)
⢠Cleans up Tasks when executed
⢠Deals with Task recovery (idempotent with status)
⢠BML is written in standard Java
⢠No Transactions
⢠Fault Tolerant
⢠Distributed + Scalable + Event Driven Architecture
(c) Copyright 2008. Oracle Corporation
33. Backing Map Listener Code
public class ExampleBackingMapListener extends
AbstractMultiplexingBackingMapListener {
public ExampleBackingMapListener(BackingMapManagerContext context) {
super(context);
System.out.println(quot;Created our ExampleBackingMapListenerquot;);
}
@Override
protected void onBackingMapEvent(MapEvent mapEvent, Cause cause) {
System.out.println(quot;Cause:quot; + cause + quot;, Event:quot; + mapEvent);
}
}
(c) Copyright 2008. Oracle Corporation
35. Results
⢠While submitting Tasks (regular system load)
⢠Test 1: Scale from 1 server to over 400
⢠No reconfiguration
⢠Test 2: Randomly kill servers
⢠No reconfiguration
⢠Test 3: Kill 1, 2, 4, 8, 16, 32, 64, 128, 160 servers at once
⢠Any data loss?
⢠Can it be identified?
⢠Possible 1,200,000,000 Tasks execution capacity
per/day
⢠Client may reduce current hardware by 75%
(c) Copyright 2008. Oracle Corporation
37. Trading Exchange
⢠Similar requirements and constraints
⢠Order processing (Foreign Exchange)
⢠1,000âs per second (initial) per currency pair
⢠No manual partitioning
⢠No transactions
⢠10ms max latency for full accept, validate, match,
respond
⢠Achieved with Coherence using BMLs (< 3ms)
⢠14 weeks development (start to go live)
(c) Copyright 2008. Oracle Corporation
38. Previous Next Generation
Approach (failed to meet SLAâs)
(c) Copyright 2008. Oracle Corporation
40. Pimp my Data Grid
(c) Copyright 2008. Oracle Corporation
41. Pimp it!
⢠Most Data Grids, especially Coherence are a fully
pluggable
⢠Coherence provides peer-to-peer JVM clustering,
resilient data management with events to support
distributed EDA.
⢠Youâre generally only limited by your creativity
(c) Copyright 2008. Oracle Corporation
42. Pimp it â with Groovy
⢠Instead of building object-based queries, why not use
Groovy expressions?
⢠Eg: Filters, Queries and Agents are completely
customizable in Coherence
⢠new GroovyFilter(âentry.value in [...]â);
⢠Serious projects are looking to use Groovy across the
Data Grid to provide processing agility
(c) Copyright 2008. Oracle Corporation
43. Pimp it â with Spring
⢠Instead of Spring wrapping your Data Grid, embed
Spring applications in a Data Grid to;
⢠Virtualize them
⢠Make then resilient to failure
⢠Scale them out
⢠Coherence is pure Java, so it plays well with Spring
⢠Use Coherence as clustering infrastructure for Spring
â make it unstoppable âş
(c) Copyright 2008. Oracle Corporation
48. The preceding is intended to outline general product
use and direction. It is intended for information
purposes only, and may not be incorporated into any
contract. It is not a commitment to deliver any
material, code, or functionality, and should not be
relied upon in making purchasing decisions.
The development, release, and timing of any
features or functionality described for Oracleâs
products remains at the sole discretion of Oracle.
(c) Copyright 2008. Oracle Corporation