This presentation was given by Barry Thompson, CTO of Tervela, to TSAM (a financial buy-side technology & operations event) in July 2011. It covers trends in big data and how to solve problems with data movement, warehousing, and virtualization solutions.
3. Trend #1: Cost Structure of Storage
• Cost 2009
- 67 TB in 4U
• Distributed commodity storage is - 24.5x multiple
- Reliability as key
25x cheaper than Tier 1 SAN differentiator
• High reliability (replication) it is - With replication
(55x)
closer to 55x - Equivalent
Performance
• Performance
• Distributed is now faster
• Flash Exacerbates
Source: BackBlaze.com
2011
• Decreasing Differentiators - 145 TB in 4U (Disk)
- 27 TB in 4U (Flash)
• Perceived Reliability - 26x multiple
• Enterprise Management - w/ Data Fabric /
Virtualization is
• Legacy Compatibility as reliable
- Higher
Performance
3
4. Trend #2: Moving from Blocks to Data
• Blocks are a legacy to tape storage
• Deeply embedded in the OS / Driver fabric and most legacy DB
architectures
• Horribly inefficient for modern requirements
• Replication / Synchronization (>100x retransmission)
• Networks are not designed for blocks
• Applications have to Load / Store
• Wall Street data usage is different than standard Fortune 500 (more dynamic data
and higher churn rates)
• WAN Optimization can not fully solve
• Atomic Data is an emerging model
• DB Rows / Messages are the historical Atomic example
• PaaS interfaces are ALL data and file driven
• What is YOUR interface?
4
5. Trend #3: End of Single Location
• Single Location Warehouse’s are Challenged
• Time to Query
• User Experience & SLA
• Data volumes and WAN bandwidth
• Regulatory and Security
• Integrated System Dependencies
• Clients / customers / applications are all in motion (mobile platform & need for
• Impact of Moving from Single Location
• Dynamic data synchronization
• 1 Second global SLA for data synchronization – emerging standard for risk
• Mechanisms for distribute data sync are different
• PUSH = the new Data Fabric
• PULL = existing WAN Optimization
• Need for a new model for WAN optimization (beyond zlib / dedupe)
• Networks can’t handle file copy (block) it must be data
• Elasticity in data movement – the “fabric” must be able to buffer
• Turns the file and database replication and model on it’s head: 1 to many
5
6. Data Virtualization & Distributed Storage
• Data Virtualization Layers
• Data (storage, DB, cache, streaming
sources, state, etc…)
• Data Fabric (data movement,
reliability, buffering, WAN services)
• Data transformation (EII) and
coordination services (virtualization)
• Data Access / Interface
&
• Distributed Storage Model
• Data (storage, DB, cache, streaming
sources, state, etc…)
• Data Fabric (data movement,
reliability, buffering, WAN services)
• Legacy Interfaces
6
7. Impact of the New Model
• Database Vendor Market
• New Architectures (column store & distributed) can have the same
reliability, enterprise features and far better performance
• Monolithic DB solutions no longer need to rely upon storage for
DR / reliability
• Cost Structure – One size does NOT fit all
• Platform
• Cloud – Public / Private
• Existing Infrastructure
• Is there any difference
• Elasticity of Compute
7
8. Adoption
• Early Adopters of the Model in the Enterprise
• Big Data and Mining:
• Options
• Back testing
• Regulatory and compliance
• Real-time risk
• Global position & Instrument Master
• Best Execution
• Hot-Hot DR
• Global Data Availability
• Flexible Computing Utilizing Cloud Technologies
• Complex derivative pricing
• Grid – DR
• Seamless integration of remote locations / venues
8
9. About Tervela: Data In Motion
The Tervela Data Fabric Products
The fastest, most reliable, and cost
effective data transport system for globally
TMX: Message Switch
distributed, mission-critical applications. Message transport through the fabric
• 10-100x performance increase
over traditional solutions TPE: Persistence Engine
Embedded storage within the fabric
• Beyond 5x9’s
built-in fault tolerance & high availability
TPM: Provisioning & Management
Central management of the fabric
• 50% faster to deliver new apps
simple development tools & embedded services Data Fabric
Optimized for Distributed Data and
• Data-layer security Applications
integrated data entitlements & protection
Client APIs
C, C++, C#, Java, JMS, PaaS
Virtual Data Fabric Appliance
Free Download
www.tervela.com/download
9