Weitere ähnliche Inhalte Ähnlich wie Data-Centric Infrastructure for Agile Development (20) Mehr von DATAVERSITY (20) Kürzlich hochgeladen (20) Data-Centric Infrastructure for Agile Development1. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
The Data-Centered Data Center
Presented by: Jim Clark, Senior Director of Product Management
2. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 2
THE WORLD IS VERY
APPLICATION-CENTRIC
3. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 3
2. Determine needed data 3. Determine needed queries
?
?
1. Design the application
7. Load the data 8. Code the application5. Build a database 6. Design the ETL strategy
4. Design the schema and
indexing strategy
4. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 4
OLTP
Warehouse
Data MartsArchives
“Unstructured”
“ ”
Video
Audio
Signals,
Logs,
Streams
Social
Documents,
Messages
{ }
Metadata
Search🔍
Reference
Data
5. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 5
HOW DO YOU DETERMINE IN
ADVANCE WHAT'S USEFUL?
Love the application...can
you go back and include the
data from 1990 – 1995?
6. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 6
TOO MUCH DATA TO BE COPYING
FOR EVERY NEW APPLICATION
Serious?! Third time this
month I'm moving that
data around!
7. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 7
ETL CONSUMES ALL RESOURCES
With all of the new data
we're trying to get into the
database, there's no time to
build new features!
8. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 8
TOO MANY TECHNOLOGIES
CREATES SCALING HEADACHES
To scale this system, we've got to buy
new hardware. We can take the old
hardware and move it to this other
system. That one can't get any bigger.
Period.
9. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 9
TOO MUCH AND TOO MANY
COPIES...YOU'VE LOST CONTROL
Who's reading it? Who's
editing it? Where's the
master copy? What's
happened to it over time?
Is it reliable?
How up-to-date is this data
store? Are the security
models consistent? Are there
different backup models? Are
the lifecycles, retention,
disposal policies the same?
10. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 10
APPLICATION-CENTRIC
DATA CENTER
11. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 11
APPLICATION-CENTRIC
DATA CENTER
12. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 12
The data-centered data center
13. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 13
5. On-premises, Cloud... both!
3. Elasticity with no downtime
6. Create powerful data
services
1. Hadoop
4. Manage
the data lifecycle2. Low-cost Tiered Storage
7. Complete database
platform
How?
8. Enterprise Readiness
14. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 14
Enter Hadoop…
Hadoop
Staging Analytics
Persistence
16. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 16
Legacy RDBMS
Indexes
Transactions
Security
Enterprise operations
“NoSQL”
Flexible data model
Commodity scale out
Distributed, fault-tolerant
Hadoop sink/source
Why must we choose?
17. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 17
Enterprise NoSQL
Flexible data model, comprehensive indexes
o Documents: Hierarchy, text, values, tags—schema “when you need it”
o Scalars: Aggregates and range filters, including geospatial
o Triples: Linked facts and inferencing
o Permissions: Users, roles, compartments, and privileges
o Queries: Reverse indexes for alerting, matching
Ad hoc queries, lock-free reads
Real-time transformation
Strict consistency, security throughout
18. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 18
Data-centered
Enterprise
NoSQL
HadoopMarkLogic
19. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 19
NoSQL
Online applications
Delivery
Decision-making
Real-time
Granular updates
Distributed indexes
Hadoop
Offline analytics
Staging
Model-building
Long-haul batch
Write-once, read-many
Distributed file system
Complementary approaches
20. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 20
TIERED STORAGE
21. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 21
With Tiered Storage You Can
Provide multiple Service Level Agreements (SLAs)
in a single system
Decrease time and costs of ETL to bring
offline content back online
Empower your operations team without
imposing burdens on your developers
22. SLIDE: 22 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Tiered Storage
Here’s how you enable tiered storage…
Define data tiers based on a range index
Have content balanced into forests by tier
Move an entire tier to different storage
Query one tier…
…or the other tier…
…or both at once!
All with no downtime, and 100% consistency!
23. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 23
OPERATIONAL
TRADE STORE
Case Study:
24. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 24
Tier 1 Bank: Operational trade store
“What are the bank’s obligations?”
ETL
Trade
execution
Post-trade processing
Reporting
Analytics
Trade stores
Reference data
25. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 25
Legacy trade store challenges
Long development cycles for new instrument types
Complex combinations of ETL and data models
Limited visibility across the business
Governance risk, maintenance costs of siloed infrastructure
Varied SLAs and access patterns created inefficiencies
26. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 26
Preserving Context with Documents
Trade Cashflows
Party
Identifier Net Payment
Payment
Date
Party
Reference Payer
Party
Trade
ID
Payment
AmountReceiver
Party
Application
Model
Provider
Model
Persistence
Model
27. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 27
Information lifecycle
Active Historical Archive
Time
SSD
DAS
SAN
Hadoop
DAS
SAN
NAS
Hadoop
S3
NAS
Hadoop
S3
28. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 28
Active
Active
Local 10K SAS, RAID10
Replication for HA
Merge overhead for updates
20 hosts, 320 shards
4 TB of SSD cache
96 TB
29. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 29
Compliance
Active
Compliance Shared NAS
63 hosts
Effective 8 TB/host
504
96
TB
30. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 30
Active
Compliance
Analytic
Hadoop
120 hosts
Effective 12 TB/host
10 MarkLogic hosts
Analytic
1,044
504
96
TB
31. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 31
Active
Compliance
Analytic
Online migration
TB
32. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 32
96 504 1,044
592 2,066 2,080
Total Size (TB)
Total Cost ($000)
Effective Unit Cost ($/GB)
$4
Compliance
$1.50
AnalyticOperational
$25
($/GB)
33. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 33
Align infrastructure with objectives
Data volumes are increasing, but IT budgets are not
Storage is the dominant factor in the overall cost
Value of data and pattern of access varies widely and changes over time
Last month’s news
Current quarter’s open transactions
Latest message traffic
35. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 35
ELASTICITY
36. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 36
With Elasticity You Can
Know when to scale
How much to scale
Programmatically expand and contract
On premises or in the cloud
37. SLIDE: 37 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Elasticity
Scale up and down with
Tools to understand in detail how your cluster
is performing, and to find bottlenecks
Fine-grained tuning parameters for
optimization of indexes, cache sizes, etc.
Cloud orchestration APIs to expand and
contract clusters programmatically on-prem or
in the cloud
Continuous, online rebalancing of content
across nodes in a cluster to keep performance
optimal for your cluster size
39. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 39
The data-centered data center
Index once
Single security model
Flexible data model
Transactions
Elastic operations
…when you need them
Simplified governance
40. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 40
SECURE
Minimize duplication,
costly ETL, reduce risk
REAL-TIME
Enterprise-class database for
real-time search, delivery &
analytics
THE DATA-CENTERED DATA CENTER
RUN APPLICATIONS
Run mission critical applications
directly on HDFS
41. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 41
Powerful
Deliver more value, build more powerful applications
Full Text
Search
Scalable
Analytic
Functions
Alerting
& Event
Processing
Geospatial
Query
In-database
MapReduce
Visualization
Widgets
Semantics:
RDF &
SPARQL
Flexible
Indexes
JSON
Storage
REST &
Java APIs
Triple
Index
POWERFUL
Deliver more value, build more powerful applications
AGILE
Prepare for and respond quickly to change
BI
Integration
HDFS &
Amazon S3
Storage
Elastic
Programmatic
Controls &
Metering
Application
Builder
Information
Studio
SQL
Support
Hadoop
Connector
Tiered
Storage
Cloud
Ready
Schema-
Agnostic
mlcp
Content
Pump
TRUSTED
Enterprise-ready and secure for mission-critical apps
ACID
Transactions
XA
Distributed
Transactions
Database
Rollback
Backup/
Restore
Automated
Failover
Journal
Archiving
Replication
Point-in-
time
Recovery
Monitoring
&
Management
Role-based
Security &
LDAP
Support
Common
Criteria
Security
Certification
Configuration
Management
42. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 42
Take-Aways
New and more data is both an opportunity and a threat
Last generation of data management is not sufficient
More copies, representations, transformations increase risk and slow innovation
Index once and reuse across workloads, lifecycle
NoSQL: indexing and updates for interactive apps
Hadoop: staging, persistence, and analytics
43. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 43
SEARCHDATABASE
APPLICATION SERVICES
44. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 44
Any Questions?