KP Partners: DataStax and Analytics Implementation Methodology

Contact Us
510.818.9480 | www.kpipartners.com© KPI Partners Inc.
Start Here
Brian Dominguez| Director of Client Services | KPI Partners
DataStax and Analytics
Implementation Methodology

1. KPI is a Silver Level DataStax Partner
2. KPI is a top tier sponsor at Cassandra Summit
• September 22-24, 2015, Santa Clara, CA
3. KPI and its consultants have implemented
DataStax at multiple retail and financial services
customers
-

1. Use Case Requirements for Data Model
2. Security and Encryption Requirements
3. Service Level Agreements
4. Operational Requirements (Monitor and Manage)
5. Search Requirements (DataStax Search)
6. Analytics Requirements (DataStax Analytics)

1. Key to success “get the data model right”
2. Leverage what is in place:
1. Query logs
2. Define specific Create, Read, Update, and Delete “CRUD” requirements
3. DataStax Security
1. Authentication Req. (i.e. Kerberos, Password, SSL, LDAP, etc.)
2. Authorization Req. (i.e. access to Scheme, Table, or other database
components)
4. Encryption
1. Client Application to DataStax (the Cluster)
2. Node-to-Node (Inter-Cluster)

5. SLA’s
1. Highly recommended “must have”
2. Lack of SLA’s lead to project failure.
6. Understand you are building a mission critical system
1. Make sure to define operational monitoring and management of the system
7. DataStax Search
1. Define Search Requirements
2. Determine the fields that will be searched on and returned (i.e. multiple
search fields or single search field, the use of faceted results vs. ranked list
results, etc.)

7. DataStax Analytics
1. Analytics requirements should be captured at this time.
8. Analytics requirements should incorporate:
1. statistical algorithms,
2. required data sources,
3. data movement/modifications,
4. security/access,
5. other analytical requirements at a clear enough level to enable a thorough
design.

1. Data Model Design
2. Data Access Object Design
3. Data Movement Design
4. Operational Design (Management and Monitoring)
5. Search Design
6. Analytics Design

1. Data Model Design should clearly include:
1. Keyspace Design (Replication Strategy, Name)
2. Table Design (Table Names, Partition Keys, Clustering Columns (if applicable),
and physical table properties as necessary (i.e. encryption, bloom filter
settings, etc.)
3. Any relationships between tables. Note that database joining within DataStax
Enterprise is not technically feasible. However, relationships between tables
are still important, especially for the application developers.

2. When leveraging simple Data Access Objects projects
are more successful
1. Simple Data Access Objects are best to encapsulate and abstract data
manipulation logic.
2. This is opposed to the current trend in application development, where
projects leverage frameworks to encapsulate, abstract, and represent
database components as application objects, i.e. Hibernate, LinQ, JPA, ORM,
etc.
3. Designing the Data Access Object, as much as possible, up front will help the
application development team as they build out higher-level functionality.

3. Data Movement Design is essential to your success
1. Batch and real-time data integration between systems
2. ETL, Change Data Capture, data pipelines, etc.
3. Data types, transformation logic, error handling, look-ups, and data
normalization should be clearly documented.

4. Operational Design
1. Tooling and the techniques used:
1. deploy new nodes, configure and upgrade nodes in the cluster, backup and
restore operations, cluster monitoring, OpsCenter use, repairs, alerting,
disaster management processes, etc.
2. KPI recommends using a "playbook" approach to Operational
Design.

5. Search Design
1. Incorporate items such as:
1.searchable terms, returned terms, tokenizers, filters,
multidocument search terms, etc.
6. DataStax Analytics Design
1. determine which Analytics components will be leveraged in the
solution.

1. Infrastructure
2. Deployment and Configuration Management
3. Software Components (Data Model and
Application)
4. Unit Testing of Components

1. Application Development – use Agile or Waterfall methodology as
desired by your organization
2. Deployment and Configuration Management Mechanism
1. Key in a distributed system is the need to automate as much as possible
2. Opscenter, Docker, Vagrant, Chef, Puppet, etc. should be leveraged.
3. Unit Testing of Components
1. More complex with distributed systems compared to single node systems.
2. Specific defects, such as race conditions, are only observed "at scale“
3. unit testing should be executed over a small cluster that contains more than a
single node.
4. Tools such as ccm can be used by developers to automate the process of
quickly launching test clusters as part of a unit test.

1. Defect tracking (JIRA, Issue Log)
2. Operational readiness checklist completed

1. Critical to enable the project team to identify actual
issues prior to going to production “at scale”
2. Minimum 2 week period where the application is running
at production scale.
3. It may take several iterations of configuration, code
change, and refactoring to enable full execution

4. Operational Readiness Checklist
1. Replace a downed node and a dead seed node
2. Configure and execute repair (within GC_Grace_Period)
3. Add a node to a cluster
4. Replace a downed Data Center
5. Add a Data Center to the cluster
6. Decommission a node
7. Restore a backup
8. At a Cluster Level and Per Node Level, report on errors, throughput, latency,
resource saturation, bottlenecks, compactions, flushes, and health

 Highlight the normal, operational mode of an application built on
DataStax Enterprise.
 Prepare for all eventualities, and address by adding nodes to expand
capacity to the system when needed.
 Scale with DataStax Enterprise.

Tableau via ODBC
R for Visualization (SPARK
Analytics)

23
Next Steps
DataStax Representative KPI Partners
DataStax Pricing
DataStax Demo
• Schedule a Lunch & Learn
• Free 1 Hour DataStax Assessment Call
Contact
Brian Dominguez
brian.dominguez@kpipartners.com
617-510-7512
or at
info@kpipartners.com
www.kpipartners.com
Who To Contact?

KPI PARTNERS
Booth 111
September 22-24

KP Partners: DataStax and Analytics Implementation Methodology

KP Partners: DataStax and Analytics Implementation Methodology

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie KP Partners: DataStax and Analytics Implementation Methodology

Ähnlich wie KP Partners: DataStax and Analytics Implementation Methodology (20)

Mehr von DataStax Academy

Mehr von DataStax Academy (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

KP Partners: DataStax and Analytics Implementation Methodology

Hinweis der Redaktion