SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Contact Us
510.818.9480 | www.kpipartners.com© KPI Partners Inc.
Start Here
Brian Dominguez| Director of Client Services | KPI Partners
DataStax and Analytics
Implementation Methodology
2
1. KPI is a Silver Level DataStax Partner
2. KPI is a top tier sponsor at Cassandra Summit
• September 22-24, 2015, Santa Clara, CA
3. KPI and its consultants have implemented
DataStax at multiple retail and financial services
customers
-
1. Use Case Requirements for Data Model
2. Security and Encryption Requirements
3. Service Level Agreements
4. Operational Requirements (Monitor and Manage)
5. Search Requirements (DataStax Search)
6. Analytics Requirements (DataStax Analytics)
1. Key to success “get the data model right”
2. Leverage what is in place:
1. Query logs
2. Define specific Create, Read, Update, and Delete “CRUD” requirements
3. DataStax Security
1. Authentication Req. (i.e. Kerberos, Password, SSL, LDAP, etc.)
2. Authorization Req. (i.e. access to Scheme, Table, or other database
components)
4. Encryption
1. Client Application to DataStax (the Cluster)
2. Node-to-Node (Inter-Cluster)
5. SLA’s
1. Highly recommended “must have”
2. Lack of SLA’s lead to project failure.
6. Understand you are building a mission critical system
1. Make sure to define operational monitoring and management of the system
7. DataStax Search
1. Define Search Requirements
2. Determine the fields that will be searched on and returned (i.e. multiple
search fields or single search field, the use of faceted results vs. ranked list
results, etc.)
7. DataStax Analytics
1. Analytics requirements should be captured at this time.
8. Analytics requirements should incorporate:
1. statistical algorithms,
2. required data sources,
3. data movement/modifications,
4. security/access,
5. other analytical requirements at a clear enough level to enable a thorough
design.
1. Data Model Design
2. Data Access Object Design
3. Data Movement Design
4. Operational Design (Management and Monitoring)
5. Search Design
6. Analytics Design
1. Data Model Design should clearly include:
1. Keyspace Design (Replication Strategy, Name)
2. Table Design (Table Names, Partition Keys, Clustering Columns (if applicable),
and physical table properties as necessary (i.e. encryption, bloom filter
settings, etc.)
3. Any relationships between tables. Note that database joining within DataStax
Enterprise is not technically feasible. However, relationships between tables
are still important, especially for the application developers.
2. When leveraging simple Data Access Objects projects
are more successful
1. Simple Data Access Objects are best to encapsulate and abstract data
manipulation logic.
2. This is opposed to the current trend in application development, where
projects leverage frameworks to encapsulate, abstract, and represent
database components as application objects, i.e. Hibernate, LinQ, JPA, ORM,
etc.
3. Designing the Data Access Object, as much as possible, up front will help the
application development team as they build out higher-level functionality.
3. Data Movement Design is essential to your success
1. Batch and real-time data integration between systems
2. ETL, Change Data Capture, data pipelines, etc.
3. Data types, transformation logic, error handling, look-ups, and data
normalization should be clearly documented.
4. Operational Design
1. Tooling and the techniques used:
1. deploy new nodes, configure and upgrade nodes in the cluster, backup and
restore operations, cluster monitoring, OpsCenter use, repairs, alerting,
disaster management processes, etc.
2. KPI recommends using a "playbook" approach to Operational
Design.
5. Search Design
1. Incorporate items such as:
1.searchable terms, returned terms, tokenizers, filters,
multidocument search terms, etc.
6. DataStax Analytics Design
1. determine which Analytics components will be leveraged in the
solution.
1. Infrastructure
2. Deployment and Configuration Management
3. Software Components (Data Model and
Application)
4. Unit Testing of Components
1. Application Development – use Agile or Waterfall methodology as
desired by your organization
2. Deployment and Configuration Management Mechanism
1. Key in a distributed system is the need to automate as much as possible
2. Opscenter, Docker, Vagrant, Chef, Puppet, etc. should be leveraged.
3. Unit Testing of Components
1. More complex with distributed systems compared to single node systems.
2. Specific defects, such as race conditions, are only observed "at scale“
3. unit testing should be executed over a small cluster that contains more than a
single node.
4. Tools such as ccm can be used by developers to automate the process of
quickly launching test clusters as part of a unit test.
1. Defect tracking (JIRA, Issue Log)
2. Operational readiness checklist completed
1. Critical to enable the project team to identify actual
issues prior to going to production “at scale”
2. Minimum 2 week period where the application is running
at production scale.
3. It may take several iterations of configuration, code
change, and refactoring to enable full execution
4. Operational Readiness Checklist
1. Replace a downed node and a dead seed node
2. Configure and execute repair (within GC_Grace_Period)
3. Add a node to a cluster
4. Replace a downed Data Center
5. Add a Data Center to the cluster
6. Decommission a node
7. Restore a backup
8. At a Cluster Level and Per Node Level, report on errors, throughput, latency,
resource saturation, bottlenecks, compactions, flushes, and health
 Highlight the normal, operational mode of an application built on
DataStax Enterprise.
 Prepare for all eventualities, and address by adding nodes to expand
capacity to the system when needed.
 Scale with DataStax Enterprise.
Tableau via ODBC
R for Visualization (SPARK
Analytics)
Tableau via ODBC
R for Visualization (SPARK
Analytics)
23
Next Steps
DataStax Representative KPI Partners
DataStax Pricing
DataStax Demo
• Schedule a Lunch & Learn
• Free 1 Hour DataStax Assessment Call
Contact
Brian Dominguez
brian.dominguez@kpipartners.com
617-510-7512
or at
info@kpipartners.com
www.kpipartners.com
Who To Contact?
KPI PARTNERS
Booth 111
September 22-24
KP Partners: DataStax and Analytics Implementation Methodology

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...RTTS
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World DistilledRTTS
 
Improve the Health of Your Data
Improve the Health of Your DataImprove the Health of Your Data
Improve the Health of Your DataRTTS
 
Query Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingQuery Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingRTTS
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP TestingRTTS
 
Implementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing ProjectImplementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing ProjectRTTS
 
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...RTTS
 
Big Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityBig Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityRTTS
 
QuerySurge for DevOps
QuerySurge for DevOpsQuerySurge for DevOps
QuerySurge for DevOpsRTTS
 
Scaling Databricks to Run Data and ML Workloads on Millions of VMs
Scaling Databricks to Run Data and ML Workloads on Millions of VMsScaling Databricks to Run Data and ML Workloads on Millions of VMs
Scaling Databricks to Run Data and ML Workloads on Millions of VMsMatei Zaharia
 
Microsoft R - ScaleR Overview
Microsoft R - ScaleR OverviewMicrosoft R - ScaleR Overview
Microsoft R - ScaleR OverviewKhalid Salama
 
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 Webinar - QuerySurge and Azure DevOps in the Azure Cloud Webinar - QuerySurge and Azure DevOps in the Azure Cloud
Webinar - QuerySurge and Azure DevOps in the Azure CloudRTTS
 
Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Cloudera, Inc.
 
An introduction to QuerySurge webinar
An introduction to QuerySurge webinarAn introduction to QuerySurge webinar
An introduction to QuerySurge webinarRTTS
 
Preparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guidePreparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guideETLSolutions
 
Testing Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopTesting Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopRTTS
 
Oracle GoldenGate for Disaster Recovery
Oracle GoldenGate for Disaster RecoveryOracle GoldenGate for Disaster Recovery
Oracle GoldenGate for Disaster RecoveryFumiko Yamashita
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessRTTS
 

Was ist angesagt? (20)

Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
Improve the Health of Your Data
Improve the Health of Your DataImprove the Health of Your Data
Improve the Health of Your Data
 
Query Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programmingQuery Wizards - data testing made easy - no programming
Query Wizards - data testing made easy - no programming
 
How to Automate your Enterprise Application / ERP Testing
How to Automate your  Enterprise Application / ERP TestingHow to Automate your  Enterprise Application / ERP Testing
How to Automate your Enterprise Application / ERP Testing
 
Implementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing ProjectImplementing Azure DevOps with your Testing Project
Implementing Azure DevOps with your Testing Project
 
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
 
Big Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityBig Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data Quality
 
QuerySurge for DevOps
QuerySurge for DevOpsQuerySurge for DevOps
QuerySurge for DevOps
 
Scaling Databricks to Run Data and ML Workloads on Millions of VMs
Scaling Databricks to Run Data and ML Workloads on Millions of VMsScaling Databricks to Run Data and ML Workloads on Millions of VMs
Scaling Databricks to Run Data and ML Workloads on Millions of VMs
 
Microsoft R - ScaleR Overview
Microsoft R - ScaleR OverviewMicrosoft R - ScaleR Overview
Microsoft R - ScaleR Overview
 
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 Webinar - QuerySurge and Azure DevOps in the Azure Cloud Webinar - QuerySurge and Azure DevOps in the Azure Cloud
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
 
Planning open stack-poc
Planning open stack-pocPlanning open stack-poc
Planning open stack-poc
 
Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop
 
An introduction to QuerySurge webinar
An introduction to QuerySurge webinarAn introduction to QuerySurge webinar
An introduction to QuerySurge webinar
 
Preparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guidePreparing a data migration plan: A practical guide
Preparing a data migration plan: A practical guide
 
Testing Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopTesting Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of Hadoop
 
Oracle GoldenGate
Oracle GoldenGate Oracle GoldenGate
Oracle GoldenGate
 
Oracle GoldenGate for Disaster Recovery
Oracle GoldenGate for Disaster RecoveryOracle GoldenGate for Disaster Recovery
Oracle GoldenGate for Disaster Recovery
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = Success
 

Ähnlich wie KP Partners: DataStax and Analytics Implementation Methodology

Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
 
Accelerate your SAP BusinessObjects to the Cloud
Accelerate your SAP BusinessObjects to the CloudAccelerate your SAP BusinessObjects to the Cloud
Accelerate your SAP BusinessObjects to the CloudWiiisdom
 
Community Resource Portal for the Healthcare Sector
Community Resource Portal for the Healthcare SectorCommunity Resource Portal for the Healthcare Sector
Community Resource Portal for the Healthcare SectorMike Taylor
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
 
Migrating to Cloud: Inhouse Hadoop to Databricks (3)
Migrating to Cloud: Inhouse Hadoop to Databricks (3)Migrating to Cloud: Inhouse Hadoop to Databricks (3)
Migrating to Cloud: Inhouse Hadoop to Databricks (3)Knoldus Inc.
 
Automated product categorization
Automated product categorizationAutomated product categorization
Automated product categorizationAndreas Loupasakis
 
Automated product categorization
Automated product categorization   Automated product categorization
Automated product categorization Warply
 
Testing insights from data lakes
Testing insights from data lakesTesting insights from data lakes
Testing insights from data lakesshivindkaur
 
Scaling managed MySQL Platform in Flipkart - (Sachin Japate - Flipkart) - Myd...
Scaling managed MySQL Platform in Flipkart - (Sachin Japate - Flipkart) - Myd...Scaling managed MySQL Platform in Flipkart - (Sachin Japate - Flipkart) - Myd...
Scaling managed MySQL Platform in Flipkart - (Sachin Japate - Flipkart) - Myd...Mydbops
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptxsharpan
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise AnalyticsDATAVERSITY
 
Anil Kumar_ 2YearsExp
Anil Kumar_ 2YearsExpAnil Kumar_ 2YearsExp
Anil Kumar_ 2YearsExpAnil Kumar
 
Anil Kumar_ 2YearsExp
Anil Kumar_ 2YearsExpAnil Kumar_ 2YearsExp
Anil Kumar_ 2YearsExpAnil Kumar
 
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudFSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudAmazon Web Services
 
Cloudera federal summit
Cloudera federal summitCloudera federal summit
Cloudera federal summitMatt Carroll
 
MineDB Mineral Resource Evaluation White Paper
MineDB Mineral Resource Evaluation White PaperMineDB Mineral Resource Evaluation White Paper
MineDB Mineral Resource Evaluation White PaperDerek Diamond
 

Ähnlich wie KP Partners: DataStax and Analytics Implementation Methodology (20)

My C.V
My C.VMy C.V
My C.V
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
Accelerate your SAP BusinessObjects to the Cloud
Accelerate your SAP BusinessObjects to the CloudAccelerate your SAP BusinessObjects to the Cloud
Accelerate your SAP BusinessObjects to the Cloud
 
Community Resource Portal for the Healthcare Sector
Community Resource Portal for the Healthcare SectorCommunity Resource Portal for the Healthcare Sector
Community Resource Portal for the Healthcare Sector
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Migrating to Cloud: Inhouse Hadoop to Databricks (3)
Migrating to Cloud: Inhouse Hadoop to Databricks (3)Migrating to Cloud: Inhouse Hadoop to Databricks (3)
Migrating to Cloud: Inhouse Hadoop to Databricks (3)
 
Automated product categorization
Automated product categorizationAutomated product categorization
Automated product categorization
 
Automated product categorization
Automated product categorization   Automated product categorization
Automated product categorization
 
58750024 datastage-student-guide
58750024 datastage-student-guide58750024 datastage-student-guide
58750024 datastage-student-guide
 
Testing insights from data lakes
Testing insights from data lakesTesting insights from data lakes
Testing insights from data lakes
 
Scaling managed MySQL Platform in Flipkart - (Sachin Japate - Flipkart) - Myd...
Scaling managed MySQL Platform in Flipkart - (Sachin Japate - Flipkart) - Myd...Scaling managed MySQL Platform in Flipkart - (Sachin Japate - Flipkart) - Myd...
Scaling managed MySQL Platform in Flipkart - (Sachin Japate - Flipkart) - Myd...
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptx
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Anil Kumar_ 2YearsExp
Anil Kumar_ 2YearsExpAnil Kumar_ 2YearsExp
Anil Kumar_ 2YearsExp
 
Anil Kumar_ 2YearsExp
Anil Kumar_ 2YearsExpAnil Kumar_ 2YearsExp
Anil Kumar_ 2YearsExp
 
Info sphere overview
Info sphere overviewInfo sphere overview
Info sphere overview
 
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudFSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud
 
Cloud computing 101
Cloud computing 101Cloud computing 101
Cloud computing 101
 
Cloudera federal summit
Cloudera federal summitCloudera federal summit
Cloudera federal summit
 
MineDB Mineral Resource Evaluation White Paper
MineDB Mineral Resource Evaluation White PaperMineDB Mineral Resource Evaluation White Paper
MineDB Mineral Resource Evaluation White Paper
 

Mehr von DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 

Mehr von DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Kürzlich hochgeladen

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 

Kürzlich hochgeladen (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

KP Partners: DataStax and Analytics Implementation Methodology

  • 1. Contact Us 510.818.9480 | www.kpipartners.com© KPI Partners Inc. Start Here Brian Dominguez| Director of Client Services | KPI Partners DataStax and Analytics Implementation Methodology
  • 2. 2
  • 3. 1. KPI is a Silver Level DataStax Partner 2. KPI is a top tier sponsor at Cassandra Summit • September 22-24, 2015, Santa Clara, CA 3. KPI and its consultants have implemented DataStax at multiple retail and financial services customers -
  • 4.
  • 5. 1. Use Case Requirements for Data Model 2. Security and Encryption Requirements 3. Service Level Agreements 4. Operational Requirements (Monitor and Manage) 5. Search Requirements (DataStax Search) 6. Analytics Requirements (DataStax Analytics)
  • 6. 1. Key to success “get the data model right” 2. Leverage what is in place: 1. Query logs 2. Define specific Create, Read, Update, and Delete “CRUD” requirements 3. DataStax Security 1. Authentication Req. (i.e. Kerberos, Password, SSL, LDAP, etc.) 2. Authorization Req. (i.e. access to Scheme, Table, or other database components) 4. Encryption 1. Client Application to DataStax (the Cluster) 2. Node-to-Node (Inter-Cluster)
  • 7. 5. SLA’s 1. Highly recommended “must have” 2. Lack of SLA’s lead to project failure. 6. Understand you are building a mission critical system 1. Make sure to define operational monitoring and management of the system 7. DataStax Search 1. Define Search Requirements 2. Determine the fields that will be searched on and returned (i.e. multiple search fields or single search field, the use of faceted results vs. ranked list results, etc.)
  • 8. 7. DataStax Analytics 1. Analytics requirements should be captured at this time. 8. Analytics requirements should incorporate: 1. statistical algorithms, 2. required data sources, 3. data movement/modifications, 4. security/access, 5. other analytical requirements at a clear enough level to enable a thorough design.
  • 9. 1. Data Model Design 2. Data Access Object Design 3. Data Movement Design 4. Operational Design (Management and Monitoring) 5. Search Design 6. Analytics Design
  • 10. 1. Data Model Design should clearly include: 1. Keyspace Design (Replication Strategy, Name) 2. Table Design (Table Names, Partition Keys, Clustering Columns (if applicable), and physical table properties as necessary (i.e. encryption, bloom filter settings, etc.) 3. Any relationships between tables. Note that database joining within DataStax Enterprise is not technically feasible. However, relationships between tables are still important, especially for the application developers.
  • 11. 2. When leveraging simple Data Access Objects projects are more successful 1. Simple Data Access Objects are best to encapsulate and abstract data manipulation logic. 2. This is opposed to the current trend in application development, where projects leverage frameworks to encapsulate, abstract, and represent database components as application objects, i.e. Hibernate, LinQ, JPA, ORM, etc. 3. Designing the Data Access Object, as much as possible, up front will help the application development team as they build out higher-level functionality.
  • 12. 3. Data Movement Design is essential to your success 1. Batch and real-time data integration between systems 2. ETL, Change Data Capture, data pipelines, etc. 3. Data types, transformation logic, error handling, look-ups, and data normalization should be clearly documented.
  • 13. 4. Operational Design 1. Tooling and the techniques used: 1. deploy new nodes, configure and upgrade nodes in the cluster, backup and restore operations, cluster monitoring, OpsCenter use, repairs, alerting, disaster management processes, etc. 2. KPI recommends using a "playbook" approach to Operational Design.
  • 14. 5. Search Design 1. Incorporate items such as: 1.searchable terms, returned terms, tokenizers, filters, multidocument search terms, etc. 6. DataStax Analytics Design 1. determine which Analytics components will be leveraged in the solution.
  • 15. 1. Infrastructure 2. Deployment and Configuration Management 3. Software Components (Data Model and Application) 4. Unit Testing of Components
  • 16. 1. Application Development – use Agile or Waterfall methodology as desired by your organization 2. Deployment and Configuration Management Mechanism 1. Key in a distributed system is the need to automate as much as possible 2. Opscenter, Docker, Vagrant, Chef, Puppet, etc. should be leveraged. 3. Unit Testing of Components 1. More complex with distributed systems compared to single node systems. 2. Specific defects, such as race conditions, are only observed "at scale“ 3. unit testing should be executed over a small cluster that contains more than a single node. 4. Tools such as ccm can be used by developers to automate the process of quickly launching test clusters as part of a unit test.
  • 17. 1. Defect tracking (JIRA, Issue Log) 2. Operational readiness checklist completed
  • 18. 1. Critical to enable the project team to identify actual issues prior to going to production “at scale” 2. Minimum 2 week period where the application is running at production scale. 3. It may take several iterations of configuration, code change, and refactoring to enable full execution
  • 19. 4. Operational Readiness Checklist 1. Replace a downed node and a dead seed node 2. Configure and execute repair (within GC_Grace_Period) 3. Add a node to a cluster 4. Replace a downed Data Center 5. Add a Data Center to the cluster 6. Decommission a node 7. Restore a backup 8. At a Cluster Level and Per Node Level, report on errors, throughput, latency, resource saturation, bottlenecks, compactions, flushes, and health
  • 20.  Highlight the normal, operational mode of an application built on DataStax Enterprise.  Prepare for all eventualities, and address by adding nodes to expand capacity to the system when needed.  Scale with DataStax Enterprise.
  • 21. Tableau via ODBC R for Visualization (SPARK Analytics)
  • 22. Tableau via ODBC R for Visualization (SPARK Analytics)
  • 23. 23 Next Steps DataStax Representative KPI Partners DataStax Pricing DataStax Demo • Schedule a Lunch & Learn • Free 1 Hour DataStax Assessment Call Contact Brian Dominguez brian.dominguez@kpipartners.com 617-510-7512 or at info@kpipartners.com www.kpipartners.com Who To Contact?

Hinweis der Redaktion

  1. The attached presentation is intended for technical audiences. It provides some good details on data modeling as well as Pre-Production testing. The main takeaway is that, if the PoC is well constructed, then you can move directly into the Pre-Production testing phase of this approach, skipping the requirements through implementation phases. This highlights the scaling advantage of Apache Cassandra and DataStax Enterprise.