Weitere ähnliche Inhalte Ähnlich wie C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley (20) Mehr von DataStax Academy (20) Kürzlich hochgeladen (20) C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop by Derek Bromenshenkel and Jeff Smoley1. reimagining the business of
apps
#Cassandra13
©2013 NativeX Holdings, LLC
The Perils and Triumphs of using
Cassandra at a .NET/Microsoft Shop
2. #Cassandra13
About the Presenters
Jeff Smoley – Infrastructure Architect
Derek Bromenshenkel – Infrastructure Architect
3. #Cassandra13
Agenda
• About NativeX
• Why Cassandra?
• Challenges
• Auto Id Generation
• FluentCassandra
• Hector
• IKVM.NET
• HectorNet
• Reporting Integration
4. ©2013 NativeX Holdings, LLC#Cassandra13
About NativeX
• Formerly W3i
• Home Office in Sartell, MN
• 75 miles NW of Minneapolis
• Remote Offices in MSP and SF
• 150 Employees
5. ©2013 NativeX Holdings, LLC#Cassandra13
What NativeX Does
• Marketing technology
platform that enables
developers to build
successful business around
their apps.
• We provide Publishers with
a way to monetize and
Advertisers with a way to
gain distribution.
6. #Cassandra13
Mobile Vanity Metrics
• Over 700M unique devices
• 1000s of Apps
• > 100M Monthly Active Users
• > 200GB of data ingest per week
7. ©2013 NativeX Holdings, LLC#Cassandra13
Backstory
• From 100M session/quarter
to 5B.
• Anticipate 7B sessions in
Q2.
• Growth was anticipated.
• Realized infrastructure
needed to change to support
this. 0
1
2
3
4
5
6
2011
Q4
2012
Q1
2012
Q2
2012
Q3
2012
Q4
2013
Q1
Billions
API
Requests
8. ©2013 NativeX Holdings, LLC#Cassandra13
Original OLTP Architecture
• Microsoft SQL Server
• 2 Node Cluster (Failover)
• 12 cores / node
• 192 GB mem / node
• Compellent SAN
• 172 Tiered Disk
• SSD, FC, SATA
9. ©2013 NativeX Holdings, LLC#Cassandra13
Objectives
Scale
• Horizontal
• Incremental
cost
structure
Resiliency
• No
single
point
of
failure
• Geographically
distributed
10. ©2013 NativeX Holdings, LLC#Cassandra13
What is NoSQL
• Stands for Not Only SQL.
• The NoSQL movement is about understanding problems and
focusing on solutions.
• It’s not about silver bullets and black boxes.
• It is about using the right tool for the right problem.
11. ©2013 NativeX Holdings, LLC#Cassandra13
Researched Products
• Compared features like:
• Distributed / Shared Nothing
• Multi-Cluster Support
• Maturity & Popularity
• Documentation
• .NET Support
12. ©2013 NativeX Holdings, LLC#Cassandra13
Why Cassandra?
• Multi-node
• Multi-cluster
• Highly Available
• Durable
• Shared Nothing
• Tunable Consistency
13. ©2013 NativeX Holdings, LLC#Cassandra13
Cassandra at NativeX
• C* was not a replacement DB system.
• We continue to use MS SQL Server alongside C*.
• SQL Server used for storing configuration data.
• C* solves a very specific problem for us.
• Writing large volumes of data quickly.
• Reading very specific data out of a large record set.
14. #Cassandra13
Challenges
• C* does not have Auto Id generation.
• How to connect to C* with C#?
• Finding a connector with good Failure Tolerance.
• How to integrate our reporting system?
15. ©2013 NativeX Holdings, LLC#Cassandra13
Auto ID Generation
• Pre-existing requirements
• Unique, 64-bit positive integers
• Increasing (sortable) a plus
• Previously SQL Server Identity column
• A Time-based UUID is sortable and unique
• Changed everything we could
• The future for us
16. ©2013 NativeX Holdings, LLC#Cassandra13
Auto ID – What are the options?
• SQL dummy table
• Easy & familiar, but limited
• Pre-generated range
• Proposed by @mdennis
• Distributed, but more complicated to implement
• Sharding [Instagram]
• Discovered too late
• Unfamiliar with Postgres
17. ©2013 NativeX Holdings, LLC#Cassandra13
We chose Snowflake
• Built by Twitter, Apache 2.0 license
• https://github.com/twitter/snowflake
• “… network service for generating unique ID numbers at high
scale..”
• Same motivation; MySQL -> C*
• A few tweaks for our Windows environment
18. ©2013 NativeX Holdings, LLC#Cassandra13
Technical reasons for Snowflake
• Meets all requirements
• Tested in high transaction system
• Java based [Scala] implementation
• Thrift server
• Run as a Windows service with Apache Daemon
• Con: Requires Apache Zookeeper
• Coordinate the worker id
19. ©2013 NativeX Holdings, LLC#Cassandra13
Connecting to Snowflake
• Built our own .NET
Snowflake Client
• Snowflake server on each
web node
• Local instance is primary
• Round robin failover to other
nodes
• Auto failover AND recovery
• “Circuit Breaker” pattern
Web
App
SF
Server
1
Web
App
SF
Server
3
Web
App
SF
Server
2
Web
App
SF
Server
4
20. #Cassandra13
Challenges
• C* does not have Auto Id generation.
• How to connect to C* with C#?
• Finding a connector with good Failure Tolerance.
• How to integrate our reporting system?
21. ©2013 NativeX Holdings, LLC#Cassandra13
Connecting to Cassandra with C#
• Thrift alone too low level
• Needs
• CQL support
• Active development / support
• Wants
• ADO.NET / LINQ feel
• ????
• FluentCassandra is where we started
22. ©2013 NativeX Holdings, LLC#Cassandra13
Vetting FluentCassandra
• Pros
• Open source -
https://github.com/fluentcassandra/fluentcassandra
• Nick Berardi, project owner, is excellent
• Designed for CQL
• Familiar feel
• Were able to start project development with it
23. ©2013 NativeX Holdings, LLC#Cassandra13
Vetting FluentCassandra
• Cons
• Immaturity
• Few users with high transaction system
• Permanent node blacklisting
• Lacked auto retry
• Couldn’t live with these limitations
• Tried hiring independent contractor to help us mature it
24. #Cassandra13
Challenges
• C* does not have Auto Id generation.
• How to connect to C* with C#?
• Finding a connector with good Failure Tolerance.
• How to integrate our reporting system?
25. ©2013 NativeX Holdings, LLC#Cassandra13
Hector: Yes, please
• Popular C* connector
• Use cases matching ours
• Good maturity
• Auto node discovery
• Auto retry
• Auto failure recovery
• Written in Java – major roadblock
26. ©2013 NativeX Holdings, LLC#Cassandra13
Help!
• We knew we still needed help.
• We found a company named Concord.
• Based out of the Twin Cites.
• Specialize in System, Process, and Data Integration.
• http://concordusa.com/
27. ©2013 NativeX Holdings, LLC#Cassandra13
Concord’s Recommendation
• Concord recommended that we use IKVM.NET to port Hector to
a .NET assembly.
• They had previous success using IKVM for other Java to .NET
ports.
• They felt that maturing FluentCassandra was going to take
longer than our timeline allowed.
28. ©2013 NativeX Holdings, LLC#Cassandra13
About the IKVM.NET Project
• http://www.ikvm.net/
• Open Source Project.
• Main contributor is Jeroen Frijters.
• He is actively contributing to the project.
• License allows for use in commercial applications.
29. ©2013 NativeX Holdings, LLC#Cassandra13
What is IKVM.NET?
• IKVM.NET includes the following components:
• A Java Virtual Machine implemented in .NET.
• A .NET implementation of the Java class libraries.
• Set of tools that enable Java and .NET interoperability.
30. ©2013 NativeX Holdings, LLC#Cassandra13
Uses for IKVM
• Drop-in JVM
• Included is a distribution of a .NET implementation of a Java
Virtual Machine.
• Allows you to run jar files using the .NET stack.
• Example: ikvm -jar myapp.jar
31. ©2013 NativeX Holdings, LLC#Cassandra13
Uses for IKVM
• Use Java libraries in your .NET applications
• Using ikvmc you can compile Java bytecode to .NET IL.
• Example: ikvmc -target:library mylib.jar
32. ©2013 NativeX Holdings, LLC#Cassandra13
Uses for IKVM
• Develop .NET applications in Java
• Write code in Java.
• Compile to JVM bytecode.
• Use ikvmc to produce a .NET Executable.
• Can also use .NET API’s in Java code using the ikvmstub
application to generate a Java jar file.
• Example: ikvmstub MyDotNetAssemblyName
33. ©2013 NativeX Holdings, LLC#Cassandra13
Hector Converted to .NET
• Per Concord’s recommendation we chose to compile the Hector
jar into a .NET Assembly.
• Hector and all of it’s dependencies are pulled into one .NET
dll that can be referenced by any .NET assembly.
• In addition you will have to reference some core IKVM
assemblies.
• Each Java dependency is given it’s own namespace with in
the .NET dll.
34. ©2013 NativeX Holdings, LLC#Cassandra13
HectorNet
• Concord also created a dll called HectorNet that wraps some of
the Hector behaviors and makes it feel more like .NET.
• Such as supporting connection strings.
• Mapping Thrift byte arrays to .NET data types.
• Mapping to native .NET collections instead of using Java
collections.
35. #Cassandra13
Challenges
• C* does not have Auto Id generation.
• How to connect to C* with C#?
• Finding a connector with good Failure Tolerance.
• How to integrate our reporting system?
36. ©2013 NativeX Holdings, LLC#Cassandra13
Integrating Reporting
OLTP
C*
Extract
Transform
CUBE
SSAS
OLAP
MS
SQL
Load
ETL
-‐
SSIS
37. ©2013 NativeX Holdings, LLC#Cassandra13
Integrating Reporting
• The SSIS Extract process uses C# Script Tasks.
• Script Task needs references to HectorNet and all of its
dependencies.
• SSIS can only reference assemblies that are in the GAC.
• Assemblies in the GAC have to be Signed.
38. #Cassandra13
Why Not DataStax C# Driver?
• We built everything using CQL 2.0.
• Wasn’t ready in time for our launch date.
39. #Cassandra13
DSE for the Win!
• We use DataStax Enterprise.
• Mainly for support, which continues to be a life saver.
40. ©2013 NativeX Holdings, LLC#Cassandra13
Thank you!
• We are hiring
• http://nativex.com/careers/
• Join the MSP C* Meetup
• http://www.meetup.com/Minneapolis-St-Paul-Cassandra-Meetup/
• Contact us
• Jeff.Smoley@nativex.com
• Derek.Bromenshenkel@nativex.com @breakingtrail
• Slide Deck
• http://www.slideshare.net/jjsmoley/the-perils-and-triumphs-of-using-
cassandra-at-a-netmicrosoft-shop