Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

How THINQ runs both transactions and analytics at scale

222 Aufrufe

Veröffentlicht am

THINQ provides a cloud-based Communications-Platform-as-a-Service (CPaaS) that routes tens of millions of phone calls per day for customers in enterprise and telecommunications industries. In this session Sasha Vaniachine, Senior Database Administrator at THINQ, explains how he combined MariaDB Server and MariaDB ColumnStore to support both high-performance transaction processing and scalable analytics. In addition, he shares some of THINQ's best practices and lessons learned from supporting an ever-increasing database workload that currently exceeds 10,000 transactions per second.

Veröffentlicht in: Software
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

How THINQ runs both transactions and analytics at scale

  1. 1. 1MySQL Triangle Meetup| 2019-01-24 How runs both transactions and analytics at scale
  2. 2. 2MySQL Triangle Meetup| 2019-01-24 2MariaDB OpenWorks Conference | 2019-02-26 Open Source Enthusiast • I am the most senior DBA at who is fortunate to be among the early MySQL users • One of my use cases was listed in An Open Letter to the Community from MySQL Founders David Axmark & Michael "Monty" Widenius on 2 August 2005 • This year, we are celebrating ten years of MySQL… • Major free software projects and hugely-popular Web sites such as the Sahana (disaster recovery system for the tsunami), Ensembl.org and Human Genome Project (used for cancer research), Wikipedia, Bugzilla, Craigslist, Feedster, Flickr, Freshmeat, LiveJournal, Neopets, Slashdot, SugarCRM, Technorati, Wordpress, CERNs ATLAS Experiment -- all taking advantage of MySQL's speed, ease of use, flexibility, scalability and ecosystem
  3. 3. 3MySQL Triangle Meetup| 2019-01-24 3MariaDB OpenWorks Conference | 2019-02-26 Outline • provides a cloud-based Communications-Platform-as-a-Service (CPaaS) that routes tens of millions of phone calls per day for customers in enterprise and telecommunications • I will present how combined MariaDB ColumnStore/InfiniDB and MariaDB Galera Cluster to support both high-performance transaction processing and scalable analytics • In addition, I will share some of our best practices and lessons learned from supporting an ever-increasing database workload • a consequence of continued growth
  4. 4. 4MySQL Triangle Meetup| 2019-01-24 4MariaDB OpenWorks Conference | 2019-02-26 What is VoIP and LCR? • Since 2010 provides Least Cost Routing for VoIP phone calls • VoIP stands for Voice over Internet Protocol • An example: Skype • Uses proprietary protocol to setup calls • and most phone carriers use a standard Session Initiation Protocol (SIP) • and many others use an open-source application OpenSIPS to setup calls • LCR stands for Least Cost Routing • While setting up VoIP phone call, choose carrier providing least expensive rate • Providing Least Cost Routing is not trivial – see next slide
  5. 5. 5MySQL Triangle Meetup| 2019-01-24 5MariaDB OpenWorks Conference | 2019-02-26 Actively maintained *stats provided by openhub.net About CGRateS (2)Least Cost Routing: Solutions by Others Least Cost Routing: the science of finding the most cost-effective path to connect your customers’ calls
  6. 6. 6MySQL Triangle Meetup| 2019-01-24 6MariaDB OpenWorks Conference | 2019-02-26 Least Cost Routing: Solutions by • Industry's only toll- free LCR engine 170+ Countries 60+ Carriers
  7. 7. 7MySQL Triangle Meetup| 2019-01-24 7MariaDB OpenWorks Conference | 2019-02-26 data types, retention periods and volumes SIP Messages CDRs Aggregates days weeks or months permanent Terabytes Terabytes Gigabytes
  8. 8. 8MySQL Triangle Meetup| 2019-01-24 8MariaDB OpenWorks Conference | 2019-02-26 Setting up Phone Calls • processes dozens of SIP messages (text format, like HTTP/1.1) to setup one call • For debugging (SIP tracing) all SIP messages are captured in the MariaDB Server database • As a consequence of our continued growth, this results in an ever-increasing database workload to capture more than 75,000 SIP messages per second SIP Messages
  9. 9. 9MySQL Triangle Meetup| 2019-01-24 Is there a way to scale up SIP messages inserts in RDBMS rather than inserting rows one by one? SIP Messages
  10. 10. 10MySQL Triangle Meetup| 2019-01-24 10MariaDB OpenWorks Conference | 2019-02-26 Scaling up Insert Rate: Multi-Row Inserts • Instead of flushing inserted rows one by one into the database, OpenSIPS will now store rows in memory, and only flush them to DB when a certain number of rows have piled up in memory • The flushing of the rows will be done within a single SQL command 1. On a constant rate of 1500 CPS pushed through the proxy, the load on the mysql daemon was : • OpenSIPS 1.6.4 : 71.5 % • OpenSIPS 1.7.0 : 35.1 % -> BOOST = ~50 % lower load on mysql http://www.opensips.org/About/PerformanceTests-InsertBuffering SIP Messages
  11. 11. 11MySQL Triangle Meetup| 2019-01-24 The transactions per second rate has been lowered by inserting multiple rows at once Does the data rate (Mbps) remain the same? How to scale up the I/O rate further? SIP Messages
  12. 12. 12MySQL Triangle Meetup| 2019-01-24 12MariaDB OpenWorks Conference | 2019-02-26 Scaling up I/O Rate • As a consequence of our growth we now • optimized my.cnf configuration for InnoDB disk I/O • use faster disks • split (“shard”) the SIP message capture across several MariaDB Servers SIP Messages
  13. 13. 13MySQL Triangle Meetup| 2019-01-24 13MariaDB OpenWorks Conference | 2019-02-26 Scaling up Insert Rate: Multi-Row Inserts • Instead of flushing inserted rows one by one into the database, OpenSIPS will now store rows in memory, and only flush them to DB when a certain number of rows have piled up in memory • The flushing of the rows will be done within a single SQL command 1. On a constant rate of 1500 CPS pushed through the proxy, the load on the mysql daemon was : • OpenSIPS 1.6.4 : 71.5 % • OpenSIPS 1.7.0 : 35.1 % -> BOOST = ~50 % lower load on mysql 2. In terms of raw CPS, the maximum number of calls per second that the proxy could handle was : • OpenSIPS 1.6.4 : 1700 CPS • OpenSIPS 1.7.0 : 3000 CPS -> BOOST = ~75 % higher CPS http://www.opensips.org/About/PerformanceTests-InsertBuffering SIP Messages
  14. 14. 14MySQL Triangle Meetup| 2019-01-24 14MariaDB OpenWorks Conference | 2019-02-26 Decoupling Database and Application • The increase in OpenSIPS throughput from using multi-row inserts indicates a tight coupling between the database and the application • Decoupling the application from the database requires • a non-blocking MariaDB client library • application re-architecture • The latest version of OpenSIPS has been re-architected to decouple application from database • This presentation shares our experience in case of tight coupling between the database and the application SIP Messages CDRs
  15. 15. 15MySQL Triangle Meetup| 2019-01-24 15MariaDB OpenWorks Conference | 2019-02-26 High Availability • A critical application tightly coupled to a database requires High Availability of the database server • For High Availability we use MariaDB Galera Cluster Galera Cluster ➢ Synchronous multi-master cluster ➢ no data loss ➢ no slave lag ➢ no slave failover ➢ For MySQL/InnoDB ➢ 3 or more nodes needed for HA ➢ No single point of failure CDRs
  16. 16. 16MySQL Triangle Meetup| 2019-01-24 16MariaDB OpenWorks Conference | 2019-02-26 Call Detail Records • After the phone call is complete, telecommunications providers must store a Call Detail Record (CDR) • In our case, upon the phone call completion the OpenSIPS application generates a corresponding record for CDR accounting • A database query for the predefined accounting table: • INSERT INTO acc … CDRs
  17. 17. 17MySQL Triangle Meetup| 2019-01-24 17MariaDB OpenWorks Conference | 2019-02-26 CDR Processing • Records from acc table are processed to generate CDRs and aggregate data further • Traditional solution: • Queue records for processing • Works well on a small scale Flavio E. Goncalves: Building Telephony Systems with OpenSIPS 1.6
  18. 18. 18MySQL Triangle Meetup| 2019-01-24 Do you expect a problem with this approach when the CDR processing rate increases as a consequence of our growth? CDRs
  19. 19. 19MySQL Triangle Meetup| 2019-01-24 19MariaDB OpenWorks Conference | 2019-02-26 Scaling Up RDBMS Queues • https://www.engineyard.com/blog/5-subtle-ways-youre-using-mysql-as-a-queue-and-why-itll-bite-you • https://www.eschrade.com/page/why-mysql-is-not-a-queue
  20. 20. 20MySQL Triangle Meetup| 2019-01-24 Is there a better way to process records that scales better than a queue implemented in RDBMS? CDRs
  21. 21. 21MySQL Triangle Meetup| 2019-01-24 21MariaDB OpenWorks Conference | 2019-02-26 Triggers! • Scales much better • For processing and aggregation uses a trigger on the acc table • Upon each acc row INSERT, the trigger increments corresponding rows in the billing table using ON DUPLICATE KEY UPDATE for a unique index on columns used for aggregation such as customer_id, carrier_id https://www.xaprb.com/blog/2007/01/11/how-to-implement-a-queue-in-sql
  22. 22. 22MySQL Triangle Meetup| 2019-01-24 22MariaDB OpenWorks Conference | 2019-02-26 Next Scalability Limit We use: • OpenSIPS multi-row inserts to scale up inserts from numerous calls • Triggers to scale up CDR processing • MariaDB Galera cluster for High Availability (HA) • Trigger support is straightforward • Not so trivial to support EVENTs • As a consequence of growth, these solutions hit another scalability limit: • We encountered an increasing rate of deadlocks CDRs Aggregates
  23. 23. 23MySQL Triangle Meetup| 2019-01-24 23MariaDB OpenWorks Conference | 2019-02-26 Increasing Rate of Deadlocks Like: • Retry logic is of little or no help here as the deadlock may happen again • Consolidating writes to a single Galera Cluster node is of no help either CDRs Aggregates
  24. 24. 24MySQL Triangle Meetup| 2019-01-24 24MariaDB OpenWorks Conference | 2019-02-26 Let us review our aggregation approach CREATE TRIGGER aggregate BEFORE INSERT ON acc FOR EACH ROW BEGIN INSERT INTO billing.aggregates (date, customer_id, carrier_id, calls, ...) VALUES (NEW.date, NEW.customer_id, NEW.carrier_id, IF(NEW.sip_code=200,1,0), # count completed calls ...) ON DUPLICATE KEY UPDATE calls = calls + VALUES(calls), ...; END CDRs AggregatesCREATE TABLE `acc` ( `id` INT NOT NULL AUTO_INCREMENT, `date` DATE, `customer_id` INT, `carrier_id` INT, `sip_code` INT, ... PRIMARY KEY (`id`) ); CREATE TABLE `billing`.`aggregates` ( `id` INT NOT NULL AUTO_INCREMENT, `date` DATE, `customer_id` INT, `carrier_id` INT, `calls` INT, PRIMARY KEY (`id`), UNIQUE KEY `uk_customer_carrier` (`date`,`customer_id`,`carrier_id`) );
  25. 25. 25MySQL Triangle Meetup| 2019-01-24 Is it clear why these deadlocks started to happen as a consequence of our growth? CDRs Aggregates
  26. 26. 26MySQL Triangle Meetup| 2019-01-24 26MariaDB OpenWorks Conference | 2019-02-26 customer-1, carrier-1 UNIQUE KEY INDEX ROWS customer-1, carrier-2 customer-1, carrier-3 (1) TRANSACTION (2) TRANSACTION customer-1, carrier-4 customer-1, carrier-5 Call 1: customer-1, carrier-1 Call 2: customer-1, carrier-3 Call 3: customer-1, carrier-2 Call 4: customer-2, carrier-1 … Call 100: customer-2, carrier-1 Call 1: customer-1, carrier-5 Call 2: customer-1, carrier-3 Call 3: customer-1, carrier-4 Call 4: customer-1, carrier-1 … Call 100: customer-6, carrier-1 What is causing deadlocks? • Concurrent multi-row INSERT queries may cause a deadlock when two transactions touch the same sets of data in a different order • Such as locking index rows in an opposite order CDRs Aggregates
  27. 27. 27MySQL Triangle Meetup| 2019-01-24 27MariaDB OpenWorks Conference | 2019-02-26 Why deadlocks presented a problem? • A deadlocked query on Galera Cluster is like a huge transaction • All later INSERT queries are waiting for it to complete • or to be rolled back • Increasing rate of deadlocks creates a backpressure for our OpenSIPS application • Due to the tight coupling between the application and the database server Impact of Huge Transaction 0 500 1000 1500 2000 2500 3000 3500 4000 4500 Huge Transaction Slave Lag Trx in master 24 secs Trx in slave 9 secs CDRs Aggregates
  28. 28. 28MySQL Triangle Meetup| 2019-01-24 Is there a way to avoid such deadlocks? CDRs Aggregates
  29. 29. 29MySQL Triangle Meetup| 2019-01-24 29MariaDB OpenWorks Conference | 2019-02-26 Locking in the same order • When modifying different sets of rows in the same table, do those operations in a consistent order each time [MySQL Reference Manual] • Application developers can eliminate all risk of enqueue deadlocks by ensuring that transactions requiring multiple resources always lock them in the same order [Steve Adams] CDRs Aggregates
  30. 30. 30MySQL Triangle Meetup| 2019-01-24 30MariaDB OpenWorks Conference | 2019-02-26 customer-1, carrier-1 UNIQUE KEY INDEX ROWS customer-1, carrier-2 customer-1, carrier-3 (1) TRANSACTION (2) TRANSACTION customer-1, carrier-4 customer-1, carrier-5 Call 1: customer-1, carrier-1 Call 2: customer-1, carrier-2 Call 3: customer-1, carrier-3 Call 4: customer-2, carrier-1 … Call 100: customer-2, carrier-1 Call 1: customer-1, carrier-3 Call 2: customer-1, carrier-4 Call 3: customer-1, carrier-5 Call 4: customer-1, carrier-7 … Call 100: customer-6, carrier-9 We need ordered sets of rows • Since OpenSIPS application is open source, we can modify the INSERT statement generated by the OpenSIPS application CDRs Aggregates
  31. 31. 31MySQL Triangle Meetup| 2019-01-24 We would like to delegate row ordering to MariaDB, for example 1. Insert unordered data into a temporary table tmp 2. Insert into acc select from tmp order by customer_id, carrier_id Is there a better way? CDRs Aggregates
  32. 32. 32MySQL Triangle Meetup| 2019-01-24 32MariaDB OpenWorks Conference | 2019-02-26 Avoiding Deadlocks • We order INSERT rows according to the index columns on-the-fly • by modifying the prepared statement generated by the OpenSIPS application for the acc table INSERT query • With UNION ALL, the modified prepared statement creates a set, which is then ordered by the unique index columns • A use case for the MariaDB UNION ALL optimization row OpenSIPS prepared statement modified prepared statement derived table: select without from insert into acc insert into acc ( ( method, method, customer_id, customer_id, carrier_id, carrier_id, callid, callid, sip_code, sip_code, sip_reason, sip_reason, time, time, duration, duration, setuptime, setuptime, created created ) ) values ( 1 ( select ?, ? AS method, ?, ? AS customer_id, ?, ? AS carrier_id, ?, ? AS callid, ?, ? AS sip_code, ?, ? AS sip_reason, ?, ? AS time, ?, ? AS duration, ?, ? AS setuptime, ? ? AS created ) ) , union all 2 (?,?,?,?,?,?,?,?,?,?) (select ?,?,?,?,?,?,?,?,?,?) , union all … … … 100 (?,?,?,?,?,?,?,?,?,?) (select ?,?,?,?,?,?,?,?,?,?) order by customer_id, carrier_id
  33. 33. 33MySQL Triangle Meetup| 2019-01-24 Three use cases for ColumnStore/InfiniDB 1. Conventional use case: analytics on the Call Detail Records The CDRs are wide rows with 50-100 columns, while most analytics queries need just few of those columns Works best for approximately incremental data, such as time series CDRs
  34. 34. 34MySQL Triangle Meetup| 2019-01-24 34MariaDB OpenWorks Conference | 2019-02-26 2. Data Aggregation with ColumnStore How many calls happen during the last hour? • A considerable problem here is that this number may change, as Call Detail Records for the past hour may arrive with delay • Under those conditions with InfiniDB/ColumnStore we are able to repeat simple SQL queries for data aggregation • These SQL queries work well, since latest data (for the few select columns) fit in memory of ColumnStore/InfiniDB cluster nodes • In a traditional row-based MariaDB Server this simple approach would require prohibitively more memory to fit whole InnoDB rows CDRs Aggregates
  35. 35. 35MySQL Triangle Meetup| 2019-01-24 35MariaDB OpenWorks Conference | 2019-02-26 3. Redundancy for Critical Data • As statistical findings tolerate limited data losses, ACID compliance of the MariaDB ColumnStore (the InfiniDB legacy) is often overlooked • In contrast, the third use case requires both products Telecommunications (IP telephony – SaaS) Database (Hybrid) Transactions Analytics Transactional Capture call detail records Charge by call/message Generate bills Analytical Monitor usage Identify peak periods Estimate costs Self-service analytics Slide by Shane K Johnson, MariaDB CDRs
  36. 36. 36MySQL Triangle Meetup| 2019-01-24 36MariaDB OpenWorks Conference | 2019-02-26 Double-Entry Accounting • Since middle ages, accounting uses double-entry system, were at least two accounting entries are required to record each financial transaction • In telecommunications these financial transactions are a part of the Call Detail Record • In 1494 Fra Luca Bartolomeo de Pacioli published a book on the double-entry system of accounting
  37. 37. 37MySQL Triangle Meetup| 2019-01-24 37MariaDB OpenWorks Conference | 2019-02-26 Double-Entry Accounting for CDRs • Thanks to the ACID compliance of MariaDB ColumnStore/InfiniDB transactions, thinQ was able to implement in practice double-entry accounting for Call Detail Records • The ColumnStore/InfiniDB transactions enable us to verify/audit customers’ billing done through Call Detail Records stored in MariaDB Server CDRs
  38. 38. 38MySQL Triangle Meetup| 2019-01-24 38MariaDB OpenWorks Conference | 2019-02-26 Traditional InfiniDB/CS Data Processing Pipeline OLTP Files/XML Log Files Operational Source Data StagingorODSETL High-speedLoadUtility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management #6 Load New Data with Minimal Impact OLTP Files/XML Log Files Operational Source Data StagingorODSETL High-speedLoadUtility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management #6 Load New Data with Minimal Impact OLTP Files/XML Log Files Operational Source Data StagingorODSETL High-speedLoadUtility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management #6 Load New Data with Minimal Impact InfiniDB HA Staging Area for processed data local to InfiniDB (e.g. costly HA storage) HA Staging Area for raw data (e.g. data streams)
  39. 39. 39MySQL Triangle Meetup| 2019-01-24 39MariaDB OpenWorks Conference | 2019-02-26 Heterogeneous Redundancy Assures Scaling • Our experience with using both MariaDB Server and MariaDB ColumnStore/InfiniDB proved crucial in solving business challenges • as two distinct data processing pipelines built with two different technologies encountered scalability limits at different loads • Similar to both mechanical and hydraulic braking in a car • Combining mechanical drum brakes with hydraulic brakes to offer backup braking support in case the car’s hydraulic system fails CDRs
  40. 40. 40MySQL Triangle Meetup| 2019-01-24 • To assure resilience, infrastructure is geo-redundant • Our call data is collected in geo-distributed data centers, which complicates analytics • Note the extra cost of large data transfers between data centers • According to the InfiniDB Concepts Guide, User and Performance Modules can be separated out in different data centers and geographic locations • We are pleased that MariaDB ColumnStore retained such feature • The beauty of this is that the analytical data aggregation queries are executed locally, with only small aggregated data are transferred between data centers CDRs
  41. 41. 41MySQL Triangle Meetup| 2019-01-24 41MariaDB OpenWorks Conference | 2019-02-26 Tips for Geo-Distributed MariaDB ColumnStore • Three combined UM/PM nodes: PM3 in a different geo-location than PM1 & PM2 • Watch for idle TCP/IP connections dropped by data centers firewalls • Implement a keep-alive ping such as periodic execution of a test data aggregation query • Watch for automatic round-robin distribution of queries: • First query execution is fast (0.3s) – logged on PM1 node • Second execution is fast (0.3s) – logged on PM2 node • Third execution is slow (18s) – logged on PM3 node in a data center separated by 25 ms RTT from PM1 & PM2 • David Thompson (MariaDB VP) kindly provided a workaround for round-robin query distribution: • Change the Columnstore.xml ExeMgr IP addresses to 127.0.0.1 on all three nodes • With these, the geo-distributed MariaDB ColumnStore system operates stably
  42. 42. 42MySQL Triangle Meetup| 2019-01-24 We look forward to the new remote mcsimport capabilities of the ColumnStore Bulk Write SDK CDRs
  43. 43. 43MySQL Triangle Meetup| 2019-01-24 43MariaDB OpenWorks Conference | 2019-02-26 OLTP Files/XML Log Files Operational Source Data StagingorODSETL High-speedLoadUtility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management 6 Load New Data with Minimal Impact OLTP Files/XML Log Files Operational Source Data StagingorODSETL High-speedLoadUtility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management #6 Load New Data with Minimal Impact RemoteHABulkDataLoaders MariaDB ColumnStore HA Staging Area for raw data (e.g. data streams) New ColumnStore Data Processing Pipeline • Improvements in data processing pipeline provided by the remote mcsimport
  44. 44. 44MySQL Triangle Meetup| 2019-01-24 44MariaDB OpenWorks Conference | 2019-02-26 vs. Traditional InfiniDB/CS Data Processing Pipeline OLTP Files/XML Log Files Operational Source Data StagingorODSETL High-speedLoadUtility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management #6 Load New Data with Minimal Impact OLTP Files/XML Log Files Operational Source Data StagingorODSETL High-speedLoadUtility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management #6 Load New Data with Minimal Impact OLTP Files/XML Log Files Operational Source Data StagingorODSETL High-speedLoadUtility Ad-Hoc Dashboards Reports Notifications Users Staging Area Data Warehouse Data Warehouse and Metadata Management #6 Load New Data with Minimal Impact InfiniDB HA Staging Area for processed data local to InfiniDB (e.g. costly HA storage) HA Staging Area for raw data (e.g. data streams)
  45. 45. 45MySQL Triangle Meetup| 2019-01-24 45MariaDB OpenWorks Conference | 2019-02-26 Online Schema Change while Streaming • Tight coupling of streaming applications to CS schema complicates schema changes • Redundant applications for data streaming enable schema changes without data loss • Prepare new application (e.g. mcsimport job.xml) for the new schema • e.g. describe new columns as <DefaultColumn colName="col7"/> • Stop the old application (mcsimport cron job) • Provide enough buffer (staging area) for the data • Use ColumnStore function (it takes time) select calonlinealter('alter table foo add column col7 int;'); alter table foo add column col7 int comment 'schema sync only'; • Start the new application (mcsimport cron job)
  46. 46. 46MySQL Triangle Meetup| 2019-01-24 46MariaDB OpenWorks Conference | 2019-02-26 High Availability with ColumnStore Bulk Write SDK • By their nature, data streaming applications run continuously • Redundant applications could increase data streaming uptime, since if one application fails, a second application would still be running • How do you implement HA/failover between data streaming applications using bulk write SDK remotely? • MariaDB developers provide functions to view and clear table locks remotely • In case of MariaDB Server, transaction is rolled back upon client failure • Perhaps the MariaDB Platform X3 may implement a similar behavior for ColumnStore
  47. 47. 47MySQL Triangle Meetup| 2019-01-24 47MariaDB OpenWorks Conference | 2019-02-26 More Open Source Benefits • Some ColumnStore features are documented as open source code • A failed cpimport may result in locks that can not be cleared with cleartablelock. Andrew Hutchings (MariaDB CS Lead) pointed to one useful option documented in such a way: • If your ColumnStore installation is running fine now these locks can be removed using a hidden cleartablelock option, '-l'. For example: /usr/local/mariadb/columnstore/bin/cleartablelock -l 1 • The downside is if the table really does exist and there was data to rollback, it cannot be rolled back any more. This is why we don't really publish this option. • I used this option for the table with reference data like customers https://groups.google.com/d/msg/mariadb-columnstore/B0fDukIgUzM/FUGBiZR7AgAJ
  48. 48. 48MySQL Triangle Meetup| 2019-01-24 48MariaDB OpenWorks Conference | 2019-02-26 MariaDB ColumnStore Summary • Most analytics queries read only few of table columns • Works best for approximately incremental data, such as time series • Repeated aggregation works well since data could fit in memory • Features often overlooked: • ACID transactions enables heterogeneous redundancy for critical data • Geo-distributed cluster works stably • While streaming, you may change the schema without data loss • With caution, you may use hidden cleartablelock option • Latest bulk write SDK enables remote data upload
  49. 49. 49MySQL Triangle Meetup| 2019-01-24 49MariaDB OpenWorks Conference | 2019-02-26 MariaDB TX 3.0 MariaDB Server 10.3 MariaDB MaxScale 2.2 InnoDB/MyRocks MariaDB AX 2.0 MariaDB Server 10.2 MariaDB MaxScale 2.2 ColumnStore 1.2 MariaDB Platform X3 MariaDB MaxScale 2.3 MariaDB Server 10.3 InnoDB/MyRocks MariaDB Server 10.3 ColumnStore 1.3 Conclusion • Wise past technology choices (MariaDB/Galera and InifiniDB) provided with a consolidated roadmap for future upgrades

×