SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
Mashing the Data
Real-Time replication from
MySQL to Google Cloud
Datastore
Ingredients
● MySQL
● NodeJS
● ZongJi
● Google Cloud Datastore
There are two types of DBAs:
1) DBAs that do backups
2) DBAs that will do backups
MySQL
● Most used Open source DB - second place overall after Oracle (but almost
equal)*
● Since 1995
● Currently at version 5.7 (5.7.16 in Oct’16)
● Several forks - MariaDB, Percona
● Several storage engines, most used is InnoDB
● NDB Cluster and Master-Master Replication for HA
* According to http://db-engines.com/en/ranking
A SQL query walks into a bar and sees two tables.
He walks up to them and asks, "Can I join you?"
MySQL replication
● Master - Slave(s)
● Slaves can be Masters in their turn (Master->Slave->Slave->...->Slave)
○ log_slave_updates
● Only data modifying queries are logged (Create, Update, Delete; not
Reads)
● 2 ½ types of replication
○ Statement Based (SBR) -> binary log records queries (UPDATE … SET ..) which are then
replayed on slave
○ Row Based (RBR) -> binary log records directly the values of the affected row before and
after the change is applied
○ Mixed -> binary log records a mix of SBR and RBR (default is SBR, but for certain
statements + storage engine used, the log is automatically switched to row-based)
Q: Why do you never ask SQL people to help you
move your furniture?
A: They sometimes drop the table
MySQL replication (cont’d)
● SBR is good when changes affect lots of rows (as for e.g. 1k modified rows
we only send a few bytes across the wire)
● SBR has problems when there are inconsistencies between master and
slave or when queries are not deterministic (e.g. UPDATE … SET … LIMIT
100)
● RBR is good in maintaining a better consistency (as every changed row is
replicated)
● RBR can be problematic when many rows are changed with a single
statement (lots of traffic over the network)
Google Cloud
Datastore
What is GCD
● NoSQL document database
● Automatic scaling
● High performance
● Flexible storage
GCD (cont’d)
● Balance of strong and eventual consistency
○ entity lookups by key and ancestor queries always receive strongly consistent data
○ Other queries are eventually consistent
● Encryption at rest
○ encrypts all data before it is written to disk
● Querying of data through GQL
○ Similar with “classic” SQL; e.g. SELECT * FROM myKind WHERE myProp >= 100 AND
myProp < 200 or SELECT * FROM myKind ORDER BY myProp DESC LIMIT 100
● By default all properties are indexed, supports composite indexes (a bit
more work to enable them though)
Our Setup
Setup
MySQL Master
MySQL Slave
SBR NodeJS App
RBR
Google Cloud
Datastore
Google Cloud
Node modules
Details about NodeJS App
● Uses ZongJi (https://github.com/nevill/zongji - MySQL binlog listener)
var ZongJi = require('zongji');
var zongji = new ZongJi(config.database);
zongji.on('binlog',function (evt) {doSomething('binlog',evt)})
zongji.on('query', function(evt) {doSomething('query',evt)})
zongji.on('writerows',function(evt) {doSomething('insert',evt)})
zongji.on('updaterows', function(evt) {doSomething('update',evt)})
zongji.on('deleterows', function(evt) {doSomething('delete',evt)})
NodeJS (cont’d)
zongji.start({
startAtEnd: true,
includeSchema: {yourDBhere":true,"yourOtherDBHere":true},//config.monitor,
includeEvents: [ 'tablemap', 'writerows', 'updaterows', 'deleterows' , 'query','rotate']
});
var doSomething = function(type, event) {
//event has a rows attribute containing every modified row
//it also has a tableMap containing table metadata (most important - table name)
}
NodeJS (last one, I promise)
var sendToDataStore = function(namespace,idfldname,row) {
var k = datastore.key([namespace, row[idfldname]]);
datastore.save({key:k,data:row} ,function(err,res){
if(err) console.log("ERROR",err)
else console.log("OK",JSON.stringify(res))
});
}
Demo Time
In case the demo does not work
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

Ndb cluster 80_ycsb_disk
Ndb cluster 80_ycsb_diskNdb cluster 80_ycsb_disk
Ndb cluster 80_ycsb_diskmikaelronstrom
 
20141206 4 q14_dataconference_i_am_your_db
20141206 4 q14_dataconference_i_am_your_db20141206 4 q14_dataconference_i_am_your_db
20141206 4 q14_dataconference_i_am_your_dbhyeongchae lee
 
PostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQLPostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQLAlexei Krasner
 
DMS (Database Migration Service) - Mydbops Team
DMS  (Database Migration Service) - Mydbops TeamDMS  (Database Migration Service) - Mydbops Team
DMS (Database Migration Service) - Mydbops TeamMydbops
 
JS App Architecture
JS App ArchitectureJS App Architecture
JS App ArchitectureCorey Butler
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDBSage Weil
 
MongoDB basics & Introduction
MongoDB basics & IntroductionMongoDB basics & Introduction
MongoDB basics & IntroductionJerwin Roy
 
SFScon14: Schrödinger’s elephant: why PostgreSQL can solve all your database ...
SFScon14: Schrödinger’s elephant: why PostgreSQL can solve all your database ...SFScon14: Schrödinger’s elephant: why PostgreSQL can solve all your database ...
SFScon14: Schrödinger’s elephant: why PostgreSQL can solve all your database ...South Tyrol Free Software Conference
 
WiredTiger Overview
WiredTiger OverviewWiredTiger Overview
WiredTiger OverviewWiredTiger
 
Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?ScyllaDB
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentationMichael Keane
 
Elephants vs. Dolphins: Comparing PostgreSQL and MySQL for use in the DoD
Elephants vs. Dolphins:  Comparing PostgreSQL and MySQL for use in the DoDElephants vs. Dolphins:  Comparing PostgreSQL and MySQL for use in the DoD
Elephants vs. Dolphins: Comparing PostgreSQL and MySQL for use in the DoDJamey Hanson
 
Introduction to mongo db
Introduction to mongo dbIntroduction to mongo db
Introduction to mongo dbLawrence Mwai
 
Optimizing RocksDB for Open-Channel SSDs
Optimizing RocksDB for Open-Channel SSDsOptimizing RocksDB for Open-Channel SSDs
Optimizing RocksDB for Open-Channel SSDsJavier González
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL Bernd Ocklin
 

Was ist angesagt? (20)

Ndb cluster 80_ycsb_disk
Ndb cluster 80_ycsb_diskNdb cluster 80_ycsb_disk
Ndb cluster 80_ycsb_disk
 
20141206 4 q14_dataconference_i_am_your_db
20141206 4 q14_dataconference_i_am_your_db20141206 4 q14_dataconference_i_am_your_db
20141206 4 q14_dataconference_i_am_your_db
 
PostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQLPostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQL
 
Introduction to mongo db
Introduction to mongo dbIntroduction to mongo db
Introduction to mongo db
 
DMS (Database Migration Service) - Mydbops Team
DMS  (Database Migration Service) - Mydbops TeamDMS  (Database Migration Service) - Mydbops Team
DMS (Database Migration Service) - Mydbops Team
 
JS App Architecture
JS App ArchitectureJS App Architecture
JS App Architecture
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDB
 
MongoDB basics & Introduction
MongoDB basics & IntroductionMongoDB basics & Introduction
MongoDB basics & Introduction
 
SFScon14: Schrödinger’s elephant: why PostgreSQL can solve all your database ...
SFScon14: Schrödinger’s elephant: why PostgreSQL can solve all your database ...SFScon14: Schrödinger’s elephant: why PostgreSQL can solve all your database ...
SFScon14: Schrödinger’s elephant: why PostgreSQL can solve all your database ...
 
WiredTiger Overview
WiredTiger OverviewWiredTiger Overview
WiredTiger Overview
 
Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?Scylla Summit 2018: What's New in Scylla Manager?
Scylla Summit 2018: What's New in Scylla Manager?
 
In-memory Databases
In-memory DatabasesIn-memory Databases
In-memory Databases
 
Ndb cluster 80_tpc_h
Ndb cluster 80_tpc_hNdb cluster 80_tpc_h
Ndb cluster 80_tpc_h
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
Elephants vs. Dolphins: Comparing PostgreSQL and MySQL for use in the DoD
Elephants vs. Dolphins:  Comparing PostgreSQL and MySQL for use in the DoDElephants vs. Dolphins:  Comparing PostgreSQL and MySQL for use in the DoD
Elephants vs. Dolphins: Comparing PostgreSQL and MySQL for use in the DoD
 
Introduction to mongo db
Introduction to mongo dbIntroduction to mongo db
Introduction to mongo db
 
Cosmos db
Cosmos dbCosmos db
Cosmos db
 
Optimizing RocksDB for Open-Channel SSDs
Optimizing RocksDB for Open-Channel SSDsOptimizing RocksDB for Open-Channel SSDs
Optimizing RocksDB for Open-Channel SSDs
 
RocksDB meetup
RocksDB meetupRocksDB meetup
RocksDB meetup
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL
 

Andere mochten auch

(Almost) Serverless Analytics System with BigQuery & AppEngine
(Almost) Serverless Analytics System with BigQuery & AppEngine(Almost) Serverless Analytics System with BigQuery & AppEngine
(Almost) Serverless Analytics System with BigQuery & AppEngineGabriel PREDA
 
Game sense
Game senseGame sense
Game sense17280495
 
請敘述Cpu的執行過程或步驟
請敘述Cpu的執行過程或步驟請敘述Cpu的執行過程或步驟
請敘述Cpu的執行過程或步驟Adolph YT
 
society and Daily Life in Mogul India
society and Daily Life in Mogul Indiasociety and Daily Life in Mogul India
society and Daily Life in Mogul Indiamariacardenas02
 
Power poin programacion
Power poin programacionPower poin programacion
Power poin programacionPatyy Nolasco
 
2015-OBDC-Organizational-Report
2015-OBDC-Organizational-Report2015-OBDC-Organizational-Report
2015-OBDC-Organizational-ReportRachel Aoanan
 
Deconstructing Lambda architectures
Deconstructing Lambda architecturesDeconstructing Lambda architectures
Deconstructing Lambda architecturesFelix Crisan
 
Presentation for the first Bucharest Big data meetup
Presentation for the first Bucharest Big data meetupPresentation for the first Bucharest Big data meetup
Presentation for the first Bucharest Big data meetupFelix Crisan
 
Qualitem - Large List Support - SharePoint Saturday
Qualitem - Large List Support - SharePoint SaturdayQualitem - Large List Support - SharePoint Saturday
Qualitem - Large List Support - SharePoint SaturdayRick Rosato
 
Atividade - Mapa Conceitual
Atividade - Mapa ConceitualAtividade - Mapa Conceitual
Atividade - Mapa Conceitualrafaelly04
 
班級經營100.03.30
班級經營100.03.30班級經營100.03.30
班級經營100.03.30Kuo-Yi Chen
 
Bluemix predictive analyticsのご紹介
Bluemix predictive analyticsのご紹介Bluemix predictive analyticsのご紹介
Bluemix predictive analyticsのご紹介IBM Analytics Japan
 
7 reasons why media productivity plans don't work as expected
7 reasons why media productivity plans don't work as expected7 reasons why media productivity plans don't work as expected
7 reasons why media productivity plans don't work as expectedPaola Furlanetto
 

Andere mochten auch (20)

(Almost) Serverless Analytics System with BigQuery & AppEngine
(Almost) Serverless Analytics System with BigQuery & AppEngine(Almost) Serverless Analytics System with BigQuery & AppEngine
(Almost) Serverless Analytics System with BigQuery & AppEngine
 
Game sense
Game senseGame sense
Game sense
 
請敘述Cpu的執行過程或步驟
請敘述Cpu的執行過程或步驟請敘述Cpu的執行過程或步驟
請敘述Cpu的執行過程或步驟
 
society and Daily Life in Mogul India
society and Daily Life in Mogul Indiasociety and Daily Life in Mogul India
society and Daily Life in Mogul India
 
lookbook 3
lookbook 3 lookbook 3
lookbook 3
 
lookbook 1
lookbook 1lookbook 1
lookbook 1
 
Power poin programacion
Power poin programacionPower poin programacion
Power poin programacion
 
Brochure (1)
Brochure (1)Brochure (1)
Brochure (1)
 
Give Dairy Farmers A Market
Give Dairy Farmers A MarketGive Dairy Farmers A Market
Give Dairy Farmers A Market
 
2015-OBDC-Organizational-Report
2015-OBDC-Organizational-Report2015-OBDC-Organizational-Report
2015-OBDC-Organizational-Report
 
Deconstructing Lambda architectures
Deconstructing Lambda architecturesDeconstructing Lambda architectures
Deconstructing Lambda architectures
 
Presentation for the first Bucharest Big data meetup
Presentation for the first Bucharest Big data meetupPresentation for the first Bucharest Big data meetup
Presentation for the first Bucharest Big data meetup
 
NoSQL solutions
NoSQL solutionsNoSQL solutions
NoSQL solutions
 
Qualitem - Large List Support - SharePoint Saturday
Qualitem - Large List Support - SharePoint SaturdayQualitem - Large List Support - SharePoint Saturday
Qualitem - Large List Support - SharePoint Saturday
 
Atividade - Mapa Conceitual
Atividade - Mapa ConceitualAtividade - Mapa Conceitual
Atividade - Mapa Conceitual
 
RHELOPS
RHELOPSRHELOPS
RHELOPS
 
班級經營100.03.30
班級經營100.03.30班級經營100.03.30
班級經營100.03.30
 
Bluemix predictive analyticsのご紹介
Bluemix predictive analyticsのご紹介Bluemix predictive analyticsのご紹介
Bluemix predictive analyticsのご紹介
 
Introducing Elixir
Introducing ElixirIntroducing Elixir
Introducing Elixir
 
7 reasons why media productivity plans don't work as expected
7 reasons why media productivity plans don't work as expected7 reasons why media productivity plans don't work as expected
7 reasons why media productivity plans don't work as expected
 

Ähnlich wie Mashing the data

MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017Severalnines
 
MySQL High Availability Solutions
MySQL High Availability SolutionsMySQL High Availability Solutions
MySQL High Availability SolutionsLenz Grimmer
 
Mysqlhacodebits20091203 1260184765-phpapp02
Mysqlhacodebits20091203 1260184765-phpapp02Mysqlhacodebits20091203 1260184765-phpapp02
Mysqlhacodebits20091203 1260184765-phpapp02Louis liu
 
MySQL High Availability Solutions
MySQL High Availability SolutionsMySQL High Availability Solutions
MySQL High Availability SolutionsLenz Grimmer
 
M|18 How Facebook Migrated to MyRocks
M|18 How Facebook Migrated to MyRocksM|18 How Facebook Migrated to MyRocks
M|18 How Facebook Migrated to MyRocksMariaDB plc
 
Loadays MySQL
Loadays MySQLLoadays MySQL
Loadays MySQLlefredbe
 
High performance and high availability proxies for MySQL
High performance and high availability proxies for MySQLHigh performance and high availability proxies for MySQL
High performance and high availability proxies for MySQLMydbops
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadKrivoy Rog IT Community
 
Ukoug 2011 mysql_arch_for_orcl_dba
Ukoug 2011 mysql_arch_for_orcl_dbaUkoug 2011 mysql_arch_for_orcl_dba
Ukoug 2011 mysql_arch_for_orcl_dbaorablue11
 
MariaDB und mehr - MariaDB Roadshow Summer 2014 Hamburg Berlin Frankfurt
MariaDB und mehr - MariaDB Roadshow Summer 2014 Hamburg Berlin FrankfurtMariaDB und mehr - MariaDB Roadshow Summer 2014 Hamburg Berlin Frankfurt
MariaDB und mehr - MariaDB Roadshow Summer 2014 Hamburg Berlin FrankfurtMariaDB Corporation
 
M|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerM|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerMariaDB plc
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialJean-François Gagné
 
M|18 Battle of the Online Schema Change Methods
M|18 Battle of the Online Schema Change MethodsM|18 Battle of the Online Schema Change Methods
M|18 Battle of the Online Schema Change MethodsMariaDB plc
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsJean-François Gagné
 
Introducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLIntroducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLMariaDB plc
 

Ähnlich wie Mashing the data (20)

MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017MySQL Cluster (NDB) - Best Practices Percona Live 2017
MySQL Cluster (NDB) - Best Practices Percona Live 2017
 
mongodb tutorial
mongodb tutorialmongodb tutorial
mongodb tutorial
 
MySQL High Availability Solutions
MySQL High Availability SolutionsMySQL High Availability Solutions
MySQL High Availability Solutions
 
Mysqlhacodebits20091203 1260184765-phpapp02
Mysqlhacodebits20091203 1260184765-phpapp02Mysqlhacodebits20091203 1260184765-phpapp02
Mysqlhacodebits20091203 1260184765-phpapp02
 
MySQL High Availability Solutions
MySQL High Availability SolutionsMySQL High Availability Solutions
MySQL High Availability Solutions
 
M|18 How Facebook Migrated to MyRocks
M|18 How Facebook Migrated to MyRocksM|18 How Facebook Migrated to MyRocks
M|18 How Facebook Migrated to MyRocks
 
Loadays MySQL
Loadays MySQLLoadays MySQL
Loadays MySQL
 
002 Database-Engines.pptx
002 Database-Engines.pptx002 Database-Engines.pptx
002 Database-Engines.pptx
 
High performance and high availability proxies for MySQL
High performance and high availability proxies for MySQLHigh performance and high availability proxies for MySQL
High performance and high availability proxies for MySQL
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
Ukoug 2011 mysql_arch_for_orcl_dba
Ukoug 2011 mysql_arch_for_orcl_dbaUkoug 2011 mysql_arch_for_orcl_dba
Ukoug 2011 mysql_arch_for_orcl_dba
 
MariaDB und mehr - MariaDB Roadshow Summer 2014 Hamburg Berlin Frankfurt
MariaDB und mehr - MariaDB Roadshow Summer 2014 Hamburg Berlin FrankfurtMariaDB und mehr - MariaDB Roadshow Summer 2014 Hamburg Berlin Frankfurt
MariaDB und mehr - MariaDB Roadshow Summer 2014 Hamburg Berlin Frankfurt
 
M|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB ServerM|18 How to use MyRocks with MariaDB Server
M|18 How to use MyRocks with MariaDB Server
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
 
M|18 Battle of the Online Schema Change Methods
M|18 Battle of the Online Schema Change MethodsM|18 Battle of the Online Schema Change Methods
M|18 Battle of the Online Schema Change Methods
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
 
Running MySQL in AWS
Running MySQL in AWSRunning MySQL in AWS
Running MySQL in AWS
 
HA with Galera
HA with GaleraHA with Galera
HA with Galera
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
Introducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQLIntroducing the ultimate MariaDB cloud, SkySQL
Introducing the ultimate MariaDB cloud, SkySQL
 

Mehr von Felix Crisan

Big data uservices
Big data uservicesBig data uservices
Big data uservicesFelix Crisan
 
BigData in BlockChains
BigData in BlockChainsBigData in BlockChains
BigData in BlockChainsFelix Crisan
 
Smart contracts using web3.js
Smart contracts using web3.jsSmart contracts using web3.js
Smart contracts using web3.jsFelix Crisan
 
Smart contracts in Solidity
Smart contracts in SoliditySmart contracts in Solidity
Smart contracts in SolidityFelix Crisan
 
Big(data) in block(chains)
Big(data) in block(chains)Big(data) in block(chains)
Big(data) in block(chains)Felix Crisan
 
Enablers for o commerce
Enablers for o commerceEnablers for o commerce
Enablers for o commerceFelix Crisan
 
Data analysis with Pandas and Spark
Data analysis with Pandas and SparkData analysis with Pandas and Spark
Data analysis with Pandas and SparkFelix Crisan
 

Mehr von Felix Crisan (12)

Big data uservices
Big data uservicesBig data uservices
Big data uservices
 
Bitcoin:Next
Bitcoin:NextBitcoin:Next
Bitcoin:Next
 
BigData in BlockChains
BigData in BlockChainsBigData in BlockChains
BigData in BlockChains
 
Lightning Network
Lightning  NetworkLightning  Network
Lightning Network
 
Smart contracts using web3.js
Smart contracts using web3.jsSmart contracts using web3.js
Smart contracts using web3.js
 
Smart contracts in Solidity
Smart contracts in SoliditySmart contracts in Solidity
Smart contracts in Solidity
 
Big(data) in block(chains)
Big(data) in block(chains)Big(data) in block(chains)
Big(data) in block(chains)
 
Enablers for o commerce
Enablers for o commerceEnablers for o commerce
Enablers for o commerce
 
mcommad
mcommadmcommad
mcommad
 
402 @ Mobile next
402 @ Mobile next402 @ Mobile next
402 @ Mobile next
 
Data analysis with Pandas and Spark
Data analysis with Pandas and SparkData analysis with Pandas and Spark
Data analysis with Pandas and Spark
 
TCP/IP of money
TCP/IP of moneyTCP/IP of money
TCP/IP of money
 

Kürzlich hochgeladen

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 

Kürzlich hochgeladen (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 

Mashing the data

  • 1. Mashing the Data Real-Time replication from MySQL to Google Cloud Datastore
  • 2. Ingredients ● MySQL ● NodeJS ● ZongJi ● Google Cloud Datastore
  • 3. There are two types of DBAs: 1) DBAs that do backups 2) DBAs that will do backups
  • 4. MySQL ● Most used Open source DB - second place overall after Oracle (but almost equal)* ● Since 1995 ● Currently at version 5.7 (5.7.16 in Oct’16) ● Several forks - MariaDB, Percona ● Several storage engines, most used is InnoDB ● NDB Cluster and Master-Master Replication for HA * According to http://db-engines.com/en/ranking
  • 5. A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
  • 6. MySQL replication ● Master - Slave(s) ● Slaves can be Masters in their turn (Master->Slave->Slave->...->Slave) ○ log_slave_updates ● Only data modifying queries are logged (Create, Update, Delete; not Reads) ● 2 ½ types of replication ○ Statement Based (SBR) -> binary log records queries (UPDATE … SET ..) which are then replayed on slave ○ Row Based (RBR) -> binary log records directly the values of the affected row before and after the change is applied ○ Mixed -> binary log records a mix of SBR and RBR (default is SBR, but for certain statements + storage engine used, the log is automatically switched to row-based)
  • 7. Q: Why do you never ask SQL people to help you move your furniture? A: They sometimes drop the table
  • 8. MySQL replication (cont’d) ● SBR is good when changes affect lots of rows (as for e.g. 1k modified rows we only send a few bytes across the wire) ● SBR has problems when there are inconsistencies between master and slave or when queries are not deterministic (e.g. UPDATE … SET … LIMIT 100) ● RBR is good in maintaining a better consistency (as every changed row is replicated) ● RBR can be problematic when many rows are changed with a single statement (lots of traffic over the network)
  • 9.
  • 11. What is GCD ● NoSQL document database ● Automatic scaling ● High performance ● Flexible storage
  • 12. GCD (cont’d) ● Balance of strong and eventual consistency ○ entity lookups by key and ancestor queries always receive strongly consistent data ○ Other queries are eventually consistent ● Encryption at rest ○ encrypts all data before it is written to disk ● Querying of data through GQL ○ Similar with “classic” SQL; e.g. SELECT * FROM myKind WHERE myProp >= 100 AND myProp < 200 or SELECT * FROM myKind ORDER BY myProp DESC LIMIT 100 ● By default all properties are indexed, supports composite indexes (a bit more work to enable them though)
  • 14. Setup MySQL Master MySQL Slave SBR NodeJS App RBR Google Cloud Datastore Google Cloud Node modules
  • 15. Details about NodeJS App ● Uses ZongJi (https://github.com/nevill/zongji - MySQL binlog listener) var ZongJi = require('zongji'); var zongji = new ZongJi(config.database); zongji.on('binlog',function (evt) {doSomething('binlog',evt)}) zongji.on('query', function(evt) {doSomething('query',evt)}) zongji.on('writerows',function(evt) {doSomething('insert',evt)}) zongji.on('updaterows', function(evt) {doSomething('update',evt)}) zongji.on('deleterows', function(evt) {doSomething('delete',evt)})
  • 16. NodeJS (cont’d) zongji.start({ startAtEnd: true, includeSchema: {yourDBhere":true,"yourOtherDBHere":true},//config.monitor, includeEvents: [ 'tablemap', 'writerows', 'updaterows', 'deleterows' , 'query','rotate'] }); var doSomething = function(type, event) { //event has a rows attribute containing every modified row //it also has a tableMap containing table metadata (most important - table name) }
  • 17. NodeJS (last one, I promise) var sendToDataStore = function(namespace,idfldname,row) { var k = datastore.key([namespace, row[idfldname]]); datastore.save({key:k,data:row} ,function(err,res){ if(err) console.log("ERROR",err) else console.log("OK",JSON.stringify(res)) }); }
  • 19. In case the demo does not work