SlideShare ist ein Scribd-Unternehmen logo
1 von 38
How to Sync
Tens of Millions of Browsers
and Sleep Well at Night
Rafał Furmański & Piotr Olchawa
Presenters
Rafał Furmański, Engineering Manager
Project Manager, Software Engineer, Big Data enthusiast and certified Cassandra developer.
Rafał has 10+ years of experience in programming.
After work: addicted volleyball player.
Piotr Olchawa, Software Engineer
Piotr is a Software Engineer working at Opera between backend and SysOps.
He has over 4 years of experience in programming. He is a big fan of everything that’s extreme:
rock climbing, hackathons, public speaking.
Outline
■ About Opera and Sync
■ Problems with Cassandra and first encounter with Scylla
■ Migration process and results
■ Automated repairs with scylla-cli
■ Scylla proxy & shard awareness
About Opera and Sync
Opera Browsers
The chosen gateway to the Web
for over 350 million people
About Opera
About Opera
■ Founded in 1995 in Norway
■ HQ in Oslo
■ Branches in Poland, Sweden and China
■ Listed on NASDAQ
■ We make browsers & apps
● Desktop:
■ Opera
■ Opera GX
● Mobile:
■ Opera Mini
■ Opera for Android
■ Opera Touch
■ Opera News
About Opera
■ Opera has pioneered many concepts
found in the major browsers today
■ We continue to introduce unique features
in our products
Opera syncs
■ Favorite sites on the Speed Dial
■ Bookmarks
■ Open tabs from all devices
■ Browsing history
■ Passwords
■ Boowser preferences
About Opera Sync
Opera Sync - Architecture overview
Opera Sync - infrastructure/software
■ Deployed on bare metal boxes in 2 datacenters:
● Backend - 2x10
● Database - 2x13
■ On each backend host:
● Debian Stretch
● Docker containers:
■ uWSGI (Python/Django App)
■ Nginx
■ Celery workers
■ RabbitMQ
■ statsd
■ Configuration/Deployment: Ansible & Docker Swarm
■ Monitoring: Graphite/Grafana + Nagios + PagerDuty
Opera Sync - example model and queries
class Bookmark(Model):
user_id = columns.Text(partition_key=True)
version = columns.BigInt(primary_key=True, clustering_order='ASC')
id = columns.Text(primary_key=True)
parent_id = columns.Text()
position = columns.Bytes()
name = columns.Text()
ctime = columns.DateTime()
mtime = columns.DateTime()
deleted = columns.Boolean(default=False)
folder = columns.Boolean(default=False)
specifics = columns.Bytes()
Opera Sync - example model and queries
class Bookmark(Model):
user_id = columns.Text(partition_key=True)
version = columns.BigInt(primary_key=True, clustering_order='ASC')
id = columns.Text(primary_key=True)
parent_id = columns.Text()
position = columns.Bytes()
name = columns.Text()
ctime = columns.DateTime()
mtime = columns.DateTime()
deleted = columns.Boolean(default=False)
folder = columns.Boolean(default=False)
specifics = columns.Bytes()
Query 1: Get all bookmarks of user ‘Adam’ from version=5 # version == precise timestamp
Query 2: Change/remove bookmark of user ‘Adam’ with version=5 and id=’6’
Problems with Cassandra
Problems with Cassandra
■ We started with Cassandra=2.1 and immediately got hit by:
● [CASSANDRA-9935] Repair fails with RuntimeException
● [CASSANDRA-10689] java.lang.OutOfMemoryError: Direct buffer memory
● [CASSANDRA-10697] Leak detected while running offline scrub
● [CASSANDRA-8558] Deleted row still can be selected out
● [CASSANDRA-8446] Lost writes when using lightweight transactions
● [CASSANDRA-8280] Crash on inserting data over 64K into indexed strings
● [CASSANDRA-8067] NullPointerException in KeyCacheSerializer
● [CASSANDRA-9681] Memtable heap size grows and GC pauses are triggered
Problems with Cassandra
■ Bugs, bugs, bugs…
■ Very high p95/p99 read/write latencies
■ Long GC pauses(!!!)
■ Insane CPU usage
■ Failing Gossip/Binary protocols
■ Restarts without specific reason
■ Problems with bootstrapping new nodes
■ Neverending repairs
Our “solutions”
■ Add more and more C* nodes!
■ Tune every piece of C*/Java config
■ Seek help from C* gurus
■ [SYNC-1146] Cron job to restart C* periodically (sic!)
Our journey with Scylla
First encounter:
Cassandra Summit
First Scylla Cluster
&
Benchmarks
September 2015
July 2018
Decision to migrate
August 2018
Decommissioning of
last Cassandra Node
13 May 2019
Initial benchmarks
■ setup: 3 bare metal nodes in the cluster
■ tool: cassandra-stress
■ keyspace: sync, table: bookmark, time: 10 minutes
■ mixed workload: 50% GetUpdates / 50% Commit
Migration process
Migration process
1. Make django-cassandra-engine connect to more than 1 database
2. Prepare 2x3 node Scylla Cluster (with monitoring)
3. Update backend to be connection-aware
Bookmark.objects.using(connection='scylla').filter(...)
1. Move a few test users to Scylla (me and coworkers)
2. Make all new users use Scylla
3. Slowly migrate all existing users from Cassandra to Scylla
a. decommission nodes from Cassandra cluster
b. add decommissioned nodes to Scylla cluster
4. Disconnect Cassandra and make Scylla the default database engine
5. Cleanup
Django’s settings.py
DATABASES = {
'cassandra': {...}
'scylla': {
'ENGINE': 'django_cassandra_engine',
'NAME': 'sync',
'HOST': SCYLLA_DC_HOSTS[DC],
'USER': SCYLLA_USER,
'PASSWORD': SCYLLA_PASSWORD,
'OPTIONS': {
'replication': {
'strategy_class': 'NetworkTopologyStrategy',
'Amsterdam': 3,
'Ashburn': 3
},
'connection': {..}
}
}
}
Determining user’s connection
def get_user_store(user_id):
connection = UserStore.maybe_get_user_connection(user_id) # from cache
if connection is not None: # We know exactly which connection to use
with ContextQuery(UserStore, connection=connection) as US:
return US.objects.get(user_id=user_id)
else: # We have no clue which connection is correct for this user
try:
with ContextQuery(UserStore, connection='cassandra') as US:
user_store = US.objects.get(user_id=user_id)
except UserStore.DoesNotExist:
with ContextQuery(UserStore, connection='scylla') as US:
user_store = US.objects.get(user_id=user_id)
user_store.cache_user_connection()
return user_store
Migration script
Requirements:
■ Ability to move user data from Cassandra to Scylla (and back)
■ Consistency check after migrating
■ Concurrent execution is a must
■ Measure everything:
● Number of migrated users
● Migration time (with distribution)
● Errors with reasons
● Failed migrations
Migration script
Algorithm:
1. Pick free user from Cassandra DB (check if not already being migrated) and
mark as picked for migration
2. Set user_store.migration_pending = True (with TTL!)
3. Copy all the data to Scylla DB
4. Perform consistency check
5. Remove leftovers from Cassandra (and clear the connection cache)
6. Set user_store.migration_pending = False
Challenges during migration
■ Timeouts and Unavailables in Cassandra
■ Migrating huge accounts takes some time
■ User is cut off from Sync during the migration period
■ Synchronization of concurrent processes
Migration results
■ Reduced number of nodes: from 32 (a year ago) to 26 (now) to 8 (next)
■ Faster node bootstrap time (days vs hours)
■ Huge drops in latency
■ No more sleepless nights!
Automated repairs
with scylla-cli
scylla-cli overview
■ Console script for
● Checking status of the cluster
● Performing range repairs
■ Connects to Scylla API on each
host via SSH tunnel (or direct)
■ Written in Python
■ Available on PyPi:
$ pip install scylla-cli
Why repair with scylla-cli?
■ It works with Scylla Open Source
■ Performs repairs only on the primary range of a Scylla node (in discrete
steps, node by node)
■ Performs advanced repair techniques (subrange repair)
■ Scheduled repairs - what and when (specific node, table)
■ Built-in retry mechanism
■ Real time repair progress and ETA
■ Works better on a busy cluster than regular nodetool repair
Example repair usage:
$ scli repair sync session --dc=Amsterdam
Shard awareness
Scylla per-node CPU shard awareness
■ “Gains by Using Scylla-Specific Drivers” - over 2x latency decrease:
(Scylla Summit 2018 - Piotr Jastrzębski)
■ Cassandra native protocol extension
■ Achieved by per-node CPU connections
Shard-awareness for Sync
Rationale
■ 48 shards per Scylla server - potential performance improvement
Obstacles
■ No Python driver support
■ 300 uwsgi + celery workers per host
● 13 DBs * 48 shards * 300 workers = ~187000 connections
● Port range up to 65535
The solution
■ Proxy Scylla client/server (gocqlproxy)
1 connection / worker, and 1 connection / host-shard
(~300 workers + ~600 shards = ~900 connections)
■ Simplified protocol with just one message type
gocqlproxy overview
gocqlproxy implementation - driver and proxy
cassandra/proxy_session.py:
class ProxyConnection(DefaultConnection):
# (...)
def send_msg(self, msg, *args, **kwargs):
# (...)
proxied_msg = ProxiedMessage(msg, routing_key)
return super().send_msg(proxied_msg, ...)
cassandra/protocol.py:
class ProxiedMessage(_MessageType):
opcode = 0xF0
# (...)
def send_body(self, f, protocol_version):
message_bytes = encode_message(self.message)
write_longstring(f, message_bytes)
write_longstring(f, self.routing_key)
proxy.go:
frameWriter := &writeProxiedFrame{
head: nestedHead,
frameData: nestedFramer.rbuf,
}
// find the appropriate host/shard to forward the frame to:
partitionKey := query.clientFramer.readBytes()
serverConn, err := query.session.pickHost(partitionKey)
// send the frame to the chosen server/shard:
serverConn.exec(context.TODO(), frameWriter, nil)
// return the response to client (use client’s stream id)
clientFramer.writeHeader(response, outerHeader.stream)
clientFramer.wbuf = append(clientFramer.wbuf, frameData...)
clientFramer.finishWrite()
Shard-aware Sync and gocqlproxy – results
■ Production:
● We’ve enabled a working prototype of gocqlproxy, running stable, for a few days
● We can use shard-awareness with 900 connections instead of 180000
■ Local synthetic benchmarks - measured latency decreases:
cluster-wide approximated latencies
averaged over 75-second test runs
‘non-shard-aware→shard-aware’
(100% * (before-after) / before)
read [μs] write [μs]
avg 580→480 (~17%) 570→470 (~18%)
p95 1000→980 (~2%) 1000→980 (~2%)
p99 2000→1000 (~50%) 1900→1000 (~47%)
Take away
Take away
■ Download Opera Browser
■ Django-cassandra-engine
■ Scylla-cli
■ Scylla-proxy:
● gocqlproxy
● Python driver
Thank you Stay in touch
Any questions?
Rafał Furmański
rfurmanski@opera.com
r4fek
Piotr Olchawa
polchawa@opera.com
BugsKillPeople

Weitere ähnliche Inhalte

Was ist angesagt?

Eventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyEventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyScyllaDB
 
Using ScyllaDB with JanusGraph for Cyber Security
Using ScyllaDB with JanusGraph for Cyber SecurityUsing ScyllaDB with JanusGraph for Cyber Security
Using ScyllaDB with JanusGraph for Cyber SecurityScyllaDB
 
How Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter FootprintHow Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter FootprintScyllaDB
 
Developing Scylla Applications: Practical Tips
Developing Scylla Applications: Practical TipsDeveloping Scylla Applications: Practical Tips
Developing Scylla Applications: Practical TipsScyllaDB
 
ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016Tzach Livyatan
 
Introducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task AutomationIntroducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task AutomationScyllaDB
 
Scylla’s Journey Towards Being an Elastic Cloud Native Database
Scylla’s Journey Towards Being an Elastic Cloud Native DatabaseScylla’s Journey Towards Being an Elastic Cloud Native Database
Scylla’s Journey Towards Being an Elastic Cloud Native DatabaseScyllaDB
 
Seastar Summit 2019 Keynote
Seastar Summit 2019 KeynoteSeastar Summit 2019 Keynote
Seastar Summit 2019 KeynoteScyllaDB
 
Writing Applications for Scylla
Writing Applications for ScyllaWriting Applications for Scylla
Writing Applications for ScyllaScyllaDB
 
Scylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the DatabaseScylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the DatabaseScyllaDB
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScyllaDB
 
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB,  or how we implemented a 10-times faster CassandraSeastar / ScyllaDB,  or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB, or how we implemented a 10-times faster CassandraTzach Livyatan
 
Scylla Summit 2019 Keynote - Avi Kivity
Scylla Summit 2019 Keynote - Avi KivityScylla Summit 2019 Keynote - Avi Kivity
Scylla Summit 2019 Keynote - Avi KivityScyllaDB
 
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times FasterScylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times FasterScyllaDB
 
A glimpse of cassandra 4.0 features netflix
A glimpse of cassandra 4.0 features   netflixA glimpse of cassandra 4.0 features   netflix
A glimpse of cassandra 4.0 features netflixVinay Kumar Chella
 
Scylla: 1 Million CQL operations per second per server
Scylla: 1 Million CQL operations per second per serverScylla: 1 Million CQL operations per second per server
Scylla: 1 Million CQL operations per second per serverAvi Kivity
 
Event Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDBEvent Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDBScyllaDB
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScyllaDB
 
Back to the future with C++ and Seastar
Back to the future with C++ and SeastarBack to the future with C++ and Seastar
Back to the future with C++ and SeastarTzach Livyatan
 
Running a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesRunning a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesScyllaDB
 

Was ist angesagt? (20)

Eventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyEventually, Scylla Chooses Consistency
Eventually, Scylla Chooses Consistency
 
Using ScyllaDB with JanusGraph for Cyber Security
Using ScyllaDB with JanusGraph for Cyber SecurityUsing ScyllaDB with JanusGraph for Cyber Security
Using ScyllaDB with JanusGraph for Cyber Security
 
How Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter FootprintHow Workload Prioritization Reduces Your Datacenter Footprint
How Workload Prioritization Reduces Your Datacenter Footprint
 
Developing Scylla Applications: Practical Tips
Developing Scylla Applications: Practical TipsDeveloping Scylla Applications: Practical Tips
Developing Scylla Applications: Practical Tips
 
ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016ScyllaDB @ Apache BigData, may 2016
ScyllaDB @ Apache BigData, may 2016
 
Introducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task AutomationIntroducing Scylla Manager: Cluster Management and Task Automation
Introducing Scylla Manager: Cluster Management and Task Automation
 
Scylla’s Journey Towards Being an Elastic Cloud Native Database
Scylla’s Journey Towards Being an Elastic Cloud Native DatabaseScylla’s Journey Towards Being an Elastic Cloud Native Database
Scylla’s Journey Towards Being an Elastic Cloud Native Database
 
Seastar Summit 2019 Keynote
Seastar Summit 2019 KeynoteSeastar Summit 2019 Keynote
Seastar Summit 2019 Keynote
 
Writing Applications for Scylla
Writing Applications for ScyllaWriting Applications for Scylla
Writing Applications for Scylla
 
Scylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the DatabaseScylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the Database
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
 
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB,  or how we implemented a 10-times faster CassandraSeastar / ScyllaDB,  or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
 
Scylla Summit 2019 Keynote - Avi Kivity
Scylla Summit 2019 Keynote - Avi KivityScylla Summit 2019 Keynote - Avi Kivity
Scylla Summit 2019 Keynote - Avi Kivity
 
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times FasterScylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
 
A glimpse of cassandra 4.0 features netflix
A glimpse of cassandra 4.0 features   netflixA glimpse of cassandra 4.0 features   netflix
A glimpse of cassandra 4.0 features netflix
 
Scylla: 1 Million CQL operations per second per server
Scylla: 1 Million CQL operations per second per serverScylla: 1 Million CQL operations per second per server
Scylla: 1 Million CQL operations per second per server
 
Event Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDBEvent Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDB
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
 
Back to the future with C++ and Seastar
Back to the future with C++ and SeastarBack to the future with C++ and Seastar
Back to the future with C++ and Seastar
 
Running a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesRunning a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes Services
 

Ähnlich wie How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night

Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafkaSamuel Kerrien
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...DataWorks Summit/Hadoop Summit
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScyllaDB
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationYi Pan
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Robbie Strickland
 
How to achieve no compromise performance and availability
How to achieve no compromise performance and availabilityHow to achieve no compromise performance and availability
How to achieve no compromise performance and availabilityScyllaDB
 
Scylla db deck, july 2017
Scylla db deck, july 2017Scylla db deck, july 2017
Scylla db deck, july 2017Dor Laor
 
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them AllScylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them AllScyllaDB
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka MeetupCliff Gilmore
 
Migrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to CassandraMigrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to CassandraDemi Ben-Ari
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaDataWorks Summit
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big DataDataStax Academy
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...Hisham Mardam-Bey
 
Cassandra REST API with Pagination TEAM 15
Cassandra REST API with Pagination TEAM 15Cassandra REST API with Pagination TEAM 15
Cassandra REST API with Pagination TEAM 15Akash Kant
 
Cassandra To Infinity And Beyond
Cassandra To Infinity And BeyondCassandra To Infinity And Beyond
Cassandra To Infinity And BeyondRomain Hardouin
 
Scio - Moving to Google Cloud, A Spotify Story
 Scio - Moving to Google Cloud, A Spotify Story Scio - Moving to Google Cloud, A Spotify Story
Scio - Moving to Google Cloud, A Spotify StoryNeville Li
 
Tweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский ДмитрийTweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский ДмитрийGeeksLab Odessa
 
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...ScyllaDB
 

Ähnlich wie How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night (20)

Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafka
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentation
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
 
How to achieve no compromise performance and availability
How to achieve no compromise performance and availabilityHow to achieve no compromise performance and availability
How to achieve no compromise performance and availability
 
Scylla db deck, july 2017
Scylla db deck, july 2017Scylla db deck, july 2017
Scylla db deck, july 2017
 
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them AllScylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
Scylla Summit 2022: ScyllaDB Rust Driver: One Driver to Rule Them All
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
 
Migrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to CassandraMigrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to Cassandra
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
 
Cassandra REST API with Pagination TEAM 15
Cassandra REST API with Pagination TEAM 15Cassandra REST API with Pagination TEAM 15
Cassandra REST API with Pagination TEAM 15
 
Cassandra To Infinity And Beyond
Cassandra To Infinity And BeyondCassandra To Infinity And Beyond
Cassandra To Infinity And Beyond
 
Scio - Moving to Google Cloud, A Spotify Story
 Scio - Moving to Google Cloud, A Spotify Story Scio - Moving to Google Cloud, A Spotify Story
Scio - Moving to Google Cloud, A Spotify Story
 
Tweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский ДмитрийTweaking perfomance on high-load projects_Думанский Дмитрий
Tweaking perfomance on high-load projects_Думанский Дмитрий
 
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
 

Mehr von ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

Mehr von ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Kürzlich hochgeladen

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Kürzlich hochgeladen (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night

  • 1. How to Sync Tens of Millions of Browsers and Sleep Well at Night Rafał Furmański & Piotr Olchawa
  • 2. Presenters Rafał Furmański, Engineering Manager Project Manager, Software Engineer, Big Data enthusiast and certified Cassandra developer. Rafał has 10+ years of experience in programming. After work: addicted volleyball player. Piotr Olchawa, Software Engineer Piotr is a Software Engineer working at Opera between backend and SysOps. He has over 4 years of experience in programming. He is a big fan of everything that’s extreme: rock climbing, hackathons, public speaking.
  • 3. Outline ■ About Opera and Sync ■ Problems with Cassandra and first encounter with Scylla ■ Migration process and results ■ Automated repairs with scylla-cli ■ Scylla proxy & shard awareness
  • 5. Opera Browsers The chosen gateway to the Web for over 350 million people About Opera
  • 6. About Opera ■ Founded in 1995 in Norway ■ HQ in Oslo ■ Branches in Poland, Sweden and China ■ Listed on NASDAQ ■ We make browsers & apps ● Desktop: ■ Opera ■ Opera GX ● Mobile: ■ Opera Mini ■ Opera for Android ■ Opera Touch ■ Opera News
  • 7. About Opera ■ Opera has pioneered many concepts found in the major browsers today ■ We continue to introduce unique features in our products
  • 8. Opera syncs ■ Favorite sites on the Speed Dial ■ Bookmarks ■ Open tabs from all devices ■ Browsing history ■ Passwords ■ Boowser preferences About Opera Sync
  • 9. Opera Sync - Architecture overview
  • 10. Opera Sync - infrastructure/software ■ Deployed on bare metal boxes in 2 datacenters: ● Backend - 2x10 ● Database - 2x13 ■ On each backend host: ● Debian Stretch ● Docker containers: ■ uWSGI (Python/Django App) ■ Nginx ■ Celery workers ■ RabbitMQ ■ statsd ■ Configuration/Deployment: Ansible & Docker Swarm ■ Monitoring: Graphite/Grafana + Nagios + PagerDuty
  • 11. Opera Sync - example model and queries class Bookmark(Model): user_id = columns.Text(partition_key=True) version = columns.BigInt(primary_key=True, clustering_order='ASC') id = columns.Text(primary_key=True) parent_id = columns.Text() position = columns.Bytes() name = columns.Text() ctime = columns.DateTime() mtime = columns.DateTime() deleted = columns.Boolean(default=False) folder = columns.Boolean(default=False) specifics = columns.Bytes()
  • 12. Opera Sync - example model and queries class Bookmark(Model): user_id = columns.Text(partition_key=True) version = columns.BigInt(primary_key=True, clustering_order='ASC') id = columns.Text(primary_key=True) parent_id = columns.Text() position = columns.Bytes() name = columns.Text() ctime = columns.DateTime() mtime = columns.DateTime() deleted = columns.Boolean(default=False) folder = columns.Boolean(default=False) specifics = columns.Bytes() Query 1: Get all bookmarks of user ‘Adam’ from version=5 # version == precise timestamp Query 2: Change/remove bookmark of user ‘Adam’ with version=5 and id=’6’
  • 14. Problems with Cassandra ■ We started with Cassandra=2.1 and immediately got hit by: ● [CASSANDRA-9935] Repair fails with RuntimeException ● [CASSANDRA-10689] java.lang.OutOfMemoryError: Direct buffer memory ● [CASSANDRA-10697] Leak detected while running offline scrub ● [CASSANDRA-8558] Deleted row still can be selected out ● [CASSANDRA-8446] Lost writes when using lightweight transactions ● [CASSANDRA-8280] Crash on inserting data over 64K into indexed strings ● [CASSANDRA-8067] NullPointerException in KeyCacheSerializer ● [CASSANDRA-9681] Memtable heap size grows and GC pauses are triggered
  • 15. Problems with Cassandra ■ Bugs, bugs, bugs… ■ Very high p95/p99 read/write latencies ■ Long GC pauses(!!!) ■ Insane CPU usage ■ Failing Gossip/Binary protocols ■ Restarts without specific reason ■ Problems with bootstrapping new nodes ■ Neverending repairs
  • 16. Our “solutions” ■ Add more and more C* nodes! ■ Tune every piece of C*/Java config ■ Seek help from C* gurus ■ [SYNC-1146] Cron job to restart C* periodically (sic!)
  • 17. Our journey with Scylla First encounter: Cassandra Summit First Scylla Cluster & Benchmarks September 2015 July 2018 Decision to migrate August 2018 Decommissioning of last Cassandra Node 13 May 2019
  • 18. Initial benchmarks ■ setup: 3 bare metal nodes in the cluster ■ tool: cassandra-stress ■ keyspace: sync, table: bookmark, time: 10 minutes ■ mixed workload: 50% GetUpdates / 50% Commit
  • 20. Migration process 1. Make django-cassandra-engine connect to more than 1 database 2. Prepare 2x3 node Scylla Cluster (with monitoring) 3. Update backend to be connection-aware Bookmark.objects.using(connection='scylla').filter(...) 1. Move a few test users to Scylla (me and coworkers) 2. Make all new users use Scylla 3. Slowly migrate all existing users from Cassandra to Scylla a. decommission nodes from Cassandra cluster b. add decommissioned nodes to Scylla cluster 4. Disconnect Cassandra and make Scylla the default database engine 5. Cleanup
  • 21. Django’s settings.py DATABASES = { 'cassandra': {...} 'scylla': { 'ENGINE': 'django_cassandra_engine', 'NAME': 'sync', 'HOST': SCYLLA_DC_HOSTS[DC], 'USER': SCYLLA_USER, 'PASSWORD': SCYLLA_PASSWORD, 'OPTIONS': { 'replication': { 'strategy_class': 'NetworkTopologyStrategy', 'Amsterdam': 3, 'Ashburn': 3 }, 'connection': {..} } } }
  • 22. Determining user’s connection def get_user_store(user_id): connection = UserStore.maybe_get_user_connection(user_id) # from cache if connection is not None: # We know exactly which connection to use with ContextQuery(UserStore, connection=connection) as US: return US.objects.get(user_id=user_id) else: # We have no clue which connection is correct for this user try: with ContextQuery(UserStore, connection='cassandra') as US: user_store = US.objects.get(user_id=user_id) except UserStore.DoesNotExist: with ContextQuery(UserStore, connection='scylla') as US: user_store = US.objects.get(user_id=user_id) user_store.cache_user_connection() return user_store
  • 23. Migration script Requirements: ■ Ability to move user data from Cassandra to Scylla (and back) ■ Consistency check after migrating ■ Concurrent execution is a must ■ Measure everything: ● Number of migrated users ● Migration time (with distribution) ● Errors with reasons ● Failed migrations
  • 24. Migration script Algorithm: 1. Pick free user from Cassandra DB (check if not already being migrated) and mark as picked for migration 2. Set user_store.migration_pending = True (with TTL!) 3. Copy all the data to Scylla DB 4. Perform consistency check 5. Remove leftovers from Cassandra (and clear the connection cache) 6. Set user_store.migration_pending = False
  • 25. Challenges during migration ■ Timeouts and Unavailables in Cassandra ■ Migrating huge accounts takes some time ■ User is cut off from Sync during the migration period ■ Synchronization of concurrent processes
  • 26. Migration results ■ Reduced number of nodes: from 32 (a year ago) to 26 (now) to 8 (next) ■ Faster node bootstrap time (days vs hours) ■ Huge drops in latency ■ No more sleepless nights!
  • 28. scylla-cli overview ■ Console script for ● Checking status of the cluster ● Performing range repairs ■ Connects to Scylla API on each host via SSH tunnel (or direct) ■ Written in Python ■ Available on PyPi: $ pip install scylla-cli
  • 29. Why repair with scylla-cli? ■ It works with Scylla Open Source ■ Performs repairs only on the primary range of a Scylla node (in discrete steps, node by node) ■ Performs advanced repair techniques (subrange repair) ■ Scheduled repairs - what and when (specific node, table) ■ Built-in retry mechanism ■ Real time repair progress and ETA ■ Works better on a busy cluster than regular nodetool repair Example repair usage: $ scli repair sync session --dc=Amsterdam
  • 31. Scylla per-node CPU shard awareness ■ “Gains by Using Scylla-Specific Drivers” - over 2x latency decrease: (Scylla Summit 2018 - Piotr Jastrzębski) ■ Cassandra native protocol extension ■ Achieved by per-node CPU connections
  • 32. Shard-awareness for Sync Rationale ■ 48 shards per Scylla server - potential performance improvement Obstacles ■ No Python driver support ■ 300 uwsgi + celery workers per host ● 13 DBs * 48 shards * 300 workers = ~187000 connections ● Port range up to 65535 The solution ■ Proxy Scylla client/server (gocqlproxy) 1 connection / worker, and 1 connection / host-shard (~300 workers + ~600 shards = ~900 connections) ■ Simplified protocol with just one message type
  • 34. gocqlproxy implementation - driver and proxy cassandra/proxy_session.py: class ProxyConnection(DefaultConnection): # (...) def send_msg(self, msg, *args, **kwargs): # (...) proxied_msg = ProxiedMessage(msg, routing_key) return super().send_msg(proxied_msg, ...) cassandra/protocol.py: class ProxiedMessage(_MessageType): opcode = 0xF0 # (...) def send_body(self, f, protocol_version): message_bytes = encode_message(self.message) write_longstring(f, message_bytes) write_longstring(f, self.routing_key) proxy.go: frameWriter := &writeProxiedFrame{ head: nestedHead, frameData: nestedFramer.rbuf, } // find the appropriate host/shard to forward the frame to: partitionKey := query.clientFramer.readBytes() serverConn, err := query.session.pickHost(partitionKey) // send the frame to the chosen server/shard: serverConn.exec(context.TODO(), frameWriter, nil) // return the response to client (use client’s stream id) clientFramer.writeHeader(response, outerHeader.stream) clientFramer.wbuf = append(clientFramer.wbuf, frameData...) clientFramer.finishWrite()
  • 35. Shard-aware Sync and gocqlproxy – results ■ Production: ● We’ve enabled a working prototype of gocqlproxy, running stable, for a few days ● We can use shard-awareness with 900 connections instead of 180000 ■ Local synthetic benchmarks - measured latency decreases: cluster-wide approximated latencies averaged over 75-second test runs ‘non-shard-aware→shard-aware’ (100% * (before-after) / before) read [μs] write [μs] avg 580→480 (~17%) 570→470 (~18%) p95 1000→980 (~2%) 1000→980 (~2%) p99 2000→1000 (~50%) 1900→1000 (~47%)
  • 37. Take away ■ Download Opera Browser ■ Django-cassandra-engine ■ Scylla-cli ■ Scylla-proxy: ● gocqlproxy ● Python driver
  • 38. Thank you Stay in touch Any questions? Rafał Furmański rfurmanski@opera.com r4fek Piotr Olchawa polchawa@opera.com BugsKillPeople

Hinweis der Redaktion

  1. 350M -> monthly active users
  2. shard-awareness for python-driver? (+) no proxy required (-) non-obvious logic to implement (shard num calculation) (-) likely too many connections anyway gocqlproxy (+) simple - most responsibilities in the proxy - already implemented in gocql (+) connection numbers, similarly to twemproxy (-) non-standard protocol changes, both in the driver and in the proxy
  3. Without proxy: every worker has one connection per shard (about 300 workers x 600 shards = almost 200k) With proxy: every worker has just one connection to the single-process proxy (about 300 connections) the proxy has one connection per shard (about 600 connections) 300 + 600 = 900 M workers, N shards -> M+N instead of M*N
  4. Messages wrapped as (message, routing_key) Idea #1: routing_key extracted in the proxy “transparently” parsing logic to implement Idea #2: routing_key duplicated by the driver proxy can use it directly to find the right shard simple to implement: just (1) add message type, (2) pass the message through to the server in the proxy remember that stream id needs to come from the “outer”/”wrapping” message
  5. Promising results in multiple-container, single-machine clusters Up to 50% latency decreases (p99), when alternating between shard-aware/non-aware tests Little performance improvement in production (except decreased connection counts) Perhaps IPC overhead is negligible in Sync for some reason?