Dubsmash grew faster than any other app before. This presentation has been given at the PyConDE 2016 in Munich by Daniel Taschik, Co-Founder and CTO of Dubsmash. It gives insight into the evolution of Dubsmash's infrastructure, that the team scaled from 0 to over 100 million users in 12 months.
Challenges and insight into the used technologies and services are being showed. Find out more and check our tech blog at https://tech.dubsmash.com
5. The Start
Backend
• Django-powered BE for content
management
• web-based Dubloader to add sounds
• deployed on Heroku
Content Delivery
• sound files in S3
• meta information in JSON file in S3
• files served via Cloudfront CDN
Metrics
• Dubloader with < 100 req/min
• >500 TB! of traffic in January 2015
8. New Features: Registration & Search
User registration
• API based on REST framework
• Django user model
• store user’s most like sounds
• push notifications for new content
Server-side Sound search
• new Django-based service
• search via ElasticSearch using Haystack
• Celery-based indexing on RQ
Metrics
• 100.000 registrations within first 24h
• >20.000 requests per minute on search service
10. DubTalk
Social Graph Service
• friend relations on platform
• Django
• TitanDB on Cassandra
• later DynamoDB
DubTalk Service
• group & video management
• Django
Service Communication
• Async via Celery on RabbitMQ
• Sync via internal HTTPS API
Metrics
• > 50.000 requests per min on both
• > 150.000.000 videos stored
12. Large Scale Problems
favorited sounds outgrew our PostgreSQL
• > 1.000.000.000 favorited sounds
• simple data model & access pattern
• Premium-7 120GB RAM, 1TB disk instance
dtaschik@unic0rn:~/dubsmash$ heroku pg:table-size -a dubsmash
name | size
-----------------------------------+------------
users_favs | 158 GB
dtaschik@unic0rn:~/dubsmash $ heroku pg:index-size -a dubsmash
name | size
-----------------------------------+------------
users_favs_username_key | 132 GB
ID username sound_id
1 daniel3 Dzdcjc
2 sarah 3jGYzH
Let’s make it a new service!
13. Cloudfront CDNS3 sound storage
Dubsmash Service Landscape
Auth
Graph DubTalk
Favs
Router
Monolith
relational DB
caching
NoSQL
many
more
16. Let’s say it with video!
Thank you!
daniel@dubsmash.com | daniel3 | @dtaschik
Editor's Notes
Dubsmash is a product with global scale. 100m users, in 192 countries created over 1,5bn videos in total.
most famous ones are jimmy fallon, Huge Jackman, Arnold Schwarzenegger or Jennifer Lopez and recently emma dickson?
connect, create communicate withfamily and friends
but started as a very simple, 3-screen protoypical creational tool.
the cave – building Dubsmash out of a Berlin souterrain 50 sqm office