SlideShare a Scribd company logo
1 of 50
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Miguel Alvarado, VP of Data Analytics
Alan Zawari, Senior Engineer, Content Services
December 1, 2016
Content and Data Platforms at Vevo
Rebuilding and Scaling from Zero in One Year
SVR308
About Us
• Miguel Alvarado, VP of Data Analytics
@djmalvarado
miguelalvarado
miguel.alvarado@vevo.com
• Alan Zawari, Senior Engineer, Content Services
@alanzawari
alanzawari
skilledDeveloper
alan.zawari@vevo.com
What to Expect from the Session
• Learn About Vevo and Engineering @ Vevo
• Content Services
• What is content services?
• Rearchitecting content services from the ground up
• How AWS Lambda functions fit into the picture
• Data Services
• What are data services?
• Building a data platform from scratch
• Using Amazon Kinesis as the central data nervous system
What Is Vevo?
Vevo is the world’s leading all-
premium music video and
entertainment platform with over
19 billion monthly views globally.
Vevo delivers a personalized and
expertly curated experience for
audiences to explore and discover
music videos, exclusive original
programming, and live
performances from the artists they
love on mobile, web, and
connected TV.
The Scale of Vevo
PROPERTY OF VEVO LLC PRIVATE & CONFIDENTIAL
The evolution of vevo
PROPERTY OF VEVO LLC PRIVATE & CONFIDENTIAL
Before (2015)
PROPERTY OF VEVO LLC PRIVATE & CONFIDENTIAL
The evolution of vevo
PROPERTY OF VEVO LLC PRIVATE & CONFIDENTIAL
After (2016)
Engineering @ Vevo
Engineering @ Vevo
• Vevo Engineering 1.0 (2009-15):
• Hybrid hosting (Rackspace + AWS)
• Megaservices
• .Net!
• No continuous delivery, no tests
• Loads of technical debt
• Vevo Engineering 2.0 (2016):
• 100% AWS + Kubernetes
• Microservices + Nanoservices (Lambdas)
• Go, Scala, Java, Node.js
• Continuous delivery, TDD, BDD
• Rewritten most of stack
• We’re hiring!
What Is Content Services?
What Is Content Services?
• Infrastructure that allows music artists to deliver video to
their audience:
• Artist and video metadata
• Video ingestion
• Video encoding
• Publishing to our own platforms and partners
• Providing APIs for our client apps
What Are Data Services?
What Are Data Services?
• Data services are the collection of services and
infrastructure that encompasses the Vevo Data Platform
• The Vevo Data Platform powers two main things:
• Vevo’s “Smart Consumer Experiences” in the form of
personalization and recommendations
• Analytics for all Vevo product and business groups
• The Data Services team comprises platform engineers
and data scientists
Rearchitecting Content
Services
Old Architecture
• A giant monolithic service responsible for everything:
• Authentication
• Search
• Playlists
• Artists
• Videos and streams
• Recommendation
• More
• .NET/SQL Server stack
Old Architecture (simplified)
New Content Architecture
• Microservice/stream architecture
• Small independent services + Amazon Kinesis
• Different technology stacks: Node.js, Java, Go
• Services don’t talk to each other directly
• Every service has its own event stream (bus)
• Services can consume each others’ stream events
• Each streams has its own JSON schema
• The whole thing cannot go down
• Continuous delivery
• Cost effective
New Architecture
Developing New Architecture
• Old architecture is in production and running
• New architecture is in progress
• We wanted to feed new architecture with live data
• And get both running simultaneously
• Slowly switch traffic over to new architecture
• Connect both worlds without changing the old code?
Bridging New Architecture to the Old
• Project Mexit
• Runs on a recurring basis
• Queries old API for recent metadata changes (like a client)
• Emits the changes to new-architecture Amazon Kinesis
streams
• Fault tolerant: Stores last successful timestamp
• AWS Lambda + Amazon DynamoDB
Bridging New Architecture to the Old
Live
Prod
Data
Mexit DataDog Dashboard
How We Used Lambdas
• Scheduled tasks (Cron job)
• Database triggers
• User-facing services
• Other use cases
How We Used Lambdas
• Scheduled tasks (Cron job)
• Read artist/video metadata changes and:
• Update Amazon Elasticsearch Service index
• Stream changes to Amazon Kinesis (Project Mexit)
• Cache warming: Keep top artist images in the cache
• Release new videos based on startDate (Project Releasr)
• Polling every 5 sec
• Lambda schedule event: rate(1 minute)
• Long running Lambda (i.e., 5 min timeout)
• 5-sec intervals
Releasr, 5-Sec Recurring Task
module.exports.handler = function (event, context, callback) {
//MAIN TIMER - runs every 5 seconds
var timer = setInterval(function () {
if (!processing) {
processItems();
} else {
//still processing, come back later!
}
}, 5000);
//finish lambda before it’s timed out
setTimeout(function () {
clearInterval(timer);
console.log("Finished processing at ", new Date().toISOString());
callback();
}, LAMBDA_TIMEOUT - 1000);
};
How We Used Lambdas (cont.)
• Database triggers
• Send user likes to Amazon Kinesis (Project Dartmouth)
• Export user likes to Amazon S3/Amazon Redshift
• Cross-account Amazon DynamoDB replication (Project
Fargo):
How We Used Lambdas (cont.)
• User-facing services
• Project Susa, Vevo link shortening and social interaction
tracking
• Pure serverless!
• Consists of nanoservices:
• Shorten
• Expand
• Event-Publisher: DB trigger to capture/publish events
• AWS Lambda, Amazon API Gateway, and Amazon DynamoDB
How We Used Lambdas (cont.)
• Project Susa: Creating a short link
AMAZONKINESISBUS
Client
Events
Core
REST API
APIGateway l1
Data
Consumers
l3
Auth
APIv2
Decode
Token
APIv2
Video
Metadata
l0
Create a new
Short link
Response
Store link
and parameters
Record a
‘Share’ event
Get YouTube
URL
Authentication
How We Used Lambdas (cont.)
• Project Susa: Clicking on a short link
AMAZONKINESISBUS
Core
Events
Click on
Short link
Redirection
Response
REST API
APIGateway
l2
Retrieve
full URL
Record a
‘Click’ event
Data
Consumers
l0
How We Used Lambdas (cont.)
• Project Susa: Lambda/API Gateway Scalability
Spike:
80X normal traffic
How We Used Lambdas (cont.)
• Other use cases:
• Sending data to third-party ML providers (Project Dartmouth)
• Slack integration (on-demand cache buster):
Building Data Services from
Scratch
Old World
• No Data team
• No Vevo Data Platform
• No first-party data
• No data science
• No personalization or recommendations
• Used third-party comScore DAX for analytics
• No continuous delivery
Old World
Data Science Leapfrog:
Project Dartmouth
Project Dartmouth
• 5 ML companies A, B, C, D, E
• Power the Feed on iOS
• Real-time event collection
• Real-time recommendations
• Goal: improve swipe/click
ratio
Project Dartmouth and Event Collection POC
Current Data Services Architecture
Endo Cross-Platform SDK
Amazon Kinesis at the Heart
Service-to-Service Contracts
• Based on JSON schemas, considered Avro and Protocol
Buffers
• All entities that are shared via Amazon Kinesis need a
schema
• Full payloads can be passed or just notification with the
ID of new or modified entity
• Messages should be dropped into Amazon Kinesis at
the time of creation or modification
Central Repo/Sample JSON Schema
{ "$schema": "http://json-schema.org/draft-04/schema#",
"id": "http://schemas.vevo.com/streams/like/1.0/like-
event.json#",
"type": "object",
"properties":
{ "user_id":
{"type": "string",
"minLength": 1 },
"entity_type":
{"type": "string",
"enum": [
"USER",
"PLAYLIST",
"VIDEO",
"ARTIST" ]
},
"entity_id": {
"type": "string",
"minLength": 1
},
"action": {"type": "string",
"enum": ["LIKE",
"UNLIKE"
]
}
},
"required": [
"user_id",
"entity_type",
"entity_id",
"action" ]}
Spark for Most Data Processing
Amazon Redshift for Analytics
Current Data Services Architecture (recap)
Lessons
Lessons Learned
• Lambda (and resources) deployment could be challenging
• Used serverless framework
• Integrated with our CI/CD framework
• Lambda throttled invocations
• Watched it and increased concurrent invocations per account
• Lambda cold start issue
• Kept it warm by frequent invocations (every 5 min.)
• Standardized what goes on stream
• A central place for schemas
• A central place for error messages
Lessons Learned (cont.)
• Don’t try to boil the ocean all at once
• Use real user data to make decisions
• Reuse as much existing technology vs. build
• If you’re serious about analytics, build your own platform
Thank you!
Remember to complete
your evaluations!

More Related Content

What's hot

Netflix's Could Migration
Netflix's Could MigrationNetflix's Could Migration
Netflix's Could MigrationChef
 
Spotify's journey to GCP
Spotify's journey to GCPSpotify's journey to GCP
Spotify's journey to GCPAlexey Lapitsky
 
Container Management with Amazon ECS
Container Management with Amazon ECSContainer Management with Amazon ECS
Container Management with Amazon ECSAWS Germany
 
Splunk Spark Integration
Splunk Spark IntegrationSplunk Spark Integration
Splunk Spark IntegrationGang Tao
 
The Rise of Serverless Architectures
The Rise of Serverless ArchitecturesThe Rise of Serverless Architectures
The Rise of Serverless ArchitecturesBenny Bauer
 
Cost Optimization Best Practices from Trend Micro
Cost Optimization Best Practices from Trend Micro Cost Optimization Best Practices from Trend Micro
Cost Optimization Best Practices from Trend Micro Cliff Chao-kuan Lu
 
AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)
AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)
AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)Amazon Web Services
 
Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern Thanh Nguyen
 
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...Amazon Web Services
 
AWS re:Invent 2016: NEW LAUNCH! Lambda Everywhere (IOT309)
AWS re:Invent 2016: NEW LAUNCH! Lambda Everywhere (IOT309)AWS re:Invent 2016: NEW LAUNCH! Lambda Everywhere (IOT309)
AWS re:Invent 2016: NEW LAUNCH! Lambda Everywhere (IOT309)Amazon Web Services
 
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...Amazon Web Services
 
Introduction to Serverless
Introduction to ServerlessIntroduction to Serverless
Introduction to ServerlessNikolaus Graf
 
Building a data warehouse with Amazon Redshift … and a quick look at Amazon ...
Building a data warehouse  with Amazon Redshift … and a quick look at Amazon ...Building a data warehouse  with Amazon Redshift … and a quick look at Amazon ...
Building a data warehouse with Amazon Redshift … and a quick look at Amazon ...Julien SIMON
 
Web Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud PlatformWeb Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud PlatformSudhir Tonse
 
Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Amazon Web Services
 
The Journey To Serverless At Home24 - reflections and insights
The Journey To Serverless At Home24 - reflections and insights The Journey To Serverless At Home24 - reflections and insights
The Journey To Serverless At Home24 - reflections and insights AWS Germany
 
AWS as platform for scalable applications
AWS as platform for scalable applicationsAWS as platform for scalable applications
AWS as platform for scalable applicationsRoman Gomolko
 
Stacktician - CloudStack Collab Conference 2014
Stacktician - CloudStack Collab Conference 2014Stacktician - CloudStack Collab Conference 2014
Stacktician - CloudStack Collab Conference 2014amoghvk
 
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐Pahud Hsieh
 

What's hot (19)

Netflix's Could Migration
Netflix's Could MigrationNetflix's Could Migration
Netflix's Could Migration
 
Spotify's journey to GCP
Spotify's journey to GCPSpotify's journey to GCP
Spotify's journey to GCP
 
Container Management with Amazon ECS
Container Management with Amazon ECSContainer Management with Amazon ECS
Container Management with Amazon ECS
 
Splunk Spark Integration
Splunk Spark IntegrationSplunk Spark Integration
Splunk Spark Integration
 
The Rise of Serverless Architectures
The Rise of Serverless ArchitecturesThe Rise of Serverless Architectures
The Rise of Serverless Architectures
 
Cost Optimization Best Practices from Trend Micro
Cost Optimization Best Practices from Trend Micro Cost Optimization Best Practices from Trend Micro
Cost Optimization Best Practices from Trend Micro
 
AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)
AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)
AWS re:Invent 2016: Running Batch Jobs on Amazon ECS (CON310)
 
Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern Migrating Monolithic Applications with the Strangler Pattern
Migrating Monolithic Applications with the Strangler Pattern
 
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
AWS re:Invent 2016: Get Technically Inspired by Container-Powered Migrations ...
 
AWS re:Invent 2016: NEW LAUNCH! Lambda Everywhere (IOT309)
AWS re:Invent 2016: NEW LAUNCH! Lambda Everywhere (IOT309)AWS re:Invent 2016: NEW LAUNCH! Lambda Everywhere (IOT309)
AWS re:Invent 2016: NEW LAUNCH! Lambda Everywhere (IOT309)
 
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...
AWS re:Invent 2016: From Resilience to Ubiquity - #NetflixEverywhere Global A...
 
Introduction to Serverless
Introduction to ServerlessIntroduction to Serverless
Introduction to Serverless
 
Building a data warehouse with Amazon Redshift … and a quick look at Amazon ...
Building a data warehouse  with Amazon Redshift … and a quick look at Amazon ...Building a data warehouse  with Amazon Redshift … and a quick look at Amazon ...
Building a data warehouse with Amazon Redshift … and a quick look at Amazon ...
 
Web Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud PlatformWeb Scale Applications using NeflixOSS Cloud Platform
Web Scale Applications using NeflixOSS Cloud Platform
 
Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017
 
The Journey To Serverless At Home24 - reflections and insights
The Journey To Serverless At Home24 - reflections and insights The Journey To Serverless At Home24 - reflections and insights
The Journey To Serverless At Home24 - reflections and insights
 
AWS as platform for scalable applications
AWS as platform for scalable applicationsAWS as platform for scalable applications
AWS as platform for scalable applications
 
Stacktician - CloudStack Collab Conference 2014
Stacktician - CloudStack Collab Conference 2014Stacktician - CloudStack Collab Conference 2014
Stacktician - CloudStack Collab Conference 2014
 
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
 

Similar to AWS re:Invent 2016: Content and Data Platforms at Vevo: Rebuilding and Scaling from Zero (SVR308)

AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)Amazon Web Services
 
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Amazon Web Services
 
Accelerate your Cloud Success with Platform Services
Accelerate your Cloud Success with Platform ServicesAccelerate your Cloud Success with Platform Services
Accelerate your Cloud Success with Platform ServicesAmazon Web Services
 
Introducing to serverless computing and AWS lambda - Israel Clouds Meetup
Introducing to serverless computing and AWS lambda - Israel Clouds MeetupIntroducing to serverless computing and AWS lambda - Israel Clouds Meetup
Introducing to serverless computing and AWS lambda - Israel Clouds MeetupBoaz Ziniman
 
AWS Lambda support for AWS X-Ray
AWS Lambda support for AWS X-RayAWS Lambda support for AWS X-Ray
AWS Lambda support for AWS X-RayEitan Sela
 
AWS and Serverless with Alexa
AWS and Serverless with AlexaAWS and Serverless with Alexa
AWS and Serverless with AlexaRory Preddy
 
Tech Talk on Cloud Computing
Tech Talk on Cloud ComputingTech Talk on Cloud Computing
Tech Talk on Cloud ComputingITviec
 
Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)AWS Vietnam Community
 
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...Emerson Eduardo Rodrigues Von Staffen
 
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...Amazon Web Services
 
Build an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersBuild an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersAmazon Web Services
 
Serverless without Code (Lambda)
Serverless without Code (Lambda)Serverless without Code (Lambda)
Serverless without Code (Lambda)CloudHesive
 
How to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless EditionHow to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless Editionecobold
 
Build an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersBuild an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersAmazon Web Services
 
How to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless EditionHow to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless EditionLecole Cole
 
Microservices and Serverless for Mega Startups - DevOps IL Meetup
Microservices and Serverless for Mega Startups - DevOps IL MeetupMicroservices and Serverless for Mega Startups - DevOps IL Meetup
Microservices and Serverless for Mega Startups - DevOps IL MeetupBoaz Ziniman
 
Raleigh DevDay 2017: Build a serverless web application in one day workshop
Raleigh DevDay 2017: Build a serverless web application in one day workshopRaleigh DevDay 2017: Build a serverless web application in one day workshop
Raleigh DevDay 2017: Build a serverless web application in one day workshopAmazon Web Services
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Amazon Web Services
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldRandy Shoup
 

Similar to AWS re:Invent 2016: Content and Data Platforms at Vevo: Rebuilding and Scaling from Zero (SVR308) (20)

AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)
 
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
Maximizing Audience Engagement in Media Delivery (MED303) | AWS re:Invent 2013
 
Accelerate your Cloud Success with Platform Services
Accelerate your Cloud Success with Platform ServicesAccelerate your Cloud Success with Platform Services
Accelerate your Cloud Success with Platform Services
 
Introducing to serverless computing and AWS lambda - Israel Clouds Meetup
Introducing to serverless computing and AWS lambda - Israel Clouds MeetupIntroducing to serverless computing and AWS lambda - Israel Clouds Meetup
Introducing to serverless computing and AWS lambda - Israel Clouds Meetup
 
AWS Lambda support for AWS X-Ray
AWS Lambda support for AWS X-RayAWS Lambda support for AWS X-Ray
AWS Lambda support for AWS X-Ray
 
AWS and Serverless with Alexa
AWS and Serverless with AlexaAWS and Serverless with Alexa
AWS and Serverless with Alexa
 
Tech Talk on Cloud Computing
Tech Talk on Cloud ComputingTech Talk on Cloud Computing
Tech Talk on Cloud Computing
 
Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)
 
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
Devops continuousintegration and deployment onaws puttingmoneybackintoyourmis...
 
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
DevOps, Continuous Integration and Deployment on AWS: Putting Money Back into...
 
Serverless Culture
Serverless CultureServerless Culture
Serverless Culture
 
Build an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersBuild an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million Users
 
Serverless without Code (Lambda)
Serverless without Code (Lambda)Serverless without Code (Lambda)
Serverless without Code (Lambda)
 
How to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless EditionHow to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless Edition
 
Build an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersBuild an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million Users
 
How to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless EditionHow to Build a Big Data Application: Serverless Edition
How to Build a Big Data Application: Serverless Edition
 
Microservices and Serverless for Mega Startups - DevOps IL Meetup
Microservices and Serverless for Mega Startups - DevOps IL MeetupMicroservices and Serverless for Mega Startups - DevOps IL Meetup
Microservices and Serverless for Mega Startups - DevOps IL Meetup
 
Raleigh DevDay 2017: Build a serverless web application in one day workshop
Raleigh DevDay 2017: Build a serverless web application in one day workshopRaleigh DevDay 2017: Build a serverless web application in one day workshop
Raleigh DevDay 2017: Build a serverless web application in one day workshop
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric World
 

Recently uploaded

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 

Recently uploaded (20)

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 

AWS re:Invent 2016: Content and Data Platforms at Vevo: Rebuilding and Scaling from Zero (SVR308)

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Miguel Alvarado, VP of Data Analytics Alan Zawari, Senior Engineer, Content Services December 1, 2016 Content and Data Platforms at Vevo Rebuilding and Scaling from Zero in One Year SVR308
  • 2. About Us • Miguel Alvarado, VP of Data Analytics @djmalvarado miguelalvarado miguel.alvarado@vevo.com • Alan Zawari, Senior Engineer, Content Services @alanzawari alanzawari skilledDeveloper alan.zawari@vevo.com
  • 3. What to Expect from the Session • Learn About Vevo and Engineering @ Vevo • Content Services • What is content services? • Rearchitecting content services from the ground up • How AWS Lambda functions fit into the picture • Data Services • What are data services? • Building a data platform from scratch • Using Amazon Kinesis as the central data nervous system
  • 4. What Is Vevo? Vevo is the world’s leading all- premium music video and entertainment platform with over 19 billion monthly views globally. Vevo delivers a personalized and expertly curated experience for audiences to explore and discover music videos, exclusive original programming, and live performances from the artists they love on mobile, web, and connected TV.
  • 6. PROPERTY OF VEVO LLC PRIVATE & CONFIDENTIAL The evolution of vevo PROPERTY OF VEVO LLC PRIVATE & CONFIDENTIAL Before (2015)
  • 7. PROPERTY OF VEVO LLC PRIVATE & CONFIDENTIAL The evolution of vevo PROPERTY OF VEVO LLC PRIVATE & CONFIDENTIAL After (2016)
  • 9. Engineering @ Vevo • Vevo Engineering 1.0 (2009-15): • Hybrid hosting (Rackspace + AWS) • Megaservices • .Net! • No continuous delivery, no tests • Loads of technical debt • Vevo Engineering 2.0 (2016): • 100% AWS + Kubernetes • Microservices + Nanoservices (Lambdas) • Go, Scala, Java, Node.js • Continuous delivery, TDD, BDD • Rewritten most of stack • We’re hiring!
  • 10. What Is Content Services?
  • 11. What Is Content Services? • Infrastructure that allows music artists to deliver video to their audience: • Artist and video metadata • Video ingestion • Video encoding • Publishing to our own platforms and partners • Providing APIs for our client apps
  • 12. What Are Data Services?
  • 13. What Are Data Services? • Data services are the collection of services and infrastructure that encompasses the Vevo Data Platform • The Vevo Data Platform powers two main things: • Vevo’s “Smart Consumer Experiences” in the form of personalization and recommendations • Analytics for all Vevo product and business groups • The Data Services team comprises platform engineers and data scientists
  • 15. Old Architecture • A giant monolithic service responsible for everything: • Authentication • Search • Playlists • Artists • Videos and streams • Recommendation • More • .NET/SQL Server stack
  • 17. New Content Architecture • Microservice/stream architecture • Small independent services + Amazon Kinesis • Different technology stacks: Node.js, Java, Go • Services don’t talk to each other directly • Every service has its own event stream (bus) • Services can consume each others’ stream events • Each streams has its own JSON schema • The whole thing cannot go down • Continuous delivery • Cost effective
  • 19. Developing New Architecture • Old architecture is in production and running • New architecture is in progress • We wanted to feed new architecture with live data • And get both running simultaneously • Slowly switch traffic over to new architecture • Connect both worlds without changing the old code?
  • 20. Bridging New Architecture to the Old • Project Mexit • Runs on a recurring basis • Queries old API for recent metadata changes (like a client) • Emits the changes to new-architecture Amazon Kinesis streams • Fault tolerant: Stores last successful timestamp • AWS Lambda + Amazon DynamoDB
  • 21. Bridging New Architecture to the Old Live Prod Data
  • 23. How We Used Lambdas • Scheduled tasks (Cron job) • Database triggers • User-facing services • Other use cases
  • 24. How We Used Lambdas • Scheduled tasks (Cron job) • Read artist/video metadata changes and: • Update Amazon Elasticsearch Service index • Stream changes to Amazon Kinesis (Project Mexit) • Cache warming: Keep top artist images in the cache • Release new videos based on startDate (Project Releasr) • Polling every 5 sec • Lambda schedule event: rate(1 minute) • Long running Lambda (i.e., 5 min timeout) • 5-sec intervals
  • 25. Releasr, 5-Sec Recurring Task module.exports.handler = function (event, context, callback) { //MAIN TIMER - runs every 5 seconds var timer = setInterval(function () { if (!processing) { processItems(); } else { //still processing, come back later! } }, 5000); //finish lambda before it’s timed out setTimeout(function () { clearInterval(timer); console.log("Finished processing at ", new Date().toISOString()); callback(); }, LAMBDA_TIMEOUT - 1000); };
  • 26. How We Used Lambdas (cont.) • Database triggers • Send user likes to Amazon Kinesis (Project Dartmouth) • Export user likes to Amazon S3/Amazon Redshift • Cross-account Amazon DynamoDB replication (Project Fargo):
  • 27. How We Used Lambdas (cont.) • User-facing services • Project Susa, Vevo link shortening and social interaction tracking • Pure serverless! • Consists of nanoservices: • Shorten • Expand • Event-Publisher: DB trigger to capture/publish events • AWS Lambda, Amazon API Gateway, and Amazon DynamoDB
  • 28. How We Used Lambdas (cont.) • Project Susa: Creating a short link AMAZONKINESISBUS Client Events Core REST API APIGateway l1 Data Consumers l3 Auth APIv2 Decode Token APIv2 Video Metadata l0 Create a new Short link Response Store link and parameters Record a ‘Share’ event Get YouTube URL Authentication
  • 29. How We Used Lambdas (cont.) • Project Susa: Clicking on a short link AMAZONKINESISBUS Core Events Click on Short link Redirection Response REST API APIGateway l2 Retrieve full URL Record a ‘Click’ event Data Consumers l0
  • 30. How We Used Lambdas (cont.) • Project Susa: Lambda/API Gateway Scalability Spike: 80X normal traffic
  • 31. How We Used Lambdas (cont.) • Other use cases: • Sending data to third-party ML providers (Project Dartmouth) • Slack integration (on-demand cache buster):
  • 32. Building Data Services from Scratch
  • 33. Old World • No Data team • No Vevo Data Platform • No first-party data • No data science • No personalization or recommendations • Used third-party comScore DAX for analytics • No continuous delivery
  • 36. Project Dartmouth • 5 ML companies A, B, C, D, E • Power the Feed on iOS • Real-time event collection • Real-time recommendations • Goal: improve swipe/click ratio
  • 37. Project Dartmouth and Event Collection POC
  • 38. Current Data Services Architecture
  • 40. Amazon Kinesis at the Heart
  • 41. Service-to-Service Contracts • Based on JSON schemas, considered Avro and Protocol Buffers • All entities that are shared via Amazon Kinesis need a schema • Full payloads can be passed or just notification with the ID of new or modified entity • Messages should be dropped into Amazon Kinesis at the time of creation or modification
  • 42. Central Repo/Sample JSON Schema { "$schema": "http://json-schema.org/draft-04/schema#", "id": "http://schemas.vevo.com/streams/like/1.0/like- event.json#", "type": "object", "properties": { "user_id": {"type": "string", "minLength": 1 }, "entity_type": {"type": "string", "enum": [ "USER", "PLAYLIST", "VIDEO", "ARTIST" ] }, "entity_id": { "type": "string", "minLength": 1 }, "action": {"type": "string", "enum": ["LIKE", "UNLIKE" ] } }, "required": [ "user_id", "entity_type", "entity_id", "action" ]}
  • 43. Spark for Most Data Processing
  • 44. Amazon Redshift for Analytics
  • 45. Current Data Services Architecture (recap)
  • 47. Lessons Learned • Lambda (and resources) deployment could be challenging • Used serverless framework • Integrated with our CI/CD framework • Lambda throttled invocations • Watched it and increased concurrent invocations per account • Lambda cold start issue • Kept it warm by frequent invocations (every 5 min.) • Standardized what goes on stream • A central place for schemas • A central place for error messages
  • 48. Lessons Learned (cont.) • Don’t try to boil the ocean all at once • Use real user data to make decisions • Reuse as much existing technology vs. build • If you’re serious about analytics, build your own platform