Ilai Malka from Nielsen at AWS Community Day TLV, December 2019 (https://awscommunitydaytelaviv2019.splashthat.com/):
Scheduling big data workloads is challenging. It's extra challenging when running on Serverless infrastructure.
At Nielsen Marketing Cloud, we've built a system that uploads 250 billion events per day to partner ad platforms, running on Serverless infrastructure (AWS Lambda and OpenFaaS).
Creating a 'scheduler' for this system required:
1. Rate-limiting to prevent flooding partner platforms.
2. High utilization to keep costs low
3. Careful bottleneck management to keep the system humming
https://www.linkedin.com/in/ilai-malka-93b06172/
https://twitter.com/IlaiMalka
#Nielsen #NielsenMarketingCloud #AWSCommunityDay #Serverless
2. What You’ll Hear About
• Our data pipeline
• Why Serverless ?
• Problems we had to solve
- Cost
- Rate Limiting
3. About me
My Post about Serverless:
https://medium.com/nmc-techblog/going-serverless-c334ae242ca6
NMC Tech Blog:
https://medium.com/nmc-techblog
Ilai Malka
Big Data Developer
4. Segmentation Upload To Networks Run Campaigns
About Nielsen Marketing Cloud (NMC)
10 Billion Profiles 140 Ad Networks 999+ Campaigns
5. Segmentation Upload To Networks Run Campaigns
About Nielsen Marketing Cloud (NMC)
10 Billion
Profiles
140 Ad
Networks 9999
Campaigns
250 Billion
Events
24. Key takeaways
• Serverless is the next revolution
• Serverless has a built in scalability feature + shorter time to market
• Cost is linear to computation power
• Incentive to optimize. optimize=costs saving
• Cost formula is not straightforward -> Find right memory setting with tool
• Costs can get out of control -> add alerts
• Hybrid solution = scalability + low costs
• Rest of the world don’t use serverless so we need to avoid flooding them
• Find the right bottleneck and solve it