1. We connect your dots
Ads in the cloud
Andrea Fiore
Managing Director
Copyrights 2012 DotAndMedia – www.dotandmedia.com
2. We connect your dots
Dot&Ads is our multichannel ad-serving system used by several
leading publishers in Italy.
Copyrights 2012 DotAndMedia – www.dotandmedia.com
3. We connect your dots
Dot&Ads
delivers more than
7 Billions
Imps per month
on a 24/7 up service
Copyrights 2012 DotAndMedia – www.dotandmedia.com
4. We connect your dots
How we use AWS:
We use EC2, Auto-scaling and Load balancing for
delivering ads along side our main infrastructure.
We automatically absorb peaks through
scripts monitoring our local farm, and
change entries in the Route 53 (DNS),
when needed, allowing more traffic
towards AWS load balancers.
Copyrights 2012 DotAndMedia – www.dotandmedia.com
5. We connect your dots
Amazon
CloudWatch
Copyrights 2012 DotAndMedia – www.dotandmedia.com
6. We connect your dots
The Earthquake Case Study
• Experiencing a peak of traffic of 60 Mbit/s (+66% of traffic) after an
earthquake in Northern Italy (June, 2012)
• Automatically half of that peak was diverted to Ec2 infrastructure
avoiding service interruptions or delays in response
Copyrights 2012 DotAndMedia – www.dotandmedia.com
7. We connect your dots
AWS Pros for our business:
• Scale up and down at will;
• 24/7 Up Service;
• Pay as much used and needed: when we started-up we used AWS
EC2 to start progressively with bandwidth and computation and HW;
• Reduce time to market for test/experiment;
• Static files are stored on S3 and distributed via Cloudfront (flash files,
videos, images, javascript libraries);
• Monitoring tools;
Copyrights 2012 DotAndMedia – www.dotandmedia.com
8. We connect your dots
BigData and MapReduce
• We produce about 8 billions log-entries that
have to be processed to count distinct
browsers and other variables;
• We decided to use the Hadoop framework
and the MapReduce to complete the task.
Copyrights 2012 DotAndMedia – www.dotandmedia.com
9. We connect your dots
MapReduce Steps
Copyrights 2012 DotAndMedia – www.dotandmedia.com
10. We connect your dots
MapReduce Steps
• In the Map phase data are parsed to find the
key/value pairs matching your search;
• Then a partition function will assign that pairs to
the reducers trying to distribute them uniformly;
• After a comparison phase in which the pairs are
sorted, a Reduce function will iterate through the
data producing zero or more results;
• Finally an output writer will write the results (i.e.
on a local storage or in the S3).
Copyrights 2012 DotAndMedia – www.dotandmedia.com
11. We connect your dots
MapReduce Steps
Amazon Elastic
MapReduce
Reporting
UI
MR Results
S3 Bucket
Corporate Data center Log
retrieving
Application Logs
S3 Bucket
Copyrights 2012 DotAndMedia – www.dotandmedia.com
12. We connect your dots
How we use EMR
• Our frontends save application logs on S3;
• A script consolidate them in bigger files and move
to the data repository bucket on S3;
• Through a dedicated UI, our user can interrogate
logs drilling down by several dimensions/filters;
• Then a script prepare and execute a job on EMR;
• When the job is completed another script will
collect all the part-files produced by the
MapReduce and add the column names.
Copyrights 2012 DotAndMedia – www.dotandmedia.com
13. We connect your dots
See also
• The Hadoop project:
– http://hadoop.apache.org/
• Apache Hadoop 1.0.3 Tutorial:
– http://hadoop.apache.org/docs/r1.0.3/mapred_tutori
al.html
• Another MapReduce Tutorial
– http://code.google.com/intl/it/edu/parallel/mapreduc
e-tutorial.htm
• The new Hadoop Model: YARN
– http://hadoop.apache.org/docs/current/
Copyrights 2012 DotAndMedia – www.dotandmedia.com
14. We connect your dots
Contact us:
info@dotandmedia.com
www.dotandmedia.com
Thank you
Copyrights 2012 DotAndMedia – www.dotandmedia.com