SlideShare a Scribd company logo
1 of 20
Download to read offline
Data Science in the
Newsroom
Geetu Ambwani
Principal Data Scientist
geetu.ambwani@huffingtonpost.com
What is the Huffington Post?
Founded May 2005
Ranking among Digital-only news websites 1
Cross-platform monthly unique visitors Over 187 Million
Number of articles per day Over 500
Number of international editions 15
Bloggers Over 100,000
News Industry - Trends
HuffPost has consistently been an innovator in the digital publishing space.
Massive Blogging Network:
More than 100K bloggers across the globe
News Industry - Trends
HuffPost has consistently been an innovator in the digital publishing space.
Google Site Rank
News Industry - Trends
HuffPost has consistently been an innovator in the digital publishing space.
Biggest Social publisher
News Industry - Challenges
How Can Data Help ?
Ad campaigns
International editionsSocial media promotion
Editors
User-experience
Blog moderators
Reporters
HuffPost Studio
Content Lifecycle
DistributionCreation Consumption
Content Creation: How Can Data Help ?
● Tools to help surface, discover trends in different parts of the web
● Content Enhancement with multimedia based on semantic matching (images, slideshows, videos)
● Optimizing headlines/images (RobinHood Platform)
Content Gap: Production Versus Consumption
Content Consumption: How Can Data Help?
Know Your Audience
● User Cohorts:
○ Social Traffic versus FrontPage Clickers consume different content
○ Desktop Vs Mobile consumption
● Recommendations/Personalization
● Can we use data to inform product design and interface ?
○ Rearrange share buttons based on traffic origin (Facebook vs Pinterest)
Content Lifecycle
DistributionCreation Consumption
Content Distribution: Can Data Help ?
● People’s attention is increasingly concentrated on social streams
○ More traffic to publishers from social than any other way
● Are Distributed Platforms the new home page ?
○ Facebook Instant, Apple News, Snapchat Discover, Google Amp
○ Messenger Bots
● You need to be where your audience is:
○ Identify the content mix that is maximally engaging on an external platform
○ Can we use data to seed these distribution networks ? (Facebook HuffPost Pages, Snapchat
Discover)
Content Distribution: Can Data Help ?
● HuffPost produces 1000 articles a day - which of these do we promote ?
● Article PVs follow a very skewed distribution of success
○ Only 1% of our articles > 100k PVs
● Content performs differently on different networks.
● Can we predict the articles that will get traction in advance so
■ We can optimally seed multiple distribution channels (Facebook HP Pages, Snapchat
Discover)
■ Target for premium/high value ads to maximize revenue
■ Populate Recommendation Widgets
Content Distribution: Can Data Help ?
Challenges
● Histogram of traffic distribution - highly skewed.
● The very act of promoting something causes a bump in traffic.
● Data normalization - how long do want to wait before predicting ?
● Very imbalanced data set
Our Approach
● Random Forest classifier.
● Multiple success criteria
● Historical examples of (+) and (-) articles. Downsampling.
● Different normalization thresholds
● Feature engineering: traffic growth ratios; initial organic social traffic per minute; distinct referrers;
Slackbot for the social promotion team
● 20% lift in PVs per predicted article
● 20% lift in PVs per predicted article
Conclusion
A Data Driven Newsroom today means
● More than just keeping track of clicks and shares
● Using predictive analytics to drive product and content placement
Machine Learning will be a key driver for success with the advent of distributed
content
Thanks !
MachineLearning@HuffPost

More Related Content

Similar to Geetu Ambwani, Principal Data Scientist, Huffington Post at MLconf NYC - 4/15/16

Data science in the newsroom
Data science in the newsroomData science in the newsroom
Data science in the newsroomGeetu Ambwani
 
Josh Luger - Mumbrella Keynote - October 2015
Josh Luger - Mumbrella Keynote - October 2015Josh Luger - Mumbrella Keynote - October 2015
Josh Luger - Mumbrella Keynote - October 2015Josh Luger
 
Telegraph cim social media v01
Telegraph cim social media v01Telegraph cim social media v01
Telegraph cim social media v01LauraWinter
 
Social Media101 (short)
Social Media101 (short)Social Media101 (short)
Social Media101 (short)Drew Shope
 
Project Paper Company Report
Project Paper Company ReportProject Paper Company Report
Project Paper Company ReportTapiwa Choto
 
Social Media & Metrics (Digital Marketing Today: F17)
Social Media & Metrics (Digital Marketing Today: F17)Social Media & Metrics (Digital Marketing Today: F17)
Social Media & Metrics (Digital Marketing Today: F17)Julian Gamboa
 
Inbound marketing (with content)
Inbound marketing (with content)Inbound marketing (with content)
Inbound marketing (with content)Phil Decoteau
 
Creative Content Uclan
Creative Content UclanCreative Content Uclan
Creative Content Uclanmarkmedia
 
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...TechSoup
 
Emerging seo & digital marketing trends. ebn
Emerging seo & digital marketing trends. ebnEmerging seo & digital marketing trends. ebn
Emerging seo & digital marketing trends. ebne-Strategy
 
Social Media & Metrics (Digital Marketing Today)
Social Media & Metrics (Digital Marketing Today)Social Media & Metrics (Digital Marketing Today)
Social Media & Metrics (Digital Marketing Today)Julian Gamboa
 
How Social Media is Impacting Traditional PR and Marketing oct 22
How Social Media is Impacting Traditional PR and Marketing oct 22How Social Media is Impacting Traditional PR and Marketing oct 22
How Social Media is Impacting Traditional PR and Marketing oct 22shapira marketing
 
Social Media Marketing Trends to Watch in 2016
Social Media Marketing Trends to Watch in 2016Social Media Marketing Trends to Watch in 2016
Social Media Marketing Trends to Watch in 2016Co-Communications
 
Web trends, social media, viralmarketing
Web trends, social media, viralmarketingWeb trends, social media, viralmarketing
Web trends, social media, viralmarketingPer Axbom
 
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...Web2LLP
 
Intro to internet marketing
Intro to internet marketingIntro to internet marketing
Intro to internet marketingBasil Puglisi
 
Social Networking on a Shoe String
Social Networking on a Shoe StringSocial Networking on a Shoe String
Social Networking on a Shoe StringPhilip Roberts
 

Similar to Geetu Ambwani, Principal Data Scientist, Huffington Post at MLconf NYC - 4/15/16 (20)

Data science in the newsroom
Data science in the newsroomData science in the newsroom
Data science in the newsroom
 
Josh Luger - Mumbrella Keynote - October 2015
Josh Luger - Mumbrella Keynote - October 2015Josh Luger - Mumbrella Keynote - October 2015
Josh Luger - Mumbrella Keynote - October 2015
 
Telegraph cim social media v01
Telegraph cim social media v01Telegraph cim social media v01
Telegraph cim social media v01
 
Social Media101 (short)
Social Media101 (short)Social Media101 (short)
Social Media101 (short)
 
Project Paper Company Report
Project Paper Company ReportProject Paper Company Report
Project Paper Company Report
 
Social Media & Metrics (Digital Marketing Today: F17)
Social Media & Metrics (Digital Marketing Today: F17)Social Media & Metrics (Digital Marketing Today: F17)
Social Media & Metrics (Digital Marketing Today: F17)
 
Inbound marketing (with content)
Inbound marketing (with content)Inbound marketing (with content)
Inbound marketing (with content)
 
Creative Content Uclan
Creative Content UclanCreative Content Uclan
Creative Content Uclan
 
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
 
Emerging seo & digital marketing trends. ebn
Emerging seo & digital marketing trends. ebnEmerging seo & digital marketing trends. ebn
Emerging seo & digital marketing trends. ebn
 
Social Media & Metrics (Digital Marketing Today)
Social Media & Metrics (Digital Marketing Today)Social Media & Metrics (Digital Marketing Today)
Social Media & Metrics (Digital Marketing Today)
 
Iabc 2
Iabc 2Iabc 2
Iabc 2
 
Holiday Marketing Tools and Tricks - Debra and Pierre.pdf
Holiday Marketing Tools and Tricks - Debra and Pierre.pdfHoliday Marketing Tools and Tricks - Debra and Pierre.pdf
Holiday Marketing Tools and Tricks - Debra and Pierre.pdf
 
How Social Media is Impacting Traditional PR and Marketing oct 22
How Social Media is Impacting Traditional PR and Marketing oct 22How Social Media is Impacting Traditional PR and Marketing oct 22
How Social Media is Impacting Traditional PR and Marketing oct 22
 
Social Media Marketing Trends to Watch in 2016
Social Media Marketing Trends to Watch in 2016Social Media Marketing Trends to Watch in 2016
Social Media Marketing Trends to Watch in 2016
 
Web trends, social media, viralmarketing
Web trends, social media, viralmarketingWeb trends, social media, viralmarketing
Web trends, social media, viralmarketing
 
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...
Session 3: Nicholas Standage (PAU) - Managing and measuring your social media...
 
Social Media Madness - join or die
Social Media Madness - join or dieSocial Media Madness - join or die
Social Media Madness - join or die
 
Intro to internet marketing
Intro to internet marketingIntro to internet marketing
Intro to internet marketing
 
Social Networking on a Shoe String
Social Networking on a Shoe StringSocial Networking on a Shoe String
Social Networking on a Shoe String
 

More from MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Geetu Ambwani, Principal Data Scientist, Huffington Post at MLconf NYC - 4/15/16

  • 1. Data Science in the Newsroom Geetu Ambwani Principal Data Scientist geetu.ambwani@huffingtonpost.com
  • 2. What is the Huffington Post? Founded May 2005 Ranking among Digital-only news websites 1 Cross-platform monthly unique visitors Over 187 Million Number of articles per day Over 500 Number of international editions 15 Bloggers Over 100,000
  • 3. News Industry - Trends HuffPost has consistently been an innovator in the digital publishing space. Massive Blogging Network: More than 100K bloggers across the globe
  • 4. News Industry - Trends HuffPost has consistently been an innovator in the digital publishing space. Google Site Rank
  • 5. News Industry - Trends HuffPost has consistently been an innovator in the digital publishing space. Biggest Social publisher
  • 6. News Industry - Challenges
  • 7. How Can Data Help ?
  • 8. Ad campaigns International editionsSocial media promotion Editors User-experience Blog moderators Reporters HuffPost Studio
  • 10. Content Creation: How Can Data Help ? ● Tools to help surface, discover trends in different parts of the web ● Content Enhancement with multimedia based on semantic matching (images, slideshows, videos) ● Optimizing headlines/images (RobinHood Platform)
  • 11. Content Gap: Production Versus Consumption
  • 12. Content Consumption: How Can Data Help? Know Your Audience ● User Cohorts: ○ Social Traffic versus FrontPage Clickers consume different content ○ Desktop Vs Mobile consumption ● Recommendations/Personalization ● Can we use data to inform product design and interface ? ○ Rearrange share buttons based on traffic origin (Facebook vs Pinterest)
  • 14. Content Distribution: Can Data Help ? ● People’s attention is increasingly concentrated on social streams ○ More traffic to publishers from social than any other way ● Are Distributed Platforms the new home page ? ○ Facebook Instant, Apple News, Snapchat Discover, Google Amp ○ Messenger Bots ● You need to be where your audience is: ○ Identify the content mix that is maximally engaging on an external platform ○ Can we use data to seed these distribution networks ? (Facebook HuffPost Pages, Snapchat Discover)
  • 15. Content Distribution: Can Data Help ? ● HuffPost produces 1000 articles a day - which of these do we promote ? ● Article PVs follow a very skewed distribution of success ○ Only 1% of our articles > 100k PVs ● Content performs differently on different networks. ● Can we predict the articles that will get traction in advance so ■ We can optimally seed multiple distribution channels (Facebook HP Pages, Snapchat Discover) ■ Target for premium/high value ads to maximize revenue ■ Populate Recommendation Widgets
  • 16. Content Distribution: Can Data Help ? Challenges ● Histogram of traffic distribution - highly skewed. ● The very act of promoting something causes a bump in traffic. ● Data normalization - how long do want to wait before predicting ? ● Very imbalanced data set Our Approach ● Random Forest classifier. ● Multiple success criteria ● Historical examples of (+) and (-) articles. Downsampling. ● Different normalization thresholds ● Feature engineering: traffic growth ratios; initial organic social traffic per minute; distinct referrers;
  • 17. Slackbot for the social promotion team ● 20% lift in PVs per predicted article
  • 18. ● 20% lift in PVs per predicted article
  • 19. Conclusion A Data Driven Newsroom today means ● More than just keeping track of clicks and shares ● Using predictive analytics to drive product and content placement Machine Learning will be a key driver for success with the advent of distributed content