SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Data Science Challenges
and Impact @Lazada
Big Data & Analytics Innovation Summit Singapore 2018
#1 Shopping
Site in SEA
145,000 sellers
3,000 brands
Lazada Data
Data App Devs expose, integrate, platform-ize
Data Scientists explore, prepare, model
Data Engineers collect, store, maintain
Start from bottom up
Considerations and
Challenges
How much business
input/overriding?
Trade-off: Manual human input vs. automated algorithms
Necessary to some extent, but harmful if overdone
Technically, manual input and rules are difficult to maintain
How much business
input/overriding?
Example: Manual override of product ranking on the site
Allows category managers to incorporate their domain
knowledge (e.g., new product releases, trending, etc.)
Nonetheless, too much manual overriding reduced metrics
Conducted AB tests to find optimal level of manual overriding
How fast is “too fast”?
Trade-off: Development speed vs. production stability
You can move faster without building tooling/abstractions, code
reviews, automated testing, repaying technical debt, documentation
But in the long run, they save time and effort
FB: “Move fast and break things” -> “Move fast with stable infra”
How fast is “too fast”?
Moar features!
Quick POC
Automation,
testing, tooling,
clear tech debt
Environment in place
Project size
Effort
Production
Dev SpeedStability
Less effort and faster =)
More effort and slower =(
Dev, dev, dev
Development effort over the long run
How fast is “too fast”?
Example: 8 man team, 10 problems—mostly focused on delivery
In the first two years, the team achieved a lot and proved our worth
Nonetheless, as we matured and had to maintain more production
code, investing in iteration speed and code quality had high ROI
How to set priorities with
business?
Trade-off: Short-term vs. long term
Business understands best what is needed, though can be overly
focused on day-to-day ops and near term goals
Data science is aware of the latest research and can innovate, but
risks being detached from business needs
How to set priorities with
business?
Example: Timebox-ed skunkworks resulting in POCs
Data leadership sponsored some POCs that were hacked together
in 2 – 4 weeks—some eventually made it into production
Nonetheless, the focus is on research and innovation that can be
applied to improve the online shopping experience
Development and
Impact
Automated Review QC
Product
Review
API
Spam
Classification
General
Classification
Model-based
Data sources
Rule-based
Keywords
Spam
Characteristics
Review
API
Manual QC
Input and post-processing
Audit
Overall results
Significant manpower cost savings (5-figures monthly)
Existing workforce can be diverted to difficult-to-automate tasks
Reduced lead-time before reviews are live on site
Product Ranking
Ranking
affects what
appears
on top
Ranking is
different
from recom-
mendation
Web Tracker
(JavaScript)
Mobile Tracker
(Adjust)
3rd Party
(e.g. ,ZenDesk,
SurveyGizmo)
Kafka Queues
Bulk Loaders
(Spark)
Hadoop
Hadoop
Data
Exploration
+
Data
Preparation
+
Feature
Engineering
+
Modelling
(Spark)
Manual
Boosting
(Django)
Local
Validation
A/B
Testing
Product
Seller
Transaction
Product rankings
Split traffic and measure outcomes
(Category Managers)
(User devices)
Overall results
Better ranking improved conversion (3 – 8%) and revenue per
session (5 – 20%)
Introducing new products improved new product engagement
(CTR increased 30 – 80%; add-to-cart increased 20 – 90%)
Emphasizing product quality had neutral to positive outcomes
(reduced return rate; increased product net promoter score)
Key takeaways
There is no single best answer to the challenges raised—it
depends on the maturity stage of the team and organization
Data science > Coding + Machine Learning—many other
activities contribute greatly to the final impact
Thank you!
eugene.yan@lazada.com
Our culture: http://bit.ly/datascienceculture
How we rank products: http://bit.ly/how-lazada-ranks-products

Weitere Àhnliche Inhalte

Was ist angesagt?

Scalable data pipeline at Traveloka - Facebook Dev Bandung
Scalable data pipeline at Traveloka - Facebook Dev BandungScalable data pipeline at Traveloka - Facebook Dev Bandung
Scalable data pipeline at Traveloka - Facebook Dev BandungRendy Bambang Junior
 
화성에서 옚 개발자, ꞈ성에서 옚 Ʞ획자
화성에서 옚 개발자, ꞈ성에서 옚 Ʞ획자화성에서 옚 개발자, ꞈ성에서 옚 Ʞ획자
화성에서 옚 개발자, ꞈ성에서 옚 Ʞ획자Yongho Ha
 
Cohort Analysis at Scale
Cohort Analysis at ScaleCohort Analysis at Scale
Cohort Analysis at ScaleBlake Irvine
 
Data Science for e-commerce
Data Science for e-commerceData Science for e-commerce
Data Science for e-commerceInfoFarm
 
Expedia vs. Priceline - User Engagement Teardown
Expedia vs. Priceline - User Engagement TeardownExpedia vs. Priceline - User Engagement Teardown
Expedia vs. Priceline - User Engagement TeardownIterable
 
Boston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsBoston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsJames Kirk
 
Interactive Analytics in Human Time
Interactive Analytics in Human TimeInteractive Analytics in Human Time
Interactive Analytics in Human TimeDataWorks Summit
 
An Introduction to the Google Analytics 360 Suite
An Introduction to the Google Analytics 360 SuiteAn Introduction to the Google Analytics 360 Suite
An Introduction to the Google Analytics 360 SuiteSearch Laboratory
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment Databricks
 
Introduction to Spark (Intern Event Presentation)
Introduction to Spark (Intern Event Presentation)Introduction to Spark (Intern Event Presentation)
Introduction to Spark (Intern Event Presentation)Databricks
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy Neo4j
 
Proof of Concept: Adobe Analytics Live Stream on Amazon Web Services
Proof of Concept: Adobe Analytics Live Stream on Amazon Web ServicesProof of Concept: Adobe Analytics Live Stream on Amazon Web Services
Proof of Concept: Adobe Analytics Live Stream on Amazon Web ServicesYASH Technologies
 
Analyse your SEO Data with R and Kibana
Analyse your SEO Data with R and KibanaAnalyse your SEO Data with R and Kibana
Analyse your SEO Data with R and KibanaVincent Terrasi
 
데읎터가 흐넎는 ìĄ°ì§ 만듀Ʞ - ë§ˆìŽëŠŹì–ŒíŠžëŠœ
데읎터가 흐넎는 ìĄ°ì§ 만듀Ʞ - ë§ˆìŽëŠŹì–ŒíŠžëŠœë°ìŽí„°ê°€ 흐넎는 ìĄ°ì§ 만듀Ʞ - ë§ˆìŽëŠŹì–ŒíŠžëŠœ
데읎터가 흐넎는 ìĄ°ì§ 만듀Ʞ - ë§ˆìŽëŠŹì–ŒíŠžëŠœìŠč화 양
 
Modeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systemsModeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systemsMitul Tiwari
 
ì‹€ëŹŽì—ì„œ 활용하는 A/B테슀튞
ì‹€ëŹŽì—ì„œ 활용하는 A/Bí…ŒìŠ€íŠžì‹€ëŹŽì—ì„œ 활용하는 A/B테슀튞
ì‹€ëŹŽì—ì„œ 활용하는 A/B테슀튞JeongMin Kwon
 
데읎터는 찚튞가 아니띌 돈읎 되얎알 한닀.
데읎터는 찚튞가 아니띌 돈읎 되얎알 한닀.데읎터는 찚튞가 아니띌 돈읎 되얎알 한닀.
데읎터는 찚튞가 아니띌 돈읎 되얎알 한닀.Yongho Ha
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?blueace
 
Starring sakila my sql university 2009
Starring sakila my sql university 2009Starring sakila my sql university 2009
Starring sakila my sql university 2009David Paz
 

Was ist angesagt? (20)

Scalable data pipeline at Traveloka - Facebook Dev Bandung
Scalable data pipeline at Traveloka - Facebook Dev BandungScalable data pipeline at Traveloka - Facebook Dev Bandung
Scalable data pipeline at Traveloka - Facebook Dev Bandung
 
화성에서 옚 개발자, ꞈ성에서 옚 Ʞ획자
화성에서 옚 개발자, ꞈ성에서 옚 Ʞ획자화성에서 옚 개발자, ꞈ성에서 옚 Ʞ획자
화성에서 옚 개발자, ꞈ성에서 옚 Ʞ획자
 
Cohort Analysis at Scale
Cohort Analysis at ScaleCohort Analysis at Scale
Cohort Analysis at Scale
 
Data Science for e-commerce
Data Science for e-commerceData Science for e-commerce
Data Science for e-commerce
 
Expedia vs. Priceline - User Engagement Teardown
Expedia vs. Priceline - User Engagement TeardownExpedia vs. Priceline - User Engagement Teardown
Expedia vs. Priceline - User Engagement Teardown
 
Boston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsBoston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender Systems
 
Interactive Analytics in Human Time
Interactive Analytics in Human TimeInteractive Analytics in Human Time
Interactive Analytics in Human Time
 
An Introduction to the Google Analytics 360 Suite
An Introduction to the Google Analytics 360 SuiteAn Introduction to the Google Analytics 360 Suite
An Introduction to the Google Analytics 360 Suite
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
 
Scala and spark
Scala and sparkScala and spark
Scala and spark
 
Introduction to Spark (Intern Event Presentation)
Introduction to Spark (Intern Event Presentation)Introduction to Spark (Intern Event Presentation)
Introduction to Spark (Intern Event Presentation)
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy
 
Proof of Concept: Adobe Analytics Live Stream on Amazon Web Services
Proof of Concept: Adobe Analytics Live Stream on Amazon Web ServicesProof of Concept: Adobe Analytics Live Stream on Amazon Web Services
Proof of Concept: Adobe Analytics Live Stream on Amazon Web Services
 
Analyse your SEO Data with R and Kibana
Analyse your SEO Data with R and KibanaAnalyse your SEO Data with R and Kibana
Analyse your SEO Data with R and Kibana
 
데읎터가 흐넎는 ìĄ°ì§ 만듀Ʞ - ë§ˆìŽëŠŹì–ŒíŠžëŠœ
데읎터가 흐넎는 ìĄ°ì§ 만듀Ʞ - ë§ˆìŽëŠŹì–ŒíŠžëŠœë°ìŽí„°ê°€ 흐넎는 ìĄ°ì§ 만듀Ʞ - ë§ˆìŽëŠŹì–ŒíŠžëŠœ
데읎터가 흐넎는 ìĄ°ì§ 만듀Ʞ - ë§ˆìŽëŠŹì–ŒíŠžëŠœ
 
Modeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systemsModeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systems
 
ì‹€ëŹŽì—ì„œ 활용하는 A/B테슀튞
ì‹€ëŹŽì—ì„œ 활용하는 A/Bí…ŒìŠ€íŠžì‹€ëŹŽì—ì„œ 활용하는 A/B테슀튞
ì‹€ëŹŽì—ì„œ 활용하는 A/B테슀튞
 
데읎터는 찚튞가 아니띌 돈읎 되얎알 한닀.
데읎터는 찚튞가 아니띌 돈읎 되얎알 한닀.데읎터는 찚튞가 아니띌 돈읎 되얎알 한닀.
데읎터는 찚튞가 아니띌 돈읎 되얎알 한닀.
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?
 
Starring sakila my sql university 2009
Starring sakila my sql university 2009Starring sakila my sql university 2009
Starring sakila my sql university 2009
 

Ähnlich wie Data Science Challenges and Impact at Lazada (Big Data and Analytics Innovation Summit Singapore 2018)

ANI | Business Agility Day @Gurugram | Are you a responsible Business | Dilje...
ANI | Business Agility Day @Gurugram | Are you a responsible Business | Dilje...ANI | Business Agility Day @Gurugram | Are you a responsible Business | Dilje...
ANI | Business Agility Day @Gurugram | Are you a responsible Business | Dilje...AgileNetwork
 
Business and IT alignment through effective Project & Program Portfolio Manag...
Business and IT alignment through effective Project & Program Portfolio Manag...Business and IT alignment through effective Project & Program Portfolio Manag...
Business and IT alignment through effective Project & Program Portfolio Manag...Alan Kan
 
Business and IT alignment through effective Project & Program Portfolio Manag...
Business and IT alignment through effective Project & Program Portfolio Manag...Business and IT alignment through effective Project & Program Portfolio Manag...
Business and IT alignment through effective Project & Program Portfolio Manag...Alan Kan
 
The Four Pillars of Analytics Technology Whitepaper
The Four Pillars of Analytics Technology WhitepaperThe Four Pillars of Analytics Technology Whitepaper
The Four Pillars of Analytics Technology WhitepaperEdgar Alejandro Villegas
 
Hypothesis-Driven Development & How to Fail-Fast Hacking Growth
Hypothesis-Driven Development & How to Fail-Fast Hacking GrowthHypothesis-Driven Development & How to Fail-Fast Hacking Growth
Hypothesis-Driven Development & How to Fail-Fast Hacking GrowthPrabhat Gupta
 
The Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterThe Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterInside Analysis
 
Startup Product Development
Startup Product DevelopmentStartup Product Development
Startup Product DevelopmentAaron Stannard
 
Future directives in erp, erp and internet, critical success and failure factors
Future directives in erp, erp and internet, critical success and failure factorsFuture directives in erp, erp and internet, critical success and failure factors
Future directives in erp, erp and internet, critical success and failure factorsVarun Luthra
 
Npi with bpm webinar
Npi with bpm webinarNpi with bpm webinar
Npi with bpm webinarAisurya Puhan
 
RAD Lab Overview v04
RAD Lab Overview v04RAD Lab Overview v04
RAD Lab Overview v04Daniel Grbac
 
Building Simple Continuous Reviews in ACL
Building Simple Continuous Reviews in ACLBuilding Simple Continuous Reviews in ACL
Building Simple Continuous Reviews in ACLJim Kaplan CIA CFE
 
Lean product management for web2.0 by Sujoy Bhatacharjee, April
Lean product management for web2.0 by Sujoy Bhatacharjee, April Lean product management for web2.0 by Sujoy Bhatacharjee, April
Lean product management for web2.0 by Sujoy Bhatacharjee, April Triggr In
 
Best Practices and Lessons Learned on Our IBM Rational Insight Deployment
Best Practices and Lessons Learned on Our IBM Rational Insight DeploymentBest Practices and Lessons Learned on Our IBM Rational Insight Deployment
Best Practices and Lessons Learned on Our IBM Rational Insight DeploymentMarc Nehme
 
Designing a to be process
Designing a to be processDesigning a to be process
Designing a to be processIhor Malytskyi
 
Gov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/OverviewGov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/OverviewSplunk
 
Improving Speed to Market in E-commerce
Improving Speed to Market in E-commerceImproving Speed to Market in E-commerce
Improving Speed to Market in E-commerceCognizant
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysNEWYORKSYS-IT SOLUTIONS
 
IBM Innovate - Uderstanding DevOps
IBM Innovate - Uderstanding DevOpsIBM Innovate - Uderstanding DevOps
IBM Innovate - Uderstanding DevOpsSanjeev Sharma
 
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
[DSC Europe 22] The Making of a Data Organization - Denys HolovatyiDataScienceConferenc1
 

Ähnlich wie Data Science Challenges and Impact at Lazada (Big Data and Analytics Innovation Summit Singapore 2018) (20)

ANI | Business Agility Day @Gurugram | Are you a responsible Business | Dilje...
ANI | Business Agility Day @Gurugram | Are you a responsible Business | Dilje...ANI | Business Agility Day @Gurugram | Are you a responsible Business | Dilje...
ANI | Business Agility Day @Gurugram | Are you a responsible Business | Dilje...
 
Business and IT alignment through effective Project & Program Portfolio Manag...
Business and IT alignment through effective Project & Program Portfolio Manag...Business and IT alignment through effective Project & Program Portfolio Manag...
Business and IT alignment through effective Project & Program Portfolio Manag...
 
Business and IT alignment through effective Project & Program Portfolio Manag...
Business and IT alignment through effective Project & Program Portfolio Manag...Business and IT alignment through effective Project & Program Portfolio Manag...
Business and IT alignment through effective Project & Program Portfolio Manag...
 
The Four Pillars of Analytics Technology Whitepaper
The Four Pillars of Analytics Technology WhitepaperThe Four Pillars of Analytics Technology Whitepaper
The Four Pillars of Analytics Technology Whitepaper
 
Hypothesis-Driven Development & How to Fail-Fast Hacking Growth
Hypothesis-Driven Development & How to Fail-Fast Hacking GrowthHypothesis-Driven Development & How to Fail-Fast Hacking Growth
Hypothesis-Driven Development & How to Fail-Fast Hacking Growth
 
The Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterThe Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value Thereafter
 
Startup Product Development
Startup Product DevelopmentStartup Product Development
Startup Product Development
 
Future directives in erp, erp and internet, critical success and failure factors
Future directives in erp, erp and internet, critical success and failure factorsFuture directives in erp, erp and internet, critical success and failure factors
Future directives in erp, erp and internet, critical success and failure factors
 
Npi with bpm webinar
Npi with bpm webinarNpi with bpm webinar
Npi with bpm webinar
 
RAD Lab Overview v04
RAD Lab Overview v04RAD Lab Overview v04
RAD Lab Overview v04
 
Building Simple Continuous Reviews in ACL
Building Simple Continuous Reviews in ACLBuilding Simple Continuous Reviews in ACL
Building Simple Continuous Reviews in ACL
 
Lean product management for web2.0 by Sujoy Bhatacharjee, April
Lean product management for web2.0 by Sujoy Bhatacharjee, April Lean product management for web2.0 by Sujoy Bhatacharjee, April
Lean product management for web2.0 by Sujoy Bhatacharjee, April
 
Best Practices and Lessons Learned on Our IBM Rational Insight Deployment
Best Practices and Lessons Learned on Our IBM Rational Insight DeploymentBest Practices and Lessons Learned on Our IBM Rational Insight Deployment
Best Practices and Lessons Learned on Our IBM Rational Insight Deployment
 
CIS 499 Final
CIS 499 FinalCIS 499 Final
CIS 499 Final
 
Designing a to be process
Designing a to be processDesigning a to be process
Designing a to be process
 
Gov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/OverviewGov Day Sacramento 2015 - Keynote/Overview
Gov Day Sacramento 2015 - Keynote/Overview
 
Improving Speed to Market in E-commerce
Improving Speed to Market in E-commerceImproving Speed to Market in E-commerce
Improving Speed to Market in E-commerce
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
IBM Innovate - Uderstanding DevOps
IBM Innovate - Uderstanding DevOpsIBM Innovate - Uderstanding DevOps
IBM Innovate - Uderstanding DevOps
 
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
 

Mehr von Eugene Yan Ziyou

Recommender Systems: Beyond the user-item matrix
Recommender Systems: Beyond the user-item matrixRecommender Systems: Beyond the user-item matrix
Recommender Systems: Beyond the user-item matrixEugene Yan Ziyou
 
Predicting Hospital Bills at Pre-admission
Predicting Hospital Bills at Pre-admissionPredicting Hospital Bills at Pre-admission
Predicting Hospital Bills at Pre-admissionEugene Yan Ziyou
 
OLX Group Prod Tech 2019 Keynote: Asia's Tech Giants
OLX Group Prod Tech 2019 Keynote: Asia's Tech GiantsOLX Group Prod Tech 2019 Keynote: Asia's Tech Giants
OLX Group Prod Tech 2019 Keynote: Asia's Tech GiantsEugene Yan Ziyou
 
SMU BIA Sharing on Data Science
SMU BIA Sharing on Data ScienceSMU BIA Sharing on Data Science
SMU BIA Sharing on Data ScienceEugene Yan Ziyou
 
Culture at Lazada Data Science
Culture at Lazada Data ScienceCulture at Lazada Data Science
Culture at Lazada Data ScienceEugene Yan Ziyou
 
Competition Improves Performance: Only when Competition Form matches Goal Ori...
Competition Improves Performance: Only when Competition Form matches Goal Ori...Competition Improves Performance: Only when Competition Form matches Goal Ori...
Competition Improves Performance: Only when Competition Form matches Goal Ori...Eugene Yan Ziyou
 
Sharing about my data science journey and what I do at Lazada
Sharing about my data science journey and what I do at LazadaSharing about my data science journey and what I do at Lazada
Sharing about my data science journey and what I do at LazadaEugene Yan Ziyou
 
AXA x DSSG Meetup Sharing (Feb 2016)
AXA x DSSG Meetup Sharing (Feb 2016)AXA x DSSG Meetup Sharing (Feb 2016)
AXA x DSSG Meetup Sharing (Feb 2016)Eugene Yan Ziyou
 
Garuda Robotics x DataScience SG Meetup (Sep 2015)
Garuda Robotics x DataScience SG Meetup (Sep 2015)Garuda Robotics x DataScience SG Meetup (Sep 2015)
Garuda Robotics x DataScience SG Meetup (Sep 2015)Eugene Yan Ziyou
 
DataKind SG sharing of our first DataDive
DataKind SG sharing of our first DataDiveDataKind SG sharing of our first DataDive
DataKind SG sharing of our first DataDiveEugene Yan Ziyou
 
Social network analysis and growth recommendations for DataScience SG community
Social network analysis and growth recommendations for DataScience SG communitySocial network analysis and growth recommendations for DataScience SG community
Social network analysis and growth recommendations for DataScience SG communityEugene Yan Ziyou
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntEugene Yan Ziyou
 
Nielsen x DataScience SG Meetup (Apr 2015)
Nielsen x DataScience SG Meetup (Apr 2015)Nielsen x DataScience SG Meetup (Apr 2015)
Nielsen x DataScience SG Meetup (Apr 2015)Eugene Yan Ziyou
 
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsEugene Yan Ziyou
 
Statistical inference: Hypothesis Testing and t-tests
Statistical inference: Hypothesis Testing and t-testsStatistical inference: Hypothesis Testing and t-tests
Statistical inference: Hypothesis Testing and t-testsEugene Yan Ziyou
 
Statistical inference: Probability and Distribution
Statistical inference: Probability and DistributionStatistical inference: Probability and Distribution
Statistical inference: Probability and DistributionEugene Yan Ziyou
 
A Study on the Relationship between Education and Income in the US
A Study on the Relationship between Education and Income in the USA Study on the Relationship between Education and Income in the US
A Study on the Relationship between Education and Income in the USEugene Yan Ziyou
 
Diving into Twitter data on consumer electronic brands
Diving into Twitter data on consumer electronic brandsDiving into Twitter data on consumer electronic brands
Diving into Twitter data on consumer electronic brandsEugene Yan Ziyou
 

Mehr von Eugene Yan Ziyou (18)

Recommender Systems: Beyond the user-item matrix
Recommender Systems: Beyond the user-item matrixRecommender Systems: Beyond the user-item matrix
Recommender Systems: Beyond the user-item matrix
 
Predicting Hospital Bills at Pre-admission
Predicting Hospital Bills at Pre-admissionPredicting Hospital Bills at Pre-admission
Predicting Hospital Bills at Pre-admission
 
OLX Group Prod Tech 2019 Keynote: Asia's Tech Giants
OLX Group Prod Tech 2019 Keynote: Asia's Tech GiantsOLX Group Prod Tech 2019 Keynote: Asia's Tech Giants
OLX Group Prod Tech 2019 Keynote: Asia's Tech Giants
 
SMU BIA Sharing on Data Science
SMU BIA Sharing on Data ScienceSMU BIA Sharing on Data Science
SMU BIA Sharing on Data Science
 
Culture at Lazada Data Science
Culture at Lazada Data ScienceCulture at Lazada Data Science
Culture at Lazada Data Science
 
Competition Improves Performance: Only when Competition Form matches Goal Ori...
Competition Improves Performance: Only when Competition Form matches Goal Ori...Competition Improves Performance: Only when Competition Form matches Goal Ori...
Competition Improves Performance: Only when Competition Form matches Goal Ori...
 
Sharing about my data science journey and what I do at Lazada
Sharing about my data science journey and what I do at LazadaSharing about my data science journey and what I do at Lazada
Sharing about my data science journey and what I do at Lazada
 
AXA x DSSG Meetup Sharing (Feb 2016)
AXA x DSSG Meetup Sharing (Feb 2016)AXA x DSSG Meetup Sharing (Feb 2016)
AXA x DSSG Meetup Sharing (Feb 2016)
 
Garuda Robotics x DataScience SG Meetup (Sep 2015)
Garuda Robotics x DataScience SG Meetup (Sep 2015)Garuda Robotics x DataScience SG Meetup (Sep 2015)
Garuda Robotics x DataScience SG Meetup (Sep 2015)
 
DataKind SG sharing of our first DataDive
DataKind SG sharing of our first DataDiveDataKind SG sharing of our first DataDive
DataKind SG sharing of our first DataDive
 
Social network analysis and growth recommendations for DataScience SG community
Social network analysis and growth recommendations for DataScience SG communitySocial network analysis and growth recommendations for DataScience SG community
Social network analysis and growth recommendations for DataScience SG community
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
 
Nielsen x DataScience SG Meetup (Apr 2015)
Nielsen x DataScience SG Meetup (Apr 2015)Nielsen x DataScience SG Meetup (Apr 2015)
Nielsen x DataScience SG Meetup (Apr 2015)
 
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
 
Statistical inference: Hypothesis Testing and t-tests
Statistical inference: Hypothesis Testing and t-testsStatistical inference: Hypothesis Testing and t-tests
Statistical inference: Hypothesis Testing and t-tests
 
Statistical inference: Probability and Distribution
Statistical inference: Probability and DistributionStatistical inference: Probability and Distribution
Statistical inference: Probability and Distribution
 
A Study on the Relationship between Education and Income in the US
A Study on the Relationship between Education and Income in the USA Study on the Relationship between Education and Income in the US
A Study on the Relationship between Education and Income in the US
 
Diving into Twitter data on consumer electronic brands
Diving into Twitter data on consumer electronic brandsDiving into Twitter data on consumer electronic brands
Diving into Twitter data on consumer electronic brands
 

KĂŒrzlich hochgeladen

Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Standamitlee9823
 
âž„đŸ” 7737669865 đŸ”â–» Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
âž„đŸ” 7737669865 đŸ”â–» Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...âž„đŸ” 7737669865 đŸ”â–» Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
âž„đŸ” 7737669865 đŸ”â–» Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptDr. Soumendra Kumar Patra
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call Girls In Doddaballapur Road ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 đŸ„” Book Your One night Standamitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Call Girls In Attibele ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Attibele ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 đŸ„” Book Your One night Standamitlee9823
 
Call Girls In Bellandur ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Bellandur ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 đŸ„” Book Your One night Standamitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 

KĂŒrzlich hochgeladen (20)

Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 đŸ„” Book Your One night Stand
 
âž„đŸ” 7737669865 đŸ”â–» Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
âž„đŸ” 7737669865 đŸ”â–» Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...âž„đŸ” 7737669865 đŸ”â–» Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
âž„đŸ” 7737669865 đŸ”â–» Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls In Doddaballapur Road ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 đŸ„” Book Your One night Stand
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call Girls In Attibele ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Attibele ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 đŸ„” Book Your One night Stand
 
Call Girls In Bellandur ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 đŸ„” Book Your One night StandCall Girls In Bellandur ☎ 7737669865 đŸ„” Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 đŸ„” Book Your One night Stand
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 

Data Science Challenges and Impact at Lazada (Big Data and Analytics Innovation Summit Singapore 2018)

  • 1. Data Science Challenges and Impact @Lazada Big Data & Analytics Innovation Summit Singapore 2018
  • 2. #1 Shopping Site in SEA 145,000 sellers 3,000 brands
  • 3. Lazada Data Data App Devs expose, integrate, platform-ize Data Scientists explore, prepare, model Data Engineers collect, store, maintain Start from bottom up
  • 5. How much business input/overriding? Trade-off: Manual human input vs. automated algorithms Necessary to some extent, but harmful if overdone Technically, manual input and rules are difficult to maintain
  • 6. How much business input/overriding? Example: Manual override of product ranking on the site Allows category managers to incorporate their domain knowledge (e.g., new product releases, trending, etc.) Nonetheless, too much manual overriding reduced metrics Conducted AB tests to find optimal level of manual overriding
  • 7. How fast is “too fast”? Trade-off: Development speed vs. production stability You can move faster without building tooling/abstractions, code reviews, automated testing, repaying technical debt, documentation But in the long run, they save time and effort FB: “Move fast and break things” -> “Move fast with stable infra”
  • 8. How fast is “too fast”? Moar features! Quick POC Automation, testing, tooling, clear tech debt Environment in place Project size Effort Production Dev SpeedStability Less effort and faster =) More effort and slower =( Dev, dev, dev Development effort over the long run
  • 9. How fast is “too fast”? Example: 8 man team, 10 problems—mostly focused on delivery In the first two years, the team achieved a lot and proved our worth Nonetheless, as we matured and had to maintain more production code, investing in iteration speed and code quality had high ROI
  • 10. How to set priorities with business? Trade-off: Short-term vs. long term Business understands best what is needed, though can be overly focused on day-to-day ops and near term goals Data science is aware of the latest research and can innovate, but risks being detached from business needs
  • 11. How to set priorities with business? Example: Timebox-ed skunkworks resulting in POCs Data leadership sponsored some POCs that were hacked together in 2 – 4 weeks—some eventually made it into production Nonetheless, the focus is on research and innovation that can be applied to improve the online shopping experience
  • 15. Overall results Significant manpower cost savings (5-figures monthly) Existing workforce can be diverted to difficult-to-automate tasks Reduced lead-time before reviews are live on site
  • 19. Web Tracker (JavaScript) Mobile Tracker (Adjust) 3rd Party (e.g. ,ZenDesk, SurveyGizmo) Kafka Queues Bulk Loaders (Spark) Hadoop Hadoop Data Exploration + Data Preparation + Feature Engineering + Modelling (Spark) Manual Boosting (Django) Local Validation A/B Testing Product Seller Transaction Product rankings Split traffic and measure outcomes (Category Managers) (User devices)
  • 20. Overall results Better ranking improved conversion (3 – 8%) and revenue per session (5 – 20%) Introducing new products improved new product engagement (CTR increased 30 – 80%; add-to-cart increased 20 – 90%) Emphasizing product quality had neutral to positive outcomes (reduced return rate; increased product net promoter score)
  • 21. Key takeaways There is no single best answer to the challenges raised—it depends on the maturity stage of the team and organization Data science > Coding + Machine Learning—many other activities contribute greatly to the final impact
  • 22. Thank you! eugene.yan@lazada.com Our culture: http://bit.ly/datascienceculture How we rank products: http://bit.ly/how-lazada-ranks-products