Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Building Personalized Data Products - From Idea to Product

32 Aufrufe

Veröffentlicht am

Scout24 is a leading operator of digital marketplaces specializing in the real estate and automotive sectors in Germany and other selected European countries. We provide the best experience to our users on our two well-known and popular brands, ImmobilienScout24 and AutoScout24 by offering highly personalized services. This talk is about how we leverage AWS infrastructure to rapidly experiment with state of the art ML algorithms which enables us to move quickly from idea through prototype to product. We illustrate this with a concrete use case: our Smart Consumer Notification engine.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Building Personalized Data Products - From Idea to Product

  1. 1. www.scout24.com Building Personalized Data Products From Idea to Product AWS Pop-up Loft Berlin | October 15th, 2018 | Stephen Wilson & Sebastian Bolz
  2. 2. Outline Building Personalized Data Products | Stephen Wilson & Sebastian Bolz • Personalisation and Relevance at Scout24 • Data Science Toolbox • Innovation and Experimentation • Modelling with Personal Analytics Clusters (PAC) • Production Architecture
  3. 3. Personalisation and Relevance at Scout24
  4. 4. Personalisation & Relevance at Scout24 • strategically important • drives sessions & revenue Building Personalized Data Products | Stephen Wilson & Sebastian Bolz
  5. 5. Personalisation and Relevance at Scout24 Building Personalized Data Products | Stephen Wilson & Sebastian Bolz • Deliver relevance through personalisation • Greater relevance → Better experience → Increase in trust and perceived quality • Anticipate needs → Productive engagement • Lead identification models to automatically identify target groups.
  6. 6. Architectural Overview of Smart Notifications Building Personalized Data Products | Stephen Wilson & Sebastian Bolz
  7. 7. Data Science Toolbox
  8. 8. Data Science Toolbox Building Personalized Data Products | Stephen Wilson & Sebastian Bolz
  9. 9. Data Science Toolbox Building Personalized Data Products | Stephen Wilson & Sebastian Bolz • Personal Analytics Clusters: − Fully automated − Configurable − Data + Notebooks persisted to S3 − Powerful development environment • EC2 + Docker: − Use case specific Docker images • SageMaker: − Still in the beginnings for us (experimentation, HackWeek etc)
  10. 10. Innovation and Experimentation
  11. 11. Use case: Lead prediction model Building Personalized Data Products | Stephen Wilson & Sebastian Bolz • Business need: we would like to automatically predict whether someone is a lead, based on their behaviour on our platform • Machine learning problem: − Scoring model (predict probability) − Similar to CTR models − Many users, small number of whom engage
  12. 12. Research Building Personalized Data Products | Stephen Wilson & Sebastian Bolz • Look at how other companies do it: − Logistic regression (Google, Facebook, Bing) • Read widely: academic papers • Kaggle competitions in similar problem domains − Factorisation Machines • Peer discussions in team • Technical white paper from Criteo: − FM implementation in production
  13. 13. Experimentation Building Personalized Data Products | Stephen Wilson & Sebastian Bolz • Personal Analytics Clusters: − Explore, play, experiment − Seamless sampling of production data from our data lake (S3) − Try things out quickly, fail fast, make changes, iterate. • Try things out quickly, fail fast, make changes, iterate
  14. 14. Modelling with PAC
  15. 15. Modelling with PAC: Data Preparation Building Personalized Data Products | Stephen Wilson & Sebastian Bolz • We use One Scout User Database (web tracking) • Encode business logic that determines a user is a lead • Select lead users and their corresponding events from S3 • Sample equivalent number of non-lead users for balanced dataset • Disregard very common events and very infrequent ones • One Hot Encode
  16. 16. Modelling with PAC: Training Building Personalized Data Products | Stephen Wilson & Sebastian Bolz • We used a TensorFlow implementation of Factorisation Machines − https://github.com/geffy/tffm • PAC provides flexibility to try out different approaches: − Data from 1 year − 6 months − 1 month − 15 days • Easy to compare performance and monitor training times
  17. 17. Modelling with PAC: Local dev + Run in Cloud Building Personalized Data Products | Stephen Wilson & Sebastian Bolz
  18. 18. Production Architecture
  19. 19. Production Architecture: Model Training Pipeline Building Personalized Data Products | Stephen Wilson & Sebastian Bolz OSUD (S3) Data Preparator One Hot Encoder Model Trainer Latest Model (S3)
  20. 20. Production Architecture: Smart Notifications Building Personalized Data Products | Stephen Wilson & Sebastian Bolz Advisor Subscriptions (Redis) OSUD (DynamoDB) Brain Matching DB (Aurora) Latest Model (S3) UserIDs
  21. 21. Production Architecture Building Personalized Data Products | Stephen Wilson & Sebastian Bolz Advisor Subscriptions (Redis) OSUD (DynamoDB) Brain Matching DB (Aurora) Latest Model (S3) Event Event
  22. 22. Production Architecture Building Personalized Data Products | Stephen Wilson & Sebastian Bolz Advisor Subscriptions (Redis) OSUD (DynamoDB) Brain Matching DB (Aurora) Latest Model (S3) Cross device IDS
  23. 23. Production Architecture Building Personalized Data Products | Stephen Wilson & Sebastian Bolz Advisor Subscriptions (Redis) OSUD (DynamoDB) Brain Matching DB (Aurora) Latest Model (S3) Is user subscribed
  24. 24. Production Architecture Building Personalized Data Products | Stephen Wilson & Sebastian Bolz Advisor Subscriptions (Redis) OSUD (DynamoDB) Brain Matching DB (Aurora) Latest Model (S3) Get user event history
  25. 25. Production Architecture Building Personalized Data Products | Stephen Wilson & Sebastian Bolz Advisor Subscriptions (Redis) OSUD (DynamoDB) Brain Matching DB (Aurora) Latest Model (S3) Compute score Publish score Push notification if score > threshold
  26. 26. Results Building Personalized Data Products | Stephen Wilson & Sebastian Bolz • A/B Test: − A are shown notifications based on a rule-based trigger − B are shown notifications based on the output of the mortgage lead model • Group B outperformed the rule-based trigger: − We issued 90% fewer notifications than the trigger group − 27% better click-rate than trigger group − 21% less rejection rate than trigger group − 7 x more leads from the same amount of notifications • Outcome: ML-based scoring models deliver on relevance and usefulness.
  27. 27. Fitting it all together Building Personalized Data Products | Stephen Wilson & Sebastian Bolz • Seamless delivery of services and content • Relevant and useful • Reuse
  28. 28. Contact ImmobilienScout24 GmbH Andreasstraße 10 10243 Berlin Stephen Wilson Fon +49 30 243 01-1686 stephen.wilson@scout24.com www.scout24.com Thank you for your attention! Sebastian Bolz Fon +49 30 24301-1228 sebastian.bolz@scout24.com www.scout24.com

×