Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Hotel or Taxi? "Sorting hat" for travel expenses with AWS ML infrastructure

41 Aufrufe

Veröffentlicht am

Automating the boring task of submitting travel expenses we developed ML model for classifying recipes. Using AWS EC2, Lambda, S3, SageMaker, Rekognition we evaluated different ways of training model and serving predictions as well as different model approaches (classical ML vs. Deep Learning).

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Hotel or Taxi? "Sorting hat" for travel expenses with AWS ML infrastructure

  1. 1. Hotel or Taxi? "Sorting hat" for travel expenses with AWS ML infrastructure. BERLIN, 18.OCT 2018 MICHAEL PERLIN
  2. 2. www.innoq.com SERVICES Strategy & technology consulting Digital business models Software architecture & development Digital platforms & infrastructures Knowledge transfer, coaching & trainings Big data & machine learning FACTS ~130 employees Privately owned Vendor-independent OFFICES Monheim Berlin Offenbach Munich Hamburg Zurich CLIENTS Finance Telecommunications Logistics E-Commerce Fortune 500 SMBs Startups
  3. 3. Agenda • The value of machine learning • Problem we‘ve solved • AWS infrastructure for training • How deep learning works • How we run it in production
  4. 4. The value of ML (aka success stories)
  5. 5. Self-driving cars Image from: pxhere.com
  6. 6. Automatic translation Screenshot from: deepl.com
  7. 7. Image classification Images from: wikipedia.org ok danger
  8. 8. The travelling consultant problem
  9. 9. Travel expenses
  10. 10. Travel expenses Screenshot from: haufe.de
  11. 11. Travel expenses
  12. 12. Travel expenses Can we simplify this ??? ßEnter data ßSubmit scan with same data
  13. 13. • Export ~5K receipts + data entered • Use ML to extract document class, VAT rate, date, price... • = Save clicking and typing Travel expenses
  14. 14. Training model in AWS infrastructure
  15. 15. Two phases 1. Training your model with available data 2. Using your model for new data (Inference)
  16. 16. Training ? Category Bus Flight Taxi
  17. 17. AWS Rekognition
  18. 18. AWS Rekognition Limited by 50 words
  19. 19. Requirements for training Classical ML • Commodity hardware • Libs • IDE Deep learning • GPU-powered hardware • Libs • IDE
  20. 20. Instances for training
  21. 21. Training environment • EC2 Instance with bare Linux • Install libraries • Configure GPU usage • Install Jupyter • Add self-signed certificates • Go! Option 1
  22. 22. Training environment • EC2 Instance with AMI from Marketplace containing pre- installed and pre-configured libraries • Add self-signed certificates • Go! Option 2
  23. 23. Training environment Option 3
  24. 24. How deep learning works
  25. 25. Artifical Intelligence Machine Learning Deep Learning Terms
  26. 26. Training ? Category Bus Flight Taxi
  27. 27. Training ? Category Bus Flight Taxi Day Travelcard Start Date(2) End Ticket(2) ... (50 words) Lufthansa Your Flight(4) Trip Payment Ticket ... (200 words) Heathrow Taxi(3) Services(2) Walton VISA DEBIT ... (100 words)
  28. 28. Training Heathrow Taxi(3) Services(2) Walton VISA DEBIT ... (100 words) Day Travelcard Start Date(2) End Ticket(2) ... (50 words) Lufthansa Your Flight(4) Trip Payment Ticket ... (200 words) TF = Frequency for „Ticket“ in second document: 3/50 = 0.06 IDF = Frequency of documents containing „Ticket“: 500/5000 = 0.1 Replace „Ticket“ in second vector with TF/IDF = 0.06/0.1 = 0.6
  29. 29. Training ? Category Bus Flight Taxi 0.1 0.03 0.3 0.31 0.44 0.00. 6 0.22 0.2 0.3 0.31 0.3 0.24 0.1 0.1 0.13 0.32 0.1 0.94 0.1 0.3 0.45 0.8
  30. 30. Training ? 0.1 0.03 0.3 0.31 0.44 0.00. 6 0.22 0.0 0.2 0.3 0.31 0.3 0.24 0.1 0.0 0.0 0.0 0.1 0.13 0.32 0.1 0.94 0.1 0.3 0.45 0.8 1 0 0 0 1 0 0 0 1 Bus Flight Taxi
  31. 31. Training ? 1 0 0 input matrix output matrix transformation 0.1 0.03 0.3 0.31 0.44 0.00. 6 0.22 0.0 0.2 0.3 0.31 0.3 0.24 0.1 0.0 0.0 0.0 0.1 0.13 0.32 0.1 0.94 0.1 0.3 0.45 0.8 0 1 0 0 0 1
  32. 32. Training matrix with arbitrary values -1 -0.3 0.3 0.9 -0.3 -0.4 -0.5 0.2 0.4 0.2 0.0 0.1 0.1 0.8 -0.6 0.3 -1.0 0.6 -0.5 0.2, -0.2 0.5 -0.4 0.3 0.2, 0.1, -0.2 another matrix with arbitrary values x x 0.1 0.03 0.3 0.31 0.44 0.00. 6 0.22 0.0 0.2 0.3 0.31 0.3 0.24 0.1 0.0 0.0 0.0 0.1 0.13 0.32 0.1 0.94 0.1 0.3 0.45 0.8 input matrix computed output matrix = 0.7 1.0 0.6 0.0 0.8 0.2 0.5 0.0 0.9
  33. 33. Training computed output matrix 0.3 1.0 0.6 0.0 0.2 0.2 0.5 0.0 0.3 true output matrix 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 error
  34. 34. Training values are changed a bit -0.9 -0.3 0.3 0.8 -0.3 -0.4 -0.4 0.2 0.4 0.4 0.0 0.1 0.2 0.8 -0.6 0.3 -0.9 0.6 -0.5 0.3 -0.2 0.5 -0.5 0.3 0.2 0.2 -0.2 x x 0.1 0.03 0.3 0.31 0.44 0.00. 6 0.22 0.0 0.2 0.3 0.31 0.3 0.24 0.1 0.0 0.0 0.0 0.1 0.13 0.32 0.1 0.94 0.1 0.3 0.45 0.8 input matrix computed output matrix = 0.3 1.0 0.6 0.0 0.8 0.4 0.5 0.1 0.8 values are changed a bit
  35. 35. Training computed output matrix true output matrix 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 error 2 0.3 1.0 0.6 0.0 0.8 0.4 0.5 0.1 0.8
  36. 36. Training -1 -0.3 0.3 0.9 -0.3 -0.4 -0.5 0.2 0.4 0.2 0.0 0.1 0.1 0.8 -0.6 0.3 -1.0 0.6 0.5 0.2, -0.2 0.5 - 0.4 0.3 0.2, 0.1, -0.2 x adjust transformation matrices error vs. error2
  37. 37. Training x 0.1 0.03 0.3 0.31 0.44 0.00. 6 0.22 0.0 0.2 0.3 0.31 0.3 0.24 0.1 0.0 0.0 0.0 0.1 0.13 0.32 0.1 0.94 0.1 0.3 0.45 0.8 input matrix computed output matrix = -1 -0.3 0.3 0.9 -0.3 -0.4 -0.5 0.2 0.4 0.2 0.0 0.1 0.1 0.8 -0.6 0.3 -1.0 0.6 0.5 0.2, -0.2 0.5 - 0.4 0.3 0.2, 0.1, -0.2 x 0.7 1.0 0.6 0.0 0.8 0.2 0.5 0.0 0.9
  38. 38. Training computed output matrix true output matrix 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 error 3 0.7 1.0 0.6 0.0 0.8 0.2 0.5 0.0 0.9
  39. 39. Training -1 -0.3 0.3 0.9 -0.3 -0.4 -0.5 0.5 0.4 0.2 0.0 0.4 0.1 0.8 -0.1 0.2 -1.0 0.6 0.5 0.2, -0.2 0.8 - 0.5 0.3 0.2, 0.1, -0.2 x adjust transformation matrices error2 vs. error3
  40. 40. Training • With all the data iterate until error stops to shrink • The result of the adjustments is the trained model • Now it can be deployed into production
  41. 41. Training -1 -0.3 0.3 0.9 -0.3 -0.4 -0.5 0.2 0.4 0.2 0.0 0.1 0.1 0.8 -0.6 0.3 -1.0 0.6 -0.5 0.2, -0.2 0.5 -0.4 0.3 0.2, 0.1, -0.2 x x 0.1 0.03 0.3 0.31 0.44 0.00. 6 0.22 0.0 0.2 0.3 0.31 0.3 0.24 0.1 0.0 0.0 0.0 0.1 0.13 0.32 0.1 0.94 0.1 0.3 0.45 0.8 input matrix computed output matrix = 0.7 1.0 0.6 0.0 0.8 0.2 0.5 0.0 0.9 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 errorX vs. errorX+1 true output matrix
  42. 42. Training -1 -0.3 0.3 0.9 -0.3 -0.4 -0.5 0.2 0.4 0.2 0.0 0.1 0.1 0.8 -0.6 0.3 -1.0 0.6 -0.5 0.2, -0.2 0.5 -0.4 0.3 0.2, 0.1, -0.2 x x 0.1 0.03 0.3 0.31 0.44 0.00. 6 0.22 0.0 0.2 0.3 0.31 0.3 0.24 0.1 0.0 0.0 0.0 0.1 0.13 0.32 0.1 0.94 0.1 0.3 0.45 0.8 input matrix computed output matrix = 0.7 1.0 0.6 0.0 0.8 0.2 0.5 0.0 0.9 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 diff 2 true output matrix Covered by frameworks!
  43. 43. Training Covered by frameworks!
  44. 44. Training frameworks Hyperparameters - „learning rate“ - ... Network architecture
  45. 45. Two phases 1. Training your model with available data 2. Using your model for new data (Inference) DONE
  46. 46. How we run it in production
  47. 47. Inference • General approach: load the model saved by training, feed the input, get output • Even cross-language works, i.e. model trained with Python can be used a Java application • Usually works on commodity hardware
  48. 48. Package, Build and Deploy web framework docker containerOption 1 container scheduler of your choice: EKS, ECS, OpenShift, Giant Swarm... deploy • inference code • trained model • dependent libs
  49. 49. Package, Build and Deploy • inference code • trained model • dependent libs Option 2 deploy zip AWS Lambda
  50. 50. Application flow Lambda S3 Bucket ECS OCR Service 2 3 1 content.json 4read 6 metadata.add( {class:Bus}) inference 5 7 read metadata
  51. 51. Takeaway(s)
  52. 52. Deep learning is accessible
  53. 53. Thank you! Questions? www.innoq.com innoQ Deutschland GmbH Krischerstr. 100 40789 Monheim am Rhein Germany +49 2173 3366-0 Ohlauer Str. 43 10999 Berlin Germany Ludwigstr. 180E 63067 Offenbach Germany Kreuzstr. 16 80331 München Germany Gewerbestr. 11 CH-6330 Cham Switzerland +41 41 743 01 11 Albulastr. 55 8048 Zürich Switzerland innoQ Schweiz GmbH Michael Perlin Michael.Perlin@innoq.com +49 178 7818063 @ttzt_mp

×