Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath

173 Aufrufe

Veröffentlicht am

Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request? Vespa (http://www.vespa.ai) allows you to search, organize and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets with latencies in the tens of milliseconds. Vespa is behind the recommendation, ad targeting, and search at Yahoo where it handles billions of daily queries over billions of documents.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath

  1. 1. Big data ° Real time The open big data serving engine; store, search, rank and organize big data at user serving time.
  2. 2. Big data maturity levels Latent Data is produced but not systematically leveraged Examples Credit card transaction data is stored for audit purposes. Movie streaming events are logged. Analysis Data is used to inform decisions made by humans Examples Statistics on credit card fraud are gathered to create policies for flagging fraudulent transactions. Lists of movies popular with various user segments are compiled to inform curated recommendation lists. Learning Data is used to learn automated decisions disconnected from direct action Examples Fraudulent credit card transactions are automatically flagged. Lists of movie recommendations per user segment are automatically generated. Acting Automated data-driven decisions are made in real time Examples Fraudulent credit card transactions are automatically blocked. Personalized movie recommendations are computed when needed by that user.
  3. 3. Closer look: Acting Acting Automated data-driven decisions are made in real time Examples Fraudulent credit card transactions are automatically blocked. Personalized movie recommendations are computed when needed by that user. Two types Decisions can be made by considering a single data item: Streaming, or stateless model evaluation Decisions need to consider many data items: Big data serving
  4. 4. Big data serving: What is required? Real-time actions: Find data and make inferences in tens of milliseconds. Realtime knowledge: Handle data changes at high continuous rates. Scalable: Handle large requests rates over big data sets. Always available: Recover from hardware failures without human intervention. Online evolvable: Change schemas, logic, models, hardware while online. Integrated: Data feeds from Hadoop, learned models from TensorFlow etc.
  5. 5. Introducing Vespa An open source platform for big data serving As Hadoop: Developed at Yahoo for search, now for all big data serving cases Open source: Visit the new site at http://vespa.ai Big data: Makes the Big Data Serving features available for everyone
  6. 6. Vespa at Oath / Yahoo Oath:Tumblr, TechCrunch, Huffington Post, Aol, Engadget, Gemini, News, Sports, Finance, Mail, etc. Hundreds of Vespa applications, … serving over a billion users … over 200.000 queries per second … over billions of content items
  7. 7. Vespa is A platform for low latency computations over large, evolving data sets • Search and selection over structured and unstructured data • Relevance scoring: NL features, advanced ML models, TensorFlow etc. • Query time organization and aggregation of matching data • Real-time writes at a high sustained rate • Live elastic and auto-recovering stateful content clusters • Processing logic container (Java) • Managed clusters: One to hundreds of nodes Typical use cases: text search, personalization / recommendation / targeting, real-time data display
  8. 8. Case study: Zedge The primary motivations for Zedge to use Vespa are 1) simplify search and recommender systems for Zedge Android and iOS apps, both for serving (reduce amount of custom code to maintain) and for processing/indexing (reduce need for big data jobs by calculating more on the fly with tensors in Vespa) 2) accelerate innovation for content discovery, e.g. easier to improve ranking with machine learning using Vespa in combination with Tensorflow than with e.g. our custom code recommender systems. An added bonus so far has been that more people understand both search and recommender systems due to the overall reduction in complexity of search and recommender systems - Zedge VP of Data, Amund Tveit 2017 Worldwide Download Leaders
  9. 9. Comparisons Vespa: Focus on big data serving: Large scale, efficient, ML models ElasticSearch: Focus on analytics: Log ingestion, visualization etc. Solr: Focused on enterprise search: Handling document formats etc. Relational databases: Transactions, hard to scale, no IR, no relevance NoSQl stores: Easier to scale, no transactions, no IR, no relevance Hadoop/Cloudera/ Hortonworks: Big Data, but not for serving
  10. 10. Text search, relevance, grouping and aggregation Analytics Vespa Elastic Search Big data serving Vespa and Elastic Search use cases
  11. 11. Analytics vs big data serving Analytics Big data serving Response time in low seconds Response time in low milliseconds Low query rate High query rate Time series, append only Random writes Down time, data loss acceptable HA, no data loss, online redistribution Massive data sets (trillions of docs) are cheap Massive data sets are more expensive Analytics GUI integration Machine learning integration VS
  12. 12. Vespa architecture
  13. 13. Container node Query Application Package Admin & Config Content node Deploy - Configuration - Components - ML models Scatter-gather Core sharding models models models 1) Parallelization 2) Move execution to data nodes 3) Prepared data structures (indexes etc.) Scalable low latency execution: How to bound latency
  14. 14. Amdahl’s law: speedup = 1 / (s + p / N)
  15. 15. SLA Latency: 100ms @ 95% Throughput: 500 qps Utilizing increased resources to potentially increase quality of returned results.
  16. 16. Inference in Vespa Tensor data model: Multidimensional collections of numbers in queries, documents, models Tensor math express all common machine-learned models with join, map, reduce TensorFlow and ONNX integration: Deploy TensorFlow and ONNX (SciKit, Caffe2, PyTorch etc.) directly on Vespa Vespa execution engine optimized for repeated execution of models over many data items, and running many inferences in parallel
  17. 17. <application package>/models/ search music { rank-profile song inherits default { first-phase { expression { 0.7 * nativeRank(artist,album,track) + 0.1 * tensorflow(tf-model-dir) + 0.1 * onnx(onnx-model-file, output) + 0.1 * xgboost(xgboost-model-file) } } } }
  18. 18. map( join( reduce( join( Placeholder, Weights_1, f(x,y)(x * y) ), sum, d1 ), Weights_2, f(x,y)(x + y) ), f(x)(max(0,x)) )Placeholder Weights_1 matmul Weights_2 add relu
  19. 19. Vespa Recap Making the best use of big data often implies making decisions in real time Vespa is the only open source platform optimized for such big data serving Available on https://vespa.ai Quick start: Run a complete application (on a laptop or AWS) in 10 minutes http://docs.vespa.ai/documentation/vespa-quick-start.html Tutorial: Make a scalable blog search and recommendation engine from scratch http://docs.vespa.ai/documentation/tutorials/blog-search.html

×