Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Exploring MongoDB & Elasticsearch: Better Together

2.882 Aufrufe

Veröffentlicht am

An Open Talk at DeveloperWeek Austin 2017 by Kimberly Wilkins (@dba_denizen), Principal Engineer - Databases at ObjectRocket. Featuring new use cases like Bitcoin, AI, IoT, and all the cool things.

Veröffentlicht in: Technologie
  • Hello! I can recommend a site that has helped me. It's called ⇒ HelpWriting.net ⇐ They helped me for writing my quality research paper on diabetes, and of course by keeping my all other needs fulfilled.
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier

Exploring MongoDB & Elasticsearch: Better Together

  1. 1. www.objectrocket.com Exploring MongoDB and Elasticsearch DeveloperWeek Austin 2017 Kimberly Wilkins Principal Engineer Databases @dba_denizen /wilkinskimberly
  2. 2. www.objectrocket.com Current Areas of Interest • NoSQL – MongoDB, Elasticsearch, etc. • Streaming, real-time analytics • AR/VR/MR – Augmented, Virtual and Mixed Reality technologies • Machine Learning – Deep Learning • Cryptocurrencies, Blockchain • Teaching, helping, raising up others
  3. 3. www.objectrocket.com MongoDB & Elasticsearch Better Together? Yes!
  4. 4. www.objectrocket.com Overview • Definitions • Current versions • Features • Architectural basics • Use cases: Best, Worst, Together Squirrel
  5. 5. www.objectrocket.com Why Do It? The blue data highway… bulging at the seams.
  6. 6. www.objectrocket.com So Many Forms… As Many Impacts New technologies, new industries, new uses…
  7. 7. www.objectrocket.com Data is Coming From Everywhere Sensors, IoT
  8. 8. www.objectrocket.com Data is Coming From Everywhere “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…” -Dan Ariely, Duke University
  9. 9. www.objectrocket.com Remember • Hold the data • Find the data fast • Stream the data between data stores • Process the data along the way • Analyze the data • Understand where the data comes from
  10. 10. www.objectrocket.com Why? • Faster, more flexible development • Lower $ (hardware, software, deployment) • Performance (faster writes, faster reads) • Developers (“Schemaless”, cool toys) • > dev’s than ^ dba’s, devops, SRE’s… • Variety of NoSQL technologies
  11. 11. www.objectrocket.com MongoDB & Elasticsearch Better Together? Yes!
  12. 12. www.objectrocket.com MongoDB "MongoDB (from humongous) is a free and open-source cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with schemas.” – straight from wikipedia • #1 NoSQL • #5 Overall
  13. 13. www.objectrocket.com Features: MongoDB Document store collections vs tables; document or objectId’s Easy for developers – more devs than DBA’s and Ops flexible data types Unstructured & structured data De-normalized Duplicate data is OK Index intersections, partials, aggregation pipelines - $lookup improvements coming in 3.6 *Nov–single db call; updating arrays Scales vertically or horizontally - sharding
  14. 14. www.objectrocket.com MongoDB Architectural Basics • Faster, more flexible development • Built-in Replication via Replica sets • HA/DR throughout stack, components • Scaling via Sharding • DR via use of Multiple Data Centers • Delayed and/or Hidden Slaves • https://www.objectrocket.com/files/objectrocket-for- mongodb-white-paper.pdf
  15. 15. www.objectrocket.com Basic MongoDB Architecture Primary Secondary Secondary Heartbeat Single Replica Set
  16. 16. www.objectrocket.com Shard 1 Secondary Secondary Primary Shard 2 Secondary Secondary Primary Shard 3 Secondary Secondary Primary Client Drivers MongoS Tier (Router) MongoD Tier Replica Sets MongoS MongoS MongoS Config Servers (Metadata) Config 3 Config 1 Config 2 Replica Set 3.2 Sharded Cluster MongoS
  17. 17. www.objectrocket.com MongoDB Architecture - Advanced • Multiple Storage Engine Options • HA/DR throughout stack, components • Scaling via Sharding • DR via use of Multiple Data Centers, delayed/hidden • Percona Server Edition - has features from MongoDB Enterprise edition* Security
  18. 18. www.objectrocket.com Best Use Cases • User Data - games, chat, social media • Mobile Analytics, Engagement/Campaigns • Aggregation Summaries • Product Catalogs • Inventory Management • Shopping Carts • Content Management Systems - Sitecore 1000 x
  19. 19. www.objectrocket.com Elasticsearch
  20. 20. www.objectrocket.com Elasticsearch “Elasticsearch is a distributed, JSON- based search and analytics engine designed for horizontal scalability, maximum reliability, and easy management.” – straight from Elastic.co website
  21. 21. www.objectrocket.com Best Use Cases ● Cluster - A collection of Elasticsearch nodes of various roles ↳ Nodes - Elasticsearch processes that perform one or more roles ● Roles are: master, data, ingest, coordinating-only (client) ● Nodes can operate in any combination or all roles ↳ Indexes - A collection of data (like databases/collections) ● Can be combined in queries with wildcards and aliases ● Fields in an index have an unchangeable data type (mapping) ↳ Shards - Slices of the index data ● Unlike many databases, automatically constructed (not key based) ● A replica is just a readonly copy of a shard ↳ Segments - Lucene’s chunk of data ● Automatically built as data is indexed. ● Docs are not deleted, just marked as deleted (can be optimized/merged) ↳ Documents - A JSON entry in the index
  22. 22. www.objectrocket.com Elasticsearch vs. Elastic Stack • Don’t be confused! • Elasticsearch vs. Elastic Stack • The Open Source Elastic Stack is a suite of tools/apps associated with and working in conjunction with Elasticsearch to complete a variety of analytics tasks.
  23. 23. www.objectrocket.com Elastic Stack Ecosystem
  24. 24. www.objectrocket.com Basic Elastic Architecture 3 Nodes 1 Replica, 1 master-Master –fewer nodes, more resources per node, each shard performs better 3 Nodes 2 Replicas, 1 master-Master – more nodes, needs more HW resources but increases search performance for the index and improves redundancy
  25. 25. www.objectrocket.com Best Use Cases • Full and Fuzzy Text Searches **true strength speed • Geo and Range related searches • Visualizing Data – with other ES Stack Components- Kibana • Logging and Log Analysis xsplunkx • Scraping and Combining Public Data Sources • Event and Data Metrics
  26. 26. www.objectrocket.com Geo Queries – Social Media – Near Me
  27. 27. www.objectrocket.com Visualization with Kibana
  28. 28. www.objectrocket.com Visualization with Kibana MongoDB Elastic (Elasticsearch) General Purpose Document store DB, server side scripts, some aggreg pipelines OLTP = good, REPORTING = not as good Simple = good, Complex = good, Very Complex = not as good Full-text search engine, Fuzzy text search, geo near, keyword, real-time analytics, indexer, distributed , java based w/Lucene under the covers Current version: 3.4.10 *Halloween! Recommended: 3.4.8 or 3.4.9 Current version: 5.6.1 September 18, 2017 *New, kinks from 5.5.3 release from September 11, 2017 Recommended and Available 5.5.1 July 25, 2017 Schemaless **#! Structured, unstructured, semi-structured Schemaless **#! Structured, unstructured, semi-structured JSON, BSON docs JSON Sharding to scale Sharding/Nodes to scale HA via replica sets (1 Primary, 2 Secondaries – or more with quorum) HA via replica sets (1 MASTER, x REPLICAS) Limited index intersection v2.6+, very large indexes still ehh 1 Query can use multiple indexes Great general purpose NoSQL db, for Processing, filtering during query & data retrieval Processing via index builds, stores in multiple versions. Great at Indexing; Great at searching big datasets
  29. 29. www.objectrocket.com Now Combine Them Like tacos and tequila
  30. 30. www.objectrocket.com Combining – in general • Database >>many indexes or very large indexes • Data has lots of arrays - to perform queries that required many different $and clauses on an field with an array as a value • SPEED up fuzzy and/or full text searches – ‘chicken’ ex. db.articles.find({ $text: { $search: "chi" } }
  31. 31. www.objectrocket.com MongoDB & Elasticsearch + Primarily Search Engine Scalable, distributed Horizontal scaling JSON Schemaless* Based on Lucene Support for Python, JS, .Net, Scala, Perl, php, Ruby 3rd Party Product Integration Primarily for Streaming, for moving data between data stores, used with other components and data techs to create near real time and very near real time event analytics, append only, Horizontal scaling JSON Schemaless* Parallel Processing 3rd Party Product Integration Primarily OLTP Scalable, distributed Verticle or Horizontal scaling Binary JSON Schemaless* Rapid prototyping Event Logging Social Media Content management User Data and Actions NOT in-depth analysis MongoDB Elasticsearch Kafka, others
  32. 32. www.objectrocket.com MongoDB & Elasticsearch @ObjectRocket MongoDB metrics Centralized Logging MongoDB data visualization Network monitoring Website search Business Metrics Elasticsearch metrics Currently
  33. 33. www.objectrocket.com Potential New Use 1 – Bitcoin Time Interval Tracking Bitcoin ticker data Interval Tracking and Analysis…. MongoDB • Simple and Complex Queries • Aggregations at any stage Elasticsearch • Speed up queries – faster results • Store frequent queries for re-use via indexes
  34. 34. www.objectrocket.com Potential New Use 1 cont’d – Bitcoin Time Interval Tracking
  35. 35. www.objectrocket.com Potential New Use 2 – Cryptocurrency Platform/Trading • Crytpocurrency Trading Platform - ex. tribeca • node.js – v7.8 or higher • MongoDB database – for persistence, aggregations • Elasticsearch – the ‘need for speed’ rapid-fire executions required – sub millisecond trades & cancellations
  36. 36. www.objectrocket.com Potential New Use 3 – Social Media App Searching • Searching large Social Media Apps for frequently searched items – popular quarterbacks & receivers on fantasy football sites, wines in comments • MongoDB’s $text operator is special - cannot be used more than once in a query; no use with $nor, etc. ex. db.comments.find({ $and: [{$text: { $search: ”win" },{$text: {$search: “red” }}]}) – WON’T WORK! In MongoDB but combine it.
  37. 37. www.objectrocket.com Potential New Use 4 – Machine Learning, Deep Learning
  38. 38. www.objectrocket.com Potential New Use 4 – Machine Learning, Deep Learning Architecture and Streaming Platform – Jay Kreps • Apps/DB’s->data in • Aggregations at any stage • Further Queries • Faster Queries via ES • Results back into DB’s • Algorithms applied • Endless … Limitless … Device events, time series, event logs, AR/VR/MR
  39. 39. www.objectrocket.com Links • MongoDB to Analyze cryptocurrency price swings and intervals: https://medium.com/@serbanmihai/aggregate-mongodb-data-with-node-js-and-mongoose- cryptocurrency-financial-time-series-ae739b4c9485 • MongoDB with node.js – Cryptocurrency trading platform: https://github.com/michaelgrosner/tribeca • Arctic MongoDN and Python – Cryptocurrency Database: https://mxbu.github.io/logbook/2017/06/04/use-arctic-to-create-cryptocurrency-database/ • AI MI DL - Jay Kreps article Architecture and Streaming Platform for AI Deep Learning Database Pipeline Models Events etc.: • https://www.oreilly.com/ideas/apache-kafka-and-the-four-challenges-of-production-machine- learning-systems
  40. 40. www.objectrocket.com We are Hiring! Join a dynamic and innovative team! objectrocket.com/careers
  41. 41. www.objectrocket.com Consultations Available sales@objectrocket.com objectrocket.com/customers/ View Customer Stories Trial & Migrations always free objectrocket.com
  42. 42. www.objectrocket.com Thank You! DeveloperWeek Austin 2017 Kimberly Wilkins Principal Engineer Databases @dba_denizen /wilkinskimberly