SlideShare ist ein Scribd-Unternehmen logo
1 von 24
CONFIDENTIALCONFIDENTIALCONFIDENTIALCONFIDENTIAL
Geo Searches for Health Care Pricing Data
Robert Stewart
Senior Architect, Castlight Health
rstewart@castlighthealth.com
@wombatnation
1
CONFIDENTIALCONFIDENTIALCONFIDENTIALCONFIDENTIAL
Castlight Health
The Business and Technical Problems
Initial Solution
MongoDB, Geo Haystack Index and SSDs
Replica Set Flipping
2
 Hosted web and mobile applications providing
unbiased information on health care cost and quality
 Customers are employers and health plans
 Founded in 2008, raised $181 million in VC funding
 #1 on Wall Street Journal’s list of “Top 50 Venture-
Backed Companies” for 2011
 Hiring!
Castlight Health
3
Home Page
4
Search Results
5
Business Problem
6
 Support searches for
 Prices for a procedure performed by any in-network provider in a
geographical area
 Prices for all procedures performed by a single provider
 Sub-second response, even if returning data on
thousands of prices
 Need a very fast geo index
 Rate count doubled in last 3 months to 600 million
 Major rate updates monthly
 Difficult to index data to ensure sequential reads
 Sometimes lots of random reads
Technical Problems
7
Pricing Retrieval Architecture
8
User
Castlight
Web Browser
Mobile Web
Browser
Native Mobile
Application
Castlight Web
App
Castlight Mobile
Web App
Proxy Service
Search Service
Pricing Service
Prices
Initial Solution
9
 Store pricing data in MySQL
 When Pricing Service starts, create two in-memory
indexes and cache most of the rates
 55 GB JVM Heap with lots of GC tuning
 20-minute service startup time to build indexes
 3 hours for background caching of most rates
 Trouble Brewing:
 Total rates growing quickly
 Rolling restart becoming unacceptably slow
 If rates not in Java or MySQL cache, retrieval was very slow
CONFIDENTIALCONFIDENTIALCONFIDENTIALCONFIDENTIAL
Enter the Mongo
10
Geo Indexes
11
 Tried standard geo 2D indexes in MongoDB
 Too slow for my use case
 Geo Haystack index
 Conceptually similar
 From docs.mongodb.org
 “A haystack index is a special index that is optimized to return
results over small areas. Haystack indexes improve performance
on queries that use flat geometry.”
Mercator Projection with 10 degree grid
12
Geo Haystack
13
 We chose degrees long-lat for x-y coordinate system
 25 miles is our default search radius
 Roughly 0.5 degrees in middle of the US
db.priceables_1.ensureIndex(
{ loc: "geoHaystack", pm: 1 },
{ bucketSize: 0.5 })
db.runCommand(
{ geoSearch: "priceables_1",
near: [-122.4, 37.79],
maxDistance: 0.5,
search: { pm: 6757 },
limit: 50000 })
 maxDistance calculated using great circle algorithm
Geo Haystack Pros
14
 Very fast when retrieving many documents in a
relatively small search radius
 Great when you also need to apply a secondary filter
 Compound 2dsphere index in Mongo 2.4 has even better support
Geo Haystack Cons
15
 Supports only one extra filter in index
 SERVER-2979
 A bug if unindexed query on only the second part of
the key
 SERVER-8645
> db.priceables_1.find({pm: 6757})
error: { "$err" : "assertion src/mongo/db/geo/haystack.cpp:178" }
 Second part of index can’t have an array value
 Location part of key can’t be null
SSDs
16
 For uncached data on HDD, Geo Haystack was twice as
fast as custom Java geo index and MySQL
 Still close to 1 minute for big queries with full data set
 Death by random read
 Tested with a $200 Samsung SSD
 Typical query dropped to 20 millis
 Big query only about 150 millis
Random 4k block reads, 5 GB file, 16 threads
Mongoperf on SSDs
17
Env SSD Read Ops/s Read MB/s
Prod Samsung 200GB SLC 74k 288
QA VM Samsung 200GB SLC 30k 117
Dev Samsung 830 256GB SATA MLC 47k 183
Env SSD Write Ops/s Write MB/s
Prod Samsung 200GB SLC 1074 289
QA VM Samsung 200GB SLC 405 196
Dev Samsung 830 256GB SATA MLC 438 210
Sequential write of the 5 GB file
 Requirements
 Major price updates monthly
 Minor updates more frequently
 Huge bulk loads with no impact on active replica set
 I/O bound, not CPU bound
Low Impact Pricing Updates
18
 Two replica sets
 Lowered cost with two SSDs on each pricing server
 scp compressed files from QA to passive replica set
 Protip: to compress and uncompress
tar cvf - pricing | pigz > ~/pricing.tgz
pigz -dc pricing.tgz | tar xvf -
 Page in index and data
 db.runCommand({ touch: "priceables_1", index: true, data: true })
 Pricing Service operation to atomically flip
Replica Set Flipping Solution
19
Replica Set Architecture
20
Physical Servers
Replica
Sets
prodpricing1
prodpricing2
Server pricing1
mongod 28001
primary
mongod 28002
secondary
Server pricing2
mongod 28001
secondary
mongod 28002
primary
Server db1
mongod 28001
arbiter
Server db2
mongod 28002
arbiter
 Obviously, increased cost, but only for SSDs
 Recently added caching of remote pricing lookups
 TTL collections
 Cache is lost during a flip
 But, usually flip late at night
 Cache eviction time is only a few hours
Replica Set Flipping Drawbacks
21
 Geo search speed with cold cache acceptable
 Geo search speed with warm cache awesome
 Pricing Service startup down to a few seconds
 No production impact for major rate updates
 Lowered risk for minor rate updates
Overall Results
22
Summary
23
 Geo Haystack Index great for …
 Retrieving lots of documents in a constrained search area
 Geo searches with a secondary filter
 SSDs great for …
 Random reads
 Reducing need for lots of complex indexes
 Replica set flipping great for …
 Instant swap of large amounts of data
 Primarily, if not solely, read only
 Trading cost for operational flexibility
CONFIDENTIALCONFIDENTIALCONFIDENTIALCONFIDENTIAL
Q & A
24

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data and Data Virtualization
Big Data and Data VirtualizationBig Data and Data Virtualization
Big Data and Data VirtualizationKenneth Peeples
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Denodo
 
Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and moreDenodo
 
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo
 
Powering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data VirtualizationPowering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data VirtualizationDenodo
 
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and RoadmapDenodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and RoadmapDenodo
 
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Yellowfin
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo
 
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Cambridge Semantics
 
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jScalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jNeo4j
 
Data Marketplace - Rethink the Data
Data Marketplace - Rethink the DataData Marketplace - Rethink the Data
Data Marketplace - Rethink the DataDenodo
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationCambridge Semantics
 
MongoDB at Agilysys: A Case Study
MongoDB at Agilysys: A Case StudyMongoDB at Agilysys: A Case Study
MongoDB at Agilysys: A Case StudyMongoDB
 
Harnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie MacHarnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie MacDataWorks Summit
 
Take your Data Management Practice to the Next Level with Denodo 7
Take your Data Management Practice to the Next Level with Denodo 7Take your Data Management Practice to the Next Level with Denodo 7
Take your Data Management Practice to the Next Level with Denodo 7Denodo
 
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data FabricUsing Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data FabricCambridge Semantics
 
Top 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data SolutionTop 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data SolutionDataStax
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationDenodo
 
Secure Your Data with Virtual Data Fabric (ASEAN)
Secure Your Data with Virtual Data Fabric (ASEAN)Secure Your Data with Virtual Data Fabric (ASEAN)
Secure Your Data with Virtual Data Fabric (ASEAN)Denodo
 

Was ist angesagt? (20)

Big Data and Data Virtualization
Big Data and Data VirtualizationBig Data and Data Virtualization
Big Data and Data Virtualization
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and more
 
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
 
Powering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data VirtualizationPowering Self Service Business Intelligence with Hadoop and Data Virtualization
Powering Self Service Business Intelligence with Hadoop and Data Virtualization
 
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and RoadmapDenodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
 
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
 
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
 
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jScalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
 
Data Marketplace - Rethink the Data
Data Marketplace - Rethink the DataData Marketplace - Rethink the Data
Data Marketplace - Rethink the Data
 
Big Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data DemocratizationBig Data Fabric 2.0 Drives Data Democratization
Big Data Fabric 2.0 Drives Data Democratization
 
MongoDB at Agilysys: A Case Study
MongoDB at Agilysys: A Case StudyMongoDB at Agilysys: A Case Study
MongoDB at Agilysys: A Case Study
 
Harnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie MacHarnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie Mac
 
TESTING IN BIG DATA WORLD
TESTING IN BIG DATA  WORLDTESTING IN BIG DATA  WORLD
TESTING IN BIG DATA WORLD
 
Take your Data Management Practice to the Next Level with Denodo 7
Take your Data Management Practice to the Next Level with Denodo 7Take your Data Management Practice to the Next Level with Denodo 7
Take your Data Management Practice to the Next Level with Denodo 7
 
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data FabricUsing Cloud Automation Technologies to Deliver an Enterprise Data Fabric
Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric
 
Top 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data SolutionTop 5 Considerations for a Big Data Solution
Top 5 Considerations for a Big Data Solution
 
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationMaximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
 
Secure Your Data with Virtual Data Fabric (ASEAN)
Secure Your Data with Virtual Data Fabric (ASEAN)Secure Your Data with Virtual Data Fabric (ASEAN)
Secure Your Data with Virtual Data Fabric (ASEAN)
 

Ähnlich wie MongoDB San Francisco 2013:Geo Searches for Healthcare Pricing Data presented by Robert Stewart, Castlight Health

Geo Searches for Health Care Pricing Data with MongoDB
Geo Searches for Health Care Pricing Data with MongoDBGeo Searches for Health Care Pricing Data with MongoDB
Geo Searches for Health Care Pricing Data with MongoDBRobert Stewart
 
Geo Searches for Health Care Pricing Data with MongoDB
Geo Searches for Health Care Pricing Data with MongoDBGeo Searches for Health Care Pricing Data with MongoDB
Geo Searches for Health Care Pricing Data with MongoDBRobert Stewart
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftSnapLogic
 
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...Daniel Martin
 
Yellowbrick MicroStrategy webcast
Yellowbrick MicroStrategy webcastYellowbrick MicroStrategy webcast
Yellowbrick MicroStrategy webcastYellowbrick Data
 
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...Arghya Kusum Das
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with SupersetDataWorks Summit
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...Amazon Web Services
 
How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and...
How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and...How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and...
How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and...Denodo
 
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedis Labs
 
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Daniel Hochman
 
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQLKohei KaiGai
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Hybrid solutions – combining in memory solutions with SSD - Christos Erotocritou
Hybrid solutions – combining in memory solutions with SSD - Christos ErotocritouHybrid solutions – combining in memory solutions with SSD - Christos Erotocritou
Hybrid solutions – combining in memory solutions with SSD - Christos ErotocritouJAXLondon_Conference
 
Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Denodo
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTORiccardo Zamana
 
Getting Started with Amazon Redshift
 Getting Started with Amazon Redshift Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom IndustryCloudera, Inc.
 

Ähnlich wie MongoDB San Francisco 2013:Geo Searches for Healthcare Pricing Data presented by Robert Stewart, Castlight Health (20)

Geo Searches for Health Care Pricing Data with MongoDB
Geo Searches for Health Care Pricing Data with MongoDBGeo Searches for Health Care Pricing Data with MongoDB
Geo Searches for Health Care Pricing Data with MongoDB
 
Geo Searches for Health Care Pricing Data with MongoDB
Geo Searches for Health Care Pricing Data with MongoDBGeo Searches for Health Care Pricing Data with MongoDB
Geo Searches for Health Care Pricing Data with MongoDB
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
 
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...
IBM Insight 2013 - Aetna's production experience using IBM DB2 Analytics Acce...
 
Yellowbrick MicroStrategy webcast
Yellowbrick MicroStrategy webcastYellowbrick MicroStrategy webcast
Yellowbrick MicroStrategy webcast
 
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...
Augmenting Amdahl's Second Law for Cost-Effective and Balanced HPC Infrastruc...
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with Superset
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
 
How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and...
How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and...How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and...
How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and...
 
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
 
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
 
FPGA-enhanced Bioinformatics @ NECST
FPGA-enhanced Bioinformatics @ NECSTFPGA-enhanced Bioinformatics @ NECST
FPGA-enhanced Bioinformatics @ NECST
 
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Hybrid solutions – combining in memory solutions with SSD - Christos Erotocritou
Hybrid solutions – combining in memory solutions with SSD - Christos ErotocritouHybrid solutions – combining in memory solutions with SSD - Christos Erotocritou
Hybrid solutions – combining in memory solutions with SSD - Christos Erotocritou
 
Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?Can data virtualization uphold performance with complex queries?
Can data virtualization uphold performance with complex queries?
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTO
 
Getting Started with Amazon Redshift
 Getting Started with Amazon Redshift Getting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
 

Mehr von MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Kürzlich hochgeladen

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

MongoDB San Francisco 2013:Geo Searches for Healthcare Pricing Data presented by Robert Stewart, Castlight Health

  • 1. CONFIDENTIALCONFIDENTIALCONFIDENTIALCONFIDENTIAL Geo Searches for Health Care Pricing Data Robert Stewart Senior Architect, Castlight Health rstewart@castlighthealth.com @wombatnation 1
  • 2. CONFIDENTIALCONFIDENTIALCONFIDENTIALCONFIDENTIAL Castlight Health The Business and Technical Problems Initial Solution MongoDB, Geo Haystack Index and SSDs Replica Set Flipping 2
  • 3.  Hosted web and mobile applications providing unbiased information on health care cost and quality  Customers are employers and health plans  Founded in 2008, raised $181 million in VC funding  #1 on Wall Street Journal’s list of “Top 50 Venture- Backed Companies” for 2011  Hiring! Castlight Health 3
  • 6. Business Problem 6  Support searches for  Prices for a procedure performed by any in-network provider in a geographical area  Prices for all procedures performed by a single provider  Sub-second response, even if returning data on thousands of prices
  • 7.  Need a very fast geo index  Rate count doubled in last 3 months to 600 million  Major rate updates monthly  Difficult to index data to ensure sequential reads  Sometimes lots of random reads Technical Problems 7
  • 8. Pricing Retrieval Architecture 8 User Castlight Web Browser Mobile Web Browser Native Mobile Application Castlight Web App Castlight Mobile Web App Proxy Service Search Service Pricing Service Prices
  • 9. Initial Solution 9  Store pricing data in MySQL  When Pricing Service starts, create two in-memory indexes and cache most of the rates  55 GB JVM Heap with lots of GC tuning  20-minute service startup time to build indexes  3 hours for background caching of most rates  Trouble Brewing:  Total rates growing quickly  Rolling restart becoming unacceptably slow  If rates not in Java or MySQL cache, retrieval was very slow
  • 11. Geo Indexes 11  Tried standard geo 2D indexes in MongoDB  Too slow for my use case  Geo Haystack index  Conceptually similar  From docs.mongodb.org  “A haystack index is a special index that is optimized to return results over small areas. Haystack indexes improve performance on queries that use flat geometry.”
  • 12. Mercator Projection with 10 degree grid 12
  • 13. Geo Haystack 13  We chose degrees long-lat for x-y coordinate system  25 miles is our default search radius  Roughly 0.5 degrees in middle of the US db.priceables_1.ensureIndex( { loc: "geoHaystack", pm: 1 }, { bucketSize: 0.5 }) db.runCommand( { geoSearch: "priceables_1", near: [-122.4, 37.79], maxDistance: 0.5, search: { pm: 6757 }, limit: 50000 })  maxDistance calculated using great circle algorithm
  • 14. Geo Haystack Pros 14  Very fast when retrieving many documents in a relatively small search radius  Great when you also need to apply a secondary filter  Compound 2dsphere index in Mongo 2.4 has even better support
  • 15. Geo Haystack Cons 15  Supports only one extra filter in index  SERVER-2979  A bug if unindexed query on only the second part of the key  SERVER-8645 > db.priceables_1.find({pm: 6757}) error: { "$err" : "assertion src/mongo/db/geo/haystack.cpp:178" }  Second part of index can’t have an array value  Location part of key can’t be null
  • 16. SSDs 16  For uncached data on HDD, Geo Haystack was twice as fast as custom Java geo index and MySQL  Still close to 1 minute for big queries with full data set  Death by random read  Tested with a $200 Samsung SSD  Typical query dropped to 20 millis  Big query only about 150 millis
  • 17. Random 4k block reads, 5 GB file, 16 threads Mongoperf on SSDs 17 Env SSD Read Ops/s Read MB/s Prod Samsung 200GB SLC 74k 288 QA VM Samsung 200GB SLC 30k 117 Dev Samsung 830 256GB SATA MLC 47k 183 Env SSD Write Ops/s Write MB/s Prod Samsung 200GB SLC 1074 289 QA VM Samsung 200GB SLC 405 196 Dev Samsung 830 256GB SATA MLC 438 210 Sequential write of the 5 GB file
  • 18.  Requirements  Major price updates monthly  Minor updates more frequently  Huge bulk loads with no impact on active replica set  I/O bound, not CPU bound Low Impact Pricing Updates 18
  • 19.  Two replica sets  Lowered cost with two SSDs on each pricing server  scp compressed files from QA to passive replica set  Protip: to compress and uncompress tar cvf - pricing | pigz > ~/pricing.tgz pigz -dc pricing.tgz | tar xvf -  Page in index and data  db.runCommand({ touch: "priceables_1", index: true, data: true })  Pricing Service operation to atomically flip Replica Set Flipping Solution 19
  • 20. Replica Set Architecture 20 Physical Servers Replica Sets prodpricing1 prodpricing2 Server pricing1 mongod 28001 primary mongod 28002 secondary Server pricing2 mongod 28001 secondary mongod 28002 primary Server db1 mongod 28001 arbiter Server db2 mongod 28002 arbiter
  • 21.  Obviously, increased cost, but only for SSDs  Recently added caching of remote pricing lookups  TTL collections  Cache is lost during a flip  But, usually flip late at night  Cache eviction time is only a few hours Replica Set Flipping Drawbacks 21
  • 22.  Geo search speed with cold cache acceptable  Geo search speed with warm cache awesome  Pricing Service startup down to a few seconds  No production impact for major rate updates  Lowered risk for minor rate updates Overall Results 22
  • 23. Summary 23  Geo Haystack Index great for …  Retrieving lots of documents in a constrained search area  Geo searches with a secondary filter  SSDs great for …  Random reads  Reducing need for lots of complex indexes  Replica set flipping great for …  Instant swap of large amounts of data  Primarily, if not solely, read only  Trading cost for operational flexibility

Hinweis der Redaktion

  1. File system was ext4 with mount flags defaults,noatime,discard