SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Downloaden Sie, um offline zu lesen
Search data store for the world's largest
                            biometric identity system


                    Regunath Balasubramanian         Shashikant Soni
                      regunathb@gmail.com      soni.shashikant@gmail.com
                       twitter @regunathb




CONFIDENTIAL: For limited circulation only                                 Slide 1
India
● 1.2 billion residents
   ● 640,000 villages, ~60% lives under $2/day
   ● ~75% literacy, <3% pays Income Tax, <20% banking
   ● ~800 million mobile, ~200-300 mn migrant workers

● Govt. spends about $25-40B on direct subsidies
   ● Residents have no standard identity document
   ● Most programs plagued with ghost and multiple identities causing
     leakage of 30-40%




                                                                        Slide 2
Aadhaar
● Create a common ‘national identity’ for every ‘resident’
   ●Biometric backed identity to eliminate duplicates
   ●‘Verifiable online identity’ for portability
● Applications ecosystem using open APIs
   ●Aadhaar enabled bank account and payment platform
   ●Aadhaar enabled electronic, paperless KYC (Know Your
     Customer)




                                                             Slide 3
Search Requirements
● Multi-attribute query like:
   name contains ‘regunath’ AND city = ‘bangalore’ AND
   address contains ‘J P Nagar’ AND YearOfBirth = ……


● Search 1.2B resident data with photo, history
   ●35Kb - Average record size
● Response times in milliseconds
● Open scale out


                                                         Slide 4
Why MongoDB
● Auto-sharding
● Replication
● Failover
   … Essentially an AP (slaveOk) data store in CAP parlance

● Evolving schema
● Map-Reduce for analysis
● Full text search
   ●Compound (or) multi-keys


                                                              Slide 5
Design

               { _id:123456789, name: ‘abcde’, year:1980, ….. }
    MongoDB         2

                                             Search API                                  Client App
                                                                  Name=‘abcde’
    Solr            1
                                                                  Address=‘some place’
  Indexes     Name: ‘abcde’                                       Year= 1980
              Address: ‘some place’
              year: 1980



● Read/Search
   ●Sharded Solr indexes for search
   ●Keyed document read from MongoDB
● Write
   ●Eventual consistency (across data sources) driven by
    application
   ●Composite MongodDB-Solr app persistence handler                                                   Slide 6
Implementation and Deployment
   ● Start - 4M records in 2 shards
   Current - 250M records in 8 shards ( 8 x ~2 TB x 3 replicas)
   ● Performance , Reliability & Durability
      ●SlaveOk
      ●getLastError, Write Concern: availability vs durability
          j = journaling
          w = nodes-to-write
   ● Replica-sets / Shards – how?
            RS 1                RS 1              RS 1
            Rs 2                                  RS 2              RS 2

Primary
                     Config 1          Config 2          Config 3
Secondary

Arbiter               Router           Router            Router
                                                                           Slide 7
Monitoring and Troubleshooting
● Monitoring tools evaluated
   ●MMS
   ●munin
● Manual approach - daily ritual
   ●RS, DB, config, router - health and stats
● Problem analysis stats
   ●mongostat, iostat, currentOps, logs
   ●Client connections
● Stats for storage, shards addition
   ●Data file size
   ●Shard data distribution
   ●Replication
                                                Slide 8
Key Learnings on MongoDB
● Indexing 32 fields
   ●Compound indexes
   ●Multi-keys indexes
       {…"indexes" : [{ "email":"john.doe@email.com", "phone":"123456789“ }] }
       db.coll.find ({ "indexes.email" : "john.doe@email.com" })
   ●Indexes use b-tree
   ●Many fields to index
   ●Performs well upto 1-2M documents
   ●Best if index fits in memory
● Data replication, RS failover
   ●Rollback when RS goes out of sync
       Manual restore (physical data copy)
       Restarting a very stale node
                                                                            Slide 9
Questions?



                    Regunath Balasubramanian               Shashikant Soni
                      regunathb@gmail.com            soni.shashikant@gmail.com
                       twitter @regunathb




CONFIDENTIAL: For limited circulation only                                       Slide 10

Weitere ähnliche Inhalte

Was ist angesagt?

Analysis on Zomato
Analysis on ZomatoAnalysis on Zomato
Analysis on ZomatoAdarsh NJ
 
A project on impact of post sales service on customer satisfaction at hero honda
A project on impact of post sales service on customer satisfaction at hero hondaA project on impact of post sales service on customer satisfaction at hero honda
A project on impact of post sales service on customer satisfaction at hero hondaBabasab Patil
 
From zero to zomato in five years
From zero to zomato in five yearsFrom zero to zomato in five years
From zero to zomato in five yearsManohar Gupta
 
Dabur Summer Internship
Dabur Summer InternshipDabur Summer Internship
Dabur Summer Internshipjkn6301
 
5 trillion economy reality to india
5 trillion economy reality to india5 trillion economy reality to india
5 trillion economy reality to indiaVanshGupta62
 
India frozen food market
India frozen food marketIndia frozen food market
India frozen food marketRenub Research
 
Zomato: An Overview of its Globalization Strategies
Zomato: An Overview of its Globalization StrategiesZomato: An Overview of its Globalization Strategies
Zomato: An Overview of its Globalization StrategiesKashyap Shah
 
Zomato: Transforming the Global Restaurant Business
Zomato: Transforming the Global Restaurant BusinessZomato: Transforming the Global Restaurant Business
Zomato: Transforming the Global Restaurant BusinessJeffrey Funk Business Models
 
A project on mc donalds
A project on mc donaldsA project on mc donalds
A project on mc donaldsProjects Kart
 
Business Plan - Chatpata Tiffin
Business Plan - Chatpata TiffinBusiness Plan - Chatpata Tiffin
Business Plan - Chatpata TiffinKartik Jain
 
Business idea for Finilite
Business idea for Finilite Business idea for Finilite
Business idea for Finilite Abhishek Kumar
 
Customer awareness @ sbi mutual fund project report mba marketing
Customer awareness @ sbi mutual fund project report mba marketingCustomer awareness @ sbi mutual fund project report mba marketing
Customer awareness @ sbi mutual fund project report mba marketingBabasab Patil
 
fundamentalanalysisofFMCGsector.docx
fundamentalanalysisofFMCGsector.docxfundamentalanalysisofFMCGsector.docx
fundamentalanalysisofFMCGsector.docxKeshavKumar985749
 

Was ist angesagt? (20)

Analysis on Zomato
Analysis on ZomatoAnalysis on Zomato
Analysis on Zomato
 
Zomato presentation
Zomato presentationZomato presentation
Zomato presentation
 
A project on impact of post sales service on customer satisfaction at hero honda
A project on impact of post sales service on customer satisfaction at hero hondaA project on impact of post sales service on customer satisfaction at hero honda
A project on impact of post sales service on customer satisfaction at hero honda
 
From zero to zomato in five years
From zero to zomato in five yearsFrom zero to zomato in five years
From zero to zomato in five years
 
Zomato
ZomatoZomato
Zomato
 
ITC Dark Fantacy
ITC Dark FantacyITC Dark Fantacy
ITC Dark Fantacy
 
Dabur Summer Internship
Dabur Summer InternshipDabur Summer Internship
Dabur Summer Internship
 
5 trillion economy reality to india
5 trillion economy reality to india5 trillion economy reality to india
5 trillion economy reality to india
 
Marketing strategy of zomato
Marketing strategy of zomatoMarketing strategy of zomato
Marketing strategy of zomato
 
India frozen food market
India frozen food marketIndia frozen food market
India frozen food market
 
Zomato: An Overview of its Globalization Strategies
Zomato: An Overview of its Globalization StrategiesZomato: An Overview of its Globalization Strategies
Zomato: An Overview of its Globalization Strategies
 
Zomato
Zomato Zomato
Zomato
 
Analysis of FMCG sector in India
Analysis of FMCG sector in IndiaAnalysis of FMCG sector in India
Analysis of FMCG sector in India
 
Zomato: Transforming the Global Restaurant Business
Zomato: Transforming the Global Restaurant BusinessZomato: Transforming the Global Restaurant Business
Zomato: Transforming the Global Restaurant Business
 
A project on mc donalds
A project on mc donaldsA project on mc donalds
A project on mc donalds
 
Business Plan - Chatpata Tiffin
Business Plan - Chatpata TiffinBusiness Plan - Chatpata Tiffin
Business Plan - Chatpata Tiffin
 
Business idea for Finilite
Business idea for Finilite Business idea for Finilite
Business idea for Finilite
 
Customer awareness @ sbi mutual fund project report mba marketing
Customer awareness @ sbi mutual fund project report mba marketingCustomer awareness @ sbi mutual fund project report mba marketing
Customer awareness @ sbi mutual fund project report mba marketing
 
Tiffin treat complete project
Tiffin treat complete projectTiffin treat complete project
Tiffin treat complete project
 
fundamentalanalysisofFMCGsector.docx
fundamentalanalysisofFMCGsector.docxfundamentalanalysisofFMCGsector.docx
fundamentalanalysisofFMCGsector.docx
 

Andere mochten auch

MongoDB at eBay
MongoDB at eBayMongoDB at eBay
MongoDB at eBayMongoDB
 
Aadhaar at 5th_elephant_v3
Aadhaar at 5th_elephant_v3Aadhaar at 5th_elephant_v3
Aadhaar at 5th_elephant_v3Regunath B
 
Hadoop at aadhaar
Hadoop at aadhaarHadoop at aadhaar
Hadoop at aadhaarRegunath B
 
Knowledge Engines and AI – Applications Beyond Gaming
Knowledge Engines and AI – Applications Beyond GamingKnowledge Engines and AI – Applications Beyond Gaming
Knowledge Engines and AI – Applications Beyond GamingMecklerMedia
 
AADHAR SECURE TRAVEL IDENETITY
AADHAR SECURE TRAVEL IDENETITYAADHAR SECURE TRAVEL IDENETITY
AADHAR SECURE TRAVEL IDENETITYPratik Vipul
 
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBayStoring eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBayMongoDB
 
Introduction To HBase
Introduction To HBaseIntroduction To HBase
Introduction To HBaseAnil Gupta
 
Step by-step presentation on digital payments
Step by-step presentation on digital paymentsStep by-step presentation on digital payments
Step by-step presentation on digital paymentsMahantesh Biradar
 
India - A Cashless Economy (NPCI/UPI)
India - A Cashless Economy (NPCI/UPI)India - A Cashless Economy (NPCI/UPI)
India - A Cashless Economy (NPCI/UPI)Aravind Krishnaswamy
 
ppt on aadhar card project
ppt on aadhar card projectppt on aadhar card project
ppt on aadhar card projectPooja Verma
 

Andere mochten auch (11)

MongoDB at eBay
MongoDB at eBayMongoDB at eBay
MongoDB at eBay
 
Aadhaar at 5th_elephant_v3
Aadhaar at 5th_elephant_v3Aadhaar at 5th_elephant_v3
Aadhaar at 5th_elephant_v3
 
Hadoop at aadhaar
Hadoop at aadhaarHadoop at aadhaar
Hadoop at aadhaar
 
Knowledge Engines and AI – Applications Beyond Gaming
Knowledge Engines and AI – Applications Beyond GamingKnowledge Engines and AI – Applications Beyond Gaming
Knowledge Engines and AI – Applications Beyond Gaming
 
AADHAR SECURE TRAVEL IDENETITY
AADHAR SECURE TRAVEL IDENETITYAADHAR SECURE TRAVEL IDENETITY
AADHAR SECURE TRAVEL IDENETITY
 
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBayStoring eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
Storing eBay's Media Metadata on MongoDB, by Yuri Finkelstein, Architect, eBay
 
Three Big Data Case Studies
Three Big Data Case StudiesThree Big Data Case Studies
Three Big Data Case Studies
 
Introduction To HBase
Introduction To HBaseIntroduction To HBase
Introduction To HBase
 
Step by-step presentation on digital payments
Step by-step presentation on digital paymentsStep by-step presentation on digital payments
Step by-step presentation on digital payments
 
India - A Cashless Economy (NPCI/UPI)
India - A Cashless Economy (NPCI/UPI)India - A Cashless Economy (NPCI/UPI)
India - A Cashless Economy (NPCI/UPI)
 
ppt on aadhar card project
ppt on aadhar card projectppt on aadhar card project
ppt on aadhar card project
 

Ähnlich wie Mongo for aadhaar

Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...Alexey Zinoviev
 
MongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes PlatformMongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes PlatformMongoDB
 
Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccsrisatish ambati
 
An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014OpenExpoES
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDBCésar Trigo
 
Graph databases in computational bioloby: case of neo4j and TitanDB
Graph databases in computational bioloby: case of neo4j and TitanDBGraph databases in computational bioloby: case of neo4j and TitanDB
Graph databases in computational bioloby: case of neo4j and TitanDBAndrei KUCHARAVY
 
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
Zeotap: Moving to ScyllaDB - A Graph of Billions ScaleZeotap: Moving to ScyllaDB - A Graph of Billions Scale
Zeotap: Moving to ScyllaDB - A Graph of Billions ScaleScyllaDB
 
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
Zeotap: Moving to ScyllaDB - A Graph of Billions ScaleZeotap: Moving to ScyllaDB - A Graph of Billions Scale
Zeotap: Moving to ScyllaDB - A Graph of Billions ScaleSaurabh Verma
 
Bringing code to the data: from MySQL to RocksDB for high volume searches
Bringing code to the data: from MySQL to RocksDB for high volume searchesBringing code to the data: from MySQL to RocksDB for high volume searches
Bringing code to the data: from MySQL to RocksDB for high volume searchesIvan Kruglov
 
There is Javascript in my SQL
There is Javascript in my SQLThere is Javascript in my SQL
There is Javascript in my SQLPGConf APAC
 
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data typePostgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data typeJumping Bean
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Chris Richardson
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLMongoDB
 
Back to Basics Webinar 1 - Introduction to NoSQL
Back to Basics Webinar 1 - Introduction to NoSQLBack to Basics Webinar 1 - Introduction to NoSQL
Back to Basics Webinar 1 - Introduction to NoSQLJoe Drumgoole
 
MongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistJoker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistAlexey Zinoviev
 
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...Rob Skillington
 
Social Analytics with MongoDB
Social Analytics with MongoDBSocial Analytics with MongoDB
Social Analytics with MongoDBPatrick Stokes
 
What Kiwi.com Has Learned Running ScyllaDB and Go
What Kiwi.com Has Learned Running ScyllaDB and GoWhat Kiwi.com Has Learned Running ScyllaDB and Go
What Kiwi.com Has Learned Running ScyllaDB and GoScyllaDB
 

Ähnlich wie Mongo for aadhaar (20)

MongoDB Basics Unileon
MongoDB Basics UnileonMongoDB Basics Unileon
MongoDB Basics Unileon
 
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
 
MongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes PlatformMongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes Platform
 
Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svcc
 
An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
Graph databases in computational bioloby: case of neo4j and TitanDB
Graph databases in computational bioloby: case of neo4j and TitanDBGraph databases in computational bioloby: case of neo4j and TitanDB
Graph databases in computational bioloby: case of neo4j and TitanDB
 
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
Zeotap: Moving to ScyllaDB - A Graph of Billions ScaleZeotap: Moving to ScyllaDB - A Graph of Billions Scale
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
 
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
Zeotap: Moving to ScyllaDB - A Graph of Billions ScaleZeotap: Moving to ScyllaDB - A Graph of Billions Scale
Zeotap: Moving to ScyllaDB - A Graph of Billions Scale
 
Bringing code to the data: from MySQL to RocksDB for high volume searches
Bringing code to the data: from MySQL to RocksDB for high volume searchesBringing code to the data: from MySQL to RocksDB for high volume searches
Bringing code to the data: from MySQL to RocksDB for high volume searches
 
There is Javascript in my SQL
There is Javascript in my SQLThere is Javascript in my SQL
There is Javascript in my SQL
 
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data typePostgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
 
Back to Basics Webinar 1 - Introduction to NoSQL
Back to Basics Webinar 1 - Introduction to NoSQLBack to Basics Webinar 1 - Introduction to NoSQL
Back to Basics Webinar 1 - Introduction to NoSQL
 
MongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness Platform
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistJoker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data Scientist
 
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
 
Social Analytics with MongoDB
Social Analytics with MongoDBSocial Analytics with MongoDB
Social Analytics with MongoDB
 
What Kiwi.com Has Learned Running ScyllaDB and Go
What Kiwi.com Has Learned Running ScyllaDB and GoWhat Kiwi.com Has Learned Running ScyllaDB and Go
What Kiwi.com Has Learned Running ScyllaDB and Go
 

Mehr von MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Mongo for aadhaar

  • 1. Search data store for the world's largest biometric identity system Regunath Balasubramanian Shashikant Soni regunathb@gmail.com soni.shashikant@gmail.com twitter @regunathb CONFIDENTIAL: For limited circulation only Slide 1
  • 2. India ● 1.2 billion residents ● 640,000 villages, ~60% lives under $2/day ● ~75% literacy, <3% pays Income Tax, <20% banking ● ~800 million mobile, ~200-300 mn migrant workers ● Govt. spends about $25-40B on direct subsidies ● Residents have no standard identity document ● Most programs plagued with ghost and multiple identities causing leakage of 30-40% Slide 2
  • 3. Aadhaar ● Create a common ‘national identity’ for every ‘resident’ ●Biometric backed identity to eliminate duplicates ●‘Verifiable online identity’ for portability ● Applications ecosystem using open APIs ●Aadhaar enabled bank account and payment platform ●Aadhaar enabled electronic, paperless KYC (Know Your Customer) Slide 3
  • 4. Search Requirements ● Multi-attribute query like: name contains ‘regunath’ AND city = ‘bangalore’ AND address contains ‘J P Nagar’ AND YearOfBirth = …… ● Search 1.2B resident data with photo, history ●35Kb - Average record size ● Response times in milliseconds ● Open scale out Slide 4
  • 5. Why MongoDB ● Auto-sharding ● Replication ● Failover … Essentially an AP (slaveOk) data store in CAP parlance ● Evolving schema ● Map-Reduce for analysis ● Full text search ●Compound (or) multi-keys Slide 5
  • 6. Design { _id:123456789, name: ‘abcde’, year:1980, ….. } MongoDB 2 Search API Client App Name=‘abcde’ Solr 1 Address=‘some place’ Indexes Name: ‘abcde’ Year= 1980 Address: ‘some place’ year: 1980 ● Read/Search ●Sharded Solr indexes for search ●Keyed document read from MongoDB ● Write ●Eventual consistency (across data sources) driven by application ●Composite MongodDB-Solr app persistence handler Slide 6
  • 7. Implementation and Deployment ● Start - 4M records in 2 shards Current - 250M records in 8 shards ( 8 x ~2 TB x 3 replicas) ● Performance , Reliability & Durability ●SlaveOk ●getLastError, Write Concern: availability vs durability  j = journaling  w = nodes-to-write ● Replica-sets / Shards – how? RS 1 RS 1 RS 1 Rs 2 RS 2 RS 2 Primary Config 1 Config 2 Config 3 Secondary Arbiter Router Router Router Slide 7
  • 8. Monitoring and Troubleshooting ● Monitoring tools evaluated ●MMS ●munin ● Manual approach - daily ritual ●RS, DB, config, router - health and stats ● Problem analysis stats ●mongostat, iostat, currentOps, logs ●Client connections ● Stats for storage, shards addition ●Data file size ●Shard data distribution ●Replication Slide 8
  • 9. Key Learnings on MongoDB ● Indexing 32 fields ●Compound indexes ●Multi-keys indexes  {…"indexes" : [{ "email":"john.doe@email.com", "phone":"123456789“ }] }  db.coll.find ({ "indexes.email" : "john.doe@email.com" }) ●Indexes use b-tree ●Many fields to index ●Performs well upto 1-2M documents ●Best if index fits in memory ● Data replication, RS failover ●Rollback when RS goes out of sync  Manual restore (physical data copy)  Restarting a very stale node Slide 9
  • 10. Questions? Regunath Balasubramanian Shashikant Soni regunathb@gmail.com soni.shashikant@gmail.com twitter @regunathb CONFIDENTIAL: For limited circulation only Slide 10