SlideShare ist ein Scribd-Unternehmen logo
1 von 44
NoSQL matters in
Catchoom Recognition Service

                                                    David Arcos
                           david.arcos@catchoom.com | @DZPM
                                     catchoom.com | @catchoom
                                            catchoom.com | @catchoom
1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM               catchoom.com | @catchoom
Hi! I'm David Arcos

     - Python/Django developer (>4yr)

     - Web backend, distributed systems,
     databases, scalability, security

     - Team leader at Catchoom

     - You can follow me at @DZPM



David Arcos | @DZPM                        catchoom.com | @catchoom
Catchoom technology recognizes an
      object by searching through a large
      collection of images in a fraction of a
      second.

      Catchoom targets application
      developers and integrators.


David Arcos | @DZPM                             catchoom.com | @catchoom
Our customers are leaders in Augmented Reality




David Arcos | @DZPM                             catchoom.com | @catchoom
Visual Recognition:

      “Identify an object in front of the camera by comparing it
      to a huge collection of reference images”




David Arcos | @DZPM                                 catchoom.com | @catchoom
Examples of recognized objects:

      - CD/DVD and book covers
      - Newspapers and magazines
      - Logos and brands
      - Posters
      - Packaged goods
      - Monuments and places




David Arcos | @DZPM                     catchoom.com | @catchoom
Catchoom Recognition Service:

       - Cloud-based Visual Recognition (SaaS)
       - RESTful API to integrate
       - “Add VR features to your app/platform”




David Arcos | @DZPM                               catchoom.com | @catchoom
- Small team of 4 developers, doing SCRUM




David Arcos | @DZPM                            catchoom.com | @catchoom
1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM               catchoom.com | @catchoom
Minimum requirements:

      - a public API for the final users to perform Visual
      Recognition

      - a private API for the customer to manage the
      Collections and get statistics

      - a nice website for the customer, providing the
      functionality of both APIs

David Arcos | @DZPM                                  catchoom.com | @catchoom
Must be flexible:

      - A customer who does Augmented Reality, and
      needs a 3D model (binary format) in the item

      - Another one who needs just the item id


      - Our data model needs to allow everything
      (structured and unstructured data)

David Arcos | @DZPM                                catchoom.com | @catchoom
Must be reliable:

      - Images or data should never be lost

      - Avoid single points of failure

      - We need redundancy




David Arcos | @DZPM                           catchoom.com | @catchoom
Must be very fast:

      “Layar has been using Catchoom’s Visual Search technology since the
      launch of Layar Vision, allowing users to quickly view the AR content placed
      on top of images by just pointing their camera to the image.

      We’ve benchmarked Catchoom’s technology in 2011 against 3 of their main
      competitors and found they had the best results both on speed and on
      successful matches (including lowest false positives)”

      Dirk Groten – CTO of Layar




David Arcos | @DZPM                                              catchoom.com | @catchoom
1) Introduction
         2) What did we need?
         3) How we built it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM               catchoom.com | @catchoom
Technology stack:
       - Development: Python, Django, Tornado, Gevent
       - Deployed using: Supervisord, Nginx, gunicorn, Fabric
       - AWS: EC2, S3, ELB




David Arcos | @DZPM                               catchoom.com | @catchoom
The Panel:

      - typical customer portal:

          - manage your Collections, run Visual Recognition

          - get usage statistics

          - and configure the payment method :)


David Arcos | @DZPM                                catchoom.com | @catchoom
David Arcos | @DZPM   catchoom.com | @catchoom
David Arcos | @DZPM   catchoom.com | @catchoom
Mobile apps:

      - for Android, iOS

      - use the Visual Recognition API

      - the code will be published




David Arcos | @DZPM                      catchoom.com | @catchoom
Data models:

      - Collection: a set of items. Has at least one token.

      - Item: has at least one Image. Has metadata.

      - Image: you want several images if the item has different
      sides, logos, flavours...

      - Token: for authenticating the requests.

David Arcos | @DZPM                                catchoom.com | @catchoom
Components:

      - the platform is highly modular

      - “Do one thing, and do it well”

      - they pass json messages

      - optimized hardware settings


David Arcos | @DZPM                      catchoom.com | @catchoom
- Frontend:
      gets the API request

      - Extractor:
      extracts the visual points

      - Collector:
      message exchange

      - Searcher:
      looks for matches
David Arcos | @DZPM                catchoom.com | @catchoom
Required NoSQL features:

      - key-value storage
      - cache
      - message lists
      - message pub/sub
      - real-time analysis

      What servers have we chosen?


David Arcos | @DZPM                  catchoom.com | @catchoom
Required NoSQL features:

      - key-value storage
      - cache
      - message lists
      - message pub/sub
      - real-time analysis




David Arcos | @DZPM              catchoom.com | @catchoom
Required NoSQL features:

      - key-value storage
      - cache
      - message lists
      - message pub/sub
      - real-time analysis


      - and Filesystem:

David Arcos | @DZPM              catchoom.com | @catchoom
1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM               catchoom.com | @catchoom
Performance:

      - Can't afford writing to disk, or querying slow databases

      - Using Redis, everything stays on memory

      - One V.R. query takes just 300 ms




David Arcos | @DZPM                                 catchoom.com | @catchoom
Scalability:

      - Need to scale different components, separately

      - Load balancing using Redis Lists:
         BLPOP: Remove and get
         the first element in a list,
         or block until one is available


      - But focus on the bottlenecks!

David Arcos | @DZPM                               catchoom.com | @catchoom
Unstructured data: query

      - A query object has many optional parameters
         - each component can add/remove fields dynamically
         - schema change between versions

      - Can't fit in a SQL table

      - We model the query in Redis as a json


David Arcos | @DZPM                             catchoom.com | @catchoom
Unstructured data: metadata

      - Metadata is optional and unstructed, can be from a json to a
      binary blob

      - Can't fit in a SQL table, and would be too slow

      - Serve the data from Redis, and use S3 as a backup

      - Warning: in the future, if we have huge metadata files,
      Redis will get out of memory. We'll improve this approach
David Arcos | @DZPM                                 catchoom.com | @catchoom
Availability:

      - Avoid single points of failure. Replicate everything!

      - Replicating a SQL server is painful

      - Redis instances configured as Master/Slave
         - When the master dies:
            - promote a slave to be the new master
            - reconfigure the other slaves to use this new master
         - Redis Sentinel does this (beta)
David Arcos | @DZPM                                  catchoom.com | @catchoom
1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM               catchoom.com | @catchoom
Do real-time calculations:

      - Usage statistics
         - total, monthly, daily, hourly
         - per image, item or collection

      - Metric monitoring for internal use
         - response times, queue size, etc

      - QoS: enforce rate limiting
         - max hits per minute
David Arcos | @DZPM                          catchoom.com | @catchoom
Sorted Sets:

      - To create indexes and filters

      - In example, “Most recognized images” (sorted by hits)

      - Updating the Sorted Set, no need to reconsolidate:
           ZADD Add one or more members to a sorted set,
           or update its score if it already exists




David Arcos | @DZPM                                        catchoom.com | @catchoom
Cache:

      - Redis is compatible with memcached API

      - Cache everything:
         - Sessions, metadata, etc

      - ...although the website is internal: no bottleneck here
          - Better focus on optimizing other stuff!


David Arcos | @DZPM                                 catchoom.com | @catchoom
Volatile data:

      - Redis can set an expiration time for a value

      - Very easy for:
         - implementing timeouts
         - removing old queries
         - adding temporary capping




David Arcos | @DZPM                                catchoom.com | @catchoom
Messages:

      - Redis implements pub/sub and lists.

          - Publish/Subscribe to a channel
             - all components get the message
             - use it for monitoring

          - List: push/pop messages
             - only one component gets the message
             - use the blocking versions for load balancing
David Arcos | @DZPM                                 catchoom.com | @catchoom
1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM               catchoom.com | @catchoom
Django apps compatibility:

      - we use Django and several contrib and external apps.
         - (“Standing in the shoulder of giants”)

      - but no support for NoSQL in Django ORM

      - dropping SQL is not an option!

      - we use MySQL. South migrations.

David Arcos | @DZPM                              catchoom.com | @catchoom
1) Introduction
         2) What did we need?
         3) How we build it
         4) Advantages of NoSQL
         5) Cool uses of NoSQL
         6) Limits
         7) Conclusion




David Arcos | @DZPM               catchoom.com | @catchoom
Summary:

      - We use a combination of SQL and NoSQL

      - Using NoSQL was necessary to meet the requirements

      - There are a lot of different uses for NoSQL




David Arcos | @DZPM                                   catchoom.com | @catchoom
Recommendations:

      - There is no silver bullet

      - Use the best tool for each task

      - But avoid unneeded complexity!

      - Try Redis. Don't do a migration, just add it to your stack


David Arcos | @DZPM                                  catchoom.com | @catchoom
Thanks for attending!

      - Our beta will be ready soon.
      Get a free trial at http://catchoom.com

      - Contact me at
      david.arcos@catchoom.com

      - Questions?

David Arcos | @DZPM                             catchoom.com | @catchoom

Weitere Àhnliche Inhalte

Was ist angesagt?

Cloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsCloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsMark Slingsby
 
Deep Learning: DL4J and DataVec
Deep Learning: DL4J and DataVecDeep Learning: DL4J and DataVec
Deep Learning: DL4J and DataVecJosh Patterson
 
Keeping Movies Running Amid Thunderstorms!
Keeping Movies Running Amid Thunderstorms!Keeping Movies Running Amid Thunderstorms!
Keeping Movies Running Amid Thunderstorms!Sid Anand
 
RESTing in the ALPS Mike Amundsen's Presentation from QCon London 2013
RESTing in the ALPS Mike Amundsen's Presentation from QCon London 2013RESTing in the ALPS Mike Amundsen's Presentation from QCon London 2013
RESTing in the ALPS Mike Amundsen's Presentation from QCon London 2013CA API Management
 
Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1Chris Nauroth
 
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...devang-dsshah
 
How to Build Deep Learning Models
How to Build Deep Learning ModelsHow to Build Deep Learning Models
How to Build Deep Learning ModelsJosh Patterson
 
Deep Learning and Recurrent Neural Networks in the Enterprise
Deep Learning and Recurrent Neural Networks in the EnterpriseDeep Learning and Recurrent Neural Networks in the Enterprise
Deep Learning and Recurrent Neural Networks in the EnterpriseJosh Patterson
 
Scalable Resilient Web Services In .Net
Scalable Resilient Web Services In .NetScalable Resilient Web Services In .Net
Scalable Resilient Web Services In .NetBala Subra
 
Iksula Drupal Solutions
Iksula Drupal SolutionsIksula Drupal Solutions
Iksula Drupal SolutionsIksula
 
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive DataSumit Rangwala
 

Was ist angesagt? (11)

Cloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web AppsCloud Computing & Scaling Web Apps
Cloud Computing & Scaling Web Apps
 
Deep Learning: DL4J and DataVec
Deep Learning: DL4J and DataVecDeep Learning: DL4J and DataVec
Deep Learning: DL4J and DataVec
 
Keeping Movies Running Amid Thunderstorms!
Keeping Movies Running Amid Thunderstorms!Keeping Movies Running Amid Thunderstorms!
Keeping Movies Running Amid Thunderstorms!
 
RESTing in the ALPS Mike Amundsen's Presentation from QCon London 2013
RESTing in the ALPS Mike Amundsen's Presentation from QCon London 2013RESTing in the ALPS Mike Amundsen's Presentation from QCon London 2013
RESTing in the ALPS Mike Amundsen's Presentation from QCon London 2013
 
Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1
 
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...
Implementing Large Scale Digital Asset Repositories with Adobe Experience Man...
 
How to Build Deep Learning Models
How to Build Deep Learning ModelsHow to Build Deep Learning Models
How to Build Deep Learning Models
 
Deep Learning and Recurrent Neural Networks in the Enterprise
Deep Learning and Recurrent Neural Networks in the EnterpriseDeep Learning and Recurrent Neural Networks in the Enterprise
Deep Learning and Recurrent Neural Networks in the Enterprise
 
Scalable Resilient Web Services In .Net
Scalable Resilient Web Services In .NetScalable Resilient Web Services In .Net
Scalable Resilient Web Services In .Net
 
Iksula Drupal Solutions
Iksula Drupal SolutionsIksula Drupal Solutions
Iksula Drupal Solutions
 
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
 

Ähnlich wie NoSQL matters in Catchoom Recognition Service

Enterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a ServiceEnterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a ServiceMongoDB
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQLCrate.io
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indixYu Ishikawa
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesAdrian Cockcroft
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceMongoDB
 
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
Teradata Partners Conference Oct 2014   Big Data Anti-PatternsTeradata Partners Conference Oct 2014   Big Data Anti-Patterns
Teradata Partners Conference Oct 2014 Big Data Anti-PatternsDouglas Moore
 
Deploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloudDeploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloudJeffrey Carpenter
 
CDMI For Swift
CDMI For SwiftCDMI For Swift
CDMI For SwiftMark Carlson
 
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWSAmazon Web Services
 
Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015WaveMaker, Inc.
 
Simpler, faster, cheaper Enterprise Apps using only Spring Boot on GCP
Simpler, faster, cheaper Enterprise Apps using only Spring Boot on GCPSimpler, faster, cheaper Enterprise Apps using only Spring Boot on GCP
Simpler, faster, cheaper Enterprise Apps using only Spring Boot on GCPDaniel Zivkovic
 
Security Goodness with Ruby on Rails
Security Goodness with Ruby on RailsSecurity Goodness with Ruby on Rails
Security Goodness with Ruby on RailsSource Conference
 
Accra MongoDB User Group
Accra MongoDB User GroupAccra MongoDB User Group
Accra MongoDB User GroupMongoDB
 
Microservices
MicroservicesMicroservices
Microservicesdarkofabijan
 
Devopsdays london: Let’s talk about security
Devopsdays london:  Let’s talk about securityDevopsdays london:  Let’s talk about security
Devopsdays london: Let’s talk about securityJustin Cormack
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsCloudera, Inc.
 
12 tips on Django Best Practices
12 tips on Django Best Practices12 tips on Django Best Practices
12 tips on Django Best PracticesDavid Arcos
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...Adrian Cockcroft
 
Getting Started with MariaDB with Docker
Getting Started with MariaDB with DockerGetting Started with MariaDB with Docker
Getting Started with MariaDB with DockerMariaDB plc
 
Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Adrian Cockcroft
 

Ähnlich wie NoSQL matters in Catchoom Recognition Service (20)

Enterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a ServiceEnterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a Service
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQL
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-Service
 
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
Teradata Partners Conference Oct 2014   Big Data Anti-PatternsTeradata Partners Conference Oct 2014   Big Data Anti-Patterns
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
 
Deploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloudDeploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloud
 
CDMI For Swift
CDMI For SwiftCDMI For Swift
CDMI For Swift
 
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS
(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS
 
Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015
 
Simpler, faster, cheaper Enterprise Apps using only Spring Boot on GCP
Simpler, faster, cheaper Enterprise Apps using only Spring Boot on GCPSimpler, faster, cheaper Enterprise Apps using only Spring Boot on GCP
Simpler, faster, cheaper Enterprise Apps using only Spring Boot on GCP
 
Security Goodness with Ruby on Rails
Security Goodness with Ruby on RailsSecurity Goodness with Ruby on Rails
Security Goodness with Ruby on Rails
 
Accra MongoDB User Group
Accra MongoDB User GroupAccra MongoDB User Group
Accra MongoDB User Group
 
Microservices
MicroservicesMicroservices
Microservices
 
Devopsdays london: Let’s talk about security
Devopsdays london:  Let’s talk about securityDevopsdays london:  Let’s talk about security
Devopsdays london: Let’s talk about security
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
12 tips on Django Best Practices
12 tips on Django Best Practices12 tips on Django Best Practices
12 tips on Django Best Practices
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
 
Getting Started with MariaDB with Docker
Getting Started with MariaDB with DockerGetting Started with MariaDB with Docker
Getting Started with MariaDB with Docker
 
Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3)
 

KĂŒrzlich hochgeladen

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...gurkirankumar98700
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

KĂŒrzlich hochgeladen (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

NoSQL matters in Catchoom Recognition Service

  • 1. NoSQL matters in Catchoom Recognition Service David Arcos david.arcos@catchoom.com | @DZPM catchoom.com | @catchoom catchoom.com | @catchoom
  • 2. 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM catchoom.com | @catchoom
  • 3. Hi! I'm David Arcos - Python/Django developer (>4yr) - Web backend, distributed systems, databases, scalability, security - Team leader at Catchoom - You can follow me at @DZPM David Arcos | @DZPM catchoom.com | @catchoom
  • 4. Catchoom technology recognizes an object by searching through a large collection of images in a fraction of a second. Catchoom targets application developers and integrators. David Arcos | @DZPM catchoom.com | @catchoom
  • 5. Our customers are leaders in Augmented Reality David Arcos | @DZPM catchoom.com | @catchoom
  • 6. Visual Recognition: “Identify an object in front of the camera by comparing it to a huge collection of reference images” David Arcos | @DZPM catchoom.com | @catchoom
  • 7. Examples of recognized objects: - CD/DVD and book covers - Newspapers and magazines - Logos and brands - Posters - Packaged goods - Monuments and places David Arcos | @DZPM catchoom.com | @catchoom
  • 8. Catchoom Recognition Service: - Cloud-based Visual Recognition (SaaS) - RESTful API to integrate - “Add VR features to your app/platform” David Arcos | @DZPM catchoom.com | @catchoom
  • 9. - Small team of 4 developers, doing SCRUM David Arcos | @DZPM catchoom.com | @catchoom
  • 10. 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM catchoom.com | @catchoom
  • 11. Minimum requirements: - a public API for the final users to perform Visual Recognition - a private API for the customer to manage the Collections and get statistics - a nice website for the customer, providing the functionality of both APIs David Arcos | @DZPM catchoom.com | @catchoom
  • 12. Must be flexible: - A customer who does Augmented Reality, and needs a 3D model (binary format) in the item - Another one who needs just the item id - Our data model needs to allow everything (structured and unstructured data) David Arcos | @DZPM catchoom.com | @catchoom
  • 13. Must be reliable: - Images or data should never be lost - Avoid single points of failure - We need redundancy David Arcos | @DZPM catchoom.com | @catchoom
  • 14. Must be very fast: “Layar has been using Catchoom’s Visual Search technology since the launch of Layar Vision, allowing users to quickly view the AR content placed on top of images by just pointing their camera to the image. We’ve benchmarked Catchoom’s technology in 2011 against 3 of their main competitors and found they had the best results both on speed and on successful matches (including lowest false positives)” Dirk Groten – CTO of Layar David Arcos | @DZPM catchoom.com | @catchoom
  • 15. 1) Introduction 2) What did we need? 3) How we built it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM catchoom.com | @catchoom
  • 16. Technology stack: - Development: Python, Django, Tornado, Gevent - Deployed using: Supervisord, Nginx, gunicorn, Fabric - AWS: EC2, S3, ELB David Arcos | @DZPM catchoom.com | @catchoom
  • 17. The Panel: - typical customer portal: - manage your Collections, run Visual Recognition - get usage statistics - and configure the payment method :) David Arcos | @DZPM catchoom.com | @catchoom
  • 18. David Arcos | @DZPM catchoom.com | @catchoom
  • 19. David Arcos | @DZPM catchoom.com | @catchoom
  • 20. Mobile apps: - for Android, iOS - use the Visual Recognition API - the code will be published David Arcos | @DZPM catchoom.com | @catchoom
  • 21. Data models: - Collection: a set of items. Has at least one token. - Item: has at least one Image. Has metadata. - Image: you want several images if the item has different sides, logos, flavours... - Token: for authenticating the requests. David Arcos | @DZPM catchoom.com | @catchoom
  • 22. Components: - the platform is highly modular - “Do one thing, and do it well” - they pass json messages - optimized hardware settings David Arcos | @DZPM catchoom.com | @catchoom
  • 23. - Frontend: gets the API request - Extractor: extracts the visual points - Collector: message exchange - Searcher: looks for matches David Arcos | @DZPM catchoom.com | @catchoom
  • 24. Required NoSQL features: - key-value storage - cache - message lists - message pub/sub - real-time analysis What servers have we chosen? David Arcos | @DZPM catchoom.com | @catchoom
  • 25. Required NoSQL features: - key-value storage - cache - message lists - message pub/sub - real-time analysis David Arcos | @DZPM catchoom.com | @catchoom
  • 26. Required NoSQL features: - key-value storage - cache - message lists - message pub/sub - real-time analysis - and Filesystem: David Arcos | @DZPM catchoom.com | @catchoom
  • 27. 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM catchoom.com | @catchoom
  • 28. Performance: - Can't afford writing to disk, or querying slow databases - Using Redis, everything stays on memory - One V.R. query takes just 300 ms David Arcos | @DZPM catchoom.com | @catchoom
  • 29. Scalability: - Need to scale different components, separately - Load balancing using Redis Lists: BLPOP: Remove and get the first element in a list, or block until one is available - But focus on the bottlenecks! David Arcos | @DZPM catchoom.com | @catchoom
  • 30. Unstructured data: query - A query object has many optional parameters - each component can add/remove fields dynamically - schema change between versions - Can't fit in a SQL table - We model the query in Redis as a json David Arcos | @DZPM catchoom.com | @catchoom
  • 31. Unstructured data: metadata - Metadata is optional and unstructed, can be from a json to a binary blob - Can't fit in a SQL table, and would be too slow - Serve the data from Redis, and use S3 as a backup - Warning: in the future, if we have huge metadata files, Redis will get out of memory. We'll improve this approach David Arcos | @DZPM catchoom.com | @catchoom
  • 32. Availability: - Avoid single points of failure. Replicate everything! - Replicating a SQL server is painful - Redis instances configured as Master/Slave - When the master dies: - promote a slave to be the new master - reconfigure the other slaves to use this new master - Redis Sentinel does this (beta) David Arcos | @DZPM catchoom.com | @catchoom
  • 33. 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM catchoom.com | @catchoom
  • 34. Do real-time calculations: - Usage statistics - total, monthly, daily, hourly - per image, item or collection - Metric monitoring for internal use - response times, queue size, etc - QoS: enforce rate limiting - max hits per minute David Arcos | @DZPM catchoom.com | @catchoom
  • 35. Sorted Sets: - To create indexes and filters - In example, “Most recognized images” (sorted by hits) - Updating the Sorted Set, no need to reconsolidate: ZADD Add one or more members to a sorted set, or update its score if it already exists David Arcos | @DZPM catchoom.com | @catchoom
  • 36. Cache: - Redis is compatible with memcached API - Cache everything: - Sessions, metadata, etc - ...although the website is internal: no bottleneck here - Better focus on optimizing other stuff! David Arcos | @DZPM catchoom.com | @catchoom
  • 37. Volatile data: - Redis can set an expiration time for a value - Very easy for: - implementing timeouts - removing old queries - adding temporary capping David Arcos | @DZPM catchoom.com | @catchoom
  • 38. Messages: - Redis implements pub/sub and lists. - Publish/Subscribe to a channel - all components get the message - use it for monitoring - List: push/pop messages - only one component gets the message - use the blocking versions for load balancing David Arcos | @DZPM catchoom.com | @catchoom
  • 39. 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM catchoom.com | @catchoom
  • 40. Django apps compatibility: - we use Django and several contrib and external apps. - (“Standing in the shoulder of giants”) - but no support for NoSQL in Django ORM - dropping SQL is not an option! - we use MySQL. South migrations. David Arcos | @DZPM catchoom.com | @catchoom
  • 41. 1) Introduction 2) What did we need? 3) How we build it 4) Advantages of NoSQL 5) Cool uses of NoSQL 6) Limits 7) Conclusion David Arcos | @DZPM catchoom.com | @catchoom
  • 42. Summary: - We use a combination of SQL and NoSQL - Using NoSQL was necessary to meet the requirements - There are a lot of different uses for NoSQL David Arcos | @DZPM catchoom.com | @catchoom
  • 43. Recommendations: - There is no silver bullet - Use the best tool for each task - But avoid unneeded complexity! - Try Redis. Don't do a migration, just add it to your stack David Arcos | @DZPM catchoom.com | @catchoom
  • 44. Thanks for attending! - Our beta will be ready soon. Get a free trial at http://catchoom.com - Contact me at david.arcos@catchoom.com - Questions? David Arcos | @DZPM catchoom.com | @catchoom

Hinweis der Redaktion

  1. Looks easy?
  2. (timestamps, the image index, debug info...)
  3. Efficiency Totals, per month, per day, per image, per item, per collection Response times, queue size Redis is compatible with memcached API Avoid hitting the db
  4. Efficiency No need to consolidate
  5. Efficiency No need to consolidate
  6. Efficiency No need to consolidate
  7. Efficiency No need to consolidate
  8. Efficiency No need to consolidate
  9. Efficiency No need to consolidate
  10. Efficiency No need to consolidate
  11. Efficiency No need to consolidate