This document discusses streaming data between Confluent Cloud and MongoDB Atlas. It provides an overview of MongoDB Atlas and its fully managed database capabilities in the cloud. It then demonstrates how to stream data from a Python generator application to MongoDB Atlas using Confluent Cloud and its connectors. The document promotes using MongoDB Atlas as a turnkey database as a service solution and shows how it can be integrated with Confluent Cloud for streaming data workflows.
3. Agenda
MongoDB Atlas
MongoDB in the Confluent Cloud
Demo
Confluent Cloud
Connectors
MongoDB
Atlas
AWS: us-east
Virginia
MongoDB
Atlas
AWS: eu-west
Ireland
4. MongoDB Adoption Continues to Grow
DB-Engines Rankings
Fastest Growing Database
over the past decade
Worldwide Activations
Most Wanted Database: 4 Years Straight
2020 Stack Overflow Developer Survey
155,000,000+
MongoDB Downloads
1,500,000+
Online Education Course
Registrations
1,750,000+
MongoDB Atlas Clusters
1,000+
Technology and Services
Partners
24,800+
Customers Across All
Industries
155M+
5. Self-hosted MongoDB Turn-key modern database
You install, patch,
maintain, scale, etc..
MongoDB as a service plus
a whole lot more…
MongoDB & MongoDB Atlas
6. At the core is the database for modern applications
● Transactional guarantees at a
global scale
● Intuitive and flexible data model
● Unique data distribution
capabilities
● MongoDB Query Language (MQL)
is built for nearly any workload
Distributed
Intuitive & Flexible Data Model
Transactional
7. We fully manage it for you in the cloud
● Fully managed database lifecycle with
MongoDB Atlas.
● Multi-cloud, available in ~ 80 regions
across AWS, GCP, Azure
● Sophisticated security controls and
next-gen end user privacy
● Autopilot features such as auto-scale,
performance advisor, and more
● Built-in data access, movement,
manipulation services for rapid
application development
8. Interactive data visualization for MongoDB data
● MongoDB Charts is the fastest
and easiest way to create
visualizations of MongoDB data
● Share, embed, and collaborate
on live data
● Support for the richness of
document model, including
nested and hierarchical data
9. Integrated full-text search capabilities
● Atlas Search allows you to
implement full-text search on
top of your data in cloud with no
need to replicate your data
elsewhere and no additional
systems to learn or manage
● Atlas search queries use the
MongoDB Query Language
10. Tier, query and analyze your data using MQL
● Auto-archive aged data into Atlas Data
Lake
● Blend, query, and analyze the
structured / unstructured data in your
cloud object storage using the
MongoDB Query Language
● Support for federated queries means
you can submit a single query and
analyze operational data in MongoDB
Atlas alongside your data in S3
13. Not ready for “- as a service” ?
https://www.confluent.io/resources/confluent-platform-reference-architecture-mongodb/
14. Thank you
Rob Walters | MongoDB
robert.walters@mongodb.com
https://www.linkedin.com/in/robwaltersprofile/
https://developer.mongodb.com/community/forums/c/connectors-integrations
Hinweis der Redaktion
As you may already be aware Confluent Cloud is a public cloud offering by Confluent that provides Kafka as a service.
MongoDB has a cloud based offering as well called MongoDB Atlas. By the end of today’s session you will have a good understanding MongoDB Atlas, how it can add value to your MongoDB applications and how to use the MongoDB Atlas Source and Sink connectors within the Confluent cloud for a complete cloud based solution.
Later in this presentation we will run through a demo that will show you how to leverage the MongoDB ATlas Source and SInk in the Confluent Cloud to move data between two geographically distributed MongoDB clusters
Let me first start off and discuss MongoDB and the huge success its been and continues to be with over 155M downloads, more than 25K customers from industries all over the world. Its been recognized by a StackOverflow survey as a the most wanted database 4 years in a row. MongoDB has a great partnership with Confluent providing enterprise scale database needs for Kafka solutions. MongoDB is a natural fit for Kafka due to its flexible data model, horizontal scale and enterprise class security.
Today I’m going to be discussing MongoDB Atlas. it is important to note that MongoDB and MongoDB Atlas are the same database engine that you use to power your back end today. MongoDB is the self-hosted database engine that you download, install and configure. MongoDB Atlas not only includes the MongoDB database engine but is also a turn-key cloud-based application platform that provides many value-added features out of the box such as full-text search, chart visualization, online archiving and deep integration with our Realm mobile database. This allows you to focus on addressing the business problem you're trying to solve versus worrying about infrastructure provisioning and maintenance.
For those who haven’t looked at MongoDB in a while a lot has changed.
From a database engine perspective we added ACID compliant transactions starting with version 4.0 and have evolved the Mongo Query Language enabling teams to ask a wide range of questions of their data, making it suitable for nearly any workload across an organization. Whether you are building a high traffic website or the next generation IoT solution that leverages time-series data, MongoDB is a general purpose database enabling you to easily build these applications quickly and securely!
With MongoDB Atlas we deliver the database as a fully managed cloud service in nearly 80 regions across AWS, GCP, and Azure.
The entire database lifecycle is fully managed. That means continuous availability, monitoring, backup, automation, upgrades — we take care of all of it for our customers.
Atlas comes with defaults that ensure data security and makes it very easy for users to turn on additional optional security features for further peace of mind. And we’re leading the industry with features such as client-side field level encryption, which ensures end user privacy. FLE works by ensuring encryption and decryption of your data only occurs where you need it, the client. All data is transmitted and stored in MongoDB encrypted. Since your data is never unencrypted at any point when it leaves your client you have an added level of protection when leveraging resources like public cloud vendors for infrastructure needs.
Atlas also comes equipped with what we’re calling autopilot features such as auto-scale, index suggestions, and schema suggestions. This helps our customers optimize their resource usage and their usage of the database, with minimal or no effort on their part. This is an area where we will be continuing to invest to further differentiate ourselves from the competition.
We know data by itself isn’t valuable it is the querying and visualization that adds value to your businesses.
MongoDB Charts is part of Atlas and is designed to work natively with the richly structured data in MongoDB, which can contain nested and hierarchical data. This means you don’t lose any data fidelity like you would if you were flattening the documents to work with most SQL-based data visualization tools. With MongoDB Charts, it’s incredibly easy to share, embed, and collaborate on the live data in MongoDB Atlas.
Another service that is part of Atlas is Atlas Search..
Search is such an integral part of nearly every application.
Built on the lucene platform that powers many search platforms today, with Atlas Search, teams can build rich search functionality on top of their data in cloud without having to learn, deploy, and maintain a separate search technology or the middleware to move data between systems.
And finally, increasingly organizations are moving their data to data lakes built on cloud object storage and often using it as a staging ground for analytics.
Atlas Data Lake allows teams to query and analyze the structured and unstructured data in those cloud object stores using the MongoDB Query Language.
Atlas Data Lake also supports federated queries, which means teams can submit a SINGLE query and analyze the live data in MongoDB Atlas alongside the data in their cloud object stores. So if you have JSON, BSON, CSV, TSV, Avro, ORC Parquet files, or even data sinked from Kafka topics you can query them in place without the complexity, cost, and time-consumption of data ingestion and transformation.
Before we get to our end to end demo using the Confluent cloud, let’s take a quick tour of MongoDB Atlas
New Cluster Dialog
Database access
Network access
Connect (paste in cmd shell, connect)
Performance
Data / Network security tab
Metrics / RealTime-Collections- $MATCH ( "Quantity" : { $gte : 5 } $SORT Country - Profiler / Perf Adivsor / Online Archive
Today we are going to show you how to move data in and out of MongoDB from one cluster in Virginia in the US to another MongoDB cluster in Ireland
Python app writing to atlas source -> CCloud -> atlas
We already created the Atlas clusters and in the interest of time we created the confluent cloud kafka cluster and we have this demo up and running
let’s take a look at how it is setup
CMD Shell->Python application
CMD Shell->Mongo SOURCE query Stocks.StockData
CONFLUENT CLOUD->Connectors, SOURCE CONNECTOR, SINK CONNECTOR
CMD Shell->Mongo SINK
MongoSH Download web link
If you’re not ready for the cloud or you have an existing on-prem application, feel free to the MongoDB Connector for Apache Kafka, download from Confluent hub and install it into Kafka Connect. Here is a reference architecture to help you with deployment.