MongoDB has been successfully used by PetroCloud for their industrial IoT platform since they reengineered their stack in 2013, moving from multiple languages and MySQL to a unified JavaScript and JSON stack with MongoDB at its core. MongoDB's flexibility, high availability, and ease of use have supported PetroCloud's growth to over 3,500 devices generating 250,000 daily events. Recently, PetroCloud migrated their MongoDB deployment to the MongoDB Atlas cloud database service to reduce management overhead as their data and usage grew.
2. Who Am I? Where I work @?
Customers
● Luis Lobo Borobia
○ Senior Software Engineer @ PetroCloud since 2013
● PetroCloud
○ Real-Time Oil Field Automation and Security Platform
○ It started in 2012 by Lance White, our CEO
○ From the ground up, an Industrial IoT company
○ Today we are the market leader and our customers are all
household names in Oil and Gas, and Electric utilities
3. What will we be talking about?
- Industrial IOT implementation example
- MySQL -> MongoDB -> MongoDB Atlas
- New tool on the block! DA-Tracer
4. Some history
How we started?
● 2012
● Python
● PHP
● MySQL
● Some devices
● Small AWS deployment
5. Not ready for our vision
● Hard to connect to different devices and protocols
● Too many languages - few developers
● Static web application, server side rendered
We knew we had to re engineer and rethink our stack and by the end
of 2013 we started the process
6. Key factors during our re engineering process
● Integration
○ Connect/integrate any kind of devices (hardware agnostic)
○ Different protocols
● One Language to rule them all
○ JavaScript replaced Python, PHP, SQL
○ JSON for all our data structures
7. Key factors (continued…)
● People
○ Angular JS, Node JS were trends, we wanted talents
● Platform
○ Build a robust Platform to support our business
○ Browser based, Event driven, Real Time application
○ Integration of Video with Industrial Processes
○ Easy and robust database, Javascript friendly
8. And the database engine that won was...
MongoDB!
● High Availability
● Almost zero maintenance
● JSON and Javascript friendly
● Schema(less) flexibility
9. How we did we re engineer a working
production system?
● Parallel system
● Iterative process
● One feature at a time
● Cut off
11. MongoDB: Rock Solid
● Always available and up
● The service was never down
● … only on that single particular situation
○ Love your MongoDB Cloud Manager Backup!
■ https://docs.cloudmanager.mongodb.com/
tutorial/nav/backup-deployments/
12. And then, we moved to MongoDB Atlas!
What does MongoDB Atlas offer?
● Support!
● Dedicated instances instead of shared ones
● Security compliance
● Performance analysis
● Ability to expand with a snap
● Database infrastructure management
○ No need to worry about it
○ Who’s better to know how to do it than
MongoDb!
○ Security
○ Encryption
Why?
● We are 10 developers and one
dev-ops
● We have our own infrastructure
to care about: more than 3.5K
industrial remote devices
generating 250K events a day
● Tips on moving to Atlas
○ Drivers
○ A/B system
14. Schema(less) and
stats
● Event Model
○ Date
○ Name
○ Equipment Type
○ Data JSON
● Event collection
○ 234 million documents
○ Average of 250K events per
day, 7.5M events a month
○ … and growing!
○ About 640 GB local storage
○ Index size 29GB
● Devices
○ Around 3.5K different devices
{
"id": "5ba5049e1d69b32a09094000",
"eventName": "cameras/motion",
"description": "Motion occurred on Jones Camera",
"equipment": "54257bbee6066fa8504ac000",
"data": {
"video": "Motion.1.1537541248",
"media": {
"images": [],
"thumbs": [],
"videos": []
}
},
"eventDate": "2018-09-21T14:47:48.242Z",
"flagged": false,
"dismissed": false,
"deleted": false,
"createdAt": "2018-09-21T14:47:58.558Z",
"updatedAt": "2018-09-21T14:48:01.316Z",
"version": "1.11.2"
}
15. Volume = hard to manage data and trace
devices
- Thousand of devices
- Tons of app servers
- Tons of logs to tail
… So what now?!
16. Change Streams... What the luck!
● Change Streams
○ Change streams allow applications to access real-time data
changes without tailing the oplog.
○ Applications subscribe to all data changes on a single
collection, a database, or an entire deployment, and
immediately react to them
○ https://docs.mongodb.com/manual/changeStreams/
17. Change Streams... What the luck!
● Some sample code
const collection = db.collection('event');
const changeStream = collection.watch();
// You can use a cursor
const next = await changeStream.next();
// Or just subscribe to an event!
changeStream.on('change', console.log)
21. Wrapping up
- Industrial IOT Implementation
- Re Engineering Process
- Python, PHP, SQL to Javascript,
JSON
- Migration to MongoDB Atlas
- DA Tracer
- What’s next?
- Check Repo README.md
- Your Pull Request is welcome!
Contact:
Luis Lobo Borobia
Twitter: @luislobo
GitHub: @luislobo
Link to DATracer:
https://github.com/luislobo/da-tracer
AWS diagrams made using
https://cloudcraft.co
Thanks!
Hinweis der Redaktion
Hi, my name is Luis Lobo Borobia, I have been working for PetroCloud since August 2013.
PetroCloud is a Real-Time Oil Field Automation and Security Platform, focused primarily in Oil and Gas, and Electric utility verticals! There are other verticals in the works as we speak.
It all started in 2012. Lance White had some oilfields he needed to monitor but their location were not convenient at all.
He wanted to see the people working on it, as well as checking some values from the tank.
He then contacted a friend, Martin Apesteguia, who was a co-worker from a previous company, and started the project.
Later on, PetroCloud was born as a company.
Martin then later on called me to join the project in August 2013.
At that time PetroCloud had its very first customers, and we were a very small start-up, with 4 developers.
Long time has passed (I know, just 5 years, but that is like... an eternity for Software Projects, right?) and now we are a solid company.
We’ve had a big capital injection in December 2017, and we keep growing each month.
So, to give you some heads up, we will be talking about:
Industrial IOT implementation example, where we were, where we are now
Moving from MySQL to MongoDB and recently to MongoDB Atlas
And about a new tool I developed for tracing MongoDB operations
Back in the days, we were working with Python inside our Router, PhP for our Web Application, and MySQL as our database engine.
At that time, the current concepts of IOT, mesh networks, edge devices, were not something that everyone talked about.
We new what we wanted to do, and tried to combine the right hardware elements with the right software components to get our job done.
And we succeeded to a point, where we were happy and started having our first customers.
But we knew that the software stack that we had was not one that would scale or be flexible enough for us to grow to what we have envisioned.
With our current stack, it was hard to connect to different devices and protocols
We had too many languages for the few developers, and if we were to grow, we would prefer that any developer could work at any level of our software stack, at least be good at reading it.
Our current application was a Static web application, server side rendered, not very flexible.
We knew we had to re engineer and rethink our stack and by the end of 2013 we started the process
Key factors during our re engineering process
# Integration
Basically, for us to grow in this field, we need to *integrate to whatever device* or feature our customers need. That’s how we create new features: our customers drive our business. Be able to handle all the different protocols used by Industrial devices.
# One language to rule them all
Also, we were developing in four languages: Python, Javascript, PHP and SQL. We were really small, we wanted new talent. Our software stack was PHP based. Nothing wrong with that, I still love PHP!, but it was using Code Igniter.
Yeah, I hear you. Not the best MVC based framework around, but it worked for us at the beginning.
We wanted to focus in developing in one single language, so that anyone in the team could jump into Front-End development, Backend-API development, Device development. And that language was Javascript, and the data format was JSON.
Our chosen solutions were and still are: MongoDB as our platform database, Node.js with Sails.js as our platform API and RCU software, and finally Angular JS (we are migrating to Angular) as our Frontend development framework.
Key factors (continued…)
# People
At the end, it's all about the people. Most of the developers we were in contact with at the time were all looking at Angular JS opportunities, Node.js was the new kid on the block. We wanted to be in a platform that allowed us to attract great developers, and to be appealing. And let's be honest, would you go back and develop apps in Code Igniter?
### Platform
Finally, we wanted to build a platform, not just a "web page". And for that, we wanted to have an independant API, a SPA Responsive Web Application, and our RCU (Remote Computing Unit) software, and Video jobs processsing
At that time, I have had previous experience with some NoSQL databases, and I always liked these features about MongoDB:
- High Availability is super easy with Replica Sets
- Almost zero maintenance (compared to other database infrastructures)
- JSON and Javascript friendly
- Schema(less) flexibility
# Parallel System
We had our PHP application, moved it into maintenance mode. That was the live App users had.
Then we started developing in parallel the new Node.js based API, with a new front end built on top of Angular.js.
We started creating first the new skeleton and then, in an iterative process, creating the different features we had in our legacy PHP application.
One of the things that we implemented in this new API and Frontend was the extensive use of Websockets.
When we matched the same features we had in our app, we rolled it over, and removed access to the old system. We kept it for a while, just in case there were any events that needed to be accessed but at some point we finally killed it.
All new development was done in the new software stack.
After some (painful) months, we ended up migrating our system from PHP, Python, MySQL, to a platform 100% javascript based.
Choosing a Node.js framework at that time was not that easy. I've always liked MVC frameworks. I was delighted working with Yii Framework, one of the best PHP frameworks I know. And I wanted something similar for our API.
After some research, we chose Sails.js. It has an ORM, called Waterline, a nice structure to work on, and it let us focus on developing the software solution rather than thinking on "how to do the things right in Javascript".
Waterline has a very nice adapter for MongoDB, and it really works great for us. We have been contributors for that project ever since, improving both Sails and their MongoDB adapters when possible/necessary.
I can tell that these last years have been, technologically one of the most fun in my life. We have a solution that can drive tons of different disparate devices from different brands, and all integrated into a single Platform.
MongoDB has been such a great tool for us, we think of it as one of the most important parts of our solution.
You know why? Because we almost never hear about it!
In the last 4 years, the only time it's been down, was because of human error. Actually, because of one error.
Actually because of me. Yeah. I messed up with the database, at the beginning.
The good thing is that one thing I always pushed forward is having a great backup mechanism. Even paranoid.
Since we were happy MongoDB Cloud users were were back in business really quick.
Also, because of how our infrastructure is designed, we do at the edge event storage when the platform is unreachable.
MongoDB has been such a great tool for us, we think of it as one of the most important parts of our solution.
You know why? Because we almost never hear about it!
In the last 4 years, the only time it's been down, was because of human error. Actually, because of one error.
Actually because of me. Yeah. I messed up with the database, at the beginning.
The good thing is that one thing I always pushed forward is having a great backup mechanism. Even paranoid.
Since we were happy MongoDB Cloud users were were back in business really quick.
Also, because of how our infrastructure is designed, we do at the edge event storage when the platform is unreachable.
One of the things we wanted to have since our re-engineering process was to have a (near) real-time system.
Today, if an event happens, you event panel populates with a very basic event that is augmented as the information about it
increases.
For example, we get a tank level high event. You can immediately see the data about it, but the video footage
takes a little longer to process, so you first get the data, then the event is augmented with media information that
comes from the cameras around the area where the tank is located.
This is possible because our platform is getting the information from our remote RCU, to the platform, to the browser
and jobs services that processes our snapshots and videos.
All of this is built into our different device, API and front end components.
Because of the nature of our business, we have so many different devices reporting data, we really need a very flexible schema.
MongoDB excels at it. We have a mixture between flexible and fixed schema.
With Sails.js, specifically Waterline, you can design your Models to have a structure. And that is what we do. We have a specific Event structure. But we also have our device reported data into one attribute that can be anything.
In yellow you can see attributes that are part of the schema of our event collection, and in bright green the ones that are not defined in our event collection schema. What it’s in data is flexible and it depends on the equipment type.
From sensors readings: temperature, oil levels, audio noise decibels, to information about videos and snapshots.
It can let us accomodate to any situation, as needed.
The second I heard of Change Streams I said that! Or something similar...
I could not believe my eyes... That is something I would have liked to have so many years ago!
We have close to a thousand RCUs, about 3.5K industrial devices/features managed, all sending information about what's
relevant to them. We use websockets to keep all of this updated. But...
I always wanted to have a tool during the development process that would allow me to see any change in the database
as soon as it happens, and being able to filter it... It wasn't possible at the time... until now!
The second I heard of Change Streams I said that! Or something similar...
I could not believe my eyes... That is something I would have liked to have so many years ago!
We have close to a thousand RCUs, about 3.5K industrial devices/features managed, all sending information about what's
relevant to them. We use websockets to keep all of this updated. But...
I always wanted to have a tool during the development process that would allow me to see any change in the database
as soon as it happens, and being able to filter it... It wasn't possible at the time... until now!
## MongoDB Tracer
Let me introduce a tiny super very alpha version of a tool I've been working on the last weeks.
I had the idea of building it since the second I read about Change Streams, but never had the time/excuse to build it.
So, thank you MongoDB .local, you were the perfect excuse.
So here it is: a tool that will show you the documents as they come in right from the database, as a tracer tool, a
real-time tool that is user friendly, with filters and all the whistles.
Let me show how it works.
[demo]
(Open up webapp)
First and foremost, remember, this is an alpha version, bear with me.
Now, let's get into it:
- we need to provide a connection URI to the database we want
- Once you do that, you need to select a collection.
- Let's add this 'event' collection.
- As you can see... nothing happens... gosh...
- But that is intended! We need to start feeding events now.
- I have built this simple web poll, that will send events to the same database DA-Tracer is connected to.
- I encourage you to go to https://poll.luislobo.xyz and answer the poll so that we can start seeing some results
- And all of this is right from the database
- It uses Sails.js, WebSockets, VUE
- And the best of it… It’s open source under a MIT license.
## MongoDB Tracer
Let me introduce a tiny super very alpha version of a tool I've been working on the last weeks.
I had the idea of building it since the second I read about Change Streams, but never had the time/excuse to build it.
So, thank you MongoDB .local, you were the perfect excuse.
So here it is: a tool that will show you the documents as they come in right from the database, as a tracer tool, a
real-time tool that is user friendly, with filters and all the whistles.
Let me show how it works.
[demo]
(Open up webapp)
First and foremost, remember, this is an alpha version, bear with me.
Now, let's get into it:
- we need to provide a connection URI to the database we want
- Once you do that, you need to select a collection.
- Let's add this 'event' collection.
- As you can see... nothing happens... gosh...
- But that is intended! We need to start feeding events now.
- I have built this simple web poll, that will send events to the same database DA-Tracer is connected to.
- I encourage you to go to lobo.now.sh and answer the poll so that we can start seeing some results
- And all of this is right from the database
- It uses Sails.js, WebSockets, VUE
- And the best of it... We are open sourcing it under a MIT license.