4. • Founded in 1973
• Leading provider of higher education for working adults
• Parent company of
– University of Phoenix
– Apollo Global
– Carnegie Learning
– College of Financial Planning
– Institute for Professional Development
• Educate over 350 thousand students per year
4
6. • Scalability. We were unable to scale our current system to
support the anticipated number of users and volume of
content, which would increase significantly as we would add
applications to the platform.
• Technology Fit. Much of the data targeted for the platform
was semi-structured and thus not a natural fit with relational
databases.
• Experience. We have experience in traditional databases with
developing, maintaining, and defined processes but we don’t
know how many of these skills can transfer to MongoDB.
6
8. Implementing a new repository solution introduces
new areas of needs such as:
• Plan and deploy a solution
• Operational procedures
• Designing object models
• Determine MongoDB Client and Frameworks
• Measuring effectiveness
8
9. Conference 10gen Training 10gen Lab
Consulting Env.
Run Book
(Deploy)
X X X
Run Book
(Maintenance)
X X X
Object Model
X X X X
Measure
Effectiveness
X
Java Client
X
9
14. Configuration Results
A: Clients on Same Typical Response Time: 0-1.7 ms
Machine Maximum Throughput: 9,000 queries/sec CPU-bound.
Typical CPU Utilization: 100%
B: Clients and Typical Response Time: 1.2-8.5 ms
MongoDB on Maximum Throughput: 12,000 queries/sec
Separate Amazon Typical CPU Utilization: 80%
C: Clients and Typical Response Time: 1.2-10.6 ms
MongoDB in Maximum Throughput: 12,200 queries/sec
Separate Typical CPU Utilization: 85%
Availability Zones, Approximately the same response time, throughput, and CPU
but within One utilization as Configuration B.
Amazon EC2
Region
D: Clients and Typical Response Time: 85.6-87.3 ms.
MongoDB in Maximum Throughput: 1,600 queries/sec
Different Amazon Typical CPU Utilization: 2%. Very low; EC2 instance was
EC2 Regions unstressed.
East coast-west coast network was bottleneck in this
configuration – EC2 instances were not stressed. Response
times were much higher than when instances were located
within a single Amazon EC2 region (configurations B & C).
14
15. Primary
• Data driven Data Model
• Data driven deployment architecture
• Hybrid deployment are possible (Cloud, on premise)
• High latency between EC2 regions
• 85% CPU Mongo behavior changes
Secondary
• Operations/Developer/DBA trained
• Roadmap Development/operations/
15
We decided to look for a solution with a better technological fitand the team developed a short list of potential solutions. While we strongly favored a solution that was already in-house, we added MongoDB to the short list because our research indicated that it might provide excellent query performance with less investment in software licenses and hardware than other solutions. However, our primary concern with MongoDB was that we had no hands-on experience with it. Apollo management tasked the Forward Engineering group within IT – my team – with assessing MongoDB. We responded with an evaluation process designed to determine in a rigorous yet time-sensitive manner whether it would suit our needs.
At the outset, our mission was somewhat loosely defined: to learn about MongoDB and to determine its suitability as a data store. Nevertheless, we identified specific areas of focus.
Our problem was “We didn’t know anything about MongoDB”.We have a saying on Forward Engineering to Fail Fast. So how do we fill in the gaps quickly? We engage the experts, the community, and get our hands dirty with a lab environment.
Reached out to our Operations Team to help stand up a large farm and automate.Our Architecture Standards Groups requires High Availability and Disaster Recoverability.(Keep in mind, our standard for success in Forward Engineering is the 80% rule.
The most fundamental difference between Oracle and MongoDB is the data modelThe scope was limited to Course offering. What courses are offered and who is associated with each course (student, faculty)
The most important analysis was looking at our queries. There were several tables, many joins, and several indexes. We found more than 75% of the queries we all about finding which people are in which classes.Query 1. Find Courses for a givent studentQuery 2. Find users in a given courseQuery 3. Find courses for a facultyAnd so on
The scope was limited to Course offering. What courses are offered and who is associated with each course (student, faculty)
The scope was limited to Course offering. What courses are offered and who is associated with each course (student, faculty)
The scope was limited to Course offering. What courses are offered and who is associated with each course (student, faculty)
The scope was limited to Course offering. What courses are offered and who is associated with each course (student, faculty)