Anzeige
Anzeige

Más contenido relacionado

Similar a Migrating to MongoDB: Best Practices(20)

Anzeige

Más de MongoDB(20)

Anzeige

Migrating to MongoDB: Best Practices

  1. Migrating to MongoDB Best Practices Muthu Chinnasamy Senior Solutions Architect
  2. Agenda • Project Team • Schema Design • Application Integration • Data Migration Options • Ops considerations
  3. Why MongoDB • Rich documents • Promote business agility • Achieve higher scalability • Lower budget strain compared to RDBMS
  4. RDBMS to MongoDB – Success stories
  5. Project Team
  6. Organizing for Success - Stakeholders • Key to success: Involve all key stakeholders for the application o Line of business o Developers o Data Architects o DBAs o Systems Administrators o Security
  7. Organizing for Success – Project Charter • Develop project charter o Define business and technical objectives o Define timeliness and responsibilities o Monitor progress and address any issues
  8. Organizing for Success – Help needed? • Partner services and resources available from MongoDB o Community support o Build skills and proficiency through web based training o Support and consulting services
  9. Schema Design
  10. Definitions RDBMS MongoDB Table Collection Row Document Column Field Index Index JOIN Embedded Document or Reference
  11. Document Model Benefits RDBMS MongoDB { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] }
  12. Schema Design – Blogging Platform
  13. Schema Design - Indexing • Compound Indexes • Unique indexes • Array Indexes • Text Indexes • Geospatial indexes • Sparse Indexes
  14. Schema Design – For more details • Afternoon session: Data modeling deep dive • Google: "6 rules of thumb for MongoDB schema design" • Google: "MongoDB compound index optimization"
  15. Application Integration
  16. Drivers & Ecosystem MongoDB API is implemented as methods Morphia Java Ruby Python Perl MEAN Stack
  17. Developer Efficiency RDBMS Rigid schema Object-Relational impedance? Alter 2TB table to modify a column? MongoDB Dynamic schema MongoDB APIs are classes and packages Modify code to use MongoDB APIs
  18. Data Migration
  19. Data Migration Source Database Source Database AApppplilcicaattioionn
  20. Data Migration – Can you have downtime? Application View AAvavailailabblele DDeegrgaraddeedd DDoowwnn AAvavailailabblele Source Database Source Database MMaastseter r EExpxpoortritningg ImImppoortritningg MMaastseter r Time T 1 T 2 T 3
  21. Data Migration – mongoimport $ mongoimport --db test --collection customers < customers.json connected to: 127.0.0.1 2014-11-26T08:36:47.509-0800 imported 1000 objects $ mongo MongoDB shell version: 2.6.5 connecting to: test > db.customers.findOne() { "_id" : 363862536, "first_name" : "Landon”, "last_name" : "Moore", "created_date" : ISODate("2010-03-02T22:48:35Z"), "is_active" : true, "phone" : [ { "type" : "Work”, "number" : "683-560-1311” }, { "type" : "Other”, "number" : "437-849-4219” } ], "address" : { "street_number" : 14, "street" : "Granite", "street_type" : "Way", "city" : "New Jersey", "zip_code" : 96881 }, "company" : ”Example" }
  22. Data Migration – ETL tools SoSouurcrece D Daatatabbaasese EETTLL
  23. Data Migration – Hadoop SSoouurrccee D Daattaabbaassee jojobb jojobb jojobb jojobb
  24. App Driven Migration AApppplilcicaattioionn SSoouurrccee D Daattaabbaassee
  25. Data Migration - Options SSoouurrccee D Daattaabbaassee SSnnaappsshhoott AApppplilcicaattioionn Application Managed Continuous Sync Batch Migration Batch Migration
  26. Case Study
  27. Case Study Uses MongoDB to safeguard over 6 billion images served to millions of customers Problem Why MongoDB Results • 6B images, 20TB of data • Brittle code base on top of Oracle database – hard to scale, add features • High SW and HW costs • JSON-based data model • Agile, high performance, scalable • Alignment with Shutterfly’s services-based architecture • 80% cost reduction • 900% performance improvement • Faster time-to-market • Dev. cycles in weeks vs. tens of months
  28. Shutterfly – Original Data store OOraraclcele • Meta data stored in XML Blobs • App responsible for content of blob Photo ID XML Blob 1 <xml><meta-data>…</xml> 2 <xml><meta-data>…</xml> 3 <xml><meta-data>…</xml>
  29. Schema Migration – Initial <?xml version="1.0" encoding="utf16"?> <votes> <voteItem user="00000000" vote="1" /> <voteItem user="11111111" vote="1" /> <voteItem user="22222222" vote="1" /> </votes> <?xml version="1.0" encoding="utf16"?> <votes> <voteItem user="00000000" vote="1" /> <voteItem user="11111111" vote="1" /> <voteItem user="22222222" vote="1" /> </votes>
  30. Schema Migration – Phase 1 <?xml version="1.0" encoding="utf16"?> <votes> <voteItem user="00000000" vote="1" /> <voteItem user="11111111" vote="1" /> <voteItem user="22222222" vote="1" /> </votes> <?xml version="1.0" encoding="utf16"?> <votes> <voteItem user="00000000" vote="1" /> <voteItem user="11111111" vote="1" /> <voteItem user="22222222" vote="1" /> </votes> { _id : "site/the3colbys/3326/_votes", "V" : 0, "cD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "wD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "md5" : "71199d82ee730f271feface722a74d30", "data" : "<?xml version="1.0" encoding="utf16"?> { _id : "site/the3colbys/3326/_votes", "V" : 0, "cD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "wD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "md5" : "71199d82ee730f271feface722a74d30", "data" : "<?xml version="1.0" encoding="utf16"?> <votes> <voteItem user="00000000" vote="1" /> <voteItem user="11111111" vote="1" /> <voteItem user="22222222" vote="1" /> </votes>" } <votes> <voteItem user="00000000" vote="1" /> <voteItem user="11111111" vote="1" /> <voteItem user="22222222" vote="1" /> </votes>" }
  31. Schema Migration – Phase 2 <?xml version="1.0" encoding="utf16"?> <votes> <voteItem user="00000000" vote="1" /> <voteItem user="11111111" vote="1" /> <voteItem user="22222222" vote="1" /> </votes> <?xml version="1.0" encoding="utf16"?> <votes> <voteItem user="00000000" vote="1" /> <voteItem user="11111111" vote="1" /> <voteItem user="22222222" vote="1" /> </votes> { _id : "site/the3colbys/3326/_votes", "V" : 0, "cD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "wD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "md5" : "71199d82ee730f271feface722a74d30", "data" : "<?xml version="1.0" encoding="utf16"?> { _id : "site/the3colbys/3326/_votes", "V" : 0, "cD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "wD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "md5" : "71199d82ee730f271feface722a74d30", "data" : "<?xml version="1.0" encoding="utf16"?> <votes> <voteItem user="00000000" vote="1" /> <voteItem user="11111111" vote="1" /> <voteItem user="22222222" vote="1" /> </votes>" } <votes> <voteItem user="00000000" vote="1" /> <voteItem user="11111111" vote="1" /> <voteItem user="22222222" vote="1" /> </votes>" } { _id : "site/the3colbys/3326/_votes", "V" : 0, "cD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "wD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "md5" : "71199d82ee730f271feface722a74d30", "votes" : { 000000000:1, 111111111:1, 222222222:1 } } { _id : "site/the3colbys/3326/_votes", "V" : 0, "cD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "wD" : "Thu Sep 23 2010 20:38:54 GMT-0700 (PDT)", "md5" : "71199d82ee730f271feface722a74d30", "votes" : { 000000000:1, 111111111:1, 222222222:1 } }
  32. Data Migration – Application driven 1. Request for photo 2. Try to read from MongoDB 3. If cache miss, read from Oracle 4. Translate document & write to MongoDB 11 55 AApppplilcicaattioionn 33 Source Database 22 44 Source 5. Return to client Database
  33. Ops Considerations
  34. Replica Sets - No downtime maintenances Replica set provides ops agility & HA •Database upgrades •Hardware swaps/maintenance •Maintenance operations •Automatic failover
  35. MongoDB Management Service (MMS) 1. Automation 2. Backups 3. Monitoring Provision Upgrade Scale Continuous Backup Point-in-Time Recovery Alerts Cloud Managed MongoDB
  36. 36 Defense in Depth Security Architecture Authentication •Database •LDAP •Kerberos •x.509 Certificates Authorization •Built-in Roles •User-Defined Roles •Field-Level Redaction Auditing •Admin operations •Queries Encryption •Network : SSL •Disk: Partner solutions
  37. Help available from MongoDB MongoDB Enterprise Advanced The best way to run MongoDB in your data center MongoDB Management Service (MMS) The easiest way to run MongoDB in the cloud Production Support In production and under control Development Support Let’s get you running Consulting We solve problems Training Get your teams up to speed.

Hinweis der Redaktion

  1. Rich documents Unrelenting growth in new data sources Growing user loads Promote agility Improve developer efficiency Improve time to market of developed features Achieve higher scalability Horizontal scaling Commercial hardware Cloud friendly Lower budget strain Much lower TCO Do more with less resources
  2. Edmunds – Billing, online advertising, user data (Oracle) Metlife – Single view of 100M+ customers and 70 systems in 90 days Cisco - Analytics, Social Networking (Various) Salesforce – Real time analytics (Various) Expedia – Special travel offers in real time Adobe – Digital experience management platform Shutterfly – Developed nearly a dozen projects on MongoDB storing more than 20TB data (Oracle) Craigslist – Archive data migration (MySQL) MTV - Centralized Content Management (Various)
  3. Agility and flexibility Data model supports business change Rapidly iterate to meet new requirements Intuitive, natural data representation Eliminates ORM layer Developers are more productive Reduces the need for joins, disk seeks Programming is more simple Performance delivered at scale
  4. Lets continue the comparison between relational and document model - consider the example of a blogging platform Got 5 tables - Category, article, user, comments and tags - the application relies on the RDBMS to join five separate tables in order to build the blog entry. – In the case of MongoDB, all of the blog data is aggregated within a single document, linked with a single reference to a user document containing authors of both the blog and comments From a performance and scalability perspective, the aggregated document can be accessed in a single call to the database, rather than having to JOIN multiple tables to respond to a query
  5. During schema design, think about how we query our data – some NoSQL databases that are little more than key/value stores, so you maybe able to ingest data quickly, but you can’t do anything other than primary key lookups – huge backward step coming from the relational world MongoDB on the other hand has a rich query model enabled by extensive indexing. Indexes can be defined for any key or array within the document, as secondary indexes MongoDB indexing will be familiar to DBAs - B-Tree Indexes, Secondary Indexes As with a relational DB, indexes are the single biggest tunable performance factor - Define indexes by identifying common queries - Use MongoDB explain to ensure index coverage - Use MongoDB profiler log all slow queries Listed index types on the slide, include text search and geospatial Array indexes allow you to index each element of an embedded array, ie in a document describing a product, each of the categories that the product can be classified under can be included in an array and indexed, so get a major performance boost when users are searching by those classifications This sort of flexibility gives MongoDB ability to run complex queries quickly
  6. MongoDB has idiomatic drivers for the most popular languages with over a dozen developed and supported by MongoDB and 30+ community-supported drivers. MongoDB API is implemented as methods within the API of a specific programming language, as opposed to a completely separate language like SQL. If we couple this with MongoDB’s document model and their affinity data structures used in object-oriented programming, makes integration with applications very simple.
  7. MongoDB has idiomatic drivers for the most popular languages with over a dozen developed and supported by MongoDB and 30+ community-supported drivers. MongoDB API is implemented as methods within the API of a specific programming language, as opposed to a completely separate language like SQL. If we couple this with MongoDB’s document model and their affinity data structures used in object-oriented programming, makes integration with applications very simple.
  8. Easy to use tool. CSV, TSV, JSON formats Useful if source data is in the same format as target May not use for large data sets Does not do transformation of data
  9. Pentaho &amp; Informatica have partnership with MongoDB GUI based tools Mapping, workflow that transform, change schema along the way Can handle different sources Stable, robust, scalable migrations for large, complex data sets. Limitations around nesting
  10. Hadoop as an ETL system MR or Oozie to transform data Combine, merge, build data set MR can directly write to MongoDB using the M-H connector Possible to do updates to augment an initial bulk load Programmer friendly so almost no limitation as to target transformations
  11. App talks to both source and target Rather than one big bulk transfer, trickle changes Business logic is inside the code, so modifications may be validated using rules before writing to MongoDB
  12. A lot of times we see a combination of these three options used by customers
  13. Key challenges: Time to market, Cost, Performance, Scalability Solution: Simple API, OSS software &amp; simple hardware, Reduce complexity &amp; partition data, Clustered system
  14. What kinds of tasks? Provisioning. Any topology, at scale, with the click of a button. Upgrades. In minutes, with no downtime. Scale. Add capacity without taking your application offline. Continuous Backup. Customize to meet your recovery goals. Point-in-time Recovery. Restore to any point in time, because disasters aren’t scheduled. Performance Alerts. Monitor 100+ system metrics and get custom alerts before your system degrades.
  15. What We Sell We are the MongoDB experts. Over 1,000 organizations rely on our commercial offerings, including leading startups and 30 of the Fortune 100. We offer software and services to make your life easier: MongoDB Enterprise Advanced is the best way to run MongoDB in your data center. It’s a finely-tuned package of advanced software, support, certifications, and other services designed for the way you do business. MongoDB Management Service (MMS) is the easiest way to run MongoDB in the cloud. It makes MongoDB the system you worry about the least and like managing the most. Production Support helps keep your system up and running and gives you peace of mind. MongoDB engineers help you with production issues and any aspect of your project. Development Support helps you get up and running quickly. It gives you a complete package of software and services for the early stages of your project. MongoDB Consulting packages get you to production faster, help you tune performance in production, help you scale, and free you up to focus on your next release. MongoDB Training helps you become a MongoDB expert, from design to operating mission-critical systems at scale. Whether you’re a developer, DBA, or architect, we can make you better at MongoDB.
Anzeige