SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Speeding Up Your Rails Application With MongoDB Phil Cowans CTO, Songkick.com MongoUK – June 18th 2010
SONGKICK: THE WORLD’S LARGEST CONCERT DATABASE Current version launched in June 2009 Data from over 85 ticket vendors. Lots of data: Over 100,000 upcoming events worldwide Over 1.5 million events in total. Over 100,000 setlists. Thousands of photos, posters and videos. Built using Ruby and Rails.
Thanks To: Dan Lucraft (@danlucraft) Matt Wynne (@mattwynne) ...and the rest of the Songkick engineering team.
Phil Cowans phil@songkick.com @philcowans http://www.songkick.com/users/phil

Weitere ähnliche Inhalte

Mehr von Skills Matter

Oscar reiken jr on our success at manheim
Oscar reiken jr on our success at manheimOscar reiken jr on our success at manheim
Oscar reiken jr on our success at manheim
Skills Matter
 
Russ miles-cloudfoundry-deep-dive
Russ miles-cloudfoundry-deep-diveRuss miles-cloudfoundry-deep-dive
Russ miles-cloudfoundry-deep-dive
Skills Matter
 
I went to_a_communications_workshop_and_they_t
I went to_a_communications_workshop_and_they_tI went to_a_communications_workshop_and_they_t
I went to_a_communications_workshop_and_they_t
Skills Matter
 

Mehr von Skills Matter (20)

Patterns for slick database applications
Patterns for slick database applicationsPatterns for slick database applications
Patterns for slick database applications
 
Scala e xchange 2013 haoyi li on metascala a tiny diy jvm
Scala e xchange 2013 haoyi li on metascala a tiny diy jvmScala e xchange 2013 haoyi li on metascala a tiny diy jvm
Scala e xchange 2013 haoyi li on metascala a tiny diy jvm
 
Oscar reiken jr on our success at manheim
Oscar reiken jr on our success at manheimOscar reiken jr on our success at manheim
Oscar reiken jr on our success at manheim
 
Progressive f# tutorials nyc dmitry mozorov & jack pappas on code quotations ...
Progressive f# tutorials nyc dmitry mozorov & jack pappas on code quotations ...Progressive f# tutorials nyc dmitry mozorov & jack pappas on code quotations ...
Progressive f# tutorials nyc dmitry mozorov & jack pappas on code quotations ...
 
Cukeup nyc ian dees on elixir, erlang, and cucumberl
Cukeup nyc ian dees on elixir, erlang, and cucumberlCukeup nyc ian dees on elixir, erlang, and cucumberl
Cukeup nyc ian dees on elixir, erlang, and cucumberl
 
Cukeup nyc peter bell on getting started with cucumber.js
Cukeup nyc peter bell on getting started with cucumber.jsCukeup nyc peter bell on getting started with cucumber.js
Cukeup nyc peter bell on getting started with cucumber.js
 
Agile testing & bdd e xchange nyc 2013 jeffrey davidson & lav pathak & sam ho...
Agile testing & bdd e xchange nyc 2013 jeffrey davidson & lav pathak & sam ho...Agile testing & bdd e xchange nyc 2013 jeffrey davidson & lav pathak & sam ho...
Agile testing & bdd e xchange nyc 2013 jeffrey davidson & lav pathak & sam ho...
 
Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero...
Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero...Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero...
Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero...
 
Progressive f# tutorials nyc don syme on keynote f# in the open source world
Progressive f# tutorials nyc don syme on keynote f# in the open source worldProgressive f# tutorials nyc don syme on keynote f# in the open source world
Progressive f# tutorials nyc don syme on keynote f# in the open source world
 
Agile testing & bdd e xchange nyc 2013 gojko adzic on bond villain guide to s...
Agile testing & bdd e xchange nyc 2013 gojko adzic on bond villain guide to s...Agile testing & bdd e xchange nyc 2013 gojko adzic on bond villain guide to s...
Agile testing & bdd e xchange nyc 2013 gojko adzic on bond villain guide to s...
 
Dmitry mozorov on code quotations code as-data for f#
Dmitry mozorov on code quotations code as-data for f#Dmitry mozorov on code quotations code as-data for f#
Dmitry mozorov on code quotations code as-data for f#
 
A poet's guide_to_acceptance_testing
A poet's guide_to_acceptance_testingA poet's guide_to_acceptance_testing
A poet's guide_to_acceptance_testing
 
Russ miles-cloudfoundry-deep-dive
Russ miles-cloudfoundry-deep-diveRuss miles-cloudfoundry-deep-dive
Russ miles-cloudfoundry-deep-dive
 
Serendipity-neo4j
Serendipity-neo4jSerendipity-neo4j
Serendipity-neo4j
 
Simon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelismSimon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelism
 
Plug 20110217
Plug   20110217Plug   20110217
Plug 20110217
 
Lug presentation
Lug presentationLug presentation
Lug presentation
 
I went to_a_communications_workshop_and_they_t
I went to_a_communications_workshop_and_they_tI went to_a_communications_workshop_and_they_t
I went to_a_communications_workshop_and_they_t
 
Plug saiku
Plug   saikuPlug   saiku
Plug saiku
 
Huguk lily
Huguk lilyHuguk lily
Huguk lily
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

Mongo uk speeding up your rails application with mongodb

  • 1. Speeding Up Your Rails Application With MongoDB Phil Cowans CTO, Songkick.com MongoUK – June 18th 2010
  • 2. SONGKICK: THE WORLD’S LARGEST CONCERT DATABASE Current version launched in June 2009 Data from over 85 ticket vendors. Lots of data: Over 100,000 upcoming events worldwide Over 1.5 million events in total. Over 100,000 setlists. Thousands of photos, posters and videos. Built using Ruby and Rails.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. Thanks To: Dan Lucraft (@danlucraft) Matt Wynne (@mattwynne) ...and the rest of the Songkick engineering team.
  • 18. Phil Cowans phil@songkick.com @philcowans http://www.songkick.com/users/phil

Hinweis der Redaktion

  1. Thank you for inviting me to speak at this event.My name is Phil Cowans, and I work for a company called Songkick where I’m chief technical officer. We’re a small company based near Old Street, and this talk is about how we use MongoDB in our production systems. Before I go on, I should say that I actually had very little direct involvement in writing the software I’m about to describe – please do feel free to ask questions, but if they get too in-depth I’ll probably have to refer you to the talented members of my development team who actually did the hard work of putting this together. Our use of MongoDB is fairly simple, but it’s practical and we’ve built some interesting software to support our use case.
  2. Before we get into the tech, I’ll just give you a quick overview of what we do as a company. Songkick is a website about live music. We help our users find out about upcoming concerts by sending them personalised email alerts, and maintain an archive of information about what’s happened in the past, including photos, videos, reviews and setlists, going back to the 1950s. The site has existed in its present form for almost exactly a year, and we’ve now got well over 1.5 million concerts and festivals in the database.Our technology platform is Ruby on Rails, running on Linux and using the standard combination of MySQL, Apache etc.
  3. Here’s a typical page on Songkick (show demo)– the artist page for Sonic Youth. You can see there’s a lot of information here: Upcoming events, similar artists, photos, videos, past events and so on. I can also see similar pages for individual events, venues, cities etc., and user profiles. There’s a lot of data on the site.We want to give our users the best experience possible, which means as well as being well designed and easy to use, the pages need to be fast. We internally set ourselves the goal of making things fast enough so that the user sees something happen, i.e. The page starts to be rendered, no more than 1 second after clicking the link. This means we need to spend as little time as possible computing the HTML that we send to the user.
  4. Every event has a row of information like this.You can see that we try very hard to make the process of finding gigs as rich as possible. Every gig has headliners, supporting artists, venues, attendees, reviews, photos, videos, posters, setlists and tickets!This is great, all of this information is really useful to our users. But all this metadata comes at a cost.
  5. This is the underlying representation of the data in the raw, normalised form we store in MySQL. There are 12 tables here (in fact this is simplified, so data from more tables goes into the event row), so populating the event row from this format involves several multi-table joins. It also requires a fair amount of business logic to decide exactly what to display, and in what format.This sort of thing happens dozens of times per page, so it’s simply not possible to hit the MySQL database every time the page is rendered. The nice thing is we don’t have to – this data is presented in almost exactly the same way for all users who see it (give or take a few modifications such as highlighting which of their friends are going), and appears in the same format on the artist page, the venue page etc. We can therefore precompute the exact data required to display this fragment and cache it ready for when the page needs to be displayed.
  6. Like many frameworks, Rails uses a Model-View-Controller paradigm. Typically this means that the view accesses model objects, which encapsulate rows from a relational database.Rails does support fragment caching – i.e. it can check for a cached version of a bit of markup before executing the code to render it. Out of the box this isn’t quite what we want because:We wanted to be able to pre-populate the cache.We didn’t want to be constrained to cache fragments of completed markup for a number of reasons.Rail’s built in cache expiry is inflexible, which was a big problem.
  7. Consider a ‘document’ view of the above fragment – this is all of the data needed to render the fragment we saw before.Exactly what goes here is flexible:It can be pre-computed HTML – this is clearly fastest to render.Or, it can be a data view as shown here, which needs more work to transform it to HTML, but has the advantage that it can be reused in places where there are similar but not identical representations of the same object, and allows some customisation based on parameters only known at render time (such as the user’s location). It also has the advantage that it doesn’t need to be expired as often as the visual design changes, which does happen fairly frequently.
  8. This is exactly the sort of thing that MongoDB is good at, specifically: Schema-less which is great for our denormalized data which is changing a lot. (Schema less databases are a great fit with dynamic languages.) Pretty quick. Stores most/all of our db in RAM. Supports sharding (or close to supporting it anyway). Seems more mature than some.... Fully supported Ruby driver. (With responsive IRC and developers.)We’d also used it for some internal apps (to store and analyse web traffic stats), so were familiar with it.
  9. Architecturally, it looks like this. The Ruby classes wrapping the document representation of the page or page fragment are called presenters. The idea is that the view should be able to take the output of the presenter and construct the markup with little or no transformation logic.The presenters pull data directly from Mongo, only falling back to the models and the MySQL database if no precomputed version is available.We aren’t fully there yet, but the advantage of this approach is that we can use a mixture of models and presenters in the views as appropriate – building presenters where the code is more stable and there’s more need for high performance.
  10. This is a schematic view of the presenter beind our event row. There’s a method for each of the pieces of data shown in the HTML, so the view can just call these and substitute in the result. These call the models, so may result in complex SQL.
  11. We call the MongoDB collections which hold this data ‘silos’, and we’ve built a library to make it really easy to convert an existing class to use them. Ruby’s flexibility when it comes to metaprogramming really helps here.At the top of the class we include Silo::Store, which brings in the utility functions, and define the collection name and a lamba which constructs the ID for a given instance of the presenter (here the primary key for the underlying event object). This is used as the document’s key in MongoDB.Once we’ve done this it’s simply a matter of adding silo_method statements to indicate which of the presenter’s methods should be persisted in the silo. When we call the title or image_count methods, we’ll now look in MongoDB first, and only execute the method itself if no answer is returned. If the method is executed, the result will be stored in the silo for next time.
  12. Getting stuff into and out of the cache is of course the easy bit – the hard bit is knowing when to expire the cache.This is the list of reasons why the silo for an event row may become invalid – there are a lot of reasons such as adding media, users saying they’re going, venue’s changing name, Artist changing name, new artists being added to the lineup and so on, some of which are several steps removed from the event object itself.All of this is quite complex, so we tried to find a way to make it easy to express the expiry rules.
  13. Rails supports ‘observers’ – hooks into the ActiveRecord ORM to trigger actions when database objects are created, updated or destroyed. We could use this to expire the cache and repopulate it with new values as appropriate.The silo helper methods are able to reflect on the presenter classes, so know which methods need to be run – this helps keep things clean. There are however two big problems.Firstly, this encourages us to organise the expiry rules by the model which triggers the change, rather than by presenter being expired. This seems counter-intuitive to us.Secondly, and most importantly, Rails’ observers are synchronous – we don’t want to speed up page rendering just to slow down requests which mutate the data (and these can be very slow indeed if a lot of things have to be regenerated as a result).
  14. Fortunately, we’d already implemented asynchronous observers for other reasons. Every Create, Update and Destroy event for specific models generates an event which is published to a RabbitMQ message queue. Multiple consumers can listen to these events and act as appropriate. We use this for all sorts of things – sending welcome emails, updating activity feeds, populating users’ calendars, resizing uploaded images and so on. We have a nice Domain Specific Language, again built using Ruby metaprogramming, which makes it really easy to define message consumers in an expressive way.
  15. The daemons which handle silo expiry and regeneration are called ‘silovators’. This is a schematic of part of the code representing the expiry rules for the event listing presenter. Each block listens for a particular action on a particular object, so for example, when an Attendance is created, which happens when a user says he or she is going to a concert, we know to regenerate the silo for that event to update the list of attending users. These can be specialised to look for changes to specific fields as appropriate.
  16. That’s basically it – we observe changes to the underlying MySQL database and via RabbitMQ and the silovators pre-generate the appropriate data and put it in MongoDB. At render time, the presenter grabs the data and does a minimal amount of work to convert it to the HTML which is sent to the user.There are a few more bits and pieces. We’ve had to deal with issues such as locking to prevent concurrency issues between multiple silovator back-ends, and bulk expiration when design changes make cached HTML fragments invalid. We’ve also moved more towards caching HTML more and data less, so have built out support for post-processing of that after retrieval from the silo to customise for a specific user. Finally, we’re big on test driven development, so we’ve put together some tools to make it easy to test the expiry rules.
  17. Thanks to Dan and Matt, who built most of this software and put together an earlier version of this talk which I’ve plagerised, and everyone else on the Songkick engineering team.
  18. I’m Phil Cowans and you can contact me at these places. The code I’ve just described isn’t quite ready for widespread distribution, but if you’re interested get in touch and we should be able to share it with you.Thanks you very much for listening, and I’d be happy to attempt to answer any questions.