SlideShare ist ein Scribd-Unternehmen logo
1 von 65
Downloaden Sie, um offline zu lesen
Lanyrd.com
Building Lanyrd
         Simon Willison
   BrightonPy, 9th August 2011
    http://lanyrd.com/sgptt
Lanyrd.com




  Definitive database
 of professional events
      and speakers
Lanyrd.com




  Definitive database       Social event recommendation
 of professional events   Comprehensive speaker profiles
      and speakers        Archive of slides, notes and video
A brief history
Casablanca!
  August 2010
• Aug 31st, 11:22: Launch! (1 linode)
• Aug 31st, 12:41: Unlaunch
• Aug 31st, 12:54: Read only mode
• Aug 31st, 14:15: DB server (2 linodes)
• Sep 1st: Limit 50 on dashboard
• Sep 1st: disable-dashboard setting
• Sep 3rd: dConstruct (and Twitter bot)
• Sep 4th: TechCrunched (read only :( )
• Sep 5th: 3 large EC2 + 1 RDS
• Sep 6th: Downgrade to 3 small EC2
December   photo: @niqui
• Dec 8: Calacanis + Scoble at the same time!
 • Upgrade to next size of RDS
 • (Sometimes scaling vertically does the job)
• Jan 26th: Solr powered dashboard
 • Replicated to 2, then 3 servers
lanyrd.com                   badges.lanyrd.net

     Load balancer (nginx)              HTTP cache (varnish)

                                                                  Database
                                                                (MySQL RDS)
   app server          app server            app server
(django/mod_wsgi)   (django/mod_wsgi)     (django/mod_wsgi)




 search master        search slave           search slave             Redis
                                                               (data structures +
     (solr)              (solr)                 (solr)
                                                                message queue)



            logging                            worker              worker
          (MongoDB)                            (celery)            (celery)
Solr + Haystack
apache > lucene > solr




                                                                                                          Search the site with Solr   Search
  Main      Wiki                                                                                                  Powered by Lucid Imagination
                                                                                                   Last Published: Sat, 04 Jun 2011 12:23:42 GMT
   About
    Welcome
    Who We Are
                         Welcome to Solr
   Documentation
                                                                                                                                          PDF
   Resources
                              What Is Solr?
   Related Projects
                              Get Started
                              News
                                  May 2011 - Solr 3.2 Released
                                  March 2011 - Solr 3.1 Released
                                  25 June 2010 - Solr 1.4.1 Released
                                  7 May 2010 - Apache Lucene Eurocon 2010 Coming to Prague May 18-21
                                  10 November 2009 - Solr 1.4 Released
                                  20 August 2009 - Solr's first book is published!
                                  18 August 2009 - Lucene at US ApacheCon
                                  09 February 2009 - Lucene at ApacheCon Europe 2009 in Amsterdam
                                  19 December 2008 - Solr Logo Contest Results
                                  03 October 2008 - Solr Logo Contest
                                  15 September 2008 - Solr 1.3.0 Available
                                  28 August 2008 - Lucene/Solr at ApacheCon New Orleans
                                  03 September 2007 - Lucene at ApacheCon Atlanta
                                  06 June 2007: Release 1.2 available
                                  17 January 2007: Solr graduates from Incubator
                                  22 December 2006: Release 1.1.0 available
                                  15 August 2006: Solr at ApacheCon US
                                  21 April 2006: Solr at ApacheCon
                                  21 February 2006: nightly builds
                                  17 January 2006: Solr Joins Apache Incubator


                         What Is Solr?
More Like This
                                                                                 Faceting
                                                                                 Stored (non-indexed) fields
                                                                                 Highlighting
                                                                                 Spelling Suggestions
                                                                                 Boost




     Find the needle you're looking for.                              Download           Documentation


Search doesn't have to be hard. Haystack lets you write your search code         Sprinting to 1.1-final
                                                                                 Posted on 2010/11/16 by Daniel
once and choose the search engine you want it to run on. With a familiar API     Though this site has sat out of
that should make any Djangonaut feel right at home and an architecture that      date, there has been a lot of
                                                                                 work put into Haystack 1.1. As
allows you to swap things in and out as you need to, it's how search ought       of writing, there are eight issues
to be.                                                                           blocking the release. I aim to
                                                                                 have those down to zero by the
                                                                                 end of the week.
Haystack is BSD licensed , plays nicely with third-party app without needing
to modify the source and supports Solr , Whoosh and Xapian .                     Once those eight are done, I will
                                                                                 be releasing 1.1-final. The RC
                                                                                 process really didn't do much
Get started                                                                      last time and this release has
                                                                                 been a long time in coming. This
1.    Get the most recent source.                                                release will feature:
2.   Add haystack to your INSTALLED_APPS.
3.   Create search_indexes.py files for your models.                                Vastly improved faceting
4.   Setup the main SearchIndex via autodiscover.                                   Whoosh 1.X support!
5.   Include haystack.urls to your URLconf.                                         Document & field boost
6.   Search!                                                                        support
Model-oriented search

• Define search_indexes.py (like
  admin.py) for your application
• Hook up default haystack search views
• Write a quick search.html template
• Run ./manage.py rebuild_index
add a conference     you are signed in as simonw, do you want to sign out?




                                                                 calendar          conferences         coverage          profile

                                                                                                                         search




     Search
We found 3 results for “django”

                                                                                                FILTER BY
    django                                                                 Search               type
                                                                                                    Sessions     3


Your current filters are…
TYPE: Sessions       TOPIC: NoSQL         PLACE: United States     Clear all filters
                                                                                                FILTER BY
                                                                                                topic
     NoSQL and Django Panel
     EVENT       DjangoCon US 2010                                                               NoSQL      3
     TIME        9th September 2010 09:00-10:00
     SPEAKERS    Jacob Burch                                                                     Django      2

                                                                                                 Cassandra       1
     Step Away From That Database
     EVENT       DjangoCon US 2010
     TIME        8th September 2010 11:20-12:00                                                 FILTER BY
     SPEAKERS    Andrew Godwin
                                                                                                place
     Apache Cassandra in Action                                                                 United States        3

     EVENT       Strata 2011                                                                    Multnomah 2
     TIME        1st February 2011 13:30-17:00
                                                                                                Oregon 2
     SPEAKERS    Jonathan Ellis
                                                                                                Portland 2
                                                                                                Santa Clara 1
                                                                                                California 1
class BookIndex(indexes.SearchIndex):
   text = indexes.CharField(document=True, use_template=True)
   speakers = indexes.MultiValueField()
   topics = indexes.MultiValueField()

  def prepare_speakers(self, obj):
    return [a.user.t_id for a in obj.authors.exclude(
        user = None
    ).select_related('user')]

  def prepare_topics(self, obj):
    return list(obj.topics.values_list('pk', flat=True))
search/indexes/books/
    book_text.txt
{{ object.title }}
{{ object.tagline }}
{% for author in object.authors.all %}
   {{ author.display_name }}
   {{ author.user.t_screen_name }}
{% endfor %}
{% for topic in object.topics.all %}
   {{ topic.name_en }}
{% endfor %}
Staying fresh
• Search engines usually don’t like accepting
  writes too frequently
  • RealTimeSearchIndex for low traffic sites
• ./manage.py update_index --age=6 (hours)
 • Uses index.get_updated_field()
• Roll your own (message queue or similar...)
Replication

             Solr Master


Solr Slave   Solr Slave    Solr Slave
Smarter indexing

class Article(models.Model):
   needs_indexing = models.BooleanField(
       default = True, db_index = True
   )
   ...
   def save(self, *args, **kwargs):
       self.needs_indexing = True
       super(Article, self).save(*args, **kwargs)
index = site.get_index(model)
updated_pks = []

objects = index.load_all_queryset().filter(
    needs_indexing=True
)[:100]
if not objects:
    return

for object in objects:
   updated_pks.append(object.pk)
   index.update_object(object)

index.load_all_queryset().filter(
   pk__in = updated_pks
).update(needs_indexing = False)
nginx + Solr
              replication trick
upstream solrmaster {           server {
  server 10.68.43.214:8080;        listen 8983;
}                                  location /solr/update {
upstream solrslaves {                  proxy_pass http://solrmaster;
  server 10.68.43.214:8080;        }
  server 10.193.138.80:8080;       location /solr/select {
  server 10.204.143.106:8080;          proxy_pass http://solrslaves;
}                                  }
                                }
add a conference   you are signed in as simonw, do you want to sign out?




                                                                       calendar      conferences           coverage         profile

                                                                                                                           search




Your contacts' calendar
 yours   24   contacts   182

                                                                                                                Simon
We've found 182 conferences your Twitter contacts are                                                           Willison
interested in.                                                                                                  Your profile
                                                                                                                page
TODAY                     Café Scientifique: Exploring                               Attend
                   21
                          the dark side of star                                        Track
                          formation with the Herschel                                               From our blog
                          Space Observatory                                                         Welcoming Sophie
                             United Kingdom / Brighton                                              Barrett to team
                          21st June 2011                                                            Lanyrd
                           Astronomy    Science                                                     Today we have a very special
                                                                                                    announcement (and for once,
                          4 contacts tracking
                                                                                                    it's not a new feature!) We
                                                                                                    would like to welcome the
                                                                                                    super-wonderful Sophie Barrett
                                                                                                    to the Lanyrd team.


                   21     Usability Professionals'                                   Attend         Session schedules in
                          Association – International                                  Track        your calendar
                          Conference                                                                You can now subscribe to event
                                                                                                    schedules in your calendar of
                             United States / Atlanta
                                                                                                    choice. Stay up to date at the
                          21st–24th June 2011                                                       event with the schedule in the
                           Usability   User Experience                                              pocket where you need it.

                          1 contact speaking and 3 contacts tracking
                                                                                                    Venues (and venue
                                                                                                    maps)
# Original implementation
twitter_ids = [11134, 223455, 33221, ...] # fetch from Twitter

attendees = Attendee.objects.filter(
    user__t_id__in = twitter_ids
).filter(
	

 conference__start_date__gte = datetime.date.today()
)
# Current implementation
twitter_ids = [11134, 223455, 33221, ...] # fetch from Twitter

sqs = SearchQuerySet()
sqs = sqs.models(Conference)
or_string = ' OR '.join(twitter_ids)
sqs = sqs.narrow('attendees:(%s)' % or_string)
Redis
Commands         Clients     Documentation      Community      Download     Issues




Redis is an open source, advanced key-value store. It is often                                          What people are saying
referred to as a data structure server since keys can contain                                               Comparison of CouchDB, Redis,
                                                                                                            MongoDB, Casandra, Neo4J &
strings, hashes, lists, sets and sorted sets.
strings hashes lists                    sets                                                                others http://j.mp/l32SqM via
                                                                                                            @DZone
Learn more →
                                                                                                            @__NeverGiveup Oh YAY, oui tu
                                                                                                            me redis ! *-* Hm, on s'rejoint à
Try it                                                     Download it                                      14h au bahut ? :o

Ready for a test drive? Check this interactive             Redis 2.2.10 is the latest stable version.       JE L REDIS JE FOLLOW BACK
                                                                                                            SUR @Fuckement_TL
tutorial that will walk you through the most               Interested in legacy or unstable versions?
important features of Redis.                               Check the downloads page.                        une question : "How to use
                                                                                                            ServiceStack Redis in a web
                                                                                                            application to take advantage of
                                                                                                            pub / sub paradigm"
                                                                                                            http://t.co/EOgyLU1 #redis #web
                                                                                                            Nice - Cassandra vs MongoDB vs
                                                                                                            CouchDB vs Redis vs Riak vs
                                                                                                            HBase vs Membase vs Neo4j
                                                                                                            comparison http://bit.ly/l32SqM
                                                                                                            from @kkovacs
                                                                                                                                         More...




                                                                                                                                Sponsored by
   This website is open source software developed by Citrusbyte.
   The Redis logo was designed by Carlos Prioglio.
simonw-follows:{144,21345,12328...}
europython-attendees:{344,21345,787...}

contact_ids = redis.sinter(
  'simonw-follows',
  'europython-attendees'
)
add a conference   you are signed in as simonw, do you want to sign out?


Lanyrd.com
                                                              calendar          conferences       coverage         profile

                                                                                                                  search




 EuroPython 2011                                                                            You're
                                                                                            speaking
 The European Python Conference                                                             AT THIS EVENT




 19 –26              JUNE
                     2011
                                                   Florence
                                                   in Italy


                                                                                           97 attending
    http://ep2011.europython.eu/               @europython                                            PEOPLE

    View the schedule on Lanyrd                #europython

                                                                                           80 tracking
                                                                                                      PEOPLE
    Save to iCal / iPhone / Outlook /          lanyrd.com/ccdpc   (short URL)
 GCal
                                                                                                    TELL YOUR FRIENDS!
                                                                                                    Tweet about this
                                                                                                    event
 119 speakers
         Andreas                   Alan                           Anna
         Schreiber                 Franzoni                       Ravenscroft
                                                                                           Topics
         @onyame                   @franzeur
                                                                                            Django
         Andrew                    Alessandro                     Anselm Kruis
         Godwin                    Dentella                                                 Plone
         @andrewgodwin                                                                      Pyramid
         Andrii                    Alex Martelli                  Antonio Cuni
                                                                  @antocuni
                                                                                            Python
         Mishkovskyi
         @mishok13                                                                          Twisted
                                   Ali Afshar                     Armin Rigo
         Armin                                                                                 Edit topics
Celery
Home                                                                            Download     Community       Documentation      Code




                                                                  Background Processing                          Distributed
                                                               Asynchronous/Synchronous                          Concurrency
                                                                 Background Processing                            Distributed
                                                                     Periodic Tasks                                 Retries
                                                               Asynchronous/Synchronous                          Concurrency
                                                                          Periodic Tasks                           Retries



   Distributed Task Queue                                                            Celery 2.2 released!
                                                                                     By @asksol on 2011-02-01.
   Celery is an asynchronous task queue/job queue based on distributed
   message passing. It is focused on real-time operation, but supports               A great number of new features,
   scheduling as well.                                                               including Jython, eventlet and gevent
                                                                                     support. Everything is detailed in the
   The execution units, called tasks, are executed concurrently on a single
                                                                                     Changelog, which you should have read
   or more worker servers using multiprocessing, Eventlet, or gevent.
                                                                                     before upgrading.
   Tasks can execute asynchronously (in the background) or
   synchronously (wait until ready).                                                 Users of Django must also upgrade to
                                                                                     django-celery 2.2.
   Celery is used in production systems to process millions of tasks a day.
                                                                                     This release would not have been
   Celery is written in Python, but the protocol can be implemented in
                                                                                     possible without the help of
   any language. It can also operate with other languages using
                                                                                     contributors and users, so thank you,
   webhooks.
                                                                                     and congratulations!
   The recommended message broker is RabbitMQ, but limited support
   for Redis, Beanstalk, MongoDB, CouchDB, and databases (using
                                                                                     Celery 2.1.1 bugfix
   SQLAlchemy or the Django ORM) is also available.
                                                                                     release
                                                                                     By @asksol on 2010-10-14.

   Celery is easy to integrate with Django, Pylons and Flask, using the
                                                                                     All users are urged to upgrade. For a list
   django-celery, celery-pylons and Flask-Celery add-on packages.
                                                                                     of changes see the Changelog.
   Example                                                                           Users of Django must also upgrade to
   This is a simple task adding two numbers:                                         django-celery 2.1.1.
Tasks?

• Anything that takes more than about 200ms
 • Updating a search index
 • Resizing images
 • Hitting external APIs
 • Generating reports
Trivial example
• Fetch the content of a web page
  from celery.task import task

  @task
  def fetch_url(url):
    return urllib.urlopen(url).read()

  >>> result = fetch_url.delay(‘http://cnn.com/’)
  >>> html = result.wait()
Python and MongoDB                                                                   EuroPython 2011
                                                                                         Italy / Florence



tutorial
                                                                                     19th–26th June 2011

                                                                                            TELL YOUR FRIENDS!
                                                                                            Tweet about this
A session at EuroPython 2011                                                                session


         Andreas
         Jung                                                                        WHEN
         CEO, ZOPYX Ltd                                                              Time 14:30–18:30 CET
                                                                                     Date 20th June 2011
MongoDB is the new star of the so-called NoSQL databases. Using
Python with MongoDB is the next logical step after having used                       SESSION HASH TAG
Python for years with relational databases.                                          #sftzh

This talk will give an introduction into MongoDB and demonstrate                     SHORT URL
how MongoDB can be be used from Python.                                              lanyrd.com/sftzh

More information can be found under:                                                 OFFICIAL SESSION
                                                                                     PAGE
http://www.zopyx.com/resources/python-mongodb-tutorial-at...                         ep2011.europython.eu/conf

                                                                                         View the schedule

    More sessions at EuroPython 2011 on Python
                                                                                     Topics
                                                                                     MongoDB
 Add coverage to this session                                                        Python

  http://www.slideshare.net/ajung/python-mo                                              Edit topics
 A URL to coverage such as videos, slides, podcasts, handouts, sketchnotes, photos
 etc.                                                                                       SCHEDULE
                                                                                            INCOMPLETE?
   Add                                                                                      Add another session
Add coverage
http://www.slideshare.net/ajung/python-mongo-
dbtrainingeurop...

Link title                                               Python and
                                                      MongoDB tutorial

Python mongo db-training-europython-2011              EuroPython 2011
                                                          Italy / Florence
                                                      19th–26th June 2011
Type of coverage
  Link                   Audio             Liveblog
  Write-up               Sketch notes      Photos
  Slides                 Transcript        Notes
  Video                  Handout



Coverage preview
From SlideShare:
The task itself...
• Tries using http://embed.ly/ to find a
  preview
• Fetches the HTTP headers and first 2048
  bytes
• If HTML, attempts to extract the <title>
• If other, gets the file type and size from
  headers
Behind the scenes...
ar = enhance_link.delay(url)
poll_url = '/working/%s/' % signed.dumps({
    'task_id': ar.task_id,
    'on_done_url': on_done_url,
})
if 'ajax' in request.POST:
    return render_json(request, {
       'ok': True,
       'poll_url': poll_url,
    })
else:
    return HttpResponseRedirect(poll_url)
And when it’s done...

from celery.backends import default_backend

...
task_id = request.REQUEST.get('id', '')
result = default_backend.get_result(task_id)
Configuration
# Carrot / Celery: queue uses Redis
CARROT_BACKEND = "ghettoq.taproot.Redis"
BROKER_HOST = " 10.11.11.11" # redis server
BROKER_PORT = 6379
BROKER_VHOST = "6"

# Task results stored in memcached, so they can
# expire automatically
CELERY_RESULT_BACKEND = "cache"
CELERY_CACHE_BACKEND = 
   "memcached://10.11.11.12:11211;..."
Tricks
Phantom load testing
• Deploy a new architecture on a brand new
    EC2 cluster
• Leave your existing site on the old cluster
• Invisibly link to the new stack from an
    <img width=1 height=1> element on your
    live site (not for very long though)
•   (sensible alternative: find a way to replay log files)
cache_version
add a conference     you are signed in as simonw, do you want to sign out?




                                                                       calendar   conferences           coverage         profile

                                                                                                                        search




Django conferences
                                                                                                 Django
     Django events looking for participants                                                      coverage
     1 Django event is looking for participants
                                                                                                          52 videos
                                                                                                          Most recent added 3
                                                                                                          weeks ago


ON NOW                 EuroPython 2011                                                                    52 slide decks
                 19                                                                                       Most recent added 4
                          Italy / Florence                                                                hours ago
                       19th–26th June 2011
                        Django   Plone    Pyramid   Python   Twisted                                      3 audio clips
                                                                                                          Most recent added 1
                                                                                                          week ago

                                                                                                          27 write-ups
SEPTEMBER              DjangoCon US 2011
                  6                                                                                       Most recent added 1

2011                      United States / Portland
                       6th–8th September 2011
                                                                                                          week ago

                                                                                                          11 handouts
                        Django   Open Source    Python
                                                                                                          Most recent added 18
                                                                                                          hours ago

                 17    PyCON FR 2011                                                                      3 notes
                          France / Rennes                                                                 Most recent added 10
                       17th–18th September 2011                                                           hours ago

                        Django   Python



                                                                                                 By country
OCTOBER                PyCon DE 2011                                                             Ireland   1
                  4
class Conference(models.Model):
   ...
   cache_version = models.IntegerField(default = 0)
   def save(self, *args, **kwargs):
       self.cache_version += 1
       super(Conference, self).save(*args, **kwargs)

  def touch(self):
    Conference.objects.filter(pk = self.pk).update(
        cache_version = F('cache_version') + 1
    )
{% cache 36000 conf-topics conference.pk conference.cache_version %}
	

 <ul class="tags inline-tags meta">
	

 	

 {% for topic in conference.topics.all %}
	

 	

 	

 <li><a href="{{ topic.get_absolute_url }}">{{ topic }}</a></li>
	

 	

 {% endfor %}
	

 </ul>
{% endcache %}
Bulk invalidation
from django.models import F

topic.conferences.all().update(
  cache_version = F('cache_version') + 1
)
Signing
Pass data through an untrusted
source with confidence that it
  hasn't been tampered with
Signing uses
• "Unsubscribe" links in emails
  •   lanyrd.com/un/ImN6VyI.ii0Hwm7p71DEcGfaVzziQaxeuu

?redirect_to=URL protection
Signed cookies
  "You are logged in as simonw" without
hitting the database
Signing in Django 1.4
from django.core import signing
signing.dumps({"foo": "bar"})
signing.loads(signed_string)
response.set_signed_cookie(key, value...)
response.get_signed_cookie(key)
Hashed static asset
filenames in S3/CloudFront
global.js

        global.ed81d119.js

cdn.lanyrd.net/js/global.ed81d119.js
Benefits
• Far futures expiry headers
 •   Cache-Control: max-age=315360000

 •   Expires: Fri, 18 Jun 2021 06:45:00 -0000 GMT

• Guaranteed updated CSS in IE
• Deploy new assets in advance of application
• Old versions stick around for rollbacks
./manage.py push_static

• Minifies JavaScript and CSS
• Renames files to include sha1(contents)[:6]
• Pushes all assets to S3
Profiling and debugging
 production systems
UserBasedExceptionMiddleware
from django.views.debug import technical_500_response
import sys

class UserBasedExceptionMiddleware(object):
   def process_exception(self, request, exception):
      if request.user.is_superuser:
          return technical_500_response(request, *sys.exc_info())
mysql-proxy

• Very handy lua-customisable proxy for all
  of your MySQL traffic
• Worst documented software ever
• log.lua - logs out ALL queries
 • https://gist.github.com/1039751
django_instrumented
• (Unreleased) code I wrote for Lanyrd
• Collects various runtime stats about the
  current request, stashes a profile JSON in
  memcached
• Writes out the profile UUID as part of the
  HTML
• A bookmarklet to view the profile
mongodb logging

• Super-fast inserts, log everything!
• Capped collections
• Structured queries
• Ask me about it in a few months
For the future...

• Much better profiling, monitoring and alerts
• Varnish in front of everything
• Replicated MySQL for analytics + upgrades
Questions?
Thank you!
http://lanyrd.com/sgptt

Weitere ähnliche Inhalte

Ähnlich wie Building Lanyrd

The secret life_of_open_source
The secret life_of_open_sourceThe secret life_of_open_source
The secret life_of_open_sourceTed Husted
 
Migrating Fast to Solr
Migrating Fast to SolrMigrating Fast to Solr
Migrating Fast to SolrCominvent AS
 
Openstack In Real Life
Openstack In Real LifeOpenstack In Real Life
Openstack In Real LifePaul Guth
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedBeyondTrees
 
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...Lucidworks
 
Apache Rave (Incubating) at SURFnet
Apache Rave (Incubating) at SURFnetApache Rave (Incubating) at SURFnet
Apache Rave (Incubating) at SURFnetJasha Joachimsthal
 
Making your Drupal fly with Apache SOLR
Making your Drupal fly with Apache SOLRMaking your Drupal fly with Apache SOLR
Making your Drupal fly with Apache SOLRExove
 
Oslo Lucene/Solr Meetup
Oslo Lucene/Solr MeetupOslo Lucene/Solr Meetup
Oslo Lucene/Solr MeetupErik Hatcher
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5israelekpo
 
First Seminar
First SeminarFirst Seminar
First SeminarChudack
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrLucidworks (Archived)
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrLucidworks (Archived)
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrLucidworks (Archived)
 
Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesCharlie Hull
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Murshed Ahmmad Khan
 

Ähnlich wie Building Lanyrd (20)

The secret life_of_open_source
The secret life_of_open_sourceThe secret life_of_open_source
The secret life_of_open_source
 
Migrating Fast to Solr
Migrating Fast to SolrMigrating Fast to Solr
Migrating Fast to Solr
 
NDL Search (beta)
NDL Search (beta)NDL Search (beta)
NDL Search (beta)
 
Openstack In Real Life
Openstack In Real LifeOpenstack In Real Life
Openstack In Real Life
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...
Solr & R to Deploy Custom Search Interface: Presented by Patrick Beaucamp, Bp...
 
Apache Rave (Incubating) at SURFnet
Apache Rave (Incubating) at SURFnetApache Rave (Incubating) at SURFnet
Apache Rave (Incubating) at SURFnet
 
Making your Drupal fly with Apache SOLR
Making your Drupal fly with Apache SOLRMaking your Drupal fly with Apache SOLR
Making your Drupal fly with Apache SOLR
 
Solr 8 interview
Solr 8 interview Solr 8 interview
Solr 8 interview
 
Oslo Lucene/Solr Meetup
Oslo Lucene/Solr MeetupOslo Lucene/Solr Meetup
Oslo Lucene/Solr Meetup
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5
 
First Seminar
First SeminarFirst Seminar
First Seminar
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with Solr
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with Solr
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with Solr
 
What’s New in Solr 1.4
What’s New in Solr 1.4What’s New in Solr 1.4
What’s New in Solr 1.4
 
What’s new in apache solr 1.4
What’s new in apache solr 1.4What’s new in apache solr 1.4
What’s new in apache solr 1.4
 
Overview of Searching in Solr 1.4
Overview of Searching in Solr 1.4Overview of Searching in Solr 1.4
Overview of Searching in Solr 1.4
 
Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challenges
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!
 

Mehr von Simon Willison

Cheap tricks for startups
Cheap tricks for startupsCheap tricks for startups
Cheap tricks for startupsSimon Willison
 
The Django Web Framework (EuroPython 2006)
The Django Web Framework (EuroPython 2006)The Django Web Framework (EuroPython 2006)
The Django Web Framework (EuroPython 2006)Simon Willison
 
How we bootstrapped Lanyrd using Twitter's social graph
How we bootstrapped Lanyrd using Twitter's social graphHow we bootstrapped Lanyrd using Twitter's social graph
How we bootstrapped Lanyrd using Twitter's social graphSimon Willison
 
Web Services for Fun and Profit
Web Services for Fun and ProfitWeb Services for Fun and Profit
Web Services for Fun and ProfitSimon Willison
 
Tricks & challenges developing a large Django application
Tricks & challenges developing a large Django applicationTricks & challenges developing a large Django application
Tricks & challenges developing a large Django applicationSimon Willison
 
Advanced Aspects of the Django Ecosystem: Haystack, Celery & Fabric
Advanced Aspects of the Django Ecosystem: Haystack, Celery & FabricAdvanced Aspects of the Django Ecosystem: Haystack, Celery & Fabric
Advanced Aspects of the Django Ecosystem: Haystack, Celery & FabricSimon Willison
 
How Lanyrd uses Twitter
How Lanyrd uses TwitterHow Lanyrd uses Twitter
How Lanyrd uses TwitterSimon Willison
 
Building Things Fast - and getting approval
Building Things Fast - and getting approvalBuilding Things Fast - and getting approval
Building Things Fast - and getting approvalSimon Willison
 
Rediscovering JavaScript: The Language Behind The Libraries
Rediscovering JavaScript: The Language Behind The LibrariesRediscovering JavaScript: The Language Behind The Libraries
Rediscovering JavaScript: The Language Behind The LibrariesSimon Willison
 
Building crowdsourcing applications
Building crowdsourcing applicationsBuilding crowdsourcing applications
Building crowdsourcing applicationsSimon Willison
 
Evented I/O based web servers, explained using bunnies
Evented I/O based web servers, explained using bunniesEvented I/O based web servers, explained using bunnies
Evented I/O based web servers, explained using bunniesSimon Willison
 
Cowboy development with Django
Cowboy development with DjangoCowboy development with Django
Cowboy development with DjangoSimon Willison
 
Crowdsourcing with Django
Crowdsourcing with DjangoCrowdsourcing with Django
Crowdsourcing with DjangoSimon Willison
 
Class-based views with Django
Class-based views with DjangoClass-based views with Django
Class-based views with DjangoSimon Willison
 
Web App Security Horror Stories
Web App Security Horror StoriesWeb App Security Horror Stories
Web App Security Horror StoriesSimon Willison
 
Web Security Horror Stories
Web Security Horror StoriesWeb Security Horror Stories
Web Security Horror StoriesSimon Willison
 
When Zeppelins Ruled The Earth
When Zeppelins Ruled The EarthWhen Zeppelins Ruled The Earth
When Zeppelins Ruled The EarthSimon Willison
 

Mehr von Simon Willison (20)

How Lanyrd does Geo
How Lanyrd does GeoHow Lanyrd does Geo
How Lanyrd does Geo
 
Cheap tricks for startups
Cheap tricks for startupsCheap tricks for startups
Cheap tricks for startups
 
The Django Web Framework (EuroPython 2006)
The Django Web Framework (EuroPython 2006)The Django Web Framework (EuroPython 2006)
The Django Web Framework (EuroPython 2006)
 
How we bootstrapped Lanyrd using Twitter's social graph
How we bootstrapped Lanyrd using Twitter's social graphHow we bootstrapped Lanyrd using Twitter's social graph
How we bootstrapped Lanyrd using Twitter's social graph
 
Web Services for Fun and Profit
Web Services for Fun and ProfitWeb Services for Fun and Profit
Web Services for Fun and Profit
 
Tricks & challenges developing a large Django application
Tricks & challenges developing a large Django applicationTricks & challenges developing a large Django application
Tricks & challenges developing a large Django application
 
Advanced Aspects of the Django Ecosystem: Haystack, Celery & Fabric
Advanced Aspects of the Django Ecosystem: Haystack, Celery & FabricAdvanced Aspects of the Django Ecosystem: Haystack, Celery & Fabric
Advanced Aspects of the Django Ecosystem: Haystack, Celery & Fabric
 
How Lanyrd uses Twitter
How Lanyrd uses TwitterHow Lanyrd uses Twitter
How Lanyrd uses Twitter
 
ScaleFail
ScaleFailScaleFail
ScaleFail
 
Building Things Fast - and getting approval
Building Things Fast - and getting approvalBuilding Things Fast - and getting approval
Building Things Fast - and getting approval
 
Rediscovering JavaScript: The Language Behind The Libraries
Rediscovering JavaScript: The Language Behind The LibrariesRediscovering JavaScript: The Language Behind The Libraries
Rediscovering JavaScript: The Language Behind The Libraries
 
Building crowdsourcing applications
Building crowdsourcing applicationsBuilding crowdsourcing applications
Building crowdsourcing applications
 
Evented I/O based web servers, explained using bunnies
Evented I/O based web servers, explained using bunniesEvented I/O based web servers, explained using bunnies
Evented I/O based web servers, explained using bunnies
 
Cowboy development with Django
Cowboy development with DjangoCowboy development with Django
Cowboy development with Django
 
Crowdsourcing with Django
Crowdsourcing with DjangoCrowdsourcing with Django
Crowdsourcing with Django
 
Django Heresies
Django HeresiesDjango Heresies
Django Heresies
 
Class-based views with Django
Class-based views with DjangoClass-based views with Django
Class-based views with Django
 
Web App Security Horror Stories
Web App Security Horror StoriesWeb App Security Horror Stories
Web App Security Horror Stories
 
Web Security Horror Stories
Web Security Horror StoriesWeb Security Horror Stories
Web Security Horror Stories
 
When Zeppelins Ruled The Earth
When Zeppelins Ruled The EarthWhen Zeppelins Ruled The Earth
When Zeppelins Ruled The Earth
 

Kürzlich hochgeladen

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Building Lanyrd

  • 1. Lanyrd.com Building Lanyrd Simon Willison BrightonPy, 9th August 2011 http://lanyrd.com/sgptt
  • 2. Lanyrd.com Definitive database of professional events and speakers
  • 3. Lanyrd.com Definitive database Social event recommendation of professional events Comprehensive speaker profiles and speakers Archive of slides, notes and video
  • 6. • Aug 31st, 11:22: Launch! (1 linode) • Aug 31st, 12:41: Unlaunch • Aug 31st, 12:54: Read only mode • Aug 31st, 14:15: DB server (2 linodes) • Sep 1st: Limit 50 on dashboard • Sep 1st: disable-dashboard setting
  • 7. • Sep 3rd: dConstruct (and Twitter bot) • Sep 4th: TechCrunched (read only :( ) • Sep 5th: 3 large EC2 + 1 RDS • Sep 6th: Downgrade to 3 small EC2
  • 8. December photo: @niqui
  • 9. • Dec 8: Calacanis + Scoble at the same time! • Upgrade to next size of RDS • (Sometimes scaling vertically does the job)
  • 10. • Jan 26th: Solr powered dashboard • Replicated to 2, then 3 servers
  • 11. lanyrd.com badges.lanyrd.net Load balancer (nginx) HTTP cache (varnish) Database (MySQL RDS) app server app server app server (django/mod_wsgi) (django/mod_wsgi) (django/mod_wsgi) search master search slave search slave Redis (data structures + (solr) (solr) (solr) message queue) logging worker worker (MongoDB) (celery) (celery)
  • 13. apache > lucene > solr Search the site with Solr Search Main Wiki Powered by Lucid Imagination Last Published: Sat, 04 Jun 2011 12:23:42 GMT About Welcome Who We Are Welcome to Solr Documentation PDF Resources What Is Solr? Related Projects Get Started News May 2011 - Solr 3.2 Released March 2011 - Solr 3.1 Released 25 June 2010 - Solr 1.4.1 Released 7 May 2010 - Apache Lucene Eurocon 2010 Coming to Prague May 18-21 10 November 2009 - Solr 1.4 Released 20 August 2009 - Solr's first book is published! 18 August 2009 - Lucene at US ApacheCon 09 February 2009 - Lucene at ApacheCon Europe 2009 in Amsterdam 19 December 2008 - Solr Logo Contest Results 03 October 2008 - Solr Logo Contest 15 September 2008 - Solr 1.3.0 Available 28 August 2008 - Lucene/Solr at ApacheCon New Orleans 03 September 2007 - Lucene at ApacheCon Atlanta 06 June 2007: Release 1.2 available 17 January 2007: Solr graduates from Incubator 22 December 2006: Release 1.1.0 available 15 August 2006: Solr at ApacheCon US 21 April 2006: Solr at ApacheCon 21 February 2006: nightly builds 17 January 2006: Solr Joins Apache Incubator What Is Solr?
  • 14. More Like This Faceting Stored (non-indexed) fields Highlighting Spelling Suggestions Boost Find the needle you're looking for. Download Documentation Search doesn't have to be hard. Haystack lets you write your search code Sprinting to 1.1-final Posted on 2010/11/16 by Daniel once and choose the search engine you want it to run on. With a familiar API Though this site has sat out of that should make any Djangonaut feel right at home and an architecture that date, there has been a lot of work put into Haystack 1.1. As allows you to swap things in and out as you need to, it's how search ought of writing, there are eight issues to be. blocking the release. I aim to have those down to zero by the end of the week. Haystack is BSD licensed , plays nicely with third-party app without needing to modify the source and supports Solr , Whoosh and Xapian . Once those eight are done, I will be releasing 1.1-final. The RC process really didn't do much Get started last time and this release has been a long time in coming. This 1. Get the most recent source. release will feature: 2. Add haystack to your INSTALLED_APPS. 3. Create search_indexes.py files for your models. Vastly improved faceting 4. Setup the main SearchIndex via autodiscover. Whoosh 1.X support! 5. Include haystack.urls to your URLconf. Document & field boost 6. Search! support
  • 15. Model-oriented search • Define search_indexes.py (like admin.py) for your application • Hook up default haystack search views • Write a quick search.html template • Run ./manage.py rebuild_index
  • 16.
  • 17. add a conference you are signed in as simonw, do you want to sign out? calendar conferences coverage profile search Search We found 3 results for “django” FILTER BY django Search type Sessions 3 Your current filters are… TYPE: Sessions TOPIC: NoSQL PLACE: United States Clear all filters FILTER BY topic NoSQL and Django Panel EVENT DjangoCon US 2010 NoSQL 3 TIME 9th September 2010 09:00-10:00 SPEAKERS Jacob Burch Django 2 Cassandra 1 Step Away From That Database EVENT DjangoCon US 2010 TIME 8th September 2010 11:20-12:00 FILTER BY SPEAKERS Andrew Godwin place Apache Cassandra in Action United States 3 EVENT Strata 2011 Multnomah 2 TIME 1st February 2011 13:30-17:00 Oregon 2 SPEAKERS Jonathan Ellis Portland 2 Santa Clara 1 California 1
  • 18. class BookIndex(indexes.SearchIndex): text = indexes.CharField(document=True, use_template=True) speakers = indexes.MultiValueField() topics = indexes.MultiValueField() def prepare_speakers(self, obj): return [a.user.t_id for a in obj.authors.exclude( user = None ).select_related('user')] def prepare_topics(self, obj): return list(obj.topics.values_list('pk', flat=True))
  • 19. search/indexes/books/ book_text.txt {{ object.title }} {{ object.tagline }} {% for author in object.authors.all %} {{ author.display_name }} {{ author.user.t_screen_name }} {% endfor %} {% for topic in object.topics.all %} {{ topic.name_en }} {% endfor %}
  • 20. Staying fresh • Search engines usually don’t like accepting writes too frequently • RealTimeSearchIndex for low traffic sites • ./manage.py update_index --age=6 (hours) • Uses index.get_updated_field() • Roll your own (message queue or similar...)
  • 21. Replication Solr Master Solr Slave Solr Slave Solr Slave
  • 22. Smarter indexing class Article(models.Model): needs_indexing = models.BooleanField( default = True, db_index = True ) ... def save(self, *args, **kwargs): self.needs_indexing = True super(Article, self).save(*args, **kwargs)
  • 23. index = site.get_index(model) updated_pks = [] objects = index.load_all_queryset().filter( needs_indexing=True )[:100] if not objects: return for object in objects: updated_pks.append(object.pk) index.update_object(object) index.load_all_queryset().filter( pk__in = updated_pks ).update(needs_indexing = False)
  • 24. nginx + Solr replication trick upstream solrmaster { server { server 10.68.43.214:8080; listen 8983; } location /solr/update { upstream solrslaves { proxy_pass http://solrmaster; server 10.68.43.214:8080; } server 10.193.138.80:8080; location /solr/select { server 10.204.143.106:8080; proxy_pass http://solrslaves; } } }
  • 25. add a conference you are signed in as simonw, do you want to sign out? calendar conferences coverage profile search Your contacts' calendar yours 24 contacts 182 Simon We've found 182 conferences your Twitter contacts are Willison interested in. Your profile page TODAY Café Scientifique: Exploring Attend 21 the dark side of star Track formation with the Herschel From our blog Space Observatory Welcoming Sophie United Kingdom / Brighton Barrett to team 21st June 2011 Lanyrd Astronomy Science Today we have a very special announcement (and for once, 4 contacts tracking it's not a new feature!) We would like to welcome the super-wonderful Sophie Barrett to the Lanyrd team. 21 Usability Professionals' Attend Session schedules in Association – International Track your calendar Conference You can now subscribe to event schedules in your calendar of United States / Atlanta choice. Stay up to date at the 21st–24th June 2011 event with the schedule in the Usability User Experience pocket where you need it. 1 contact speaking and 3 contacts tracking Venues (and venue maps)
  • 26. # Original implementation twitter_ids = [11134, 223455, 33221, ...] # fetch from Twitter attendees = Attendee.objects.filter( user__t_id__in = twitter_ids ).filter( conference__start_date__gte = datetime.date.today() )
  • 27. # Current implementation twitter_ids = [11134, 223455, 33221, ...] # fetch from Twitter sqs = SearchQuerySet() sqs = sqs.models(Conference) or_string = ' OR '.join(twitter_ids) sqs = sqs.narrow('attendees:(%s)' % or_string)
  • 28. Redis
  • 29. Commands Clients Documentation Community Download Issues Redis is an open source, advanced key-value store. It is often What people are saying referred to as a data structure server since keys can contain Comparison of CouchDB, Redis, MongoDB, Casandra, Neo4J & strings, hashes, lists, sets and sorted sets. strings hashes lists sets others http://j.mp/l32SqM via @DZone Learn more → @__NeverGiveup Oh YAY, oui tu me redis ! *-* Hm, on s'rejoint à Try it Download it 14h au bahut ? :o Ready for a test drive? Check this interactive Redis 2.2.10 is the latest stable version. JE L REDIS JE FOLLOW BACK SUR @Fuckement_TL tutorial that will walk you through the most Interested in legacy or unstable versions? important features of Redis. Check the downloads page. une question : "How to use ServiceStack Redis in a web application to take advantage of pub / sub paradigm" http://t.co/EOgyLU1 #redis #web Nice - Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase vs Membase vs Neo4j comparison http://bit.ly/l32SqM from @kkovacs More... Sponsored by This website is open source software developed by Citrusbyte. The Redis logo was designed by Carlos Prioglio.
  • 31. add a conference you are signed in as simonw, do you want to sign out? Lanyrd.com calendar conferences coverage profile search EuroPython 2011 You're speaking The European Python Conference AT THIS EVENT 19 –26 JUNE 2011 Florence in Italy 97 attending http://ep2011.europython.eu/ @europython PEOPLE View the schedule on Lanyrd #europython 80 tracking PEOPLE Save to iCal / iPhone / Outlook / lanyrd.com/ccdpc (short URL) GCal TELL YOUR FRIENDS! Tweet about this event 119 speakers Andreas Alan Anna Schreiber Franzoni Ravenscroft Topics @onyame @franzeur Django Andrew Alessandro Anselm Kruis Godwin Dentella Plone @andrewgodwin Pyramid Andrii Alex Martelli Antonio Cuni @antocuni Python Mishkovskyi @mishok13 Twisted Ali Afshar Armin Rigo Armin Edit topics
  • 33. Home Download Community Documentation Code Background Processing Distributed Asynchronous/Synchronous Concurrency Background Processing Distributed Periodic Tasks Retries Asynchronous/Synchronous Concurrency Periodic Tasks Retries Distributed Task Queue Celery 2.2 released! By @asksol on 2011-02-01. Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports A great number of new features, scheduling as well. including Jython, eventlet and gevent support. Everything is detailed in the The execution units, called tasks, are executed concurrently on a single Changelog, which you should have read or more worker servers using multiprocessing, Eventlet, or gevent. before upgrading. Tasks can execute asynchronously (in the background) or synchronously (wait until ready). Users of Django must also upgrade to django-celery 2.2. Celery is used in production systems to process millions of tasks a day. This release would not have been Celery is written in Python, but the protocol can be implemented in possible without the help of any language. It can also operate with other languages using contributors and users, so thank you, webhooks. and congratulations! The recommended message broker is RabbitMQ, but limited support for Redis, Beanstalk, MongoDB, CouchDB, and databases (using Celery 2.1.1 bugfix SQLAlchemy or the Django ORM) is also available. release By @asksol on 2010-10-14. Celery is easy to integrate with Django, Pylons and Flask, using the All users are urged to upgrade. For a list django-celery, celery-pylons and Flask-Celery add-on packages. of changes see the Changelog. Example Users of Django must also upgrade to This is a simple task adding two numbers: django-celery 2.1.1.
  • 34. Tasks? • Anything that takes more than about 200ms • Updating a search index • Resizing images • Hitting external APIs • Generating reports
  • 35. Trivial example • Fetch the content of a web page from celery.task import task @task def fetch_url(url): return urllib.urlopen(url).read() >>> result = fetch_url.delay(‘http://cnn.com/’) >>> html = result.wait()
  • 36. Python and MongoDB EuroPython 2011 Italy / Florence tutorial 19th–26th June 2011 TELL YOUR FRIENDS! Tweet about this A session at EuroPython 2011 session Andreas Jung WHEN CEO, ZOPYX Ltd Time 14:30–18:30 CET Date 20th June 2011 MongoDB is the new star of the so-called NoSQL databases. Using Python with MongoDB is the next logical step after having used SESSION HASH TAG Python for years with relational databases. #sftzh This talk will give an introduction into MongoDB and demonstrate SHORT URL how MongoDB can be be used from Python. lanyrd.com/sftzh More information can be found under: OFFICIAL SESSION PAGE http://www.zopyx.com/resources/python-mongodb-tutorial-at... ep2011.europython.eu/conf View the schedule More sessions at EuroPython 2011 on Python Topics MongoDB Add coverage to this session Python http://www.slideshare.net/ajung/python-mo Edit topics A URL to coverage such as videos, slides, podcasts, handouts, sketchnotes, photos etc. SCHEDULE INCOMPLETE? Add Add another session
  • 37. Add coverage http://www.slideshare.net/ajung/python-mongo- dbtrainingeurop... Link title Python and MongoDB tutorial Python mongo db-training-europython-2011 EuroPython 2011 Italy / Florence 19th–26th June 2011 Type of coverage Link Audio Liveblog Write-up Sketch notes Photos Slides Transcript Notes Video Handout Coverage preview From SlideShare:
  • 38. The task itself... • Tries using http://embed.ly/ to find a preview • Fetches the HTTP headers and first 2048 bytes • If HTML, attempts to extract the <title> • If other, gets the file type and size from headers
  • 39. Behind the scenes... ar = enhance_link.delay(url) poll_url = '/working/%s/' % signed.dumps({ 'task_id': ar.task_id, 'on_done_url': on_done_url, }) if 'ajax' in request.POST: return render_json(request, { 'ok': True, 'poll_url': poll_url, }) else: return HttpResponseRedirect(poll_url)
  • 40. And when it’s done... from celery.backends import default_backend ... task_id = request.REQUEST.get('id', '') result = default_backend.get_result(task_id)
  • 41. Configuration # Carrot / Celery: queue uses Redis CARROT_BACKEND = "ghettoq.taproot.Redis" BROKER_HOST = " 10.11.11.11" # redis server BROKER_PORT = 6379 BROKER_VHOST = "6" # Task results stored in memcached, so they can # expire automatically CELERY_RESULT_BACKEND = "cache" CELERY_CACHE_BACKEND = "memcached://10.11.11.12:11211;..."
  • 43. Phantom load testing • Deploy a new architecture on a brand new EC2 cluster • Leave your existing site on the old cluster • Invisibly link to the new stack from an <img width=1 height=1> element on your live site (not for very long though) • (sensible alternative: find a way to replay log files)
  • 45. add a conference you are signed in as simonw, do you want to sign out? calendar conferences coverage profile search Django conferences Django Django events looking for participants coverage 1 Django event is looking for participants 52 videos Most recent added 3 weeks ago ON NOW EuroPython 2011 52 slide decks 19 Most recent added 4 Italy / Florence hours ago 19th–26th June 2011 Django Plone Pyramid Python Twisted 3 audio clips Most recent added 1 week ago 27 write-ups SEPTEMBER DjangoCon US 2011 6 Most recent added 1 2011 United States / Portland 6th–8th September 2011 week ago 11 handouts Django Open Source Python Most recent added 18 hours ago 17 PyCON FR 2011 3 notes France / Rennes Most recent added 10 17th–18th September 2011 hours ago Django Python By country OCTOBER PyCon DE 2011 Ireland 1 4
  • 46. class Conference(models.Model): ... cache_version = models.IntegerField(default = 0) def save(self, *args, **kwargs): self.cache_version += 1 super(Conference, self).save(*args, **kwargs) def touch(self): Conference.objects.filter(pk = self.pk).update( cache_version = F('cache_version') + 1 )
  • 47. {% cache 36000 conf-topics conference.pk conference.cache_version %} <ul class="tags inline-tags meta"> {% for topic in conference.topics.all %} <li><a href="{{ topic.get_absolute_url }}">{{ topic }}</a></li> {% endfor %} </ul> {% endcache %}
  • 48. Bulk invalidation from django.models import F topic.conferences.all().update( cache_version = F('cache_version') + 1 )
  • 50. Pass data through an untrusted source with confidence that it hasn't been tampered with
  • 51. Signing uses • "Unsubscribe" links in emails • lanyrd.com/un/ImN6VyI.ii0Hwm7p71DEcGfaVzziQaxeuu ?redirect_to=URL protection Signed cookies "You are logged in as simonw" without hitting the database
  • 52. Signing in Django 1.4 from django.core import signing signing.dumps({"foo": "bar"}) signing.loads(signed_string) response.set_signed_cookie(key, value...) response.get_signed_cookie(key)
  • 53. Hashed static asset filenames in S3/CloudFront
  • 54. global.js global.ed81d119.js cdn.lanyrd.net/js/global.ed81d119.js
  • 55. Benefits • Far futures expiry headers • Cache-Control: max-age=315360000 • Expires: Fri, 18 Jun 2021 06:45:00 -0000 GMT • Guaranteed updated CSS in IE • Deploy new assets in advance of application • Old versions stick around for rollbacks
  • 56. ./manage.py push_static • Minifies JavaScript and CSS • Renames files to include sha1(contents)[:6] • Pushes all assets to S3
  • 57. Profiling and debugging production systems
  • 58. UserBasedExceptionMiddleware from django.views.debug import technical_500_response import sys class UserBasedExceptionMiddleware(object): def process_exception(self, request, exception): if request.user.is_superuser: return technical_500_response(request, *sys.exc_info())
  • 59. mysql-proxy • Very handy lua-customisable proxy for all of your MySQL traffic • Worst documented software ever • log.lua - logs out ALL queries • https://gist.github.com/1039751
  • 60. django_instrumented • (Unreleased) code I wrote for Lanyrd • Collects various runtime stats about the current request, stashes a profile JSON in memcached • Writes out the profile UUID as part of the HTML • A bookmarklet to view the profile
  • 61.
  • 62. mongodb logging • Super-fast inserts, log everything! • Capped collections • Structured queries • Ask me about it in a few months
  • 63. For the future... • Much better profiling, monitoring and alerts • Varnish in front of everything • Replicated MySQL for analytics + upgrades