SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Architecture of PBS.org
 DCPython - June 7, 2011
PBS is…

• PBS is a national federation of independently owned and
  operated public television stations and producers
   – Each with their own management and development resources


• 1500+ highly trafficked websites:
   –   http://www.pbs.org/
   –   http://www.pbs.org/nova/
   –   http://pbskids.org/
   –   http://pbskids.org/sesame/
   –   http://video.pbs.org/

• Enterprise services/APIs
PBS is not!



• Radio is easy… We do television!




• Or any of the other ~200 local stations.
What we do

• Technology leadership within public
  broadcasting community
• Distribution of national programming content
• Services to local stations
• Core application development. Yeah!!!
A few of our sites
History of PBS.org




      Early 1990’s: Hand rolled static html
       Late 1990’s: Hand crafted static html + CGI!
    Most of 2000’s: Zope/Plone CMS generated static html
          2008-10: Django generated static html
Launched Oct 2010: Django all the way
COVE API

• Contains the metadata for all PBS videos online
  including pointers to streaming video
• Needed to be:
   – Secure
   – Fast
   – Scalable
COVE API – Technology Stack

•   Amazon Elastic Cluster Computing (EC2)
•   Amazon Relational Database Service (RDS)
•   Linux
•   Python
•   Django
•   Piston for REST API
COVE API - Architecture
                         Internet

                    Elastic Load Balancer

 Auto Scale Array
         App Server 1      …        App Server N

                         HA Proxy

                                                      S3
           RDS Master               RDS Slave 11
                                     RDS Slave
                                      RDS Slave 1   Backups

App Sync Server
COVE API – Management Tools

• Amazon Web Service Console
• RightScale
• Splunk
COVE API – Interesting Stuff

• Easy to load test
  – Duplicate environment for several days
• Easy to scale
  – Autoscale array grows automatically
• Easy to upgrade
  – Each server built from vanilla base
COVE API – Lessons learned
• Use normalized data for administration and de-
  normalized data for API
COVE API – Lessons learned
• Piston is fine, but lacks flexibility without
  significant customization
   – TastyPie?
• JSON is probably good enough
• Don’t get fancy with your endpoints
• Stick to REST principles
• Don’t get fancy with your authentication
   – Use OAuth2 or simple token
PBS.org and Merlin API

• PBS.org
   – Slim, fast layer
   – Pulls data from Merlin API
   – Uses memcache extensively
   – Currently Django, but could be anything (Flask?)

• Merlin API
  – Aggregate content from distributed CMSes
  – Expose via standardized API
  – Power PBS.org and more
Merlin API – Technology stack

•   Python       • Amazon Web Services (“cloud”)
•   Django         – EC2
•   MySQL          – RDS - Relational Database Service
                      – ELB - Elastic Load Balancing
•   Piston
                          – Cloudfront CDN
•   Solr                      – S3Storage
•   Celery
•   RabbitMQ
Data flow




RSS Feed   Standardized
Ingestor       API
Merlin API architecture

   API Endpoint – Django Piston

          Search service             Indexing service
         Django-haystack                   Solr

     Data layer – MySQL (RDS)

Administration      Feed ingestion
Django admin           Celery
Merlin API server topology

          Internet



     Elastic Load Balancer


Autoscaling     App #N
                 App #N                Solr   Master
      array       App #N     Celery
                    App #n            Index   DB RDS


               S3 backups
Merlin API – Management Tools

• Amazon Web Service Console
• RightScale
• Splunk
API - Piston/Haystack/Solr
class WebObjectIndexHandler(BaseHandler):
    ...
    def get_queryset(self):
        ...
        return PistonSearchQuerySet().models(*models)

from haystack.query import SearchQuerySet
class PistonSearchQuerySet(SearchQuerySet):
    ...
    def __getitem__(self, k):
        ...
        return [IndexSerializer(i) for i in
super(PistonSearchQuerySet, self).__getitem__(k)]
Feed ingestor - Celery
from celery.decorators import task, periodic_task

@periodic_task(run_every=timedelta(seconds=300))
def update_webobject_states():
    ...
solr_visible = WebObject.children.filter(visible=True)
solr_visible = solr_visible.exclude(
flag__api_visible=True, available__isnull=True)
    ...
    updated = solr_visible.update(visible=False,
is_indexed = False)
    ...
signals.bulk_update.send('tasks.update_webobject_states')
Merlin API - Lessons learned

• Memcached was not necessary
• Denormalized search data via Solr index is much faster
  than querying database
• Asynchronous task delegation is awesome
• Celery prone to memory leaks
• App server array for easy horizontal scaling
   – Even if not autoscaling, increase min servers
• Never trust data you don’t control (validate!)
Resources

•   http://lucene.apache.org/solr/
•   http://haystacksearch.org/
•   http://celeryproject.org/
•   http://celeryproject.org/docs/django-celery/
•   http://aws.amazon.com/
PBS Developer Community

• Dedicated to making open.PBS the industry
  standard in open development communities.

                  http://open.pbs.org/
                  https://github.com/pbs

                  open@pbs.org
Questions?

  Drew Engelson
  drew@engelson.net
  http://tomatohater.com



  Edgar Roman
  emroman@pbs.org

Weitere ähnliche Inhalte

Was ist angesagt?

Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...Amazon Web Services
 
Designing Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSVDesigning Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSVAmazon Web Services
 
(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers
(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers
(DVO305) Turbocharge YContinuous Deployment Pipeline with ContainersAmazon Web Services
 
Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Amazon Web Services
 
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...Amazon Web Services
 
Amazon Elastic Beanstalk
Amazon Elastic BeanstalkAmazon Elastic Beanstalk
Amazon Elastic BeanstalkEberhard Wolff
 
Transforming Software Development
Transforming Software DevelopmentTransforming Software Development
Transforming Software DevelopmentAmazon Web Services
 
AppScale Talk at SBonRails
AppScale Talk at SBonRailsAppScale Talk at SBonRails
AppScale Talk at SBonRailsChris Bunch
 
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic Beanstalk
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic BeanstalkDeploy, Manage, and Scale Your Apps with OpsWorks and Elastic Beanstalk
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic BeanstalkAmazon Web Services
 
Building Serverless Web Applications - May 2017 AWS Online Tech Talks
Building Serverless Web Applications  - May 2017 AWS Online Tech TalksBuilding Serverless Web Applications  - May 2017 AWS Online Tech Talks
Building Serverless Web Applications - May 2017 AWS Online Tech TalksAmazon Web Services
 
AppScale @ LA.rb
AppScale @ LA.rbAppScale @ LA.rb
AppScale @ LA.rbChris Bunch
 
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...Amazon Web Services
 
Alfresco WCM For High Scalability
Alfresco WCM For High ScalabilityAlfresco WCM For High Scalability
Alfresco WCM For High ScalabilityAlfresco Software
 
Active Cloud DB at CloudComp '10
Active Cloud DB at CloudComp '10Active Cloud DB at CloudComp '10
Active Cloud DB at CloudComp '10Chris Bunch
 
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐Pahud Hsieh
 
AppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDBAppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDBChris Bunch
 
Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09Chris Bunch
 
Auto scaling with Ruby, AWS, Jenkins and Redis
Auto scaling with Ruby, AWS, Jenkins and RedisAuto scaling with Ruby, AWS, Jenkins and Redis
Auto scaling with Ruby, AWS, Jenkins and RedisYi Hsuan (Jeddie) Chuang
 

Was ist angesagt? (20)

Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...Local Testing and Deployment Best Practices for Serverless Applications - AWS...
Local Testing and Deployment Best Practices for Serverless Applications - AWS...
 
Designing Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSVDesigning Fault Tolerant Applications on AWS - Janakiram MSV
Designing Fault Tolerant Applications on AWS - Janakiram MSV
 
(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers
(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers
(DVO305) Turbocharge YContinuous Deployment Pipeline with Containers
 
Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017Building Serverless Web Applications - DevDay Austin 2017
Building Serverless Web Applications - DevDay Austin 2017
 
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...
(DVO308) Docker & ECS in Production: How We Migrated Our Infrastructure from ...
 
Amazon Elastic Beanstalk
Amazon Elastic BeanstalkAmazon Elastic Beanstalk
Amazon Elastic Beanstalk
 
Transforming Software Development
Transforming Software DevelopmentTransforming Software Development
Transforming Software Development
 
AppScale Talk at SBonRails
AppScale Talk at SBonRailsAppScale Talk at SBonRails
AppScale Talk at SBonRails
 
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic Beanstalk
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic BeanstalkDeploy, Manage, and Scale Your Apps with OpsWorks and Elastic Beanstalk
Deploy, Manage, and Scale Your Apps with OpsWorks and Elastic Beanstalk
 
Building Serverless Web Applications - May 2017 AWS Online Tech Talks
Building Serverless Web Applications  - May 2017 AWS Online Tech TalksBuilding Serverless Web Applications  - May 2017 AWS Online Tech Talks
Building Serverless Web Applications - May 2017 AWS Online Tech Talks
 
AppScale @ LA.rb
AppScale @ LA.rbAppScale @ LA.rb
AppScale @ LA.rb
 
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
 
Alfresco WCM For High Scalability
Alfresco WCM For High ScalabilityAlfresco WCM For High Scalability
Alfresco WCM For High Scalability
 
Active Cloud DB at CloudComp '10
Active Cloud DB at CloudComp '10Active Cloud DB at CloudComp '10
Active Cloud DB at CloudComp '10
 
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
 
AppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDBAppScale + Neptune @ HPCDB
AppScale + Neptune @ HPCDB
 
Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09
 
Amazon ECS Deep Dive
Amazon ECS Deep DiveAmazon ECS Deep Dive
Amazon ECS Deep Dive
 
Auto scaling with Ruby, AWS, Jenkins and Redis
Auto scaling with Ruby, AWS, Jenkins and RedisAuto scaling with Ruby, AWS, Jenkins and Redis
Auto scaling with Ruby, AWS, Jenkins and Redis
 
Managing Your Cloud Assets
Managing Your Cloud AssetsManaging Your Cloud Assets
Managing Your Cloud Assets
 

Ähnlich wie Architecture at PBS

DCPython: Architecture at PBS (Jun 7, 2011)
DCPython: Architecture at PBS (Jun 7, 2011)DCPython: Architecture at PBS (Jun 7, 2011)
DCPython: Architecture at PBS (Jun 7, 2011)Drew Engelson
 
NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013aspyker
 
A 60-mn tour of AWS compute (March 2016)
A 60-mn tour of AWS compute (March 2016)A 60-mn tour of AWS compute (March 2016)
A 60-mn tour of AWS compute (March 2016)Julien SIMON
 
Agile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic BeanstalkAgile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic BeanstalkAmazon Web Services
 
Amazon Web Services - Elastic Beanstalk
Amazon Web Services - Elastic BeanstalkAmazon Web Services - Elastic Beanstalk
Amazon Web Services - Elastic BeanstalkAmazon Web Services
 
Journey Towards Scaling Your Application to 10 million users
Journey Towards Scaling Your Application to 10 million usersJourney Towards Scaling Your Application to 10 million users
Journey Towards Scaling Your Application to 10 million usersAmazon Web Services
 
Agile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic BeanstalkAgile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic BeanstalkAmazon Web Services
 
Journey Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million UsersJourney Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million UsersAdrian Hornsby
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersAmazon Web Services
 
04 integrate entityframework
04 integrate entityframework04 integrate entityframework
04 integrate entityframeworkErhwen Kuo
 
AWS and Serverless with Alexa
AWS and Serverless with AlexaAWS and Serverless with Alexa
AWS and Serverless with AlexaRory Preddy
 
Evolving IGN’s New APIs with Scala
 Evolving IGN’s New APIs with Scala Evolving IGN’s New APIs with Scala
Evolving IGN’s New APIs with ScalaManish Pandit
 
AWS as platform for scalable applications
AWS as platform for scalable applicationsAWS as platform for scalable applications
AWS as platform for scalable applicationsRoman Gomolko
 
What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?Sébastien ☁ Stormacq
 
Scaling up to Your First 10 Million Users
Scaling up to Your First 10 Million UsersScaling up to Your First 10 Million Users
Scaling up to Your First 10 Million UsersAmazon Web Services
 
Rails as iOS Application Backend
Rails as iOS Application BackendRails as iOS Application Backend
Rails as iOS Application Backendmaximeguilbot
 
Serverless Web Apps using API Gateway, Lambda and DynamoDB
Serverless Web Apps using API Gateway, Lambda and DynamoDBServerless Web Apps using API Gateway, Lambda and DynamoDB
Serverless Web Apps using API Gateway, Lambda and DynamoDBAmazon Web Services
 

Ähnlich wie Architecture at PBS (20)

DCPython: Architecture at PBS (Jun 7, 2011)
DCPython: Architecture at PBS (Jun 7, 2011)DCPython: Architecture at PBS (Jun 7, 2011)
DCPython: Architecture at PBS (Jun 7, 2011)
 
NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013
 
A 60-mn tour of AWS compute (March 2016)
A 60-mn tour of AWS compute (March 2016)A 60-mn tour of AWS compute (March 2016)
A 60-mn tour of AWS compute (March 2016)
 
Agile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic BeanstalkAgile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic Beanstalk
 
Amazon Web Services - Elastic Beanstalk
Amazon Web Services - Elastic BeanstalkAmazon Web Services - Elastic Beanstalk
Amazon Web Services - Elastic Beanstalk
 
Journey Towards Scaling Your Application to 10 million users
Journey Towards Scaling Your Application to 10 million usersJourney Towards Scaling Your Application to 10 million users
Journey Towards Scaling Your Application to 10 million users
 
[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop
 
DevOpsCon Cloud Workshop
DevOpsCon Cloud Workshop DevOpsCon Cloud Workshop
DevOpsCon Cloud Workshop
 
Agile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic BeanstalkAgile Deployment using Git and AWS Elastic Beanstalk
Agile Deployment using Git and AWS Elastic Beanstalk
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
 
Journey Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million UsersJourney Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million Users
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million Users
 
04 integrate entityframework
04 integrate entityframework04 integrate entityframework
04 integrate entityframework
 
AWS and Serverless with Alexa
AWS and Serverless with AlexaAWS and Serverless with Alexa
AWS and Serverless with Alexa
 
Evolving IGN’s New APIs with Scala
 Evolving IGN’s New APIs with Scala Evolving IGN’s New APIs with Scala
Evolving IGN’s New APIs with Scala
 
AWS as platform for scalable applications
AWS as platform for scalable applicationsAWS as platform for scalable applications
AWS as platform for scalable applications
 
What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?What is Amazon Web Services & How to Start to deploy your apps ?
What is Amazon Web Services & How to Start to deploy your apps ?
 
Scaling up to Your First 10 Million Users
Scaling up to Your First 10 Million UsersScaling up to Your First 10 Million Users
Scaling up to Your First 10 Million Users
 
Rails as iOS Application Backend
Rails as iOS Application BackendRails as iOS Application Backend
Rails as iOS Application Backend
 
Serverless Web Apps using API Gateway, Lambda and DynamoDB
Serverless Web Apps using API Gateway, Lambda and DynamoDBServerless Web Apps using API Gateway, Lambda and DynamoDB
Serverless Web Apps using API Gateway, Lambda and DynamoDB
 

Mehr von Public Broadcasting Service (10)

Cloud Orchestration is Broken
Cloud Orchestration is BrokenCloud Orchestration is Broken
Cloud Orchestration is Broken
 
Pycon2013
Pycon2013Pycon2013
Pycon2013
 
Simplified Localization+ Presentation
Simplified Localization+ PresentationSimplified Localization+ Presentation
Simplified Localization+ Presentation
 
PBS Localization+ API Webinar
PBS Localization+ API WebinarPBS Localization+ API Webinar
PBS Localization+ API Webinar
 
Mobile Presentation at PBS TECH CON 2011
Mobile Presentation at PBS TECH CON 2011Mobile Presentation at PBS TECH CON 2011
Mobile Presentation at PBS TECH CON 2011
 
PBS Presentation at AWS Summit 2012
PBS Presentation at AWS Summit 2012PBS Presentation at AWS Summit 2012
PBS Presentation at AWS Summit 2012
 
I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...
I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...
I've Got a Key to Your API, Now What? (Joint PBS and NPR API Presentation Giv...
 
SQL Injection Defense in Python
SQL Injection Defense in PythonSQL Injection Defense in Python
SQL Injection Defense in Python
 
PBS Tech Con 2011 API Workshop
PBS Tech Con 2011 API WorkshopPBS Tech Con 2011 API Workshop
PBS Tech Con 2011 API Workshop
 
Fall2010 producer summit_openpbs_final
Fall2010 producer summit_openpbs_finalFall2010 producer summit_openpbs_final
Fall2010 producer summit_openpbs_final
 

Kürzlich hochgeladen

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Architecture at PBS

  • 1. Architecture of PBS.org DCPython - June 7, 2011
  • 2. PBS is… • PBS is a national federation of independently owned and operated public television stations and producers – Each with their own management and development resources • 1500+ highly trafficked websites: – http://www.pbs.org/ – http://www.pbs.org/nova/ – http://pbskids.org/ – http://pbskids.org/sesame/ – http://video.pbs.org/ • Enterprise services/APIs
  • 3. PBS is not! • Radio is easy… We do television! • Or any of the other ~200 local stations.
  • 4. What we do • Technology leadership within public broadcasting community • Distribution of national programming content • Services to local stations • Core application development. Yeah!!!
  • 5. A few of our sites
  • 6. History of PBS.org Early 1990’s: Hand rolled static html Late 1990’s: Hand crafted static html + CGI! Most of 2000’s: Zope/Plone CMS generated static html 2008-10: Django generated static html Launched Oct 2010: Django all the way
  • 7. COVE API • Contains the metadata for all PBS videos online including pointers to streaming video • Needed to be: – Secure – Fast – Scalable
  • 8. COVE API – Technology Stack • Amazon Elastic Cluster Computing (EC2) • Amazon Relational Database Service (RDS) • Linux • Python • Django • Piston for REST API
  • 9. COVE API - Architecture Internet Elastic Load Balancer Auto Scale Array App Server 1 … App Server N HA Proxy S3 RDS Master RDS Slave 11 RDS Slave RDS Slave 1 Backups App Sync Server
  • 10. COVE API – Management Tools • Amazon Web Service Console • RightScale • Splunk
  • 11. COVE API – Interesting Stuff • Easy to load test – Duplicate environment for several days • Easy to scale – Autoscale array grows automatically • Easy to upgrade – Each server built from vanilla base
  • 12. COVE API – Lessons learned • Use normalized data for administration and de- normalized data for API
  • 13. COVE API – Lessons learned • Piston is fine, but lacks flexibility without significant customization – TastyPie? • JSON is probably good enough • Don’t get fancy with your endpoints • Stick to REST principles • Don’t get fancy with your authentication – Use OAuth2 or simple token
  • 14. PBS.org and Merlin API • PBS.org – Slim, fast layer – Pulls data from Merlin API – Uses memcache extensively – Currently Django, but could be anything (Flask?) • Merlin API – Aggregate content from distributed CMSes – Expose via standardized API – Power PBS.org and more
  • 15. Merlin API – Technology stack • Python • Amazon Web Services (“cloud”) • Django – EC2 • MySQL – RDS - Relational Database Service – ELB - Elastic Load Balancing • Piston – Cloudfront CDN • Solr – S3Storage • Celery • RabbitMQ
  • 16. Data flow RSS Feed Standardized Ingestor API
  • 17. Merlin API architecture API Endpoint – Django Piston Search service Indexing service Django-haystack Solr Data layer – MySQL (RDS) Administration Feed ingestion Django admin Celery
  • 18. Merlin API server topology Internet Elastic Load Balancer Autoscaling App #N App #N Solr Master array App #N Celery App #n Index DB RDS S3 backups
  • 19. Merlin API – Management Tools • Amazon Web Service Console • RightScale • Splunk
  • 20. API - Piston/Haystack/Solr class WebObjectIndexHandler(BaseHandler): ... def get_queryset(self): ... return PistonSearchQuerySet().models(*models) from haystack.query import SearchQuerySet class PistonSearchQuerySet(SearchQuerySet): ... def __getitem__(self, k): ... return [IndexSerializer(i) for i in super(PistonSearchQuerySet, self).__getitem__(k)]
  • 21. Feed ingestor - Celery from celery.decorators import task, periodic_task @periodic_task(run_every=timedelta(seconds=300)) def update_webobject_states(): ... solr_visible = WebObject.children.filter(visible=True) solr_visible = solr_visible.exclude( flag__api_visible=True, available__isnull=True) ... updated = solr_visible.update(visible=False, is_indexed = False) ... signals.bulk_update.send('tasks.update_webobject_states')
  • 22. Merlin API - Lessons learned • Memcached was not necessary • Denormalized search data via Solr index is much faster than querying database • Asynchronous task delegation is awesome • Celery prone to memory leaks • App server array for easy horizontal scaling – Even if not autoscaling, increase min servers • Never trust data you don’t control (validate!)
  • 23. Resources • http://lucene.apache.org/solr/ • http://haystacksearch.org/ • http://celeryproject.org/ • http://celeryproject.org/docs/django-celery/ • http://aws.amazon.com/
  • 24. PBS Developer Community • Dedicated to making open.PBS the industry standard in open development communities. http://open.pbs.org/ https://github.com/pbs open@pbs.org
  • 25. Questions? Drew Engelson drew@engelson.net http://tomatohater.com Edgar Roman emroman@pbs.org