RightScale Conference Santa Clara 2011: When getting started with a new technology, it’s helpful to hear the war stories and successes of those who have gone before us. We’re excited that several RightScale customers will share their experiences of how they have achieved agility in the cloud.
2. 2
What’s my background?
• Backend Software Engineer / Scalability Expert
• Founded a few companies all run on AWS
• Consultant helping startups move to AWS
• Joined Clicker.com as Director of Operations
• Clicker.com Acquired by CBS Interactive
6. 6
Framework for Operations
• Frameworks all the rage in software development
• MVC – Rails, Symfony, Kohana, Spring
• MVP – Google Web Toolkit
• MTV – Django
7. 7
Framework for Operations
• Frameworks all the rage in software development
• Build your infrastructure the same way
• RightScripts are the Models (where all the logic goes)
• Server Templates are the Views (ties all the logic together into a server)
• RightScale is the Controller (launches & manages servers)
8. 8
Framework for Operations
• Frameworks all the rage in software development
• Build your infrastructure the same way
• Create reusable, standardized components that can be shared
10. 10
Framework for Operations
• Frameworks are popular in web development
• Build your infrastructure the same way
• Create reusable components that can be shared
• Simplify long-term maintenance
11. 11
Standardization of Operations
• Consistent way of doing things
• What works in one cloud can work in another cloud
• rs_tag –add app:role=memcache
• rs_tag –query app:role=memcache
• Clone deployments, server templates, right scripts
12. 12
Standardization of Operations
• Consistent way of doing things
• What works in one cloud can work in another cloud
• Clone deployments, server templates, right scripts
• Commoditization of Infrastructure
• Use Private or Public Clouds – It’s all the same
• Simplify Migration
13. 13
Standardization of Operations
• Consistent way of doing things
• What works in one cloud can work in another cloud
• Clone Deployments, Server Templates, RightScripts
• Commoditization of Infrastructure
• Use Private or Public Clouds – it’s all the same
• Simplify migrations
• Reduction in Technical Debt
• Less technical debt than rolling out your own custom solution
• Less of a problem, if someone leaves the company
14. 14
Evolution of Operations
• Makes sense to Engineers
• Turn engineers into DevOps
• Don’t silo your ops team from your engineering team
15. 15
Evolution of Operations
• Makes sense to Engineers
• Turn engineers into DevOps
• Don’t silo your ops team from your engineering team
• Reduces Engineering bottle necks
• Uses a modern approach to operations
• Teaches skills necessary for modern software development
16. 16
Evolution of Operations
• Makes sense to Engineers
• Turn engineers into DevOps
• Don’t silo your ops team from your engineering team
• Reduces Engineering bottle necks
• Uses a modern approach to operations
• Teaches skills necessary for modern software development
• Operational Insurance
• More people who can fix things reduces liability
• Distribution of responsibility (Human HA?)
18. 18
What have we learned?
• RightScale gives you the tools
• Use templates a starting off point
• Build for High Availability
• Visit http://highscalability.com/
19. 19
What have we learned?
Check out the “Roll your own Server Templates” session for more!
What I want to share with everyone is “Why RightScale?”You’re going to find a few “competing” vendors at the CloudExpo next door, But I challenge anyone to find one as versatile as RightScale.So what I’m going cover are 3 compelling justifications for RightScale that I don’t think get enough attention. These are some of the reasons why we went with RightScale and why were pushing our agenda through out the CBS Interactive organizationBut before I get into things, let me share a little bit about my background
I’m a active BackendSoftware Engineer with experience developing in everything from Ruby to Java to C++ I got my start on AWS earlier than most. In 2006 some friends and I had just founded a startup at the same time EC2 had entered into a private beta. We scored an invite and built our entire product around it. I ended up founding 2 more companies which were also based on EC2Before moving into a consulting role helping startups build their products on AWS.One of those was clicker.com where I joined Director of Operations.In March of this year, we were acquired by CBS Interactive. Very exciting times for me!
Who is CBS Interactive?We’re the online division for CBS, the broadcast network. We’re top 10 global web property and the largest premium online content network.I’m sure you might have heard about some of the brands we own. - It’s CNET, Last.fm, TV.com, CBS Sports, 60 Minutes, to name a fewIt’s where a lot of very successful internet properties end up, including clicker.com.
From my technical perspective, here is why RightScale makes sense to us.I’ll go into these in detail, but here’s the gist of it:RightScale is….A Framework for OperationsA platform for StandardizationAnd the Evolution of Operations
Frameworks have been all the rage in web development for the last 10 years, but it’s taken it’s sweet time to reach the world of operationsRuby has Rails, Python has Django, PHP has Symfony and Kohaha, but has ops had? Not much.
Frameworks help you break your application into more manageable components.Rather than spend a lot of time focused on the plumbing, you can get to work right away building your product.They establish a set of naming conventions and coding standards to facilitate long term management and extensibility of your application. Of course, the signs of any good framework is that they don’t get in your way.
Let me take this concept of an MVC framework and apply it to RightScaleKeep your configuration logic in RightScripts (kind’a like models)Use server templates to tie all your RightScripts together (kind’a like views)Let RightScale launch, scale, and mange servers across all your clouds (kind’a like a controller)Now, at CBS Interactive we already use CFEngine extensively, which is kind of redundant to using RightScripts. That’s not a problem because like any good framework, RightScale is extensible. We can just create a simple RightScript to install cfengine and continue to do things our way. RightScale still acts as the main cloud controller launching servers, but cfengine takes care of configuring the servers. It all works together in perfect harmony.
REUSABLE COMPONENTS:By designing things in a framework-like fashion, we are building up a library of re-usable components that can be shared across our organization.
If you look up at the slide, you’ll see some of the things we’ve built templates for. It reads like a lexicon of buzz-words for everything a startup is using these days.We have written our OWN Server Templates and RightScripts for all of these and we can share them throughout CBS. They are fully reusable.
LONG-TERM MAINTENENCEFrameworks help us facilitate the long term management and extensibility of an application. For example, take jQuery, the Java Script framework. Because we build our frontend UI at TV.com using the jQuery framework, we can build a cross-browser compliant website much easier and faster that if we had to start from scratch. We don’t worry about a new browsers because the jQuery community will fix any compatibility issues that arise before it hits general release. Well, RightScale is like jQuery but for cloud management. They abstract all the dirty inconsistencies between cloud providers and expose one clean interface. Since we’re not in the business of developing a cloud management platform, we decided we shouldn’t be building it ourselves: - we let RightScale take care of all public and private cloud integrations - we let them figure out how to innovate in the cloud spaceAll the while, we’re focusing on our core objectives: Building an even more successful brand.
I talked a little bit already on how RightScale offers standardization by way of creating a Framework. Here’s a tangible example. jQuery lets us design a UI so that it works in any browser. You know how in jQuery you can use the $ method to lookup any element in the DOM with a particular id? It’ll work in all the major browsers and we don’t need to do any special hacks. It just works.Let me draw a parallel. By designing our infrastructure on RightScale, we get the same effect: it’ll work in any cloud that RightScale supports. So, on RightScale we can “id” our servers by using tags. This is ideal for service discovery -- finding all servers within a deployment that have a particular role.On the command line we run “rs_tag” and pass in a few arguments that specify what this server does like memcache or hadoop data node. Then we can run the “rs_tag” command again and query by tags to find those servers. EC2 supports something similar and also has command line tools. But the problem with it is that it only works on EC2.By using RightScale we get one behavior that works consistently everywhere, such as on RackSpace or in a private cloud. No hacking needed.
COMMODITIZATION OF INFRASTRUCTURENow, we can take this a step further. We can essentially commoditize our infrastructure so that we don’t have to care as much where it’s located.Instead of building for Amazon, we only worry about building for RightScale. We let them act as the translation layer between all the various Cloud API End Points. A real-world problem at CBS is that have datacenters spread out across the entire world, but each has a very specialized configuration. An effort is underway to consolidate these datacenters and modernize our infrastructure. Unfortunately, this is a major undertaking because everything needs to be built by hand from the ground up in the new Data Centers. And most of CBS is not yet on RightScale.Now we found a smarter way of doing it.Instead of rebuilding from the ground up on the old way of doing things, we’re doing it on RightScale. This will make migrations much simpler down the road and doesn’t bind us to a particular datacenter out of fear of relocating. If you think about it, it can give us a lot of bargaining power when it comes time to renegotiate contracts.
Reduction in Technical DebtMy goal is to help us reduce the amount and rate of which we acquire operational technical debt. This starts with maintaining less of the code ourselves. It also means we need to get in the habit of constantly rebuilding & provisioning servers so that we keep the gears greased and the parts moving. We have so much technical debt at CBS partially because we’re very successful. We’ve been around longer than most of the tools used in devops today. We didn’t have a choice but to build them ourselves the first time around, but now that better tools exist we cannot afford to continue maintaining the ones we have because they’re not essential to driving our core business forward. This is why we’re switching. As an engineer myself, I want to stay current on technology; that’s what’s important to move my career forward.
MAKE SENSE TO ENGINEERSSoftware Engineers understand frameworks. Rubyists love talking about keeping things DRY. We like simplicity and consistency. This is what we get with RightScale.But here’s something else we get which doesn’t get enough attention. We get a more technically savvy software engineering team. There’s no reason we need to silo our ops teams from our engineers any more. In fact, I’d argue that engineers who don’t adapt to the new paradigm of devops will find it harder and harder to get work. Engineers should be learning how to build their systems in cloud environments so that they write better code and take into the consideration the design aspects of our infrastructure. What more, I see engineers who are responsible for configuring and managing their own deployments to take steps to ensure that they’re easier to deploy & maintain. This is because they have to eat their own dog food.
REDUCES BOTTLE NECKSSince at TV.com our engineers are more involved in the operations, we’ve eliminated a lot of the common bottle necks. We have about 20 engineers working on individual projects getting released on a weekly basis. We have over 40 standalone in-house applications or services that we built and manage. We have 2 people in operations and no bottle necks in ops. This is possible because we’ve trained our engineers on how to use RightScale and build their own server templates. We empower them and teach them the sysadmin skills they need so that they can take matters into their own hands. Our Engineers can perform many tasks that were once reserved for admins.They can solve performance problems.Or when a server crashes, they can relaunch it. When load spikes and PagerDuty calls them, they can go in and figure out why and resolve it. All the meanwhile, I’m sleeping pretty well because I don’t have to do everything! Life’s good =)
OPERATIONAL INSURANCERightScale can also be thought of as a bit of insurance. The more people in our organization that know how our product operates both on the frontend and backend, the better off we are. We’re more resilient towards what I call “human” failure. If I get run over by a bus or win the lottery, things will continue to move forward.
In the future, I envision our infrastructure to look more like this – a federatation of clouds. It’s an infrastructure which is broken into smaller divisions (regions) have some degree of internal autonomy. Within CBS, we’ll be able to let Business Units choose where they want to be or how they want to scale out their technology. They’ll be able to use public clouds like EC2 or Private Clouds that we control. We want business units within CBS be as agile as we were as a startup. We’re at the tip of the Ice Berg. Before we we’re acquired, we were a cash strapped startup.Now we’re at CBS Interactive, so we’re looking to go big and build a more future proof infrastructure.
We’ve learned a lot of things along the way.The most important thing is that RightScale is just another tool in the tool chest. RightScale alone is not silver bullet for 100% uptime. We still need to build for fault tolerance and high-availability and embrace frequent failures for some subset of our systems. I like to say we operate in a constant state of failure. We test everything all the time. This is why when Amazonpocalypse hit in Apr 2011, we got away unscathed, but many more did not.I recommend people to check out the highscalability.com blog for inspiration
And to check out the session “Roll your own server templates” with Darryl Eaton and myself later on this afternoon to hear some more about my experiences.That about wraps it up.