Austin Cloud Users Group - August 23rd, 2011

copperegg
Austin CUG - August 23rd, 2011
(presented by Eric Anderson)
anderson@copperegg.com

Wednesday, August 24, 11

About Us
CopperEgg
• Founded spring 2010
• Super real-time monitoring and analytics

About me (Eric Anderson)
• SysAdmin - Centaur - 1999-2007
• 1400 compute nodes, ~50-100 ﬁle servers, ~200 misc systems, hundreds of TB’s
• Software Engineer - StorSpeed - 2007-2010
• built distributed ﬁle system cache for NAS acceleration product
• Co-Founder/COO - CopperEgg - 2010-Present

2

Why Cloud?
Important Differences:

• All reliable and business-worthy install need something like this:
Installs in seconds – copy/paste systems
• No configuration required - anyone can do it

•Physical security •Redundant infrastructure
•Redundant power •Multi-AZ, Regions, storage, etc
•Redundant AC •Resilient Applications
•Redundant & fast network •Designed for failure
•Peak hardware •Performance measurement
•Spare equipment •Automatic failover/recovery
•Physical space (storage of •Security of your infrastructure
spare stuff too) •Monitoring - up/down/status
•People to manage physical •Visibility into system as a whole
infrastructure •Don’t rely on cloud vendor!
•Hardware repairs •Delayed, inaccurate

3

Why Cloud?
Important Diﬀerences:

All reliable and business-worthy systems need something like this:

Physical Cloud

•Physical security •Redundant infrastructure
•Redundant power •Multi-AZ, Regions, storage, etc
•Redundant AC •Resilient Applications
•Redundant & fast network •Designed for failure
•Peak hardware •Performance measurement
•Spare equipment •Automatic failover/recovery
•Physical space (storage of •Security of your infrastructure
spare stuﬀ too) •Monitoring - up/down/status
•People to manage physical •Visibility into system as a whole
infrastructure •Don’t rely on cloud vendor!
•Hardware repairs •Delayed, inaccurate

4

Why Cloud? (for CopperEgg)
Why did we go cloud?
• Needed to get building fast
• We didn’t know what we needed
• Just-in-time scaling
• Keep costs low and still provide awesome service levels
• Easy deployment for developers
• Test diﬀerent scenarios, try new setups, etc
• We use it for everything!
• code repositories, tickets, email, phone, alerting, etc

5

What we were building
Storage analytics product
• visualize network attached storage in real-time
• massive amounts of data
• analyzing 10 billion ops/day in beta, in real-time
• super real-time (seconds vs minutes)

Requirements:
• highly available
• super responsive
• gobble large amounts of analytics data in real-time
• historical data for 2 yrs
• great UI

6

Where we started
+ SimpleDB

Bad:
• Outgrew it before we outgrew it
• Slow!

So then what?

7

Amazon RDS to save the day!
+ SimpleDB

+ RDS

Good:
• Faster than SimpleDB
• Could scale the storage
Bad:
• Realized it still would not handle our dataset
• Inserts were too slow
So then what?

8

MySQL on EC2 to save the day!
+ SimpleDB

+ RDS

EC2 + MySQL

Good:
• Faster than RDS
• Increased insert performance
• Using some cheats to get the insert rate up
Bad:
• Still not good enough insert performance..
So then what?

9

MySQL on Rackspace Cloud
+ SimpleDB

+ RDS

+ MySQL EC2 + MySQL

Good:
• Faster than Amazon (CPU)
• Seemed cheaper
Bad:
• No easy way to scale across diﬀerent zones or regions
• No way to expand storage per instance (whole instance only - costly!)
• Then we got the bill: they charge for data xfer between instances - OUCH
So then what?

10

Back to Amazon!
+ SimpleDB

EC2, EBS,
+ RDS
MongoDB

+ MySQL EC2 + MySQL

Why did we move back?
• Lots of great services: S3, EC2, EBS, Route 53, ELB (we use all of these)
• Even more: SQS, SES, etc
• Multiple regions and availability zones
• Scale-as-you-need: storage, memory, cpu, redundancy
• Documentation

We’re still happy with this.. (9 months and running)

11

What’s this NoSQL thing?
Realized maybe MySQL was not the best choice
• How about a NoSQL database?
• So we tested and measured every one we thought was worth looking at:
• Redis
• Tokyo Tyrant, Kyoto Cabinet
• Cassandra
• MongoDB
• etc, etc, etc (there are a lot)

12

MongoDB won
MongoDB won the award - why?
• Redundant
• Scalable
• Persistent data-store
• Handles large amounts of data
• Awesome user community
• Vendor support
• Open source
• Lots of momentum

13

Where are we now?
Needed a way to monitor our site:
• Requirements:
• Know right away when problems occur
• See into the performance of the system
• See historical trends as we grow the business
• Super real-time product needs super real-time monitoring
• Not satisﬁed with existing solutions
• slow updates (1m or 5m way to slow - not real-time)
• not ‘cloud friendly’
• pain to maintain
• some are pricey

14

Not real-time?
Then what *is* real-time?
• Smallest amount of time you can comfortably have poor service before
someone notices and changes their behavior.
• Example:
• Web site can only be slow/unavailable for a few seconds before people leave
• Email can be slow for tens of seconds before people get grumpy (or less depending on
the people!)
• Twitter - well, we’ll leave that one for you to decide

So, if seconds is the yardstick for measuring poor performance,
why do we monitor every 1 or 5 minutes?

15

CPU Usage: 5min sampling
100

75

50

25

1
5:00 PM 5:05 PM

Here’s what a 5 minute sample provides
• Doesn’t look like much is happening
• Users should not be complaining right?

16

CPU Usage: 1min sampling
100

75

50

25

0
5:00 PM 5:01 PM 5:02 PM 5:03 PM 5:04 PM 5:05 PM

Same data - 1 minute sample
• Looks like there was some kind of cpu activity at 5:01pm - 5:02pm
• Still no issue though - right?

17

CPU Usage: 5 second sampling
100

75

50

25

0
5:00 PM 5:01 PM 5:02 PM 5:03 PM 5:04 PM 5:05 PM

Same data - 5s sampling
• Becomes clear there was something happening:
• between 5:01:10pm - 5:01:25pm

18

So we rolled our own
RevealCloud
• Turns out a lot of people agreed with us
• Highlights:
• Built on our super real-time analytics engine
• Updates in seconds vs minutes
• Easy to install, no conﬁg required
• Great looking and usable interface
• Works anywhere - public/private cloud, vm, bare metal)

19

copperegg
Questions


copperegg
Demo


Demo Screenshots

22

Demo Screenshots

23

Demo Screenshots

24

Austin Cloud Users Group - August 23rd, 2011

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (14)

Mehr von CopperEgg

Mehr von CopperEgg (13)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Austin Cloud Users Group - August 23rd, 2011