The MEW Workshop is now established as a leading national event dedicated to distributed high performance scientific computing. The principle objective is to encourage close contact between the research communities from the Mathematics, Chemistry, Physics and Materials Programmes of EPSRC and the major vendors.
6. Application is accessed from
• Embedded Azure
Scheduler, not a Head
Node
• All runtimes: Parametric
Sweep, MPI, Cluster
SOA, Excel
7. National services / tightly coupled apps
Departmental / mixed apps
Ad hoc apps and queries
Profitable Services
Internet scale
8. Compute
Compute
Inactivity
Period
Average
Average Usage Usage
Time Time
On and off workloads (e.g. batch Successful services needs to
job) grow/scale
Over provisioned capacity is wasted Keeping up w/growth is big IT
Time to market can be cumbersome challenge
Complex lead time for deployment
Compute
Compute
Average Average
Usage Usage
Time Time
Unexpected/unplanned peak in demand Services with micro seasonality trends
Sudden spike impacts performance Peaks due to periodic increased demand
Can’t over provision for extreme cases IT complexity and wasted capacity
9. Compute Instance I/O Cost Per
Instance CPU Memory Storage Performan Hour
Size ce
Extra Small 1.0 GHz 768 MB 20 GB Low $0.04
Small 1.6 GHz 1.75 GB 225 GB Moderate $0.12
Medium 2 x 1.6 GHz 3.5 GB 490 GB High $0.24
Large 4 x 1.6 GHz 7 GB 1,000 GB High $0.48
Extra Large 8 x 1.6 GHz 14 GB 2,040 GB High $0.96
11. Windows Azure Platform
Compute Storage Management CDN
“Operating system “Middleware “Relational database
in the cloud” in the cloud” in the cloud”
12. 3) The « job » is divided in tasks.
The tasks are put in a Queue
4) The worker get the tasks in the
queue and process them
1) The user submit a
job trough the web UI 3 Queue 4
1
1 6 5 n
Web Role Worker Role
Blob
2
2) The job is added in the 5) Each worker post the results of his
Table for future access computation in a Blob
Table
6) The differents output are
assembled to get the final result
13. HADOOP as a Service HADOOP on Premises Datawarehouse
Low cost SQLServer
Master Slave (s) Slave (s)
Master Archive Storage
Task Task
Task
Tracker Tracker
Task
Tracker Tracker MS BI
PDW
Job Job
Tracker Tracker
MapReduce MapReduce
Name HDFS Name HDFS
Node Node
T SQL Server
SQOOP
T
Data Data Data Data
Node Node Node
Analysis
Node
Service
HIVE
SQOOP
SQOOP
Multi-node Azure Cluster
Key Points:“On / Off” or Batch JobGrowing FastUnpredictable BurstingPredictable BurstingScript:We’ve told you what Windows Azure is and what cloud is. The next question is “what workloads fit public cloud?” The first workload is an “on / off” workload. For example, we have a customer, Risk Metrics, that runs risk analysis for hedge funds. The big challenge for hedge funds is acting quickly. You want to look at market trends, do some analysis and buy or sell based on what the analysis stated. They are doing a bunch of this in the cloud with us. They will come in and book 10,000 to 50,000 machines for a month or a week or a few hours to do their analysis then go back to their clients and make recommendations. This model makes complete sense because you don’t have to buy a bunch of machines that you will never use. They never sit empty, you come in, use what you want and turn it off.Another example is an application that is growing quickly. Typically startups get into this but other companies run into this also. You release an application that is for a particular set of your customers and you think they are the only one’s who would want to use it or be interested in it. And then you find lots and lots of people using it. What do you do in this case? You can’t wait to buy servers, set them up and manage them. You can come to the cloud! You come in and provision capacity as you need. There’s a bunch of startups that are basing their business on us. They don’t want to worry about the hardware, managing hardware, patching, we do it for you so you can focus on the business model, writing good software. You can focus on the customers and the intellectual property aspect of the business. We will do the operations for you, we manage all of that and it works really well. The third example is unpredictable bursting. Lets say you sell sporting goods for Spain’s soccer team and they win the world cup. It’s not really expected and suddenly you have all these people showing up at your store. They show up and want to buy a jersey or a football/soccer ball that day. If your site is down, your done, they aren’t coming back. With those situations the cloud can be invaluable. One of our more interesting customers broadcast games via the internet. For online broadcasting of soccer the customer saw an insane increase in demand during the quarter and semi-finals. They couldn’t believe how easy it was to scale up. They had originally booked X number of instances and then they need 3X and all they had to do was make small changes and we took care of all the hardware on the back end. For all those new machines it took six minutes for them to come online and they were amazed by that, that’s an option that’s available to you. The last thing is predictable bursting, lets use a salary or payroll example. In the US, on the 1st and 15th of every month people are going to show up to see what their paycheck looks like. For the rest of the month, minimal interest like 2%-5% but on the 1st and the 15th it suddenly goes up to 80-90%. So you can have one or two servers running on premises for the average demand and for all these spikes you can go to the cloud. These are typically, the four most important workloads we are seeing. And what I recommend to you, as you start thinking about the cloud is to think about an application that fits one of these patterns, the batch processing one is typically the best one, and think about how you can test that in the cloud. You could run the application on premises and in the cloud at the same time. That’s a very good way for you to see if the cloud is right for you.Click:The next important question is how do you pay for this?