The Architect's Two Hats

The Architect ’ s Two Hats Ben Stopford Architect High Performance Computing RBS

And they didn ’t really get on

OK – that ’s not entirely true – but they have competing objectives

This is the story of the architect ’s two hats

So what are the two hats all about?

Design ,[object Object],[object Object],[object Object],[object Object]

Architecture ,[object Object],[object Object],[object Object],[object Object]

So lets look at how design works in industry

We ’ ll look at architecture for performance a bit later, right now lets look at design

Why do we need to worry about Design? ,[object Object],[object Object],[object Object]

We design our solution well so that it is easy to understand and change

And we all know how to do that right? I ’ m going to play with you a bit now….

AND… We get to spot problems early on in the project lifecycle. Why is that advantageous?

Why? Because it costs less to get the design right early on, see: Low cost of change early in lifecycle

So we have a plan for how to build our application before we start coding. All we need to do is follow the plan!!

Well….. That ’s what we used to do… … but problems kept cropping up…

It was really hard to get the design right up front. ,[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object]

Ahhh, those pesky users, why can ’t they make up their minds?

We find designing up front hard…

… and when we do get it right the users generally go and change the requirements…

… and it all changes in the next release anyway.

So are we taking the right approach

Software is supposed to be soft. That means it is supposed to be easy to change.

So is fixing the design up front the right way to do it?

But we ’ve seen that changing the design later in the project life cycle is expensive!

So does the cost curve have to look like this?

Can we design our systems so that we CAN change them later in the lifecycle?

The Cost of Change in an Agile application

By designing for change ** But not designing for any specific changes *** And writing lots and lots of tests

Agile Development facilitates this Code Base Test Test Test

Dynamic Design ,[object Object],[object Object],[object Object]

This means that your system ’s design will constantly evolve.

So what does this imply for the Designer?

It implies that the designer must constantly steer the application so that it remains easy to understand and easy to change .

With everyone being responsible for the design.

So the designers role becomes about steering the applications design through others.

Shepherding the team ,[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Now we can switch to the architects hat.

Architecting for performance And why it can fight against good design

If speed is going to be a problem…. Then evolutionary design alone may not work

We need to seed the design with clever architectural decisions to ensure we get the speed we need

We use patterns and frameworks that increase speed But often at the expense of a clear programming model. i.e. these patterns can obfuscate the implementation.

How it fits together: A project timeline Set-up Architecture Design: Push patterns and reuse etc Amend Architecture

So lets look at how we architect for speed

Load Balancing ,[object Object]

Load Balancing: Horizontal Scalability

Load balancing is great way to get horizontal scalability without affecting the programming model much!!

But it is limited by the time taken for a single request i.e. it allows us to handle more load, but does not allow us to reduce the time taken to execute single process

What if we want to make a single (possibly long running) processes faster? Lets look at an example

Sum the numbers between 1 and 10,000,000 int total for (n =1 to 10,000,000){ total+=n } A service has a hard limit on how fast it can execute this, defined by it ’s clock speed. Load balancing won ’t speed this up.

How do we speed this up programming in process intensive cases? ,[object Object]

Make it faster, using threads and multiple cores ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Larger Machines - Azul ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

But we can also scale out. i.e. lots of normal hardware linked together

Scaling Out is Cost Effective Scaling out is much more cost effective per teraflop than scaling up.

But it adds complexity to the programming model

Key Point: Role of an Architect is to construct and application that performs - that often means a distributed system Distributed Systems can be much harder to manage than those that run on a single machine Why?

Because standard computing models don ’t work in a distributed world.

The Trials of Distributed Computing

As we distribute our application across multiple machines, complexity is moved from hardware to software.

The major problem is the lack of fast shared memory

Why? ,[object Object],[object Object],You cannot abstract such latencies away. (why not?) You must architect around them.

We can ’t ignore ‘wire’ time. We need to make sure programmers think about it.

The complexity of shared memory has been moved from the hardware domain to the software domain. Simple problems like accessing other objects are now more time consuming

But lets step back a little and look at why we need to distribute processing

Sum the numbers between 1 and 10,000,000 int total for (n =1 to 10,000,000){ total+=n } A service has a hard limit on how fast it can execute this

To execute faster: batch for parallel execution ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Leads to concept of Grid Computing ,[object Object],[object Object]

Parallel execution on a grid Server Grid node Grid node Grid node Grid node Send code + data Receive result Code + data (1) Code + data (2) Code + data (3) Code + data (4) Processing time is 4 x synchronous case (assuming processing time >> wire time)

Grid computing solves the processing problem. i.e. we can do complex computations very quickly by doing them in parallel.

But it complicates the programming model

Example: Report Generation for (Report report : reports){ Data data = getDataFromDB(report); format(data); present(data); }

... so visually... Get data DB Format Present Loop For n

...but on the grid... Start DB Format Present Loop n times Grid Runner Asynchronous Synchronous Grid Callback Grid

Note how this muddles the design A simple loop has become a set of distributed invocations and asynchronous call backs.

Grids allow us to process computations quickly through parallelism. But this leads to the problem getting fast enough access to the data we want to operate on

Problems With Data Bottlenecks ,[object Object],[object Object],[object Object]

Why is the database a bottleneck? Server Grid node Grid node Grid node Grid node Send code Receive result Database Get data

Databases are slow(ish*) ,[object Object],[object Object],[object Object]

We can scale up by using memory not disk Memory access is much faster than disk access.

In memory databases ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

But Single Machine Data Stores Don ’t Scale, even if they are in memory Server Grid node Grid node Grid node Grid node Send code Receive result In Memory Database Get data BOTTLENECK

What we need is a distributed data source? Welcome to the world of distributed caching

Distributed Caching Solves This Problem by Splitting The Data Over All Servers 1/5 data 1/5 data 1/5 data 1/5 data 1/5 data Client Client Client Client This is parallel processing for data access. Data requests are split across multiple machines.

Now we have removed the data bottleneck Server Grid node Grid node Grid node Grid node (1) Send code (2) Receive result Data Fabric 1/6 1/6 1/6 1/6 1/6 1/6

This gives us access to a fast shared memory across multiple machines

We are now massively parallel With lightning fast data access.

But we can get faster than this. How?? Server Grid node Grid node Grid node Grid node (1) Send code (2) Receive result Data Fabric 1/6 1/6 1/6 1/6 1/6 1/6

Superimpose compute and data fabrics into one entity Data Fabric 1/6 1/6 1/6 1/6 1/6 1/6 Server (1) Send code (2) Receive result

So to speed up a system ,[object Object],[object Object],[object Object]

[object Object],[object Object]

The trick is balancing them! How much architecture is enough?

But fundamentally, you need them both!

Thanks Slides: http://www.BenStopford.com Vacation Placements: [email_address]

The Architect's Two Hats

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (8)

Ähnlich wie The Architect's Two Hats

Ähnlich wie The Architect's Two Hats (20)

Mehr von Ben Stopford

Mehr von Ben Stopford (20)

The Architect's Two Hats

Hinweis der Redaktion