This document discusses key considerations for architecting Azure solutions, including:
- Software architecture focuses on the fundamental organization of a system, including its components, relationships, and design principles.
- Idempotency is important to address problems like messages being processed multiple times if a worker role fails. Transaction IDs can be used to prevent duplicate processing.
- Latency in Azure may be zero, but using bandwidth and other resources has real costs that must be accounted for in architecture.
- Service Bus enables secure and reliable messaging across hybrid cloud/on-premises applications and networks.
2. Famous Last Words… “It is a very humbling experience to make a multimillion-dollar mistake, but it is also very memorable….” (Fred Brooks - “Mythical Man-Month” p.47)
4. Software architecture is the fundamentalorganization of a system, embodied in its components, their relationships to each other and the environment, and the principles governing its design and evolution
5. Architecture forces Stakeholders Quality Attributes Constraints Principles Community experience Architect Architecture Patterns & Anti-patterns Key people Technology A “deliverable” Produce Is an input
10. Messages Process At Least Once Debit bank account $100 message Worker role reads message Balance debited $100 Worker role is torn before message can be deleted 3 minutes later, message re-appears on queue Worker role reads message Balance debited $100 Message deleted from queue Chaos ensues..... Customer calls bank..... Web Role Worker Role Balance = $1000 Balance = $900 Balance = $800 Worker Role Web Role Worker Role Worker Role Queue Storage LB LB
11. Solving the Idempotency Problem Debit bank account $100 message with transaction ID Worker role reads message. Checks transaction ID not present. Writes transaction ID with state ‘Started’ to ‘Replay Log’ Balance debited $100 Worker role is torn before message can be deleted 3 minutes later, message re-appears on queue Worker role reads message. Checks transaction ID. It is present in state started. Compensating message written to another queue Message deleted from queue Compensatory message processed. Balance = $1000 Balance = $900 Web Role Worker Role Worker Role Web Role Storage Worker Role Worker Role Query Query Queue Queue Table LB LB
15. Service Bus Provides secure messaging and connectivity across different network topologies Enables hybrid applications that span on-premises and the cloud Enables various communication protocols and patterns for developers to engage in reliable messaging Topology doesn’t change
22. Don’t assume specific instances Virtual IP : 1.1.1.2 Virtual IP : 1.1.1.3 Virtual IP : 1.1.1.4 Worker Role Worker Role Web Role Service Service IIS Instance Instance Windows Kernel Windows Kernel Windows Kernel TCP/IP TCP/IP TCP/IP TCP/IP TCP/IP NLB Driver NIC Driver NIC Driver NIC Driver Virtual NIC Virtual NIC Virtual NIC Virtual IP : 1.1.1.1
30. It isn’t – but it’s abstractedunless of course you use Azure connect The Network is homogenous Quickly connect on-premise computers with the cloud, no networking configuration required Supports standard IP protocols; secured using end-to-end IPSec Integrated with the Windows Azure Service Model; all role types supported
31. Deployment view Consider xsmall instances for development Test if you can use less than medium for production
32. Cost considerations You pay when you’re deployed (there is no “shelving”) Shutdown doesn’t help (keep CPUs running..)
IEEE 1471 – recommended practice for architecture description of software intensive systemSoftware architecture is the collection of the fundamental decisions about a software product/solution designed to meet the project's quality attributes (i.e. requirements). The architecture includes the main components, their main attributes, and their collaboration (i.e. interactions and behavior) to meet the quality attributes. Architecture can and usually should be expressed in several levels of abstraction (depending on the project's size). If an architecture is to be intentional (rather than accidental), it should be communicated. Architecture is communicated from multiple viewpoints to cater the needs of the different stakeholders.Architectural decisions are global tied to quality attributesDesigns decisions are local –tied to functionality
Fallacies of ditributed computingPeter Deutsch (first 7 in 94) & James Gosling (last in 97)
Slide ObjectiveIntroduce idempotencySpeaking NotesWhen describing the function“F of X equals F of F of X” – More simply put You get the same result no matter how many time you call the functionThis is important for us in this situation as the approach that Windows Azure takes to queues;Read the message and hideDelete the message on completionIs vulnerable do executing the same message twice. The pattern guarantees that each message will be processed to completion AT LEAST onceNotes
Slide ObjectiveUnderstand the need for idempotent operations with a simple exampleSpeaking NotesThere are number of reasons why you may not call deleteMessage, including:Your code failsThere is a hardware failure, effectively killing your worker role codeThere worker role is instructed to stop, and you have no code implemented to do that in a timely fashion.For this reason, you should design the tasks your worker role does to be Idempotent – which basically means that you should be able to do the same task twice, without it having an adverse effect on the system state.e.g. – If the worker is sent the name of a picture stored in blob storage that it would resize, it does not matter how many times it resizes the image, the outcome is the same. This is idempotent. If a worker is decrementing a balance, it DOES matter how many times this occurs – this is NOT idempotent.Notes
Slide ObjectiveUnderstand a way to make the previous example idempotent- or at least robust against multiple message processingSpeaking NotesThe approach in this slide is to use a Replay LogThis allows us to track whether we have seen a message before and if so take a different course of actionThere will be various levels of remedial action we may choose to take based on the contents of our replay logWe may be able to fully compensate for the previous failureWe may need human interventionNotesSome good (older) posts on creating idempotent web services. Worth readinghttp://blogs.msdn.com/b/ramkoth/archive/2004/03/12/88423.aspxhttp://blogs.msdn.com/b/ramkoth/archive/2004/03/13/88778.aspxGood MSDN mag article that touches on the topicsAlso discusses MSMQ and WCF which have different approaches- are transactionalhttp://msdn.microsoft.com/en-us/magazine/cc663023.aspx Discussion on dealing with situations where processing time may inadvertently exceed the invisibility timeouthttp://blog.smarx.com/posts/deleting-windows-azure-queue-messages-handling-exceptions