Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Public Cloud Workshop


Hier ansehen

1 von 37 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Public Cloud Workshop (20)


Aktuellste (20)

Public Cloud Workshop

  1. 1. Public Cloud Computing Workshop Amer Ather Netflix Cloud Performance Architect
  2. 2. What is a Cloud ● Abstraction of underlying IT. resources ● On-demand resource provisioning via self service layer ● API driven infrastructure ● Cloud is not just virtualization. Virtualization is among many technologies that cloud uses to manage physical infrastructure. ● Cloud can span across multiple geographical locations. Cloud capabilities can be set up for public or private access us-west-2 us-east-1 eu-west-1
  3. 3. Public Cloud Computing ● Cloud computing enables companies or individuals to consume compute resources like a utility rather than building their own. ● Compute services are hosted on Public Cloud providers (Amazon, Azure, Google..) infrastructure instead of data centers.
  4. 4. Cloud Computing Benefits ● Elasticity of compute resources ● Pay-Per-Use ● Self Service on-demand Provisioning ● Cloud API and Integration ● Managed Services ● Economy of scale ● Tier pricing model ● Resilience via Availability Zones and Regions ● Give rise to immutable Infrastructure ● No more hardware debugging. Terminate the bad instance and provision a new one.
  5. 5. Types of Cloud Computing ● Infrastructure-as-a-Service(IaaS) ○ Customers launch VMs (Virtual Machines) in public cloud managed infrastructure. ○ Customer manages infrastructure via self service interface (web, api, cli) over the internet. Infrastructure components include: VM, machine Images, DNS, storage, networking, patching, monitoring, security etc.. ● Platform-as-a-Service(PaaS) ○ Hides complexity of managing infrastructure. ○ Cloud provider handles capacity provisioning: VM launch, load balancing, auto-scaling, patching, monitoring etc.. ○ Targeted for developers. Developers simply upload the code and cloud providers do the rest. ○ Examples: AWS Elastic Beanstalk, Google App Engine
  6. 6. Types of Cloud Computing (cont.) ● Software-as-a-Service(SaaS) ○ Application hosting in the cloud. ○ Customer access services via web interface over the Internet ○ SaaS provider use subscription model ○ Examples: Salesforce.com, Dropbox, Gmail, Flicker .. ● Function-as-a-Service(FaaS) ○ Serverless computing. No infrastructure to maintain or pay. ○ New compute paradigm. Application is built in bite-sized business logic ○ A function is a single purpose block of code performing a single task. ○ Functions run on public cloud infrastructure. ○ You pay for the amount of time the code is running (nearest 100 ms). ○ Functions are ephemeral, they run on-demand in response to an event ○ Examples: AWS Lambda, Google Cloud Function
  7. 7. Public Cloud Managed Services (AWS) ● Cloudformation: Template to model and provision cloud infrastructure ● RDS: Hosted database solution (sql, nosql): mySQL, Oracle, DynamoDB. ● Data Lake: Store structured/unstructured data on S3 that can be used to run ad hoc queries or ingest into warehouse or hadoop/spark clusters for analytics ● Elastic Cache: Object (memcache) and key-value pair (redis) caching engine ● Amazon ElasticSearch: Server log and full text search for near real time analytics ● RedShift: Data warehousing service ● EMR: On-demand hadoop cluster for big data processing ● Amazon IoT: Connect devices to cloud and use AWS services ● Elastic Container Service: Container (Docker) orchestration in public cloud ● Amazon Lambda: Run functions in response of events from cloud services ● Elastic File System: Managed NFS service ● Amazon Kinesis: Collect and analyze streaming data for real time insight (kafka) ● Amazon SageMaker: Build, train, and deploy machine learning models at cloud scale
  8. 8. Cloud Native Application ● Application written to have cloud in mind ● Stateless and self healing ● Support data sharding ● CI/CD and DevOps ● Red/Black Deployment ● Use microservice or serverless architecture, if possible ● Leverage public cloud API to build new features quickly ● Auto scale and Health check Some open source projects that scales well in cloud: Hadoop/Spark, Machine Learning, NoSQL, memcache, redis, Elasticsearch, kafka
  9. 9. Microservices ● It is an alternative to traditional monolithic application architecture ● Loosely coupled service oriented architecture (SOA) with bounded contexts ● Massively scalable due to loose coupling, stateless model and sharded data ● Decomposition of single application into a suite of small services each implementing different sets of business logic. ● Each service runs independently and interact with open protocol (API) ● HTTP/REST and gRPC are common API used for service interaction. ● Common payloads used for data exchange: JSON, XML, Protocol Buffers ● Forces design of clear interfaces
  10. 10. Microservices (cont.) ● Each service is independently built, deployed, upgrade and scaled. ● Lower learning curve for a new team member due to bounded context ● Services may be written in different languages: java, go, python, nodejs etc. ● Services can be deployed as web container (Tomcat) in a VM or Docker ● Service is free to select any datastore technology (redis, memcache, elasticsearch, cassandra, mongoDB etc.) that suites its use case ● Stateful cached datastore can be built via replicated ephemeral instances
  11. 11. Monolithic vs. Microservices Architecture
  12. 12. Netflix Microservices Architecture (Netflix OSS) Spinnaker DevOps CI/CD Tooling Edda (Archaius) Config Mgmt. Eureka Prana Discovery Zuul Ribbon Routing Hystrix Atlas Observability Ephemeral datastores Dynomite, Memcached, Priam, Cassandra Orchestration Auto-scaling Groups(AWS), Titas (Netflix PaaS using Mesos, Docker), Elastic Container Service (AWS) Build Environment Java (majority), Groovy, Scala, Python, Ruby, php, nodejs Policy Conformance Simian Army, Chaos Monkey, Conformity Monkey, Janitor Monkey spigo: open source software that simulates Netflix style microservices and interactions Microservices with Spring Cloud: Online course on building microservices with Netflix OSS Deep dive into Netflix Microservices
  13. 13. Monolithic vs. Microservices Architecture (cont.) Traditional Data Center Architecture Cloud Architecture (microservices) Monolithic and Centralized Decomposed and decentralized Design for predictable scalability Design for elastic scale Relational database Polyglot persistence (mix of data storage engines) Strong consistency Eventual consistency Shared dataset Sharded dataset Serial and synchronized processing Parallel and async processing Design to avoid failures Design for failure Infrequent and slower updates Frequent small updates (More features) Manual management Self-management (DevOps, CI/CD pipeline) Failures may result in an outage Immutable infrastructure
  14. 14. REST Web API ● A RESTful API is a platform that exposes data as a resource on which to operate ● All client actions to resource are represented by HTTP CRUD methods: ○ POST - Create | GET: Read | PUT: Update | DELETE: Delete ● Response is returned in JSON. HTTP status codes (2xx, 3xx, 4xx, 5xx) are returned with response ● URL is a unique identifier that describes the resource in application. ● A simple client (curl) can be used to invoke REST methods in application ● Each request/response is stateless. Client maintains state and responsible for providing it for server to fulfill that request website Mobile Partner Integration Third Party Apps API GatewayEdge services Backend Services HTTP Transactions clients Well defined interaction with clients and front end service
  15. 15. Continuous Integration and Deployment (CI/CD)
  16. 16. Server Virtualization - Evolution Courtesy Brendan Gregg
  17. 17. Cloud Instance ● Virtual Machine (VM) hosted in public cloud is called Cloud instance ● Hypervisor (xen, kvm) is used for virtualizing physical machine hardware ● VM or guest is bounded to subset of physical resources ● Multiple OS. (window, Linux, Solaris) can run concurrently on the same physical machine ● Intel hardware assisted Virtualization technologies (VTx, VT-d, VPID, EMT, SR-IOV) have narrowed the performance gap between VM and bare-metal
  18. 18. Cloud Instance Families (AWS) AWS offers wide range of cloud instance families with varying processing capabilities https://aws.amazon.com/ec2/instance-types/ Purchasing options: ● Bare-metal (most expensive) ● Dedicated ● On-demand ● Reserved ● Spot (least expensive) Instance Family Purpose T2 Burstable Performance C5 Compute Optimized R4 Memory Optimized I3 Storage Optimized (SSD) D2 Dense Storage Optimized (HDD) M5 General Purpose X1 Large Hardware Configuration P3 GPU General Purpose G3 Graphics Intensive F1 FPGA (custom hardware)
  19. 19. Cloud Instance Types (AWS) AWS offers cloud instances with different hardware configuration within an instance family https://aws.amazon.com/ec2/instance-types/ Instance Type vCPU Memory (GB) Storage Network (Mbps) Comment nano,micro,small 1 0.5 - 2.0 Net only 300 Burstable, T2 only medium 2 4 Net only 300 - 700 t2 and m3 only large 2 3-15 Direct/Net 500-700 All instances except: D2, M4, X1 xlarge 4 16-32 Direct/Net 700 - 1000 all families except x1, p2 2xlarge 8 32-60 Direct/Net 1000 - 2000 all families except x1,p2 4xlarge 16 30-122 Direct/Net 10,000 all families except t2, x1, p2 8xlarge 32 60-244 Direct/Net 10,000 all families except m4, t2, x1, 10xlarge 40 256 Net only 20,000 m4 only 16xlarge 64 256 - 488 Direct/Net 20,000 r4,m4,i3,x1 only 32xlarge 128 1,952 Direct/Net 20,000 x1 only
  20. 20. Cloud Instance Features (AWS) Instance Feature Comment CPU Types Various generations of Intel processors and models: Ivy Bridge, Sandy Bridge, Broadwell, Haswell, Skylake Enhanced Networking Low latency and High throughput networking using SR-IOV PCIe NIC. Native driver (sriov, ena) runs inside the VM and have direct access (DMA) to NIC hardware. In the absence of Enhanced Networking feature, virtualized xen driver is used that is prone to higher latencies. Ephemeral Direct attached Storage Some instance families offer direct attached SSD, HDD and NVMe storage. Direct attached storage is called Ephemeral because storage life span is limited to instance lifespan. Once instance is terminated, ephemeral storage is also lost. NVMe storage is available in I3 family that provide access to storage using native driver (nvme) that runs inside the VM and have direct access (DMA) to storage via SR-IOV. In the absence of SR-IOV extension for storage, SSD and HDD are access via virtualized xen driver that is prone to higher IO latencies. EBS Optimized Network Storage EBS optimized instances have dedicated Network link for accessing EBS network storage. This allows instances to keep storage traffic separate from the application network traffic. Burstable Performance AWS offers burstable cpu, io and network performance to achieve higher than baseline performance for a shorter period of time. Burstable feature is ideal for bursty workloads that require higher (burst) performance for a short period. You pay fraction of price to achieve burst as compared to fixed performance. Amazon instances support performance features to improve compute, IO and network performance
  21. 21. Spot Instances (AWS) ● Spare compute capacity in AWS public cloud is sold via bidding system. ● Spot instances get steep discount (upto 90%) compare to on-demand prices ● Spot instances can be taken away at 2 minutes notice whenever AWS needs the capacity back or spot price has increased. ● Shorter run-time jobs or application that is interruptible are good candidates to run on spot instances. ● Spot instance has an option to hibernate when instance is about to terminate due to spot price changes. When capacity is available application resumes where it was paused. ○ Upon hibernation, you pay for storage cost only.
  22. 22. Cloud Image (AMI) Ubuntu Base AMI Java GC and thread dump logging Tomcat Application servlet, base server, platform, interface jars for dependent services Atlas Monitoring Optional Apache front end, memcached, non-java apps Healthcheck, status servlets, JMX interface
  23. 23. Auto Scaling in Public Cloud (Elasticity) ● Microservice architecture allows each service to be scaled independent of other services ● AWS Auto Scaling Group (ASG) is a group of same type of instances running the same service. ASG can: ○ Scale up/down instances to meet varying demands on the service ○ Replace unhealthy or terminated instance ○ Monitor AWS system and application metrics periodically via Amazon Cloudwatch service to trigger scaling event. ○ Easy to setup via single policy to manage instance capacity via Target Tracking feature: ■ Target Tracking acts like a thermostat that strives to keep the metric close to a desired value. Netflix uses predictive auto scaling policy that scales up early in anticipation of load and scale down slowly to avoid causing resource shortages. It takes into account for public holidays, big public events
  24. 24. Bootstrapping a Cloud Instance ● AWS offers metadata service that publishes instance metadata and custom supplied user data and script that can be fetched at instance launch time. ● Application or configuration script can use instance attributes such as: instance-id or type, public hostname, IP address, AMI-id, AZ etc.. to configure instance at launch time. Use cases: ○ Launch an instance and have it register itself to DNS service ○ Launch an instance with “Golden Image” and install additional patches/software on it ○ Run automated Test bed to perform different tests depending on instance type. ● AWS metadata service is hosted at: ● AWS Cloudinit executes user supplied data script at the first boot cycle of instance.
  25. 25. Cloud Storage (AWS) ● Elastic Block Storage (Network Block Storage): Network storage optimized for IO throughput and low latency. ○ IO1: SSD backed network storage with bounded IO latency (most expensive) ○ GP2: SSD backed network storage for lower IO latency ○ ST1: Magnetic Disk backed storage for Higher IO Throughput ○ SC1: Magnetic Disk backed storage for Moderate IO Throughput (least expensive) ● Ephemeral Storage: High performance direct attached storage. Data is lost on instance termination. Comes in variety of flavors: ○ Magnetic Disk (attached to D2, H1 instance families) ○ SSD (attached to I2, R3 .. instance families) ○ NVMe (attached to i3 instance family) ● EFS (NFS Managed Service): Shared storage that can be accessed concurrently by hundreds of cloud instances spanning across multiple AZ
  26. 26. S3 Object Storage (AWS) ● Manages data as objects ● Each object is self identifiable and discoverable by including metadata and globally unique identifier ● Most cost-efficient and scalable method of storing data in the public cloud ● Flat model makes it scalable and searchable even when object count reaches in trillions ● AWS offers API to interface with S3 objects ● Instances with ephemeral storage periodically backs up data to S3 buckets. ● Ability to scale to millions of operations/sec and GBs of throughput Netflix Cloud Native Storage is built around ephemeral instances and storage Cassandra Backup S3 Cassandra Nodes us-east-1c Cassandra Nodes us-east-1d Cassandra Nodes us-east-1e
  27. 27. Cloud Networking (AWS) Virtual Private Cloud (VPC): ● Instances are launched into an isolated VPC in a virtual network with Internet Gateway already configured. ● VPC resembles data center with full control on virtual networks. ● Default VPC spans to all AZ with one subnet in each. You are free to define more subnets. ● VPC has CIDR block /16 network (65k IP addresses). Subnet mask of /20 allows 4096 IP per subnet. ● Security is applied at instance (security group) and subnet level (ACL) Public and Private IP Address: ● New instances are assigned randomly generated public and private IP addresses. For non-default VPC, only private IP address is assigned ● Private IP address of instance are mapped to Public IP via NAT. ● Private IP is used inside AWS cloud within the same region. ● Public IP is used for Internet and AWS inter-region traffic
  28. 28. Cloud Availability (Failover) Higher availability or service failover require same IP address to be assigned to a newly launched instance after instance termination. Elastic IP (EIP): ● EIP is a permanent public IP address that can be assigned to a running instance in any AZ. ● Masks failure of instance. No delay in DNS propagation due to persistent IP address ● EIP is owned by account. There is small charge for unused EIPs per account. ● Automation using a script that allocate EIP to a running instance Elastic Network Interface (ENI): ● A virtual NIC that can be attached to a running instance in addition to primary NIC (eth0) ● ENI is per subnet and thus require creating ENI for each subnet that you plan an instance to run ● ENI has an associated properties: Private IP, EIP, Security Group etc.. ● When ENI is attached to a new instance, all ENI properties are migrated with it. ● Useful for redirect traffic or configuring a seperate network for administration and backup
  29. 29. Cloud Availability ( Elastic Load Balancer) ● Load Balancer (LB) distributes incoming network traffic across group of cloud instances ● LB performs periodic health check and stops sending traffic to unhealthy instance ● AWS ASG (Auto Scaling Group) and LB work together. ASG is responsible for replacing a bad or terminated instance and LB job is to resume traffic to healthy instances. ● LB supports features like: SSL Termination, Sticky Sessions, Idle Connection Timeout, Connection Draining etc. ● Instances behind LB need private IP address only ● Classic LB runs at TCP layer (layer 4) and thus use TCP ports to direct traffic ● Application LB (ALB) makes routing decision at the Application layer (layer 7). ● Unlike Classic LB, one ALB can route traffic to multiple services ● ALB supports HTTP/HTTPs and thus have more context and flexibility in routing traffic ● ALB supports content based routing that allows traffic to be routed based on URL ● ALB supports dynamic port mapping that allows load balance across two containers (Docker) of same service running on the same instance. Without ALB, you may require two instances for load balance the same service.
  30. 30. AWS Route 53 - DNS Service ● Self managed DNS service in AWS cloud ● You can register and park your domain name (cloudperf.net ) with route 53 service ● You own any subdomain such as techblog.cloudperf.net. ● Instead of dynamically assign names of cloud instances, give them custom names ● Route53 supports health check and various routing policies: Latency Routing, Failover Routing, Geolocation Routing
  31. 31. Containers (OS. Virtualization) ● Lightweight virtualization supported by operating system (kernel) to create isolated user-space instances ● No hypervisor is required! ● Container can run instruction native to CPU without any special interpretation ● Goal is to create application execution environment that mimics standard linux install without requiring a separate kernel
  32. 32. Docker Container ● Open source container service, similar to lxc ● Application centric instead of machine centric view ● More nimble than VM due to smaller footprints ● Portable across data centers and public clouds ● Immutable. Changes to container image is lost on termination. ● Support Open standard libraries: libcontainer, libswarm.. ● Containers are created from a read-only template called an image ○ Simple template (Dockerfile or Docker compose) are used for building docker images for single and multi-container applications. ● All required dependencies (code, runtime, system tools and libraries etc.) are baked into the container, thus allow the software to run the same way regardless of environment
  33. 33. Docker Image Docker image is built using multiple layers: ● Base: boot file system. Unmounted after container is booted ● rootfs: It can be any Linux distro (Ubuntu, RedHat..). Mounted as read-only root file system ● union mount: Docker uses union mount to add more read-only file systems (called images) on top of root file system. ● Container Image: When a container image is launched, it is mounted as read-write file system This is where application/process inside Docker container run. Docker images are stored in a public or private registry from which they can be downloaded and run on the cluster
  34. 34. Container (Docker) Orchestration ● Orchestration framework is required to manage fleet of cloud instances where docker containers can be deployed ● Container Orchestration framework abstracts the infrastructure and make the entire fleet of instances or cluster as a single deployment target. ● Container orchestration typically involves container scheduling, deployment, replication, scaling, monitoring, management, and failover ● Public Cloud container services: ○ Amazon ECS ○ Azure Container Service ○ Google Container Engine ● Popular Container Orchestration framework: ○ Kubernetes ○ Mesos ○ Docker Swarm ○ CoreOS Fleet
  35. 35. Cloud Monitoring and Resource Tagging (AWS) CloudWatch ● AWS service for monitoring AWS resource metrics to gain visibility on resource utilization and performance ● Application can also store custom metrics and logs into CloudWatch ● Metrics can be polled to set an alarm or alert when a threshold is met. Tagging ● AWS allows cloud resources to be tagged using a key and a value (optional). ● Allows companies to perform internal tracking of resource usage across departments (sales, marketing) and billing. ● Tags can also be used to identify cloud resources used in prod and test environment ● Cloud resource with tag can be searched and filter to perform an action as a group.
  36. 36. Multi-Tier Cloud Security (AWS) ● AWS Security Keys: used for cloud resource provisioning ● Key Pair: public/private key to authenticate ssh/login into cloud instance ● IAM Users/groups: are given limited access to cloud resources by attaching a policy that lists cloud resources that a user/group can provision or access. ● IAM Roles: Allows assigning temporary security credential to application running on cloud instance or mobile device for access to aws services and resources. Example: InstanceProfile, AssumeRole API ● Security Group (SG): Implements security (Firewall) at the cloud instance ● Access Control List (ACL): Implement security at the network level ● AWS CloudTrail Service: Event history of account activity to perform security analysis, resource charge tracking and troubleshooting.
  37. 37. Multi-Tier Cloud Security (AWS) - Cont. ● Meets several compliance requirements (financial, healthcare, govt.) ● DDoS Mitigation ● Data encryption on transit (TLS) and at rest for better data privacy ● Highly secure AWS data centers ● Multi-Factor Authentication for privileged accounts. ● Integration with corporate directory using AWS Directory Service to easily migrate directory aware on-premises workloads.