SlideShare a Scribd company logo
1 of 39
Download to read offline
C l o u d Sta c k B e s t P r a c t i ce s I n
PPTV
DeanWei
About Me
 OPS Architect at PPTV
•   3 years experience in software development and design
•   6 years experience in technical consultant(infrastructure
    architecture design , integration , solution , capacity
    planning and performance tuning) for the top insurance
    companies (AIG,ASR,ACE,Fortis,SNS REAAL,Chubb,GEL,SBI)
•   1 year experience in ASP(Application Service Provider)
    platform architecture design,security, performance
    analysis and optimization ,and operations
•   Current focus on the automation operations architecture,
    cloud platform building, the large-scale distributed system
    operations and performance analysis and
    optimization ,continuous delivery, System performance
    tuning
 SINA WEIBO (DeanWei) : http://weibo.com/deanw
Agenda



Why Cloud?
   What is Cloudstack?
                How to Build?
Overview
 Why Use Cloud ?
 Why Cloudstack ?
 What is CloudStack ?
 How to build A Cloud-Based Infrastructure Platform?
 Cloudstack Best Practices In PPTV
    Deployment Architecture
    Network Considerations And Design
    Storage Considerations And Design
    Services Offering Considerations And Design
    Troubleshooting Best Practices
    Performance Tuning
Background And Challenge
The Original Infrastructure Provisioning Processes

 APP OPS 申     IDC 查找      IDC 初始化 OS   IDC 安装VM     IDC 创建VM
   请资源          CMDB                       软件




 监控Team更新     APP OPS 更新   App OPS 安    App OPS 安装   App OPS 初始
                                                         VM
 Zabbix 监控        CMDB       装应用           中间件




  Tools 调整    更改控制审批        迁移到环境       重新布线,迁移到
 release 配置    应用上线                       产品环境
Problems
 A.   Occupied by a large number of people
 B.   A large number of manual steps
 C.   Built one server at a time
 D.   Non-Self Service
 E.   Not out of the box by itself
 F.   Non-elastic
 G.   Path dependence
 H.   Long time for building
 I.   Many fault point
Five Characteristics of Clouds
 A. On-Demand Self-Service

 B. Scalable

 C. Resource Pooling

 D. Rapid Elasticity

 E. Measured Service


 Cloud technology can solve our current confusion!
Cloud-based Infrastructure Provisioning Processes
     Provisioned when needed
              App OPS 申请应用     OPS 访问               OPS 挑选应用最       选择可用资源
                   环境         Services UI             近快照模板




                             (验证资源分配)             (选择应用模板和资源规模)   (可用的资源和何时使用)




o   Out of the box
                                                    资源自动分配和           按 “启动”
    Parallel building
                             ERP   CRM    app

o                                                      注册
                                   APP

o Self Service                     App1


o One-button for All
                                   APP2



o   Elastic
                                                (资源分配,自动创建VM,监控注册等)
Cloud Still Requires Architectural Design

 Cloud Computing isn’t a magical solution apps need to be
  able to scale out

 Design your architecture with the end in mind

 Make your infrastructure easily replicable
Popular Cloud Software Platform
Why CloudStack?
   Open Source: Apache 2.0
   Cloudstack User(it is proven, and has a good track record)
   It is very easy to install and get up and running
   Less man hours for implementation
   Easy to integration and custom
   Match our requirements at this stage
What is CloudStack?

 Open source Infrastructure as a Service (IaaS) solution.
 Programmable Data Center orchestrator
 Hypervisor agnostic
 Support scalable storage (Ceph, SWIF,NFS)
 Support complex enterprise networking (e.g Firewall, load balancer, VPN,
  VPC…)
 Multi-tenant
Core Components
 Hosts
   o Servers onto which services will be
     provisioned
                                                                          VM




 Primary Storage                                                 Host
   o VM disk storage
                                                                          VM
                                                   Network
 Cluster                                                         Host
   o A grouping of hosts and their associated
     storage                                                          Primary
 Pod                                                                 Storage

   o Collection of clusters in the same failure
     boundary                                                    Cluster
 Network
                                                  Secondary
   o Logical network associated with service       Storage       Cluster
     offerings
 Secondary Storage
                                                              CloudStack Pod
   o Template, snapshot and ISO storage
 Zone
   o Collection of pods, network offerings and                CloudStack Pod
     secondary storage
 Management Server Farm                                       Zone
   o Management and provisioning tasks
Two Types of Storage
Primary Storage
•    Stores disk volumes for VMs in a cluster
•    Configured at Cluster-level.
•    Close to hosts for better performance                               L3 switch
•    Cluster have at least one primary storage
•    Requires high IOPs (can be expensive)
                                                 Pod 1       L2 switch
                                                                                     Secondary
                                                  Cluster 1                           Storage
                                                    Host 1
                                                                    Primary
Secondary Storage                                   Host 2          Storage
•    Stores all Templates, ISOs and Snapshots
•    Configured at Zone-level
•    Zone can have one or more secondary
     storages
•    High capacity, low cost commodity
     storage
Deployment Architecture

                                      Internet          Hypervisor is the basic unit
Management
Server Cluster                                           of scale.

 Zone 1                                                 Cluster consists of one ore
                                                         more hosts of same
                         L3                              hypervisor

                                   Pod N
                                                        All hosts in cluster have
 Pod 1         L2                          Secondary
                                                         access to shared (primary)
                              ….            Storage
                                                         storage
   Cluster N
                                                        Pod is one or more clusters,
                                                         usually with L2 switches.
         ….
                                                        Availability Zone has one or
   Cluster 1                                             more pods, has access to
     Host 1
                                                         secondary storage.
                    Primary
                                                        One or more zones
     Host 2         Storage
                                                         represent cloud
Software Architecture
                     Cloud                                                     Other
    UI                                           CLI                           Clients
                     Portal




                                                     Management Server
                                                           REST API
         OAM&P API                    End User API       EC2 API       Other APIs           Pluggable Service API Engine


  Console Proxy                              ACL & Authentication                                Security Adapters
  Management                -        Accounts, Domains, and Projects
                            -        ACL, limits checking                                      Account Management
    Template                                                                                       Connectors
     Access
                                                Services API
                                                                                                                            DB



                                                                               Plugin API
                                                                                               Deployment Planning
         HA
                                          Orchestration Engine
                                      -   Drives long running VM
                      Services API




                                                                                                  Network Gurus
      Usage                               operations
   Calculations                       -   Syncs between resources
                                          managed and DB                                         Network Elements
   Additional                         -   Generates events
    Services
                                                                                                 Hypervisor Gurus


       Cluster                        Resource             Job                Alert & Event              Database
     Management                      Management         Management            Management                  Access




                                                       Message Bus
                                                                       Event Bus                                           Usage
                                                                                                                           Server
                                                       Resource API
      Hypervisor                       Network            Storage             Image                  Snapshot
      Resources                       Resources          Resources           Resources               Resources
Data And Control Flow

            Cloud                                                Management Servers
                                                                  control all resources,
Data Center 1
                                       Data Center 3              both virtual and physical
           Managem

  VR
             ent
            Server
                                                        VR       SSVMs deployed to
                                                                  transfer data between
  CPVM          SSVM                      SSVM         CPVM       zones
                        Transfer of
                        Templates,                               CPVMs deployed to
                           ISOs,
                        Snapshots
                                                                  transfer VNC console
                                      Internet                    traffic
 Data Center 2
                                                                 VR deployed for traffic
    VR           SSVM                                             into public internet

                 CPVM
                                                                 Management Server is
                                                                  never in the data path
How to build A Cloud-based infrastructure Platform?

 A infrastructure Management Platform constitutes:
    Provisioning
    Configuration Management
    Services Orchestration
    Monitoring And Alert
 How to build ?
    Architecture
        A programmable infrastructure architecture
    Open Source ToolChains
A infrastructure Management Platform constitutes

 Provisioning
    Installation of operating systems and other software
 Configuration Management
    Sets the parameters for servers, can specify initialized parameters
 Services Orchestration
    Automate tasks across systems
 Monitoring And Alert
    Records errors and health of infrastructure
    Alert Services
A Programmable Infrastructure Architecture
Open Source Provisioning Tools


                     Year Started   License   Installation
                                              Targets
Kickstart            ?              GPL       Most .dep and RPM
                                              based Linux distros

Cobbler (Plus koan   2007           GPL       Red Hat, OpenSUSE
for PXE boot of                               Fedora, Debian,
VMs)                                          Ubuntu

Spacewalk            2008           GPL       Fedora, Centos

Crowbar              2011           Apache    (Bare metal
                                              provisioning)
Open Source Configuration Management Tools

           Year      Language   License   Client/Server
           Started
Cfengine   1993      C          Apache Yes


Chef       2009      Ruby       Apache Chef Solo – No
                                       Chef Server -
                                       Yes
Puppet     2004      Ruby       GPL    yes
Salt       2011      Python     Apache yes
Open Source Monitoring Tools


          License    Type of           Collection
                     Monitoring        Methods
Cacti /   GPL        Performance       SNMP, syslog
RRDTool
Nagios    GPL        Availability      SNMP,TCP,
                                       ICMP, IPMI,
                                       syslog
Zabbix    GPL        Availability/     SNMP,
                     Performance and   TCP/ICMP,
                     more              IPMI, Synthetic
                                       Transactions
Zenoss    GPL        Availability,     SNMP, ICMP,
                     Performance,      SSH, syslog,
                     Event             WMI
                     Management
Open Source Automation/Orchestration Tools

              Year      Languag   Licens   Client/Se   Support
              Started   e         e        rver        Organizati
                                                       on
Capistrano    2006      Ruby      MIT      Yes         None

Controltier   2010      Java      Apache Yes           DTO
/RunDeck                                               Solutions
Func          2007      Python    GPL      Yes         Fedora
                                                       Project
MCollective   2009      Ruby      Apache Yes           PuppetLabs

Salt          2011      Python    Apache Yes           SaltStack
                                                       Inc. ?
Provisioning Activity Flow And Open Source Tools




                                                                                          ControlTier
                                                                        Services Portal
                        Command and         Application Services
                        Control         Orchestration And Management
Provisioning Activity




                                                                                          Zabbix
                                                                        Puppet
                        Configuration      System Configuration




                                                                                          Cloudstack
                                                                        Cobbler
                                        VM Image
                        Bootstrapping                      OS Install
                                         Launch
Automated Tools Chain in PPTV

Generate                  BootStrapped                Provision        Configuratio
 Images                      Image                   Cobbler/Cloud          n
                          Cobbler/CloudStack          stack/Koan           Puppet
BoxGrinder




             Monitoring                                                Services
               zabbix
                                                                     Orchestration
                Cacti                                                ControlTier/Zabbix
                                                                           agent



                                                 CMDB
                                               CMDBUILD/Ra
                                                 ckTable
Cloudstack In PPTV
   CS Version : 3.0.2
   Hypervisor : KVM
   Host OS : Centos 6.2
   KVM Guest OS : Centos 5.8
   Multiple management servers are deployed in the multi-line/BGP IDC
   Be deployed to all the core IDC and Used for the Non-vod business
   More than 150 hosts
   Primary storage : local Storage
   Secondary Storage : Local NFS Server and GlusterFS
   Network : Basic Network
   Monitoring : Zabbix
   System configuration management : Puppet
   Services Orchestration management : ControlTier/Services Portal
   Patches for the performance, integration and stability
   Workaround for some issues
Deployment Architecture

  BGP/Multi-line Management Farm

 BGP IDC            沈阳电信 IDC         上海电信 IDC
           Manage
            ment
           Server

                      SYCB Zone
  BGP Zone                           SHTB Zone




 广州电信 IDC           成都电信 IDC       北京网通 IDC




   GZTB Zone          CDTB Zone
                                     BJCB Zone
Management Server Deployment Architecture


                                             MySQL
                        Management
User API                  Server1
              Load
             Balancer                          Replication
Admin API               Management
                          Server2              Slave




       Infrastructure         Infrastructure                 Infrastructure
         Resources              Resources                      Resources

            zone1                    Zone2                      Zone3
Network Considerations And Design

  Using Basic Network
  Custom Network offering for basic network(Only use DHCP)
  Disable Iptables for performance consideration(modify Sources
   Code)
  Disable Security Group
  Multi-zone design for PrimaryStorage Performance consideration
Storage Considerations And Design

  Use Local Storage
  A cluster mapping to a Host
  Primary Storage
      A local disk only services a VM instance
                                                                           L3 switch
      Backup VM instance as template on schedule
      Using shared storage type
                                                    Pod 1      L2 switch
      Separating application data and log
                                                                                       Secondary
          data to Root Volume and Data Volume        Cluster 1                          Storage
  Secondary Storage
      Local NFS Server                               Host 1          Primary
          Backup Data use Inotify and Rsync                          Storage

          Network Card bonding
          Up-link to 10G
          Failover By manual
      GlusterFS over NFS
Services Offering Considerations And Design

  Disable HA
  A disk offering bind the specified disk
  A compute offering bind the specified host and disk
Provisioning Processes Best Practices

A. Install Host OS by cobber
B. Install CS agent and system settings by puppet
C. Install and configure monitor by puppet
D. Services Orchestration system trigger scripts to register host to CS
E. Services Orchestration system trigger script to generate Disk
   offerings and Compute offerings for Host
F. Services Orchestration system register host to CMDB
G. Host go launch
Troubleshooting Best Practices

  Analyse Log files
     Management Log : /var/log/cloud/management/
     Agent Log : /var/log/cloud/agent/
     Adjust log4j level for debugging
  Source Code
  Data Models
Performance Tuning

  BIOS Settings for KVM Host
      For Dell PowerEdge servers:
      A. Set the Power Management Mode to Maximum
         Performance.
      B. Set the CPU Power and Performance Management
         Mode to Maximum Performance.
      C. Processor Settings: set Turbo Mode to enabled .
      D. Processor Settings: set C States to disabled.
Performance Tuning (contd)

  CS Tuning
     NFS Server Tuning
          Use NFSV4
          noatime,nodiratime,noacl,data=writeback,commit=15
          IDE/Sata parameters
          NIC &TCP/IP
          Use GlusterFS
     Management Server Tuning
          Increase Worker Process Number
          Turn off stats collectors
          Tuning Allocation Algorithm
          Tuning Direct Agent Load Size
          Mysql DB tuning
          JVM Tuning
               Heap Size Tuning
               Use CMS GC Algorithm
Performance Tuning (contd)
 KVM Tuning
    CPU
         Disable KSM in KVM Host
         Disable tickless mode in KVM guest
         PIN CPU in KVM host
    Memory
         THP in KVM Host
              echo 'yes' > /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag
              echo 'always'> /sys/kernel/mm/redhat_transparent_hugepage/enabled
              echo 'never'> /sys/kernel/mm/redhat_transparent_hugepage/defrag
    network performance issue in centos 6.2
              Workaround: blacklist vhost-net. Edit /etc/modprobe.d/blacklist-kvm.conf and
               include vhost-net.

 Linux kernel parameters tuning
    TCP Buffer Tuning
Q&A

More Related Content

What's hot

Using Virtualization To Improve Development And Testing
Using Virtualization To Improve Development And TestingUsing Virtualization To Improve Development And Testing
Using Virtualization To Improve Development And Testing
elliando dias
 
Virtualization presentation
Virtualization presentationVirtualization presentation
Virtualization presentation
Mangesh Gunjal
 

What's hot (20)

VMware Virtual SAN Presentation
VMware Virtual SAN PresentationVMware Virtual SAN Presentation
VMware Virtual SAN Presentation
 
Vmware overview
Vmware overviewVmware overview
Vmware overview
 
Disaster Recovery Planning using Azure Site Recovery
Disaster Recovery Planning using Azure Site RecoveryDisaster Recovery Planning using Azure Site Recovery
Disaster Recovery Planning using Azure Site Recovery
 
Microsoft azure overview
Microsoft azure overviewMicrosoft azure overview
Microsoft azure overview
 
Using Virtualization To Improve Development And Testing
Using Virtualization To Improve Development And TestingUsing Virtualization To Improve Development And Testing
Using Virtualization To Improve Development And Testing
 
Virtualization presentation
Virtualization presentationVirtualization presentation
Virtualization presentation
 
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
KVM High Availability Regardless of Storage - Gabriel Brascher, VP of Apache ...
 
Volume Encryption In CloudStack
Volume Encryption In CloudStackVolume Encryption In CloudStack
Volume Encryption In CloudStack
 
Fundamentals of Cloud Computing & AWS
Fundamentals of Cloud Computing & AWSFundamentals of Cloud Computing & AWS
Fundamentals of Cloud Computing & AWS
 
Az 104 session 3 azure compute
Az 104 session 3 azure compute Az 104 session 3 azure compute
Az 104 session 3 azure compute
 
Edge Zones In CloudStack
Edge Zones In CloudStackEdge Zones In CloudStack
Edge Zones In CloudStack
 
Azure Site Recovery Bootcamp
Azure Site Recovery BootcampAzure Site Recovery Bootcamp
Azure Site Recovery Bootcamp
 
An Introduction to VMware NSX
An Introduction to VMware NSXAn Introduction to VMware NSX
An Introduction to VMware NSX
 
20180717 AWS Black Belt Online Seminar AWS大阪ローカルリージョンの活用とAWSで実現するDisaster Rec...
20180717 AWS Black Belt Online Seminar AWS大阪ローカルリージョンの活用とAWSで実現するDisaster Rec...20180717 AWS Black Belt Online Seminar AWS大阪ローカルリージョンの活用とAWSで実現するDisaster Rec...
20180717 AWS Black Belt Online Seminar AWS大阪ローカルリージョンの活用とAWSで実現するDisaster Rec...
 
An Intrudction to OpenStack 2017
An Intrudction to OpenStack 2017An Intrudction to OpenStack 2017
An Intrudction to OpenStack 2017
 
Az 104 session 5: Azure networking
Az 104 session 5: Azure networkingAz 104 session 5: Azure networking
Az 104 session 5: Azure networking
 
[AWS初心者向けWebinar] 利用者が実施するAWS上でのセキュリティ対策
[AWS初心者向けWebinar] 利用者が実施するAWS上でのセキュリティ対策[AWS初心者向けWebinar] 利用者が実施するAWS上でのセキュリティ対策
[AWS初心者向けWebinar] 利用者が実施するAWS上でのセキュリティ対策
 
Azure Reference Architectures
Azure Reference ArchitecturesAzure Reference Architectures
Azure Reference Architectures
 
IBM MQ in Containers - Think 2018
IBM MQ in Containers - Think 2018IBM MQ in Containers - Think 2018
IBM MQ in Containers - Think 2018
 
cloud virtualization technology
 cloud virtualization technology  cloud virtualization technology
cloud virtualization technology
 

Similar to CloudStack Best Practice in PPTV

Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stack
Murali Reddy
 
Apache CloudStack Architecture by Alex Huang
Apache CloudStack Architecture by Alex HuangApache CloudStack Architecture by Alex Huang
Apache CloudStack Architecture by Alex Huang
buildacloud
 
What is cloud computing
What is cloud computingWhat is cloud computing
What is cloud computing
Brian Bullard
 
Ram chinta hug-20120922-v1
Ram chinta hug-20120922-v1Ram chinta hug-20120922-v1
Ram chinta hug-20120922-v1
Ram Chinta
 

Similar to CloudStack Best Practice in PPTV (20)

2 architectural at CloudStack Developer Day
2  architectural at CloudStack Developer Day2  architectural at CloudStack Developer Day
2 architectural at CloudStack Developer Day
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stack
 
Apache CloudStack Architecture by Alex Huang
Apache CloudStack Architecture by Alex HuangApache CloudStack Architecture by Alex Huang
Apache CloudStack Architecture by Alex Huang
 
CloudStack-Developer-Day
CloudStack-Developer-DayCloudStack-Developer-Day
CloudStack-Developer-Day
 
Deploying Apache CloudStack from API to UI
Deploying Apache CloudStack from API to UIDeploying Apache CloudStack from API to UI
Deploying Apache CloudStack from API to UI
 
Apache CloudStack from API to UI
Apache CloudStack from API to UIApache CloudStack from API to UI
Apache CloudStack from API to UI
 
Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)
 
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-12012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
2012 CloudStack Design Camp in Taiwan--- CloudStack Overview-1
 
Intro to CloudStack Build a Cloud Day
Intro to CloudStack Build a Cloud DayIntro to CloudStack Build a Cloud Day
Intro to CloudStack Build a Cloud Day
 
CloudStack Architecture Future
CloudStack Architecture FutureCloudStack Architecture Future
CloudStack Architecture Future
 
DevCloud and CloudMonkey
DevCloud and CloudMonkeyDevCloud and CloudMonkey
DevCloud and CloudMonkey
 
What is cloud computing
What is cloud computingWhat is cloud computing
What is cloud computing
 
Win2k8 cluster kaliyan
Win2k8 cluster kaliyanWin2k8 cluster kaliyan
Win2k8 cluster kaliyan
 
Architecting a Private Cloud - Cloud Expo
Architecting a Private Cloud - Cloud ExpoArchitecting a Private Cloud - Cloud Expo
Architecting a Private Cloud - Cloud Expo
 
CloudStack and SDN
CloudStack and SDNCloudStack and SDN
CloudStack and SDN
 
1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day 1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day
 
Windows server 2012 failover clustering improvements
Windows server 2012   failover clustering improvementsWindows server 2012   failover clustering improvements
Windows server 2012 failover clustering improvements
 
What's New in RHEL 6 for Linux on System z?
What's New in RHEL 6 for Linux on System z?What's New in RHEL 6 for Linux on System z?
What's New in RHEL 6 for Linux on System z?
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overview
 
Ram chinta hug-20120922-v1
Ram chinta hug-20120922-v1Ram chinta hug-20120922-v1
Ram chinta hug-20120922-v1
 

More from gavin_lee (7)

Cloudstack 社区及商业
Cloudstack 社区及商业Cloudstack 社区及商业
Cloudstack 社区及商业
 
CloudStack challenges for China customers
CloudStack challenges for China customersCloudStack challenges for China customers
CloudStack challenges for China customers
 
开源云产品在企业级部署的适用性探讨- CloudStack
开源云产品在企业级部署的适用性探讨- CloudStack开源云产品在企业级部署的适用性探讨- CloudStack
开源云产品在企业级部署的适用性探讨- CloudStack
 
Cloudstack China User Group Report
Cloudstack China User Group ReportCloudstack China User Group Report
Cloudstack China User Group Report
 
CloudStack Architecture and Refactor
CloudStack Architecture and RefactorCloudStack Architecture and Refactor
CloudStack Architecture and Refactor
 
Cloudstack dev/user sharing
Cloudstack dev/user sharingCloudstack dev/user sharing
Cloudstack dev/user sharing
 
cloudstack participation
cloudstack participationcloudstack participation
cloudstack participation
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Recently uploaded (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

CloudStack Best Practice in PPTV

  • 1. C l o u d Sta c k B e s t P r a c t i ce s I n PPTV DeanWei
  • 2. About Me  OPS Architect at PPTV • 3 years experience in software development and design • 6 years experience in technical consultant(infrastructure architecture design , integration , solution , capacity planning and performance tuning) for the top insurance companies (AIG,ASR,ACE,Fortis,SNS REAAL,Chubb,GEL,SBI) • 1 year experience in ASP(Application Service Provider) platform architecture design,security, performance analysis and optimization ,and operations • Current focus on the automation operations architecture, cloud platform building, the large-scale distributed system operations and performance analysis and optimization ,continuous delivery, System performance tuning  SINA WEIBO (DeanWei) : http://weibo.com/deanw
  • 3. Agenda Why Cloud? What is Cloudstack? How to Build?
  • 4. Overview  Why Use Cloud ?  Why Cloudstack ?  What is CloudStack ?  How to build A Cloud-Based Infrastructure Platform?  Cloudstack Best Practices In PPTV  Deployment Architecture  Network Considerations And Design  Storage Considerations And Design  Services Offering Considerations And Design  Troubleshooting Best Practices  Performance Tuning
  • 6. The Original Infrastructure Provisioning Processes APP OPS 申 IDC 查找 IDC 初始化 OS IDC 安装VM IDC 创建VM 请资源 CMDB 软件 监控Team更新 APP OPS 更新 App OPS 安 App OPS 安装 App OPS 初始 VM Zabbix 监控 CMDB 装应用 中间件 Tools 调整 更改控制审批 迁移到环境 重新布线,迁移到 release 配置 应用上线 产品环境
  • 7. Problems A. Occupied by a large number of people B. A large number of manual steps C. Built one server at a time D. Non-Self Service E. Not out of the box by itself F. Non-elastic G. Path dependence H. Long time for building I. Many fault point
  • 8. Five Characteristics of Clouds A. On-Demand Self-Service B. Scalable C. Resource Pooling D. Rapid Elasticity E. Measured Service Cloud technology can solve our current confusion!
  • 9. Cloud-based Infrastructure Provisioning Processes Provisioned when needed App OPS 申请应用 OPS 访问 OPS 挑选应用最 选择可用资源 环境 Services UI 近快照模板 (验证资源分配) (选择应用模板和资源规模) (可用的资源和何时使用) o Out of the box 资源自动分配和 按 “启动” Parallel building ERP CRM app o 注册 APP o Self Service App1 o One-button for All APP2 o Elastic (资源分配,自动创建VM,监控注册等)
  • 10. Cloud Still Requires Architectural Design  Cloud Computing isn’t a magical solution apps need to be able to scale out  Design your architecture with the end in mind  Make your infrastructure easily replicable
  • 12. Why CloudStack?  Open Source: Apache 2.0  Cloudstack User(it is proven, and has a good track record)  It is very easy to install and get up and running  Less man hours for implementation  Easy to integration and custom  Match our requirements at this stage
  • 13. What is CloudStack?  Open source Infrastructure as a Service (IaaS) solution.  Programmable Data Center orchestrator  Hypervisor agnostic  Support scalable storage (Ceph, SWIF,NFS)  Support complex enterprise networking (e.g Firewall, load balancer, VPN, VPC…)  Multi-tenant
  • 14. Core Components  Hosts o Servers onto which services will be provisioned VM  Primary Storage Host o VM disk storage VM Network  Cluster Host o A grouping of hosts and their associated storage Primary  Pod Storage o Collection of clusters in the same failure boundary Cluster  Network Secondary o Logical network associated with service Storage Cluster offerings  Secondary Storage CloudStack Pod o Template, snapshot and ISO storage  Zone o Collection of pods, network offerings and CloudStack Pod secondary storage  Management Server Farm Zone o Management and provisioning tasks
  • 15. Two Types of Storage Primary Storage • Stores disk volumes for VMs in a cluster • Configured at Cluster-level. • Close to hosts for better performance L3 switch • Cluster have at least one primary storage • Requires high IOPs (can be expensive) Pod 1 L2 switch Secondary Cluster 1 Storage Host 1 Primary Secondary Storage Host 2 Storage • Stores all Templates, ISOs and Snapshots • Configured at Zone-level • Zone can have one or more secondary storages • High capacity, low cost commodity storage
  • 16. Deployment Architecture Internet  Hypervisor is the basic unit Management Server Cluster of scale. Zone 1  Cluster consists of one ore more hosts of same L3 hypervisor Pod N  All hosts in cluster have Pod 1 L2 Secondary access to shared (primary) …. Storage storage Cluster N  Pod is one or more clusters, usually with L2 switches. ….  Availability Zone has one or Cluster 1 more pods, has access to Host 1 secondary storage. Primary  One or more zones Host 2 Storage represent cloud
  • 17. Software Architecture Cloud Other UI CLI Clients Portal Management Server REST API OAM&P API End User API EC2 API Other APIs Pluggable Service API Engine Console Proxy ACL & Authentication Security Adapters Management - Accounts, Domains, and Projects - ACL, limits checking Account Management Template Connectors Access Services API DB Plugin API Deployment Planning HA Orchestration Engine - Drives long running VM Services API Network Gurus Usage operations Calculations - Syncs between resources managed and DB Network Elements Additional - Generates events Services Hypervisor Gurus Cluster Resource Job Alert & Event Database Management Management Management Management Access Message Bus Event Bus Usage Server Resource API Hypervisor Network Storage Image Snapshot Resources Resources Resources Resources Resources
  • 18. Data And Control Flow Cloud  Management Servers control all resources, Data Center 1 Data Center 3 both virtual and physical Managem VR ent Server VR  SSVMs deployed to transfer data between CPVM SSVM SSVM CPVM zones Transfer of Templates,  CPVMs deployed to ISOs, Snapshots transfer VNC console Internet traffic Data Center 2  VR deployed for traffic VR SSVM into public internet CPVM  Management Server is never in the data path
  • 19. How to build A Cloud-based infrastructure Platform?  A infrastructure Management Platform constitutes:  Provisioning  Configuration Management  Services Orchestration  Monitoring And Alert  How to build ?  Architecture  A programmable infrastructure architecture  Open Source ToolChains
  • 20. A infrastructure Management Platform constitutes  Provisioning  Installation of operating systems and other software  Configuration Management  Sets the parameters for servers, can specify initialized parameters  Services Orchestration  Automate tasks across systems  Monitoring And Alert  Records errors and health of infrastructure  Alert Services
  • 22. Open Source Provisioning Tools Year Started License Installation Targets Kickstart ? GPL Most .dep and RPM based Linux distros Cobbler (Plus koan 2007 GPL Red Hat, OpenSUSE for PXE boot of Fedora, Debian, VMs) Ubuntu Spacewalk 2008 GPL Fedora, Centos Crowbar 2011 Apache (Bare metal provisioning)
  • 23. Open Source Configuration Management Tools Year Language License Client/Server Started Cfengine 1993 C Apache Yes Chef 2009 Ruby Apache Chef Solo – No Chef Server - Yes Puppet 2004 Ruby GPL yes Salt 2011 Python Apache yes
  • 24. Open Source Monitoring Tools License Type of Collection Monitoring Methods Cacti / GPL Performance SNMP, syslog RRDTool Nagios GPL Availability SNMP,TCP, ICMP, IPMI, syslog Zabbix GPL Availability/ SNMP, Performance and TCP/ICMP, more IPMI, Synthetic Transactions Zenoss GPL Availability, SNMP, ICMP, Performance, SSH, syslog, Event WMI Management
  • 25. Open Source Automation/Orchestration Tools Year Languag Licens Client/Se Support Started e e rver Organizati on Capistrano 2006 Ruby MIT Yes None Controltier 2010 Java Apache Yes DTO /RunDeck Solutions Func 2007 Python GPL Yes Fedora Project MCollective 2009 Ruby Apache Yes PuppetLabs Salt 2011 Python Apache Yes SaltStack Inc. ?
  • 26. Provisioning Activity Flow And Open Source Tools ControlTier Services Portal Command and Application Services Control Orchestration And Management Provisioning Activity Zabbix Puppet Configuration System Configuration Cloudstack Cobbler VM Image Bootstrapping OS Install Launch
  • 27. Automated Tools Chain in PPTV Generate BootStrapped Provision Configuratio Images Image Cobbler/Cloud n Cobbler/CloudStack stack/Koan Puppet BoxGrinder Monitoring Services zabbix Orchestration Cacti ControlTier/Zabbix agent CMDB CMDBUILD/Ra ckTable
  • 28. Cloudstack In PPTV  CS Version : 3.0.2  Hypervisor : KVM  Host OS : Centos 6.2  KVM Guest OS : Centos 5.8  Multiple management servers are deployed in the multi-line/BGP IDC  Be deployed to all the core IDC and Used for the Non-vod business  More than 150 hosts  Primary storage : local Storage  Secondary Storage : Local NFS Server and GlusterFS  Network : Basic Network  Monitoring : Zabbix  System configuration management : Puppet  Services Orchestration management : ControlTier/Services Portal  Patches for the performance, integration and stability  Workaround for some issues
  • 29. Deployment Architecture BGP/Multi-line Management Farm BGP IDC 沈阳电信 IDC 上海电信 IDC Manage ment Server SYCB Zone BGP Zone SHTB Zone 广州电信 IDC 成都电信 IDC 北京网通 IDC GZTB Zone CDTB Zone BJCB Zone
  • 30. Management Server Deployment Architecture MySQL Management User API Server1 Load Balancer Replication Admin API Management Server2 Slave Infrastructure Infrastructure Infrastructure Resources Resources Resources zone1 Zone2 Zone3
  • 31. Network Considerations And Design  Using Basic Network  Custom Network offering for basic network(Only use DHCP)  Disable Iptables for performance consideration(modify Sources Code)  Disable Security Group  Multi-zone design for PrimaryStorage Performance consideration
  • 32. Storage Considerations And Design  Use Local Storage  A cluster mapping to a Host  Primary Storage  A local disk only services a VM instance L3 switch  Backup VM instance as template on schedule  Using shared storage type Pod 1 L2 switch  Separating application data and log Secondary data to Root Volume and Data Volume Cluster 1 Storage  Secondary Storage  Local NFS Server Host 1 Primary  Backup Data use Inotify and Rsync Storage  Network Card bonding  Up-link to 10G  Failover By manual  GlusterFS over NFS
  • 33. Services Offering Considerations And Design  Disable HA  A disk offering bind the specified disk  A compute offering bind the specified host and disk
  • 34. Provisioning Processes Best Practices A. Install Host OS by cobber B. Install CS agent and system settings by puppet C. Install and configure monitor by puppet D. Services Orchestration system trigger scripts to register host to CS E. Services Orchestration system trigger script to generate Disk offerings and Compute offerings for Host F. Services Orchestration system register host to CMDB G. Host go launch
  • 35. Troubleshooting Best Practices  Analyse Log files  Management Log : /var/log/cloud/management/  Agent Log : /var/log/cloud/agent/  Adjust log4j level for debugging  Source Code  Data Models
  • 36. Performance Tuning  BIOS Settings for KVM Host For Dell PowerEdge servers: A. Set the Power Management Mode to Maximum Performance. B. Set the CPU Power and Performance Management Mode to Maximum Performance. C. Processor Settings: set Turbo Mode to enabled . D. Processor Settings: set C States to disabled.
  • 37. Performance Tuning (contd)  CS Tuning  NFS Server Tuning  Use NFSV4  noatime,nodiratime,noacl,data=writeback,commit=15  IDE/Sata parameters  NIC &TCP/IP  Use GlusterFS  Management Server Tuning  Increase Worker Process Number  Turn off stats collectors  Tuning Allocation Algorithm  Tuning Direct Agent Load Size  Mysql DB tuning  JVM Tuning  Heap Size Tuning  Use CMS GC Algorithm
  • 38. Performance Tuning (contd)  KVM Tuning  CPU  Disable KSM in KVM Host  Disable tickless mode in KVM guest  PIN CPU in KVM host  Memory  THP in KVM Host  echo 'yes' > /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag  echo 'always'> /sys/kernel/mm/redhat_transparent_hugepage/enabled  echo 'never'> /sys/kernel/mm/redhat_transparent_hugepage/defrag  network performance issue in centos 6.2  Workaround: blacklist vhost-net. Edit /etc/modprobe.d/blacklist-kvm.conf and include vhost-net.  Linux kernel parameters tuning  TCP Buffer Tuning
  • 39. Q&A