UCS Security and High Availability Configuration Guide

UCS Security
www.silantia.com1
 System Policies
 High Availability
 System Events
 SNMP
 Firmware
 TAC Information

System Policies
www.silantia.com2

Overview of High Availability
www.silantia.com3

High Availability
www.silantia.com4
 Two fabric interconnects two IOM per chassis so two
data paths. Per blade.
 Clustering of FI requires same UCS manager version
and same model of FI.
 Clustering is done thru L1 and L2 port on Fabric
interconnect. These ports are non-configurable.
 L1-L2 ports 1000BaseTX using straight through Cat6
cable
 Pre-configured to run LACP and CDP.
 Links are 802.3ad bond managed by underlying OS.

High Availability
www.silantia.com5
 Cisco UCS manager controller:
 Distributed application runs on both the primary and
subordinate UCS manager instance
 Each instance is represented by node ID
 Separate process running on Cisco NX-OS
 Defines running mode UCS manager processes
 Cisco NX-OS:
 Starts all Cisco UCS manager processes
 Monitors and restart UCS manager processes.

High Availability
www.silantia.com6
 Local Storage:
 NVRAM and flash stores static data
 Read and written but local Cisco UCS manager
instance
 Replicated when both nodes are up
 Chassis EEPROM
 Serial EEPROM stores state data
 Upto 3 chassis has its EEPROM written with state
information in two partitions.
 Read and written by both chassis management
controller
 Used to assist the Cisco UCS manager in determining
state of the cluster.

Viewing and Changing Management HA
www.silantia.com7
 connect local-mgmt
 dc101-A# sh cluster extended-state
 Cluster Id: 0x898942147f8311e2-0x8af9547feeed8104
 Start time: Sun May 26 18:36:30 2013
 Last election time: Sun May 26 18:36:33 2013
 A: UP, PRIMARY
 B: UP, SUBORDINATE
 A: memb state UP, lead state PRIMARY, mgmt services state: UP
 B: memb state UP, lead state SUBORDINATE, mgmt services state:
UP
 heartbeat state PRIMARY_OK
 INTERNAL NETWORK INTERFACES:
 eth1, UP
 eth2, UP
 HA READY
 Detailed state of the device selected for HA storage:
 Chassis 1, serial: FOX1450H4JK, state: active
 dc101-A#
 cluster lead
 cluster force
L1 and L2 ports
Serial EEPROM Chassis

High Availability (split brain issues)
www.silantia.com8
 Partition in space:
 A partition in space occurs when the private network fails (no
path from L1 to L1 and L2 to L2)
 There is a risk of active-active management node.
 Both nodes are demoted to subordinate and a quorun race
begins.
 The node that claims the most resources wins.
 Partition in time:
 A partition in time occurs when a node boots alone in the cluster.
 Node compares its database version against the serial EEPROM
and discovers that its version number is lower than current
database version.
 There is risk of applying an old configuration to UCS
components.
 This node will not become the active management node.

System Events
www.silantia.com9

Fault severity
www.silantia.com10
Severity Description
Critical A service-affecting condition that requires immediate corrective
action. This severity might indicate that the managed object is out of
service and its capability must be restored.
Major A service-affecting condition that requires urgent corrective action,
This severity might indicate a severe degradation in the capability of
managed object and that its full capability must be restored.
Minor A non-service impacting fault condition that requires corrective action
to prevent a mode serious fault from occurring,.
Warning A potential service-affecting fault that currently has no significant
effects in the system.
Condition An informational message about a condition, possibly independently
insignificant.
Info A basic notification or informational message, possibly independently
insignificant.

Fault states
www.silantia.com11
State Description
Active A fault was raised and it currently active
Cleared A fault was raised but did not reoccur during the flapping interval.
The condition that caused the fault has been resolved, and the fault
has been cleared
Flapping A fault was raised, cleared, and then raised again within a short time
interval, known as flap interval.
Soaking A fault raised and then cleared but since it was a flapping condition,
the fault severity remains at its original active value, but this state
indicates that condition that raised the fault has cleared.

System Events settings
www.silantia.com12
Admin Tab- >Fault,events and audit log -> Settings

SNMP
www.silantia.com14
 All SNMP versions are supported. V1,v2c and v3.
 Username and password is configurable on device for
SNMP version 3.
 Source IP address of all SNMP transaction uses
cluster IP address.
 Admin Tab -> Communication management ->
Communication services -> SNMP

Firmware
www.silantia.com16
 UCSM, IOM and Fabric interconnect upgrade
 Following steps are done under Equipment-> firmware management -
> Update/Activate firmware.
 Activate Cisco UCS Manager new image
 Activate the I/O modules new image
 Activate the subordinate fabric interconnect new image
 Manually failover the primary fabric interconnect to the fabric interconnect
that has already been upgraded.
 This step is done thru command line using following command
 UCS-A (local-mgmt) # cluster {force primary | lead {a | b}}
 Verify that the data path has been restored.
 Activate the primary fabric interconnect new image
 Note: During fabric interconnect upgrade each blade will lose
one path but other path is available so fabric failover from UCS
and/or vmware nic teaming should work.
 Upon activating IOM image, does not reboot the IOM, IOM
reboots and upgrade when connected fabric interconnect
reboots and upgraded.

Firmware
www.silantia.com17
 Host firmware packages.
 Grouping of Adapter, BIOS, Board controller, Storage
controller firmwares in to an entity which can be then used
in service profile.
 Management firmware packages.
 Set of CIMC images for different kinds of blades.
 When above applied to a service profile which is
already associated it will trigger maintenance task.
Depends on how it is scheduled this firmware updates
will be applied.

TAC Information
www.silantia.com18
 Go to Admin Tab click on All and then “Collect TAC
specific information”

TAC Information
www.silantia.com19
 cisco-ucspe# connect local-mgmt
 cisco-ucspe(local-mgmt)# show tech-support
 chassis Chassis
 fex FEX (fabric-extender) Module
 server Rack Server
 ucsm UCSM
 ucsm-mgmt UCSM Management(excludes
fabric interconnect)
chassis 1 cimc 2
chassis 1 iom 1

UCS Security and High Availability Configuration Guide

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie UCS Security and High Availability Configuration Guide

Ähnlich wie UCS Security and High Availability Configuration Guide (20)

Mehr von Krunal Shah

Mehr von Krunal Shah (7)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

UCS Security and High Availability Configuration Guide