Detailed outline of an Information Technology Infrastructure Library (ITIL) is a set of practices for IT service management (ITSM) that focuses on aligning IT services for Software Companies
5. IT Infrastructure Library
• A set of books describing code
of best practice for IT Service
provision
• UK Government
• First edition – Late 80’s
• Revised in 2000
• Non-proprietary
• Platform independent
ITIL
9. ITIL Publications
Planning to Implement Service Management
The Business Perspective
Application Management
ICT Infrastructure Management
Security Management
Service Support
Service Delivery
Planning to Implement Service Management
Service Management
Service
Support
Service
Delivery
T
h
e
B
u
s
i
n
e
s
s
The
Business
Perspective
Applications Management
ICT
Infrastructure
Management
T
h
e
T
e
c
h
n
o
l
o
g
y
Security
Management
12. Mission
To identify, control and audit the information
required to manage IT services by defining and
maintaining a database of controlled items, their
status, lifecycles and relationships and any
information needed to manage the quality of IT
services cost effectively
14. Objectives
• Identify & record management information
• Account for all IT assets & configurations
• Control the information in the database
• Ensure that information reflect reality
• Provide a basis for management
• Provide status of components
18. How low do you go???
• CI Levels
– Lowest level of independent change
– Who are you and what are you doing?
– Information value vs collection effort
20. Key Terms
• Relationship
– Primary
– Secondary
• Baseline
– Snapshot of a CI at a time or stage
• Variant
– A baseline with minor differences
• Model, Version and Copy numbers
– Type
– Unique
– Version / Copy
21. Life Cycles
• Stages in the life of a CI
• Allow CIs to be moved and tracked
Ordered
Delivered
Set up
Installed
Withdrawn
Maintenance
23. Stages in Configuration Management
• Identification
• Control
• Status Accounting
• Verification / Audit
24. Identification
• Logical
– What items need to be recorded?
– What do we need to know about them?
• Physical
– Marking items that are under Configuration
Management control
Logical & Physical
25. Basic Principles
• CIs must be uniquely identified
• Prominent & clearly visible
• Meaningful naming
• Copy numbers must be catered for
• Cater for growth
26. Control
• Information in the CMDB
– Access
– Changes
– Adding new items
• To achieve control
– Agree and freeze CI specification
– Only allow changes through change management
27. Status Accounting
• Uses lifecycles and attributes
• Records and reports on
– Current data
– Historical data
28. Verification & Audit
• Does the CMDB reflect reality?
• Accuracy is improved by
– Active rather than passive CMDB
– Automatic updating
– Integration with other processes
– Automatic checks
37. Benefits
• Accurate information & documentation on CI’s
• Control of valuable CI’s
• Legal Obligations
• Financial & expenditure planning
• Registration of Software Changes
• Contingency planning
• Improving Release Management
• Improved security
• Trending data
38. Costs
• Staff costs
– Initial audit
– Management
• HW & SW identification & Level of control
• Number of users who have access
• Need for tailoring
• Diversity & quality of information
• Level of integration
39. Possible Problems
• Incorrect CI level
• Emergency changes
• Over-ambitious schedules
• Circumvention of procedures
• Manual systems
• Over expectation
• Isolated implementation
• Difficult without Change Management
• Difficult to cost justify
• No operational use of the system
41. Mission
To manage all changes that could impact on IT’s
ability to deliver services through a formal,
centralised process of approval, scheduling and
control to ensure that the IT Infrastructure stays
aligned to business requirements with minimum risk
42. Objectives
• Manage the process of:
– Requesting changes
– Assessing changes
– Authorising changes
– Implementing changes
• Prevent unauthorised changes
• Minimise disruption
• Ensure proper research and relevant input
• Coordinate build, test and implementation
43. Scope
Hardware
System Software
Communications Equipment and Software
‘Live’ Application Software
All documentation, plans and procedures relevant to the
running, support and maintenance of live systems
Environmental Equipment
45. Key Terms and Roles
• Request for Change (RFC)
– Contains all necessary information to make the change
• Change Advisory Board (CAB)
– Assesses resource requirements and impact
– Advises the Change Manager
• CAB Emergency Committee (CAB/EC)
– Urgent changes
– 1-3 senior staff
46. • Forward Schedule of Change (FSC)
– Details of approved changes & dates
• Projected Service Availability (PSA)
– Best time for change to be implemented
• Change Model
– Pre-defined path
• Standard Change
– Pre-authorised change
48. Initiate Change
Filter Requests
Initial Priority
Decide Category
Urgent?
Normal Change
Procedure
Reject
To urgent procedure
Yes
Change Model?
To Change Model
procedure
Yes
Minor Significant Major
Assess impact and resources. Confirm priority and ScheduleAuthorised?
No
Yes
Refers RFC upwards. IT Director
decides then passes to CAB for
actioning
Circulates RFCs
to CAB members
Authorises and schedules
change. Report action to
CAB
49. Independent Testing
Build change, Testing &
back out Plans
Co-ordinates
implementation
Working?
Monitor/Review Change
Back out / Refer back to
CAB
Normal Change
Implementation
Procedure
From Normal Change
Yes
No
Successful?
Yes
Close
No
To Start
Failure
Update Documentation
50. Urgent CAB or
CAB/EC meeting
Assess impact resource
requirements and urgency
Urgently prepares the
change
Urgent? To normal procedure
Urgent Change
Procedure
Time for test? Urgent Testing
Yes
No
Yes
Failure
No
Co-ordinate
implementation
53. Benefits to Business
• Greater IT & business alignment
• Higher availability
• Increased productivity
• More communication – greater trust
• IT can handle more changes
• Balance between need for change & potential impact
54. Financial Benefits
• More accurate forecasting
• Better quality decisions
• Reduction in amount of rework
55. IT Benefits
• Easier to meet SLA’s
• Fewer change failures
• Back out plans – easier restore
• Valuable input for problem & availability
• Increased productivity of IT Staff
59. Mission
To take an holistic view of a change to an IT Service
and ensure that all aspects of a release, both
technical and non-technical, are considered
together
60. Why Release Management?
• Large or critical
hardware roll-outs
• Major software roll-
outs
• Bundling or batching
related sets of
changes
In-house
applications
“Other” software
Utility Software
System Software
Hardware
Specifications
Assembly
Instructions
User Manuals
62. Key Concepts
• Release
– Collection of authorised changes
– Major / minor / emergency
• Definitive Hardware Store (DHS)
– Storage of Hardware spares
• CMDB
– Definitions of planned releases
– Records of CI’s impacted by release
– Information about the target of environment
63. Key Concepts
• Definitive Software Library (DSL)
– Physical secure storage
– Source code & Original media
• Build Management
– Controlled environment
– Compiled on dedicated “build hardware”
• Release Policy
– Roles, responsibility & content
– Form part of initial planning
• Release Unit
– Components released together
64. Release Units
• Systems, suites, programs and modules
• Factors affecting the level of release
– Number and extent of changes
– Number of changes that can be managed
– Available resources and time
– Ease of implementation
– Complexity of the release
65. Release Units
System 1
Suite 2.1
Program 2.2.1
Module 2.2.2.1 Module 2.2.2.2 Module 2.2.2.3
Program 2.2.2 Program 2.2.3
Suite 2.2 Suite 2.3
System 2 System 3
IT Infrastructure
66. Development Releases - and
• Managed by development
• Must not affect live services
• Should not require production resources
• Customer agreement obtained
• Usage covered in SLAs
• Must not replace live systems
• Must be licensed
67. Normal Release - Full
• All components built, tested, distributed &
implemented together
• Better integrated testing
• Easier to detect & rectify problems
• Complex & will require more resources
68. Normal Release -
• Partial release
• Contains only new or changed items
• Not as stable as full releases
• Authorisation of a delta release depends on:
– Size of a full release compared to the delta
– Urgency of required facilities
– Number of changes already made
– Potential business impact
– Available resources
69. Normal Release – Package
• Combination of release units
• Reduces number and frequency of releases
• Better integration and testing
• Less old or incompatible software
• Could result in delays to fixes or enhancements
• Greater potential for disruption
71. Urgent Releases
• Disruptive and error prone
• Often used to bypass Change Management
• Controls are essential
– Use software from the DSL
– Software must be replaced through the DSL
– Must follow Change Management
– CMDB must be updated
– Version control
– Testing and documentation
– Give notice
72. Back-Out Plan
• Documents actions that will restore service
• Still part of change
• Two approaches
– A full reversal of release
– Contingency plans to restore as much as possible
• Should be verified and tested
76. Release Planning
• Agreeing release content
• Planning phases of releases
• Produce schedule
• Assess hardware at target site
• Plan resource requirements
• Obtain quotes if upgrades are required
• Produce back out plans
• Develop quality plan
• Plan acceptance of support groups
77. Designing, Building & Configuring
• Components assembled in controlled process
• All components of release should be under
Configuration control
78. Testing & Release Acceptance
• Before going to live
• Types of testing
– Functional testing
– Operational testing
– Performance testing
– Integration testing
– Testing & back out plans
• Final acceptance & sign off – part of Change
• Rejection treated as failed change
80. Communication, Preparation &
Training
• Support staff & customers
• Training
• Parallel working
• Involvement in acceptance process
• Rollout planning meetings
81. Distribution & Installation
• Distribution
– Equipment reaches destination in time & in tact
– Secure Storage Areas
– Checked against relevant documentation
– Final check before implementation
• Installation
– Functional checks of equipment
– Automate deployment
– Installation routines
– Include check of target
– User checklists?
82. Software ordered
Software developed
and supplied
Acceptance checks
OK?
Rectification
Action
No
Software placed in DSL
Final approval
Package built in
test environment
Operational
acceptance testing
OK?
No Build in live
environment
Distribute to live
environment
Implemented on
live environment
C
M
D
B
Normal Flow of software
84. Business Benefits
• Minimum disruption
• Better quality of service
• Fewer & less frequent releases
• Effective scheduling of users for testing
• Overall reduction in business risk
• Business knows what to expect & can plan
85. Financial Benefits
• Assets more controlled
• Less time & resources spent on rework
• More responsive to revenue producing
opportunities
• Prevention of duplication of activities
86. IT Benefits
• Consistent quality of releases
• Centralised control
• Improved quality and control of changes
• Effective planning of staff activities
• Number of regressions are reduced
• Easier detection of unauthorised and incorrect
versions
• Less blame shifting
87. Costs
• Storage costs
• Build , test and archive environments
• Secure equipment stores
• Software distribution tools
• Network bandwidth
• Telecommunications
• Staff and training
88. Problems
• Circumvention of procedures
• Emergency fixes
• Distribution of builds directly from development
• Uncoordinated implementation of Software and
Hardware
• Resources not available for testing
• Test results are invalid
• Process is seen to be unclear or bureaucratic
91. The Service Desk
Structure not a process
• Drive & improve service to the business
• Single point of contact
– Advice
– Guidance
– Rapid restoration to service
92. Role of the Service Desk
• Supports the incident & problem management
function
• Provide a central point of contact
– Preventing the same incident being reported to
different people over & over
– Preventing the loss of incidents
– Preventing technical people being disrupted
– Preventing unnecessary work if already known error
93. Objectives
• Single point of contact for reporting of incidents
• Accurately record information about incidents
• Co-ordinate activities to restore service to normal
• Support the incident & problem management functions
• Provide management information
• Provide support & advice to business
95. Service Desk Functions
• Log Incident
• Pre-scan phase
– Not Known Error
– Proper procedure have been followed
– Required supporting evidence is complete & present
• Incident Management
• Service Desk remains responsible
• Responsible for escalation
• Regularly feeds back to user
96. Service Desk & Change Management
• Log Changes & cross reference to problems
• Issue change schedules
• Monitor & track changes & assist with escalation
• Inform users of change once complete & update
change schedules
97. Common Features of Service Desks
• A single point of contact for all users
• A central log of all incidents
• Each incident uniquely numbered and date/time stamped
• Diagnostic scripts and other aids
• Configuration Management Support Tools
• Known Error Lists
• An impact coding system
98. Common Features of Service Desks
• Automatic escalation procedures based on impact, priority
and elapsed time
• Telephone and electronic mail communication with all
support staff
• Interface to Service Level Agreements
• Regular progress reporting
• Classification of incidents at call closure
• Regular management summaries of calls received and
resolved
100. Local Service Desk
• Local desk meeting local needs
• Support staff also local
• Becomes impractical with multiple locations
• Several local desks – operational standards
• Common processes across all locations
101. Local Service Desk
Local
User
Local
User Local
User
Third Party
Support
Network &
Operations
Support
Application
Support
Desktop
Support
Service Desk First line Support
102. Centralised Service Desk
Customer Site
1
Customer
Site 2
Customer
Site 3
Third Party
Support
Network &
Operations
Support
Application
Support
Desktop
Support
Service Desk
Second Line Support
103. Modem
Virtual
Service
Desk
Paris Service Desk Sydney Service
Desk
Modem
Third Party Supplier
Service Desk
Cape Town
Service Desk
Local Users
LAN
Service
Management
Database(s)
London Service
Desk
Toronto Service
Desk
fax
LAN
Durban Service
Desk
User Site ‘n’User Site I
Telephone
Local Users Remote Users
104. Virtual Desks
• Physical location immaterial
• Used for global organisation
• Benefits include
– Reduced operational costs
– Consolidated management overview
– Improved usage of available skills
– Knowledge sharing
• Onsite assistance still required
105. Outsourcing
• Have outsourcers use your Service Desk tool
• Keep ownership of management information
• Ensure suitably skilled staff
• Request details of staff
• Monitor value for money
• Check supplier dependencies
• Ensure deliverables are clearly understood
108. Technically Unskilled Staff
• Centralised Service Desks
• Emphasis on interpersonal skills
• Large call volumes, little support
• Administrates and coordinates calls
• Relies on diagnostic scripts and other tools
• Technical staff are not distracted or demotivated
• No in-depth support
• Potential job satisfaction is high
109. Technically Skilled Staff
• Lower call volumes, greater support
• Longer call times
• May become to involved in technical aspects
• Job satisfaction issues
• Customer satisfaction issues
• Peak time staffing issues
• Familiarity breeds contempt
110. Expert Staff
• Resolve all calls
• Staff are more important than procedures
• Will play the role of technical departments
112. Definition of an Incident
Any event which is not part of the standard operation of a
service and which causes, or may cause, an interruption
to, or reduction in, the quality of that service
Includes
• New services
• Automatically registered events
113. Mission
To minimise the impact of service disruptions
to the business by restoring that service
through effective management of incidents
114. Scope
• Inputs
– Incident details from service desk
– Configuration details
– Matched incidents, problems & known errors
– Resolution details
– RFC
• Outputs
– RFC for resolution
– Resolved & closed incidents
– Communication to Customers
– Management information
115. Objectives
• Restoration of service as quickly as possible
• Ensure timely resolution of all incidents
• Identify trends that may assist in incident
resolution
• Assist problem management in identifying trends
117. Incident Handling
• Service Desk owns Incidents
• Progress reporting
• Incident Lifecycles
– New
– Accepted / Assigned
– Scheduled
– WIP
– On Hold / Waiting
– Resolved
– Closed
118. Levels of Support
• 1st line Support
– Service Desk
• 2nd Line Support
– Incident Management
• 3rd Line Support
– Specialist Group
119. Key Concepts (cont.)
• Ownership & Communication
– Monitor status against open Incidents
– Incidents passed between support groups
– Affected users are kept informed
– Check for similar Incidents
– Incidents that are likely to exceed SLA times
• Escalation
– Functional Escalation
– Hierarchical Escalation
122. Key Concepts (cont.)
• Incidents, Problems and Known Errors
– Incidents are events or occurences that degrade or
disrupt a Service
– A problem is the underlying cause of one or more
incidents that have not yet been diagnosed
– Known Errors are
• Problems that have been diagnosed and have not yet been
rectified
• Problems that have been diagnosed and for which a resolution
or circumvention exists
124. Reasons for classification
• Identifying the service the Incident is
related to
• Associate the Incident with the SLA
• Selecting the most suitable support team
• Indication of the impact and/or severity
• Match Incidents to Known Errors
• Determine a reporting structure
Incident detecting
& recording
Initial classification
& support
Service
Request
Service Request
Procedure
•Service Desk
•System Monitoring tools
•Capturing base & initial data•Diagnostic Scripts
•Known Error Database
•Skill Levels
•Knowledge base and/or expert software
125. Incident detecting
& recording
Initial classification
& support
Service
Request
Investigation &
diagnosis
Resolution &
recovery
Incident Closure
Service Request
Procedure
Ownership,monitoring,trackingand
communication
• Support group accepts assignment
• Advise if work around can be provided
• Attempt resolution
• Record all details
126. • Monitor status against open Incidents
• Incidents passed between support
groups
• Affected users are kept informed
• Check for similar Incidents
• Incidents likely to exceed SLA times
• Escalation
Ownership,monitoring,trackingand
communication
127. Incident detecting
& recording
Initial classification
& support
Service
Request
Investigation &
diagnosis
Resolution &
recovery
Service Request
Procedure
Ownership,monitoring,trackingand
communication
Incident Closure
128. Incident
CMDB
Incident, Problem
& KE databases
Diagnostic data
system dumps and
journals
Support staff
allocation
Basic fact
gathering
Enquiries on
historical data
Support staff
allocation
Allocate
further support
Incident Closure
Diagnosis/
Circumventions?
Escalation
threshold
exceeded
Liaise with Problem
Management to create
Problem or Known Error
record where appropriate
Who, When
•
Results?
•
Correlations
•
Dumps, ID’s etc
•
Diagnosis and
resolution/
circumvention
action
•
What, why when?
•
When
Incident progress
summary
Free Format
text record
Diagnostic data
search
Y
Y
N
N
130. Mission
To minimise the disruption of IT services by
organising IT resources to resolve problems,
preventing them from recurring and recording
information that will improve the way in which
IT deals with problems, resulting in higher levels
of availability and productivity.
131. Scope
• Reactive
– Solving problems in response to incidents
• Proactive
– Solving problems before incidents occur
The main goal of Problem Management is the detection of
the underlying causes of an incident and their subsequent
resolution and prevention
132. Objectives
• Identify, manage and resolve problems
• Prevents recurrence of problems
• Reduce the number and severity of problems
• Minimise impact to business
• Ensure right level of staff
• Record & manage information
• Ensure vendor compliance when resolving problems
134. Problem V Error Control
Problem Control
Transforms Problems into
Known Errors
Error control
Resolving Known Errors via
the Change Management
process
137. Problem Identification
• Initial support could not match the Incident to a
known problem
• Analysis of Incidents
• Analysis of IT infrastructure
• Significant or Major Incidents
138. Problem Classification
• Impact
– Direct effect on the business
• Urgency
– The measure of business criticality based on impact and business
need
• Priority
– The order in which a series of items should be addressed
– P=I x S x U
139. Defining Priority
Priority – sequence in which an Incident or Problem
needs to be resolved
Impact – measure of the business criticality of
an incident
Severity – what is the effect on the infrastructure /
resources?
Urgency – extent to which the resolution of a Problem
or error can bear delay
Priority = Impact Severity Urgencyx xP = I S Ux x
140. Investigation & Diagnosis
• Diagnosis of root cause
• Update of problem record
• May reclassify at closure
• Methods of problem Analysis
149. Proactive Problem Management
• Identifying and resolving problems before
incidents occur
• Activities include:
– Trend Analysis
– Targeting support action
– Providing information to business