Problem management is typically defined as an aggregated process that analyses issues within an organisation and provides causation to adverse events and situations.
A key element is how a major incident is handled as this is one of the most crucial processes for an enterprise. A major incident which is one with a significant negative business consequences needs to be handled with a well defined process which is not currently clearly defined in existing methodologies.
This course addresses how an enterprise, with a focus on IT, needs to handle the major incident process which includes those outages and failures that are on the immediate horizon of any enterprise.
It also deals with the aspects of dealing with problems with an organization in a generic fashion including supporting methodologies and processes.
2. ProblemManagementFoundation
Introduction
Problem management is typically defined as an aggregated
process that analyses issues within an organisation and provides
causation to adverse events and situations.
A key element is how a major incident is handled as this is one of
the most crucial processes for an enterprise. A major incident
which is one with a significant negative business consequences
needs to be handled with a well defined process which is not
currently clearly defined in existing methodologies.
This course addresses how an enterprise, with a focus on IT, needs
to handle the major incident process which includes those outages
and failures that are on the immediate horizon of any enterprise.
It also deals with the aspects of dealing with problems with an
organisation in a generic fashion including supporting
methodologies and processes.
3. ProblemManagementFoundation
What does this course provide?
In a nutshell...
Develop an optimum strategy for dealing with mission-critical data
centre environments and IT assets based on consequence analysis.
Describe all components that are important for high availability in the
IT landscape and how to effectively set up the IT environment to be
geared for operational safety.
Apply relevant processes from the various industry frameworks.
Understand the various tools available to assist in IT Crisis management
to optimize the Major Incident process.
Become effective and efficient at IT Crisis management and introduce
cost optimisation into the business using time optimization.
4. ProblemManagementFoundation
What does this course provide?
In a nutshell...
Design a highly reliable and scalable IT architecture and learn what is
required to make the IT environment reliable and robust.
Understand the landscape of IT Risk and introduce real information
security abilities across all areas and disciplines.
Prioritise your company's competencies (which drives all business
decisions).
Identify and mitigate the risks associated with Major Incidents.
Search for workarounds during the initial investigations that can
reduce the clients pain.
Measure and audit the actions for missing safeguards/controls.
6. ProblemManagementFoundation
Course outline
1. Overview, introduction to crisis management
What is problem management?
Entities involved in problem management
ITIL’s incident and problem management
What is a Major Incident?
Vital Business Functions (VBF)
What does this course provide?
2. Perceptions, strategies to dealing with a crisis
Making Information Technology (IT) visible
Transparency – the importance of being ugly
3. Significant Havoc in Technology, the statistics and theory of a crisis
How complex systems fail by Richard Cook
Incident iceberg
Activities and outcomes
7. ProblemManagementFoundation
Course outline
4. Engineering to avoid a crisis, building and operating systems to deal with a
crisis
Fail-over, resilience and redundancy
Documentation
Correct implementation
5. Tiger Teams, creating structured operations for dealing with a crisis
History and background from space flight
Teams: echo, whisky, delta, romeo, bravo, alpha
6. The Major incident lifecycle, timelines and attributes of a crisis
Analogy to riding a bicycle
Lifecycle Diagram
The importance of time
Metrics and measurement
Detecting a negative event
8. ProblemManagementFoundation
Course outline
6. The Major incident lifecycle (cont.)
Diagnosis
o Checklists
o Crime scene, Genchi Genbutsu, recording, visualization, prevailing
conditions
o Changes
Repair, Restore, Recover, Resolution
Workaround
o Firefighting
7. Mission Control, establishing a Crisis Management Operations Centre (CMOC)
Best practices for the CMOC
The three level CMOC
8. Communications
Escalations
9. ProblemManagementFoundation
Course outline
9. Crisis management control points
WAR rooms
Technical Observation Post (TOP)
10. Tools
Hardware, software, process
11. Analysing a major incident
Incident consequence analysis
Lessons learnt, After Action Review (AAR)
Classification
Problem solving tips
12. Simulation
Training
Testing
13. Budget
Cost categories
10. ProblemManagementFoundation
Course outline
14. The IT risk landscape, measuring and quantifying associated risks
The risk matrix, areas and disciplines
SWOT
Lights, camera, action
Controls
Decisions
15. Continuous improvement, changing the impact and
consequences of future events
Deming wheel
Toyota Production System (TPS)
Prioritisation using the Pareto principle
The CM 101 course
Ronald brings strategic direction including in-depth expertise to Infrastructure and Technical architecture. He has a successful track record to solving a variety of business challenges related to not only country wide telecommunications but also Information Technology. He is passionate about applying suitable technical strategies to resolve crucial business-related problems.
He has been involved in a large number of corporate implementations in South Africa. This includes communications infrastructure such as legacy TDM environments, VSAT and carrier Ethernet, especially when high availability is required. Skilled in understanding the requirements and strategy he is able to support business functions in a converged environment having worked with video conferencing and VoIP systems for close to 20 years.
He has a solid grounding in Information Security having been trained as well as managed operational IT security functions. He has a demonstrated ability to define and document policies Enterprise-wide and conduct IT risk assessments. This includes utilising costing models. Ronald is also highly skilled in reviewing standardised best practices, assessing and recommending technologies to support a company’s needs, as well as developing products to deliver on these requirements and measure via service level agreements.
With proven leadership abilities and able to handle a staff compliment, he is driven, innovative and a critical thinker. He also demonstrated a superior ability to mentor and manage technical staff and external service providers. As such he has participated in the delivery of customer solutions to remote sites and new premises.
Ronald focuses on service management opportunities within the telecommunications and financial user base, especially related to IT crisis management and more specifically the major incident process.
Refer https://lnkd.in/esKjtHm
Develop an optimum strategy for dealing with mission-critical data centre environments and IT assets based on consequence analysis.
Describe all components that are important for high availability in the IT landscape and how o effectively setup the your IT environment to be geared for operational safety.
Apply the various relevant processes from the various industry frameworks.
Understand the various tools available to assist in IT Crisis management to optimize the Major Incident process.
Become effective and efficient at IT Crisis management and introduce cost optimization into the business using time optimization.
Design a highly reliable and scalable IT architecture and learn what is required to make the IT environment reliable and robust.
Understand the landscape of IT Risk and introduce real Information security abilities across all areas and disciplines.
Prioritize on your company's competencies (which drives all business decisions)
Determine and mitigate the risks associated to the Major Incidents
Search for workarounds during the initial investigations that can reduce the customers’ pains
Measure and audit the actions for missing safeguards
How you deal with a crisis is more important than it happening!