This document provides an overview of incident management strategies and UCAS' approach. It defines what an incident is, discusses why incidents occur, and outlines the aims of incident management as preventing incidents, rapidly detecting and responding to those that do happen, and understanding their root causes to adapt processes. The document then details UCAS' security strategy, which incorporates incident management into protecting assets, detecting issues, and reducing vulnerabilities and impacts. It also outlines features of an effective incident response process like clear procedures, communication plans, and review processes.
1. Some ideas on managing incidents
Wasn’t expecting that! Now what?
Andy Gibbs
Enterprise Architect - Security
2. Agenda
Introduction to UCAS
The nature of incidents: what, why, how
Aims of incident management
UCAS’ strategy and approach
Shared experiences
3. UCAS is Unique!
We are the national centralized organisation
processing applications to higher education in the UK.
An intermediary in an ever changing multi-£billion market.
5. Our customers
Circa 800,000 applicants each year
Circa 600,000 placed
4 million applications, in over 6,000 registered centers,
to 387universities & colleges & 1200 schools.
This includes UK & international schools, agents and advisers
from over 100 countries.
6. Our challenge
Protect vital stakeholder information
Deliver our services in a secure, reliable and operationally stable manner
8. What is an incident anyway?
An event or occurrence that has an unexpected and adverse
effect on normal circumstances:
• Business assets (including digital assets)
• Services, outputs or deliverables
• Operational processes
• Resource levels
• Assurance levels (integrity, quality, reliability etc)
It will usually require special treatment to resolve:
• Additional resources (time, money, people, equipment)
• Emergency processes
• Skill-sets
9. Appropriate incident response
data breach
business loss
fines
reputational damage
harm
£ unnecessary
wasted resources
Distraction from normal business
panic & alarm
Too early, too greatToo little, too late
10. Why do incidents occur?
Organisations take risks
Processes can and will go wrong
Things break
People are human – we all make mistakes
Carelessness, naivety, distractions, misunderstanding
Those with malicious, malevolent or criminal intent
Defences are not perfect
INCIDENTS ARE INEVITABLE
PLAN and RESPOND ACCORDINGLY
11. Aims of incident management
Prevent incidents from occurring
Rapidly detect and respond to incidents when they do occur
Contain the situation and minimise business disruption
Quickly identify the issue and who/what has been affected
Inform and update affected parties of situation and action plan
Take prompt and co-ordinated remedial action to re-instate
Understand root-cause and adapt accordingly
Minimise the cost and disruption of handling incidents
13. Protect - managing your risk
Risk has three components:
• threat
• vulnerability
• impact
Eliminate any one of the above and you have no risk!
14. Threat
Understand your threat landscape
• Participate in Security Groups
JANET UK-SECURITY forum
CyberSecurity Information Sharing Partnership (CiSP) (NCSC)
LinkedIn Information Security Forum
• Media and Press
• Threat advisories and reports
15. Vulnerability
A vulnerability is an incident waiting to happen!
Identify and reduce your areas of vulnerability:
• Technical - vulnerability advisories
vulnerability scans and pen tests
• People - active awareness and education programmes
• Financial - contingency planning, budgeting, reserves
16. Reduce Impact
Reducing impact should an incident occur
• Regular backup of data
• Build in resilience
• Responsive incident management process
• Business continuity plan - Test
17. Detect
Keep employees vigilant
• Raise and maintain awareness
• Encourage and reward people for reporting
• Make reporting issues easy
Manual / passive detection alone is not sufficient
- pro-actively monitor for incidents
21. Incident response- features
Clear incident process
• Common point of incident co-ordination
• Agreed lines of command and control
• Multiple ways of communicating
Triage process
Duty Incident Manager (+ deputies)
Incident team + resolver groups
Nominated deputies for all critical roles – no SPOF
Run-books to support incident handling and recovery
Communications – internal and external stakeholders, media and press
Escalation processes when necessary
Review – root cause + long-term fix
Incident closure
22. Adapt
Always establish root cause if possible – Why? Why? Why?
Do we have an underlying problem?
Identify lasting / long-term solution
Understand risks of maintaining any ‘sticking-plaster’ fixes
Learn from the experience
Adapt people, processes and technology
Record the outcome
Communicate to affected stakeholders
Introduction
Andy Gibbs
Enterprise Architect – specialising in the area of Security.
I’ve worked at UCAS for just over three years now,
. . . helping to choose the people, processes and technology
needed to protect UCAS’ vital information assets, systems and business services.
Introduction to UCAS
I expect that most here to already know who UCAS are and what we do
. . . But just a brief introduction to those who don’t
. . . and also to set the scene in terms of the operational and security challenges we handle
The nature of incidents: what, why, how
Since we’re here to discuss Incident Management I’d like to briefly touch upon what we mean by an ‘incident’ . . .
Aims of incident management
. . . and understand what we are trying to achieve by having good incident management process in place.
UCAS’ strategy and approach
I’d like to share with you the way UCAS’ Incident management processes have evolved in line with UCAS’ security strategy . . .
Shared experiences
. . . and also share some practical experiences in handling incidents of all types
Standard Intro Slide – DON’T SHOW
Most people in UK know UCAS for our Undergraduate Admissions and Clearing scheme.
UCAS also provides admission services for Teacher Training, arts courses through the Conservatoires and Post-Graduate courses
UCAS is increasing involved in providing guidance earlier in the learning cycle with UCAS Progress, which allows young people between 13 and 17 years old to search for courses across England and Wales,
. . . and now advice on apprenticeships.
Standard Intro Slide – DON’T SHOW
Standard Intro Slide – DON’T SHOW
This means UCAS is dealing with a the best part of a million applicants each year, generating about 4 million applications to about 400 universities & colleges & 1200 schools.
This includes UK & international schools, agents and advisers from over 100 countries.
From an operational and security viewpoint that means we have a huge population of end-points over which we have little control of the technical configuration and use.
With our connections to universities, colleges & schools, we are connecting to some of the most tech-savvy , but also technologically diverse communities.
At any time we are holding
tens of millions of highly sensitive learner records containing contact details, personal attributes, educational achievements and aspirations, personal statements etc
At key times of year we hold exam results BEFORE they’ve been released to the learners
Commercially valuable sector reports, trends analysis etc
We also hold the corporate data about UCAS, our employees, customers, suppliers etc
We have a duty of care to our stakeholder groups to protect the information we hold about them or on their behalf
We strive to ensure our services are delivered in a secure, reliable and operationally stable manner. Failure to do so could have significant impact on UCAS’ reputation.
During our Confirmation and Clearing Processes, operational failure is NOT AN OPTION. Also true of our Admissions processes.
We have highly responsive Incident Management processes in place should any adverse event or circumstance threaten delivery of our services or jeopardise the security of our data.
Are we all speaking the same language?
Why is it important to understand the difference between and event, an INCIDENT
Are we all speaking the same language?
It important that we understand what we mean by an INCIDENT and to know when we have one.
An will usually require special treatment and additional resources to resolve.
This may vary from, for example, applying a simple patch to invoking a full-blown crisis management process (obviously depending on severity)
Straw poll of Audience:
How many here have an Incident Management Process?
How many here have a formal Business Continuity Process?
Of those, how many here have rehearsed/tested their processes in the last year?
How many have an Incident Prevention Programme?
Getting Incident Response wrong either way can have a huge detrimental impact:
Premature or over zealous response is expensive and may cause undue alarm or panic
Slow response may fail to effectively address issue, resulting in loss, damage, harm
We live in an imperfect world
Business is a continual balance between risk and return
The bad guys are out there!
The ultimate aim of good incident management is to minimise the cost and disruption of handling incidents.
If we accept that to be the case, we have a choice – respond as best we can when incidents occur
OR
anticipate incidents and be prepared to manage them.
The best way is to actively avoid incidents in the first place.
Our Security Strategy has 4 Quadrants – PROTECT, DETECT, RESPOND and ADAPT
Incident management has a role to play in all 4
Be pro-active in reducing or eliminating THREATS and VULNERABILITIES; Aim to reduce potential IMPACT
It is often difficult to eliminate THREAT; bad weather, terrorists and viruses are out there!
But you can successfully reduce your exposure to these by reducing or removing your vulnerability.
And given that incidents will still occur, we can also pre-empt these and have risk reduction strategies in place.
Understand your threat landscape
UCAS are members of the JISC Security Group; also CiSP (Cyber Security Information Sharing Partnership) who are part of the National Cyber Security Centre, LinkedIn Cyber Security Forum
Threat advisories – 3rd party security vendors (eg. Incapsula, Akamai, Symantec) publish regular threat landscape reports
Media and Press –The Register, SC Magazine, Cyber Daily, Info Security Alert, CSO Security Alert
Bloggers – Graham Cluley
Identify your areas of vulnerability
Technical - vulnerability scans and pen tests
People – active awareness and education programmes
Financial – planning, budgeting, reserves
Set up emergency finance booking codes (and clear authorities for use) to accommodate and record emergency spending – useful with insurance claims!
Reduce Impact
Good backup
Build in resilience – particularly for critical systems and services
Responsive incident management
Business Continuity Plans – These must be tested to be successful (come back to this later)
Provide bulletins updates etc in Intranet, Yammer etc
Ensure staff know how to report incidents or suspicious activity
Encourage reporting
Make mechanisms for reporting easy – Service Now
Physical incidents can sometimes be glaringly obvious . . .
. . . but in the digital world the smoke isn’t so obvious . . .
. . . until it its too late!
We use Splunk Cloud for our Security Information and Event Monitoring. This allows:
Secure log repository (forensic record of events leading up to incident)
event correlation
Analysis
Dashboard reports
Real Time alerting
We use Service Now in Security Operations for tracking incidents to successful closure
We will progressively integrate the capabilities of the two
Have a clear incident process laid down and well communicated
Common point of incident co-ordination
Clear invocation process - invoke
Triage – have a clear system of triage. Include a clear definition of incident classes (eg, P0, P1, P2 P3, P4) based upon severity, complexity, required resources and/or communication levels
Containment
Agreed line of command and control
Multiple means of communication