Many IT organizations suffer from the nagging problems of Availability and Performance Management. In this presentation we will detail 10 Ways to Better Application-Centric Service Management, particularly with SAP environments.
2. IT
Executives
Transport Requests
New Rollouts
Monitoring / CCMS Alerts
Access Requests
Background Jobs
Upgrades/Support Packs
SLA & Performance Reports
Trouble Tickets
IT
Managers
End-Users
The Nagging Problem of Availability &
Performance Management
SAP
Team
Linh Nguyen ITConductor.com 2
3. It’s Tough to Manage the growing
demands of SAP Environment
• Technology is changing fast and knowledge management
along with resource demands are growing
• Daily repetitive and manual tasks are growing burden that
takes time away from expert resources
• Project demands to meet new business requirements run as
parallel streams and requires dedicated effort
• Company mergers, acquisitions, divestitures, regulatory
requirements may mean additional application
environments
• Outsourced IT services blur the lines of responsibilities,
deliverables and accountability
• Move to cloud and/or service oriented architecture may
introduce private, public and/or hybrid virtual layers that
require additional effort to manage, including migration and
life cycle management tools
Linh Nguyen ITConductor.com 3
4. Grow Smarter with Application-Centric
Service Management & Automation
10 ways to Automate towards Smart Application Management
Linh Nguyen ITConductor.com 4
(1) 360-degree view
of Application
Environment
(2) Availability and
Performance
Monitoring
(3) Root-cause
Analysis
(4) Time-
synchronized
Troubleshooting
Context
(5) Service Impact
Awareness
(6) Automated
Admin Scripts and
Jobs
(7) Self-healing
Automated
Recovery
(8) Digitized
Complex IT
Processes
(9) Synthetic
Transaction
Management
(10) Dynamic
Service Level
Management
6. 1. 360-degree view of Application
Environment
Think of your applications as part of a service
providing features and functions that together
complete business tasks or workflows
Top down application-centric view of your
technology components that make up the
services, e.g. SAP Functions/modules, mobile
services, application layer, database layer
(though with SAP HANA – these could be
combined), storage, network, security,
integration/interfaces
Linh Nguyen ITConductor.com 6
8. 2. Availability and Performance
Monitoring
Availability probe for as many layers as practical starting
with the app down to components such as DB, Hosts
Event management with logs collection, filtering and
analysis such as syslogs, database logs/alerts, workflow
event logs
Performance monitoring with metric collection (also
known as performance counters). Collect only what are
relevant at smart intervals, follow best-practices for the
technology stack. SAP literally has thousands of
metrics, most of which don’t significantly impact
performance or availability. Those of lesser impact, if
really needed – say for capacity planning, should be
collected in larger intervals
Linh Nguyen ITConductor.com 8
10. 3. Root-cause Analysis
Availability, Events/Alerts, and Performance metric
are just data. They must be able to fit into context to
enhance root-cause analysis. Think of them
individually as puzzle pieces, but without context
they are more noise and doesn’t really help.
Service or Application-centric monitoring enhances
the ability fit data into context and thus provide
useful information of where, when and how they fit
into the bigger picture. Seeing where the alert or
metric originate in a service tree greatly helps the
contextual analysis.
Linh Nguyen ITConductor.com 10
12. 4. Time-synchronized Troubleshooting
Context
Contexts can be related or correlated if the data are
time-synchronized. If services are comprised of many
components then the data collected from those
components should be available in the same time series
Visual correlation: the human eyes and mind are
naturally the most powerful analytical tool so when
data are aligned, they can provide useful information
Statistical correlation: time series data normally form a
good basis for any number of statistical analysis such as
averages, spikes that can indicate deviation and root-
cause more easily
Benchmark & baselines: comparisons between
different periods, hourly, daily, weekly, monthly, etc.
Linh Nguyen ITConductor.com 12
14. 5. Service Impact Awareness
Service-centric monitoring allows the top-down as well
as bottom-up propagation of events that impact related
components
Flexible service definitions can assign relationships as
well as rules/weight how one should impact the other,
e.g. if a server is busy and unable to service requests for
10 minutes, then change its status to critical and
propagate the impact to the overall application service
as warning – the overall service may still be available
but degraded as other load balanced servers can still
service other requests.
Role-based subscription determines who should get
notified depending on what’s being impacted & severity
Linh Nguyen ITConductor.com 14
15. (6) Automated Admin Scripts & Jobs
Linh Nguyen ITConductor.com 15
App Jobs
OS
Scripts
DB
Scripts
16. 6. Automated Admin Scripts & Jobs
Centralize system administration scripts for systems,
databases, and applications
Cross-platforms
Manage batch execution schedules
Monitor execution logs
Linh Nguyen ITConductor.com 16
18. 7. Self-healing Automated Recovery
Pro-active execution of house-keeping tasks for best
practices problem prevention
Monitored thresholds can invoke auto-recovery actions
using known fixes for common issues, or simply notify
the associated person to execute those actions
Corrective actions should be logged for audit purposes
Linh Nguyen ITConductor.com 18
20. 8. Digitized Complex IT Processes
Knowledge is power, IT processes are should be
captured in runbooks for documentation or automation
Complex environments are non-linear so workflows are
best used to capture processes and dependencies
Workflows should be repeatable, monitored and
managed down to the individual task level
Linh Nguyen ITConductor.com 20
22. 9. Synthetic Transaction Management
Test critical processes on a frequent basis to ensure the
availability and performance service levels for them are
constantly monitored
When possible, simulate end-user experience from
various points of entry to the service or application
using robots or scripts that can be triggered centrally
but executed remotely
Baseline performance during different time periods got
trend-analysis and exception-based alerting
Transaction-level monitoring
Integration with Performance Load Testing
Linh Nguyen ITConductor.com 22
23. (10) Dynamic Service Level Management
Linh Nguyen ITConductor.com 23
Flexible
Dashboard
with
Drilldown
Service Level
Monitors
Service Level
KPIs
24. 10. Dynamic Service Level Management
Dashboards should support operational service level
monitoring and compliance
Service Level Agreement & Operational Level
Agreement should be proactively managed for
compliance
Automated and flexible report generation and delivery
Service desk integration with notifications and
interactive charts
Linh Nguyen ITConductor.com 24