Weitere ähnliche Inhalte Ähnlich wie Case Study: Sprint Monitors Its Mega-Network for Voice/Video/Data Service Assurance with CA Performance Management (20) Mehr von CA Technologies (20) Kürzlich hochgeladen (20) Case Study: Sprint Monitors Its Mega-Network for Voice/Video/Data Service Assurance with CA Performance Management1. Case Study: Sprint Monitors Its Mega-Network for
Voice/Video/Data Service Assurance with CA
Performance Management
Mike Peterson
Sprint
Wireless Core Planner
Session Number
2. 2 © 2015 CA. ALL RIGHTS RESERVED.@CAWORLD #CAWORLD
© 2015 CA. All rights reserved. All trademarks referenced herein belong to their respective companies.
The content provided in this CA World 2015 presentation is intended for informational purposes only and does not form any type
of warranty. The information provided by a CA partner and/or CA customer has not been reviewed for accuracy by CA.
For Informational Purposes Only
Terms of this Presentation
3. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Case Study: Sprint Monitors Its Mega-Network for
Voice/Video/Data Service Assurance with CA
Performance Management
3
11/19/2015
4. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Agenda
• General Business Case
• Size and Scope of Environment
• Groups
• Dashboards
• Scorecards
4
5. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
General Business Case
5
6. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
General Business Case
Network Planning and Performance
• Scale the network to exceed customer demand in the most cost effective manner possible.
• Ensure the network operates at peak performance.
Challenges
• Report and forecast utilization on over 400,000 interfaces.
• Monitor and alarm on device components and interfaces for anomalous behavior.
• Provide clear and concise reporting.
• Respond to ad hoc reporting requests quickly.
• Develop ‘flow based’ reporting and use Top N reporting sparingly.
• Keep up with evolving devices, designs and scale.
6
7. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Size and Scope of Environment
7
8. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Size and Scope of Environment
Components
• Over 40,000 physical locations.
• More than 50,000 devices and 2 million interfaces.
• A wide array of networking devices and manufacturers.
• Wireless Core accounts for 6% of the devices and interfaces.
Data Points
• 2.2 million components polled every polling cycle, 288 cycles per day.
• 4.4 trillion data points:
• 91% of the 4.4 trillion is a combination of five minute polled and calculated metrics retained for 45 days.
• The remainder is hourly and daily roll-ups.
8
9. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Size and Scope of Environment
Calculations and Notifications
• Three metric projections on Bits In and Bits Out.
• Baselines and standard deviations for utilization and component performance.
• Threshold notifications sent to email recipients and external monitoring systems.
Users
• Application is available to all employees.
• Operations:
• Viewing performance in near real time
• Analyzing historical performance for RCA (Root Cause Analysis)
• Engineering:
• Capacity planning utilizing trending
• Performance metric trending
9
10. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Groups
10
11. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Groups
Challenges
• We have too many interfaces for a user to be able to effectively build reports by searching.
• We wanted to put most of our interfaces into at least one group.
• We needed to be able to group similar interfaces across 30+ sites.
• There are users that have no working knowledge of the Wireless Core.
• Downstream customers need groups that fit their view of the network.
• Management has stated a preference for contextual reporting, minimizing Top N.
11
12. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Groups
Observations / Experience
• The true power of the platform is in groups.
• The flexibility of groups is a powerful feature, enabling limitless reporting combinations.
• We based our primary group organization around our sites. Each site has many interface
groupings; the group structure is common across sites.
• Our groups use rules instead of manually selecting members.
• Rules can be very simple, containing only other groups.
• They can also be very complex and powerful by leveraging regular expressions.
• Outside of the site structure we can easily build other group containers by using references.
Because we are using text/pattern matching on interface descriptions, established interface naming
and description conventions must be strictly followed.
12
13. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Groups
13
• A site’s interface types or functions are identified and grouped.
14. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Groups | By Site
14
• Each of the groups within the
site are grouped together as a
site group.
15. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Groups | By Interface Groupings
15
• Individual interface groupings
are tied together across all of
the sites.
16. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Groups | By Platform
16
• Devices are grouped similar to
interfaces.
17. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Groups | Example
17
• Sites are the primary groups
containing the various
interface functional types.
• Interface groups contain
reference groups from the
sites.
• Within the sites, top level
groups will contain all
associated interfaces.
Subordinate level (aggregate)
groups contain just the
LAG/Bundle.
18. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Dashboards
18
19. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Dashboards
• We rely heavily on dashboards for our capacity planning activities.
• The dashboard views we favor the most are IM Table (Top) and IM Custom View Group Scorecard.
• The default Dashboards are excellent, especially Router, Interface and Group.
• 50% of our published dashboards are locked to a specific group.
• Trend graphs and tabular views offer unique but equally valuable insights.
• Users are allowed to edit any dashboard and save changes to only their account. They can also
build their own without cluttering the Performance Manager menus.
19
20. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Dashboards
20
21. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Dashboards
21
• IM Table (Top) – Used both by Operations and Planning
• Useful for data extractions (csv)
22. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Dashboards
22
23. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Scorecards
23
24. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Scorecards
• Scorecards have proven to be the most useful dashboard view.
• To fully realize their potential, a lot of time and care needs to be spent on group administration.
Group hierarchy is key.
• Two types of scorecards are used:
• Dynamic Context – Any group can be viewed;
• Locked Context – Specific groups are chosen and locked to the dashboard. We use these for high profile
interfaces.
• Scorecard strengths:
• The ability to view at a summary level many interfaces at once;
• The ability to drill deeper into the data;
• Metric projections;
• Identifying out-of-balance interface members in LAGs/Bundles.
24
25. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Scorecards
25
26. ©2015 Sprint. This information is subject to Sprint policies regarding use and is the property of Sprint and/or its relevant affiliates and may contain restricted,
confidential or privileged materials intended for the sole use of the intended recipient. Any review, use, distribution or disclosure is prohibited without authorization.
#MoveForward
Scorecards
26
Hinweis der Redaktion General Business Case – Capacity Management and Performance
Size and Scope of Environment
Value of Grouping – Pattern Matching, network ‘slice’ (by, site, interface groupings, vendor, flows, etc)
Dashboards – IM Table Top (If time allows MultiView, canned [interface, device, group]
Scorecards – Aggregates, sites, flows
Data Extraction – OpenAPI – Tableau
Planning - Place augments in time with a capacity that allows for cost effective incremental growth. Try to time capital expenditures to the last minute without EVER causing a constraint.
Reporting – Should be easy to understand and have relevant impact, the executive levels don’t live and breath this stuff like engineers do
Tracking flows from A to Z, especially useful during troubleshooting, it is nearly difficult to find issues in data paths using only top n
40,000 locations - Cell towers to aggregating POPs to regional data centers to national data centers
Devices – many devices unique to carriers (gateways, MME, etc.)
Linear regression / least squares – near (90 day), short (180 day) and long term (365 day) projections.
Standard Dev & baseline for interface utilization, errors, discards, mem, CPU, etc.
Notifications – Email and SNMP traps
Operations – Fault and performance
Engineering – Planning and performance
Our experience with eHealth showed that building reports through searching had serious limitations. The first was the limit of how many could be selected, the other was the painful process of actually performing the search.
While using eHealth it became quite clear that we should try to ensure that as many elements as possible were contained in groups.
Group labeling and structure was going to important if downstream customers were to be able to use CAPM. There are many open and closed source platforms that can collect SNMP, we’ve found the group administration to be the most powerful asset. During an internal demonstration management was able to see the value immediately.
The design of our network in some ways informed us how to structure the groups. If it exists in the network it must be performing a function, therefore that function can be described and in turn that description can be the name of a group.
The big “Ah Ha” was when we figured out how groups can be reported in the scorecards. Assumptions that were made on group structure were in some ways wrong and some of the group structures we built had to be scraped.
Reference groups allow for new interface groupings to be built without having to replicate the rule sets. Example – We might have groups that contain router to firewall links at every site, if we need a group that contains those links in a top level group all we have to do is build references. IM Table (Top) is like a Swiss army knife, just about every metric family is available to view, works equally well on interfaces and devices, performance or capacity. (Assume we have an example of metric projection) – Even though management dislikes Top N reports, they still have value to operations (NOC)
We have other versions of this view that aren’t locked to groups, that allows extremely focused views of specific interface (or device) groupings