Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Large scale implementation of ibm tivoli composite application manager for web sphere and response time tracking redp4162
1. Front cover
Large-Scale Implementation of
IBM Tivoli Composite Application
Manager for WebSphere and
Response Time Tracking
Planning for performance of
management infrastructure
Implementing with multiple
servers
Performing mass update
of agents
Budi Darmawan
Aleem Subhedar
Celena Tan
Howard Anglin
Huang Chuan
Rohit Dhall
ibm.com/redbooks Redpaper
2.
3. International Technical Support Organization
Large-Scale Implementation of IBM Tivoli
Composite Application Manager for WebSphere
and Response Time Tracking
December 2007
10. Trademarks
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:
Redbooks (logo) ® ETEWatch® OMEGAMON®
pSeries® IBM® OS/400®
z/OS® IMS™ Rational®
AIX® Lotus Notes® Redbooks®
CICS® Lotus® Tivoli®
Database 2™ Monitoring On Demand® WebSphere®
DB2 Universal Database™ MVS™ Workplace™
DB2® Notes®
ETE™ Operating System/400®
The following terms are trademarks of other companies:
Oracle, JD Edwards, PeopleSoft, Siebel, and TopLink are registered trademarks of Oracle Corporation
and/or its affiliates.
Snapshot, and the Network Appliance logo are trademarks or registered trademarks of Network Appliance,
Inc. in the U.S. and other countries.
ITIL is a registered trademark, and a registered community trademark of the Office of Government
Commerce, and is registered in the U.S. Patent and Trademark Office.
Enterprise JavaBeans, EJB, Java, JavaBeans, JDBC, JMX, JNI, JRE, JVM, J2EE, Solaris, and all
Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or
both.
Microsoft, Outlook, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United
States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
viii Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
12. Celena Tan is a Managing Consultant with IBM Software Group Services in
Australia. She has 14 years of experience in the IT field. She holds a Masters of
Technology from National University of Singapore and a Bachelor of Electrical
Engineering (Hons) from the University of Tasmania. Her areas of expertise
include ITCAM family products and rational testing, and change and
configuration management products.
Howard Anglin is a Deployment Expert for ITCAM for WebSphere, Response
Time Tracking, IBM Tivoli Monitoring in the United States. He has worked with
various large customers, and in his role as an IT Specialist he has resolved
deployment, integration, and performance issues. He has nine years of
experience in the software test and development field with emphasis on the
WebSphere Application Server. He holds a Bachelor of Science in Electrical
Engineering from Manhattan College, Riverdale, New York. Howard began his
career at IBM in the pSeries Hardware Group as a Test Engineer developing
automation solutions for the production line. He then transferred to the software
group.
Huang Chuan is a Senior Test Lead of IBM China CSDL lab. He has five years
of experience in software developing and over six years of experience in
software product testing. He has led the ITCAM for Response Time Tracking test
project for several releases. He holds a degree in Computer Science from the
University of Electronic Science and Technology of China.
Rohit Dhall is an IT Architect with GBS, IBM India. He has 10 years of IT
experience in technologies like client-server computing, Web-based
transactional systems, data warehousing, and data mining. His major expertise is
in designing, implementing, and tuning large-scale Internet banking, eMortgage,
and anti-money laundering solutions for the banking and financial sector. He is
EXIN ITIL® certified and also holds certification in Java and EJB™ from
Brainbench. His current interests include SOA and IBM Virtualization offerings.
Thanks to the following people for their contributions to this project:
Donna Martin, Noel Lewis, Tony Williams, Marco De Gregorio, Sushanto Pandit
IBM Software Group, Tivoli Software
John Horton
Author of the first edition of Large-Scale Implementation of IBM Tivoli Composite
Application Manager for WebSphere and Response Time TrackingLarge-Scale
Implementation of IBM Tivoli Composite Application Manager, REDP-4162
Julie Czubik
International Technical Support Organization, Poughkeepsie Center
x Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
13. Become a published author
Join us for a two-week to six-week residency program! Help write an IBM
Redbook dealing with specific products or solutions, while getting hands-on
experience with leading-edge technologies. You will team with IBM technical
professionals, Business Partners, or customers.
Your efforts will help increase product acceptance and customer satisfaction. As
a bonus, you will develop a network of contacts in IBM development labs, and
increase your productivity and marketability.
Find out more about the residency program, browse the residency index, and
apply online at:
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
We want our papers to be as helpful as possible. Send us your comments about
this Redpaper or other Redbooks® in one of the following ways:
Use the online Contact us review book form found at:
ibm.com/redbooks
Send your comments in an e-mail to:
redbook@us.ibm.com
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
Preface xi
14. xii Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
16. 1.1 Application management with IBM Tivoli
Computer-based applications are the lifeblood of modern enterprises. Most
business processes are driven by the so-called computer application that
promotes productivity, automates processing, and minimizes human errors.
These applications enable business persons to focus on what must be done,
instead of how to do it. However, as business processes rely more on these
applications, the applications become critical to the business. The applications
must be available for the execution of the business processes.
Most applications evolved from centralized applications typically managed by the
information technology (IT) department or mainframe-based applications, where
all the application layers are maintained from the central mainframe. Today,
applications tend to have multiple layers, often distributed across different
servers, different platforms, and different components. These applications are
called composite applications. This complicates the management of applications
on matters such as operational settings, problem determination, and
performance management.
Applications as a business-critical entity must be available with adequate
response time for users to perform their tasks. With application components
spread throughout the enterprise, problem determination and performance
management are typically complicated. There is no clear path for finding which
component faces the problem. Is it the database? A network problem? The
application server experiencing a bottleneck? A user machine stall? Sometimes,
these components even belong to different organizations.
Figure 1-1 shows a typical composite application. This is used by multiple users
through the Internet and intranet. It consists of multiple application layers, each
with its own abstraction level. Some of the applications have the original back
end in the mainframe transactions.
Figure 1-1 Composite application
2 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
17. Composite applications are regarded as the ultimate application management
challenge, as they span different application servers that communicate with each
other. This architecture enables modular application development, where
changes in a layer may not affect other layers, but introduces the complexity of
multiple components.
This paper demonstrates how to implement the IBM Tivoli Composite Application
Manager family of products in a large-scale environment. This chapter introduces
IBM Tivoli product portfolio and how IBM Tivoli Composite Application Manager
product fits.
1.1.1 IBM Tivoli systems management portfolio
IBM Tivoli product solutions are aligned towards an overall IBM IT Service
Management approach. Figure 1-2 shows the IBM IT Service Management
portfolio structure.
IT CRM & Service Service Information Business
Business Delivery Deployment Management Resilience
Management & Support
IT Process
Management Products
IT Service Change and Configuration
Management Platform Management Database
IT Operational
Management Products
Best Practices Business Server, Network
Storage Security
Application & Device
Management Management
Management Management
Figure 1-2 IBM IT Service Management
This approach provides Information Technology Infrastructure Library-aligned
automation work flows. Future offerings will provide an open standard-based and
configuration management database-based solution, as well as a workflow
engine.
Chapter 1. Overview of IBM Tivoli Composite Application Manager implementation 3
18. The operational management pillar shown in Figure 1-2 on page 3 is divided into
software families. The availability solution addressed in business application
management and server, network, and device management can be viewed as an
integrated offering, as shown in Figure 1-3.
Business Service Management
Orchestration and Provisioning
Security Event Correlation and Automation Storage
Composite Application Management
Resource Monitoring
Figure 1-3 IBM Tivoli software portfolio
As shown in Figure 1-3, the Tivoli software portfolio is divided into the following
components:
Resource monitoring
Measures and manages IT resource performance, including servers,
databases, and middleware.
Composite application management
Monitors and manages an application and its components, and understands
applications from the availability standpoint.
Event correlation and automation
Correlates and automates events or faults that are generated by resource
monitoring, application monitoring, or both to provide a concise root-cause
analysis of failure in the environment.
Orchestration and provisioning
Provides the ability to deploy or redeploy servers or components as
requested on demand to fulfill processing requirements, if the necessity
arises as indicated by the correlation engine.
4 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
19. Business service management
Provides a high-level view of business status as reflected by its underlying
monitoring components. The view is either in real time or based on a
service-level agreement.
1.1.2 IBM Tivoli Composite Application Manager solution
The IBM Tivoli Composite Application Manager family resides in the application
management pillar of the Tivoli software portfolio. The current application
management portfolio consists of the following products:
ITCAM for Response Time Tracking V6.1
ITCAM for Response Time V6.2
IBM Tivoli Composite Application Manager for service-oriented architecture
(SOA) V6.1
ITCAM for WebSphere V6.1
ITCAM for J2EE™ V6.1
ITCAM for Web Resources V6.2
ITCAM for CICS Transactions V6.1
ITCAM for IMS Transactions V6.1
IBM Tivoli OMEGAMON® XE for Messaging V6.0
Figure 1-4 shows the scope of composite application management.
Response Time WebSphere CICS/IMS
Tracking performance transaction
Web Services calls WBI messaging
Figure 1-4 Composite application management
Chapter 1. Overview of IBM Tivoli Composite Application Manager implementation 5
20. Manage the overall composite application from the following sides:
Get the user side of response time and availability with ITCAM for Response
Time Tracking.
Get IBM WebSphere middleware performance and analyze in-depth resource
usage through ITCAM for WebSphere.
Manage messaging from IBM WebSphere Business Integration MQ Series
using IBM Tivoli OMEGAMON XE for IBM WebSphere Business Integration.
For more details, refer to Implementing IBM Tivoli OMEGAMON XE for
WebSphere Business Integration V1.1, SG24-6768.
Manage message flow in an SOA environment and collect metrics for Web
service calls using IBM Tivoli Composite Application Manager for
service-oriented architecture (SOA).
Provide the integration view with a mainframe-based, back-end application
such as Information Management System (IMS™) or Customer Information
Control System (CICS®) using ITCAM for IMS Transactions or ITCAM for
CICS Transactions.
1.2 Scope of and concerns relating to large-scale
implementation
This paper discusses large-scale implementation of IBM Tivoli Composite
Application Manager. It specifically provides information about the
implementation of ITCAM for WebSphere and ITCAM for Response Time
Tracking in large-scale environments. The discussion is about large-scale
implementation in distributed and mainframe environments, and includes the
following topics:
1.2.1, “Defining large-scale implementation” on page 6
1.2.2, “Concerns and considerations” on page 7
1.2.1 Defining large-scale implementation
There are several indications relating to large-scale implementation. These
indications are based on the following factors:
The number of application servers to be monitored
Each application server must have an agent installed to be monitored and
managed. With the number of application servers ranging from hundreds to
thousands, additional care must be taken to manage the deployment,
maintenance, and processing of the managing server.
6 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
21. The transaction rates on application servers
The transaction rates contribute to the overhead of the monitoring system. A
balance of data collection and system health must be achieved. A large
number of transactions potentially require larger management server
processing.
The number of network sites
The number of network sites typically corresponds to the potential bottlenecks
between the sites. The bottlenecks may be from production data, monitoring
data, or a security requirement such as a firewall.
The requirement for high availability or fail over
This additional requirement, although not directly related to the scale, is
typically a must for a large-scale implementation.
The existence of multiple managed spaces that a site must handle
Managed space is defined as a group of environments with a single
management database and a set of management server processes. Different
managed spaces are usually used to separate the production and
development environments. They are also used to prepare and test the
changes to the management environment.
1.2.2 Concerns and considerations
Following is a list of concerns and considerations that are specific to a
large-scale environment:
Server size
As this is a large-scale implementation, sizing the servers to manage the
environments is critical. The placement, configuration, and specification of a
single server or multiple servers must be predetermined in order to avoid
bottlenecks in processing. This sizing must also take into consideration
special processing requirements such as debugging and troubleshooting and
data collection and recovery.
Deploying agents
The number of agents that must be deployed are enormous and prohibitive to
being performed manually. Automated efforts must be included in the ability
to deploy and implement the agents automatically with minimal manual
intervention. This must cover initial deployment, fix pack implementation, and
maintenance action.
Chapter 1. Overview of IBM Tivoli Composite Application Manager implementation 7
22. Security
This includes confidentiality support and firewall support.
– Confidentiality support secures information transfer between the agents
and the servers.
– Firewall support allows the sites to be secured, with management action
still flowing through in order to effectively manage the environment.
Reliability
Fail over and fault tolerance are critical to maintain while monitoring
business-critical applications. The reliability factor must be promptly
addressed and ensured.
Maintenance
Changes do happen, as with deployment. These changes must be applied to
both the servers and the agents. Special consideration must be provided for a
large-scale implementation with changes on both the servers and the agents.
While server consideration applies to preserving, monitoring, and data
collection with minimal downtime, agent consideration relates to automating
the deployment process with minimal manual intervention and outage.
This paper deals with and addresses these concerns for ITCAM for Response
Time Tracking and ITCAM for WebSphere implementations.
1.3 Overview of IBM Tivoli Composite Application
Manager
This section explains the following topics:
1.3.1, “Understanding ITCAM for WebSphere” on page 8
1.3.2, “Understanding IBM Tivoli Composite Application Manager for
Response Time Tracking” on page 11
1.3.1 Understanding ITCAM for WebSphere
This section provides an overview of ITCAM for WebSphere. The discussion
includes the following topics:
“Features and functions” on page 9
“Components” on page 9
“Platforms supported” on page 10
8 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
23. For more information about ITCAM for WebSphere, visit the following Web site:
http://www.ibm.com/software/tivoli/products/composite-application-mgr-
websphere/
Features and functions
ITCAM for WebSphere helps increase the performance and availability of
business-critical applications by providing facilities for real-time problem
detection, analysis, and repair. Correlation spanning Java 2 Platform, Enterprise
Edition (J2EE), Customer Information Control System, and Information
Management System, and diagnostics at the method level pinpoint code
problems to help resolve problems quickly and reduce support and operations
costs.
Today’s business processes often depend on a number of complex applications.
Although most businesses have traditional monitoring tools to manage individual
resources at a high level, many lack an integrated solution to automatically
monitor, analyze, and resolve problems at the service, transaction, application,
and resource levels. As a result, operations and development may take a long
time to identify, isolate, and fix composite application problems.
ITCAM for WebSphere is an application management tool that helps maintain
the availability and performance of on demand applications. It helps you to
quickly pinpoint, in real time, the source of bottlenecks in application code, server
resources, and external system dependencies. This product also provides
detailed reports that you can use to enhance the performance of your
applications. ITCAM for WebSphere provides in-depth, WebSphere-based
application performance analysis and a tracing facility.
ITCAM for WebSphere enables multiple levels of analysis to get a complete view
of the application, depending on the requirement. From production-level
monitoring to detailed heap and method debugging, it digs into Structured Query
Language (SQL) performance analysis without the need for database monitors. It
provides SQL information and information about calls that were made through
Java Database Connectivity (JDBC™). ITCAM for WebSphere provides a
composite status correlation for transactions that use Customer Information
Control System and Information Management System as the back end.
Components
ITCAM for WebSphere contains the following components:
Managing server
This acts as the central component that manages and administers the data
collectors. It stores that data in a relational database repository. A Web-based
Chapter 1. Overview of IBM Tivoli Composite Application Manager implementation 9
24. application is provided to show monitoring results. This interface is also called
the visualization engine.
Data collector
This runs on the application server and collects performance information for
the managing server.
Tivoli Enterprise Monitoring Agent
This collects information that shows the status of the WebSphere Application
Server and sends it to the Tivoli Enterprise Monitoring Server for display on
the Tivoli Enterprise Portal. The Tivoli Enterprise Monitoring Agent is installed
on individual machines where data collectors reside. This component is
moved to IBM Tivoli Composite Application Manager for Web Resources in
Version 6.2.
Platforms supported
For a complete platform coverage list, refer to the following Web site:
http://publib.boulder.ibm.com/tividd/td/ITCAMWAS/prereq60/en_US/HTML/itc
am6.html
Table 1-1 provides an overview of the platforms supported for ITCAM for
WebSphere V6.
Table 1-1 Platforms supported for ITCAM for WebSphere
Component Software
Managing server operating IBM AIX V5.2 and V5.3
system Solaris™ 8 and Solaris 9 (SPARC)
Hewlett-Packard UNIX® (HP-UX) 11i 1
Windows® 200 Server or Advanced Server with
Service Pack 4 (SP4)
Windows 2003 Server Standard Edition/Enterprise Edition
(SE/EE)
Red Hat Enterprise Linux® (RHEL) 3.0 and 4.0
SUSE Linux Enterprise Server (SLES) 8 and 9
Managing server database IBM DB2® V8.1 Fix Pack 6 (FP6) or IBM DB2 V8.2
Oracle® 8i SE R3 8.1.7, Oracle 9i SE R2 9.2, Oracle 10g
Managing server WebSphere WebSphere Application Server V5.1.x or WebSphere Application
Server V6.x
10 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
25. Component Software
Data collector platform AIX V5.2 and V5.3
Solaris 8 and 9 SPARC
HP-UX 11i 1
Windows 200 Server or Advanced Server with SP4
Windows 2003 Server SE/EE
RHEL 3.0 and 4.0
SLES 8 and 9
Red Flag Advanced Server (RFAS) 4.0 and 4.1(xLinux)
IBM Operating System/400® (OS/400®) V5.2 and V5.3
IBM z/OS V1.4, V1.5, or V1.6
Customer Information Control V2.2, V2.3, and V3.1
System
Information Management V7.1, V8.1, and V9.1
System
1.3.2 Understanding IBM Tivoli Composite Application Manager
for Response Time Tracking
This section provides an overview of ITCAM for Response Time Tracking. It
discusses the following topics:
“Features and functions” on page 9
“Components” on page 12
“Platforms supported” on page 14
For more information about ITCAM for Response Time Tracking, visit the
following Web site:
http://www.ibm.com/software/tivoli/products/composite-application-mgr-rtt/
Features and functions
ITCAM for Response Time Tracking proactively recognizes, isolates, and
resolves transaction performance problems by using robotic and real-time
techniques. It is an end-to-end transaction management solution that monitors
user response time and helps you to visualize the transaction’s path through your
application systems, including the response time contributions of each step.
ITCAM for Response Time Tracking uses Application Response Measurement
(ARM) technology to track the response time of a distributed application.
Chapter 1. Overview of IBM Tivoli Composite Application Manager implementation 11
26. Today’s business processes often depend on composite applications that span
Web servers, J2EE application servers, integration middleware, and mainframe
systems. Although most businesses have traditional monitoring tools to manage
individual resources, many lack an integrated solution to automatically monitor,
analyze, and resolve user response time problems. As a result, it may take a
long time to identify, isolate, and fix distributed transaction performance
problems.
ITCAM for Response Time Tracking enables you to follow the path of a user
transaction end-to-end across your business infrastructure. You can drill down to
each step the transaction takes as it travels across multiple systems, and
measure how each component of a transaction contributes to the overall
response time. The entire transaction analysis process is transparent to
customers and application developers. It collects transaction performance
through robot and browser simulation, in-depth J2EE server instrumentation, and
feedback from Customer Information Control System and Information
Management System.
ITCAM for Response Time Tracking feeds the Tivoli Enterprise Monitoring
Server to provide a comprehensive performance management solution on Tivoli
Enterprise Portal. This enables the development of custom monitoring
workspaces for managing enterprise applications.
Components
ITCAM for Response Time Tracking consists of the following components:
Management server
This acts as the central point of contact for ITCAM for Response Time
Tracking. It consists of a WebSphere-based J2EE application that performs
the management and administrative functions. The management server
stores data in a central database repository.
Store and Forward Agent
This relays traffic to and from the management agents. Typically, the Store
and Forward agent is used in a firewall environment. It consolidates the port
requirements for the connectivity.
Management agent
This performs the monitoring function. Typically, it investigates the
performance of the distributed application, depending on the management
components deployed on it. The components that you can deploy are:
– Generic Windows workstation
This allows deployment of IBM Rational® Robot to measure transaction
performance.
12 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
27. – Client Application Tracker
This uses IBM ETEWatch® scripts to collect performance information.
Default monitoring is available for measuring IBM Lotus® Notes® and
Microsoft® Outlook® performance.
– Synthetic Transaction Investigator (STI)
This performs Web-based transactions and measures the resulting
response time.
– Quality of Service monitoring agent
This collects information about user performance by acting as reverse
proxy between the user and the Web server.
– JavaTM 2 Platform, Enterprise Edition (J2EE) monitoring agent
This instruments and collects performance information about J2EE-based
application servers such as WebSphere or WebLogic.
– Web Response Monitor component
– Rational Performance Tester
– Tomcat and JBoss monitoring component
– Generic Application Response Measurement (ARM) agent
This collects ARM events from a custom-instrumented application.
Tivoli Enterprise Monitoring Agent for Tivoli Enterprise Monitoring Server
This feeds data from the ITCAM for Response Time Tracking server to
display on the Tivoli Enterprise Portal.
Chapter 1. Overview of IBM Tivoli Composite Application Manager implementation 13
28. Platforms supported
For a complete platform coverage list, visit the following Web site:
http://publib.boulder.ibm.com/tividd/td/ITCAMRTT/prereq60/en_US/HTML/
Version60.html
Table 1-2 provides an overview of the platforms supported for ITCAM for
Response Time Tracking V6.0.
Table 1-2 Platforms supported for ITCAM for Response Time Tracking
Component Software level
Management server operating Microsoft Windows 2000 Server with SP4
system Windows 2000 Advanced with SP4
Windows 2003 Server SE or EE
IBM AIX V5.2 or V5.3
Solaris 9 or 10
HP-UX 11i 1
RHEL 3.0 or 4.0
SLES 8 or 9
Management server database Oracle 9i SE 9.2
IBM DB2 V8.1 ESE with FP3+ (required for WebSphere
Application Server V5.1.x)
IBM DB2 V8.1 ESE with FP6a+ (required for WebSphere
Application Server V6.x)
IBM DB2 V8.2
Management server WebSphere WebSphere Application Server V5.1.x or later versions
WebSphere Application Server V6.0.1.x or later versions
Management agent platform Windows 2000 Professional, Server or Advanced Server with
SP4
Windows 2003 Server SE or EE
Windows XP Professional with SP1
IBM AIX V5.2 or V5.3
Solaris 9 or 10
HP-UX 11i
RHEL 3.0 or 4.0
SLES 8 or 9
RFAS 4.0 or 4.1 (xLinux)
z/OS V1.4, V1.5, or V1.6
OS/400 V5.2 or V5.3
14 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
29. 1.4 Document organization
This paper discusses the following topics:
Before the implementation
Chapter 2, “Planning for ITCAM for WebSphere” on page 17, and Chapter 5,
“Planning for ITCAM for Response Time Tracking” on page 93, discuss the
planning and sizing considerations.
The implementation
Chapter 3, “Installing ITCAM for WebSphere” on page 33, and Chapter 6,
“Installing ITCAM for Response Time Tracking” on page 105, discuss
additional steps that are required, such as reliability and automation
considerations.
After the implementation
Chapter 4, “Maintenance of ITCAM for WebSphere” on page 75, and
Chapter 7, “Maintenance of ITCAM for Response Time Tracking” on
page 137, discuss maintenance considerations and operational concerns
relating to a large-scale implementation.
Chapter 1. Overview of IBM Tivoli Composite Application Manager implementation 15
30. 16 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
32. 2.1 Planning considerations
This section discusses the following aspects pertaining to large-scale
implementations (see also 1.2.2, “Concerns and considerations” on page 7):
Understanding the product architecture
This allows you to make the correct decisions. Section 2.2, “Product
architecture of ITCAM for WebSphere” on page 18, describes the architecture
for ITCAM for WebSphere.
Sizing the servers
This is important to correctly acquire adequate servers and choose a sound
software configuration option. Section 2.3, “Deciding on the size of the
servers” on page 21, describes one approach.
Understanding the servers’ configuration options and agent deployment
This is discussed for ITCAM for WebSphere in 2.4, “Implementation options
for ITCAM for WebSphere” on page 25.
Planning for communication security
This is a mandatory step for an enterprise with business-critical and sensitive
information in a transaction environment. Section 2.5, “Communication and
security considerations” on page 29, discusses confidentiality and firewall
requirements.
Discussing reliability, failover, and disaster recovery issues
These are the other mandatory aspects pertaining to a critical business
process on a large enterprise. Section 2.6, “Reliability and high availability” on
page 32 discusses this.
2.2 Product architecture of ITCAM for WebSphere
This section discusses the product architecture of ITCAM for WebSphere. This
understanding is critical to plan and decide about the server configuration and
other implementation issues. See also IBM Tivoli Composite Application
Manager V6.1 Family Installation, Configuration, and Basic Usage, SG24-7151.
ITCAM for WebSphere V6.0 evolved from WebSphere Studio Application
Monitor (WSAM) and IBM Tivoli OMEGAMON XE for WebSphere. ITCAM for
WebSphere observes and reports on the health of Java 2 Platform, Enterprise
Edition-based applications. It tracks the progress of applications as they traverse
through Java 2 Platform, Enterprise Edition (J2EE) application servers,
18 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
33. middleware adapters and transports, and database calls, to back-end systems
such as Customer Information Control System (CICS) or Information
Management System (IMS) to extract business data or to invoke mainframe
business processes.
The tracking of applications produces request traces, where the events in a
request’s life are recorded and stored in a monitoring repository database.
ITCAM for WebSphere captures the CPU and the elapsed internal times when
events are called and exited, measuring as far down as the CPU consumed and
the elapsed internal times charged to individual methods in J2EE classes. The
methods or events taking the most time are marked as an application’s parts that
deserve attention for runtime improvement studies and code optimizations.
ITCAM for WebSphere does not require modification of any J2EE or mainframe
application code. Java Virtual Machine Tool Interface (JVMTI) interfaces and
primitives, along with WebSphere Performance Management Interface (PMI) and
z/OS System Measurement Facility (SMF) 120 records, are ITCAM for
WebSphere’s principal data sources. The monitoring data is collected and
analyzed to offer a wealth of information about the health of J2EE applications
and their servers.
Many system-level performance metrics are collected and reported about J2EE
application servers. The status of the servers and their resources, particularly at
vital checkpoints such as CPU utilization, memory usage, and the status of
internal components such as database connection pools, Java Virtual Machine
(JVM™) thread pools, Enterprise JavaBeans™ (EJB) usage, and request
processing statistics, are very important in locating real-time problems with J2EE
applications. ITCAM for WebSphere brings attention to these critical indicators
with real-time, graphical displays of their values and their trends over a span of
time.
ITCAM for WebSphere is a distributed performance monitoring application for
application servers. Its components are connected through IP network
communication. The central component of ITCAM for WebSphere, the managing
server, is its heart and brain. It collects and displays various performance
information from application servers.
The application servers run a component of ITCAM for WebSphere called data
collector, which is a collecting agent that runs in the application server and
sends monitoring information to the management server. These data collectors
operate independently of each other.
Chapter 2. Planning for ITCAM for WebSphere 19
34. Figure 2-1 shows the overall architecture of ITCAM for WebSphere.
Browser interface
ITCAM
for WebSphere
Managing Server
I
Web Server Tivoli Enterprise
Monitoring Server
and
Application servers with Tivoli Enterprise
ITCAM for WebSphere Portal Server
Data collectors
Figure 2-1 ITCAM for WebSphere architecture
The application monitor comprises the following main parts:
Managing server
A managing server comprises several Java-based components that provide
the environment to collect and present management data.
Data collector agent
A data collector agent runs on each monitored application server, whether
J2EE, Customer Information Control System (CICS), or Information
Management System (IMS), and communicates essential operational data to
the managing server. Unique sampling algorithms maintain low CPU and
network overhead, while providing application-specific performance
information.
20 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
35. 2.3 Deciding on the size of the servers
The scale of the implementation must decide the size of the servers to be used.
Sizing determines the hardware configuration and implementation consideration
of the servers. This section discusses the following topics:
2.3.1, “Sizing parameters” on page 21
2.3.2, “Sizing estimation for ITCAM for WebSphere managing server” on
page 22
2.3.1 Sizing parameters
The following parameters must be considered before deciding on the size of the
servers:
The number of data collectors for ITCAM for WebSphere
This value assumes that the application servers run a similar load profile. If
the application servers have several load profiles, consider them in different
groups.
The transaction rate for application servers
The number of transactions executed for each minute, when multiplied with
the number of data collectors or monitoring agents, gives the total amount of
transaction information captured for a given period.
The complexity of a transaction
It is not easy to understand the complexity of a transaction. This requires a
more subjective approach than transaction rate counting, which can be
retrieved from the transaction data or the application log. The relative
complexity of transactions is determined by the number of method calls per
transaction. Typically, the number of methods a complex transaction invokes
is around four to six times that of a simple transaction.
There are some product-specific parameters that affect sizing considerations.
These parameters are built to filter out unimportant or insignificant information
from the data that is collected. These parameters are:
– Data collection filter
– Sampling rate
– Monitoring level
– Listening policy mask
– Instrumentation level
Chapter 2. Planning for ITCAM for WebSphere 21
36. 2.3.2 Sizing estimation for ITCAM for WebSphere managing server
Specific to ITCAM for WebSphere, consider the following parameters for sizing:
Communication bandwidth
Memory size
Processing requirement
Database size
Important: Sizing estimation for ITCAM for WebSphere managing server
must be estimated for a worst-case scenario, that is, in the state that level 3
monitoring is run for the highest number of data collectors concurrently.
Communication bandwidth
Several communication traffic flows exist between the managing server and the
data collector. The communication traffic flows are:
Initial communication with the kernel to collect configuration information
This only happens in the initial connection when the data collector is started.
This configuration information consists of sending the configuration and
managing server Java archives.
Management information to modify data collection level, sampling interval, or
logging level from the kernel
This happens by request or when scheduled by Monitoring On Demand®.
The size of this communication is small and negligible.
Visualization engine requests for current active transactions
The impact of these requests depends on the following factors:
– The transaction rate and the average transaction response time that make
up the average number of in-flight transactions
– The number of concurrent Web console users who may request the
in-flight transaction information
Transaction information is streamed to the publish server as it happens
This is the largest contributor to network load. It uses up the largest amount of
network bandwidth. The formula is as follows:
– Monitoring in level 1: transaction rate x 353
– Monitoring in level 3: (transaction rate x 353) + (transaction rate x method
call x 172)
22 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
37. As an illustration, we use a sample environment with a transaction rate of
3,000 requests per minute on level 1 and 300 requests per minute on level 3
monitoring. The average method calls is 500 methods per requests. The
transaction bandwidth required is:
– Level 1 transaction load: (3000 transaction / 60 seconds) x 353 = 17,650
bytes/sec
– Level 3 transaction load: (30 transaction / 60 seconds) x 353 + (30 / 60) x
5,000 x 172 = 430,176 bytes/sec
As shown in this example, the majority of network usage is spent on level 3
analysis. In a real production environment, for the majority of time, ITCAM for
WebSphere runs on level 1. Therefore, the communication requirement is
low. However, prepare an installation to occasionally increase monitoring in
level 3 for problem determination purposes.
Memory size
Memory requirement is typically important for the following components:
Kernel
The memory size of the kernel is directly related to the number of data
collectors. The typical size of 64 MB in the setenv.sh may have to be
increased for more than 50 data collectors.
Publish server
The memory size is related to the number of transactions the publish server
has to process, with some consideration to the transaction complexity factor,
that is, the number of methods invoked. The publish server’s memory must be
adequate to handle the data size between garbage collector intervals. For
garbage collection per minute, you must accommodate a minute’s worth of
data. In the example provided in “Communication bandwidth” on page 22, the
total size of publish server memory for processing the load must be around
4.3 x 60 x (1.5) = 387 MB. Note that the base publish server was already
using around 100 MB of storage.
Archive agent
This requires memory as a subset to the publish server and is masked by the
sampling percentage from the publish server. The archive agent uses more
memory than the sampling rate percentage, as it performs Java Database
Connectivity (JDBC) database calls.
Chapter 2. Planning for ITCAM for WebSphere 23
38. Visualization engine memory size
This depends on the number of users who are connected and the activities
that they perform. Users are categorized into the following groups:
– Users monitoring the availability screens
– Users collecting performance reports
– Users monitoring in-flight threads
Modify the visualization engine’s memory size by using the WebSphere
Application Server administration console.
Memory sizes for ITCAM for WebSphere components are defined in the
setenv.sh file that is sourced by all overseer components.
Processing requirement
The processor requirement for ITCAM for WebSphere is directly related to the
transaction rate. The largest processor usage is for the following components:
Publish server: to process transaction data
Database engine: for interface to the database
Archive agent: to perform SQL calls
WebSphere Application Server: to process user requests
Database size
The typical database size requirement depends on:
The number of application server statistics
The transaction volume to be stored
The complexity of transaction
The duration to keep the data
Database table information that increases in size during ITCAM for WebSphere
execution is:
requests: number of requests x 353 bytes
methods: number of methods x # requests in L3 x 172 bytes
pmidata: number of data collectors x (3600/polling interval) x 73 bytes
serverstats: number of data collectors x (3600/polling interval) x 107 bytes
volumestats: number of data collectors x (3600/polling interval) x 74 bytes
memorydata: number of data collectors x (3600/polling interval) x 115 bytes
gcdata: number of data collectors x (3600/garbage collection interval) x104
bytes
24 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
39. 2.3.3 Data collector overhead
Monitoring with ITCAM for WebSphere has overhead related to data collectors
running on a production WebSphere Application Server. The overhead is
minimal for data collectors running on level 1 monitoring. This is typically around
a 2–3% increase of CPU time with no notable memory or disk input/output (I/O)
requirement.
When the monitoring level is increased, the processing overhead of ITCAM for
WebSphere data collectors also increases. This increase is due to the fact that
ITCAM for WebSphere collects more data from more sources. A typical level 2
monitoring generates around a 10% increase in processing usage, while a level
3 monitoring generates around 25–30% overhead.
This means that level 2 or level 3 monitoring must be used sparingly in your
production environment. To change the monitoring level for purposes of problem
determination, schedule it to start and then step back to level 1 automatically in
order to reduce the impact on users.
2.4 Implementation options for ITCAM for WebSphere
Depending on the size of your implementation, there are some considerations for
implementing ITCAM for WebSphere. This section discusses the following
topics:
2.4.1, “Designing the managing server” on page 25
2.4.2, “Deploying a large number of data collectors” on page 28
2.4.1 Designing the managing server
The ITCAM for WebSphere managing server consists of the following products:
IBM DB2 Universal Database™ Enterprise Server or Oracle database server
WebSphere Application Server
ITCAM for WebSphere managing server
Chapter 2. Planning for ITCAM for WebSphere 25
40. Figure 2-2 shows the conceptual relationship between the components.
Snapshot traffic
Publish traffic
Global Publish
Server (SAM)
Publish Server (PS)
Kernel (KL) Visualization Engine
Message Dispatcher Provide services on: Provide services on:
(MD) - Lookup -Administration
- Registration -Availability
- Recovery -Problem Determination
Archive Agent (AA) - Configuration -Performance Management
Polling Agent (PA)
OCTIGATE
database
Figure 2-2 ITCAM for WebSphere components
The following ITCAM for WebSphere components are displayed in Figure 2-2:
Kernels
These control the managing server. There are always two copies of kernels
running on an ITCAM for WebSphere managing server for redundancy and
failover. The kernels register components as they join the managing server,
periodically renew connections and registrations with components and data
collectors, and collect server and component availability information.
Publish servers
These receive application and system event data from the data collectors,
gather and compute request-level information about performance metrics
such as response times, and implement the trap monitoring and alerts
features.
Archive agents
These receive monitoring data from the publish servers and store the
monitoring data in ITCAM for WebSphere’s repository.
Global publishing server
This collects information from the publish servers and correlates all parts and
pieces of multi-server requests, such as requests from J2EE servers to
execute Customer Information Control System (CICS) or Information
Management System (IMS) programs.
26 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
41. Message dispatcher
This is a conduit for messages from ITCAM for WebSphere using e-mail and
Simple Network Management Protocol (SNMP) facilities.
Polling agent
This collects data from Web servers for Apache 2.0 and later versions.
Visualization engine
This is a Web-based graphical user interface (GUI) with access to graphics,
ITCAM for WebSphere performance reports, real-time views of different slices
of monitoring data, ITCAM for WebSphere internal commands, and
event-driven functions. The visualization engine runs on a J2EE server such
as WebSphere Application Server.
Although ITCAM for WebSphere provides the facility to install all the components
in a single wizard, which is called embedded installation, individually installing
each component allows more flexibility in terms of verifying each component and
configuring them to suit your requirements. The considerations that you must
keep in mind when installing the components are:
Database
You can install the database locally on the managing server or on a separate
database server. ITCAM for WebSphere provides database configuration
scripts to assist with the configuration of a remote database.
Utilizing a remote database, regardless of whether it is a DB2 Universal
Database or an Oracle database, relieves the processing load on the
managing server. An environment with hundreds of data collectors generates
a large amount of data flowing into the database. This amount increases
considerably if the data collectors are set to run monitoring at level 2 or level
3, even for a short period of time.
A remote database allows database query processing and recording to be
processed using dedicated hardware, instead of sharing with the main
managing server that is already busy with processing the transaction
information.
WebSphere Application Server
The visualization engine of the managing server acts as the administration
console for ITCAM for WebSphere. The visualization engine is deployed on a
WebSphere Application Server JVM that resides in a standalone application
server or an application server that is a part of a network deployment
environment.
We recommend that you install the visualization engine on a separate
application server JVM that is not monitored by ITCAM for WebSphere data
collectors, especially in a network deployment environment. This reduces any
Chapter 2. Planning for ITCAM for WebSphere 27
42. possible conflicts that may arise with respect to ITCAM for WebSphere. In
addition, if an issue does arise, problem determination will be somewhat
easier due to the separation.
ITCAM for WebSphere components
Configure the managing server to handle large amounts of data by adding
additional components, such as the publish servers and the archive agents.
When adding the publish servers and the archive agents, the distribution of
data is handled by the managing server. The amount of data being written to
the database is handled more efficiently as well.
Another major consideration for the managing server is the split server
installation. This option provides the managing server with the overseer
processes that exist on separate machines, including the kernel, which
provides load balancing and failover capabilities.
There are benefits to this type of configuration when there are hundreds of
data collectors providing data to the managing server. This type of setup not
only allows the managing server to handle more memory and disk space
usage, but also provides a failover capability. For more information about split
server installation, refer to 3.1, “Installing ITCAM for WebSphere managing
server” on page 34.
2.4.2 Deploying a large number of data collectors
Installation of a small amount of ITCAM for WebSphere data collectors is
performed by using the graphical-based installation and configuration wizard
provided by the product. When presented with the task of deploying hundreds of
data collectors into an environment, the graphical interface is no longer a good
option. This non-interactive automated installation method is commonly known
as silent installation.
The use of silent installation provides a means to deploy a larger number of data
collectors in a more efficient manner. When performing the silent installation,
information about the WebSphere environment must be known ahead of time. A
response file will be used during the installation, and if incorrect information is
used, may result in a failed install.
When performing the silent installation of data collectors, the WebSphere
Application Server version must be taken into account, as V6 introduced the
usage of profiles. The response files for silent installation are different for various
versions of the WebSphere Application Server. In some cases, when two
versions of WebSphere Application Server are present, it is better to have two
separate master response files.
28 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
43. Installing the Tivoli Enterprise Monitoring Agent (TEMA) is also an option of the
silent installation. Although Tivoli Enterprise Monitoring Agent can be installed
using silent installation, more configuration must be performed to connect to IBM
Tivoli Monitoring V6.1 Tivoli Enterprise Monitoring Server.
Mass automated installation is also possible by using a software distribution or
provisioning solution such as the IBM Tivoli Configuration Manager.
There is an additional consideration for deploying data collectors on a machine
that has multiple application servers installed. Consider installing a separate data
collector directory set for each application server, because applying a fix pack for
data collectors requires you to stop the application server. You have a more
flexible scheduling option with separate data collector installation for each
application server.
2.5 Communication and security considerations
Communication and security issues are vital to the inter-networked world that we
live in. Applications and their management infrastructure must be secured in
order to protect resources from unauthorized sources. This section discusses the
following planning considerations:
2.5.1, “Communication security” on page 29
2.5.2, “Firewall and port consideration” on page 30
2.5.1 Communication security
Communication security relates to the confidentiality of the information
transmitted over a network. Management information that is used by IBM Tivoli
Composite Application Manager products may contain details about application
processing internals. This requires the content of the management information to
be secured from being accessed by unauthorized sources.
WebSphere security
WebSphere security plays a significant role in a large-scale implementation. In
some cases, WebSphere security is not enabled during the test phase of an
implementation, but in a production environment. This requires certain additional
considerations. The WebSphere user must have the appropriate permissions to,
for instance, issue a wsadmin command.
The configuration of data collectors involves the use of Java Command
Language (JACL) scripts, and can fail when there is a permission problem.
Chapter 2. Planning for ITCAM for WebSphere 29
44. If any of the application servers on which the data collector is installed has
WebSphere security enabled on it, the entire ITCAM for WebSphere
environment must have it enabled as well. This includes WebSphere security
being enabled on the ITCAM for WebSphere managing server.
Secure Sockets Layer communication
Secure communication between the managing server and the data collector is a
viable option if there is a requirement for data to be encrypted during
transmission. Using Secure Sockets Layer (SSL) provides secure data
transmission from the data collector to the managing server and must appease
corporate security requirements, if necessary. Additional configuration must take
place on the managing server and the data collector when enabling SSL. A
certificate key generator is included with the product. This key generator
provides the facility to use custom-generated keys.
A best practice is to complete the default installation of the managing server and
the data collector and then enable SSL for both. This isolates problems (that is,
whether the problem is caused by the basic installation or the SSL configuration).
2.5.2 Firewall and port consideration
Firewall and port issues arise when the data collectors are on a different site,
location, or subnet from the managing server. Problems such as name resolution
occur if the Domain Name System (DNS) is not set up correctly on either the
managing server or the data collectors. Routing problems occur if the Internet
Protocol (IP) addresses used belong to different subnets. The entire network
environment must be looked into in order to determine where a firewall, router, or
bridge may exist.
30 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking
45. Figure 2-3 shows the communication port requirement for ITCAM for
WebSphere.
DC
Command Agent
KL1
DC
Event Agent
DC
Command Agent KL2
Port
Consolidator
DC PS1
Event Agent
DC
Command Agent
PS2
DC
Event Agent
Figure 2-3 Communication port requirements
The managing server requires open ports for each kernel and publish server.
The data collector requires open ports for the command agent and the event
agent. The port consolidator requires a port to communicate to the managing
server. Use a single port consolidator to consolidate communication from
multiple data collectors.
A port consolidator is useful to limit the number of ports required for
communication between the data collector and the managing server. Port
consolidation is a viable option if there is a limit to the number of ports that can
be opened on the firewall. Additional configuration must be carried out on the
data collector, including the configuration of the data collector to go through the
port consolidator, and starting the port consolidator process.
Chapter 2. Planning for ITCAM for WebSphere 31
46. 2.6 Reliability and high availability
This section discusses reliability issues that relate to failover and disaster
recovery.
2.6.1 Failover and fault tolerance
Split server configuration for ITCAM for WebSphere or the clustering server for
ITCAM for Response Time Tracking consists of having two or more management
servers running on separate physical machines. Hardware or software errors do
occur on a machine and cause the server to cease functioning. Using the
separate server configuration, the secondary server can handle the entire load
until the failing machine is recovered.
The switchover to the secondary managing server is not automatic. Manual
intervention must take place for the failover to be successful. There are specific
ITCAM for WebSphere components that can only be run on one managing
server. They must therefore be started on a secondary server, such as the global
publish server or the message dispatcher, if the primary server goes down.
2.6.2 Disaster recovery
There are three areas where a backup is necessary for disaster recovery with
respect to the IBM Tivoli Composite Application Manager server:
Database
Perform database backup regularly to collect the most up-to-date information.
Use the database utility function to perform the backup function.
WebSphere Application Server configuration for the server and agent or data
collectors
Perform a backup for these for them to be restored in a disaster recovery
scenario.
IBM Tivoli Composite Application Manager servers
These servers must be physically considered for recovery in the disaster
recovery site.
32 Large-Scale Implementation of IBM Tivoli Composite Application Manager for WebSphere and Response Time Tracking