howto monitor linux server? what metrics are important when monitor server? what is related between metrics and monitoring tools? what are basic linux server optimization ? howto optimize ?
6. Tuning Introduction
Tunning is
● the process of finding bottlenecks in
a system and tuning the operating
system to eliminate these
bottlenecks.
● about achieving balance between
the different sub-systems of an OS.
Tunning is not
● performance tuning can be a
“cook book” approach
● setting some parameters in the
kernel will simply solve a problem
7. Application Type
IO Bound
● requires heavy use of memory and
the underlying storage system.
● IO bound application is processing
(in memory) large amounts of data.
● IO bound applications use CPU
resources to make IO requests and
then often go into a sleep state.
CPU Bound
● CPU bound applications require the CPU
for batch processing and/or mathematical
calculations.
● High volume web servers, and any kind of
rendering server.
8. Basic command line tools for monitoring
● Network monitoring (ping, tcpdump, netstat, nmap, traceroute, ...)
● Process monitoring (ps, pgrep, top, htop, …)
● Memory monitoring (vmstat)
● Disk monitoring (iostat)
9. Process monitoring
Ps, pstree, pgrep - process monitor
● To see every process on the system - standard syntax
○ $ ps -ef
● To see every process on the system - BSD syntax
○ $ ps aux
● Print a process tree:
○ $ ps -ejH
○ $ pstree
● look up or signal processes based on name and other attributes
○ $ pgrep syslog
11. ● every process that needs to be served enters a run queue before the kernel scheduler can allocate it
to run on a CPU core.
● average number of processes that is waiting to be served at any given moment.
● the number indicated as the load average should not be much higher than the total number of CPU
cores
● if server has four cores, four processes can be handled at the same time, and the CPU load should
not be higher than four.
top
Load Average
12. us
Percentage of time the CPU spends handling processes in user mode.
sy
Percentage of time the CPU spends in kernel mode.
id
Percentage of time the processor spends in the idle loop.
wa
Time the processor spends waiting for noninterruptible I/O, such as requests to disks,
hard-mounted NFS, and tape units.
hi
Time the processor spends handling hardware interrupts. A high value may indicate faulty
hardware.
top
CPU Performance Parameters
13. KiB Mem Total amount of physical memory in KiB (1 KiB = 1024 bytes)
used Total amount of RAM that is used for any purpose
free Total amount of RAM that is not used for anything
buffers
Total amount of used memory that is used for storing unstructured Data
cached
Mem
Total amount of memory that is used to cache files that have recently been fetched from
disk
top
Memory Usage
14. Sysstat - disk monitoring
Iostat - Report CPU statistics and input/output statistics for devices
and partitions.
● find out which I/O devices have
been used intensively and what
amount of I/O has been
happening on these devices.
●
15. iostat
%user - Show the percentage of CPU utilization that occurred while executing at the user level (application).
%nice - Show the percentage of CPU utilization that occurred while executing at the user level with nice priority.
%system - Show the percentage of CPU utilization that occurred while executing at the system level (kernel).
%iowait - Show the percentage of time that the CPU or CPUs were idle during which the system had an out
standing disk I/O request.
%steal - Show the percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor
was servicing another virtual processor.
%idle - Show the percentage of time that the CPU or CPUs were idle and the system did not have an out
standing disk I/O request.
tps - Indicate the number of transfers per second that were issued to the device. A transfer is an I/O request to
the device. Multiple logical requests can be combined into a single I/O request to the device. A transfer is of
indeterminate size.
16. Sysstat - memory monitoring
Vmstat - Report virtual memory statistics
● find out which I/O devices have
been used intensively and what
amount of I/O has been
happening on these devices.
●
18. Definitions: what is metric?
● The National Institute of Standards and Technology (NIST) define metrics as: “Tools designed to
facilitate decision-making and improve performance and accountability through collection, analysis
and reporting of relevant performance-related data”
● Metrics are simply a standard or system of measurement
19. Howto get metrics from server, app,
infrastructure ?!
● Prometheus is an open source,
metrics-based monitoring system.
● Prometheus does one thing and it does it
well. (get metrics)
● It does not try to solve problems outside of
the metrics space, leaving those to other
more appropriate tools.
● Metrics from server can big gathered with
prometheus-node-exporter package:
● # apt install prometheus-node-exporter
● curl localhost:9100/metrics
24. System Optimization Basics
● Linux kernel offers a complicated framework to have your server
behave in the best possible way
● Linux performance tuning should be done by experts who know
what they are doing.
● Tuned: tuning service can adapt the operating system to perform
better under certain workloads by setting a tuning profile
25. Understanding the Role of the Linux Kernel
● The Linux kernel is the heart of the operating system
● It is the layer between the user who works with Linux from a shell environment and the hardware that
is available in the computer on which the user is working.
● The kernel doing many essential operating system tasks. the scheduler that makes sure
● that processes that are started on the operating system are handled by the CPU, for example.
26. Analyzing What the Kernel Is Doing
● The dmesg utility ( or use journalctl --dmesg )
○ This utility shows the contents of the kernel ring buffer, an area of memory where the Linux kernel keeps its
recent log messages.
● The /proc file system
○ It contains files with detailed actual status information on what is happening on your server.
○ Many of the performance-related tools mine the /proc file system for more information.
● The uname utility
○ $ uname -a
■ Print all kernel information
○ $ uname -r
■ Which kernel version currently is used
27. /proc File System
● The key to Linux performance tuning is in the /proc file system.
● Many, if not all, the system utilities (including lscpu, uname, top, ps,
lsmod, and many more) are getting the information they show from
the /proc file system.
29. Process ID (PID) directories
● Apart from the configuration files mentioned,
there are also the process ID (PID)
directories.
● Every process that runs on Linux has a unique
PID, and each of these processes builds its
own environment.
31. /proc/sys
● Key to optimizing Linux performance is the /proc/sys directory
● In this directory, you’ll find tunables, divided in different categories
Tunable Explanation
kernel The kernel interface. Contains many useful tunables
net The network interface. Contains many useful tunables
fs The interface to the virtual file system. Contains a few useful tunables, such
as file-max, which specifies the maximum number of files that can be pened
simultaneously.
vm The virtual memory interface. Contains many useful tunables.
34. Howto change optimization parameters?
● Use echo to write the new parameter
to the kernel tunable file.
○ # echo 1 >
/proc/sys/net/ipv4/ip_forward
● Use sysctl -w to write the parameter
to the kernel tunable.
○ # sysctl -w
net.ipv4.ip_forward=1
● Use /etc/sysctl.conf file to save as
persistent
● Use /etc/sysctl.conf file to save as
persistent
○ # vim /etc/sysctl.conf
■ net.ipv4.ip_forward=0
○ sysctl -p
● Sysctl -a : Display all values
currently available