Linux menu

Wednesday, September 24, 2014

How To Monitor Server Performance In Linux and Unix II

Uptime

The first thing I'll do when I can get logged into a non-responsive system is do an "uptime". If the load is high, I know I need to start digging into things with the other tools. Uptime gives you 3 numbers which indicate the 1, 5, and 15 minute load averages. From this you can tell if the load is trending up, neutral, or going down:
guin:~$ uptime
 13:26:32 up 1 day, 16:52, 21 users,  load average: 0.00, 0.14, 0.15
guin:~$
On my laptop, the load is fairly low, but it is trending down (1 minute average of 0.00, 5 minute of 0.14, and over 15 minutes it's 0.15).
database:~ # uptime
 12:29pm  up 1 day 13:29,  1 user,  load average: 0.84, 0.82, 0.80
database:~ #
On this database server the load is somewhat low (it's a quad CPU box, so I wouldn't consider it saturated until it was around 4).

Dmesg

It's also useful to look at the bottom of the "dmesg" output. Usually it isn't particularly revealing, but in the case of hardware errors or the out of memory killer it can very quickly reveal a problem.

dmesg output format

[ time ] device name: message

[ time ] : number seconds from the boot time. If we see something like 34.23445 as time stamp, that indicates this message was created after 34 seconds of booting the machine.

device name: This will give us the hardware device name along with manufacture details.
message: This is actual information we have to look at, this message contain information about devices loaded information, if it's not loaded it will give us error etc.
Dmesg examples 1:

Example 1: Display all the devices drivers loaded in to kernel.
dmesg
The above command will give as all the hardware drivers loaded in to kernel, their status success or failed and even error message why they are failed.
Example2: Display hardware information related to Ethernet port eth0
dmesg | grep -i eth0
Clipped Output:
[ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 3.5.0-25-generic (buildd@komainu) (gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-2ubuntu1) ) #39-Ubuntu SMP Mon Feb 25 18:26:58 UTC 2013 (Ubuntu 3.5.0-25.39-generic 3.5.7.4)
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.5.0-25-generic root=UUID=9b1dca12-2bf6-47d3-ab45-929f85bec913 ro quiet splash vt.handoff=7
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Centaur CentaurHauls
[ 0.000000] e820: BIOS-provided physical RAM map:
We can use grep command to filter any kind of information from dmesg command. Know more about grep command here.
To display USB stuff
dmesg | grep -i usb
Output:
[ 0.898683] ACPI: bus type usb registered
[ 0.898700] usbcore: registered new interface driver usbfs
[ 0.898707] usbcore: registered new interface driver hub
[ 0.898725] usbcore: registered new device driver usb
[ 1.057206] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 1.057248] ehci_hcd 0000:00:1a.0: new USB bus registered, assigned bus number 1
[ 1.071681] ehci_hcd 0000:00:1a.0: USB 2.0 started, EHCI 1.00
[ 1.071737] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[ 1.071742] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1

To display total memory available and shared memory details
dmesg | grep -i Memory
Output:
[ 0.000000] Memory: 8008776k/9697280k available (6722k kernel code, 1460428k absent, 228076k reserved, 6448k data, 932k init)
[ 0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[ 0.003319] Initializing cgroup subsys memory
[ 0.497325] Freeing initrd memory: 15664k freed
[ 1.723922] Freeing unused kernel memory: 932k freed
[ 1.727869] Freeing unused kernel memory: 1460k freed
[ 1.730885] Freeing unused kernel memory: 1120k freed
Display if any bluetooth device is present or not
dmesg | grep -i bluetooth
Output:
[ 27.145978] Bluetooth: Core ver 2.16
[ 27.145998] Bluetooth: HCI device and connection manager initialized
[ 27.146001] Bluetooth: HCI socket layer initialized
[ 27.146003] Bluetooth: L2CAP socket layer initialized
[ 27.146009] Bluetooth: SCO socket layer initialized
[ 41.586902] Bluetooth: RFCOMM TTY layer initialized
[ 41.586909] Bluetooth: RFCOMM socket layer initialized
[ 41.586911] Bluetooth: RFCOMM ver 1.11
[ 42.003041] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[ 42.003045] Bluetooth: BNEP filters: protocol multicast
Display tty information
   dmesg | grep -i tty
Output:
[ 0.000000] console [tty0] enabled
[ 34.210105] usb 2-1.1: GSM modem (1-port) converter now attached to ttyUSB0
[ 34.210198] usb 2-1.1: GSM modem (1-port) converter now attached to ttyUSB1
[ 34.210285] usb 2-1.1: GSM modem (1-port) converter now attached to ttyUSB2
[ 41.586902] Bluetooth: RFCOMM TTY layer initialized
[ 197.980023] option1 ttyUSB0: option_instat_callback: error -108

More interaction with dmesg:

We can clear dmesg logs from dmesg command output if required by using clear switch -c as shown below.
dmesg -c
Note: This command will clear dmesg logs from boot time to till that time. If you connect any hardware after this command execution, you can see these new messages in dmesg output.
accidentally deleted dmesg messages by running dmesg -c? Dont worry these messages are stored in /var/log/kern.log or /var/log/dmesg files.
Check latest dmesg logs by viewing content of these two files.
cat /var/log/kern.log
Output:
May 19 06:59:46 linuxnix kernel: [ 3047.124011] atkbd serio0: Unknown key released (translated set 2, code 0xab on isa0060/serio0).
May 19 06:59:46 linuxnix kernel: [ 3047.124015] atkbd serio0: Use 'setkeycodes e02b <keycode>' to make it known.
May 19 06:59:46 linuxnix kernel: [ 3047.382982] atkbd serio0: Unknown key pressed (translated set 2, code 0xab on isa0060/serio0).
May 19 06:59:46 linuxnix kernel: [ 3047.383000] atkbd serio0: Use 'setkeycodes e02b <keycode>' to make it known.
or
cat /var/log/dmesg
Output:
[ 38.115030] RPC: Registered named UNIX socket transport module.
[ 38.115034] RPC: Registered udp transport module.
[ 38.115036] RPC: Registered tcp transport module.
[ 38.115038] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 38.186630] FS-Cache: Loaded
[ 38.235731] init: munin-node main process (1082) terminated with status 127
[ 38.235754] init: munin-node main process ended, respawning
[ 38.300708] NFS: Registering the id_resolver key type
[ 38.300724] Key type id_resolver registered
[ 38.300725] Key type id_legacy registered
[ 38.300733] FS-Cache: Netfs 'nfs' registered for caching
One advantage about kern.log file is it will give you system time for each log it created instead of boot time. In our next posts we will see how to use other hardware related commands in detail.



Dmesg examples 2:

1. View the Boot Messages

By executing the dmesg command, you can view the hardwares that are detected during bootup process and it’s configuration details. There are lot of useful information displayed in dmesg. Just browse through them line by line and try to understand what it means. Once you have an idea of the kind of messages it displays, you might find it helpful for troubleshooting, when you encounter an issue.
# dmesg | more
Bluetooth: L2CAP ver 2.8
eth0: no IPv6 routers present
bnx2: eth0 NIC Copper Link is Down
usb 1-5.2: USB disconnect, address 5
bnx2: eth0 NIC Copper Link is Up, 100 Mbps full duplex
As we discussed earlier, you can also view hardware information using dmidecode.

2. View Available System Memory

You can also view the available memory from the dmesg messages as shown below.
# dmesg | grep Memory
Memory: 57703772k/60817408k available (2011k kernel code, 1004928k reserved, 915k data, 208k init)

3. View Ethernet Link Status (UP/DOWN)

In the example below, dmesg indicates that the eth0 link is in active state during the boot itself.
# dmesg  | grep eth
eth0: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem 96000000, IRQ 169, node addr e4:1f:13:62:ff:58
eth1: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem 98000000, IRQ 114, node addr e4:1f:13:62:ff:5a
eth0: Link up

4. Change the dmesg Buffer Size in /boot/config- file

Linux allows to you change the default size of the dmesg buffer. The CONFIG_LOG_BUF_SHIFT parameter in the /boot/config-2.6.18-194.el5 file (or similar file on your system) can be changed to modify the dmesg buffer.
The below value is in the power of 2. So, the buffer size in this example would be 262144 bytes. You can modify the buffer size based on your need (SUSE / REDHAT).
#  grep CONFIG_LOG_BUF_SHIFT  /boot/config-`uname -r`
CONFIG_LOG_BUF_SHIFT=18

5. Clear Messages in dmesg Buffer

Sometimes you might want to clear the dmesg messages before your next reboot. You can clear the dmesg buffer as shown below.
# dmesg -c

# dmesg 

6. dmesg timestamp: Date and Time of Each Boot Message in dmesg

By default the dmesg don’t have the timestamp associated with them. However Linux provides a way to see the date and time for each boot messages in dmesg in the /var/log/kern.log file as shown below.
klogd service should be enabled and configured properly to log the messages in /var/log/kern.log file.
# dmesg | grep "L2 cache" 
[    0.014681] CPU: L2 cache: 2048K

# grep "L2 cache" kern.log.1
Oct 18 23:55:40 ubuntu kernel: [    0.014681] CPU: L2 cache: 2048K

Vmstat

Next I will often run "vmstat 1", which prints out statistics every second on the system utilization. The first line is the average since the system was last booted:
denver-database:~ # vmstat 1
procs ---------memory---------- --swap-- --io--- -system-- -----cpu-----
 r  b swpd   free   buff  cache  si  so  bi  bo   in   cs  us sy id wa st
 0  0  116 158096 259308 3083748   0   0  47  39   30   58 11  8 76  5  0
 2  0  116 158220 259308 3083748   0   0   0   0 1706 4899 22 14 64  0  0
 1  0  116 158220 259308 3083748   0   0   0 276 1435 1490  4  2 93  0  0
 0  0  116 158220 259308 3083748   0   0   0   0 1502 1569  5  3 92  0  0
 0  0  116 158220 259308 3083748   0   0   0 892 1394 1529  2  1 97  0  0
 1  0  116 158592 259308 3083748   0   0   0 216 1702 1825  8  7 84  1  0
 0  0  116 158344 259308 3083748   0   0   0 368 1465 1461  8  7 84  0  0
 0  0  116 158344 259308 3083748   0   0   0 940 1992 2115  2  2 95  0  0
 0  0  116 158344 259308 3083748   0   0   0 240 1906 1982  6  7 87  0  0
The first thing I'll look at here is the "wa" column; the mount of CPU time spent waiting. If this is high you almost certainly have something hitting the disc hard.
If the "wa" is high, the next thing I'd look at is the "swap" columns "si" and "so". If these are much above 0 on a regular basis, it probably means you're out of memory and the system is swapping. Since RAM is around a million times faster than a hard drive (10ns instead of 10ms), swapping much can cause the system to really grind to a halt. Note however that some swapping, particularly swapping out, is normal.
Next I'd look at the "id" column under "cpu" for the amount of idle CPU time. If this is around 0, it means the CPU is heavily used. If it is, the "sy" and "us" columns tell us how much time is being used by the kernel and user-space processes.
If CPU "sy" time is high, this can often indicate that there are some large directories (say a user's "spam" mail directory) with hundreds of thousand or millions of entries, or other large directory trees. Another common cause of high "sy" CPU time is the system firewall: iptables. There are other causes of course but these seem to be the primary ones.
If CPU "us" is high, that's easy to track down with "top".

ps awwlx --sort=vsz

If there is swapping going on I like to look at the big processes via "ps awwlx --sort=vsz". This shows processes sorted by virtual sizes (which doesinclude shared libraries, but also counts blocks swapped out to disc).

Iostat

For systems where there is a lot of I/O activity (shown via the "bi" and "bo" being high, but "si" and "so" being low), iostat can tell you more about what hard drives the activity is happening on, and what the utilization is. Normally I will run "iostat -x 5" which causes it to print out updated stats every 5 seconds:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.64    0.00    3.95    0.30    0.00   90.11

Device: rrqm/s wrqm/s   r/s   w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda       0.00   9.60  0.60  2.40   6.40  97.60    34.67     0.01  4.80  4.80  1.44
I'll first look at the "%util" column, if it's approaching 100% then that device is being hit hard. In this case we only have one device, so I can't use this to isolate where the heavy activity might be happening, but if the database were on it's own partition that could help track it down.
"await" is a very useful column, it tells us how long the device takes to service a request. If this gets high, it probably indicates saturation.
Other information iostat gives can tell us if the activity is read-oriented or writes, and whether they are small or large writes (based on the sec/s sectors per second rate and the number of read/writes per second).

Iotop

This requires a very recent kernel (2.6.20 or newer), so this isn't something I tend to run very often: most of the systems I maintain are enterprise distros, so they have older kernels. RHEL/CentOS 3/4/5 are too old, Ubuntu Hardy doesn't have iotop, but Lucid does support it.
iotop is like top but it will show processes that are doing heavy I/O. However, often this may be a kernel process so you still may not be able to tell exactly what process is causing the I/O load. It's much better than what we had in the past though.

Top

In the case of high user CPU time, "top" is a great tool for telling you what is using the most CPU.

Linux TOP Command Examples With Screenshots:-


1. Top Command output:

First of all, let us understand what the output says. Top command displays a lot of information about the running system. But we need to under stand the meaning of different sections of this output:
Running by default, the top command displays output like this:
First few lines are horizontal showing summary about different system parameters, and following these are the processes and their attributes in columns.

1.1 Uptime and Load Averages:

At the top of top command is displayed the output similar to uptime command.
The fields display:
* current time
* the time your system is been up
* number of users logged in
* load average of 5, 10 and 15 minutes respectively.
This uptime display can be toggled with ‘l’ command.

1.2 Tasks:

The second line shows summary of tasks or processes. The processes can be in different states. It shows total number of the processes. Out of these, the processes can be running, sleeping, stopped or in zombie (zombie is the state of a process state, These process summary can be toggled with ‘t’ command.

1.3 CPU States:

Next is shown the CPU state. Here, %age of CPU time in different modes is shown. The meaning of different CPU times are:
* us, user: CPU time in running (un-niced) user processes
* sy, system: CPU time in running kernel processes
* ni, niced: CPU time in running niced user processes
* wa, IO wait: CPU time waiting for IO completion
* hi: CPU time serving hardware interrupts
* si: CPU time serving software interrupts
* st: CPU time stolen for this vm by the hipervisor.
This can be toggled with ‘t’ command.

1.4 Memory Usage:

Next two lines show memory usage, somewhat like ‘free’ command. 1st of these lines is for physical memory and the second for virtual memory (swap space).
The physical memory is displayed as: total available memory, used memory, free memory, and memory used for buffers
Similarly, swap reflects: total, used, free and cached swap space.
The memory can be toggled with ‘m’ command.

1.5 Fields/Columns:

Top columns
After these horizontal system properties and states, the processes are shown in columns. The different columns represent different properties discussed below.
By default, top shows these attributes associated with processes:
PID
The Process ID, to uniquely identify a processes.
USER
The effective user name of the owner of the processes.
PR
The scheduling priority of the process. Some values in this field are ‘rt’. It means that the process is running under real-time.
NI
The nice value of the process. Lower values mean higher priority.
VIRT
The amount of virtual memory used by the process.
RES
The resident memory size. Resident memory is the amount of non-swapped physical memory a task is using.
SHR
SHR is the shared memory used by the process.
S
This is the process status. It can have one of the following values:
D – uninterruptible sleep
R – running
S – sleeping
T – traced or stopped
Z – zombie
%CPU
It is the percentage of CPU time the task has used since last update.
%MEM
Percentage of available physical memory used by the process.
TIME+
The total CPU time the task has used since it started, with precision upto hundredth of a second.
COMMAND
The command which was used to start the process.
There are many other outputs which are not displayed by default which can display information about page faults, effective group and group ID of the process, and many more.

2. Interactive Commands:

We discussed earlier that the top command is interactive commands. Some of the commands we encountered in the last section. Here we explore these commands further.

2.1 ‘h': Help

First of all, you can press ‘h’ or ‘?’ to display the help menu for interactive commands.

2.2 ‘<ENTER>’ or ‘<SPACE>': Refresh Display

The top command by default refreshes after a certain interval (3 seconds). To refresh manually, user can press enter or space key.

2.3 A: Toggle Alternate Display Mode

This command switches between full-screen Mode and alternate-display mode. In alternate display mode, 4 windows are available:
1. Def
2. Job
3. Mem
4. Usr
Each of the 4 field groups has a unique separately configurable summary area and its own configurable task area. Only one of these 4 windows will be the current window. The current window is displayed on the top left corner.
Alternative display mode
You can switch between 4 windows with ‘a’ and ‘w’ keys. ‘a’ moves to next and ‘w’ to previous window. With ‘g’ command, you can enter a number to select the current window.
Window selection in alternative display

2.4 B: Toggle Bold Display

Some important information is shown in bold characters. This command toggles use of bold.
Bold off

2.5 ‘d’ or ‘s': Set Display Time interval

When ‘d’ or ‘s’ is pressed, you will be prompted to enter a value ( in seconds ) which will be set as display interval. If you enter 1 here, top will refresh every second.
Display time interval

2.6 ‘l’, ‘t’, ‘m': Toggle Load, Task, Memory Info

These will toggle load average, task/cpu status and mem info respectively as discussed.
Load average off
CPU summary off
Memory and swap usage off
All three displays off

2.7 ‘f': Field Management

This is used to chose what field you want to display on the output screen. The fields marked as * are selected.
Managing columns
‘<‘ and ‘>’
The ‘<‘ command moves the sorted field to the left and ‘>’ to the right

2.8 ‘R': Reverse Sort

Toggle Reverse/Normal sort order

2.9 ‘c': Toggle Command

Toggle full path of command that started the process and program name.
Full command path

2.10 ‘i': Idle Tasks

Toggle idle tasks.
Idle task display off

2.11 ‘V': Forest View

Toggle forest view mode.
Forest view of tasks

2.12 ‘Z': Change color map

Pressing the ‘Z’ key takes the user to a screen where the display color can be changed for top command. There are 8 task areas to chose from and 8 colors.
Customizing colors
The below screen shows full colored top view with all 4 screens on.
Colored display

2.13 ‘z': Toggle Color

Toggle color, i.e. turn on or off the colored display.

2.14 ‘x’ or ‘y’

Toggle highlights: ‘x’ sort field; ‘y’ running tasks. Depending upon your display settings, You might have to make the output colored in order to notice these highlights.
X and Y highlighted

2.15 ‘u': Processes of a User

Show processes for a particular user. You are prompted to enter the username. Blank will show for all users.
Processes of 'raghu' user

2.16 ‘n’ or ‘#': Number of tasks

Set maximum number of tasks displayed.
Setting number of tasks displayed

2.17 ‘k': Kill tasks

One of the most important commands of top. Used to send signals to tasks (Usually kill tasks).
Killing a task

2.18 ‘r': Renice

Renice a task to change the scheduling priority.

3. Command line options:

Most of these command line options are similar to the commands discussed above. Top output can be manipulated interactively with commands. But you can start top with some parameters set to your convenience with these options.

3.1 -b: Batch mode

The -b option starts top command in batch mode. It can be useful when you want to save the output in a file.

3.2 -c: Command/Program-name toggle:

As discussed in the above commands, this option will toggle from the last remembered state of command/program name display.

3.3 -d: Set delay interval

Set the delay interval for top (in seconds). For example:
$ top -d 1
will start the top command with 1 sec delay interval.

3.4 -i: idle process toggle

This option sets the top command with last remembered ‘i’ state reversed.

3.5 -n: Set number of iterations

With -n option, you can set the number of iterations after which top willl end.
$ top -n 3
will exit top automatically after 3 outputs.

3.6 -p: monitor specific PIDs

You can specify what PIDs you want to monitor with -p option. PID value 0 will be treated as process ID of top command itself.

3.7 -u or -U: username or UID

The process of a particular user can be viewed with these options. Username or UID can be specified to the option. The -p, -u and -U options are mutually exclusive. Only one of the options can be used at a time. You get error when you try to use any combination of these options:
- See more at: http://linoxide.com/linux-command/linux-top-command-examples-screenshots/#sthash.PCAZKla9.dpuf
$ top -p 28453 -u raghu
top: conflicting process selections (U/p/u) - See more at: http://linoxide.com/linux-command/linux-top-command-examples-screenshots/#sthash.PCAZKla9.dpuf

Munin


Munin is a great tool that tracks long-term system performance trends. However, it's not something you can start using when you have a performance problem. It's the sort of thing you should set up on all your systems so that you can build up the historic usage and have it available when you need it.
It will give you extensive stats about CPU, disc, RAM, network, and other resources, and allow you to see trends to determine an upgrade will be needed in the coming months, rather than that you needed to do one a few months ago. :-)

No comments: