Tuesday, November 6, 2012

Sar and Sysstat: Linux Performance Statistics with Ease

Whenever I perform any type of activity that requires me to look at historical system statistics such as load average, CPU utilization, I/O wait state, or even memory usage; I usually skip the System Monitoring Applications like Nagios or Zenoss and start running the sar command. While I’m not saying that sar completely replaces those tools I am saying that sar is quick and dirty and if all you want is some raw numbers from a certain time frame, sar is a great tool.

What is sar?

sar (System Activity Reporter) is a command that ships with the sysstat package. Sysstat is a collection of Unix tools used for performance monitoring, the package includes tools such as iostat, mpstat, pidstat, sadf and sar.
Along with the real time commands sysstat will install a cronjob that will run every 10 minutes and collect the systems performance information. Sar is the command you can use to read the collected information.

Setting up sysstat

Below you will find instructions on how to install, configure, and use sysstat & sar. I personally run sysstat collection on all of the servers under my care as the benefits of having this data far out weighs any reason you could think of not installing it (I can’t think of any).

Install and Configure

Installing sysstat is pretty simple and is in the repositories for most Linux Distributions.

Installing with apt:

$ sudo apt-get install sysstat

Installing with yum:

$ sudo yum install sysstat

Enable sysstat collection

In order to enable sysstat collection we will need to edit the /etc/default/sysstat file.
Edit the config file:
# vi /etc/default/sysstat
Find:
# Should sadc collect system activity informations? Valid values
# are "true" and "false". Please do not put other values, they
# will be overwritten by debconf!
ENABLED="false"
Modify to:
# Should sadc collect system activity informations? Valid values
# are "true" and "false". Please do not put other values, they
# will be overwritten by debconf!
ENABLED="true"
Once you have made the change save and exit the file; from here every time the cronjob is run sysstat will collect system information.

Changing the collection interval (optional)

By default sysstat will collect data every 10 minutes, some people (like me) will want a shorter collection interval. In order to accomplish this you can simply modify the cronjob that runs every 10 minutes.
Edit the cronjob file:
# vi /etc/cron.d/sysstat
Find:
# Activity reports every 10 minutes everyday
5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1
Modify to: 
# Activity reports every 5 minutes everyday
*/5 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1
Now you can save and exit the file and simply wait 5 minutes for the next run of sysstat to verify that you are collecting data.

Keep sysstat log files longer than 1 week (optional)

By default sysstat will only retain its log files (historical performance statistics) for 7 days, personally I prefer to keep these files around for at least 31 days. To keep these files longer simply edit the /etc/sysstat/sysstat config file.
Edit the config file:
# vi /etc/sysstat/sysstat
Find:
# How long to keep log files (in days).
# Used by sa2(8) script
# If value is greater than 28, then log files are kept in
# multiple directories, one for each month.
HISTORY=7
Modify to:
# How long to keep log files (in days).
# Used by sa2(8) script
# If value is greater than 28, then log files are kept in
# multiple directories, one for each month.
HISTORY=31
Save and exit the file and you will now be keeping 31 days of log files.

Accessing the Performance Statistics with sar

There are a metric ton of methods of getting data out of sar, below are a few options that I use commonly.

Access Previous Days Data

Before I start giving you examples of ways to extract performance statistic goodness via sar I first want to cover the default output of sar. By default sar will output the current days statistics depending on what options you give it; in order to get a previous days data from sar you must find that days log file and specify it with -f /path/to/file.
Example:
# sar -f /var/log/sysstat/sa04
The log files for sar are contained within the /var/log/sysstat or /var/log/sa directory depending on your distributions implementation. The sa log files have a bit of an interesting naming scheme which isn’t used very often in Unix or Linux. The files end with a number that denotes the day example sa04 is actually the file from the 4th day of the current month.
This ls listing may help explain it easier.
# ls -la sa[0-9]*
-rw-r--r-- 1 root root 254604 2012-07-02 00:00 sa01 << This file is for 2012-07-01
-rw-r--r-- 1 root root 254604 2012-07-03 00:00 sa02
-rw-r--r-- 1 root root 254604 2012-07-04 00:00 sa03
-rw-r--r-- 1 root root 254604 2012-07-05 00:00 sa04
-rw-r--r-- 1 root root 254604 2012-07-06 00:00 sa05
-rw-r--r-- 1 root root 254604 2012-07-07 00:00 sa06
-rw-r--r-- 1 root root 220044 2012-07-07 20:55 sa07 << This file is for 2012-07-07 (The current day)
-rw-r--r-- 1 root root 254604 2012-06-30 00:00 sa29 << This file is for 2012-06-29
-rw-r--r-- 1 root root 254604 2012-07-01 00:00 sa30 << This file is for 2012-06-30
** Note: Per the config files comments if you are keeping the log files longer than 28 days the files will be contained in a sub directory to denote the month.
The -f flag can be used with all of the examples below to show data from a specific day.

CPU Information

The following command prints out the collective CPU performance information
# sar
Linux 3.2.0-26-generic (workstation) 07/07/2012 _x86_64_ (1 CPU)
01:35:02 PM CPU %user %nice %system %iowait %steal %idle
01:40:01 PM all 1.30 0.00 0.66 0.31 0.00 97.73
01:45:02 PM all 1.12 0.00 0.43 0.22 0.00 98.23
If you want to display the CPU information broken down rather than summarized you can use the -P flag.
# sar -P ALL
Linux 3.2.0-26-generic (workstation) 07/07/2012 _x86_64_ (1 CPU)
01:35:02 PM CPU %user %nice %system %iowait %steal %idle
01:40:01 PM all 1.30 0.00 0.66 0.31 0.00 97.73
01:40:01 PM 0 1.30 0.00 0.66 0.31 0.00 97.73
01:40:01 PM CPU %user %nice %system %iowait %steal %idle
01:45:02 PM all 1.12 0.00 0.43 0.22 0.00 98.23
01:45:02 PM 0 1.12 0.00 0.43 0.22 0.00 98.23
I only have 1 CPU on my test virtual machine but if you have a multiprocessor machine the output will have more CPU’s.

I/O Statistics

The -b flag will show the summarized I/O Statistics.
# sar -b
Linux 3.2.0-26-generic (workstation) 07/07/2012 _x86_64_ (1 CPU)
01:35:02 PM tps rtps wtps bread/s bwrtn/s
01:40:01 PM 0.88 0.04 0.84 1.50 8.67
01:45:02 PM 0.68 0.00 0.68 0.00 6.94
01:50:02 PM 0.67 0.00 0.67 0.00 6.83
01:55:01 PM 1.58 0.62 0.96 19.72 10.29

Disk Utilization

The -d flag will show the activity of your block devices, this output is similar to iostat’s.
# sar -d
Linux 3.2.0-26-generic (workstation) 07/07/2012 _x86_64_ (1 CPU)
01:35:02 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
01:40:01 PM dev8-0 0.32 0.75 4.34 16.00 0.03 92.34 21.64 0.69
01:40:01 PM dev252-0 0.56 0.75 4.34 9.05 0.05 86.69 12.26 0.69
01:40:01 PM dev252-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
01:45:02 PM dev8-0 0.24 0.00 3.47 14.25 0.03 113.81 26.41 0.64
01:45:02 PM dev252-0 0.43 0.00 3.47 8.00 0.04 98.55 14.86 0.64
01:45:02 PM dev252-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

 Paging Information

The -B flag will show the systems paging information, this is usefull for determining if your system is paging frequently.
# sar -B
Linux 3.2.0-26-generic (workstation) 07/07/2012 _x86_64_ (1 CPU)
01:35:02 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff
01:40:01 PM 0.37 2.17 56.25 0.01 21.46 0.00 0.00 0.00 0.00
01:45:02 PM 0.00 1.73 34.75 0.00 14.49 0.00 0.00 0.00 0.00
01:50:02 PM 0.00 1.71 43.66 0.00 17.17 0.00 0.00 0.00 0.00

Memory Usage

This is very useful for figuring out what your memory utilization was historically.
**Note: Before freaking out that your memory is nearly completely utilized please visit Linux Ate My RAM
# sar -r
Linux 3.2.0-26-generic (workstation) 07/07/2012 _x86_64_ (1 CPU)
01:35:02 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact
01:40:01 PM 39912 205528 83.74 53080 98860 65720 10.03 73036 99284
01:45:02 PM 39912 205528 83.74 53080 98868 65724 10.03 73112 99192
01:50:02 PM 39852 205588 83.76 53084 98868 65720 10.03 73124 99196

Swap Usage

This option goes hand in hand with the memory usage option, you can use this option to figure out when your system started swapping.
# sar -S
Linux 3.2.0-26-generic (workstation) 07/07/2012 _x86_64_ (1 CPU)
01:35:02 PM kbswpfree kbswpused %swpused kbswpcad %swpcad
01:40:01 PM 409180 416 0.10 196 47.12
01:45:02 PM 409180 416 0.10 196 47.12
01:50:02 PM 409180 416 0.10 196 47.12

Huge Pages Usage

The -H option will give you the historical huge pages usage, this is especially helpful for Oracle database servers.
# sar -H
Linux 3.2.0-26-generic (workstation) 07/07/2012 _x86_64_ (1 CPU)
01:35:02 PM kbhugfree kbhugused %hugused
01:40:01 PM 0 0 0.00
01:45:02 PM 0 0 0.00

Network Device Statistics

The -n option can show you network statistics, there are quite a few options for this but the device statistics has been the most useful for me.
# sar -n DEV
Linux 3.2.0-26-generic (workstation) 07/07/2012 _x86_64_ (1 CPU)
01:35:02 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s
01:40:01 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00
01:40:01 PM eth0 0.54 0.39 0.04 0.05 0.00 0.00 0.00
01:45:02 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00
01:45:02 PM eth0 0.07 0.05 0.00 0.01 0.00 0.00 0.00
01:50:02 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00

If all else fails get everything

The sar man page has even more examples of usage than the above, if you have not found what your looking for here than you can try the man page for specifics. If you are in too much of a hurry to figure it all out you can use sar -A to output all of the sysstat collected data for that day; you may want to output that to a file as it is quite a bit of data.
# sar -A
<too much to list here>

Can You Top This? 15 Practical Linux Top Command Examples

In this article, let us review 15 examples for Linux top command that will be helpful for both newbies and experts.

1. Show Processes Sorted by any Top Output Column – Press O

By default top command displays the processes in the order of CPU usage.  When the top command is running, press M (upper-case) to display processes sorted by memory usage as shown below.
Top Command Sort By Memory Usage
Fig: Press M to sort by memory usage – Unix top command
To sort top output by any column, Press O (upper-case O) , which will display all the possible columns that you can sort by as shown below.
Current Sort Field:  P  for window 1:Def
Select sort field via field letter, type any other key to return 

  a: PID        = Process Id              v: nDRT       = Dirty Pages count
  d: UID        = User Id                 y: WCHAN      = Sleeping in Function
  e: USER       = User Name               z: Flags      = Task Flags
  ........
When the linux top command is running, Press R, which does the sort in reverse order.

2. Kill a Task Without Exiting From Top – Press k

Once you’ve located a process that needs to be killed, press ‘k’ which will ask for the process id, and signal to send.  If you have the privilege to kill that particular PID, it will get killed successfully.

PID to kill: 1309
Kill PID 1309 with signal [15]: 
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1309 geek   23   0 2483m 1.7g  27m S    0 21.8  45:31.32 gagent
 1882 geek   25   0 2485m 1.7g  26m S    0 21.7  22:38.97 gagent
 5136 root    16   0 38040  14m 9836 S    0  0.2   0:00.39 nautilus

3. Renice a Unix Process Without Exiting From Top – Press r

Press r, if you want to just change the priority of the process (and not kill the process). This will ask PID for renice, enter the PID and priority.

PID to renice: 1309
Renice PID 1309 to value: 
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1309 geek   23   0 2483m 1.7g  27m S    0 21.8  45:31.32 gagent
 1882 geek   25   0 2485m 1.7g  26m S    0 21.7  22:38.97 gagent

4. Display Selected User in Top Output Using top -u

Use top -u to display a specific user processes only in the top command output.
$ top -u geek
While unix top command is running, press u which will ask for username as shown below.
Which user (blank for all): geek
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1309 geek   23   0 2483m 1.7g  27m S    0 21.8  45:31.32 gagent
 1882 geek   25   0 2485m 1.7g  26m S    0 21.7  22:38.97 gagent

Display Only Specific Process with Given PIDs Using top -p

Use top -p as shown below to display specific PIDs.
$ top -p 1309, 1882
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1309 geek   23   0 2483m 1.7g  27m S    0 21.8  45:31.32 gagent
 1882 geek   25   0 2485m 1.7g  26m S    0 21.7  22:38.97 gagent

5. Display All CPUs / Cores in the Top Output – Press 1 (one)

Top output by default shows CPU line for all the CPUs combined together as shown below.
top - 20:10:39 up 40 days, 23:02,  1 user,  load average: 4.97, 2.01, 1.25
Tasks: 310 total,   1 running, 309 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.5%us,  0.7%sy,  0.0%ni, 92.3%id,  6.4%wa,  0.0%hi,  0.0%si,  0.0%st
Press 1 (one), when the top command is running, which will break the CPU down and show details for all the individual CPUs running on the system as shown below.
top - 20:10:07 up 40 days, 23:03,  1 user,  load average: 5.32, 2.38, 1.39
Tasks: 341 total,   3 running, 337 sleeping,   0 stopped,   1 zombie
Cpu0  :  7.7%us,  1.7%sy,  0.0%ni, 79.5%id, 11.1%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.3%us,  0.0%sy,  0.0%ni, 94.9%id,  4.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2 :  3.3%us,  0.7%sy,  0.0%ni, 55.7%id, 40.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3 :  5.0%us,  1.0%sy,  0.0%ni, 86.2%id,  7.4%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu4  : 38.5%us,  5.4%sy,  0.3%ni,  0.0%id, 54.8%wa,  0.0%hi,  1.0%si,  0.0%st
Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  0.3%us,  0.7%sy,  0.0%ni, 97.3%id,  1.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  5.4%us,  4.4%sy,  0.0%ni, 82.6%id,  7.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8 :  1.7%us,  1.7%sy,  0.0%ni, 72.8%id, 23.8%wa,  0.0%hi,  0.0%si,  0.0%st

6. Refresh Unix Top Command Output On demand (or) Change Refresh Interval

By default, linux top command updates the output every 3.0 seconds. When you want to update the output on-demand, press space bar.
To change the output update frequency, press d in interactive mode, and enter the time in seconds as shown below.
Change delay from 3.0 to: 10
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1309 geek   23   0 2483m 1.7g  27m S    0 21.8  45:31.32 gagent
 1882 geek   25   0 2485m 1.7g  26m S    0 21.7  22:38.97 gagent

7. Highlight Running Processes in the Linux Top Command Output – Press z or b

Press z or b, which will highlight all running process as shown below.
Highlight Running Process on Ubuntu Linux Using Top Command
Fig: Ubuntu Linux – top command highlights running process

8. Display Absolute Path of the Command and its Arguments – Press c

Press c which will show / hide command absolute path, and arguments as shown below.
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1309 geek   23   0 2483m 1.7g  27m S    0 21.8  45:31.32 /usr/sbin/gagent
 1882 geek   25   0 2485m 1.7g  26m S    0 21.7  22:38.97 /usr/sbin/gagent -l 0 -u pre

9. Quit Top Command After a Specified Number of Iterations Using top -n

Until you press q, top continuously displays the output. If you would like to view only a certain iteration and want the top to exit automatically use -n option as shown below.
The following example will show 2 iterations of unix top command output and exit automatically
$ top -n 2

10. Executing Unix Top Command in Batch Mode

If you want to execute top command in the batch mode use option -b as shown below.
$ top -b -n 1
Note: This option is very helpful when you want to capture the unix top command output to a readable text file as we discussed earlier.

11. Split Top Output into Multiple Panels – Press A

To display multiple views of top command output on the terminal, press A. You can cycle through these windows using ‘a’. This is very helpful, when you can sort the output on multiple windows using different top output columns.

12. Get Top Command Help from Command Line and Interactively

Get a quick command line option help using top -h as shown below.
$ top -h
        top: procps version 3.2.0
usage:  top -hv | -bcisS -d delay -n iterations [-u user | -U user] -p pid [,pid ...]
Press h while top command is running, which will display help for interactive top commands.
Help for Interactive Commands - procps version 3.2.0
Window 1:Def: Cumulative mode Off.  System: Delay 3.0 secs; Secure mode Off.

  Z,B       Global: 'Z' change color mappings; 'B' disable/enable bold
  l,t,m     Toggle Summaries: 'l' load avg; 't' task/cpu stats; 'm' mem info
  1,I       Toggle SMP view: '1' single/separate states; 'I' Irix/Solaris mode
  ..........

13. Decrease Number of Processes Displayed in Top Output – Press n

Press n in the Interactive mode, which prompts for a number and shows only that. Following example will display only 2 process as a time.
Maximum tasks = 0, change to (0 is unlimited): 2
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1309 geek   23   0 2483m 1.7g  27m S    0 21.8  45:31.32 gagent
 1882 geek   25   0 2485m 1.7g  26m S    0 21.7  22:38.97 gagent

14. Toggle Top Header to Increase Number of Processes Displayed

By default top displays total number process based on the window height. If you like to see additional process you might want to eliminate some of the top header information.
Following is the default header information provided by top.
top - 23:47:32 up 179 days,  3:36,  1 user,  load average: 0.01, 0.03, 0.00
Tasks:  67 total,   1 running,  66 sleeping,   0 stopped,   0 zombie
Cpu(s):   0.7% user,   1.2% system,   0.0% nice,  98.0% idle
Mem:   1017136k total,   954652k used,    62484k free,   138280k buffers
Swap:  3068404k total,    22352k used,  3046052k free,   586576k cached
  • Press l – to hide / show the load average. 1st header line.
  • Press t – to hide / show the CPU states. 2nd and 3rd header line.
  • Press m – to hide / show the memory information. 4th and 5th line.

15. Save Top Configuration Settings – Press W

If you’ve made any interactive top command configurations suggested in the above examples, you might want to save those for all future top command output. Once you’ve saved the top configuration, next time when you invoke the top command all your saved top configuration options will be used automatically.
To save the top configuration, press W, which will write the configuration files to ~/.toprc. This will display the write confirmation message as shown below.
top - 23:47:32 up 179 days,  3:36,  1 user,  load average: 0.01, 0.03, 0.00
Tasks:  67 total,   1 running,  66 sleeping,   0 stopped,   0 zombie
Cpu(s):   0.7% user,   1.2% system,   0.0% nice,  98.0% idle
Mem:   1017136k total,   954652k used,    62484k free,   138280k buffers
Swap:  3068404k total,    22352k used,  3046052k free,   586576k cached
Wrote configuration to '/home/ramesh/.toprc'

10 Useful Sar (Sysstat) Examples for UNIX / Linux Performance Monitoring

Using sar you can monitor performance of various Linux subsystems (CPU, Memory, I/O..) in real time.
Using sar, you can also collect all performance data on an on-going basis, store them, and do historical analysis to identify bottlenecks.

Sar is part of the sysstat package.
This article explains how to install and configure sysstat package (which contains sar utility) and explains how to monitor the following Linux performance statistics using sar.
  1. Collective CPU usage
  2. Individual CPU statistics
  3. Memory used and available
  4. Swap space used and available
  5. Overall I/O activities of the system
  6. Individual device I/O activities
  7. Context switch statistics
  8. Run queue and load average data
  9. Network statistics
  10. Report sar data from a specific time
This is the only guide you’ll need for sar utility. So, bookmark this for your future reference.

I. Install and Configure Sysstat

Install Sysstat Package

First, make sure the latest version of sar is available on your system. Install it using any one of the following methods depending on your distribution.
sudo apt-get install sysstat
(or)
yum install sysstat
(or)
rpm -ivh sysstat-10.0.0-1.i586.rpm

Install Sysstat from Source

Download the latest version from sysstat download page.
You can also use wget to download the


wget http://pagesperso-orange.fr/sebastien.godard/sysstat-10.0.0.tar.bz2

tar xvfj sysstat-10.0.0.tar.bz2

cd sysstat-10.0.0

./configure --enable-install-cron
Note: Make sure to pass the option –enable-install-cron. This does the following automatically for you. If you don’t configure sysstat with this option, you have to do this ugly job yourself manually.
  • Creates /etc/rc.d/init.d/sysstat
  • Creates appropriate links from /etc/rc.d/rc*.d/ directories to /etc/rc.d/init.d/sysstat to start the sysstat automatically during Linux boot process.
  • For example, /etc/rc.d/rc3.d/S01sysstat is linked automatically to /etc/rc.d/init.d/sysstat
After the ./configure, install it as shown below.
make

make install
Note: This will install sar and other systat utilities under /usr/local/bin
Once installed, verify the sar version using “sar -V”. Version 10 is the current stable version of sysstat.
$ sar -V
sysstat version 10.0.0
(C) Sebastien Godard (sysstat  orange.fr)
Finally, make sure sar works. For example, the following gives the system CPU statistics 3 times (with 1 second interval).
$ sar 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:27:32 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:27:33 PM       all      0.00      0.00      0.00      0.00      0.00    100.00
01:27:34 PM       all      0.25      0.00      0.25      0.00      0.00     99.50
01:27:35 PM       all      0.75      0.00      0.25      0.00      0.00     99.00
Average:          all      0.33      0.00      0.17      0.00      0.00     99.50

Utilities part of Sysstat

Following are the other sysstat utilities.
  • sar collects and displays ALL system activities statistics.
  • sadc stands for “system activity data collector”. This is the sar backend tool that does the data collection.
  • sa1 stores system activities in binary data file. sa1 depends on sadc for this purpose. sa1 runs from cron.
  • sa2 creates daily summary of the collected statistics. sa2 runs from cron.
  • sadf can generate sar report in CSV, XML, and various other formats. Use this to integrate sar data with other tools.
  • iostat generates CPU, I/O statistics
  • mpstat displays CPU statistics.
  • pidstat reports statistics based on the process id (PID)
  • nfsiostat displays NFS I/O statistics.
  • cifsiostat generates CIFS statistics.
This article focuses on sysstat fundamentals and sar utility.

Collect the sar statistics using cron job – sa1 and sa2

Create sysstat file under /etc/cron.d directory that will collect the historical sar data.
# vi /etc/cron.d/sysstat
*/10 * * * * root /usr/local/lib/sa/sa1 1 1
53 23 * * * root /usr/local/lib/sa/sa2 -A
If you’ve installed sysstat from source, the default location of sa1 and sa2 is /usr/local/lib/sa. If you’ve installed using your distribution update method (for example: yum, up2date, or apt-get), this might be /usr/lib/sa/sa1 and /usr/lib/sa/sa2.
Note: To understand cron entries, read Linux Crontab: 15 Awesome Cron Job Examples.

/usr/local/lib/sa/sa1

  • This runs every 10 minutes and collects sar data for historical reference.
  • If you want to collect sar statistics every 5 minutes, change */10 to */5 in the above /etc/cron.d/sysstat file.
  • This writes the data to /var/log/sa/saXX file. XX is the day of the month. saXX file is a binary file. You cannot view its content by opening it in a text editor.
  • For example, If today is 26th day of the month, sa1 writes the sar data to /var/log/sa/sa26
  • You can pass two parameters to sa1: interval (in seconds) and count.
  • In the above crontab example: sa1 1 1 means that sa1 collects sar data 1 time with 1 second interval (for every 10 mins).

/usr/local/lib/sa/sa2

  • This runs close to midnight (at 23:53) to create the daily summary report of the sar data.
  • sa2 creates /var/log/sa/sarXX file (Note that this is different than saXX file that is created by sa1). This sarXX file created by sa2 is an ascii file that you can view it in a text editor.
  • This will also remove saXX files that are older than a week. So, write a quick shell script that runs every week to copy the /var/log/sa/* files to some other directory to do historical sar data analysis.

II. 10 Practical Sar Usage Examples

There are two ways to invoke sar.
  1. sar followed by an option (without specifying a saXX data file). This will look for the current day’s saXX data file and report the performance data that was recorded until that point for the current day.
  2. sar followed by an option, and additionally specifying a saXX data file using -f option. This will report the performance data for that particular day. i.e XX is the day of the month.
In all the examples below, we are going to explain how to view certain performance data for the current day. To look for a specific day, add “-f /var/log/sa/saXX” at the end of the sar command.
All the sar command will have the following as the 1st line in its output.
$ sar -u
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)
  • Linux 2.6.18-194.el5PAE – Linux kernel version of the system.
  • (dev-db) – The hostname where the sar data was collected.
  • 03/26/2011 – The date when the sar data was collected.
  • _i686_ – The system architecture
  • (8 CPU) – Number of CPUs available on this system. On multi core systems, this indicates the total number of cores.

1. CPU Usage of ALL CPUs (sar -u)

This gives the cumulative real-time CPU usage of all CPUs. “1 3″ reports for every 1 seconds a total of 3 times. Most likely you’ll focus on the last field “%idle” to see the cpu load.
$ sar -u 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:27:32 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:27:33 PM       all      0.00      0.00      0.00      0.00      0.00    100.00
01:27:34 PM       all      0.25      0.00      0.25      0.00      0.00     99.50
01:27:35 PM       all      0.75      0.00      0.25      0.00      0.00     99.00
Average:          all      0.33      0.00      0.17      0.00      0.00     99.50
Following are few variations:
  • sar -u Displays CPU usage for the current day that was collected until that point.
  • sar -u 1 3 Displays real time CPU usage every 1 second for 3 times.
  • sar -u ALL Same as “sar -u” but displays additional fields.
  • sar -u ALL 1 3 Same as “sar -u 1 3″ but displays additional fields.
  • sar -u -f /var/log/sa/sa10 Displays CPU usage for the 10day of the month from the sa10 file.

2. CPU Usage of Individual CPU or Core (sar -P)

If you have 4 Cores on the machine and would like to see what the individual cores are doing, do the following.
“-P ALL” indicates that it should displays statistics for ALL the individual Cores.
In the following example under “CPU” column 0, 1, 2, and 3 indicates the corresponding CPU core numbers.
$ sar -P ALL 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:34:12 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:34:13 PM       all     11.69      0.00      4.71      0.69      0.00     82.90
01:34:13 PM         0     35.00      0.00      6.00      0.00      0.00     59.00
01:34:13 PM         1     22.00      0.00      5.00      0.00      0.00     73.00
01:34:13 PM         2      3.00      0.00      1.00      0.00      0.00     96.00
01:34:13 PM         3      0.00      0.00      0.00      0.00      0.00    100.00
“-P 1″ indicates that it should displays statistics only for the 2nd Core. (Note that Core number starts from 0).
$ sar -P 1 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:36:25 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:36:26 PM         1      8.08      0.00      2.02      1.01      0.00     88.89
Following are few variations:
  • sar -P ALL Displays CPU usage broken down by all cores for the current day.
  • sar -P ALL 1 3 Displays real time CPU usage for ALL cores every 1 second for 3 times (broken down by all cores).
  • sar -P 1 Displays CPU usage for core number 1 for the current day.
  • sar -P 1 1 3 Displays real time CPU usage for core number 1, every 1 second for 3 times.
  • sar -P ALL -f /var/log/sa/sa10 Displays CPU usage broken down by all cores for the 10day day of the month from sa10 file.

3. Memory Free and Used (sar -r)

This reports the memory statistics. “1 3″ reports for every 1 seconds a total of 3 times. Most likely you’ll focus on “kbmemfree” and “kbmemused” for free and used memory.
$ sar -r 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

07:28:06 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact
07:28:07 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
07:28:08 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
07:28:09 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
Average:      6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
Following are few variations:
  • sar -r
  • sar -r 1 3
  • sar -r -f /var/log/sa/sa10

4. Swap Space Used (sar -S)

This reports the swap statistics. “1 3″ reports for every 1 seconds a total of 3 times. If the “kbswpused” and “%swpused” are at 0, then your system is not swapping.
$ sar -S 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

07:31:06 AM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
07:31:07 AM   8385920         0      0.00         0      0.00
07:31:08 AM   8385920         0      0.00         0      0.00
07:31:09 AM   8385920         0      0.00         0      0.00
Average:      8385920         0      0.00         0      0.00
Following are few variations:
  • sar -S
  • sar -S 1 3
  • sar -S -f /var/log/sa/sa10
Notes:
  • Use “sar -R” to identify number of memory pages freed, used, and cached per second by the system.
  • Use “sar -H” to identify the hugepages (in KB) that are used and available.
  • Use “sar -B” to generate paging statistics. i.e Number of KB paged in (and out) from disk per second.
  • Use “sar -W” to generate page swap statistics. i.e Page swap in (and out) per second.

5. Overall I/O Activities (sar -b)

This reports I/O statistics. “1 3″ reports for every 1 seconds a total of 3 times.
Following fields are displays in the example below.
  • tps – Transactions per second (this includes both read and write)
  • rtps – Read transactions per second
  • wtps – Write transactions per second
  • bread/s – Bytes read per second
  • bwrtn/s – Bytes written per second
$ sar -b 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:56:28 PM       tps      rtps      wtps   bread/s   bwrtn/s
01:56:29 PM    346.00    264.00     82.00   2208.00    768.00
01:56:30 PM    100.00     36.00     64.00    304.00    816.00
01:56:31 PM    282.83     32.32    250.51    258.59   2537.37
Average:       242.81    111.04    131.77    925.75   1369.90
Following are few variations:
  • sar -b
  • sar -b 1 3
  • sar -b -f /var/log/sa/sa10
Note: Use “sar -v” to display number of inode handlers, file handlers, and pseudo-terminals used by the system.

6. Individual Block Device I/O Activities (sar -d)

To identify the activities by the individual block devices (i.e a specific mount point, or LUN, or partition), use “sar -d”
$ sar -d 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:59:45 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
01:59:46 PM    dev8-0      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM    dev8-1      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM dev120-64      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM dev120-65      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM  dev120-0      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM  dev120-1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM dev120-96      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91
01:59:46 PM dev120-97      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91
In the above example “DEV” indicates the specific block device.
For example: “dev53-1″ means a block device with 53 as major number, and 1 as minor number.
The device name (DEV column) can display the actual device name (for example: sda, sda1, sdb1 etc.,), if you use the -p option (pretty print) as shown below.
$ sar -p -d 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:59:45 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
01:59:46 PM       sda      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM      sda1      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM      sdb1      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM      sdc1      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM      sde1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM      sdf1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM      sda2      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91
01:59:46 PM      sdb2      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91
Following are few variations:
  • sar -d
  • sar -d 1 3
  • sar -d -f /var/log/sa/sa10
  • sar -p -d

7. Display context switch per second (sar -w)

This reports the total number of processes created per second, and total number of context switches per second. “1 3″ reports for every 1 seconds a total of 3 times.
$ sar -w 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

08:32:24 AM    proc/s   cswch/s
08:32:25 AM      3.00     53.00
08:32:26 AM      4.00     61.39
08:32:27 AM      2.00     57.00
Following are few variations:
  • sar -w
  • sar -w 1 3
  • sar -w -f /var/log/sa/sa10

8. Reports run queue and load average (sar -q)

This reports the run queue size and load average of last 1 minute, 5 minutes, and 15 minutes. “1 3″ reports for every 1 seconds a total of 3 times.
$ sar -q 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

06:28:53 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
06:28:54 AM         0       230      2.00      3.00      5.00         0
06:28:55 AM         2       210      2.01      3.15      5.15         0
06:28:56 AM         2       230      2.12      3.12      5.12         0
Average:            3       230      3.12      3.12      5.12         0
Note: The “blocked” column displays the number of tasks that are currently blocked and waiting for I/O operation to complete.
Following are few variations:
  • sar -q
  • sar -q 1 3
  • sar -q -f /var/log/sa/sa10

9. Report network statistics (sar -n)

This reports various network statistics. For example: number of packets received (transmitted) through the network card, statistics of packet failure etc.,. “1 3″ reports for every 1 seconds a total of 3 times.
sar -n KEYWORD
KEYWORD can be one of the following:
  • DEV – Displays network devices vital statistics for eth0, eth1, etc.,
  • EDEV – Display network device failure statistics
  • NFS – Displays NFS client activities
  • NFSD – Displays NFS server activities
  • SOCK – Displays sockets in use for IPv4
  • IP – Displays IPv4 network traffic
  • EIP – Displays IPv4 network errors
  • ICMP – Displays ICMPv4 network traffic
  • EICMP – Displays ICMPv4 network errors
  • TCP – Displays TCPv4 network traffic
  • ETCP – Displays TCPv4 network errors
  • UDP – Displays UDPv4 network traffic
  • SOCK6, IP6, EIP6, ICMP6, UDP6 are for IPv6
  • ALL – This displays all of the above information. The output will be very long.
$ sar -n DEV 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:11:13 PM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s   txcmp/s  rxmcst/s
01:11:14 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:11:14 PM      eth0    342.57    342.57  93923.76 141773.27      0.00      0.00      0.00
01:11:14 PM      eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00

10. Report Sar Data Using Start Time (sar -s)

When you view historic sar data from the /var/log/sa/saXX file using “sar -f” option, it displays all the sar data for that specific day starting from 12:00 a.m for that day.
Using “-s hh:mi:ss” option, you can specify the start time. For example, if you specify “sar -s 10:00:00″, it will display the sar data starting from 10 a.m (instead of starting from midnight) as shown below.
You can combine -s option with other sar option.
For example, to report the load average on 26th of this month starting from 10 a.m in the morning, combine the -q and -s option as shown below.
$ sar -q -f /var/log/sa/sa23 -s 10:00:01
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

10:00:01 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
10:10:01 AM         0       127      2.00      3.00      5.00         0
10:20:01 AM         0       127      2.00      3.00      5.00         0
...
11:20:01 AM         0       127      5.00      3.00      3.00         0
12:00:01 PM         0       127      4.00      2.00      1.00         0
There is no option to limit the end-time. You just have to get creative and use head command as shown below.
For example, starting from 10 a.m, if you want to see 7 entries, you have to pipe the above output to “head -n 10″.
$ sar -q -f /var/log/sa/sa23 -s 10:00:01 | head -n 10
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

10:00:01 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
10:10:01 AM         0       127      2.00      3.00      5.00         0
10:20:01 AM         0       127      2.00      3.00      5.00         0
10:30:01 AM         0       127      3.00      5.00      2.00         0
10:40:01 AM         0       127      4.00      2.00      1.00         2
10:50:01 AM         0       127      3.00      5.00      5.00         0
11:00:01 AM         0       127      2.00      1.00      6.00         0
11:10:01 AM         0       127      1.00      3.00      7.00         2

Monday, November 5, 2012

SYSSTAT Howto: A Deployment and Configuration Guide for Linux Servers

I remember one of the first things that attracted me to computers--well, besides Pac-Man--was the blinking lights. Blue, amber, red, or green, I didn’t really care about the color, rather what they meant. I was curious as to what exactly was that light reporting? I’m the type of fellow who likes to take things apart to learn what makes them tick.
It started with an altimeter that my father, who was in the Air Force, brought home. I was instantly curious about how it worked. It was a good thing it was a gift and not something that had to be put back on the plane as springs and gears shot out shortly after cracking it open. Like Humpty Dumpty, I was never able to get that altimeter back together again, however it did increase my hunger to understand not just how something works, but how do you understand what’s going on?
SYSSTAT is a software application comprised of several tools that offers advanced system performance monitoring. It provides the ability to create a measurable baseline of server performance, as well as the capability to formulate, accurately assess and conclude what led up to an issue or unexpected occurrence. In short, it lets you peel back layers of the system to see how it’s doing... in a way it is the blinking light telling you what is going on, except it blinks to a file. SYSSTAT has broad coverage of performance statistics and will watch the following server elements:
  • Input/Output and transfer rate statistics (global, per device, per partition, per network filesystem and per Linux task / PID)
  • CPU statistics (global, per CPU and per Linux task / PID), including support for virtualization architectures
  • Memory and swap space utilization statistics
  • Virtual memory, paging and fault statistics
  • Per-task (per-PID) memory and page fault statistics
  • Global CPU and page fault statistics for tasks and all their children
  • Process creation activity
  • Interrupt statistics (global, per CPU and per interrupt, including potential APIC interrupt sources)
  • Extensive network statistics: network interface activity (number of packets and kB received and transmitted per second, etc.) including failures from network devices; network traffic statistics for IP, TCP, ICMP and UDP protocols based on SNMPv2 standards.
  • NFS server and client activity
  • Socket statistics
  • Run queue and system load statistics
  • Kernel internal tables utilization statistics
  • System and per Linux task switching activity
  • Swapping statistics
  • TTY device activity
(List source - http://pagesperso-orange.fr/sebastien.godard/features.html)

Scope

This article covers a brief overview of how the SYSSTAT utility works, initial configuration, deployment and testing on Linux based servers. It includes an optional system configuration guide for writing SYSSTAT data into a MySQL database. This article is not intended to be an in-depth explanation of the inner workings of SYSSTAT, nor a detailed manual on database storage operations.
Now... on to the interesting parts of SYSSTAT!

Overview

The SYSSTAT software application is composed of several utilities. Each utility has a specific function:
  • iostat reports CPU statistics and input/output statistics for devices, partitions and network filesystems.
  • mpstat reports individual or combined processor related statistics.
  • pidstat reports statistics for Linux tasks (processes) : I/O, CPU, memory, etc.
  • sar collects, reports and saves system activity information (CPU, memory, disks, interrupts, network interfaces, TTY, kernel tables, NFS, sockets etc.)
  • sadc is the system activity data collector, used as a backend for sar.
  • sa1 collects and stores binary data in the system activity daily data file. It is a front end to sadc designed to be run from cron.
  • sa2 writes a summarized daily activity report. It is a front end to sar designed to be run from cron.
  • sadf displays data collected by sar in multiple formats (CSV, XML, etc.) This is useful to load performance data into a database, or import them in a spreadsheet to make graphs.
(List source - http://pagesperso-orange.fr/sebastien.godard/documentation.html)
The four main components used in collection activities are sar, sa1, sa2 and cron. Sar is the system activity reporter. This tool will display interpreted results from the collected data. Sar is ran interactively by an administrator via command line. When a sar file is created, it is written into the /var/log/sa directory and named sar##. The ## is a numerical value that represents the day of the month (i.e. sa03 would be the third day of the month). The numerical value changes accordingly without system administrator intervention. There are many option flags to choose from to display data in a sar file to view information about server operations, such as cpu, network activity, NFS and sockets. These options can be viewed by reviewing the man pages of sar.
Sa1 is the internal mechanism that performs the actual statistical collection and writes the data to a binary file at specified times. Information is culled from the /proc directory where the Linux kernel writes and maintains pertinent data while the operating system is running. Similar to sar, the binary file is written into /var/log/sa and named sa##. Again, the ## represents the day of the month (i.e. sar03 would be the third day of the month). Once more, the numerical value changes accordingly without system administrator intervention.
Sa2 is responsible for converting the sa1 binary file into a human readable format. Upon successful creation of the binary file sa## it becomes necessary to set up a cron task that will call the sa2 libraries to convert the sa1 binary file into the human-readable sar file. SYSSTAT utilizes the scheduled cron command execution to draw and record specified performance data based upon pre-defined parameters. It is not necessary to run the sa2 cron at the same time or as often as the sa1 cron. The sa2 function will create and write the sar file to the /var/log/sa directory.
How often SYSSTAT “wakes up” to record and what data is captured, is determined by your operational needs, regulatory requirements and purposes of the server being monitored. These logs can be rotated to a central logging server and stored for analysis at a later date if desired.

SYSSTAT Configuration

Now that you have the 40,000-foot overview of the components, onward to the nitty gritty of building out your SYSSTAT capabilities. The following is a suggested base configuration. You can tweak for your environment, but I will step through a traditional set up. My testing environment utilized a SUSE 10 Linux server.
As a side bar, every Linux-based server I have come across, installed or worked with has the SYSSTAT package deployed as part of the base server set up at installation. I would suggest however that you always look at the option to upgrade to the latest version of SYSSTAT. It will offer bug correction and more performance monitoring elements such as:
  • Autoconf support added.
  • Addition of a new command ("pidstat") aimed at displaying statistics for processes, threads and their children (CPU, memory, I/O, task switching activity...)
  • Better hotplug CPU support.
  • New VM paging metrics added to sar -B.
  • Added field tcp-tw (number of sockets in TIME_WAIT state) to sar -n SOCK.
  • iostat can now display the registered device name of device-mapper devices.
  • Timestamped comments can now be inserted into data files created by sadc.
  • XML Schema document added. Useful with sadf -x.
  • National Language Support improved: Added Danish, Dutch, Kirghiz, Vietnamese and Brazilian Portuguese translations.
  • Options -x and -X removed from sar. You should now use pidstat instead.
  • Some obsolete fields (super*, dquot* and rtsig*) were removed from sar -v. Added field pty-nr (number of pseudo-terminals).
(List source - http://pagesperso-orange.fr/sebastien.godard/features.html)

Create Scheduled SYSSTAT Monitoring

First things first--we need to tell our machine to record sar data. The kernel needs to be aware that it is to run SYSSTAT to collect metrics. Depending upon your distribution, at installation it creates a basic cron named sysstat.cron in /etc/sysstat/ for SUSE, or in /etc/cron.d/sysstat for Red Hat. If you use SUSE, it is recommended to create SYSSTAT’s cron job as a soft link assignment in /etc/cron.d pointing to the sysstat.cron in /etc/sysstat/. I would also suggest that several gathering times be created based on times when a server has the potential to be more active. This will ensure collection of accurate statistics. What good is it to see that your server is never busy at 2 in the morning? Data collected during off peak hours would skew later analysis and has the potential to cause erroneous interpretation.
 #Crontab for sysstat
 #Activity reports culled and updated every 10 minutes everyday
 -*/10 *   * * *     root  /usr/lib/sa/sa1
 #Update log sar reports every day at 2330 hours
 30 23 * * *      root  /usr/lib/sa/sa2 –A
After the softlink has been created, restart the cron daemon to allow it to reload the new assignment:
 # rccron restart
Let's use a script to create a sar backup file and offload to a specified location:
 #!/bin/bash
 # Created 04-AUG-09 / Kryptikos

  # Script to create backup and rename with current hostname and date of sar file and offload to storage facility.

  # Cycle through directory once looking for pertinent sar files.

  ls -1 /var/log/sa/sar* | while read sarname
         do
           mv "$sarname" $(echo "$HOSTNAME"_"$sarname "_`date +"%Y%m%d”`.bkup)
         done

 # This section will need to be modified. It is a place holder for code to offload the designated sar backup
 # file just created to the localhost, a central logging host or database server. This is dependent upon your ops.
 # i.e. scp /var/log/<sarlogname> <username>@<host>:/<desired location>

 <insert some code to handle the transfer via scp / ssh / mv / nc / or other method>

  exit
SYSSTAT will now run and collect the sar log, rename it and then offload it to the location you prefer.
Well that’s great you say, but what if you don’t have time to comb through 30 days worth of sar reports and just need a quick snap shot? Another neat thing you can do with this tool is capture real time statistics of what is going on with your machine. For instance at the command line you can enter:
 # sar –n NFS 5 3
That will have SYSSTAT report all NFS activity in five second intervals for 3 times and report it back to the terminal.
 root@mymachine # sar -n NFS 5 3
 Linux 2.6.18 (mymachine)        08/04/2009

 02:50:39 PM    call/s retrans/s    read/s   write/s  access/s  getatt/s
 02:50:44 PM      0.00      0.00      0.00      0.00      0.00      0.00
 02:50:49 PM      0.00      0.00      0.00      0.00      0.00      0.00
 02:50:54 PM      0.00      0.00      0.00      0.00      0.00      0.00
 Average:         0.00      0.00      0.00      0.00      0.00      0.00
So you have your sar data recording, and you now know how to use it for real-time checking. You are still left with quite a bit of data to comb through. It might be that you don’t need to look through your data until there is a problem and you’d like to track back at what point in time your server started having issues. The next section of the article deals with an advanced configuration for storage of the sar data for later retrieval.

MySQL Database Configuration

If you are running a server that does not carry a heavy user load it is okay to have the sar data stay local on the box. However, with the large volumes of system performance data that will be collected from a Linux server farm running numerous applications, I would suggest establishing a database for storing the relevant SYSSTAT information. By utilizing a MySQL database, customized data may be reviewed at any time and allow for the creation of reports, including charts, that are more granular in nature. It would also allow for analysis of cross sections of pertinent SYSSTAT data from multiple servers at one time. This would alleviate requiring an administrator/engineer reviewing individual sar log files attempting to troubleshoot or identify issues line by line. The use of a database decreases the time required to locate and diagnose root cause(s) of a server issue. This section covers database creation, setup and methodology to import the recorded logs. It is recommended to install and utilize MySQL version 5.1 or later for utilization of enhanced features and increased performance.

Start the MySQL Daemon

It stands to reason that to run the MySQL daemon you must have already installed MySQL. If you have not installed MySQL now’s the perfect time to pause and grab the latest copy. You should be able to utilize your distribution’s package manager to install the database, however if not, it is just as easy (and the method I typically use) to simply pull a copy from http://dev.mysql.com/downloads/mysql/5.1.html#downloads. Once you have MySQL installed you can start it with the following command:
 # <location of mysql>/bin/mysqld_safe --user=mysql --local-infile=1
The option --user= tells the daemon to run as user mysql (must be a local POSIX account). It is not recommended to run the MySQL daemon as root. The option --local-infile=1 tells the daemon to enable LOAD DATA LOCAL INFILE, which allows pushing tabbed, csv and txt files into a database from a file stored locally on the MySQL server.

Create the SYSSTAT Database

Getting down to business now that the database is up and running, it is time to create the infrastructure we want to hang our sar data upon.
  1. Connect to the MySQL server from the command line:
    # <location of mysql >/bin/ mysql --user=<username> -p
    Again, the user must exist on the MYSQL server; it is not a POSIX user account. The –p prompts for a password to connect to the MYSQL server.
  2. From the MySQL prompt check that the database does not already exist:
    mysql > SHOW databases;
  3. Create the database:
    mysql > CREATE DATABASE <name of database>;
  4. Grant privileges on the newly created database to a specified MySQL user account:
    mysql > GRANT ALL ON <name of database>.* TO ‘<mysql username>’@’localhost’;
This grants the MYSQL user specified full control over the database but only when connecting from the localhost the MYSQL daemon is running on. If you prefer to access from alternative locations for administrative purposes, execute the additional command:
mysql > GRANT ALL ON <name of database>.* TO ‘<mysql username>’@’%’
It is possible to control and granulize access via certain networks or domains. For security, if a database is deployed, I would create a “workhorse” account to perform the upload. This workhorse account would only have UPDATE privileges on the desired tables. In my case I chose to name my database “systat_collection”.
Before:
 +------------------------------+
 | Database                     |
 +------------------------------+
 | information_schema           | 
 | menagerie                    | 
 | MySQL                        | 
 | test                         | 
 +------------------------------+
 4 rows in set (0.00 sec)
After:
 +------------------------------+
 | Database                     |
 +------------------------------+
 | information_schema           | 
 | menagerie                    | 
 | MySQL                        | 
 | sysstat_collection        |
 | test                         | 
 +------------------------------+
 5 rows in set (0.00 sec)

Create the Necessary Tables

We now have our database, but we need to take it one more step. The database has to be “made-ready” to accept incoming sar data. This is done by building tables. Think of tables as a bookcase. Each block (table) will hold books (sar data). The easiest method to insert tables into your database is to create and utilize sql scripts. These scripts can be quickly invoked by the MySQL daemon and pushed inside the database. Each script should have a unique name that ends with the .sql extension. A basic SYSSTAT configuration would require 18 tables. I’ve written an example sql script for you to use:
Example SQL script: create_cpuutilization_table.sql
 # Created 04-AUG-09 / Kryptikos

  # Drop the CPU UTILIZATION table if it exists, then recreate it.

  DROP TABLE IF EXISTS cpuutilization;

  CREATE TABLE cpuutilization
 (
   hostname    VARCHAR(20),
   datestamp   DATE,
   time              VARCHAR(8),
   cpu                VARCHAR(3),
   pct_user       DECIMAL(10,2),
   pct_nice        DECIMAL(10,2),
   pct_system  DECIMAL(10,2),
   pct_iowait    DECIMAL(10,2),
   pct_steal       DECIMAL(10,2),
   pct_idle         DECIMAL(10,2)
 );
Breaking that down into understandable chunks, CPU utilization is one of the items SYSSTAT will record. SYSSTAT will stamp the kernel, hostname, date, time and then sar value in the string (see the output in the “real-time” example earlier in the article). I know what kernel version I am using so really all I am interested in is hostname (because I capture multiple servers to this database), date, time and cpu elements. Each sar value has its own elements. The quickest way to view this is to use the man pages of sar to see what values sar records.
Remembering thinking about a database table as a bookshelf, the above values hostname, datestamp, time, cpu, pct_user, pct_nice, etc. are the open shelves. You have to have the shelf before you can place a book. For each book type (sar element) you have you need a shelf (table point).

Deploy the Tables into the MySQL Database

The great thing about .sql scripts is they can be invoked directly by the MySQL daemon. It is not necessary to log into and obtain a MySQL prompt from the server. You can simply feed the script file to the daemon which will parse and execute the commands on your behalf. Lather, rinse and repeat for each table you wish to insert into your database, or create a bash script to feed the .sql scripts all at once. The following is the command structure to execute the table creation script:
# /usr/local/ mysql /bin/ mysql -u <user> -p -D <database> < create_cpuutilization_table.sql
Again, the –u tells the daemon to run the script as the specified MySQL user account (not POSIX). The user must have privileges on the database to allow modification. The –p again prompts for the user password and the –D specifies which database you want to execute the script contents upon. The re-direct sign “<” feeds the script to the daemon. You can by-pass the password prompt and stream your password directly in by changing –p to --password=<passwd value>.

Loading SYSSTAT Logs Into the MySQL Database

Moving right along, once the cron job and backup script has run, it is necessary to format the data from the sar file in order to prepare it for loading into the SYSSTAT database. The elegance of sar is it loads the data into tabulated columns in one large file and breaks apart sections by blank lines. By utilizing the stream editor (sed) and translate capabilities we can quickly parse the sar file into bite size pieces ready to load into its respective table. A few things are worthy to note here. I like to recommend processing sar data before 0000 hours (midnight) to maintain time/date integrity. Second, in preparation I like to stash and load sar data from its own directory source, say /var/log/sysstatdbprepare, or from the /tmp directory. In the environment I work in I have numerous servers reporting and prefer one location where logs are stored for sar.
The script listed below is an example of formatting and uploading data into a database. Variables inside the script should be changed to fulfill operational requirements. Spaces are included to increase legibility in this article:
 #!/bin/bash
 # Created 04-AUG-09 / Kryptikos

 # This script will parse the sar file and prepare/format the data to make it
 # available to upload into a MySQL database. It will then call and upload the
 # data into the selected MySQL database and its respective tables.

 # Set miscellaneous variables needed.
 DATESTAMP=$(date '+%Y-%m-%d')
 WORKDIR=/tmp/sysstatdbprepare
 FMATDIR=/tmp/sysstatdbformatted

 # Begin main.

 # Change into work directory.

 cd $WORKDIR

 # Start preprocessing formatting.

 for file in `dir -d *`;
 do

 # Prepare and format designated hosts' sar log files to be loaded into MYSQL database:

 sed -n "/proc/,/cswch/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_taskcreation.csv

 sed -n "/cswch/,/CPU/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_systemswitchingactivity.csv

 sed -n "/user/,/INTR/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_cpuutilization.csv

 sed -n "/INTR/,/CPU/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_irqinterrupts.csv

 sed -n "/i000/,/pswpin/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_inputactivityperprocperirq.csv

 sed -n "/pswpin/,/tps/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_swappingstatistics.csv

 sed -n "/tps/,/frmpg/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_iotransferrate.csv

 sed -n "/frmpg/,/TTY/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_memorystatistics.csv

 sed -n "/TTY/,/IFACE/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_ttydeviceactivity.csv

 sed -n "/IFACE/,/rxerr/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_networkstatistics.csv

 sed -n "/rxerr/,/call/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_networkstatisticserrors.csv

 sed -n "/call/,/scall/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_networkstatisticsnfsclientactvty.csv

 sed -n "/scall/,/pgpgin/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_networkstatisticsnfsserveractvty.csv

 sed -n "/pgpgin/,/kbmemfree/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_pagingstatistics.csv

 sed -n "/kbmemfree/,/dentunusd/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_memoryswapspaceutilization.csv

 sed -n "/dentunusd/,/totsck/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_inodefilekerneltable.csv

 sed -n "/totsck/,/runq-sz/ p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_networkstatisticssocket.csv

 sed -n "/runq-sz/,// p" $file | sed "$ d" | tr -s [:blank:] | sed -n '1h;2,$H;${g;s/ /,/g;p}' | sed '/Average:/ d' | sed "s/^/$HOSTNAME,$DATESTAMP,/" | sed '$d' > "$FMATDIR"/"$file"_queuelengthloadavgs.csv

 done

 # Kick off uploading formatted data into MYSQL database:

 # Change into format directory.

 cd $FMATDIR

 # Pushing data into MYSQL via -e flag.

 for file in `dir -d *`;
 do

 /usr/local/MySQL/bin/MySQL -u <MySQLuser> --password=<password> -D <database> -e "LOAD DATA LOCAL INFILE '/tmp/sysstatdbformatted/${file}' INTO TABLE `echo $file | sed 's/\.csv//g' | awk -F_ '{print $2}'` FIELDS TERMINATED BY ',' IGNORE 1 LINES;"

Wrap-Up and Overview

That’s it, just a few scripts and a little bit of time to set up automation to inject your data into a database. If everything is correct syntax wise on your scripts, you should now be able to log into the MySQL server and see the data loaded into the tables. I tend to use MySQL Administrator (GUI tool) to log in and look at the databases and tables. It is a bit quicker to use for checking.
The following overview points are suggested recommendations to implement SYSSTAT on your Linux server(s):
  • Schedule cron via soft link assignment in /etc/cron.d pointing to the sysstat.cron in /etc/sysstat/.
  • Schedule multiple sa1 function crons: record statistics more often during higher utilization times and less during off-peak hours with respect to server operation.
  • Utilize MYSQL database to store collected data for minimum of 30 - 45 days before purging records and restarting storage process.
  • If database option is not chosen: write sar files to a central logging server and rename with hostname and current date values.
  • If database option is not chosen: store renamed sar files for 25 - 30 days until purged by central logging server.
SYSSTAT can give you a wealth of information as to what is going on with your server. It gives you the chance to watch a historical trend of when your server is getting utilized, how heavy the use is and a host of other empirical data. It will allow you to focus and determine root-cause analysis if you suddenly find your server having issues. You are only limited to your imagination as to what you could use it for to compliment troubleshooting. If you do end up using a database there are packages out there that will generate pretty graphs for easier interpretation, or you could even scribble up some PHP code and pull up the data via a web browser.
The thing I love about Linux is how I can continue to break things apart, learn how they work and then deploy based on my needs. If you have suggestion as to content, or have questions please feel free to comment. Either way, hope this helped a little bit for your environment.