Linux

collectl performance monitoring tool – Explained

March 6, 2017

4407

Collectl – An Awesome All in one Performance Analysis Tool in Linux

Collectl is a light-weight performance monitoring tool capable of reporting interactively as well as logging to disk. It reports statistics on buddyinfo, cpu, disk, inodes, infiniband, lustre, memory, network, nfs, processes, quadrics, slabs, sockets and tcp in easy to read format. Unlike most monitoring tools that either focus on a small set of statistics, format their output in only one way, run either interactively or as a daemon but not both, collectl can monitor different parameters at the same time and report them in a suitable manner.

Unlike most monitoring tools that either focus on a small set of statistics, format their output in only one way, run either interatively or as a daemon but not both, collectl tries to do it all. You can choose to monitor any of a broad set of subsystems which currently include buddyinfo, cpu, disk, inodes, infiniband, lustre, memory, network, nfs, processes, quadrics, slabs, sockets and tcp.

The following block diagram shows that collectl is much more than a tool that reads data from /proc and writes its results into a file or on the terminal.

Installation

To install collect on Redhat/Centos

# yum install collectl –y

To install on debian family like Ubuntu etc

# apt-get install collectl –y

Once install you can run collect to get the current status of the server, here you will get CPU, disk and network status as below,

# collectl

Usage:

Collectl Subsystem Theory:

If you write data for CPUs and DISKs to a raw file and play it back with -sc, you will only see CPU data. If you play it back with -scm you will still only see CPU data since memory data was not collected. However, when used with -P, collectl will always honor the subsystems specified with this switch so in the previous example you will see CPU data plus memory data of all 0s. To see the current set of default subsystems, which are a subset of this full list, use -h.

You can also use + or – to add or subtract subsystems to/from the default values. For example, “-s-cdn+N”< will remove cpu, disk and network monitoring from the defaults while adding network detail.

The default is “cdn”, which stands for CPU, Disk and Network summary data.

Monitor CPU Usage:

To monitor the summary of cpu usage use “-sc” along with collect

# collectl –sc

To check each cpu individually, use “C”. It will output multiple lines together, one for each cpu.

# collectl –sC

To collect combined result of both, you can use both c & C in single commend;

# collectl –scC

Monitor Memory

Use the m subsystem to check the memory

# collectl –sm

The M option would give further details about the memory.

# collect –sM

Check disk usage

The d and D options provide the summary and details on disk usage.

4. Report multiple systems together

So let’s say you want a report of cpu, memory and disk io together, then use the subsystems together.

# collect –scmd

5. Display time with the stats

To display the time in each line along with the measurements, use the T option. And over that, to specify options, you need to use the “-o” switch.

# collect –scmd –oT

You could also display the time in milliseconds with “-oTm”.

Change sample count

Every row the collectl reports is a snapshot or sample. And it takes these snapshots at regular intervals, say 1 second. The i option sets the interval and c option sets the sample count.

# collectl -c1 -sm

# collectl -sm –i5

More examples: To get top output,

# collectl –-top

# collect –-vmstat