|
MetroCollect
2.3.4
|
MetroCollect gathers metrics from different sources:
/proc/stat/proc/meminfo/proc/net/devThose metrics are organized hierarchically. For instance, /cfm/memory/cached represents the amount of cached memory on the system, while cfm/network/eth0/rx/bytes represents the number of bytes received by eth0.
Most metric have a unit attached to it, and are expressed in units per second.
CPU metrics are read from the /proc/stat file, you may refer to the relevant man page of your system to have a thorough description.
As a summary, there are three types of CPU metrics:
Memory metrics are read from the /proc/memory file, you may refer to the relevant man page of your system to have a thorough description.
Network metrics are read from the /proc/net/dev file, you may refer to the relevant man page of your system to have a thorough description.
For each interface (either up or down), the number of bytes, packets and various errors are collected, for both reception ("rx") and transmission ("tx").
These are the same metrics provided by the ethtool -S $iface for each network interface $iface. It provides very detailled information about each interface, but these information depend on the interface's driver. you may refer to the driver documentation for more details.
MetroCollect supports all drivers supported by ethtool, and it is optimized for ixgbe, igb and i40e.
When a number appears in a metric name, it is moved at the end of it. For instance:
becomes
This program collects all available metrics and prints them on the standard output. For each metric, it shows values read from the kernel at some time, values read some time later, and the computed variation. That is because most kernel values are counters that can only increase; for example the number of bytes sent by eth0 or the amount of interrupts serviced.
MetroCollectValues can take up to two arguments:
For example, ./MetroCollectValues 500 10 will read kernel values and print them every 500 milliseconds, and it will do so 10 times. The program will thus run for 5 seconds.
This program collects all available metrics, compute statistics on a moving window and prints them on the standard output. For each metric, it shows the minimum, maximum, average and standard deviation of its variation.
MetroCollectValues can take up to four arguments:
By default, statsics are thus computed and printed every 1 second.
For example, ./MetroCollectStats 50 5 0 4 will read kernel values every 50 milliseconds, and compute and print stats every 5 samples (that is every 250 milliseconds), and it will do so 4 times. The program will thus run for 1 seconds.
This program collects all available metrics and write them to a file.
MetroCollectValues can take up to three arguments:
./output.csvFor example, ./MetroCollectFile 0 20 maxspeed.csv will read kernel values continuously (no delay between readings) 20 times, and write the variation to he file ./maxspeed.csv.
First set up the Snap framework.
The default configuration for a MetroCollect Snap task is the following:
The parameters are (default values are given above):
SendValues (type boolean): whether or not to send values to SnapSendStats (type boolean): whether or not to send statistics to SnapSamplingInterval (type int): delay in millisecond between two readings of the kernel valuesProcessingWindowLength (type int): moving window length (in number of samples). It is used to compute statisticsProcessingWindowOverlap (type int): moving window overlap (in number of samples), it must be less than the length of the window. It is used to compute statisticsConvertToUnitPerSecond (type boolean): wether or not to convert values in units per second if relevant (if you prefer the number of bytes sent per second instead of bytes sent per 100 milliseconds)UnchangedMetricTimeout (type int): by default, for performance reasons, constant metrics are downsampled and thus not sent everytime to Snap. If the value changes it is sent immediatly. This parameter is the downsample factor: how many null variations to ignore before sending the value to Snap again. If it is 0, downsample will not take place, if it is -1 constant values will never be sentMaxMetricsBuffer (type int): maximum number of metrics to send to Snap at onceMaxCollectDuration (type int): maximum waiting time (in seconds) before sending metrics to SnapBy default, values will be read from the kernel every 100 milliseconds, compute stats every 10 samples (that is every 1 second) and send them to Snap. If a value has remained constant across 120 iterations (that is during 2 minutes), zeros will be sent to Snap.
Collected metrics are described above.
For each metric, you can select which statistics you are interested in (shown below for the metric of aggegated CPU proportion of time spent in user mode, note the use of wildcards):
You can use wildcards to specify groups of metrics easily: