AnyCollect  1.1.2
AnyCollect Usage

Contents

Please be careful about metric instantiation rules. They are detailed below in Metrics and substitution.

Metrics

AnyCollect can gather metrics from two different kinds of sources:

  • Reading files from the filesystem
  • Executing commands

Each of these sources have an array of expressions which are used to filter and match the contents. Each expression is a regex and an array of metrics templates, which use regex matches to form metrics.

Sources

Metrics are defined in a JSON file:

{
"Files": [
{
"Paths": [
""
],
"Expressions": [...]
}
]
"Commands": [
{
"Program": "",
"Arguments": [
""
],
"Expressions": [...]
}
],
}

Metrics from reading file contents are specified in the top-level Files array. Each require two fields:

  • Paths, an array of strings representing the paths of files to read
  • Expressions, an array of expressions (see below) used to match file contents

Paths support standard POSIX globbing (with wildcards, etc.). You can specify multiple paths to be matched against the same expressions.

Metrics from command output are specified in the top-level Command array. It requires three fields:

  • Program, the main program to execute
  • Arguments, an array of additional arguments to give to the program
  • Expressions, an array of expressions (see below) used to match command output

Prior to execution, program and arguments are concatenated with spaces in between.

Expressions

An expression is defined by two fields:

  • Regex, a regular expression string (backslashes \ and quotes " should be escaped)
  • Metrics, an array of metric templates

A metric template is in turn defined in JSON by six fields:

  • Name, an array of strings representing the name of the metric
  • Value, a string representing the value of the metric
  • Unit, a string representing the value's unit
  • Tags, a map associating strings to strings adding metadata to a metric
  • ComputeRate, a boolean indicating whether the variation of the value, rather than the value, should be collected
  • ConvertToUnitsPerSecond, a boolean indicating whether the value should be converted to units per second

The regex will be applied for each line of content. If a match is found, metric templates are attempted to be filled with variable substitution.

One expression may have more than one metric template to facilitate parsing: if a line contains multiple metrics, the whole line can be matched by the regex and each metric template will extract one metric from the regex match.

Metrics and substitution

String fields of a metric template (Name, Value, Unit and Tags) are subject to variable substitution:

  • $1, $2, $3, etc. will be replaced by the regex submatch with the same index. $0 matches the whole match
  • for file content sources, $path_0, $path_1, $path_2, etc. will be replaced by the path component at the same index

Important notes after substitution:

  • A metric is defined by its Name and Tags fields: if two metrics have the same Name and Tags, they are considered to represent the same thing. The Unit field is not taken into account. This equivalence is used to compute rates from an iteration to the next, and in case two metrics are found to be equivalent during the same iteration then their values are added.
  • The Value field may be a simple mathematic expressions. Integer or floating point numbers are supported, as well as operands +, -, *, /, % (modulo), ^ (power), ( and ); some function are also supported (sqrt, exp, log, cos, sin, tan, ...). If the expression is not valid, or can't be converted to a number, the metric is dropped.
  • If any of Name, Value, or Tags field is empty, the metric is considered deficient and is dropped. The Unit field may be empty.
  • The Name field should only contain lower-case alphanumeric characters and underscores _. Upper-case letters will be converted to lower-case, and symbols will be replaced by underscores.

Examples

Please refer to the examples config files in the example directory of this repo.

Basic file matching

Memory stats in /proc/meminfo

MemTotal: 949448 kB

Regex: ^(\w+) (\d+) (\w)B$

Matches:

Index Submatch
0 MemTotal: 949448 kB
1 MemTotal
2 949448
3 k

Metrics:

Template Metric
  • Name: ["memory", "$1"],
  • Value: "$2",
  • Unit: "$3B",
  • Tags: {}
  • Name: ["memory", "MemTotal"],
  • Value: 949448,
  • Unit: "kB",
  • Tags: {}

Multiple metrics on one line:

CPU stats in /proc/stat

cpu 14574109 23322 24156706 688820875 542455 0 1102485 0 0

Regex: ^cpu (\d+) \d+ (\d+) (\d+)

Matches:

Index Submatch
0 cpu 14574109 23322 24156706 688820875
1 cpu
2 14574109
3 24156706
4 688820875

Metrics:

Template Metric
  • Name: ["cpu", "user"],
  • Value: "$1",
  • Unit: "jiffies",
  • Tags: {}
  • Name: ["cpu", "user"],
  • Value: 14574109,
  • Unit: "jiffies",
  • Tags: {}
  • Name: ["cpu", "system"],
  • Value: "$2",
  • Unit: "jiffies",
  • Tags: {}
  • Name: ["cpu", "system"],
  • Value: 24156706,
  • Unit: "jiffies",
  • Tags: {}
  • Name: ["cpu", "idle"],
  • Value: "$3",
  • Unit: "jiffies",
  • Tags: {}
  • Name: ["cpu", "idle"],
  • Value: 688820875,
  • Unit: "jiffies",
  • Tags: {}

Command output matching

Get network latency with ping

$ ping -c 1 -W 1 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=61 time=2.74 ms
...

Regex: bytes from ([\d.]+): .* time=([\d.]+) (\w+)$

Matches:

Index Submatch
0 bytes from 8.8.8.8: icmp_seq=1 ttl=61 time=2.74 ms
1 8.8.8.8
2 2.74
3 ms

Metrics:

Template Metric
  • Name: ["ping"],
  • Value: "$2",
  • Unit: "$3",
  • Tags: { "host": "$1" }
  • Name: ["ping"],
  • Value: 2.74,
  • Unit: "ms",
  • Tags: { "host": "8.8.8.8" }

AnyCollectValues Standalone interface

This program collects all available metrics and prints them on the standard output. For each metric, it shows its name, its value and its unit.

AnyCollectValues can take up to three arguments:

  • the sampling interval in second, that is the duration to wait between two readings of kernel values. Default is 1 second
  • JSON configuration file path
  • how many times to report metrics. Default is 0, which means unlimited time

For example, ./AnyCollectValues 60 ./config.json 10 will read the config file ./config.json, and then collect metrics and print them every 60 seconds; it will do so 10 times. The program will thus run for 10 minutes.

Snap Configuration

Global configuration

In order for the AnyCollect plugin to be aware of its configuration before Snap launches a task (otherwise metrics won't be registered), the configuration file must be specified in snapteld global config:

---
control:
plugins:
collector:
anycollect:
all:
ConfigFile: /etc/snap/anycollect.json

However, if you choose to send all metrics described by a configuration file, this global configuration is not needed. You can tell the plugin to send all metrics in the task file, either by setting SendAllMetrics to true or by requesting the metric /cfm/anycollect/*.

Configuration

First set up the Snap framework.

The default configuration for a AnyCollect Snap task is the following:

---
version: 1
schedule:
type: "streaming"
max-failures: 10
workflow:
collect:
metrics:
/cfm/anycollect/*: {}
config:
/cfm:
ConfigFile: "/path/to/anycollect/config.json"
SamplingInterval: 1
SendAllMetrics: false
# MaxMetricsBuffer: 0
# MaxCollectDuration: 0
publish:
...

Parameters

The parameters are (default values are given above):

  • ConfigFile (type string): path to AnyCollect's JSON configuration file
  • SamplingInterval (type int): delay in seconds between two readings of the kernel values
  • SendAllMetrics (type boolean): whether to send all metrics to Snap, ignoring requested metrics in the task. This is a workaround: if the config file is modified and the Snap daemon not restarted, Snap doesn't update the metric list and new metrics won't be sent
  • MaxMetricsBuffer (type int): maximum number of metrics to send to Snap at once
  • MaxCollectDuration (type int): maximum waiting time (in seconds) before sending metrics to Snap

Metrics

Collected metrics are described above.

Each metric template described in the configuration file will, after regex substitution, be transformed into a Snap metric. The metric name is converted into a metric namespace which you can filter in the Snap task file.

You can use wildcards to specify groups of metrics easily:

/cfm/anycollect/*: {}