Skip to content
Snippets Groups Projects
Commit ef9ca05f authored by Nikolay Stanchev's avatar Nikolay Stanchev
Browse files

Updates alerts documentation

parent 17ba3cb4
No related branches found
No related tags found
No related merge requests found
......@@ -114,6 +114,7 @@ topology_template:
condition:
threshold: 100 # requests have increased by at least 100
granularity: 120
aggregation_method: mean
resource_type:
flame_sfp: storage
flame_sf: storage-users
......@@ -132,6 +133,7 @@ topology_template:
condition:
threshold: -100 # requests have decreased by at least 100
granularity: 120
aggregation_method: mean
resource_type:
flame_sfp: storage
flame_sf: storage-users
......@@ -224,7 +226,7 @@ the format is still the same for consistency. Therefore, using `<measurement>.*`
* **threshold** -
* for **threshold** event type, this is the critical value the queried metric is compared to.
* for **relative** event type, this is the critical value the difference (between the current metric value and the past metric value) is compared to.
* for **relative** event type, this is the critical value the difference (between the current aggregated metric value and the past aggregated metric value) is compared to.
* for **deadman** event type, this is the critical value the number of measurement points (received in InfluxDB) is compared to.
* **granularity** - period in seconds
......@@ -233,7 +235,7 @@ the format is still the same for consistency. Therefore, using `<measurement>.*`
* for **deadman** event type, this value specifies how long the span in time (in which the number of measurement points are checked) is
* **aggregation_method** - the function to use when querying InfluxDB, e.g. median, mean, etc. This value is only used when
the event_type is set to **threshold**.
the event_type is set to **threshold** or **relative**.
* **resource_type** - provides context for the given event - key-value pairs for the global tags of the CLMC Information Model.
This includes any of the following: `"flame_sfp", "flame_sf", "flame_sfe", "flame_server", "flame_location"`.
......@@ -294,7 +296,7 @@ result of the comparison operation is true, an alert is triggered. For example:
"neq" : "not equal"
```
* **relative** - A relative event type is an alert in which Kapacitor computes the difference between the current value of a metric and the value
* **relative** - A relative event type is an alert in which Kapacitor computes the difference between the current aggregated value of a metric and the aggregated value
reported a given period of time ago. The difference between the current and the past value is then compared against a given
threshold. If the result of the comparison operation is true, an alert is triggered. For example:
......@@ -308,6 +310,7 @@ threshold. If the result of the comparison operation is true, an alert is trigge
condition:
threshold: -100
granularity: 120
aggregation_method: mean
resource_type:
flame_sfp: storage
flame_sf: storage-users
......@@ -318,8 +321,8 @@ threshold. If the result of the comparison operation is true, an alert is trigge
- flame_sfemc
```
This trigger specification will create an alert task in Kapacitor, which compares every **requests** value reported in
measurement **storage** with the value received **120** seconds ago. If the difference between the current and the past
This trigger specification will create an alert task in Kapacitor, which compares the mean **requests** value reported in measurement **storage**
with the mean value received **120** seconds ago. If the difference between the current and the past
value is less than or equal to (comparison operator is **lte**) **-100**, an alert is triggered. Simply explained, an alert
is triggered if the **requests** current value has decreased by at least 100 relative to the value reported 120 seconds ago.
The queried value is contextualised for service function **storage-users** (using service function package **storage**)
......@@ -329,6 +332,7 @@ threshold. If the result of the comparison operation is true, an alert is trigge
* **aggregation_method** is not required here - the alert task compares the actual value that's being reported (stream mode)
* if **aggregation_method** is provided, it will be ignored
* if X is the current timestamp, the current aggregated value refers to the period {X - granularity; X} while the past aggregated value refers to the period {X - 2*granularity; X - granularity}
* **deadman** - A deadman event type is an alert in which Kapacitor computes the number of reported points in a measurement
for a given period of time. This number is then compared to a given threshold value. If less number of points have been
......@@ -352,7 +356,7 @@ For example:
This trigger specification will create an alert task in Kapacitor, which monitors the number of points reported in
measurement **storage** and having tag **sfp** set as **storage**. This value is computed every 60 seconds.
If the number of reported points is less than **0** (no points have been reported for the last 60 seconds), an alert
If the number of reported points is less than or equal to **0** (no points have been reported for the last 60 seconds), an alert
will be triggered. Triggered alerts will be sent through an HTTP POST message to the URLs listed in the **implementation** section.
*Notes*:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment