threshold:100# requests have increased by at least 100
threshold:100# requests have increased by at least 100
granularity:120
granularity:120
aggregation_method:mean
resource_type:
resource_type:
flame_sfp:storage
flame_sfp:storage
flame_sf:storage-users
flame_sf:storage-users
...
@@ -132,6 +133,7 @@ topology_template:
...
@@ -132,6 +133,7 @@ topology_template:
condition:
condition:
threshold:-100# requests have decreased by at least 100
threshold:-100# requests have decreased by at least 100
granularity:120
granularity:120
aggregation_method:mean
resource_type:
resource_type:
flame_sfp:storage
flame_sfp:storage
flame_sf:storage-users
flame_sf:storage-users
...
@@ -224,7 +226,7 @@ the format is still the same for consistency. Therefore, using `<measurement>.*`
...
@@ -224,7 +226,7 @@ the format is still the same for consistency. Therefore, using `<measurement>.*`
***threshold** -
***threshold** -
* for **threshold** event type, this is the critical value the queried metric is compared to.
* for **threshold** event type, this is the critical value the queried metric is compared to.
* for **relative** event type, this is the critical value the difference (between the current metric value and the past metric value) is compared to.
* for **relative** event type, this is the critical value the difference (between the current aggregated metric value and the past aggregated metric value) is compared to.
* for **deadman** event type, this is the critical value the number of measurement points (received in InfluxDB) is compared to.
* for **deadman** event type, this is the critical value the number of measurement points (received in InfluxDB) is compared to.
***granularity** - period in seconds
***granularity** - period in seconds
...
@@ -233,7 +235,7 @@ the format is still the same for consistency. Therefore, using `<measurement>.*`
...
@@ -233,7 +235,7 @@ the format is still the same for consistency. Therefore, using `<measurement>.*`
* for **deadman** event type, this value specifies how long the span in time (in which the number of measurement points are checked) is
* for **deadman** event type, this value specifies how long the span in time (in which the number of measurement points are checked) is
***aggregation_method** - the function to use when querying InfluxDB, e.g. median, mean, etc. This value is only used when
***aggregation_method** - the function to use when querying InfluxDB, e.g. median, mean, etc. This value is only used when
the event_type is set to **threshold**.
the event_type is set to **threshold** or **relative**.
***resource_type** - provides context for the given event - key-value pairs for the global tags of the CLMC Information Model.
***resource_type** - provides context for the given event - key-value pairs for the global tags of the CLMC Information Model.
This includes any of the following: `"flame_sfp", "flame_sf", "flame_sfe", "flame_server", "flame_location"`.
This includes any of the following: `"flame_sfp", "flame_sf", "flame_sfe", "flame_server", "flame_location"`.
...
@@ -294,7 +296,7 @@ result of the comparison operation is true, an alert is triggered. For example:
...
@@ -294,7 +296,7 @@ result of the comparison operation is true, an alert is triggered. For example:
"neq" : "not equal"
"neq" : "not equal"
```
```
***relative** - A relative event type is an alert in which Kapacitor computes the difference between the current value of a metric and the value
***relative** - A relative event type is an alert in which Kapacitor computes the difference between the current aggregated value of a metric and the aggregated value
reported a given period of time ago. The difference between the current and the past value is then compared against a given
reported a given period of time ago. The difference between the current and the past value is then compared against a given
threshold. If the result of the comparison operation is true, an alert is triggered. For example:
threshold. If the result of the comparison operation is true, an alert is triggered. For example:
...
@@ -308,6 +310,7 @@ threshold. If the result of the comparison operation is true, an alert is trigge
...
@@ -308,6 +310,7 @@ threshold. If the result of the comparison operation is true, an alert is trigge
condition:
condition:
threshold: -100
threshold: -100
granularity: 120
granularity: 120
aggregation_method: mean
resource_type:
resource_type:
flame_sfp: storage
flame_sfp: storage
flame_sf: storage-users
flame_sf: storage-users
...
@@ -318,8 +321,8 @@ threshold. If the result of the comparison operation is true, an alert is trigge
...
@@ -318,8 +321,8 @@ threshold. If the result of the comparison operation is true, an alert is trigge
- flame_sfemc
- flame_sfemc
```
```
This trigger specification will create an alert task in Kapacitor, which compares every **requests** value reported in
This trigger specification will create an alert task in Kapacitor, which compares the mean **requests** value reported in measurement **storage**
measurement **storage** with the value received **120** seconds ago. If the difference between the current and the past
with the mean value received **120** seconds ago. If the difference between the current and the past
value is less than or equal to (comparison operator is **lte**) **-100**, an alert is triggered. Simply explained, an alert
value is less than or equal to (comparison operator is **lte**) **-100**, an alert is triggered. Simply explained, an alert
is triggered if the **requests** current value has decreased by at least 100 relative to the value reported 120 seconds ago.
is triggered if the **requests** current value has decreased by at least 100 relative to the value reported 120 seconds ago.
The queried value is contextualised for service function **storage-users** (using service function package **storage**)
The queried value is contextualised for service function **storage-users** (using service function package **storage**)
...
@@ -329,6 +332,7 @@ threshold. If the result of the comparison operation is true, an alert is trigge
...
@@ -329,6 +332,7 @@ threshold. If the result of the comparison operation is true, an alert is trigge
* **aggregation_method** is not required here - the alert task compares the actual value that's being reported (stream mode)
* **aggregation_method** is not required here - the alert task compares the actual value that's being reported (stream mode)
* if **aggregation_method** is provided, it will be ignored
* if **aggregation_method** is provided, it will be ignored
* if X is the current timestamp, the current aggregated value refers to the period {X - granularity; X} while the past aggregated value refers to the period {X - 2*granularity; X - granularity}
***deadman** - A deadman event type is an alert in which Kapacitor computes the number of reported points in a measurement
***deadman** - A deadman event type is an alert in which Kapacitor computes the number of reported points in a measurement
for a given period of time. This number is then compared to a given threshold value. If less number of points have been
for a given period of time. This number is then compared to a given threshold value. If less number of points have been
...
@@ -352,7 +356,7 @@ For example:
...
@@ -352,7 +356,7 @@ For example:
This trigger specification will create an alert task in Kapacitor, which monitors the number of points reported in
This trigger specification will create an alert task in Kapacitor, which monitors the number of points reported in
measurement **storage** and having tag **sfp** set as **storage**. This value is computed every 60 seconds.
measurement **storage** and having tag **sfp** set as **storage**. This value is computed every 60 seconds.
If the number of reported points is less than **0** (no points have been reported for the last 60 seconds), an alert
If the number of reported points is less than or equal to **0** (no points have been reported for the last 60 seconds), an alert
will be triggered. Triggered alerts will be sent through an HTTP POST message to the URLs listed in the **implementation** section.
will be triggered. Triggered alerts will be sent through an HTTP POST message to the URLs listed in the **implementation** section.