diff --git a/docs/NotificationAPI-proposal.md b/docs/NotificationAPI-proposal.md new file mode 100644 index 0000000000000000000000000000000000000000..c3c908dc31f66bf0a076059103f93cdb18adeab9 --- /dev/null +++ b/docs/NotificationAPI-proposal.md @@ -0,0 +1,144 @@ +<!-- +// © University of Southampton IT Innovation Centre, 2018 +// +// Copyright in this software belongs to University of Southampton +// IT Innovation Centre of Gamma House, Enterprise Road, +// Chilworth Science Park, Southampton, SO16 7NS, UK. +// +// This software may not be used, sold, licensed, transferred, copied +// or reproduced in whole or in part in any manner or form or in or +// on any media by any person other than in accordance with the terms +// of the Licence Agreement supplied with the software, or otherwise +// without the prior written consent of the copyright owners. +// +// This software is distributed WITHOUT ANY WARRANTY, without even the +// implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR +// PURPOSE, except where stated in the Licence Agreement supplied with +// the software. +// +// Created By : Nikolay Stanchev +// Created Date : 09-08-2018 +// Created for Project : FLAME +--> + +# **FLAME - Integration of alerts, topics and handlers** + +#### **Authors** + +|Authors|Organisation| +|:---:|:---:| +|[Nikolay Stanchev](mailto:ns17@it-innovation.soton.ac.uk)|[University of Southampton, IT Innovation Centre](http://www.it-innovation.soton.ac.uk)| + + +#### Description + +This document outlines an internal proposal for the implementation of a CLMC Notification API - in relation to Kapacitor's alerts, +topics and handlers. + +#### Terminology + +1) Alert (a.k.a Task) - some work for Kapacitor to do periodically over time. Essentially, this is a query for Kapacitor to +execute and check the result of. If the result matches a given condition, an alert has to be fired. + +2) Handler (a.k.a Event Handler or Alert Handler) - a software, which is responsible for handling triggered alerts. Currently, +we are only considering the HTTP Post handler - a HTTP server (or simply a socket) that listens on a given url for POST +messages. + +3) Topic (a.k.a. Event Name) - a namespace to which an alert publishes data and from which a handler subscribes for alert data. Topics are used +to decouple Alerts from Handlers and are created on demand - an alert that publishes to non-existing topic will cause Kapacitor +to automatically create the topic and a handler which subscribes to a non-existing topic will cause Kapacitor to automatically +create the topic. + +#### Proposal + +After doing some extensive analysis on Kapacitor, I suggest that for managing **alerts**, we use task templates with +placeholders for MSP-specific values. This is feasible because as we found out from the CLMC infomation model analysis +everything apart from the ***ipendpoint*** identifier is already described in TOSCA by a MSP. Here is an example of what a simple +task template might look like: + +```tickscript +// Alert template ID - threshold_exceeded + +var db string + +var rp = 'autogen' // default value for the retention policy + +var measurement string + +var field string + +var whereCondition = 'TRUE' // default value is TRUE, hence no filtering of the query result + +var messageValue = 'TRUE' // default value is TRUE, as this is what SFEMC expects as a notification for an event rule + +var criticalValue float + +var alertPeriod = 60s // this value is read from TOSCA and is measured in seconds, default value is 60 seconds + +var topicID string + +batch + |query('SELECT mean(' + field + ') AS mean_value FROM "' + db + '"."' + rp + '"."' + measurement + '" WHERE ' + whereCondition) + .period(alertPeriod) + .every(alertPeriod) + |alert() + .crit(lambda: "mean_value" >= criticalValue) + .message(messageValue) + .topic(topicID) +``` + +And here is an example of what a configuration for the template above might look like: + +```json +{ + "db": {"type": "string", "value": "CLMCMetrics"}, + "rp": {"type": "string", "value": "autogen"}, + "measurement": {"type": "string", "value": "storage_sf_measurement"}, + "field": {"type": "string", "value": "service_delay"}, + "criticalValue": {"type": "float", "value": 10.0}, + "topicID": {"type": "string", "value": "storage_sf_delay_exceeded"}, + "whereCondition": {"type": "string", "value": "sf_package='storage' and sf='storage-users'"} +} +``` + +Alerts configurations are received and managed by the CLMC service with Kapacitor on the background. Therefore, the CLMC +service must provide an API endpoint for receiving this data, e.g. /alerts/configuration. As a starting point we might want to +provide templates for the three types that are given when building alerts in Chronograf - *threshold*, *relative* and *deadman*. +These would then be adjusted/extended based on common use cases from the experiments. An example request to configure alerts might +look like this: + +**HTTP POST Request** +**Request URL** - ***http://clmc.flame.eu/alerts/configuration*** +**Request Body** - must contain a list of configuration objects (similar to the one defined above) along with the type of +template to use - +```json +[ + { + "type": "threshold_exceeded", + "configuration": { + "db": {"type": "string", "value": "CLMCMetrics"}, + "rp": {"type": "string", "value": "autogen"}, + "measurement": {"type": "string", "value": "storage_sf_measurement"}, + "field": {"type": "string", "value": "service_delay"}, + "criticalValue": {"type": "float", "value": 10.0}, + "topicID": {"type": "string", "value": "storage_sf_delay_exceeded"}, + "whereCondition": {"type": "string", "value": "sf_package='storage' and sf='storage-users'"} + } + } +] +``` + +The alerts configuration must be sent along with the TOSCA template so that a *topic* validation is made of whether there +is an alert publishing data to the TOSCA-specific events and to also read the period value for each event (worth arguing on this). + +For handling alerts defined with the aforementioned methodology, following from discussion with IDE and what they prefer, +we would automatically subscribe a well-known SFEMC endpoint handler, which will receive HTTP POST messages when alerts trigger, +to every TOSCA-specific notification event. + +In case we decide to also allow MSP to provide MSP-specific alert handlers, the CLMC service will expose an +additional API endpoint - /alerts/configuration/\<topicID\>/handlers which will allow a MSP to subscribe with a +webhook handler to the given topic in a situation where the MSP wishes to receive alert notifications, too. + +#### Sequence diagram of the interactions between SFEMC, CLMC and MSP + + \ No newline at end of file diff --git a/docs/image/CLMC-Notifications-SequenceD-v4.png b/docs/image/CLMC-Notifications-SequenceD-v4.png new file mode 100644 index 0000000000000000000000000000000000000000..686e2e9cc4fc76432543acb0aa1f53faae2a71a6 Binary files /dev/null and b/docs/image/CLMC-Notifications-SequenceD-v4.png differ