Skip to content
Snippets Groups Projects
Commit 6288eaf6 authored by Stephen Phillips's avatar Stephen Phillips
Browse files

Merge branch '103-add-user-documentation-for-the-end-to-end-delay' into 'integration'

Extends user documentation for the end to end delay

See merge request FLAME/consortium/3rdparties/flame-clmc!72
parents f7413af8 5300765f
No related branches found
No related tags found
No related merge requests found
......@@ -73,9 +73,9 @@ There is a dedicated endpoint which starts an automated graph monitoring script,
constantly executing a full processing pipeline - build temporal graph, query for end-to-end delay, write results bach in InfluxDB, delete
temporal graph. The pipeline uses the defined configuration to periodically build the temporal graph and query for the end-to-end delay
from all possible UEs to every deployed service function endpoint and writes the result back into a dedicated measurement in the time-series database (InfluxDB).
For more information on the graph monitoring pipeline, see the [graph RTT slides](https://owncloud.it-innovation.soton.ac.uk/remote.php/webdav/Shared/FLAME/Project%20Reviews/2nd%20EC%20Review%20(technical)/drafts/WP4_FLAME_Graph_RTT.pptx).
For more information on the graph monitoring pipeline, see the relevant section below.
* `POST http://<clmc-host>/clmc/clmc-service/graph/monitor`
* `POST http://platform/clmc/clmc-service/graph/monitor`
* Expected JSON body serving as the configuration of the graph monitoring script:
......@@ -89,7 +89,7 @@ For more information on the graph monitoring pipeline, see the [graph RTT slides
"<service function package>": {
"response_time_field": "<field measuring the service delay of a service function - as described above>",
"request_size_field": "<field measuring the request size of a service function - as described above>",
"response_size_field": "<field measuring the response size of a service function - as descirbed above>",
"response_size_field": "<field measuring the response size of a service function - as described above>",
"measurement_name": "<the name of the measurement which contains the fields above>"
},
...
......@@ -99,7 +99,7 @@ For more information on the graph monitoring pipeline, see the [graph RTT slides
* Example request with curl:
`curl -X POST -d <JSON body> http://<clmc-host>/clmc/clmc-service/graph/monitor`
`curl -X POST -d <JSON body> http://platform/clmc/clmc-service/graph/monitor`
* Example JSON body for the tomcat-based service described above:
......@@ -133,11 +133,11 @@ The configuration described above will start a graph monitoring process executin
in the measurement named **graph_measurements**, database **fms-sfc**. To stop the graph monitoring process, use the request ID received in
the response of the previous request:
`curl -X DELETE http://<clmc-host>/clmc/clmc-service/graph/monitor/75df6f8d-3829-4fd8-a3e6-b3e917010141`
`curl -X DELETE http://platform/clmc/clmc-service/graph/monitor/75df6f8d-3829-4fd8-a3e6-b3e917010141`
To view the status of the graph monitoring process, send the same request, but using a GET method rather than DELETE.
`curl -X GET http://<clmc-host>/clmc/clmc-service/graph/monitor/75df6f8d-3829-4fd8-a3e6-b3e917010141`
`curl -X GET http://platform/clmc/clmc-service/graph/monitor/75df6f8d-3829-4fd8-a3e6-b3e917010141`
Keep in mind that since this process is executing once in a given period, it is expected to see status **sleeping** in the response.
Example response:
......@@ -147,4 +147,95 @@ Example response:
"status": "sleeping",
"msg": "Successfully fetched status of graph pipeline process."
}
```
\ No newline at end of file
```
### Graph monitoring pipeline - technical details
In order for service graph-based monitoring to be possible, the network topology graph must be built with the relevant network link latencies.
This network graph can be created/updated/deleted by sending a POST/PUT/DELETE request to the **/clmc/clmc-service/graph/network** API endpoint:
```
curl –X POST http://platform/clmc/clmc-service/graph/network
curl –X PUT http://platform/clmc/clmc-service/graph/network
curl –X DELETE http://platform/clmc/clmc-service/graph/network
```
After the network graph is built, a graph monitoring process can execute the following steps:
1) Build a temporal graph for a particular service function chain
2) Query the temporal graph for round-trip-time
3) Write results in the time-series database (InfluxDB)
4) Clean up and delete the temporal graph
#### Building a temporal graph
The temporal graph could be built by sending a POST request to the **/clmc/clmc-service/graph/temporal** API endpoint. The request body
follows the same format as the one used to start an automated graph monitoring script described above with the only difference being that the
**from** and **to** timestamps must be specified thus defining the time window for which this temporal graph relates to - for example:
```json
{
"from": "<start of the time window, UNIX timestamp, e.g. 1549881060>",
"to": "<end of the time window, UNIX timestamp, e.g. 1550151600>",
"service_function_chain": "<SFC identifier>",
"service_function_chain_instance": "<SFC identifier>_1",
"service_functions": {
"<service function package>": {
"response_time_field": "<field measuring the service delay of a service function - as described above>",
"request_size_field": "<field measuring the request size of a service function - as described above>",
"response_size_field": "<field measuring the response size of a service function - as described above>",
"measurement_name": "<the name of the measurement which contains the fields above>"
},
...
}
}
```
`curl -X POST -d <JSON body> http://platform/clmc/clmc-service/graph/temporal`
The CLMC would then build the temporal graph in its graph database (Neo4j) and populate it with the time-series data valid for the defined time window.
#### Querying the temporal graph
The temporal graph built in the previous step can be used to retrieve the end-to-end delay by sending a GET request to the
**/clmc/clmc-service/graph/temporal/{uuid}/round-trip-time?starpoint={ue, cluster or switch}&endpoint={service function endpoint}**.
This endpoint requires the UUID of the temporal graph received in the response from the previous step, as well as a UE and service function endpoint identifiers.
The query is, thus, configured to return the end-to-end delay from a particular UE (User Equipment) to a particular service endpoint deployed on the FLAME platform.
For example:
`curl -X GET http://platform/clmc/clmc-service/graph/temporal/ac2cd21c-9c36-44ea-a923-51ca3f72bf7a/round-trip-time?startpoint=ue20&endpoint=fms-storage-endpoint`
The automated graph monitoring process (described in the previous sections) executes this query for every possible pair of a UE and a service function endpoint to
ensure that all metrics are collected.
#### Writing results in InfluxDB
The response of the previous requests will contain metrics such as round-trip-time, network delay and service delay. These are returned in JSON format
which must then be converted to the InfluxDB line protocol format. An example would look like:
```
graph_measurement,flame_server=DC3,flame_sfci=fms-sfc-1,flame_location=DC3,flame_sfe=fms-storage-second-endpoint,flame_sfp=fms-storage,flame_sfc=fms-sfc,flame_sf=fms-storage-ns,traffic_source=ue24 round_trip_time=0.029501264137931037,service_delay=0.0195,network_delay=0.005 1550499460000000000
```
This measurement line could then be reported to InfluxDB with a POST request to **/clmc/influxdb/write?db={SFC identifier}**:
`curl -X POST http://platform/clmc/influxdb/write?db=fms-sfc --data-binary <measurement line>`
#### Clean up
Once the temporal graph is no longer used, or the time window it relates to is no longer viable, it can be deleted with a DELETE
request to **/clmc/clmc-service/graph/temporal/{uuid}**. The UUID parameter is the same as in the round-trip time query request,
i.e. the UUID received when building the temporal graph. For example:
`curl -X DELETE http://platform/clmc/clmc-service/graph/temporal/ac2cd21c-9c36-44ea-a923-51ca3f72bf7a`
#### Summary
The graph monitoring process described in the beginning of this document automates the steps described above. When defining a query period, e.g. 30 seconds,
the process will start executing the pipeline every 30 seconds, by defining a non-overlapping, contiguous time windows. For each time window, a temporal graph is built,
then queried for end-to-end delay and finally deleted.
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment