diff --git a/docs/Measuring-E2E-MS-Performance.md b/docs/Measuring-E2E-MS-Performance.md index 9b9acc0728e6f3f1caf287a0f89b1e639992bf3b..ad3fa8de7a54c2925fd43e1b1df69821d20f0a11 100644 --- a/docs/Measuring-E2E-MS-Performance.md +++ b/docs/Measuring-E2E-MS-Performance.md @@ -50,19 +50,19 @@ Here, we list the assumptions we make for measuring and understanding E2E perfor * Network measurement - the assumption is that we have a measurement for the network path delays between service function routers, called **network_delays**, providing the following information: -| path (tag) | source (tag) | target (tag) | delay | time | +| path_ID (tag) | source_SFR (tag) | target_SFR (tag) | delay | time | | --- | --- | --- | --- | --- | | path identifier | source SFR | target SFR | e2e delay for the given path (ms) | timestamp of measurement | -Here, the **path** tag value is the identifier of the path between two nodes (service routers) in the network topology obtained from FLIPS. The **source** tag value is the source service router for the identified path, while the **target** tag value is the target service router. The delay field value is the network end-to-end delay in milliseconds that a packet would experience when traversing the path between the two SFRs identified in the tag values. +Here, the **path_ID** tag value is the identifier of the path between two nodes (service function routers) in the network topology obtained from FLIPS. The **source_SFR** tag value is the source service router for the identified path, while the **target_SFR** tag value is the target service router. The delay field value is the network end-to-end delay in milliseconds that a packet would experience when traversing the path between the two SFRs identified in the tag values. An example row would be: -| path (tag) | source (tag) | target (tag) | delay | time | +| path_ID (tag) | source_SFR (tag) | target_SFR (tag) | delay | time | | --- | --- | --- | --- | --- | | SFR-A---S1---S2---S3---SFR-B | SFR-A | SFR-B | 10 | 1525334761282000 | -The semantics of the row is that a packet traversing the path from SFR-A (service router) through S1, S2, S3 (switches) to SFR-B (service router) will experience an averaged delay of 10ms. +The semantics of the row is that a packet traversing the path from SFR-A (service function router) through S1, S2, S3 (switches) to SFR-B (service function router) will experience an averaged delay of 10ms. * Request/Response path - the assumption is that a response will traverse the same network path as the request, but in reverse direction. @@ -82,7 +82,7 @@ An example row would be: | ms-A.ict-flame.eu | ms-A-sf_INSTANCE | SFR-B | 27 | 1525334761282000 | The semantics of the row is that the response time for a service function instance with ID *ms-A-sf_INSTANCE* serving media service -*ms-A.ict-flame.eu* and connected to the FLAME network through service router *SFR-B* will have an averaged response time of 27 ms. +*ms-A.ict-flame.eu* and connected to the FLAME network through service function router *SFR-B* will have an averaged response time of 27 ms. ## E2E Model @@ -176,7 +176,7 @@ The ultimate goal is to populate a new measurement, called **e2e_delays**, which | path_ID (tag) | source_SFR (tag) | target_SFR (tag) | FQDN (tag) | sf_instance (tag) | delay_forward | delay_reverse | delay_service | time | | --- | --- | --- | --- | --- | --- | --- | --- | --- | -* *pathID* - tag ID used to identify the network path (bidirectional path identifier) +* *path_ID* - tag ID used to identify the network path (bidirectional path identifier) * *source_SFR* - tag used to identify the source service function router (the start of the network path) * *target_SFR* - tag used to identify the target service function router (the end of the network path) * *FQDN*- tag used to identify the media service @@ -191,10 +191,10 @@ Then we can easily query on this measurement to obtain different performance ind The aggregation process provides similar functionality to that of an INFLUX continuous query. During each sample period the process collects and averages network and service delay data for the last 10 seconds (for example). The executed queries are: -* Network delays query - to obtain the network delay values and group them by their **path**, **source** and **target** identifiers: +* Network delays query - to obtain the network delay values and group them by their **path_ID**, **source_SFR** and **target_SFR** identifiers: ``` -SELECT mean(delay) as "net_delay" FROM "E2EMetrics"."autogen"."network_delays" WHERE time >= now() - 10s and time < now() GROUP BY path, source, target +SELECT mean(delay) as "net_delay" FROM "E2EMetrics"."autogen"."network_delays" WHERE time >= now() - 10s and time < now() GROUP BY path_ID, source_SFR, target_SFR ``` * Media service response time query - to obtain the response time values of the media service instances and group them by **FQDN**, **sf_instance** and **sfr** identifiers: @@ -203,7 +203,7 @@ SELECT mean(response_time) as "response_time" FROM "E2EMetrics"."autogen"."servi ``` The results of the queries are then matched against each other on the **target** and **sfr** tag values (for *network_delays* and *service_delays* respectively): -on every match of the **sfr** tag of the **service_delays** measurement with the **target** service router of the **network_delays** measurement, the rows are combined +on every match of the **sfr** tag of the **service_delays** measurement with the **target** service function router of the **network_delays** measurement, the rows are combined to obtain an **e2e_delay** measurement row, which is posted back to influx. Example: @@ -214,13 +214,13 @@ Let's assume we have these results from the two queries: ``` name: network_delays -tags: path=SFR-A---SFR-B, source=SFR-A, target=SFR-B +tags: path_ID=SFR-A---SFR-B, source_SFR=SFR-A, target_SFR=SFR-B time net_delay ---- --------- 1524833145975682287 9.2 name: network_delays -tags: path=SFR-A---SFR-B, source=SFR-B, target=SFR-A +tags: path_ID=SFR-A---SFR-B, source_SFR=SFR-B, target_SFR=SFR-A time net_delay ---- --------- 1524833145975682287 10.3 @@ -248,37 +248,37 @@ The resulting row would then be posted back to influx in the **e2e_delays** meas ### Monitoring network delays -Here, we describe the process of obtaining network delays between two service function routers in the network topology. CLMC retrieves network path delays between any two SFRs, see below (**SR** denotes a service router, **S** denotes a switch): +Here, we describe the process of obtaining network delays between two service function routers in the network topology. CLMC retrieves network path delays between any two SFRs, see below (**SFR** denotes a service function router, **S** denotes a switch): - + -SFR monitoring provides us with FIDs at each service router, which are bidirectional path IDs. From those, we derive the desired SR-SR network latencies. For instance, if we take the network graph example and analyse service router **SR3**. We would get 2 FIDs for this router - one for the path to reach SR2 and one for the path to reach SR1. +SFR monitoring provides us with FIDs at each service function router, which are bidirectional path IDs. From those, we derive the desired SFR-SFR network latencies. For instance, if we take the network graph example and analyse service function router **SFR3**. We would get 2 FIDs for this router - one for the path to reach SFR2 and one for the path to reach SFR1. -We assume that the FID for reaching *SR1* from *SR3* tells us the path goes through nodes *S3* and *S6*. +We assume that the FID for reaching *SFR1* from *SFR3* tells us the path goes through nodes *S3* and *S6*. - + -Hence, we accumulate the individual link delays to derive the full SR-SR delay for both forward and reverse direction. +Hence, we accumulate the individual link delays to derive the full SFR-SFR delay for both forward and reverse direction. -delay_forward = SR3-S3 + S3-S6 + S6-SR1 = 12 + 3 + 3 = 18 -delay_reverse = SR1-S6 + S6-S3 + S3-SR3 = 1 + 5 + 10 = 16 +delay_forward = SFR3-S3 + S3-S6 + S6-SFR1 = 12 + 3 + 3 = 18 +delay_reverse = SFR1-S6 + S6-S3 + S3-SFR3 = 1 + 5 + 10 = 16 -Now, we assume that the FID for reaching *SR2* from *SR3* tells us the path goes through nodes *S4* and *S2*. +Now, we assume that the FID for reaching *SFR2* from *SFR3* tells us the path goes through nodes *S4* and *S2*. - + -Hence, we accumulate the individual link delays to derive the full SR-SR delay for both forward and reverse direction. +Hence, we accumulate the individual link delays to derive the full SFR-SFR delay for both forward and reverse direction. -delay_forward = SR3-S4 + S4-S2 + S2-SR2 = 12 + 4 + 5 = 21 -delay_reverse = SR2-S2 + S2-S4 + S4-SR3 = 8 + 2 + 11 = 21 +delay_forward = SFR3-S4 + S4-S2 + S2-SFR2 = 12 + 4 + 5 = 21 +delay_reverse = SFR2-S2 + S2-S4 + S4-SFR3 = 8 + 2 + 11 = 21 Overall, from this analysis, the following data will be reported to Influx in the **network_delays** measurement: -| path (tag) | source (tag) | target (tag) | delay | time | +| path_ID (tag) | source_SFR (tag) | target_SFR (tag) | delay | time | | --- | --- | --- | --- | --- | -| SR3-SR1 | SR3 | SR1 | 18 | 1525334761282000 | -| SR3-SR1 | SR1 | SR3 | 16 | 1525334761282000 | -| SR3-SR2 | SR3 | SR2 | 21 | 1525334761282000 | -| SR3-SR2 | SR2 | SR3 | 21 | 1525334761282000 | +| SFR3-SFR1 | SFR3 | SFR1 | 18 | 1525334761282000 | +| SFR3-SFR1 | SFR1 | SFR3 | 16 | 1525334761282000 | +| SFR3-SFR2 | SFR3 | SFR2 | 21 | 1525334761282000 | +| SFR3-SFR2 | SFR2 | SFR3 | 21 | 1525334761282000 | ### Monitoring media service response times \ No newline at end of file diff --git a/docs/image/network-SFR3-SFR1.png b/docs/image/network-SFR3-SFR1.png new file mode 100644 index 0000000000000000000000000000000000000000..2c407b99ad43f59678dade42c34112bc38f73980 Binary files /dev/null and b/docs/image/network-SFR3-SFR1.png differ diff --git a/docs/image/network-SFR3-SFR2.png b/docs/image/network-SFR3-SFR2.png new file mode 100644 index 0000000000000000000000000000000000000000..ece0b4495be287c3b867aa6caced7317464d5e83 Binary files /dev/null and b/docs/image/network-SFR3-SFR2.png differ diff --git a/docs/image/network-SR3-SR1.png b/docs/image/network-SR3-SR1.png deleted file mode 100644 index 97cba1cc8d197111e4053d465783aea0c2659e00..0000000000000000000000000000000000000000 Binary files a/docs/image/network-SR3-SR1.png and /dev/null differ diff --git a/docs/image/network-SR3-SR2.png b/docs/image/network-SR3-SR2.png deleted file mode 100644 index 507da9475d419122ad126f59c272bed3772055c3..0000000000000000000000000000000000000000 Binary files a/docs/image/network-SR3-SR2.png and /dev/null differ diff --git a/docs/image/network_graph.png b/docs/image/network_graph.png index 25d62efb86edde1a6d2cfd0f05a08e1fa92c8342..587777c6f32bb3427faa75e69f1afd3dc17c9713 100644 Binary files a/docs/image/network_graph.png and b/docs/image/network_graph.png differ