diff --git a/docs/total-service-request-delay.md b/docs/total-service-request-delay.md index d85798402eef43a2f29e2aa301c64ff194fa6769..c59743047188e3f49d3c272283da7a66de476077 100644 --- a/docs/total-service-request-delay.md +++ b/docs/total-service-request-delay.md @@ -16,7 +16,7 @@ If we ignore the OSI L6 protocol (e.g. HTTP, FTP, Tsunami) then we are modelling ``` network_delay = latency + (time difference from start of the data to the end of the data) - = latency + data_delay + = latency + data_delay ``` ### Latency @@ -61,7 +61,8 @@ let then data_size = (packet_size / packet_payload_size) * file_size or - data_size = (packet_size / packet_size - packet_header_size) * file_size + data_size = [packet_size / (packet_size - packet_header_size)] * file_size + = file_size * packet_size / (packet_size - packet_header_size) ``` ### Measuring and Predicting @@ -70,9 +71,12 @@ Bringing the above parts together we have: ``` network_delay = latency + data_delay - = (distance * 5 / 1E9) + {[(packet_size / packet_size - packet_header_size) * file_size] * 8 / bandwidth * 1E6} + = (distance * 5 / 1E9) + {[file_size * packet_size / (packet_size - packet_header_size)] * 8 / (bandwidth * 1E6)} + = (distance * 5 / 1E9) + (8 / 1E6) * (file_size / bandwidth) * [packet_size / (packet_size - packet_header_size)] ``` +i.e. `file_size / bandwidth` with an adjustment to increase the size of the data transmitted because of the packet header and some unit factors. + We want to be able to measure the `network_delay` and also want to be able to predict what the delay is likely to be for a given deployment. Parameter | Known / measured @@ -149,7 +153,7 @@ Our service_delay equation would then just reduce to: ``` service_delay = workload * f(benchmark, service function characteristics) - = workload * service_function_scaling_factor / benchmark + = workload * service_function_scaling_factor / benchmark ``` The `service_function_scaling_factor` essentially scales the `workload` number into a number of Megaflops. So for a `workload` in bytes the `service_function_scaling_factor` would be representing Megaflops/byte. @@ -158,7 +162,50 @@ If we don't have a benchmark then the best we can do is approximate the benchmar ``` service_delay = workload * f(benchmark, service function characteristics) - = workload * service_function_scaling_factor / cpus + = workload * service_function_scaling_factor / cpus +``` + +Is this a simplification too far? It ignores the size of RAM for instance which cannot normally be included as a linear factor (i.e. twice as much RAM does not always give twice the performance). Not having sufficient RAM results in disk swapping or complete failure. Once you have enough for a workload, adding more makes no difference. + +## Conclusion + +The total delay is: + +``` +total_delay = forward_network_delay + service_delay + reverse_network_delay ``` -Is this a simplification too far? It ignores the size of RAM for instance which cannot normally be included as a linear factor (i.e. twice as much RAM does not always give twice the performance). Not having sufficient RAM results in disk swapping or complete failure. Once you have enough for a workload, adding more makes no difference. \ No newline at end of file +To measure or predict the `total_delay` we need: + +``` +total_delay = forward_latency + forward_data_delay + service_delay + reverse_latency + reverse_data_delay + = forward_latency + + {(8 / 1E6) * (request_size / bandwidth) * [packet_size / (packet_size - packet_header_size)]} + + request_size * service_function_scaling_factor / cpus + + reverse_latency + + {(8 / 1E6) * (response_size / bandwidth) * [packet_size / (packet_size - packet_header_size)]} +``` + +With: + +* forward_latency / s +* request_size / Bytes +* bandwidth / Mb/s (b = bit) +* packet_size / Bytes +* packet_header_size / Bytes +* service_function_scaling_factor / Mflops/Byte +* cpus (unitless) +* reverse_latency / s +* response_size / Bytes + +This calculation assumes: + +* there is no network congestion, i.e. the whole bandwidth is available +* that the protocol (such as TCP) has no effect (see discussion of flow control above) +* there is no data loss on the network +* that the service delay is proportional to the `request_size`, i.e. that the service is processing the data in the request +* that the service does not start processing until the complete request is received +* that the amount of memory and disk on the compute resource is irrelevant +* that the service delay is inversely proprtional to the number of CPUs (and all CPUs are equal) +* that the compute resource is invariable, 100% available and 100% reliable +* that the distribution of API calls is constant and that the workload can be represented sufficiently by the average request size