-
Michael Boniface authoredMichael Boniface authored
- FLAME CLMC Information Model Specification
- Authors
- Service Management and Control Decisions
- An Elementary Starting Point: The Static Configuration Scenario
- Where Next: Fast Variable Configuration Scenarios
- Information Model
- Media Service (https://gitlab.it-innovation.soton.ac.uk/mjb/flame-clmc/issues/2)
- Configuration (https://gitlab.it-innovation.soton.ac.uk/mjb/flame-clmc/issues/3)
- Monitoring (https://gitlab.it-innovation.soton.ac.uk/mjb/flame-clmc/issues/8)
- Information Security (https://gitlab.it-innovation.soton.ac.uk/mjb/flame-clmc/issues/25) (TBC Stephen Phillips)
- Data Subject (https://gitlab.it-innovation.soton.ac.uk/mjb/flame-clmc/issues/24) (TBC Stephen Phillips)
- Measurement Model
- General
- Temporal Measurements (TBC Simon Crowle)
- Spatial Measurements (TBC Simon Crowle)
- Decision Context
- Data Retention Policy
- Architecture
- General
- Integration with FLIPS Monitoring
- Measurements Summary
- Configuration
- Monitoring
- Capacity Measurements
- Platform Measurements
- Media Service Measurements
- Service Function Chain Measurements
- Surrogate Measurements
- Network Measurements
- VM Measurements
- Service Measurements
- Worked Usage Scenario - MPEG-DASH
- CLMC Use Case Scenario
- MISC Measurements and Further Questions
- Link Measurements
- Other Issues
FLAME CLMC Information Model Specification
© University of Southampton IT Innovation Centre, 2017
This document describe the configuration and monitoring specification for cross-layer management and control within the FLAME platform. All information measured by the CLMC aims to improve management and control decisions made by the platform and/or media service providers against defined performance criteria such as increasing Quality of Experience and cost reduction.
Authors
Authors | Organisation |
---|---|
Michael Boniface | University of Southampton, IT Innovation Centre |
Simon Crowle | University of Southampton, IT Innovation Centre |
Service Management and Control Decisions
Service management decisions relate to processes for Service Request Management, Fault Management and Configuration Management. There are many possible management and control decisions and it is the purpose of the CLMC to provide decision makers with empirical knowledge to design and implement better policies. The FLAME architecture describes how the CLMC uses KPIs to measure performance and highlights examples of control policies such as shortest path routing to a SF and horizontal scaling of SFs in response to changes in workload. A Platform Provider and Media Service Provider will have KPI targets that are different and also not independent of each other. For example, allocating all of the resources needed for an expected peak workload of a media service when it is submitted for orchestration would guarantee a performance level . However, the outcome would typically produce low utilisation and increased costs due to peak workload only being of a fraction of the overall service operation time. The solution is to provide greater flexibility by exploiting points of variabilty within the system in relation to constraints. Constraints are imposed by policy (e.g. a limit on resource allocation) and technology limitations (e.g. VM boot time, horizontal/vertical scaling, routing).
The management and control processes implemented by the FLAME platform define the decisions, variability and constraints. The detail for the implementation of orchestration, management and control is under discussion and the following is based on a best understanding of what was described in the FLAME architecture.
An Elementary Starting Point: The Static Configuration Scenario
The 1st scenario to consider is an entirely static configuration. In this scenario a media service provider defines explicitly the infrastructure resources needed for a media service. The static configuration is what is proposed by the adopted of the current TOSCA specification for the Alpha release. Here an MSP declares the resources needed to deliver a desired performance level (implicitly known to the MSP). In an extreme case, the approach results in a static infrastructure configuration where the MSP defines the entire topology including servers, links and resource requirements. This would include server locations (central, metro and edge DCs) and when the servers are needed. The most basic case is deploy everything now for the lifetime of the media service. This full declaration would statically configure surrogates through the explicit declaration of servers and software deployed on those services.
In this case, the Platform Provider is responsible for allocating the requested resources to the media service provider for the lifetime of the media service. The performance of the service is entirely related to the knowledge of the MSP and the workload over time.
Even this simple example leads important decisions
D1: “How much infrastructure resource does a PP need from an IP?”
The infrastructure resource (slice) defines a topology of compute, storage and network resources allocated by an IP to a PP. The slice is used by the PP to resource media services. The PP will allocate proportions of the slice to media services within the lifecycle of such services. In most cases, a PP will need to define resource management policies that define rules for allocation of resources considering that multiple media services are contending for such resources.
The capacity of the slice and the distribution of the resources within the infrastructure is a planning decision made by the PP based on a prediction of media service demand. The allocation of a slice has cost implications as from an IP’s perspective resources are dedicated to a PP. Depending on the business model and cost structures, the slice allocation would typically become a fixed cost to the PP and revenue for the IP. The PP must now allocate the slice to MSPs in the context of KPIs designed to maximise revenue from the slice.
Issues related to this decision include:
- What are the temporal constraints on a slice? Is there a defined end time, recurring subscription or is a slice perpetual?
- How fine grained are temporal constraints considering the fact that an IP has resource scarcity at edge DCs in comparison to metro and central DCs?
- What are the states of a slice? What causes the state transition?
- Can a slice be modified and if so how can the slice change?
D1 CLMC outcome: a set of measurements describing an infrastructure slice.
D2: “How much infrastructure resource does a MSP need from a PP?”
Once the PP has a slice then media services can be orchestrated, managed and controlled within the slice. Here the PP must consider the MSP infrastructure resource requirements. In the Alpha release FLAME adopts the current TOSCA specification where the MSPs define declaratively server resources required for each SF. The PP has no understanding of how a media service will behave in response to the resource allocation as that knowledge is within the MSP. In TOSCA++ FLAME is exploring KPI-based media service specifications where resource management knowledge forms part of the platform’s responsibility.
Issues related to this decision include:
- What are the temporal constraints on resource requirements within a TOSCA specification?
- How fine grained are the temporal constraints considering that a media service includes a set of media components with temporal resourcing requirements? E.g. media component A needs resource on Monday and media component B resource on Tuesday.
- What are the spatial constraints associated with the resource requirements? Does an MSP specify the precise DC (or set of DCs) where the SF needs or can be deployed? In effect, if the MSP says where the SF needs to be deployed this encodes the surrogate policy directly within the media service definition.
- How much variability are there is routing rules? How much of this is implicit within the platform implementation (e.g. coincidental multicast features)
D2 CLMC outcome: a set of measurements describing media service infrastructure requirements.
Where Next: Fast Variable Configuration Scenarios
Variable configuration identifies configuration state that can change in the lifetime of a media service. Variability in configuration state includes:
- Vertically up and down scaling SF resources (i.e. compute, storage, network IO)
- Horizontally up and down scaling SF resources (i.e. replication)
- Distributing SFs by location (i.e. placement of a VM on an edge DC)
- Routing traffic between SFs (i.e. load balancing algorithms)
- Adapting content (i.e. reducing the resolution of a video stream)
Each transition in state is a decision that has a time in the lifecycle (when is it implemeted), a duration (how long does it take to implement), actor (who is responsible) and an expected outcome.
General issues reated to variable configuration include:
- What are the points of variability within the platform?
- How is variability configured, either through default platform policy or TOSCA templates?
- Where are we contributing innovation in variability? e.g. network + compute + service factors considered together
We now discuss key decisions associated variable configuration
D3: “When should resources be allocated to a media service”?
When a PP receives a request to orchestrate a media service the PP must decide on when to allocate infrastructure resources. Allocation has a temporal dimension defining a start time and end time. An allocation in the future can be seen as a commitment. Allocation is important for accounting purposes but even more important in situation of resource scarcity. In most public clouds, resources from a MSP perspective are assumed to be infinite and there’s little need to consider temporal constraints associated with resource allocations. As long as the MSP has budget to pay for the resources, public cloud providers will scale those resources as requested.
In FLAME we have resource scarcity and contention in edge DCs and therefore MSPs and the PP must find workable ways to negotiate allocation of resources over time. Different resource management policies can be considered.
- Allocate on request: PP allocates when the orchestration request is made. The PP would determine if sufficient infrastructure capacity exists considering the current commitments to other media services and if capacity is available then the resources would be allocated. This is a reservation for an MSP and is likely to result in underutilisation of the resources and increased costs for an MSP but may be needed to guarantee performance.
- Allocate on placement: PP allocates when the SFs are placed. The impact depends on the placement strategy as if SFs are placed when the MS is requested it will have the same effect to allocate all on request. If placement is selective based on factors such as utilisation/demand then some cost reduction may be achieved at the risk the resources might not be available. Note that placement does incur resource allocation to the MSP (e.g. storage and network I/O for ingest) but this is traded off with the potential to boot and respond to demand quickly.
- Allocate on boot: PP allocates when then SFs are booted if they are available. Here the VMs are placed with a defined resource that’s allocated when the machine boots. The PP needs to decide if the machine can be booted according to the utilisation by other VMs deployed on the server.
- Best effort with contention ratio: PP does not make any attempt to allocate resources but does place based on a defined contention ratio. Here there’s a high risk that performance is degraded by others competing for the resources
Some resource management constraints relate to peak usage rate, for example, 50M/s peak and 100G a month usage.
Issues related to this decision include:
- What is the resource management policy for Alpha?
- Do different policies apply for different types of infrastructure resources?
- How long does it take to allocate different types of infrastructure resources?
D3 CLMC outcome: a set of measurements describing an allocation of infrastructure to a media service or SF over a time period
D4: “Where can a SF be placed over a time period”?
When a media service tempalte is submitted for orchestration the PP must determine where SFs can be placed. Placement of a SF results in a VM being deployed on a server ready to be booted. Placement uses storage resources associated with a VM and network resources for ingest of the VM/content but does not utilise resources such as cpu, memory and data i/o incurred when the VM is used.
In alpha where no KPIs are provided, placement is a spatial/temporal decision based on a function of the following measurements
- infrastructure slice
- media service allocations
- SF server requirements
The outcome is a set of server options where an SF could be placed within a time period. This outcome is not related to the CLMC monitoring beyond CLMC measurements providing input to placement functions
D5: “Where is a SF best placed over a time period”?
The question of where is it best to place an SF is challenging and depends on responsibility for delivering KPIs. A PP define a KPI to achieve an utilisation target of 80% for servers and place VMs on servers according to an utilisation measurement. A MSP may have a KPI to achieve a response time of 200mS for 80% of requests and place VMs according to a request rate and location measurement.
*The outcome is a decision on where to place a SF. There’s no change to system state at this point just a decision to take an action now or in the future. *
D6: “When is a SF placed”?
The placement strategy is driven by KPIs such as response time. Placement takes time to transfer the VMs and content to a service. Placed VMs boot faster but they consume storage resources as a consequence.
A default PP strategy may be needed for the alpha release. For example, a strategy could be to place and boot in a metro or central DC where there’s less scarcity, and then selectively place/boot VMs in edge DCs on demand. However it’s not clear how such behaviour can be expressed in the TOSCA specification and how this relates to allocations. A default policy could be that the PP can place a SF on any compute node in the network where there’s sufficient resources with a guarantee that there will be at least one instance of a SF, it’s then the PPs decision to create surrogates rather than have an explicit definition as per the static configuration scenario above. This policy is sensible as it moves towards the situation where the PP manages services based on KPIs, however it does require the PP to manage allocations over time in response to demand.
D6 CLMC outcome: VM configuration measurement updating state to “placed”
D7: “When is a SF be booted?”
The booting strategy is driven by KPIs such as response time. VMs take time to boot. Booted VMs are available to serve requests routed to them immediately. When SF’s are booted the VM consumes resources in accordance within the context of the applicable resource management policy (e.g. guaranteed allocation or with contention ratio)
D5 CLMC outcome: VM configuration measurement updating state to “booted”
D8: “Which surrogate are requests routed to?”
An SFC may have multiple surrogate services booted serving requests. A decision needs to be made on where to route requests. In a typical load balancing situation requests are routed using algorithms such as round robin and source-based. Routing to the closest surrogate may not deliver improved performance especially if the surrogate is deployed on a resource constrained server and the NAP is experiencing a high level of demand. In many typical load balancing scenarios, the servers are homogenous, network delay is not considered and requests are processed from a central point of access. In our scenario the server resources are heterogeneous, network delay is critical and requests enter from multiple points of access as defined by NAPs.
At this point it’s worth highlighting that we are considering E2E performance and that each step in an end to end process contributions to an overall performance. If we take latency (as a key benefit of the platform), the E2E latency is the sum of delays in network and servers contributing to a content delivery process as shown in the diagram below:
If we the average delay for parts of a process over a time period we have some indication of best routing policy.
The variability factors that influence E2E latency include:
- Spatial/temporal demand
- Server placement and server resource allocation/contention over time
- Network routing and network resource allocation/contention over time
Issues related to this decision include:
- How are NAP routing decisions coordinated as requests are not sourced from a central point?
Information Model
This section provides an overview of the FLAME CLMC information model in support of service management and control decisions. The information model is designed to support the exploration and understanding of state and factors contributing to changes in state over time as shown in the primitive below:
The system (infrastructure, platform and media services) are composed of a set of configuration items that transition between different states during the lifecycle of the system. Configuration items of interest include significant components who's state change influence the response of the system. In general, the information aims to support the process of:
- Identification of significant configuration items within the system
- Assertion of state using configuration measurements
- Measurement of response (monitoring measurements)
- Support for taking action (configuration measurements)
This process is implemented in accordance with information security and privacy constraints. The following sections provides an overview of key aspects of monitoring.
https://gitlab.it-innovation.soton.ac.uk/mjb/flame-clmc/issues/2)
Media Service (The FLAME architecture defines a media services as "An Internet accessible service supporting processing, storage and retrieval of content resources hosted and managed by the FLAME platform". A media service consists of 1 or more media components (also known as Service Functions) that together are composed to create an overall Service Function Chain. SFs are realised through the instantiation of virtual machines (or containers) deployed on servers based on resource management policy. Multiple VMs may be instantiated for each SF to create surrogate SFs, for example, to balance load and deliver against performance targets. Media Services, SFCs, SFs, VMs, links and servers are all examples of configuration items.
Media services are described using a template structured according to the TOSCA specification (http://docs.oasis-open.org/tosca/TOSCA/v1.0/TOSCA-v1.0.html). A TOSCA template includes all of the information needed for the FLAME orchestrator to instantiate a media service. This includes all SF's, links between SFs and resource configuration information. The Alpha version of the FLAME platform is based on the current published TOSCA specification. Future developments will extend the TOSCA specification (known as TOSCA++) to meet FLAME requirements such as higher level KPIs and location-based constraints.
The current TOSCA template provides the initial structure of the Media Service information model through specified service and resource configuration. Within this structure, system components are instantiated whose runtime characteristics are measured to inform management and control processes. Measurements relate to individual SF's as well as aggregated measurements structured according the structure of configured items within the system. Measurements are made by monitoring processes deployed with system components. The configured items provide the context for monitoring.
The media information model in relation to the high-level media service lifecycle is shown in the diagram below. The lifecycle includes processes for packaging, orchestration, routing and SF management/control. Each stage in the process creates context for decisions and measurements within the next stage of the lifecycle. Packaging creates the context for orchestration, orchestration creates the context for surrogate instantiation, and network topology management. In the diagram, the green concepts identify the context which can be used for filtering and queries whilst the yellow concepts are the measurement data providing runtime measurements.
The primary measurement point for a media service is a surrogate. A surrogate is an instantation of a service function within a VM or container on a server. A surrogate exists within two main contexts: media service and virtual infrastructure. The media service context relates to the use of the surrogate within a service function chain designed to deliver content. The virtual infrastructure context relates to the host and network environment into which the surrogate is deployed. Deploying monitoring agents in different contexts and sharing information between contexts is a key part of cross-layer management and control.
The diagram highlights the need to monitor three views on a surrogate: network, host, and service. The acquisition of these different views together are a key element of the cross-layer information required for management and control. The measurements are captured by different processes running on servers but are brought together by common context allowing the information to be integrated, correlated and analysed. The surrogate can measure a service view related to the content being delivered such as request rates, content types, etc, a VM can measure a virtual infrastructure view of a single surrogate, and the server view can measure an infrastructure view across multiple surrogates deployed on a server. These monitoring processes running on the server are managed by different stakeholders, for example, the platform operator would monitor servers, where as the media service provider would monitor service specific usage.
Not all information acquired will be aggregated and stored within the CLMC. The CLMC is not responsible for capturing every measurement point related to transferring bytes over the network. It's also not responsible for capturing every interaction between a user and a service. The key design principle is to acquire information from one context that can be used in another context. For example, instead of recording every service interaction an aggregate service usage metric (e.g. request rate/s) would be acquired and stored, and the similar aggregation would be needed for infrastructure monitoring.
https://gitlab.it-innovation.soton.ac.uk/mjb/flame-clmc/issues/3)
Configuration (Configuration information describes the structure and state of the system over time. Each configuration item has a lifecycle that defines configuration states and events that cause a transition between states. The following table gives examples of configuration items and states.
Configuration Item | Configuration States |
---|---|
Network | e.g. available, unavailable |
Physical Link | up, down, unknown |
Server | e.g. available, unavailable |
Port | up, down, unknown |
Service function package | published, unpublished |
Media service template | published, unpublished |
Service function chain | submitted, scheduled, starting, running, stopping, stopped, error |
Service function | starting, running, stopping, stopped, error |
Surrogate | placed, unplaced, booted, connected, error |
The state of configuration items needs to be defined
Describe the failure taxonomy
https://gitlab.it-innovation.soton.ac.uk/mjb/flame-clmc/issues/8)
Monitoring (Monitoring measures the behaviour of the system and system components overtime including metrics associated with usage and performance. Measurements are made within the context of a known configuration state. Usage monitoring information can include measurements such as network resource usage, host resource usage and service usage. Performance monitoring information can include measurements such as cpu/s, throughput/s, avg response time and error_rate
https://gitlab.it-innovation.soton.ac.uk/mjb/flame-clmc/issues/25) (TBC Stephen Phillips)
Information Security (to be completed
https://gitlab.it-innovation.soton.ac.uk/mjb/flame-clmc/issues/24) (TBC Stephen Phillips)
Data Subject (to be completed
Measurement Model
General
The measurement model is based on a time-series model defined by TICK stack from influxdata called the line protocol. The protocol defines a format for measurement samples which together can be combined to create series.
<measurement>[,<tag-key>=<tag-value>...] <field-key>=<field-value>[,<field2-key>=<field2-value>...] [unix-nano-timestamp]
Each series has:
- a name "measurement"
- 0 or more tags for measurement context
- 1 or more fields for the measurement values
- a timestamp.
The model is used to report both configuration and monitoring data. In general, tags are used to provide configuration context for measurement values stored in fields. The tags are structured to provide queries by KPIs and dimensions defined in the FLAME architecture.
Tags are automatically indexed by InfluxDB. Global tags can be automatically inserted by contexualised agents collecting data from monitoring processes. The global tags used across different measurements are a key part of the database design. Although, InfluxDB is schemaless database allowing arbirtary measurement fields to be stored (e.g. allowing for a media component to have a set of specific metrics), using common global tags allows the aggregation of measurements across time with a known context.
Although similar to SQL, InfluxDB is not a relational database and the primary key for all measuremetns is time. Schema design recommendations can be found here: https://docs.influxdata.com/influxdb/v1.4/concepts/schema_and_data_layout/
Temporal Measurements (TBC Simon Crowle)
Monitoring data must have time-stamp values that are consistent and sychronised across the platform. This means that all VMs hosting SFs should have a synchronised system clock, or at least (and more likely) a means by which an millisecond offset from the local time can be retrieved so that a 'platform-correct' time value can be calculated.
Describe approaches to integrate temporal measurements, time as a primary key, etc.
Discuss precision
influx -precision rfc3339 : The -precision argument specifies the format/precision of any returned timestamps. In the example above, rfc3339 tells InfluxDB to return timestamps in RFC3339 format (YYYY-MM-DDTHH:MM:SS.nnnnnnnnnZ).
Spatial Measurements (TBC Simon Crowle)
Location can be represented in forms: labelled (tag) and numeric (longitude and latitude as digitial degrees). Note that the location label is likely to be a global tag.
Tag location
location | loc_long | loc_lat |
---|---|---|
DATACENTRE_1 | 0 | 0 |
A surrogate is placed on a server has has no means to obtain GPS coordinates but has a location_label provided to it as a server context. It provides zeros in the longitude and latitude. In subsequent data analysis we can search for this SF by location label.
GPS coordination location
location_label | location_long | location_lat |
---|---|---|
LAMP_1 | 50.842715 | -0.778276 |
A SF that is a proxy to a user attached to a NAP running in street lamp post LAMP_1. Here we have knowledge both of the logical location of the service and also the fine-grained, dynamic position of the service user.
Note that tags are always strings and cannot be floats, therefore log and lat will always be stored as a measurement field.
Discuss integrating and analysing location measurements
If tags are used then measurements of GPS coordinates will need to be translated into tag based approximation. For example, if a user device is tracking location information then for that to be combined with a server location the GPS coordinate needs to be translated.
Matching on tags is limited to matching and potentially spatial hierarchies (e.g. country.city.street). Using a coordiante system allows for mathatical functions to be developed (e.g. proximity functions)
Decision Context
Monitoring data is collected to support service design, management and control decisions resulting in state changes in configuration items. The link between decisions and data is through queries and rules applied to contextual information stored with measurement values.
Every measurement has a measurement context. The context allows for time-based series to be created according to a set of query criteria which are then be processed to calculate statistical data over the desired time-period for the series. For example, in the following simple query the measurement is avg_response_time, the context is “service A” and the series are all of the data points from now minus 10 minutes.
find avg response time for service A over the last 10 minutes
To support this query the following measurement would be created:
serviceA_monitoring,service_id=(string) response_time=(float) timestamp
In the FLAME architeture we discuss at length the relationship between KPIs and dimensions, and implementations based on OLAP. In the current CLMC implementation, KPIs are calculated from measurement fields and dimensions are encoded within measurement tags. This is a lightweight implementation that will allow for a broad range of questions to be asked about the cross layer information acquired.
Designing the context for measurements is an important step in the schema design. This is especially important when measurements from multiple monitoring sources need to be integrated and processed to provided data for queries and decision. The key design principles adopted include:
- identify common context across different measurements
- where possible use the same identifiers and naming conventions for context across different measurements
- organise the context into hierarchies that are automatically added to measurements during the collection process
The following figure shows the general structure approach for two measurements A and B. Data points in each series have a set of tags that shares a common context and have a specific context related to the measurement values.
The measurement model considers three monitoring views on a surrogate with field values:
- service: specific metrics associated within the SF (either media component or platform component)
- network: data usage TX/RX, latency, jitter, etc.
- host: cpu, storage, memory, storage I/O, etc
All of the measurements on a surrogate share a common context that includes tag values:
- sfc – an orchestration template
- sfc_i – an instance of the orchestration template
- sf_package – a SF type
- sf_i – an instance of the SF type
- surrogate – an authoritive copy of the SF instance either VM or container
- server – a physical or virtual server for hosting VM or container instances
- location – the location of the server
By including this context with service, network and host measurements it is possible to support range of temporal queries associated with SFC’s. By adopting the same convention for identifiers it is possible to combine measurements across service, network and host to create new series that allows exploration of different aspects of the VM instance, including cross-layer queries.
Give a worked example across service and network measurements based on the mpeg-dash service
- Decide on the service management decisions and time scales
- Decide on the measurements of interest that are needed to make the decisions
- Decide how measurements are calculated from a series of one or more other measurements
- Decide on time window for the series and sample rate
- Decide on interpolation approach for data points in the series
Discuss specific tags
Data Retention Policy
Discuss what data needs to be kept and for how long in relation to decision making
Architecture
General
The monitoring model uses an agent based approach with hierarchical aggregation used as required for different time scales of decision making. The general architecture is shown in the diagram below.
To monitor a SF an agent is deployed on each of the surrogates implementing a SF. The agent is deployed by the orchestrator when the SF is provisioned. The agent is configured with
- a set of input plugins that collect measurements from the three viewpoints of network, host and service
- a set of global tags that are inserted for all measurements made by the agent on the host.
- 1 or more output plugs for publishing aggregated monitoring data.
Telegraf offers a wide range of integration with relevant monitoring processes.
- Telegraf Existing Plugins for common services, relevant plugins include
- Network Response https://github.com/influxdata/telegraf/tree/release-1.5/plugins/inputs/net_response: could be used to performance basic network monitoring
- nstat https://github.com/influxdata/telegraf/tree/release-1.5/plugins/inputs/nstat : could be used to monitor the network
- webhooks https://github.com/influxdata/telegraf/tree/release-1.5/plugins/inputs/webhooks: could be used to monitor end devices
- prostat https://github.com/influxdata/telegraf/tree/release-1.5/plugins/inputs/procstat: could be used to monitor containers
- SNMP https://github.com/influxdata/telegraf/tree/release-1.5/plugins/inputs/snmp: could be used to monitor flows
- systat https://github.com/influxdata/telegraf/tree/release-1.5/plugins/inputs/sysstat: could be used to monitor hosts
Telegraf offers a wide range of integration for 3rd party monitoring processes:
- Telegraf AMQP: https://github.com/influxdata/telegraf/tree/release-1.5/plugins/inputs/amqp_consumer
- Telegrapf http json: https://github.com/influxdata/telegraf/tree/release-1.5/plugins/inputs/httpjson
- Telegraf http listener: https://github.com/influxdata/telegraf/tree/release-1.5/plugins/inputs/http_listener
- Telegraf Bespoke Plugin: https://www.influxdata.com/blog/how-to-write-telegraf-plugin-beginners/
The architecture considers hierarchical monitoring and scalability, for example, AMQP can be used to buffer monitoring information whilst InfluxDB can be used to provide intermediate aggregation points when used with Telegraf input and output plugin.
Integration with FLIPS Monitoring
FLIPS offers a scalable pub/sub system for distributing monitoring data. The architecture is described in the POINT monitoring specification https://drive.google.com/file/d/0B0ig-Rw0sniLMDN2bmhkaGIydzA/view. Some observations can be made
- MOOSE and CLMC provide similar functions in the architecture, the CLMC will not have access to MOOSE but will need to subscribe to data points provided by FLIPS
- The APIs for Moly and Blackadder are not provided therefore it's not possible to critically understand the correct implementation approach for agents and monitoring data distribution
- Individual datapoints need to be aggregated into measurements according to a sample rate
- We may need to use the blackadder API for distribution of monitoring data, replacing messaging systems such as AMQP with all buffering and pub/sub deployed on the nodes themselves rather than a central service.
There are a few architectural choices. The first below uses moly as an integration point for monitoring processes via a Telegraf output plugin with data inserted into influx using a blackadder API input plugin on another Telegraf agent running on the CLMC. In this case managing the subscriptions to nodes and data points is difficult. In addition, some data points will be individual from FLIPS monitoring whilst others will be in line protocol format from Telegraf. For the FLIPS data points a new input plugin would be required to aggregate individual data points into time-series measurements.
The second (currently preferred) choice only sends line protocol format over the wire. Here we develop telegraf input and output plugins for blackadder benefiting from the scalable nature of the pub/sub system rather than introducing RabbitMQ as a central server. In this case the agent on each node would be configured with input plugins for service, host and network . We'd deploy a new Telegraf input plugin for FLIPS data points on the node's agent by subscribing to blackadder locally and then publish the aggregated measurement using the line protocol back over blackadder to the CLMC. FLIPS can still publish data to MOOSE as required.
The pub/sub protocol still needs some work as we don't want the CLMC to have to subscribe to nodes as they start and stop. We want the nodes to register with a known CLMC and then start publishing data to the CLMC according to a monitoring configuration (e.g. sample rate, etc). So we want a "monitoring topic" that nodes publish to and that the CLMC can pull data from. This topic is on the CLMC itself and note the nodes. Reading the FLIPS specification it seems that this is not how the nodes current distribute data, although could be wrong
Measurements Summary
Configuration
Decision Context | Measurement | Description |
---|---|---|
Capacity | host_resource | the compute infrastructure slice allocation to the platform |
Capacity | network_resource | the network infrastructure slice allocation to the platform |
Platform | topology_manager | specific metrics tbd |
Media Service | sfc_config | specific metrics tbd |
Media Service | sf_config | specific metrics tbd |
Media Service | vm_host_config | compute resources allocated to a VM |
Media Service | net_port_config | networking constraints on port on a VM |
Monitoring
Decision Context | Measurement | Description |
---|---|---|
Platform | nap_data_io | nap data io at byte, ip and http levels |
Platform | nap_fqdn_perf | fqdn request rate and latency |
Platform | orchestrator | specific metrics tbd |
Platform | clmc | specific metrics tbd |
Media Service | cpu_usage | vm metrics |
Media Service | disk_usage | vm metrics |
Media Service | disk_IO | vm metrics |
Media Service | kernel_stats | vm metrics |
Media Service | memory_usage | vm metrics |
Media Service | process_status | vm metrics |
Media Service | system_load_uptime | vm metrics |
Media Service | net_port_io | vm port network io and error at L2 |
Media Service | surrogate | service usage and performance metrics |
Capacity Measurements
Capacity measurements measure the size of the infrastructure slice available to the platform that can be allocated on demand to tenants.
Common tags
- slice_id – an idenfication id for the tenant infrastructure slice within openstack
host_resource
The host_resource measurement measures the wholesale host resources available to the platform that can be allocated to media services.
host_resource,slice_id,server_id,location cpu,memory,storage timestamp
network_resource
network_resource measures the overall capacity of the network available to the platform for allocation to tenants. There are currently no metrics defined for this in the FLIPS monitoring specification, although we can envisage usage metrics such as bandwidth being part of this measurement.
network_resource,slice_id,network_id, bandwidth,X,Y,Z timestamp
Platform Measurements
Platform measurements measure the configuration, usage and performance of platform components.
topology_manager
nap
nap measurements are the platforms view on IP endpoints such as user equipment and services. A NAP is therefore the boundary of the platform. NAP also measures aspects of multicast performance
NAP multicast metrics that require further understanding
Fields
-
CHANNEL_AQUISITION_TIME_M
-
CMC_GROUP_SIZE_M
-
What is the group id for CHANNEL_AQUISITION_TIME_M and how can this be related to FQDN of the content?
-
what is the predefined time interval for CMC_GROUP_SIZE_M?
-
How are multicast groups identified? i.e. "a request for FQDN within a time period", what's the content granularity here?
NAP data usage measurement
nap_data_io,node_id,ip_version <fields> timestamp
Fields
- RX_BYTES_HTTP_M
- TX_BYTES_HTTP_M
- RX_PACKETS_HTTP_M
- TX_PACKETS_HTTP_M
- RX_BYTES_IP_M
- TX_BYTES_IP_M
- RX_BYTES_IP_MULTICAST_M
- TX_BYTES_IP_MULTICAST_M
- RX_PACKETS_IP_MULTICAST_M
- TX_PACKETS_IP_MULTICAST_M
NAP service request and response metrics
nap_fqdn_perf,<common_tags>,cont_nav=FQDN <fields> timestamp
Fields
- HTTP_REQUESTS_FQDN_M
- NETWORK_FQDN_LATENCY
clmc
tbd
Media Service Measurements
Media service measurements measure the configuration, usage and performance of media service instances deployed by the platform.
Service Function Chain Measurements
sfc_i_config
sfc_i_config,<common_tags>,state <fields> timestamp
sfc_i_monitoring
Aggregate measurement derived from VM/container measurements, most likely calculated using a continuous query over a specific time interval
sf_i_config
sf_i_config,<common_tags>,state <fields> timestamp
sf_i_monitoring
Aggregate measurement derived from surrogate measurements, most likely calculated using a continuous query over a specific time interval
surrogates
Aggregate measurement derived from surrogate measurements, most likely calculated using a continuous query over a specific time interval
surrogates,<common_tags>, placed, unplaced, booted, connected
Surrogate Measurements
Surrogate measurements measure the configuration, usage and performance of VM/Container instances deployed by the platform within the context of a media service.
Common tags
- sfc – an orchestration template
- sfc_i – an instance of the orchestration template
- sf_pack – a SF package identifier indicating the type and version of SF
- sf_i – an instance of the SF type
- surrogate – an authoritive copy of the SF instance either a container or VM
- server – a physical or virtual server for hosting nodes instances
- location – the location of the server
Network Measurements
net_port_config
network config is concerned with any network io allocation/constraints for network rx/tx
net_port_config,<common_tags>,port_id,port_state <fields> timestamp
Possible fields (but these are not available from the FLIPS monitoring specification)
- RX_USAGE_CONSTRAINT
- TX_USAGE_CONSTRAINT
- RX_THROUGHPUT_CONSTRAINT
- TX_THROUGHPUT_CONSTRAINT
Specific tags
- port_state
- port_id
net_port_io
All net_port_io measurements are monitoring by FLIPS
net_port_io,<common_tags>,port_id PACKET_DROP_RATE_M, PACKET_ERROR_RATE_M, RX_PACKETS_M, TX_PACKETS_PORT_M, RX_BYTES_PORT_M, TX_BYTES_PORT_M timestamp
Specific tags
- port_id
Note that RX_PACKETS_M seems to have inconsistent naming convention.
VM Measurements
SF Host Resource Measurements measures the host resources allocated to a service function deployed by the platform. All measurements have the following global tags to allow the data to be sliced and diced according to dimensions.
vm_config
The resources allocated to a VM/Container
vm_res_alloc,<common_tags>,vm_state cpu,memory,storage timestamp
Specific tags
- vm_state
cpu_usage
https://github.com/influxdata/telegraf/blob/master/plugins/inputs/system/CPU_README.md
cpu_usage,<common_tags>,cpu cpu_usage_user,cpu_usage_system,cpu_usage_idle,cpu_usage_active,cpu_usage_nice,cpu_usage_iowait,cpu_usage_irq,cpu_usage_softirq,cpu_usage_steal,cpu_usage_guest,cpu_usage_guest_nice timestamp
disk_usage
https://github.com/influxdata/telegraf/blob/master/plugins/inputs/system/DISK_README.md
disk,<common_tags>,fstype,mode,path free,inodes_free,inodes_total,inodes_used,total,used,used_percent timestamp
disk_IO
https://github.com/influxdata/telegraf/blob/master/plugins/inputs/system/DISK_README.md
diskio,<common_tags>,name weighted_io_time,read_time,write_time,io_time,write_bytes,iops_in_progress,reads,writes,read_bytes timestamp
kernel_stats
https://github.com/influxdata/telegraf/blob/master/plugins/inputs/system/KERNEL_README.md
kernel,<common_tags> boot_time,context_switches,disk_pages_in,disk_pages_out,interrupts,processes_forked timestamp
memory_usage
mem,<common_tags> cached,inactive,total,available,buffered,active,slab,used_percent,available_percent,used,free timestamp
process_status
https://github.com/influxdata/telegraf/blob/master/plugins/inputs/system/PROCESSES_README.md
processes,<common_tags> blocked,running,sleeping,stopped,total,zombie,dead,paging,total_threads timestamp
system_load_uptime
https://github.com/influxdata/telegraf/blob/master/plugins/inputs/system/SYSTEM_README.md
system,<common_tags>,host load1,load5,load15,n_users,n_cpus timestamp
Service Measurements
_service_config
Each SF developed will measure service specific configuration.
Fields
Specific Tags
- service_state
_service_perf
Each SF developed will measure service specific usage and performance measurements.
<prefix>_service,<common_tags>,cont_nav,cont_rep,user <fields> timestamp
Fields (only examples as these are specific to each service)
- request_rate
- response_time
- peak_response_time
- error_rate
- throughput
Specific Tags
- cont_nav: the content requested
- cont_rep: the content representation requested
- user: a user profile classification
Worked Usage Scenario - MPEG-DASH
CLMC Use Case Scenario
The following scenario aims to verify two aspects
- CLMC monitoring specification & data acquisition
- Support for initial decision making processes for FLAME (re)orchestration
The FLAME platform acquires a slice of the infrastructure resources (compute, RAM & storage [C1, C2, C3] and networking). A media service provider offers an MPEG-DASH service to end-users (via their video clients connected to NAPs on the FLAME platform). The service provider deploys surrogates of the MPEG-DASH service on all compute nodes [C1-C3]. All services (including NAPs) are monitored by the CLMC.
Over time a growing number of video clients use a MPEG-DASH service to stream movies on demand. As clients connect and make requests, the platform makes decisions and takes actions in order to maintain quality of service for the increasing number of clients demanding an MPEG-DASH service.
What are the possible criteria (based on metrics and analytics provided by the CLMC) that could be used to help NAP makes these decisions?
In this scenario what are the possible actions a NAP could take?
Platform actions
-
Increase the resources available to MPEG-DASH surrogates
-
This may not be possible if resources unavailable
-
Vertical scaling may not solve the problem (i.e., I/O bottleneck)
-
Re-route client requests to other MPEG-DASH services
-
C1 – Closer to clients, but limited capability
-
C3 – Greater capability but further away from clients … note: NAP service end-point re-routing will need to take into account network factors AND compute resource availability related service KPIs; i.e., end-to-end performance
Service actions
- Lower overall service quality to clients… reduce overall resource usage
Goal: Explore QoE under two different resource configurations
KPI targetrs over a 1 hr period
- Avg quality met: the ratio of average delivered quality out of requested quality
- Avg start up time: the average time taken before a video stream starts playing less than a threshold
- Avg video stalls: the percentage of stalls (dropped video segments that require re-sending) less than a threshold
Configuration Measurements
vm_res_alloc,<common_tags>,vm_state=placed cpu=1,memory=2048,storage=100G timestamp
vm_res_alloc,<common_tags>,vm_state=booted cpu=1,memory=2048,storage=100G timestamp
vm_res_alloc,<common_tags>,vm_state=connected cpu=1,memory=2048,storage=100G timestamp
net_port_config,<common_tags>,port_id=enps03,port_state=up RX_USAGE_CONSTRAINT=500G,TX_USAGE_CONSTRAINT=500G timestamp
mpegdash_service_config,service_state=running connected_clients=10 timestamp
Monitoring Measurements
mpegdash_service,<common_tags>,cont_nav=url,cont_rep=video_quality requests=100,response_time=200mS,peak_response_time=5s timestamp
cpu_usage,<common_tags>,cpu cpu_usage_user,cpu_usage_system timestamp
network_io,<common_tags>,port_id PACKET_DROP_RATE_M, PACKET_ERROR_RATE_M, RX_PACKETS_M, TX_PACKETS_PORT_M, RX_BYTES_PORT_M, TX_BYTES_PORT_M timestamp
Start-up time delay: Video stalls:
MISC Measurements and Further Questions
The following data points require further analysis
- CPU_UTILISATION_M: likely to be replaced by other metrics provided directly by Telegraf plugins
- END_TO_END_LATENCY_M: not clear what this measurement means, so needs clarification
- BUFFER_SIZES_M: needs clarification
- RX_PACKETS_IP_M: is this just NAP or all Nodes
- TX_PACKETS_IP_M: is this just NAP or all Nodes
The following fields need further analysis as they seem to relate to core ICN, most likely fields/measurements related to platform components
- FILE_DESCRIPTORS_TYPE_M
- MATCHES_NAMESPACE_M
- PATH_CALCULATIONS_NAMESPACE_M
- PUBLISHERS_NAMESPACE_M
- SUBSCRIBERS_NAMESPACE_M
The following fields relate to CID which I don't understand but jitter is an important metric so we need to find out.
- PACKET_JITTER_CID_M
- RX_BYTES_CID_M
- TX_BYTES_CID_M
Some questions
- Can a single value of jitter (e.g. avg jitter) be calculated from the set of measurements in PACKET_JITTER_CID_M message? What is the time period for the list of jitter measurements?
- What does CID mean? consecutive identical digits
Link Measurements
Links are established between VM/container instances, need to discuss what measurements make sense. Also the context for links could be between media services, therefore a link measurement should be within the platform context and NOT the media service context. Need a couple of scenarios to work this one out.
link_config
Link Tags
- link_name
- link_id
- source_node_id
- destination_node_id
- link_type
- link_state
link_perf
link perf is measured at the nodes, related to end_to_end_latency. Needs further work.
Other Issues
Trust in measurements
If the agent is deployed in a VM/container that a tenant has root access then a tenant could change the configuration to fake measurements associated with network and host in an attempt gain benefit. This is a security risk. Some ideas include
- Deploy additional agents on hosts rather than agents to measure network and VM performance. Could be hard to differentiate between the different SFs deployed on a host
- Generate a hash from the agent configuration file that's checked within the monitoring message. Probably too costly and not part of the telegraf protocol
- Use unix permissions (e.g. surrogates are deployed within root access to them)