Service Internal Metrics | VMware Aria Operations for Applications Documentation

VMware Aria Operations for Applications (formerly known as Tanzu Observability by Wavefront) collects internal metrics that are used extensively in the different dashboards of the Operations for Applications Usage integration.

You can:

Clone and modify one of the Operations for Applications Usage integration dashboards.
Create your own dashboard, query these metrics in charts, and create alerts for some of these metrics.

Most of the internal metrics are ephemeral and not convertible to persistent. Exceptions are the following internal metrics, which are persistent:

~collector.*points.reported
~externalservices.*.points
~derived-metrics.points.reported
~collector.*histograms.reported
~derived-histograms.histograms.reported
~collector.*spans.reported
~query.metrics_scanned
~proxy.points.*.received
~proxy.histograms.*.received
~proxy.spans.*.received
~proxy.spanLogs.*.received
~proxy.build.version
~metric.global.namespace.*
~histogram.global.namespace.*
~counter.global.namespace.*

Internal Metrics Overview

We collect the following sets of metrics.

~alert* – a set of metrics that allows you to examine the effect of alerts on your service instance.
~collector – metrics processed at the collector gateway to the service instance. Includes spans.
~metric – total unique sources and metrics. You can compute the rate of metric creation from each source.
~proxy – metric rate received and sent from each Wavefront proxy, blocked and rejected metric rates, buffer metrics, and JVM stats of the proxy. Also includes counts of metrics affected by the proxy preprocessor. See Monitor Wavefront Proxies.
~wavefront – set of gauges that track metrics about your use of the Operations for Applications service.
~http.api – namespace for looking at API request metrics.

If you have an AWS integration, metrics with the following prefix are available:

~externalservices – metric rates, API requests, and events from AWS CloudWatch, AWS CloudTrail, and AWS Metrics+.

There’s also a metric you can use to monitor ongoing events and make sure the number does not exceed 1000:

~events.num-ongoing-events – returns the number of ongoing events.

Useful Internal Metrics for Optimizing Performance

A small set of internal metrics can help you optimize performance and monitor your costs. This section highlights some things to look for - the exact steps depend on how you’re using the Operations for Applications service and on the characteristics of your environment.

Our customer support engineers have found the following metrics especially useful.

Type	Metric	Description
~alert	~alert.query_time.<alert_id>	Tracks the average time, in ms, that a specified alert took to run in the past hour.
~alert	~alert.query_points.<alert_id>	Tracks the average number of points that a specified alert scanned in the past hour.
~alert	~alert.checking_frequency.<alert_id>	Tracks how often a specified alert performs a check. See Alert States for details.
~collector	~collector.points.reported ~collector.histograms.reported ~collector.tracing.spans.reported ~collector.tracing.span_logs.reported ~collector.tracing.span_logs.bytes_reported	Valid metric points, histogram points, trace data (spans), or span logs that the collector reports to Operations for Applications. This is a billing metric that you can look up on the Operations for Applications Usage dashboard. Note: We have a corresponding direct ingestion metric for each metric. For example, corresponding to `collector.points.reported` we have `collector.direct-ingestion.points.reported`.
~collector	~collector.points.batches ~collector.histograms.batches ~collector.tracing.spans.batches ~collector.tracing.span_logs.batches	Number of batches of points, histogram points, or spans received by the collector, either via the proxy or via the direct ingestion API. In the histogram context a batch is the number of HTTP POST requests. Note: We have a corresponding direct ingestion metric for each metric. For example, corresponding to `collector.spans.batches` we have `collector.direct-ingestion.spans.batches`.
~collector	~collector.points.undecodable ~collector.histograms.undecodable ~collector.tracing.spans.undecodable ~collector.tracing.span_logs.undecodable	Points, histogram points, spans, or span logs that the collector receives but cannot report to Operations for Applications because the input is not in the right format. Note: We have a corresponding direct ingestion metric for each metric. For example, corresponding to `collector.points.undecodable` we have `collector.direct-ingestion.points.undecodable`.
~collector	~collector.delta_points.tracing_red.reported ~collector.histograms.tracing_red.reported ~collector.points.tracing_red.reported	Delta counters, histograms, and points derived as Tracing RED metrics that the collector receives. Note: We have a corresponding direct ingestion metric for each metric. For example, corresponding to `collector.delta_points.tracing_red.reported` we have `collector.direct-ingestion.delta_points.tracing_red.reported`.
~metric	~metric.new_host_ids	Counter that increments when a new `source=` or `host=` is sent to Operations for Applications.
~metric	~metric.new_metric_ids	Counter that increments when a new metric name is sent to Operations for Applications.
~metric	~metric.new_string_ids	Counter that increments when a new point tag value is sent to Operations for Applications.
~query	~query.requests	Counter tracking the number of queries a user made.
~http.api	~http.api.v2.*	Monotonic counter, without tags, that can be aligned with the API endpoints and allows you to examine API request metrics. For example: ts(~http.api.v2.alert.{id}.GET.200.count) aligns with the GET /api/v2/alert/{id} API endpoint. Examine the ~http.api.v2. namespace to see the counters for specific API endpoints.

If several slow queries are executed within the selected time window the Slow Query page can become long. Section links at the top left allow you to select a section. The links display only after you have scrolled down the page.