Monitor CloudWatch, CloudTrail, and Metrics+ with Tanzu Observability by Wavefront.

Amazon Web Services (AWS) is a collection of cloud-computing services that provide an on-demand computing platform. The Amazon Web Services integration allows you to ingest metrics directly from AWS. The Amazon Web Services built-in integration is part of the setup, but the additional steps in this document are needed to complete and customize integration setup.

You have to set up your Tanzu Observability by Wavefront account with the correct permissions.

Supported AWS Integrations

The AWS integration ingests data from many products and provides dashboards for each. See any integration page for a list of dashboards. The following products are of special interest to most customers:

  • CloudWatch – retrieves AWS metric and dimension data. Includes some metrics for Amazon Relational Database (RDS).
  • CloudTrail – retrieves EC2 event information and creates Tanzu Observability System events that represent the AWS events.
  • AWS Metrics+ – retrieves additional metrics using AWS APIs other than CloudWatch. Data include EBS volume data and EC2 instance metadata like tags. You can investigate billing data and the number of reserved instances. Be sure to enable AWS+ metrics because it allows Tanzu Observability to optimize its use of CloudWatch, and saves money on CloudWatch calls as a result.

CloudWatch Integration Details

Tanzu Observability retrieves AWS metric and dimension data from AWS services using the AWS CloudWatch API. The complete list of metrics and dimensions that can be retrieved from AWS CloudWatch is available at Amazon CloudWatch Metrics and Dimensions Reference. In addition, you can publish custom AWS metrics that can also be ingested by the CloudWatch integration.

Configuring CloudWatch Data Ingestion

You can configure which instances and volumes to ingest metrics from, which metrics to ingest, and the rate at which Tanzu Observability fetches metrics.

To configure CloudWatch ingestion:

  1. Log in to your Wavefront cluster and click Integrations on the toolbar.
  2. In the Featured section, click the Amazon Web Services tile.
  3. Click the Setup tab.
  4. In the Types column, click the CloudWatch link in the row of the integration you want to configure.
  5. Configure ingestion properties:
    • Instance and Volume Allow List fields – Add instances and volumes to an allow list by specifying EC2 tags, defined on the instances and volumes. The allow lists should be in JSON format, for example, {"organization":"yourcompany"}. The tags specified in the allow lists are OR’d. To use instance and volume allow lists, you must also add an AWS Metrics+ integration because the AWS tags are imported from the EC2 service. If you don’t specify any tags, Tanzu Observability imports metrics from all instances and volumes.
    • Metric Allow List field – Add metrics to an allow list by specifying a regular expression. Metric names consist of the actual metric name and an aggregation type. In the regular expression, you must use the actual metric names without the aggregation types. For example, in the following list of metric names:

      • aws.dynamodb.successfulrequestlatency.average
      • aws.dynamodb.successfulrequestlatency.maximum
      • aws.dynamodb.successfulrequestlatency.minimum
      • aws.dynamodb.successfulrequestlatency.samplecount
      • aws.dynamodb.successfulrequestlatency.sum

      Here, the actual metric name is aws.dynamodb.successfulrequestlatency, while average, maximum, minimum, samplecount, and sum are the aggregation types. When you create the regular expression, you must use only aws.dynamodb.successfulrequestlatency. For example, ^aws.dynamodb.successfulrequestlatency$.

      If you do not specify a regular expression, all CloudWatch metrics are retrieved.

    • Point Tag Allow List – Add custom AWS point tags to an allow list by specifying a regular expression. If you do not specify a regular expression, no point tags are added to metrics.

      Currently, custom point tags only for AWS EC2 instances and volumes are supported. To ingest the custom tags, you must first add the custom tags to the supported resources, and then add the tag keys in the Point Tag Allow List as a regular expression.

    • Service Refresh Rate – Number of minutes between requesting metrics. Default is 5.
    • Products – Allows you to filter the list of AWS products for which you want to collect metrics by using the CloudWatch integration. The default is All. Click Custom to see the list of AWS products and to filter them according to your needs.
  6. Click Update.

How to Use the Metric Allow List and the Products List

By using the Metric Allow List and the Products option you can select which services and metrics to monitor. If you want to monitor all metrics for all services, you don’t have to do anything, just leave the Metric Allow List empty and the Products option set to All.

How to Monitor All Metrics for Specific Services

If you want to monitor all of the ingested metrics for specific services, select these services from the Products list. For example, if you want to monitor Amazon Relational Database Service and Amazon DynamoDB:

  1. Expand the list of Products.
  2. Select Custom.
  3. Select the Amazon DynamoDB and Amazon Relational Database Service options.

How to Monitor Some of the Metrics for Specific Services

If you want to monitor only some of the metrics for specific services, select these services from the Products list and use a regular expression to specify the metrics that you want to monitor. For example, if you want to monitor aws.rds.activetransactions for Amazon Relational Database Service and aws.dynamodb.accountmaxreads for Amazon DynamoDB:

  1. In the Metric Allow List, enter a regular expression such as: aws.(rds.activetransactions|dynamodb.accountmaxreads).*
  2. Expand the list of Products.
  3. Select Custom.
  4. Select the Amazon DynamoDB and Amazon Relational Database Service options.

How to Monitor Only the Metrics for a Service Which Is Not in the Products Lists

If you are ingesting metrics for a service, which is not part of the products list, to monitor the metrics for this service, leave the he Products option set to All and use a regular expression.

CloudWatch Sources and Source Tags

Tanzu Observability automatically sets each metric’s source field and adds source tags to each AWS source, as follows:

Metric Source Field

Tanzu Observability sets the value of the AWS metric source field by service:

  • EC2 - the value of the hostname, host, or name EC2 tags, if the tags exist and you have an EC2 integration. Otherwise, the source is set to the Amazon instance ID.
  • EBS - the Amazon instance ID of the EC2 instance the volume is attached to.
  • All other services - the value of the first CloudWatch dimension. The supported dimensions appear at the bottom of the Amazon service metric documentation topic. For example, see Amazon EC2 Dimensions.

Source Tags

AWS sources are assigned source tags that identify their originating service following this pattern: ~integration.aws.<service>, for example, ~integration.aws.ec2.

CloudWatch Point Tags

Tanzu Observability adds the following point tags to CloudWatch metrics:

  • accountId - the Amazon account that reported the metric.
  • Region - The region in which the service is running. Added to EC2 and EBS metrics only.
  • CloudWatch dimensions. The dimensions vary by service. For example, for AWS S3, the BucketName dimension is added as a point tag.

CloudWatch Pricing

Standard AWS CloudWatch pricing applies each time Tanzu Observability requests metrics using the CloudWatch API. For pricing information, see AWS | Amazon CloudWatch | Pricing. After selecting a region, you can find the current expected price under Amazon CloudWatch API Requests. In addition, custom metrics have a premium price; see the Amazon CloudWatch Custom Metrics section of the pricing page. To limit cost, by default Tanzu Observability queries the API every 5 minutes. However, you can change the refresh rate, which will change the cost.

As an alternative to using the CloudWatch API for EC2 metrics, you can collect these metrics using a Telegraf collector on each AWS instance. In this case, to prevent CloudWatch from requesting those metrics, you should set the Metric Allow List property to allow all metrics except EC2. For example:

^aws.(billing|instance|sqs|sns|reservedInstance|ebs|route53.health|ec2.status|elb|s3).*$

By default, the number of unique metrics that can be retrieved from CloudWatch are limited to 10K to cap the AWS CloudWatch bill.

Configuring CloudWatch Billing Metrics

The AWS Billing and Cost Management service sends billing metrics to CloudWatch. You configure AWS to produce aws.billing.* metrics by selecting the Receive Billing Alerts check box on the Preferences tab in the AWS Billing and Cost Management console:

aws billing

Tanzu Observability reports the single metric aws.billing.estimatedcharges. The source field and ServiceName point tag identify the AWS services. For the total estimated charge metric, source is set to usd and ServiceName is empty. Tanzu Observability also provides the point tags accountId, Currency, LinkedAccount, and Region. Billing metrics are typically reported every 4 hours.

AWS Usage Metrics

As part of CloudWatch we collect metrics that let you check if throttling is happening and get the number of API calls.

  • aws.usage.throttlecount - Understand whether throttling is happening at the AWS end.
  • aws.usage.callcount.* - Get the number of API calls that goes to AWS. If you know the Service Quota, you can easily calculate the percentage of usages and trigger an alert if the percentage reaches a defined threshold.

CloudTrail Events, Metrics, and Point Tags

We retrieve CloudTrail event information stored in JSON-formatted log files in an S3 bucket. The CloudTrail integration parses the files for all events that result from an operation that is not a describe, get, or list, and creates a Tanzu Observability System event.

In the Events browser the events are named AWS Action: <Operation> and have the event tag aws.cloudtrail.ec2. For example:

aws start instance

Starting with release 2018.22.x, we group AWS CloudTrail events by the minute and report the metrics. We also support several point tags that allow you to filter the events.

CloudTrail Metrics

Each metrics starts with aws.cloudtrail.event., followed by one of the EC2 operation names.

The EC2 operations include:

  • [Run|Start|Stop|Terminate|Monitor|Unmonitor]Instances
  • [Attach|Detach]Volume
  • DeleteNetworkInterface
  • AuthorizeSecurityGroupIngress
  • CreateSecurityGroup
  • RequestSpotInstances
  • CancelSpotInstanceRequests
  • ModifyInstanceAttribute
  • CreateTags
  • [Create|Delete]KeyPair
  • DeregisterImage

As a result, the metrics include, for example aws.cloudtrail.event.Start or aws.cloudtrail.event.CreateTags.

In addition, the metric aws.cloudtrail.event.total-per-minute reports the per-minute count of all AWS API calls recorded by the AWS CloudTrail integration.

Point Tags for Filtering CloudTrail Metrics

You can use the following point tags to filter the metrics.

Point tagDescriptionExample
eventType The type of event that generated the event record. AwsApiCall, AwsServiceEvent
eventSource The service that the request was made to. ec2.amazonaws.com
Region The AWS region that the request was made to. us-east-2
accountId The account ID that you specified when you set up the AWS CloudTrail integration. User42
bucket Bucket that you specified when you set up the AWS CloudTrail integration. A random number

AWS Metrics+ Data

AWS Metrics+ are metrics retrieved using AWS metrics API calls other than CloudWatch. Unless otherwise indicated, Tanzu Observability sets the value of the AWS Metrics+ source field to the AWS instance ID. If an EBS volume is detached, its source field is set to the volume ID. The metrics include:

  • aws.instance.price - EC2 instances and how much they cost per hour. This metric includes the point tags availabilityZone, instanceID, instanceLifecycle, instanceType, and operatingSystem.
  • aws.reservedinstance.count - Number of reserved instances in each availability zone by each instance type. This metric includes the point tags availabilityZone, instanceID, instanceType, and operatingSystem. This metric appears only if your account has reserved instances.
  • EBS metrics - EBS metrics include the point tags instanceID, Region, State, Status, volumeId, and volumeType (see Amazon EBS Volume Types). The Status can be attached, detaching, or attaching. The State can be available (detached) or in-use (attached).
    • aws.ebs.volumesize - The volume size of the elastic block store.
    • aws.ebs.volumeiops - The volume I/O operations of the elastic block store.
  • SQS - AWS SQS metrics retrieved every minute from the SQS service.
    • aws.sqs.approximatenumberofmessagesnotvisible - The number of messages that are “in flight.” Messages are considered in flight if they have been sent to a client but have not yet been deleted or have not yet reached the end of their visibility window.
    • aws.sqs.approximatenumberofmessagesdelayed - The number of messages in the queue that are delayed and not available for reading immediately. This can happen when the queue is configured as a delay queue or when a message has been sent with a delay parameter.
    • aws.sqs.approximatenumberofmessages aliased to the CloudWatch metric aws.sqs.approximatenumberofmessagesvisible - The number of messages available for retrieval from the queue.
  • Pricing Metrics - Capture the current pricing of EC2 instances. These metrics are available as a preview and subject to change. These metrics have the point tags instanceType, operatingSystem, Region, purchaseOption (All Upfront, Partial Upfront, No Upfront), leaseContractLength (1 or 3 years), and offeringClass (standard or convertible)). The source field is set to the display name of the region. For example, if Region=us-west2, then source=us west (oregon).
    • ~sample.aws.ec2.on-demand.price.hourly - The hourly price (in US$) of an on-demand instance.
    • ~sample.aws.ec2.reserved.price.upfront - The up-front payment (in US$) for a reservation. This metric reports 0 when purchaseOption is No Upfront.
    • ~sample.aws.ec2.reserved.price.hourly - The hourly payment (in US$) for a reservation. This metric reports 0 when the purchaseOption is All Upfront.
  • RDS Metrics -give insight into Amazon Relational Database Service (RDS)
    • aws.rds.allocatedstorage - The amount of storage (in gigabytes) allocated for the database instance.
    • aws.rds.capacity - For Amazon Aurora only, RDS capacity.
    • aws.rds.backtrackconsumedchangerecords - For Amazon Aurora only, the number of change records stored for Backtrack.
  • Service Limit Metrics - Capture the current resource limits and usage for your AWS account. These metrics include the point tags Region and category.
    • aws.limits.<resource>.limit - The current limit for an AWS resource in a particular region.
    • aws.limits.<resource>.usage - The current usage of an AWS resource in a particular region.