Datadog Concepts

From NovaOrdis Knowledge Base
Revision as of 01:12, 25 January 2022 by Ovidiu (talk | contribs) (→‎Security)
Jump to navigation Jump to search

Internal

Overview

Datadog is an observability platform that includes products for monitoring, alerting, metrics, dashboard, big logs, synthetics, user monitoring, CI/CD (how?). Datadog is API driven.

Route of a Metric from Application to Dashboard

Application (specialized library) → metricDogStatsD → Datadog Backend → metricDashboard

Datadog Metric Propagation.png

The metrics are generated by an application-level library, such as Micrometer. For more details, see Metric Lifecycle below. The Datadog agent annotates the metric with additional tags (cluster name, pod name, etc.)

Metrics

https://docs.datadoghq.com/metrics/

Metrics are numerical values that can track anything about your environment over time. Example: latency, error rates, user signups. Metric data is ingested and stored as a datapoint with a value and a timestamp. The timestamp is rounded to the nearest second. If there is more than one value with the same timestamp, the latest received value overwrites the previous one. A sequence of metrics is stored as a timeseries. There are standard metrics, such as CPU, memory, etc, but metrics specific to business can be defined. Those are custom metrics. Metrics can be visualized in dashboards, Metrics Explorer and Notebooks

Metric Name

Metric Tags

Metric Lifecycle

Metrics are created by application-level specialized libraries, such as Micrometer. For example, Micrometer creates measurements , which are semantically equivalent. The metric has a name, and it can optionally have one or more tags.

Metrics can be sent to Datadog from:

Metric Query

https://docs.datadoghq.com/metrics/#querying-metrics

 space-aggregation:metric.name{filter/scope>} by space-aggregation(grouping by tag).time-aggregation

 avg:system.io.r_s{app:myapp} by {host}.rollup(avg, 3600)

TO PROCESS:

Metric Types

https://docs.datadoghq.com/metrics/types/

Metric types determine which graphs and functions are available to use with the metric.

Count

A count metric adds up all the submitted values in a time interval. This would be suitable for a metric tracking the number of website hits, for example.

Rate

The rate metric takes the count and divides it by the length of the type interval (example: hits per second).

Gauge

A gauge metric takes the last value reported during the interval. This could be used to track values such as CPU or memory, where taking the last value provides a representative picture of the host's behavior during the time interval.

Histogram

A histogram reports five different values summarizing the submitted values: the average, count, median, 95th percentile, and max. This produces five different timeseries. This type of metric is suitable for things like latency, for which were is not enough to know the average value. Histograms allow you to understand how your data was spread out without recording every single data point.

Distribution

https://docs.datadoghq.com/metrics/distributions/

A distribution is similar to a histogram but it summarizes values submitted during a time interval across all hosts in your environment.

Set

Custom Metrics

https://docs.datadoghq.com/metrics/custom_metrics/

Events

https://docs.datadoghq.com/events/

Events are records of notable changes relevant for managing and troubleshooting IT operations, such as code deployments, service health, configuration changes or monitoring alerts. TO PROCESS:

Agent

The Datadog agent has a built-in StatsD server, exposed over a configurable port. It's written in Go.

Agent and Kubernetes

TO CONTINUE: https://docs.datadoghq.com/developers/dogstatsd/?tab=kubernetes#

DogStatsD

https://docs.datadoghq.com/developers/dogstatsd/

DogStatsD is a metrics aggregation service bundled with the Datadog agent. DogStatsD implements the StatsD protocol and a few extensions (histogram metric type, service checks, events and tagging).

DogStatsD accepts custom metrics,events and service checks over UDP and periodically aggregates them and forwards them to Datadog.

Monitor

When something goes wrong, a computer tells you about it. This is what a monitor is. There re different monitor types. The monitor has a query. The monitor has alert conditions.

Unified Service Tagging

There are three reserved tags: "env", "service", "version".

Dashboard

Dashboard

Metrics Explorer

Tool to browse arbitrary metrics, by selecting their name.

Notebook

Security

User

Service Account

API Key

https://docs.datadoghq.com/account_management/api-app-keys/#api-keys

An API key is required by the Datadog Agent to submit metrics and events to Datadog. The API keys are also used by other third-party clients, such as, for example, the Pulumi Datadog resource provider, which provisions infrastructure on the Datadog backend. API keys are unique to an organization. To see API Keys: Console → Hover over the user name at the bottom of the left side menu → Organization Settings → API Keys.

Application Key

https://docs.datadoghq.com/account_management/api-app-keys/#application-keys

Application keys, in conjunction with the organization’s API key, give users access to Datadog’s programmatic API. Application keys are associated with the user account that created them and by default have the permissions and scopes of the user who created them. To see or create Application Keys: Console → Hover over the user name at the bottom of the left side menu → Organization Settings → Application Keys.

API