Datadog Concepts: Difference between revisions
Line 50: | Line 50: | ||
====Filter/Scope==== | ====Filter/Scope==== | ||
The query metric values can be filtered based on tags. The {...} section contains a comma-separated list of <code>tag-name:tag-value</code> pairs. It is said that filter - the list of <code>tag-name:tag-value</code> pairs - scope the query. Example | The query metric values can be filtered based on tags. The {...} section contains a comma-separated list of <code>tag-name:tag-value</code> pairs. It is said that filter - the list of <code>tag-name:tag-value</code> pairs - scope the query. Example: | ||
<syntaxhighlight lang='groovy'> | <syntaxhighlight lang='groovy'> | ||
app:myapp, something:somethingelse | app:myapp, something:somethingelse |
Revision as of 22:50, 24 March 2022
Internal
Overview
Datadog is an observability platform that includes products for monitoring, alerting, metrics, dashboard, big logs, synthetics, user monitoring, CI/CD (how?). Datadog is API driven.
Organization
Route of a Metric from Application to Dashboard
Application (specialized library) → metric → DogStatsD → Datadog Backend → metric → Dashboard
The metrics are generated by an application-level library, such as Micrometer. For more details, see Metric Lifecycle below. The Datadog agent annotates the metric with additional tags (cluster name, pod name, etc.)
Metrics
Metrics are numerical values that can track anything about your environment over time. Example: latency, error rates, user signups. Metric data is ingested and stored as a datapoint with a value and a timestamp. The timestamp is rounded to the nearest second. If there is more than one value with the same timestamp, the latest received value overwrites the previous one. A sequence of metrics is stored as a timeseries. There are standard metrics, such as CPU, memory, etc, but metrics specific to business can be defined. Those are custom metrics. Metrics can be visualized in dashboards, Metrics Explorer and Notebooks
Metric Name
Valid characters?
Metric Tags
Metric Lifecycle
Metrics are created by application-level specialized libraries, such as Micrometer. For example, Micrometer creates measurements , which are semantically equivalent. The metric has a name, and it can optionally have one or more tags.
Metrics can be sent to Datadog from:
- Datadog-supported integrations. More: https://docs.datadoghq.com/integrations/.
- Directly from the Datadog platform, for example counting errors showing up in the logs and publishing that as a new metric. More: https://docs.datadoghq.com/logs/logs_to_metrics/.
- Custom metrics generators.
- Datadog Agent, which automatically sends standard metrics as CPU and disk usage.
Metric Query
space-aggregation:metric.name{filter/scope} by {space_aggregation_grouping_by_tag}.time-aggregation
avg:system.io.r_s{app:myapp} by {host}.rollup(avg, 3600)
TO PROCESS:
- https://docs.datadoghq.com/metrics/#querying-metrics
- https://docs.datadoghq.com/metrics/advanced-filtering
Metric Query Elements
Space Aggregator
Examples of space aggregators: avg, count, max, min, p50, p75, p95, p99, sum, 0-sum, 1-avg, 100-avg.
Metric Name in Metric Query
system.io.r_s
See Metric Name above.
Filter/Scope
The query metric values can be filtered based on tags. The {...} section contains a comma-separated list of tag-name:tag-value
pairs. It is said that filter - the list of tag-name:tag-value
pairs - scope the query. Example:
app:myapp, something:somethingelse
For more details on tags, see Tags.
Operations
Space Aggregator Tags
host
Rollup
rollup(avg, 3600)
Other Examples
avg:myapp.smoketest.run_time{$cluster_name}/1000
Metric Types
Metric types determine which graphs and functions are available to use with the metric.
Count
A count metric adds up all the submitted values in a time interval. This would be suitable for a metric tracking the number of website hits, for example.
Rate
The rate metric takes the count and divides it by the length of the type interval (example: hits per second).
Gauge
A gauge metric takes the last value reported during the interval. This could be used to track values such as CPU or memory, where taking the last value provides a representative picture of the host's behavior during the time interval.
Histogram
A histogram reports five different values summarizing the submitted values: the average, count, median, 95th percentile, and max. This produces five different timeseries. This type of metric is suitable for things like latency, for which were is not enough to know the average value. Histograms allow you to understand how your data was spread out without recording every single data point.
Distribution
A distribution is similar to a histogram but it summarizes values submitted during a time interval across all hosts in your environment.
Set
Custom Metrics
Tags
Events
Events are records of notable changes relevant for managing and troubleshooting IT operations, such as code deployments, service health, configuration changes or monitoring alerts. TO PROCESS:
Agent
The Datadog agent has a built-in StatsD server, exposed over a configurable port. It's written in Go.
Agent and Kubernetes
TO CONTINUE: https://docs.datadoghq.com/developers/dogstatsd/?tab=kubernetes#
DogStatsD
DogStatsD is a metrics aggregation service bundled with the Datadog agent. DogStatsD implements the StatsD protocol and a few extensions (histogram metric type, service checks, events and tagging).
DogStatsD accepts custom metrics,events and service checks over UDP and periodically aggregates them and forwards them to Datadog.
Monitor
When something goes wrong, a computer tells you about it. This is what a monitor is. There re different monitor types. The monitor has a query. The monitor has alert conditions.
Unified Service Tagging
There are three reserved tags: "env", "service", "version".
Dashboard
Metrics Explorer
Tool to browse arbitrary metrics, by selecting their name.
Notebook
Security
User
Service Account
API Key
An API key is required by the Datadog Agent to submit metrics and events to Datadog. The API keys are also used by other third-party clients, such as, for example, the Pulumi Datadog resource provider, which provisions infrastructure on the Datadog backend. API keys are unique to an organization. To see API Keys: Console → Hover over the user name at the bottom of the left side menu → Organization Settings → API Keys.
To invoke into the API, the client expects the environment variable DATADOG_API_KEY
to be set.
An API key is unique to an organization.
Application Key
Application keys, in conjunction with the organization’s API key, give users access to Datadog’s programmatic API. Application keys are associated with the user account that created them and by default have the permissions and scopes of the user who created them. To see or create Application Keys: Console → Hover over the user name at the bottom of the left side menu → Organization Settings → Application Keys.
API
Kubernetes Support
Understand this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
metadata:
annotations
ad.datadoghq.com/myapp.check_names: '["myapp"]'
ad.datadoghq.com/myapp.init_configs: '[{"is_jmx": true, "collect_default_metrics": true}]'
ad.datadoghq.com/myapp.instances: '[{"host": "%%host%%","port":"19081"}]'
spec:
[...]