Skip to main content

Monitors

Monitors allow you to define metric queries that trigger alerts based on specific conditions. The system periodically evaluates these queries and generates alerts when the results violate the defined conditions.

Key Components of a Monitor

Alert Name

A descriptive name that identifies the monitor. You can use template variables in the name using Prometheus templates syntax.

Query

A PromQL query that the system evaluates periodically. You can use any valid PromQL query syntax.

Conditions

Conditions specify the threshold values that trigger alerts. There are two severity levels:

  • Critical
  • Warning

You must define at least one condition, and you can set different thresholds for each severity level. The following operators are available for conditions:

  • Equal to
  • Not equal to
  • Above
  • Above or Equal to
  • Below
  • Below or Equal to

Message

A customizable message that accompanies alert notifications. You can enhance messages using template variables with Prometheus templates syntax.

For example, with a query like max by (job) (up), you can access the job name in your message using the template variable {{ $labels.job }}.

note

Oodle notifications automatically include the following information, so you don't need to add them to your message:

  • Alert name
  • Severity
  • Threshold
  • Metric Value (the current value when the alert fires)
  • Labels and Annotations

Notification Policy

Notification policies determine where alert notifications are sent. You can configure multiple notifiers as destinations for your alerts.

Monitor notification policy configuration

Each rule specifies which notifiers to trigger and can optionally be scoped by label conditions and severity.

How Routing Rules Work

Notification rules are evaluated top to bottom on a first-match basis. The first rule whose conditions match the alert fires its notifiers; subsequent rules are skipped. The last rule without any label conditions acts as the fallback and catches everything that didn't match an earlier rule.

Rules follow an If / Else If / Else pattern:

PositionPrefixDescription
First rule (with conditions)IfEvaluated first
Additional rules (with conditions)Else IfEvaluated only when preceding rules don't match
Last rule (no conditions)Always / ElseFallback — matches all remaining alerts

You can drag and drop rules to reorder them.

Label-Based Routing

Each rule can include one or more label conditions to match alerts. Conditions within a single rule are combined with AND logic. Available operators are:

  • is — exact match
  • is not — negated match
  • matches regex — regex match
  • does not match regex — negated regex match

Label values are auto-suggested from the monitor's query results, so you can pick from labels that actually appear in your data.

Example — Team-based routing:

Route alerts to the owning team's channel based on a team label:

RuleConditionNotifier
Ifteam is backendSlack #backend-alerts
Else Ifteam is frontendSlack #frontend-alerts
Else(none — fallback)Slack #ops-alerts

Example — Environment-based routing:

Escalate production alerts to your incident management tool while keeping non-production alerts in Slack:

RuleConditionNotifier
Ifenvironment is productionOpsGenie
Else(none — fallback)Slack #dev-alerts

Example — Combined conditions:

Conditions within a rule are AND'd together, so you can be very specific:

RuleConditionsNotifier
Ifteam is payments AND environment is productionPagerDuty
Else Ifenvironment is productionOpsGenie
Else(none — fallback)Slack #alerts

Severity-Based Routing

Within each rule, you can filter by alert severity using the for dropdown. The options are:

OptionMatches
allCritical, Warning, and No Data alerts
criticalOnly Critical alerts
warningOnly Warning alerts
no dataOnly No Data alerts

When you select a specific severity, you can add additional severity rows within the same rule to send different severities to different notifiers. For example, within a single rule you can send critical alerts to PagerDuty and warning alerts to Slack.

Notification Policies vs. Direct Notifiers

Each rule supports two action types, selectable via the dropdown next to the rule:

  • notify on — Send directly to one or more notifiers (Slack, OpsGenie, PagerDuty, etc.)
  • trigger policy — Delegate to a reusable Notification Policy that defines its own routing, grouping, and timing

Grouping

Grouping controls how notifications are consolidated. You can choose from three options:

  • By Monitor: Receive a single notification per monitor (default behavior)
  • By Labels: Receive a notification for each unique set of specified labels
  • Disabled: Receive a notification for each individual timeseries

For example, consider a monitor with this query that alerts when the up metric for any job is 0:

max by (job, instance) (up{})

Here's how different grouping options affect notifications:

  • By Monitor: You receive one notification that includes all affected jobs and instances
  • By Labels (with job specified): You receive one notification per affected job
  • Disabled: You receive one notification per affected timeseries (job-instance combination)

Notification Timings

These parameters control the timing of grouped notifications:

Group Wait

Group Wait defines how long to wait before sending the first notification for a new group of alerts. The default is 30 seconds.

Group Interval

Group Interval defines how long to wait before sending a notification for new alerts added to a group that has already sent notifications. The default is 5 minutes.

Repeat Interval

The repeat interval specifies how often notifications are resent for active alerts. For example, if you have an active alert for the timeseries up with job=foo and instance=bar, setting a repeat interval of 10 minutes means the notification will be sent every 10 minutes until the alert is resolved.

info

For example, with the up metric monitor grouped "By Monitor", consider a group wait of 30 seconds and a group interval of 5 minutes: If two jobs go down within the first 30 seconds, you'll receive one notification covering both jobs. If a third job goes down a minute later, its notification will be delayed by the 5-minute group interval.

Labels

Labels add metadata to alerts. You can use Prometheus templates syntax to create dynamic labels.

A common use case is tagging monitors with owner teams to identify responsible teams for each monitor.

Annotations

Annotations provide additional context for alerts, such as descriptions or runbook links. Like labels, annotations support Prometheus templates syntax.

Creating a Monitor

  1. Navigate to Alerts (Bell icon in sidebar)
  2. Click New Alert
  3. Configure the following settings:
    • Name: Enter a descriptive name for your monitor
    • Query: Specify the PromQL query to evaluate
    • Conditions: Define the alerting conditions
    • Message: (Optional) Add a message for alert notifications
    • Notification Policy: (Optional) Select routing rules for alerts
    • Grouping: (Optional) Configure how you want to group alerts. Additionally, you can configure various timing parameters by clicking on the Clock icon.
    • Labels: (Optional) Add metadata labels to the alerts
    • Annotations: (Optional) Include additional contextual information
  4. Click Save to create the monitor

Support

If you need assistance or have any questions, please reach out to us through: