Skip to main content
Independent community resource — not affiliated with the official OpenClaw project. Learn more
Part 2 of 5OpenClaw for Developers

Error & Uptime Monitoring with OpenClaw

Use cases — the six monitoring workflows

Error and uptime monitoring is the most time-sensitive work you do as an engineer. You need to know immediately when a production endpoint is down, when a new error type appears in your error tracking, and whether last night's incidents have resolved. OpenClaw automates the daily check-in.

🟢 Endpoint health checks

Ping your URLs on a schedule, alert on non-200 or slow responses

🐞 Sentry error summary

Group and rank errors by frequency, flag new errors seen for first time

🔕 Alert de-duplication

Group similar errors to cut noise, surface root cause candidates

🚨 New error detection

Alert only when an error is seen for the first time, not every occurrence

☀️ On-call morning briefing

Single summary of overnight incidents, top errors, and uptime status

📉 Error trend tracking

Flag error groups that are growing week over week

Sentry API setup (4 steps)

To summarise errors and track uptime, you need a Sentry account with API access. The free tier is sufficient for most projects and includes 5,000 events per month.

Step 1: Generate an API token

Log in to Sentry, then navigate to Your avatar (top right) → SettingsAPIAuth TokensCreate New Token. Give it a name like "OpenClaw Monitoring".

Step 2: Set scopes

Select these scopes only (you don't need write access):

Step 3: Find your organisation and project slugs

The slugs appear in your Sentry URL. For example, if your Sentry URL is https://sentry.io/organizations/my-company/projects/web-app/, your organisation slug is my-company and your project slug is web-app.

Step 4: Store in secrets

Create or update your secrets.env with the token and slugs:

SENTRY_TOKEN=sntrys_xxxxxxxxxxxx
SENTRY_ORG=your-org-slug
SENTRY_PROJECT=your-project-slug
💡 Keep tokens safe. Never commit secrets.env to version control. Store it locally or in a secrets management service, and reference it when running OpenClaw via openclaw run --secrets-file secrets.env.

HTTP endpoint health checks

The uptime-checker agent pings your production and staging endpoints on a schedule and alerts if any return non-200 status, timeout, or respond too slowly.

Configuration

Add this agent to your AGENTS.md:

agents:
  uptime-checker:
    description: "Ping endpoints and alert on failures or slow responses"
    tools:
      - http-checker
    config:
      endpoints:
        - url: "https://yourapp.com/health"
          name: "App health endpoint"
          expected_status: 200
          timeout_ms: 3000
        - url: "https://yourapp.com/api/v1/ping"
          name: "API ping"
          expected_status: 200
          timeout_ms: 2000
          expected_body_contains: "pong"
        - url: "https://yourapp.com"
          name: "Homepage"
          expected_status: 200
          timeout_ms: 5000
      alert_on:
        - wrong_status_code
        - timeout
        - body_mismatch
        - response_time_ms_above: 4000
      state_file: "uptime_state.json"
      alert_after_consecutive_failures: 2
    output:
      format: markdown
      include: [endpoint_name, status, response_time_ms, checked_at, failure_reason]
      only_report_problems: true

Key settings

Sentry error group summarisation

The sentry-summariser agent pulls the top error groups from Sentry over a configurable time window, ranks them by frequency, and flags new errors (never seen before).

Configuration

agents:
  sentry-summariser:
    description: "Summarise top Sentry error groups from the last 24 hours"
    tools:
      - sentry-api
    config:
      token: "${SENTRY_TOKEN}"
      organisation: "${SENTRY_ORG}"
      project: "${SENTRY_PROJECT}"
      time_window_hours: 24
      top_n: 10
      new_errors_window_hours: 24
    output:
      format: markdown
      sections:
        - new_errors_first_seen
        - top_errors_by_events
        - growing_errors
      include_fields: [title, culprit, event_count, user_count, first_seen, last_seen, assignee, is_new]
    prompt: |
      For each new error (first_seen within window), write one sentence explaining what the error is
      and where in the code it occurs (culprit field). Mark it [NEW] in bold.
      For top errors, just list them with counts — no prose needed.

Output sections

Alert de-duplication and noise reduction

Error tracking tools can be noisy if one underlying bug causes multiple distinct error signatures. The alert-deduplicator agent groups related errors by culprit (file/function) and error type, surfacing the root cause instead of the symptom list.

Configuration

agents:
  alert-deduplicator:
    description: "Group similar Sentry errors to reduce noise and surface root causes"
    tools:
      - sentry-api
    config:
      token: "${SENTRY_TOKEN}"
      organisation: "${SENTRY_ORG}"
      project: "${SENTRY_PROJECT}"
      time_window_hours: 24
    analysis:
      group_by_culprit: true
      group_by_error_type: true
      root_cause_hint: true
    output:
      format: markdown
      prompt: |
        Instead of listing every error individually, group them:
        - If 3+ errors share the same culprit (file/function), group them under "Possible root cause: [culprit]"
        - List the individual error types underneath, indented
        - Estimate the blast radius: how many users/events are affected by the group total
        This output replaces the raw Sentry list in the on-call briefing.

How it works

The agent queries Sentry for all errors in the window, clusters them by culprit and error type, and generates a grouped report. For example, if you have TypeError, AttributeError, and KeyError all originating from the same function in auth/middleware.js, the output flags that as a single root cause cluster with three error types — signalling to your on-call engineer to focus on that one location.

Morning on-call briefing

The oncall-brief agent combines uptime status, new errors, and top error trends into a single morning report. Engineers read this at standup instead of checking three separate dashboards.

Configuration

agents:
  oncall-brief:
    description: "Daily on-call briefing combining uptime and error status"
    tools:
      - sentry-api
      - http-checker
    config:
      sentry:
        token: "${SENTRY_TOKEN}"
        organisation: "${SENTRY_ORG}"
        project: "${SENTRY_PROJECT}"
        time_window_hours: 12
      endpoints: "${MONITORED_ENDPOINTS}"
    prompt: |
      Generate a concise on-call morning briefing. Structure:

      ## 🟢/🔴 Uptime Status
      List all monitored endpoints. Use 🟢 if up, 🔴 if down or degraded.

      ## 🚨 New Errors (last 12h)
      List any errors seen for the first time. If none: "✓ No new error types overnight."

      ## 📊 Top Errors by Volume
      Top 3 errors by event count. One line each: [count] events — [title] — [culprit]

      ## 📈 Growing Issues
      Any errors that grew >50% vs the prior 12h window.

      Keep the entire briefing under 30 lines. Engineers read this in standup.

Typical output structure

## 🟢/🔴 Uptime Status
🟢 App health endpoint (98ms)
🟢 API ping (156ms)
🟢 Homepage (287ms)

## 🚨 New Errors (last 12h)
[NEW] TypeError: Cannot read property 'user' of undefined — auth/middleware.js:47 — 3 events, 3 users

## 📊 Top Errors by Volume
24 events — NetworkError: Failed to fetch — services/api.js:102
12 events — ValidationError: Invalid request body — routes/webhook.js:33
8 events — TimeoutError: Request exceeded 5000ms — services/database.js:156

## 📈 Growing Issues
ValidationError growing 120% (8 → 18 events vs yesterday)

HEARTBEAT templates

Use these three HEARTBEAT.md configurations to automate your monitoring schedule.

Every 5 minutes: uptime check

schedules:
  - id: uptime-check-5m
    agent: uptime-checker
    cron: "*/5 * * * *"
    output: slack://your-channel-webhook
    notify_on_failure: true

Weekday mornings: on-call briefing

schedules:
  - id: oncall-brief-morning
    agent: oncall-brief
    cron: "0 8 * * 1-5"  # 8 AM, Monday to Friday
    output: slack://your-channel-webhook
    notify_on_failure: true

Monday mornings: weekly error trend report

schedules:
  - id: weekly-error-trends
    agent: sentry-summariser
    cron: "0 9 * * 1"  # 9 AM, Mondays
    config:
      time_window_hours: 168  # 7 days
      top_n: 15
    output: slack://your-channel-webhook

Sample on-call briefing

Here's what a real morning on-call briefing looks like:

on-call-briefing.md — generated at 2026-03-25 08:00:00 UTC
## 🟢/🔴 Uptime Status
🟢 App health endpoint (45ms)
🟢 API v1 endpoint (123ms)
🟢 Homepage (201ms)
🟢 Documentation site (156ms)

## 🚨 New Errors (last 12h)
[NEW] TypeError: Cannot read property 'user' of undefined — app/middleware/auth.js:47 — 3 events, 3 users

## 📊 Top Errors by Volume
18 events — ValidationError: Invalid email format — app/validators/user.js:23
12 events — NetworkError: Failed to connect to payment service — app/services/payments.js:89
8 events — TimeoutError: Redis connection timeout — app/cache/redis.js:156

## 📈 Growing Issues
ValidationError growing 120% (8 → 18 events vs yesterday window)
NetworkError stable (10 → 12 events, +20%)

Frequently asked questions

Does OpenClaw replace PagerDuty or OpsGenie?

No. OpenClaw's uptime checker is for visibility and reporting, not incident management. It can post to a Slack channel or write a file, but it doesn't have PagerDuty's on-call rotation scheduling, escalation policies, or phone/SMS alerting. Use it alongside, not instead of, a dedicated incident tool.

How often should I run endpoint health checks?

Every 5 minutes is a reasonable starting point for production endpoints. For internal services, every 15 minutes is usually enough. Don't check too frequently with Sentry summarisation — every 5 minutes would generate too many calls; daily or hourly is more appropriate.

Can OpenClaw monitor multiple Sentry projects?

Yes. Add multiple project slugs to the config and the agent will query each one. Output is grouped by project in the report.

What happens if my Sentry free tier is exceeded?

The Sentry API still responds but event data may be sampled or capped. OpenClaw will report whatever data the API returns. If you're hitting Sentry's free tier limits, the agent config has a note_if_data_may_be_sampled: true option that adds a warning to the output.