Skip to content

Monitoring

import { Aside } from ‘@astrojs/starlight/components’;

Deploy Monkey provides built-in monitoring for your servers and Odoo instances — tracking server resources, instance health, and worker usage.

Deploy Monkey collects server-level metrics every 5 minutes via SSH:

MetricDescription
CPU UsagePercentage of CPU being used
Load AverageSystem load (1m, 5m, 15m)
Memory UsageRAM consumption (used %, used MB, total MB)
Disk UsageStorage used (used %, used GB, total GB)
ContainersRunning and exited container counts

Metrics are available as raw samples (up to 14 days) and hourly rollups (up to 180 days).

Navigate to your server’s Monitoring tab to see:

  • Current metric values (CPU, RAM, Disk, Load)
  • Timeseries charts with range selection (1h, 6h, 24h, 7d)
  • Container status overview

Each instance is monitored with two signals:

  1. Container status — checked via SSH during the server metrics pass (is the Docker container running?)
  2. HTTP probe — checks https://domain/web/login every 5 minutes

These signals combine into an overall health state:

StateMeaning
HealthyContainer running AND HTTP responding
DegradedOne signal OK, one failing
DownBoth signals failing

Instance health is rolled up hourly, giving you:

  • Uptime percentage over time
  • Average and max response times
  • Health state distribution (healthy/degraded/down counts per hour)

Deploy Monkey tracks Odoo worker process activity to help you make scaling decisions.

MetricDescription
Configured WorkersWorker count set in instance settings
Detected ProcessesTotal Odoo/Python processes in the container
Busy WorkersProcesses that consumed CPU in the measurement window
Usage %Busy workers ÷ configured workers × 100

Worker data is collected during the regular server metrics SSH pass — no additional connections needed.

Deploy Monkey evaluates alert rules against your metrics and health data:

AlertDefault Threshold
CPU > threshold80% sustained
Memory > threshold85% sustained
Disk > threshold90%
High loadLoad > 2× CPU cores
Instance downContainer + HTTP both failing
Instance degradedOne health signal failing
Slow responsesResponse time > 5000ms
  1. Open — threshold breached, notification sent
  2. Acknowledged — operator has seen it
  3. Resolved — condition recovered (auto or manual)

Alerts are visible in the Alerts section of the control panel and as notifications in the bell menu.

Data TypeRetention
Raw metric samples14 days
Hourly rollups180 days
Resolved alerts180 days