Dashboard
Overview of all bot sessions and activity
Active Sessions
3
across 2 bots
Today's Sessions
11
+4 from yesterday
Tokens Used
842K
$6.38 estimated
Avg Session
23m
47 turns avg
Bots
co

concierge

Signal assistant · Opus
Active
Sessions
6
Tokens
412K
Context
34%
hi

historian

Knowledge curator · Sonnet
Idle
Sessions
2
Tokens
158K
Context
12%
fo

forge

Engineering peer · Codex
Active
Sessions
3
Tokens
246K
Context
67%
cu

curator

Content organizer · Haiku
Offline
Sessions
0
Tokens
26K
Context
--
Recent Sessions

← Back to Dashboard
fo
forge
Implementing health check v2 with detailed diagnostics
Started 10:48 AM 43 minutes 68 turns 124K tokens · $0.94
Context
67%
Max
The health check script in colony needs a v2. Current one just checks if processes are running. I want it to also verify: MCP server connectivity, colony message round-trip latency, and disk space on the data partition. Put it at colony/scripts/health-check-v2.ts.
forge
Thinking · 3.2s
The existing health check is at colony/scripts/health-check.ts. I should read it first to understand the current approach, then extend it with the three new checks Max wants. MCP connectivity can be tested by attempting a colony_heartbeat equivalent. For latency, I can do a round-trip message send/receive. Disk space is straightforward with statfs or df parsing.
Read colony/scripts/health-check.ts Grep "health" in colony/
Read File
colony/scripts/health-check.ts
72 lines · 1.8 KB · read in 12ms

Read the existing health check. It's a simple process-alive check using pgrep for each bot name. Straightforward to extend.

I'll structure v2 with individual check functions that each return a result object, then aggregate them into a single report. Here's my plan:

1. Process checks -- keep existing behavior
2. MCP connectivity -- attempt a WebSocket handshake to the colony MCP server
3. Message latency -- send a ping through colony, measure round-trip
4. Disk space -- parse df output for the data partition

10:51 AM
Max
Perfect. For the latency check, don't actually send a colony message -- just measure the time to connect to the MCP stdio transport and get a response to a ping. Keep it self-contained.
forge
Write colony/scripts/health-check-v2.ts
Write File
colony/scripts/health-check-v2.ts
148 lines written

Done. The script exports a runHealthCheck() function and also works as a CLI entrypoint. Here's the structure:

type CheckResult = {
  name: string
  status: "pass" | "warn" | "fail"
  message: string
  durationMs: number
}

type HealthReport = {
  timestamp: string
  checks: CheckResult[]
  overall: "healthy" | "degraded" | "unhealthy"
}

Each check runs independently with a 5-second timeout. The MCP connectivity test spawns the stdio transport, sends an initialize request, and measures time-to-response. Disk space warns at 80%, fails at 95%.

forge
Bash bun run colony/scripts/health-check-v2.ts
Bash Output
Colony Health Check v2
======================

[PASS]  process/concierge     running (pid 48291)        2ms
[PASS]  process/forge          running (pid 48305)        1ms
[WARN]  process/historian      not running                1ms
[PASS]  process/curator        not expected               0ms

[PASS]  mcp/connectivity       connected in 127ms       127ms
[PASS]  mcp/latency            ping: 34ms                34ms

[PASS]  disk/data              62% used (186G / 300G)     8ms
[PASS]  disk/tmp               14% used (2.1G / 15G)     3ms

Overall: HEALTHY (7/8 pass, 1 warn)

All checks passing except the expected historian warning (it's offline per the bot reduction). The MCP round-trip came in at 34ms which is solid for a stdio transport.

Script is at colony/scripts/health-check-v2.ts and ready to use. Want me to wire it into the cron schedule or update the colony tools reference?

10:58 AM
Max
Looks great. Yeah, update the colony tools reference in forge's identity docs to point to v2 as well. And commit it.
forge
Edit colony-tools.md Bash git add && git commit

Updated the colony tools reference to include v2 and committed both files:

feat: add health check v2 with MCP, latency, and disk checks

Co-Authored-By: forge

Two files changed: colony/scripts/health-check-v2.ts (new) and colony/identities/forge/colony-tools.md (updated reference).