Architecture
Botfather's design has one constraint: SLAW instances stay sovereign. Everything else follows from that.
The push model
Every connection originates from a SLAW instance. Instances push outbound over HTTPS; the tower never initiates a connection into the instance network. This means instances work behind NAT, VPN, or corporate proxies without firewall exceptions.
Four message types flow instance → tower:
| Message | Frequency | Purpose |
|---|---|---|
enroll / enroll/poll | Startup | Self-register; poll until approved; receive API key |
heartbeat | ~60 s | Liveness signal + summary counts and spend |
sync | ~60 s, only when deltas exist | Entity upserts and cost/run fact events |
manifest | Nightly or on demand | Full reconciliation checksum |
The heartbeat response is the back-channel: the tower delivers directives to the instance (for example, set_limits to enforce an enterprise budget ceiling) without ever opening an inbound connection.
The sovereignty boundary
The Reporter module inside each SLAW instance decides what to emit before anything leaves.
| Synced to tower | Stays on the instance |
|---|---|
| Squad and agent names, roles, status | Agent adapter configuration and secrets |
| Issue titles and statuses | Issue bodies and comments |
| Token counts and cost telemetry | Run logs and output |
| Instance health and enrollment state | Skill definition content |
Issue titles are included by default so admins can see what the fleet is working on. Set reportIssueTitles: false in the instance config to send only IDs and statuses.
Enrollment and the startup gate
Instances enroll themselves — there is no pre-shared secret to distribute to users. The flow:
- On startup, the instance sends
POST /api/ingest/v1/enrollwith its machine identity; no token required - The instance appears as pending in the tower's Approval Queue
- An admin approves it, or an auto-approve rule matches (for example, any machine matching
*-ENG-*) - The instance polls, receives a per-instance API key, and moves to active
- The SLAW startup gate unlocks; the app becomes usable
Until the instance is active, the SLAW UI shows a blocking enrollment gate. A configured botfather.url signals that the enterprise expects enrollment. Already-enrolled instances continue to work if the tower goes offline (fail-open for enrolled); never-enrolled instances stay gated until approval.
The Reporter module
The Reporter is a sidecar module inside the existing SLAW instance (server/src/services/botfather/reporter.ts). It is not a separate process — it rides the process lifecycle and reuses the existing database connection. All errors are logged and swallowed; a Reporter failure cannot affect the SLAW instance.
The Reporter spools unsent batches to disk when the tower is unreachable (default cap: 50 MB / 14 days), then drains them in order once connectivity returns. Because cost facts already live in the instance's own database, even a dropped spool is recoverable via the nightly manifest reconciliation.
Next steps
- Reporting protocol — the full sync payload schema
- Identity & Keys — API key lifecycle, rotation, and revocation
- What is synced — the complete sync inventory