Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kindling.systems/llms.txt

Use this file to discover all available pages before exploring further.

Production foundations acceptance

Use this checklist to treat the Operator Controls and Recovery slice as complete enough to move on to broader multi-node reliability testing.

Control-plane backup and disaster recovery

  • docs/control-plane-backup-and-dr.md is the team’s agreed contract for what is in Postgres vs volumes vs host-local state.
  • Operators can describe RPO/RTO for the Postgres tier using their own backup tooling (Kindling does not ship scheduled control-plane backups).
  • A documented restore ordering exists: Postgres → DSN files → kindling serve → validation (see Backup and restore).
  • Volume backup is not mistaken for control-plane backup (see Secrets and volumes).

Audit logging (cluster-global admin)

  • cluster_audit_events exists after migrate and receives rows for:
    • server.drain / server.activate
    • cluster.settings.update (persisted keys changed listed in details)
    • auth.provider.update (no secrets in details)
  • Design doc docs/cluster-audit-events.md matches the implemented action names and redaction rules.
  • Operators can query recent events via SQL (read API/UI deferred per design).

Runbooks

Optional drill

  • Restore Postgres to a non-production environment and confirm Kindling starts and passes the multi-server validation checklist in docs/high-availability.md.