Home βΊ Reference βΊ Design Rules
π Inviolable Design Rules
These rules govern every architecture decision. If a proposal conflicts, the proposal is wrong.
These are not guidelinesEach rule exists because of a real incident β data loss, service outages, or config corruption.
| Rule | Statement | Why |
|---|---|---|
| R1 | dryad-litellm (port 4000) is the ONLY LiteLLM instance. | Dual gateways split model aliases and break Ralph loops. |
| R2 | All AI data on /mnt/ai_engine/. Nothing large on the OS drive. | OS drive fills within weeks with models + Docker data. |
| R3 | 16 GB RAM hard ceiling. Every container needs explicit mem_limit. | Runaway container OOM-kills everything including Ollama mid-inference. |
| R4 | One Redis (dryad-redis). Shared by all stacks. | Multiple Redis instances caused key conflicts and auth confusion. |
| R5 | System Caddy is the only reverse proxy (owns 80/443). | Two Caddy instances exist β always edit /etc/caddy/Caddyfile. |
| R6 | One MCP server per capability. No duplicates. | Duplicate tools cause the model to pick the wrong one. |
| R7 | free-coding-models only via ~/bin/pick-nim-model wrapper. | Direct calls bypass rate limiting, drain API quota. |
| R8 | Cluster hardware arrived. Talos and PXE configs in active development. | Previously blocked. Amended March 2026. |
| R9 | Cross-node calls use Tailscale IPs or MagicDNS. Never LAN IPs. | LAN IPs change. Tailscale IPs are stable and encrypted. |
| R11 | Cluster nodes have NO Tailscale. Access via madhatter subnet routes. | Talos is immutable β installing Tailscale requires custom image builds. |
| R12 | talosctl is the ONLY management tool for Talos. No SSH. | Talos has no SSH server by design. |
| R13 | secrets.yaml backed up to /mnt/backups AND offsite. | Losing secrets.yaml = full cluster rebuild. |