Home › LiteLLM Gateway

🤖 LiteLLM Gateway

Port 4000. Auth required. R1: never add a second instance.

Model Aliases

Alias	Backing Model	VRAM	Use Case
`qwen-coder-fast`	qwen2.5-coder:7b	4.7 GB	Default — Ralph loops, quick tasks
`qwen-coder-deep`	qwen2.5-coder:14b-instruct	9.0 GB (solo)	Deep coding, architecture
`deepseek-reason`	deepseek-r1:14b	9.0 GB (solo)	Reasoning, debugging
`embeddings`	nomic-embed-text	274 MB	RAG embeddings
`nim-deepseek`	nvidia/deepseek-v3.2 (NIM)	Cloud	VRAM-saturated fallback
`qwen3-coder-cloud`	qwen3-coder:480b-cloud	Cloud	Best for Ralph scaffolding + tool-use

Adding a New Model

ollama pull <model-name>
# Edit ~/dryad-brain/litellm-config.yaml — never cat-append
nano ~/dryad-brain/litellm-config.yaml
cd ~/dryad-brain && docker compose restart dryad-litellm
curl http://localhost:4000/v1/models -H "Authorization: Bearer sk-dryad-master" | jq ".data[].id"