Home βΊ LiteLLM Gateway
π€ LiteLLM Gateway
Port 4000. Auth required. R1: never add a second instance.
Model Aliases
| Alias | Backing Model | VRAM | Use Case |
|---|---|---|---|
qwen-coder-fast | qwen2.5-coder:7b | 4.7 GB | Default β Ralph loops, quick tasks |
qwen-coder-deep | qwen2.5-coder:14b-instruct | 9.0 GB (solo) | Deep coding, architecture |
deepseek-reason | deepseek-r1:14b | 9.0 GB (solo) | Reasoning, debugging |
embeddings | nomic-embed-text | 274 MB | RAG embeddings |
nim-deepseek | nvidia/deepseek-v3.2 (NIM) | Cloud | VRAM-saturated fallback |
qwen3-coder-cloud | qwen3-coder:480b-cloud | Cloud | Best for Ralph scaffolding + tool-use |
Adding a New Model
ollama pull <model-name>
# Edit ~/dryad-brain/litellm-config.yaml β never cat-append
nano ~/dryad-brain/litellm-config.yaml
cd ~/dryad-brain && docker compose restart dryad-litellm
curl http://localhost:4000/v1/models -H "Authorization: Bearer sk-dryad-master" | jq ".data[].id"