Agent Mesh — System Architecture

Multi-model agentic workflow · FastAPI + SSE · Provider-adapter routing · mesh.venacafe.dev

HTTPS SSE stream POST /api/run chat/completions · /messages reads refresh executes frozen run config plan optional optional always Render Deployment · render.yaml · healthCheckPath: /health FastAPI · uvicorn · main.py Agent Pipeline · Sequential · text-in/text-out · no tools · each call routed via Provider Adapters → active LLM Browser mesh.venacafe.dev Cloudflare DNS · TLS edge Render Python web svc HTTP Basic Auth require_auth() secrets.compare_digest MESH_USER · MESH_PASS API Routes POST /api/run GET /api/stream/{id} /agents · /models · /providers Workflow Engine run_workflow() 5-agent pipeline cancel on disconnect Task Store In-memory dict TTL eviction (10m) events[] · cancelled Provider Adapters build_adapter() · per-call Aggregator · Anthropic · OpenAI normalize_model_id() NousAuth Reads Hermes auth.json Token refresh via portal asyncio.Lock · auto-expiry SSE Stream text/event-stream 40ms poll · disconnect → cancel Model Catalog Live fetch + fallback 190+ models · tier detection Lifespan Startup → model cache make_run_config() snapshot Hermes+ (Nous) inference-api.nous OpenRouter openrouter.ai/v1 Anthropic Direct /v1/messages OpenAI Direct api.openai.com/v1 auth.json (local) ~\AppData\Local\hermes\ Nous Portal portal.nousresearch.com 🧠 Orchestrator Claude Haiku 4.5 JSON plan · 1024 tok picks the team 🔍 Researcher Llama 3.1 8B facts · 4096 tok from training data 💻 Coder Qwen 2.5 Coder 32B code · 4096 tok no execution 📊 Analyst Llama 3.3 70B reasoning · 4096 tok no tools ✍️ Writer Claude Sonnet 4.5 synthesizes · 8192 tok final deliverable Legend Frontend (Browser) Backend (FastAPI) Cloud / Provider Security (Auth) SSE Stream Local / External Solid = data flow Dashed = optional/async

Pipeline Pattern

  • • Fixed DAG: Orchestrator → Specialists → Writer
  • • Dynamic node selection — orchestrator picks the team per task
  • • Sequential execution with context chaining (600-char truncation)
  • • Writer always runs last, gets full untruncated outputs
  • • No tools: text-in/text-out only, no web/code/files/memory
  • • Run config frozen at start — concurrency-safe across sessions

Provider Routing

  • • 4 adapter classes, per-call construction via build_adapter()
  • • AggregatorAdapter: OpenRouter + Nous (OpenAI-compatible /chat/completions)
  • • AnthropicDirectAdapter: native /v1/messages, content_block_delta parsing
  • • OpenAIDirectAdapter: strips openai/ prefix, filters non-chat models
  • • NousAuth reads Hermes CLI auth.json, auto-refreshes via portal OAuth
  • • Model catalog: live fetch at startup, FALLBACK_MODELS on failure

Data Flow & Lifecycle

  • • SSE events: token, status, connection, plan, stream_start/end, done
  • • In-memory task store with TTL eviction (10 min after done)
  • • Client disconnect sets cancelled flag → workflow self-terminates
  • • Per-agent max_tokens: Orch 1024 · Spec 4096 · Writer 8192
  • • Agent accent colors match the warm bistro canvas palette
  • • Render auto-deploys from GitHub master on push