Connect OmniDeploy to Claude Desktop
OmniDeploy is the first AI inference router with native Model Context Protocol support. Add it as a tool inside Claude Desktop, Cursor, Cline, or any MCP client and your agent can route inference, query pricing, and list providers — programmatically.
Install mcp-proxy
OmniDeploy speaks MCP over HTTP/JSON-RPC 2.0. Most MCP clients today use stdio, so we pipe through mcp-proxy (open source).
# pick one npm install -g mcp-proxy # or pip install mcp-proxy
Edit your Claude Desktop config
On macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
On Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"omnideploy": {
"command": "mcp-proxy",
"args": [
"--transport", "streamablehttp",
"https://omnideployservice.online/mcp"
],
"env": {
"MCP_AUTH_HEADER": "X-API-Key: <YOUR_OMNIDEPLOY_KEY>"
}
}
}
}⚠ Replace <YOUR_OMNIDEPLOY_KEY> with your real OmniDeploy key (starts with omni_live_). Don't have one? Get one in 30 seconds here →
Restart Claude Desktop and try it
Quit Claude Desktop completely, reopen it. In a new chat, ask:
“What OmniDeploy tools do you have? Use route_inference to ask llama-3.1-8b-instant for a haiku about Mumbai.”
Or test the MCP endpoint directly
You don't need an MCP client at all — the server speaks plain JSON-RPC 2.0 over HTTP:
# Discover what tools OmniDeploy exposes
curl -X POST https://omnideployservice.online/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'
# Call route_inference (auth required for actual inference)
curl -X POST https://omnideployservice.online/mcp \
-H "Content-Type: application/json" \
-H "X-API-Key: <YOUR_OMNIDEPLOY_KEY>" \
-d '{
"jsonrpc":"2.0","id":2,"method":"tools/call",
"params":{
"name":"route_inference",
"arguments":{
"model":"llama-3.1-8b-instant",
"messages":[{"role":"user","content":"Hello"}]
}
}
}'Cursor, Cline, Continue, Zed
All MCP-compatible clients accept the same JSON config. In Cursor: Settings → Model Context Protocol → Add server. In Cline (VS Code): edit cline_mcp_settings.json.
What your agent can do once connected
Send a chat-style prompt. We pick the cheapest provider that meets your policy and return cost + latency in metadata.
Query live per-token pricing across all 13 providers. Filter by model name. Useful for cost-aware agents.
Enumerate every provider OmniDeploy can route to, plus their supported models.
Works in any MCP client
Same JSON-RPC 2.0 endpoint, same three tools. Works in:
- Claude Desktop
- Cursor
- Cline (VS Code)
- Continue.dev
- Zed AI
- Custom agents
Don't have an OmniDeploy API key yet?
Provision one in 30 seconds — email only, no password.