drio
Open app

LocaLLama MCP Server

Source

An MCP Server that works with Roo Code/Cline.Bot/Claude Desktop to optimize costs by intelligently routing coding tasks between local LLMs free APIs and paid APIs.

Catalog onlyCatalog onlySTDIO

Overview

LocaLLama MCP Server is a tool designed to optimize costs by intelligently routing coding tasks between local LLMs and paid APIs, working with Roo Code and Cline.Bot.

To use the server, clone the repository, install dependencies, configure your environment variables, and start the server. Integrate it with Cline.Bot or Roo Code for enhanced functionality.

  • Cost & Token Monitoring Module for real-time data on API usage and costs. - Decision Engine that dynamically decides whether to use local or paid APIs based on cost and quality. - API Integration for seamless interaction with local LLMs and OpenRouter. - Fallback & Error Handling mechanisms to ensure reliability. - Comprehensive Benchmarking System for performance comparison.
  1. Reducing costs by offloading tasks to local LLMs when appropriate.
  2. Integrating with Cline.Bot for enhanced coding assistance.
  3. Benchmarking local models against paid APIs for performance insights.

Add to your AI client

Use these steps to connect LocaLLama MCP Server in Cursor, Claude, VS Code, and other MCP-compatible apps. The same JSON appears in the Use with menu above for one-click copy.

Cursor

Add this to your .cursor/mcp.json file in your project root, then restart Cursor.

.cursor/mcp.json

{
  "mcpServers": {
    "locallama-mcp-heratiki": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-locallama-mcp-heratiki"
      ]
    }
  }
}

Claude Desktop

Add this server entry to the mcpServers object in your Claude Desktop config, then restart the app.

~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows)

{
  "mcpServers": {
    "locallama-mcp-heratiki": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-locallama-mcp-heratiki"
      ]
    }
  }
}

Claude Code

Add this to your project's .mcp.json file. Claude Code will detect it automatically.

.mcp.json (project root)

{
  "mcpServers": {
    "locallama-mcp-heratiki": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-locallama-mcp-heratiki"
      ]
    }
  }
}

VS Code (Copilot)

Add this to your .vscode/mcp.json file. Requires the GitHub Copilot extension with MCP support enabled.

.vscode/mcp.json

{
  "servers": {
    "locallama-mcp-heratiki": {
      "type": "stdio",
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-locallama-mcp-heratiki"
      ]
    }
  }
}

Windsurf

Add this to your Windsurf MCP config file, then restart Windsurf.

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "locallama-mcp-heratiki": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-locallama-mcp-heratiki"
      ]
    }
  }
}

Cline

Open Cline settings, navigate to MCP Servers, and add this server configuration.

Cline MCP Settings (via UI)

{
  "mcpServers": {
    "locallama-mcp-heratiki": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-locallama-mcp-heratiki"
      ]
    }
  }
}

FAQ

Can I use LocaLLama with any local LLM?

Yes, it supports various local LLMs like LM Studio and Ollama.

Is there a cost associated with using LocaLLama MCP Server?

The server itself is free, but costs may arise from using paid APIs.

How do I configure the server?

Configuration is done through environment variables in the `.env` file.7:["$","div",null,{"className":"container mx-auto flex flex-col gap-4","children":["$L26","$L27",["$","$L28",null,{"currentProject":{"id":1928,"uuid":"4ade1387-dcc6-4899-ba13-aa90fa13cac9","name":"locallama-mcp","title":"LocaLLama MCP Server","description":"An MCP Server that works with Roo Code/Cline.Bot/Claude Desktop to optimize costs by intelligently routing coding tasks between local LLMs free APIs and paid APIs.","avatar_url":"https://avatars.githubusercontent.com/u/3317358?v=4","created_at":"2025-03-05T06:07:41.504Z","updated_at":"2025-03-12T10:21:47.391Z","status":"created","author_name":"Heratiki","author_avatar_url":"https://avatars.githubusercontent.com/u/3317358?v=4","tags":"vscode,mcp-server,roocode,clinebot","category":"developer-tools","is_featured":false,"sort":1,"url":"https://github.com/Heratiki/locallama-mcp","target":"_self","content":"$29","summary":"$2a","img_url":null,"type":null,"metadata":"{\"star\":\"11\",\"license\":\"\",\"language\":\"TypeScript\",\"is_official\":false,\"latest_commit_time\":\"2025-03-07 00:04:18\"}","user_uuid":null,"tools":null,"sse_url":null,"sse_provider":null,"sse_params":null,"is_official":false,"server_command":null,"server_params":null,"server_config":null,"allow_call":false,"is_innovation":false,"is_dxt":false,"dxt_manifest":null,"dxt_file_url":null,"is_audit":false},"randomProjects":[],"currentServerKey":"$undefined"}]]}]