EngineeringMar 4, 2026Nicolai Schmid

MCP Architecture Explained: A Technical Deep Dive

A developer-grade walkthrough of the Model Context Protocol internals — JSON-RPC messaging, lifecycle management, transport layers, security model, and the three server primitives.

MCP is a two-layer protocol built on JSON-RPC 2.0. The data layer defines what servers expose (tools, resources, prompts) and how clients interact with them. The transport layer handles how messages move between client and server. Understanding both layers is essential for building reliable MCP applications.

This is the technical companion to our What Is MCP? overview. That post explains what MCP is and why it matters. This post explains how it works internally — the protocol stack, the message format, the lifecycle, and the security model. The target audience is developers who will build or debug MCP servers.

The protocol stack

The MCP specification defines a clear separation between two layers:

Data layer — The application-level protocol. Defines the message types (requests, responses, notifications), the primitives (tools, resources, prompts), and the lifecycle (initialization, operation, shutdown). This layer is transport-agnostic.

Transport layer — The mechanism for moving messages between client and server. Two transports are defined: stdio for local communication and streamable HTTP for remote communication. The transport layer is protocol-agnostic — it moves JSON-RPC messages without understanding their content.

This separation is deliberate. It means you can switch transports without changing your application logic, and you can evolve the protocol without changing the transport.

JSON-RPC 2.0 foundation

All MCP communication uses JSON-RPC 2.0. This is the same wire format used by the Language Server Protocol — and MCP explicitly cites LSP as an architectural precedent.

Three message types exist:

Requests

A request expects a response. Every request has a unique id, a method, and optional params:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "get_weather",
    "arguments": { "city": "Berlin" }
  }
}

Responses

A response carries the result of a request. It includes the same id as the request:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Weather in Berlin: 18°C, partly cloudy"
      }
    ]
  }
}

Error responses use the error field instead of result, with a code, message, and optional data.

Notifications

Notifications are fire-and-forget messages with no id and no expected response:

{
  "jsonrpc": "2.0",
  "method": "notifications/progress",
  "params": {
    "progressToken": "abc123",
    "progress": 0.5,
    "total": 1.0
  }
}

Progress notifications are the primary use case. They allow long-running tool executions to report incremental progress to the client.

Lifecycle management

Every MCP session follows a three-phase lifecycle, documented in the architecture overview:

Phase 1: Initialization

The client sends an initialize request with its protocol version and capabilities:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2025-03-26",
    "capabilities": {
      "roots": { "listChanged": true }
    },
    "clientInfo": {
      "name": "ChatGPT",
      "version": "1.0.0"
    }
  }
}

The server responds with its own capabilities — which primitives it supports, which features it offers:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2025-03-26",
    "capabilities": {
      "tools": { "listChanged": true },
      "resources": {},
      "prompts": {}
    },
    "serverInfo": {
      "name": "weather-server",
      "version": "1.0.0"
    }
  }
}

This is capability negotiation. The client and server agree on what features they both support. A client that does not support resources will not request them. A server that does not support prompts will not advertise them.

After the server responds, the client sends an initialized notification to confirm. The session is now active.

Phase 2: Operation

During the operation phase, the client can:

  • List tools (tools/list) — Discover available tools and their schemas
  • Call tools (tools/call) — Invoke a tool with arguments
  • List resources (resources/list) — Discover available resources
  • Read resources (resources/read) — Fetch resource content
  • List prompts (prompts/list) — Discover available prompt templates
  • Get prompts (prompts/get) — Retrieve a prompt with arguments filled in

The server can send notifications — progress updates, resource change notifications, tool list change notifications.

Phase 3: Shutdown

Either side can initiate shutdown. The client sends a close signal (transport-dependent), and both sides clean up resources. For stdio, this means closing the subprocess. For HTTP, this means ending the session.

The three server primitives

The server concepts documentation defines three primitives. Each serves a distinct purpose.

Tools

Tools are the primary interaction mechanism. They represent functions the AI can invoke.

A tool definition includes:

  • name — Unique identifier within the server
  • description — Natural language description for AI reasoning
  • inputSchema — JSON Schema defining the expected parameters
server.tool(
  "search_flights",
  "Search for flights between two airports on a given date",
  {
    origin: { type: "string", description: "Origin airport IATA code" },
    destination: { type: "string", description: "Destination airport IATA code" },
    date: { type: "string", description: "Departure date (YYYY-MM-DD)" }
  },
  async ({ origin, destination, date }) => {
    // Call airline API, process results
    return { content: [{ type: "text", text: "..." }] };
  }
);

This example uses the TypeScript SDK. The handler receives validated arguments and returns structured content.

A critical principle: tools have side effects. They call APIs, write data, execute operations. The AI must get user consent before invoking tools. This is enforced at the host level — the AI client asks the user before making the tool call.

Resources

Resources provide read-only context. They are identified by URIs and can contain text or binary data.

server.resource(
  "database-schema",
  "postgres://analytics/schema",
  "The schema of the analytics database",
  async () => ({
    contents: [{
      uri: "postgres://analytics/schema",
      mimeType: "text/plain",
      text: "CREATE TABLE events (id SERIAL, name TEXT, timestamp TIMESTAMPTZ)..."
    }]
  })
);

Resources do not have side effects. The AI reads them for context — a database schema, a knowledge base article, a configuration file. They inform the AI's reasoning but do not trigger actions.

Prompts

Prompts are reusable templates that guide AI behavior. They accept arguments and return structured messages.

server.prompt(
  "analyze-query",
  "Template for analyzing a database query",
  [{ name: "query", description: "The SQL query to analyze", required: true }],
  async ({ query }) => ({
    messages: [{
      role: "user",
      content: { type: "text", text: `Analyze this SQL query for performance: ${query}` }
    }]
  })
);

Prompts are user-initiated — the AI does not autonomously select prompts. They are exposed in the client UI for the user to choose.

Transport mechanisms

stdio

For local servers. The host launches the server as a child process and communicates via standard input/output. Messages are newline-delimited JSON.

This is the default for Claude Desktop. You configure a server in claude_desktop_config.json, and Claude launches it as a subprocess when you start a conversation.

Advantages: No network required. Low latency. Simple deployment (just a binary or script).

Limitations: Only works locally. Cannot be shared across machines or users.

Streamable HTTP

For remote servers. The client sends POST requests to an HTTP endpoint. The server responds with JSON-RPC messages. Long-running operations can use Server-Sent Events (SSE) for streaming.

This is what ChatGPT uses. You deploy your server to a URL, and ChatGPT connects to it over HTTPS.

Advantages: Works remotely. Shareable. Supports cloud deployment.

Limitations: Requires HTTPS. Higher latency than stdio. Requires infrastructure management.

Security model

The MCP specification defines several security principles:

User consent. Tools have side effects, so the host must get user approval before invoking them. This is not optional — it is a protocol-level requirement.

Data privacy. Servers should not expose data to clients or AI models without user understanding and consent. The user should know what data flows where.

Tool safety. Servers must not execute destructive operations without explicit user confirmation. A tool that deletes records should require a confirmation step.

LLM sampling controls. When servers request the AI to generate content (via the sampling capability), they must not use this to manipulate the user or exfiltrate data.

These principles are enforced at the host level. The MCP specification defines the rules; the AI client (ChatGPT, Claude, Cursor) enforces them.

Practical implications

Several architectural decisions follow from this design:

One client, one server. Each MCP client maintains a 1:1 connection with a server. If you need multiple servers, the host creates multiple clients. This simplifies the protocol — no multiplexing, no routing.

Servers are stateless between sessions. While a session is stateful (initialization, operation, shutdown), servers should not assume state persists between sessions. Each new connection starts fresh.

Capability negotiation prevents feature mismatches. A server that supports tools but not resources will not receive resource requests. A client that does not support progress notifications will not receive them. Both sides declare their capabilities upfront.

Tools are the workhorse. In practice, most MCP servers expose tools and little else. Resources and prompts are useful but optional. If you are building your first MCP server, start with tools.

For a practical guide to building MCP tools with interactive widgets, see Building MCP Tools with Rich UIs. For a comparison with REST APIs, see MCP vs. REST APIs.

The path from here

If you want to build an MCP server:

  • Code-first: Start with the TypeScript SDK and the build server guide.
  • Visual builder: Use drio to build and deploy MCP tools without writing protocol code. The compiler handles the JSON-RPC layer, transport configuration, and lifecycle management.
  • Debug: Use the MCP Inspector to test your server's tools, resources, and prompts interactively.

The protocol is well-designed. The specification is clear. The challenge is not understanding MCP — it is building production-ready applications on top of it efficiently.

MCP's architecture is deliberately simple — two layers, three primitives, two transports. Simplicity is the foundation for reliability.