MCP server architecture: when to build one versus use a skill

The first MCP server I wrote was a 200-line Python script that did nothing a skill could not have done with 20 lines. I had read about MCP, got excited about the protocol, and reached for it before checking whether my workload actually needed a server. It did not. I threw the code away three days later and rewrote it as a skill.

atmospheric wide of architectural form aligned to the horizon under electric-blue sky with hot-pink sunset - hero image for the MCP server architecture basics guide — // arch-sky-atmo · architectural family

This is the pattern library I use now to decide whether a job wants an MCP server, a skill, or neither. Same three patterns, three real instances of each.

The pattern

MCP, the Model Context Protocol, is a standard for exposing tools and resources to an LLM over a stable interface. Anthropic published the spec in late 2024. By early 2026 it is the default way to give Claude access to external systems, supported across Claude Code, the Anthropic API, and most third-party clients. The protocol is transport-neutral. You can run an MCP server over stdio (local process), Streamable HTTP (remote, the current remote standard), or SSE (an older remote transport being phased out).

The pattern I keep seeing: people reach for MCP when they should have reached for a skill, and they reach for skills when they should have built an MCP server. The two are not interchangeable.

A skill is a packaged pattern that lives inside Claude Code. It auto-activates, runs in the main agent's context, and does not require a separate process.
An MCP server is a separate process or service that exposes tools and resources to any MCP-compatible client. It runs independently of the agent.

The question that decides between them: does this capability need to be shared across clients, across machines, or across teams? If yes, MCP. If no, skill.

ultra-wide strip of architectural form aligned to the horizon under electric-blue sky with hot-pink sunset - companion image for the MCP server architecture basics guide — // arch-horizon-strip · architectural family

Instance 1: the local file-transform job

A shop wants to run a specific kind of JSON-to-CSV transform with validation on specific paths. They write an MCP server for it. The server has a transform tool that takes a path and returns a result. The server runs on the same machine as Claude Code. No other client uses it.

What the shape tells us: this is a skill. The MCP server added nothing except startup latency and a separate process to manage. A skill with a scripts/transform.py helper would have done the same work in a quarter of the code, with none of the protocol overhead.

How it resolved: rewrote as a skill. The SKILL.md describes when to invoke the transform. The helper script handles the work. The model shells out to the script when needed. Total lines of code dropped from roughly 200 to 40.

Instance 2: the shared team Supabase connection

A three-person team wants every agent on every machine to be able to query their Supabase project with the right service role and the right audit logging. Each person runs Claude Code on their own laptop. They do not want to distribute API keys in skills.

What the shape tells us: this is MCP. The capability is shared across multiple clients, the authentication is central (not duplicated per laptop), and the audit trail needs one collection point. A skill cannot give you that without putting the same key on every machine.

How it resolved: a Streamable HTTP MCP server hosted on the company's internal infrastructure. Each developer's Claude Code client connects to it with their own SSO-backed token. The server holds the Supabase service role. Every query is logged against the developer's token. No keys on laptops. Works.

The Supabase MCP actually exists as a shipped integration. If your need matches it, use theirs rather than rolling your own.

cinematic trail-aspect of architectural form aligned to the horizon under electric-blue sky with hot-pink sunset - companion image for the MCP server architecture basics guide — // arch-basin-trail · architectural family

Instance 3: the Playwright browser harness

A client wants agents to drive a real browser for QA work. Navigate, click, screenshot, assert. The browser itself is a heavy process, and multiple agents should be able to share it without each one spinning up its own Chromium.

What the shape tells us: this is MCP, and specifically it is a long-running stateful MCP. A skill cannot own a browser process across invocations. The browser needs to stay warm.

How it resolved: the official Playwright MCP server. It runs as a separate process, holds the browser context, and exposes navigate/click/evaluate tools. Multiple Claude Code sessions can connect to the same server and share the browser. The server handles the process lifecycle.

This is the canonical MCP use case. A protocol with state, shared across clients, with a heavy process that needs to persist between calls.

The three transports and when to pick each

MCP supports three transports. Most people pick stdio because the docs default to it. That is usually right, but not always.

stdio. The server is a subprocess of the client. No network. Communication over stdin and stdout as newline-delimited JSON-RPC. Best for local tools that only one client will ever use. Cheapest to operate. No auth, no serialization tax, no network debugging.

Streamable HTTP. The server is a separate HTTP service. Clients connect over HTTP with streaming support. This is the current standard for remote MCP servers as of early 2026. Best for multi-client, multi-machine deployments. You pay the auth and networking tax but get horizontal scalability.

SSE. Server-Sent Events, the original remote MCP transport. Being phased out in favor of Streamable HTTP. If you are building new, use Streamable HTTP. If you are maintaining an older SSE server, the migration is straightforward but worth doing.

The honest truth: for 90 percent of the MCP servers a solo operator will write, stdio is the right answer. You are running a local tool on your own machine. Network anything is overkill.

portrait orientation of architectural form aligned to the horizon under electric-blue sky with hot-pink sunset - companion image for the MCP server architecture basics guide — // arch-from-side-portrait · architectural family

Instance 4: the cross-team markdown index

A company wants every internal agent to be able to search their Notion workspace from any Claude Code session. The search needs fresh credentials, rate limiting against Notion's API, and a consistent caching layer.

What the shape tells us: this is MCP, Streamable HTTP specifically. Shared across teams, needs centralized auth, benefits from shared caching. A skill would require everyone to hold their own Notion token and hit Notion directly, which is fine for one person and fragile at five.

How it resolved: a company-hosted Streamable HTTP MCP server. It holds a workspace-level Notion integration token, handles rate limiting against Notion's API, and caches search results for five minutes to amortize the cost across team members. Every Claude Code client in the company connects to it.

What the pattern tells us

Three questions that decide for me now:

Is this capability used by more than one client? If no, skill. If yes, keep reading.
Does the capability need state that persists between calls? A browser, a connection pool, a cache. If yes, MCP. If no, still could be a skill with a helper script.
Does the authentication need to be centralized? If yes, MCP. If no, skill.

Two or more "yes" answers: build an MCP server. Zero or one "yes": write a skill.

tall vertical of architectural form aligned to the horizon under electric-blue sky with hot-pink sunset - companion image for the MCP server architecture basics guide — // arch-worms-eye-tall · architectural family

How to spot the wrong choice early

MCP smells wrong when:

The server runs on the same machine as the client and only serves that one client.
The tools the server exposes are thin wrappers around local scripts.
The authentication is "read this env var," same as a skill would.
You find yourself distributing the server with the skill that uses it (they belong together as one skill).

Skill smells wrong when:

Three or more people end up with copies of the same API token in their local config.
The skill needs to hold state across invocations (it cannot).
The skill needs to serve more than one process at a time.
You are rate-limited by the downstream API and cannot coordinate across users.

The MCP vs skills vs tools decision log covers the sharper edges of this decision. Building your first Claude skill covers the other side.

For operators building production pipelines, the patterns here feed directly into the Operator's Stack curriculum and into the agent handbook that indexes all of this.

FAQ

Can I use an existing MCP server without writing my own?

Yes. The MCP ecosystem has growing coverage, including official servers for Supabase, Playwright, GitHub, Slack, Figma, and others. If your need matches an existing server, use it. Writing your own is for gaps in the ecosystem or for internal tools.

How do I register an MCP server with Claude Code?

Edit the MCP configuration in your Claude Code settings. For stdio servers, you specify a command and args. For remote servers, you specify the URL and any auth headers. The server appears as a tool source once connected.

What language should I write an MCP server in?

Whatever you are comfortable with. The SDK ships in Python and TypeScript first-class, and community implementations exist in Rust, Go, and others. For a one-person shop, pick the language where you can iterate fastest.

Do MCP servers have any security implications?

Yes. Any MCP server you run is giving the model access to the tools it exposes. Treat that access as you would treat giving a human an API key. Restrict scope, log actions, and never expose destructive operations without explicit confirmation gates.

Can I hot-reload an MCP server without restarting Claude Code?

stdio servers need a client restart. Remote servers can hot-reload on the server side without affecting connected clients, as long as the tool definitions do not change. Tool definition changes require reconnection.