April 19, 2026

10 min read

How to Build an MCP Server: A Practical Guide for AI Agent Builders

What MCP Actually Is (Before You Build Anything)

The Model Context Protocol is an open standard, originally released by Anthropic in late 2024, that defines how AI models connect to external tools, data sources, and services. Think of it as a standardized plug format for AI agents. Instead of each application inventing its own way to wire an LLM to a database, a calendar, or an internal API, MCP gives everyone the same socket.

The protocol has three main concepts: servers expose capabilities (tools, resources, prompts), clients are the LLM-powered hosts that connect to those servers (Claude Desktop, Cursor, custom agents), and tools are callable functions the model can invoke during a conversation or task. When a user asks an MCP-connected agent to "check my open GitHub issues", the model doesn't just pull from training data — it calls a tool on an MCP server that queries the GitHub API and returns live results.

If you want the full protocol specification, Anthropic publishes it at modelcontextprotocol.io. But reading the spec and knowing what to actually build are different things. This guide focuses on the latter.

One thing worth clarifying upfront: MCP servers are not the same as regular REST APIs. They use a JSON-RPC 2.0 message format over either stdio (for local processes) or HTTP with Server-Sent Events (for remote servers). If you want to understand the architectural difference in more depth, the post on MCP vs API for AI agent builders covers that comparison well. For here, we'll assume you've decided MCP is the right approach and want to know how to ship one.

Choosing Your Transport: stdio vs HTTP/SSE

Before writing a single line of server code, you need to decide how your server will communicate. This choice shapes everything downstream.

stdio (local, in-process)

The simplest option. Your MCP server runs as a local process, and the client communicates over standard input/output. This is what you use when building tools that run on the same machine as the host application: a filesystem tool, a local database reader, or a code execution sandbox.

stdio servers are fast to build and trivially easy to test. They have no authentication surface to worry about. The downside is obvious: they only work locally. You cannot expose a stdio server to a remote agent or share it across a team.

HTTP with SSE (remote)

Remote MCP servers run as HTTP services. The client sends requests via POST, and the server streams responses back using Server-Sent Events. This is what you need when multiple clients need to access the same server, your tools need to run in the cloud, or you want to deploy your tools like any other service.

Remote servers require you to think about authentication. The MCP spec supports OAuth 2.0, and most production implementations use bearer tokens or API keys passed in headers. This is where most tutorials stop — they get the server running locally and leave deployment as "an exercise for the reader." We'll come back to that.

Setting Up a Basic MCP Server in Python

Python is currently the most common language for MCP server development, largely because Anthropic's official Python SDK is well-maintained and the FastMCP wrapper makes boilerplate minimal.

Install the SDK:

pip install mcp

Here's the minimal structure of an MCP server that exposes a single tool:

from mcp.server import Server
from mcp.types import Tool, TextContent
import mcp.server.stdio as stdio

server = Server("my-tool-server")

@server.list_tools()
async def list_tools():
    return [
        Tool(
            name="get_weather",
            description="Returns current weather for a city. Use this when the user asks about weather conditions in a specific location.",
            inputSchema={
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "get_weather":
        city = arguments["city"]
        # Replace with your actual weather API call
        return [TextContent(type="text", text=f"Weather in {city}: 22C, partly cloudy")]

if __name__ == "__main__":
    import asyncio
    asyncio.run(stdio.run(server))

That's the skeleton. Your tool implementation replaces the placeholder with real logic: API calls, database queries, file reads, whatever the agent needs to do.

If you prefer TypeScript, Anthropic also maintains a TypeScript SDK with equivalent patterns. The type safety helps when your tool schemas get complex.

Tool Design: Where Most MCP Servers Go Wrong

Getting the server running is the easy part. Designing tools that models actually use well is where most projects stall.

Write descriptions as if the model is making the decision

The LLM decides which tool to call based on the tool's description field. This means your description is a prompt in disguise. Vague descriptions produce unpredictable tool selection. Compare:

Bad: "Get user data"

Better: "Retrieve a user's profile including name, email, account status, and subscription tier by their user ID. Use this when you need to look up who a specific user is or check their account standing."

The second version tells the model when to use the tool, not just what it does. That distinction matters more than you might expect in multi-tool environments where the model has to choose between several options.

Keep tools atomic

A common mistake is building tools that do too much. A tool called manage_user that creates, updates, and deletes users based on an action parameter forces the model to construct more complex arguments and increases the chance of misuse. Three separate tools — create_user, update_user, delete_user — are easier for the model to reason about and safer to expose.

Return structured, useful output

Your tools can return text, JSON, images, or embedded resources. Prefer structured JSON over raw text wherever possible. When you return a blob like "User John Doe created successfully, ID 12345", the model has to parse that string. When you return {"id": 12345, "name": "John Doe", "status": "created"}, it can reliably extract fields without introducing a parse step that can fail.

Handle errors explicitly

Tools should return structured errors, not throw exceptions. The MCP spec allows tools to return an isError: true flag alongside error content. Use it. When a model calls a tool and gets back a clear error explaining what went wrong, it can retry with corrected parameters or surface a useful message. Silent failures or unhandled stack traces give the model nothing to work with.

Resources and Prompts: The Features Most People Skip

MCP servers can expose three types of capabilities: tools (callable functions), resources (data the model can read), and prompts (reusable prompt templates). Most tutorials only cover tools. Resources and prompts are worth understanding even if you don't use them immediately.

Resources are read-only data endpoints. If your agent needs to read a document, a configuration file, or a schema, exposing it as a resource is cleaner than hacking it into a tool. Resources have URIs, and MCP clients can subscribe to resource updates — useful for monitoring scenarios where you want the model notified when data changes.

Prompts let you package reusable prompt templates into the server itself. A server for a customer support system might expose a prompt called handle_refund_request that contains the exact wording and constraints the agent should follow. This keeps business logic in the server layer rather than scattered across whatever is prompting the model upstream.

Going Remote: Deployment Considerations

Deploying a remote MCP server is mostly standard web service work, but with a few protocol-specific quirks.

Session management

Unlike a REST API that is stateless by default, MCP has a concept of sessions. A client connects, initiates a session, and that session persists for the duration of the interaction. Your server needs to handle session lifecycle: initialization, capability negotiation, and clean shutdown. The official SDKs handle most of this automatically, but it matters when you're debugging connection issues or running behind a load balancer.

Authentication

The MCP specification's authorization model uses OAuth 2.1 for remote servers. In practice, many teams start with simpler API key authentication passed as a bearer token, which most MCP clients support natively. For internal tooling where you control both client and server, this is fine. For anything customer-facing or multi-tenant, invest in proper OAuth flow early rather than retrofitting it later.

Containerize from day one

Docker your MCP server before you deploy it anywhere. The dependency story for Python in particular gets messy fast — SDK versions, system libraries, conflicting packages. A Dockerfile that pins your exact environment saves debugging time. A minimal pattern:

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "server.py"]

Expose your HTTP transport port and deploy it like any other microservice — ECS, Cloud Run, a Kubernetes deployment, whatever your infrastructure already supports.

Testing Your MCP Server

Two testing strategies are worth keeping in your toolkit.

The first is the MCP Inspector, a browser-based tool from Anthropic that lets you connect to a running MCP server and manually invoke tools, browse resources, and inspect protocol messages. It's the fastest way to verify your server is behaving correctly before connecting it to an actual agent host. Run it with npx @modelcontextprotocol/inspector and point it at your server endpoint.

The second is unit testing your tool logic independently from the MCP layer. Your actual business logic — the database query, the API call, the file transform — should be a plain function you can test with pytest or Jest. Wrap it in the MCP tool decorator only after the core logic is solid. Trying to test tool behavior through an MCP connection adds latency and fragility to your test suite.

Common Production Pitfalls

A few patterns come up repeatedly in MCP server builds that cause real problems once they hit production.

Tool name collisions. If you connect an agent to multiple MCP servers and two servers expose a tool with the same name, behavior becomes unpredictable. Namespace your tool names: github_list_issues instead of list_issues. Do this even if you're only shipping one server today — it's much harder to rename tools after clients have adopted them.

Overly broad permissions. Because an agent, not a human, is calling your tools, it's easy to accidentally expose tools with destructive capabilities that the model will happily invoke. A tool that deletes records should require explicit confirmation or be gated behind a separate step. At minimum, log every tool invocation with the full arguments so you can audit what the model actually did.

Forgetting rate limits. MCP servers that proxy third-party APIs inherit those APIs' rate limits. If your agent runs in a loop — which agents do — it can exhaust rate limits in seconds. Add per-tool rate limiting in your server layer rather than relying on the upstream API to throttle gracefully. An agent that hits a 429 and gets no structured error to work from will often retry in an unhelpful loop.

Ignoring latency. Tools are called synchronously during model inference in most architectures. A tool that takes three seconds to respond stretches the entire interaction timeline. Profile your tools under realistic load. If a tool is consistently slow, consider caching, async patterns, or whether it belongs in the MCP layer at all versus being pre-fetched into context.

Where MCP Fits in a Broader Agent Architecture

MCP solves tool connectivity. It doesn't solve agent architecture. A well-designed agent system needs orchestration, memory, error recovery, and observability on top of whatever MCP provides.

For most production use cases, MCP servers are one component in a larger system. An orchestrator — LangGraph, a custom loop, or a framework from Genta's agent frameworks roundup — manages planning and decides when to call tools. The MCP server layer handles the actual tool execution. Observability sits around both layers to log what's happening and alert when things break.

If you're building the MCP server as part of a product where other teams or external clients will connect to it, treat it like a public API from day one: version it, document the schema, think about backwards compatibility. The MCP ecosystem is still moving fast, and protocol versions are not always backwards compatible across major SDK releases. Pinning your SDK version in requirements.txt and tracking the spec changelog saves painful surprises.

Building real agent infrastructure that holds up in production involves far more than the tool layer. If your team is working through the architecture decisions and wants to move faster, Genta works embedded with engineering teams to design and ship these systems — the kind of work where MCP server design is one piece of a larger puzzle that includes orchestration, evaluation, and deployment.

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

April 19, 2026

10 min read

How to Build an MCP Server: A Practical Guide for AI Agent Builders

What MCP Actually Is (Before You Build Anything)

The Model Context Protocol is an open standard, originally released by Anthropic in late 2024, that defines how AI models connect to external tools, data sources, and services. Think of it as a standardized plug format for AI agents. Instead of each application inventing its own way to wire an LLM to a database, a calendar, or an internal API, MCP gives everyone the same socket.

The protocol has three main concepts: servers expose capabilities (tools, resources, prompts), clients are the LLM-powered hosts that connect to those servers (Claude Desktop, Cursor, custom agents), and tools are callable functions the model can invoke during a conversation or task. When a user asks an MCP-connected agent to "check my open GitHub issues", the model doesn't just pull from training data — it calls a tool on an MCP server that queries the GitHub API and returns live results.

If you want the full protocol specification, Anthropic publishes it at modelcontextprotocol.io. But reading the spec and knowing what to actually build are different things. This guide focuses on the latter.

One thing worth clarifying upfront: MCP servers are not the same as regular REST APIs. They use a JSON-RPC 2.0 message format over either stdio (for local processes) or HTTP with Server-Sent Events (for remote servers). If you want to understand the architectural difference in more depth, the post on MCP vs API for AI agent builders covers that comparison well. For here, we'll assume you've decided MCP is the right approach and want to know how to ship one.

Choosing Your Transport: stdio vs HTTP/SSE

Before writing a single line of server code, you need to decide how your server will communicate. This choice shapes everything downstream.

stdio (local, in-process)

The simplest option. Your MCP server runs as a local process, and the client communicates over standard input/output. This is what you use when building tools that run on the same machine as the host application: a filesystem tool, a local database reader, or a code execution sandbox.

stdio servers are fast to build and trivially easy to test. They have no authentication surface to worry about. The downside is obvious: they only work locally. You cannot expose a stdio server to a remote agent or share it across a team.

HTTP with SSE (remote)

Remote MCP servers run as HTTP services. The client sends requests via POST, and the server streams responses back using Server-Sent Events. This is what you need when multiple clients need to access the same server, your tools need to run in the cloud, or you want to deploy your tools like any other service.

Remote servers require you to think about authentication. The MCP spec supports OAuth 2.0, and most production implementations use bearer tokens or API keys passed in headers. This is where most tutorials stop — they get the server running locally and leave deployment as "an exercise for the reader." We'll come back to that.

Setting Up a Basic MCP Server in Python

Python is currently the most common language for MCP server development, largely because Anthropic's official Python SDK is well-maintained and the FastMCP wrapper makes boilerplate minimal.

Install the SDK:

pip install mcp

Here's the minimal structure of an MCP server that exposes a single tool:

from mcp.server import Server
from mcp.types import Tool, TextContent
import mcp.server.stdio as stdio

server = Server("my-tool-server")

@server.list_tools()
async def list_tools():
    return [
        Tool(
            name="get_weather",
            description="Returns current weather for a city. Use this when the user asks about weather conditions in a specific location.",
            inputSchema={
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "get_weather":
        city = arguments["city"]
        # Replace with your actual weather API call
        return [TextContent(type="text", text=f"Weather in {city}: 22C, partly cloudy")]

if __name__ == "__main__":
    import asyncio
    asyncio.run(stdio.run(server))

That's the skeleton. Your tool implementation replaces the placeholder with real logic: API calls, database queries, file reads, whatever the agent needs to do.

If you prefer TypeScript, Anthropic also maintains a TypeScript SDK with equivalent patterns. The type safety helps when your tool schemas get complex.

Tool Design: Where Most MCP Servers Go Wrong

Getting the server running is the easy part. Designing tools that models actually use well is where most projects stall.

Write descriptions as if the model is making the decision

The LLM decides which tool to call based on the tool's description field. This means your description is a prompt in disguise. Vague descriptions produce unpredictable tool selection. Compare:

Bad: "Get user data"

Better: "Retrieve a user's profile including name, email, account status, and subscription tier by their user ID. Use this when you need to look up who a specific user is or check their account standing."

The second version tells the model when to use the tool, not just what it does. That distinction matters more than you might expect in multi-tool environments where the model has to choose between several options.

Keep tools atomic

A common mistake is building tools that do too much. A tool called manage_user that creates, updates, and deletes users based on an action parameter forces the model to construct more complex arguments and increases the chance of misuse. Three separate tools — create_user, update_user, delete_user — are easier for the model to reason about and safer to expose.

Return structured, useful output

Your tools can return text, JSON, images, or embedded resources. Prefer structured JSON over raw text wherever possible. When you return a blob like "User John Doe created successfully, ID 12345", the model has to parse that string. When you return {"id": 12345, "name": "John Doe", "status": "created"}, it can reliably extract fields without introducing a parse step that can fail.

Handle errors explicitly

Tools should return structured errors, not throw exceptions. The MCP spec allows tools to return an isError: true flag alongside error content. Use it. When a model calls a tool and gets back a clear error explaining what went wrong, it can retry with corrected parameters or surface a useful message. Silent failures or unhandled stack traces give the model nothing to work with.

Resources and Prompts: The Features Most People Skip

MCP servers can expose three types of capabilities: tools (callable functions), resources (data the model can read), and prompts (reusable prompt templates). Most tutorials only cover tools. Resources and prompts are worth understanding even if you don't use them immediately.

Resources are read-only data endpoints. If your agent needs to read a document, a configuration file, or a schema, exposing it as a resource is cleaner than hacking it into a tool. Resources have URIs, and MCP clients can subscribe to resource updates — useful for monitoring scenarios where you want the model notified when data changes.

Prompts let you package reusable prompt templates into the server itself. A server for a customer support system might expose a prompt called handle_refund_request that contains the exact wording and constraints the agent should follow. This keeps business logic in the server layer rather than scattered across whatever is prompting the model upstream.

Going Remote: Deployment Considerations

Deploying a remote MCP server is mostly standard web service work, but with a few protocol-specific quirks.

Session management

Unlike a REST API that is stateless by default, MCP has a concept of sessions. A client connects, initiates a session, and that session persists for the duration of the interaction. Your server needs to handle session lifecycle: initialization, capability negotiation, and clean shutdown. The official SDKs handle most of this automatically, but it matters when you're debugging connection issues or running behind a load balancer.

Authentication

The MCP specification's authorization model uses OAuth 2.1 for remote servers. In practice, many teams start with simpler API key authentication passed as a bearer token, which most MCP clients support natively. For internal tooling where you control both client and server, this is fine. For anything customer-facing or multi-tenant, invest in proper OAuth flow early rather than retrofitting it later.

Containerize from day one

Docker your MCP server before you deploy it anywhere. The dependency story for Python in particular gets messy fast — SDK versions, system libraries, conflicting packages. A Dockerfile that pins your exact environment saves debugging time. A minimal pattern:

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "server.py"]

Expose your HTTP transport port and deploy it like any other microservice — ECS, Cloud Run, a Kubernetes deployment, whatever your infrastructure already supports.

Testing Your MCP Server

Two testing strategies are worth keeping in your toolkit.

The first is the MCP Inspector, a browser-based tool from Anthropic that lets you connect to a running MCP server and manually invoke tools, browse resources, and inspect protocol messages. It's the fastest way to verify your server is behaving correctly before connecting it to an actual agent host. Run it with npx @modelcontextprotocol/inspector and point it at your server endpoint.

The second is unit testing your tool logic independently from the MCP layer. Your actual business logic — the database query, the API call, the file transform — should be a plain function you can test with pytest or Jest. Wrap it in the MCP tool decorator only after the core logic is solid. Trying to test tool behavior through an MCP connection adds latency and fragility to your test suite.

Common Production Pitfalls

A few patterns come up repeatedly in MCP server builds that cause real problems once they hit production.

Tool name collisions. If you connect an agent to multiple MCP servers and two servers expose a tool with the same name, behavior becomes unpredictable. Namespace your tool names: github_list_issues instead of list_issues. Do this even if you're only shipping one server today — it's much harder to rename tools after clients have adopted them.

Overly broad permissions. Because an agent, not a human, is calling your tools, it's easy to accidentally expose tools with destructive capabilities that the model will happily invoke. A tool that deletes records should require explicit confirmation or be gated behind a separate step. At minimum, log every tool invocation with the full arguments so you can audit what the model actually did.

Forgetting rate limits. MCP servers that proxy third-party APIs inherit those APIs' rate limits. If your agent runs in a loop — which agents do — it can exhaust rate limits in seconds. Add per-tool rate limiting in your server layer rather than relying on the upstream API to throttle gracefully. An agent that hits a 429 and gets no structured error to work from will often retry in an unhelpful loop.

Ignoring latency. Tools are called synchronously during model inference in most architectures. A tool that takes three seconds to respond stretches the entire interaction timeline. Profile your tools under realistic load. If a tool is consistently slow, consider caching, async patterns, or whether it belongs in the MCP layer at all versus being pre-fetched into context.

Where MCP Fits in a Broader Agent Architecture

MCP solves tool connectivity. It doesn't solve agent architecture. A well-designed agent system needs orchestration, memory, error recovery, and observability on top of whatever MCP provides.

For most production use cases, MCP servers are one component in a larger system. An orchestrator — LangGraph, a custom loop, or a framework from Genta's agent frameworks roundup — manages planning and decides when to call tools. The MCP server layer handles the actual tool execution. Observability sits around both layers to log what's happening and alert when things break.

If you're building the MCP server as part of a product where other teams or external clients will connect to it, treat it like a public API from day one: version it, document the schema, think about backwards compatibility. The MCP ecosystem is still moving fast, and protocol versions are not always backwards compatible across major SDK releases. Pinning your SDK version in requirements.txt and tracking the spec changelog saves painful surprises.

Building real agent infrastructure that holds up in production involves far more than the tool layer. If your team is working through the architecture decisions and wants to move faster, Genta works embedded with engineering teams to design and ship these systems — the kind of work where MCP server design is one piece of a larger puzzle that includes orchestration, evaluation, and deployment.

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

April 19, 2026

10 min read

How to Build an MCP Server: A Practical Guide for AI Agent Builders

What MCP Actually Is (Before You Build Anything)

The Model Context Protocol is an open standard, originally released by Anthropic in late 2024, that defines how AI models connect to external tools, data sources, and services. Think of it as a standardized plug format for AI agents. Instead of each application inventing its own way to wire an LLM to a database, a calendar, or an internal API, MCP gives everyone the same socket.

The protocol has three main concepts: servers expose capabilities (tools, resources, prompts), clients are the LLM-powered hosts that connect to those servers (Claude Desktop, Cursor, custom agents), and tools are callable functions the model can invoke during a conversation or task. When a user asks an MCP-connected agent to "check my open GitHub issues", the model doesn't just pull from training data — it calls a tool on an MCP server that queries the GitHub API and returns live results.

If you want the full protocol specification, Anthropic publishes it at modelcontextprotocol.io. But reading the spec and knowing what to actually build are different things. This guide focuses on the latter.

One thing worth clarifying upfront: MCP servers are not the same as regular REST APIs. They use a JSON-RPC 2.0 message format over either stdio (for local processes) or HTTP with Server-Sent Events (for remote servers). If you want to understand the architectural difference in more depth, the post on MCP vs API for AI agent builders covers that comparison well. For here, we'll assume you've decided MCP is the right approach and want to know how to ship one.

Choosing Your Transport: stdio vs HTTP/SSE

Before writing a single line of server code, you need to decide how your server will communicate. This choice shapes everything downstream.

stdio (local, in-process)

The simplest option. Your MCP server runs as a local process, and the client communicates over standard input/output. This is what you use when building tools that run on the same machine as the host application: a filesystem tool, a local database reader, or a code execution sandbox.

stdio servers are fast to build and trivially easy to test. They have no authentication surface to worry about. The downside is obvious: they only work locally. You cannot expose a stdio server to a remote agent or share it across a team.

HTTP with SSE (remote)

Remote MCP servers run as HTTP services. The client sends requests via POST, and the server streams responses back using Server-Sent Events. This is what you need when multiple clients need to access the same server, your tools need to run in the cloud, or you want to deploy your tools like any other service.

Remote servers require you to think about authentication. The MCP spec supports OAuth 2.0, and most production implementations use bearer tokens or API keys passed in headers. This is where most tutorials stop — they get the server running locally and leave deployment as "an exercise for the reader." We'll come back to that.

Setting Up a Basic MCP Server in Python

Python is currently the most common language for MCP server development, largely because Anthropic's official Python SDK is well-maintained and the FastMCP wrapper makes boilerplate minimal.

Install the SDK:

pip install mcp

Here's the minimal structure of an MCP server that exposes a single tool:

from mcp.server import Server
from mcp.types import Tool, TextContent
import mcp.server.stdio as stdio

server = Server("my-tool-server")

@server.list_tools()
async def list_tools():
    return [
        Tool(
            name="get_weather",
            description="Returns current weather for a city. Use this when the user asks about weather conditions in a specific location.",
            inputSchema={
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "get_weather":
        city = arguments["city"]
        # Replace with your actual weather API call
        return [TextContent(type="text", text=f"Weather in {city}: 22C, partly cloudy")]

if __name__ == "__main__":
    import asyncio
    asyncio.run(stdio.run(server))

That's the skeleton. Your tool implementation replaces the placeholder with real logic: API calls, database queries, file reads, whatever the agent needs to do.

If you prefer TypeScript, Anthropic also maintains a TypeScript SDK with equivalent patterns. The type safety helps when your tool schemas get complex.

Tool Design: Where Most MCP Servers Go Wrong

Getting the server running is the easy part. Designing tools that models actually use well is where most projects stall.

Write descriptions as if the model is making the decision

The LLM decides which tool to call based on the tool's description field. This means your description is a prompt in disguise. Vague descriptions produce unpredictable tool selection. Compare:

Bad: "Get user data"

Better: "Retrieve a user's profile including name, email, account status, and subscription tier by their user ID. Use this when you need to look up who a specific user is or check their account standing."

The second version tells the model when to use the tool, not just what it does. That distinction matters more than you might expect in multi-tool environments where the model has to choose between several options.

Keep tools atomic

A common mistake is building tools that do too much. A tool called manage_user that creates, updates, and deletes users based on an action parameter forces the model to construct more complex arguments and increases the chance of misuse. Three separate tools — create_user, update_user, delete_user — are easier for the model to reason about and safer to expose.

Return structured, useful output

Your tools can return text, JSON, images, or embedded resources. Prefer structured JSON over raw text wherever possible. When you return a blob like "User John Doe created successfully, ID 12345", the model has to parse that string. When you return {"id": 12345, "name": "John Doe", "status": "created"}, it can reliably extract fields without introducing a parse step that can fail.

Handle errors explicitly

Tools should return structured errors, not throw exceptions. The MCP spec allows tools to return an isError: true flag alongside error content. Use it. When a model calls a tool and gets back a clear error explaining what went wrong, it can retry with corrected parameters or surface a useful message. Silent failures or unhandled stack traces give the model nothing to work with.

Resources and Prompts: The Features Most People Skip

MCP servers can expose three types of capabilities: tools (callable functions), resources (data the model can read), and prompts (reusable prompt templates). Most tutorials only cover tools. Resources and prompts are worth understanding even if you don't use them immediately.

Resources are read-only data endpoints. If your agent needs to read a document, a configuration file, or a schema, exposing it as a resource is cleaner than hacking it into a tool. Resources have URIs, and MCP clients can subscribe to resource updates — useful for monitoring scenarios where you want the model notified when data changes.

Prompts let you package reusable prompt templates into the server itself. A server for a customer support system might expose a prompt called handle_refund_request that contains the exact wording and constraints the agent should follow. This keeps business logic in the server layer rather than scattered across whatever is prompting the model upstream.

Going Remote: Deployment Considerations

Deploying a remote MCP server is mostly standard web service work, but with a few protocol-specific quirks.

Session management

Unlike a REST API that is stateless by default, MCP has a concept of sessions. A client connects, initiates a session, and that session persists for the duration of the interaction. Your server needs to handle session lifecycle: initialization, capability negotiation, and clean shutdown. The official SDKs handle most of this automatically, but it matters when you're debugging connection issues or running behind a load balancer.

Authentication

The MCP specification's authorization model uses OAuth 2.1 for remote servers. In practice, many teams start with simpler API key authentication passed as a bearer token, which most MCP clients support natively. For internal tooling where you control both client and server, this is fine. For anything customer-facing or multi-tenant, invest in proper OAuth flow early rather than retrofitting it later.

Containerize from day one

Docker your MCP server before you deploy it anywhere. The dependency story for Python in particular gets messy fast — SDK versions, system libraries, conflicting packages. A Dockerfile that pins your exact environment saves debugging time. A minimal pattern:

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "server.py"]

Expose your HTTP transport port and deploy it like any other microservice — ECS, Cloud Run, a Kubernetes deployment, whatever your infrastructure already supports.

Testing Your MCP Server

Two testing strategies are worth keeping in your toolkit.

The first is the MCP Inspector, a browser-based tool from Anthropic that lets you connect to a running MCP server and manually invoke tools, browse resources, and inspect protocol messages. It's the fastest way to verify your server is behaving correctly before connecting it to an actual agent host. Run it with npx @modelcontextprotocol/inspector and point it at your server endpoint.

The second is unit testing your tool logic independently from the MCP layer. Your actual business logic — the database query, the API call, the file transform — should be a plain function you can test with pytest or Jest. Wrap it in the MCP tool decorator only after the core logic is solid. Trying to test tool behavior through an MCP connection adds latency and fragility to your test suite.

Common Production Pitfalls

A few patterns come up repeatedly in MCP server builds that cause real problems once they hit production.

Tool name collisions. If you connect an agent to multiple MCP servers and two servers expose a tool with the same name, behavior becomes unpredictable. Namespace your tool names: github_list_issues instead of list_issues. Do this even if you're only shipping one server today — it's much harder to rename tools after clients have adopted them.

Overly broad permissions. Because an agent, not a human, is calling your tools, it's easy to accidentally expose tools with destructive capabilities that the model will happily invoke. A tool that deletes records should require explicit confirmation or be gated behind a separate step. At minimum, log every tool invocation with the full arguments so you can audit what the model actually did.

Forgetting rate limits. MCP servers that proxy third-party APIs inherit those APIs' rate limits. If your agent runs in a loop — which agents do — it can exhaust rate limits in seconds. Add per-tool rate limiting in your server layer rather than relying on the upstream API to throttle gracefully. An agent that hits a 429 and gets no structured error to work from will often retry in an unhelpful loop.

Ignoring latency. Tools are called synchronously during model inference in most architectures. A tool that takes three seconds to respond stretches the entire interaction timeline. Profile your tools under realistic load. If a tool is consistently slow, consider caching, async patterns, or whether it belongs in the MCP layer at all versus being pre-fetched into context.

Where MCP Fits in a Broader Agent Architecture

MCP solves tool connectivity. It doesn't solve agent architecture. A well-designed agent system needs orchestration, memory, error recovery, and observability on top of whatever MCP provides.

For most production use cases, MCP servers are one component in a larger system. An orchestrator — LangGraph, a custom loop, or a framework from Genta's agent frameworks roundup — manages planning and decides when to call tools. The MCP server layer handles the actual tool execution. Observability sits around both layers to log what's happening and alert when things break.

If you're building the MCP server as part of a product where other teams or external clients will connect to it, treat it like a public API from day one: version it, document the schema, think about backwards compatibility. The MCP ecosystem is still moving fast, and protocol versions are not always backwards compatible across major SDK releases. Pinning your SDK version in requirements.txt and tracking the spec changelog saves painful surprises.

Building real agent infrastructure that holds up in production involves far more than the tool layer. If your team is working through the architecture decisions and wants to move faster, Genta works embedded with engineering teams to design and ship these systems — the kind of work where MCP server design is one piece of a larger puzzle that includes orchestration, evaluation, and deployment.

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.

We’re Here to Help

Ready to transform your operations? We're here to help. Contact us today to learn more about our innovative solutions and expert services.