Cloud Blog: Build and Deploy a Remote MCP Server to Google Cloud Run in Under 10 Minutes

Source URL: https://cloud.google.com/blog/topics/developers-practitioners/build-and-deploy-a-remote-mcp-server-to-google-cloud-run-in-under-10-minutes/
Source: Cloud Blog
Title: Build and Deploy a Remote MCP Server to Google Cloud Run in Under 10 Minutes

Feedly Summary: Integrating context from tools and data sources into LLMs can be challenging, which impacts ease-of-use in the development of AI agents. To address this challenge, Anthropic introduced the Model Context Protocol (MCP), which standardizes how applications provide context to LLMs. Imagine you want to build an MCP server for your API to make it available to fellow developers so they can use it as context in their own AI applications. But where do you deploy it? Google Cloud Run could be a great option.
Drawing directly from the official Cloud Run documentation for hosting MCP servers, this guide shows you the straightforward process of setting up your very own remote MCP server. Get ready to transform how you leverage context in your AI endeavors!
MCP Transports
MCP follows a client-server architecture, and for a while, only supported running the server locally using the stdio transport.

https://modelcontextprotocol.io/introduction

MCP has evolved and now supports remote access transports: streamable-http and sse. Server-Sent Events (SSE) has been deprecated in favor of Streamable HTTP in the latest MCP specification but is still supported for backwards compatibility. Both of these two transports allow for running MCP servers remotely.
With Streamable HTTP, the server operates as an independent process that can handle multiple client connections. This transport uses HTTP POST and GET requests.
The server MUST provide a single HTTP endpoint path (hereafter referred to as the MCP endpoint) that supports both POST and GET methods. For example, this could be a URL like https://example.com/mcp.
You can read more about the different transports in the official MCP docs.
Benefits of running an MCP server remotely
Running an MCP server remotely on Cloud Run can provide several benefits:

? Scalability: Cloud Run is built to rapidly scale out to handle all incoming requests. Cloud Run will scale your MCP server automatically based on demand.

? Centralized server: You can share access to a centralized MCP server with team members through IAM privileges, allowing them to connect to it from their local machines instead of all running their own servers locally. If a change is made to the MCP server, all team members will benefit from it.

? Security: Cloud Run provides an easy way to force authenticated requests. This allows only secure connections to your MCP server, preventing unauthorized access.

IMPORTANT: The security benefit is critical. If you don’t enforce authentication, anyone on the public internet can potentially access and call your MCP server. 
Prerequisites

Python 3.10+

Uv (for package and project management, see docs for installation)

Google Cloud SDK (gcloud)

Installation
Create a folder, mcp-on-cloudrun, to store the code for our server and deployment:

code_block
)])]>

Let’s get started by using uv to create a project. Uv is a powerful and fast package and project manager.

code_block
<ListValue: [StructValue([(‘code’, ‘uv init –name “mcp-on-cloudrun" –description "Example of deploying a MCP server on Cloud Run" –bare –python 3.10’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b50f46eb0>)])]>

After running the above command, you should see the following pyproject.toml:

code_block
<ListValue: [StructValue([(‘code’, ‘[project]\r\nname = "mcp-on-cloudrun"\r\nversion = "0.1.0"\r\ndescription = "Example of deploying a MCP server on Cloud Run"\r\nrequires-python = ">=3.10"\r\ndependencies = []’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b50f46310>)])]>

Next, let’s create the additional files we will need: a server.py for our MCP server code, a test_server.py that we will use to test our remote server, and a Dockerfile for our Cloud Run deployment.

code_block
<ListValue: [StructValue([(‘code’, ‘touch server.py test_server.py Dockerfile’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b50f467f0>)])]>

Our file structure should now be complete:

code_block
<ListValue: [StructValue([(‘code’, ‘├── mcp-on-cloudrun\r\n│ ├── pyproject.toml\r\n│ ├── server.py\r\n│ ├── test_server.py\r\n│ └── Dockerfile’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b50f46dc0>)])]>

Now that we have our file structure taken care of, let’s configure our Google Cloud credentials and set our project:

code_block
<ListValue: [StructValue([(‘code’, ‘gcloud auth login\r\nexport PROJECT_ID=<your-project-id>\r\ngcloud config set project $PROJECT_ID’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b50f465b0>)])]>

Math MCP Server
LLMs are great at non-deterministic tasks: understanding intent, generating creative text, summarizing complex ideas, and reasoning about abstract concepts. However, they are notoriously unreliable for deterministic tasks  –  things that have one, and only one, correct answer.
Enabling LLMs with deterministic tools (such as math operations) is one example of how tools can provide valuable context to improve the use of LLMs using MCP.
We will use FastMCP to create a simple math MCP server that has two tools: add and subtract. FastMCP provides a fast, Pythonic way to build MCP servers and clients.
Add FastMCP as a dependency to our pyproject.toml:

code_block
<ListValue: [StructValue([(‘code’, ‘uv add fastmcp==2.6.1 –no-sync’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b50f46bb0>)])]>

Copy and paste the following code into server.py for our math MCP server:

code_block
<ListValue: [StructValue([(‘code’, ‘import asyncio\r\nimport logging\r\nimport os\r\n\r\nfrom fastmcp import FastMCP \r\n\r\nlogger = logging.getLogger(__name__)\r\nlogging.basicConfig(format="[%(levelname)s]: %(message)s", level=logging.INFO)\r\n\r\nmcp = FastMCP("MCP Server on Cloud Run")\r\n\r\n@mcp.tool()\r\ndef add(a: int, b: int) -> int:\r\n """Use this to add two numbers together.\r\n \r\n Args:\r\n a: The first number.\r\n b: The second number.\r\n \r\n Returns:\r\n The sum of the two numbers.\r\n """\r\n logger.info(f">>> ?️ Tool: \’add\’ called with numbers \'{a}\’ and \'{b}\’")\r\n return a + b\r\n\r\n@mcp.tool()\r\ndef subtract(a: int, b: int) -> int:\r\n """Use this to subtract two numbers.\r\n \r\n Args:\r\n a: The first number.\r\n b: The second number.\r\n \r\n Returns:\r\n The difference of the two numbers.\r\n """\r\n logger.info(f">>> ?️ Tool: \’subtract\’ called with numbers \'{a}\’ and \'{b}\’")\r\n return a – b\r\n\r\nif __name__ == "__main__":\r\n logger.info(f"? MCP server started on port {os.getenv(\’PORT\’, 8080)}")\r\n # Could also use \’sse\’ transport, host="0.0.0.0" required for Cloud Run.\r\n asyncio.run(\r\n mcp.run_async(\r\n transport="streamable-http", \r\n host="0.0.0.0", \r\n port=os.getenv("PORT", 8080),\r\n )\r\n )’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325cc2b0>)])]>

Transport
We are using the streamable-http transport for this example as it is the recommended transport for remote servers, but you can also still use sse if you prefer as it is backwards compatible.
If you want to use sse, you will need to update the last line of server.py to use transport="sse".
Deploying to Cloud Run
Now let’s deploy our simple MCP server to Cloud Run. ?
Copy and paste the below code into our empty Dockerfile; it uses uv to run our server.py:

code_block
<ListValue: [StructValue([(‘code’, ‘# Use the official Python lightweight image\r\nFROM python:3.13-slim\r\n\r\n# Install uv\r\nCOPY –from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/\r\n\r\n# Install the project into /app\r\nCOPY . /app\r\nWORKDIR /app\r\n\r\n# Allow statements and log messages to immediately appear in the logs\r\nENV PYTHONUNBUFFERED=1\r\n\r\n# Install dependencies\r\nRUN uv sync\r\n\r\nEXPOSE $PORT\r\n\r\n# Run the FastMCP server\r\nCMD ["uv", "run", "server.py"]’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325cc8b0>)])]>

You can deploy directly from source, or by using a container image.
For both options we will use the –no-allow-unauthenticated flag to require authentication.
This is important for security reasons. If you don’t require authentication, anyone can call your MCP server and potentially cause damage to your system.
Option 1 – Deploy from source 

code_block
<ListValue: [StructValue([(‘code’, ‘gcloud run deploy mcp-server –no-allow-unauthenticated –region=us-central1 –source .’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325ccfa0>)])]>

Option 2 – Deploy from a container image 
Create an Artifact Registry repository to store the container image.

code_block
<ListValue: [StructValue([(‘code’, ‘gcloud artifacts repositories create remote-mcp-servers \\\r\n –repository-format=docker \\\r\n –location=us-central1 \\\r\n –description="Repository for remote MCP servers" \\\r\n –project=$PROJECT_ID’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325cc880>)])]>

Build the container image and push it to Artifact Registry with Cloud Build.

code_block
<ListValue: [StructValue([(‘code’, ‘gcloud builds submit –region=us-central1 –tag us-central1-docker.pkg.dev/$PROJECT_ID/remote-mcp-servers/mcp-server:latest’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325cc670>)])]>

Deploy our MCP server container image to Cloud Run.

code_block
<ListValue: [StructValue([(‘code’, ‘gcloud run deploy mcp-server \\\r\n –image us-central1-docker.pkg.dev/$PROJECT_ID/remote-mcp-servers/mcp-server:latest \\\r\n –region=us-central1 \\\r\n –no-allow-unauthenticated’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325cc550>)])]>

Once you have completed either option, if your service has successfully deployed you will see a message like the following:

code_block
<ListValue: [StructValue([(‘code’, ‘Service [mcp-server] revision [mcp-server-12345-abc] has been deployed and is serving 100 percent of traffic.’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325cc6d0>)])]>

Authenticating MCP Clients
Since we specified –no-allow-unauthenticated to require authentication, any MCP client connecting to our remote MCP server will need to authenticate.
The official docs for Host MCP servers on Cloud Run provides more information on this topic depending on where you are running your MCP client.
For this example, we will run the Cloud Run proxy to create an authenticated tunnel to our remote MCP server on our local machines.
By default, the URL of Cloud Run services requires all requests to be authorized with the Cloud Run Invoker (roles/run.invoker) IAM role. This IAM policy binding ensures that a strong security mechanism is used to authenticate your local MCP client.
Make sure that you or any team members trying to access the remote MCP server have the roles/run.invoker IAM role bound to their IAM principal (Google Cloud account).
NOTE: The following command may prompt you to download the Cloud Run proxy if it is not already installed. Follow the prompts to download and install it.

code_block
<ListValue: [StructValue([(‘code’, ‘gcloud run services proxy mcp-server –region=us-central1’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325cce20>)])]>

You should see the following output:

code_block
<ListValue: [StructValue([(‘code’, ‘Proxying to Cloud Run service [mcp-server] in project [<YOUR_PROJECT_ID>] region [us-central1]\r\nhttp://127.0.0.1:8080 proxies to https://mcp-server-abcdefgh-uc.a.run.app’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325ccb20>)])]>

All traffic to http://127.0.0.1:8080 will now be authenticated and forwarded to our remote MCP server.
Testing the remote MCP server
Let’s test and connect to the remote MCP server using the FastMCP client to connect to http://127.0.0.1:8080/mcp (note the /mcp at the end as we are using the Streamable HTTP transport) and call the add and subtract tools. 
Add the following code to the empty test_server.py file:

code_block
<ListValue: [StructValue([(‘code’, ‘import asyncio\r\n\r\nfrom fastmcp import Client\r\n\r\nasync def test_server():\r\n # Test the MCP server using streamable-http transport.\r\n # Use "/sse" endpoint if using sse transport.\r\n async with Client("http://localhost:8080/mcp") as client:\r\n # List available tools\r\n tools = await client.list_tools()\r\n for tool in tools:\r\n print(f">>> ?️ Tool found: {tool.name}")\r\n # Call add tool\r\n print(">>> ? Calling add tool for 1 + 2")\r\n result = await client.call_tool("add", {"a": 1, "b": 2})\r\n print(f"<<< ✅ Result: {result[0].text}")\r\n # Call subtract tool\r\n print(">>> ? Calling subtract tool for 10 – 3")\r\n result = await client.call_tool("subtract", {"a": 10, "b": 3})\r\n print(f"<<< ✅ Result: {result[0].text}")\r\n\r\nif __name__ == "__main__":\r\n asyncio.run(test_server())’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325cc6a0>)])]>

NOTE: Make sure you have the Cloud Run proxy running before running the test server.
In a new terminal run:

code_block
<ListValue: [StructValue([(‘code’, ‘uv run test_server.py’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325cc190>)])]>

You should see the following output:

code_block
<ListValue: [StructValue([(‘code’, ‘>>> ?️ Tool found: add\r\n>>> ?️ Tool found: subtract\r\n>>> ? Calling add tool for 1 + 2\r\n<<< ✅ Result: 3\r\n>>> ? Calling subtract tool for 10 – 3\r\n<<< ✅ Result: 7’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9b325cc9d0>)])]>

You’ve done it! You have successfully deployed a remote MCP server to Cloud Run and tested it using the FastMCP client ?
Want to learn more about deploying AI applications on Cloud Run? Check out this blog from Google I/O to learn the latest on Easily Deploying AI Apps to Cloud Run!
Continue Reading

Host MCP servers on Cloud Run

MCP server for helping deploy applications to Cloud Run

AI Summary and Description: Yes

Summary: The text discusses the implementation of the Model Context Protocol (MCP) for enhancing the usability of large language models (LLMs) in AI applications. It provides a comprehensive guide on deploying an MCP server via Google Cloud Run, highlighting its scalability, security features, and the importance of authentication. The technical insights offered are particularly relevant for professionals involved in AI, cloud infrastructure, and security.

Detailed Description: The text outlines the following critical aspects related to integrating MCP into LLM applications:

– **Introduction to Model Context Protocol (MCP)**:
– MCP standardizes how applications provide contextual information to LLMs, crucial for building effective AI agents.
– The necessity arises from challenges in integrating context, impacting development ease.

– **Benefits of Remote MCP Server Deployment**:
– **Scalability**: Cloud Run can automatically scale the MCP server based on demand, ensuring efficient resource usage.
– **Centralization**: A single server can serve multiple users, allowing for updates that benefit all team members.
– **Security**: The ability to enforce authenticated requests mitigates risks associated with unauthorized access.

– **Setting Up an MCP Server**:
– Prerequisites include specific software such as Python and Google Cloud SDK.
– Detailed commands for creating and structuring the server implementation using Uv (a project manager) are provided.

– **Transport Mechanisms**:
– The text describes client-server architecture and mentions that MCP now supports remote transport methods like streamable-http, enhancing its functionality.

– **Deployment Instructions**:
– Guidance is offered for deploying the server on Google Cloud Run, emphasizing the need for authentication to secure server access.
– Specific code snippets are shared for deploying via source or container images, ensuring clarity in the setup process.

– **Testing the MCP Server**:
– Instructions for testing the server’s functionality utilizing add and subtract tools provide practical insights into the server’s utility.
– Emphasizes the importance of confirming successful authentication before conducting tests.

– **Conclusion and Further Learning**:
– Encourages further exploration of deploying AI models on Cloud Run, directing readers to external resources for deeper knowledge.

Overall, this text provides vital information for security and compliance professionals as it addresses potential security vulnerabilities inherent in cloud computing and emphasizes the importance of robust authentication when deploying AI solutions.