loader

MCP: Make A Game Changing API For Your AI

Introduction

If you’ve been working in the world of Generative AI for any amount of time, you will certainly have heard of agents. Though, to be fair, even if you haven’t you will likely still have heard of them. About a year ago, agentic frameworks began to completely shake up the world of generative AI, and like RAG (retrieval augmented generation) before them, they completely changed the way we saw and leveraged LLMs in a very short amount of time.

Now agents are everywhere and trying to do all sorts of clever things, but time stands still for no one, especially in the whiplash inducing world of AI, and with each new wave of innovation comes a fresh set of acronyms, buzzwords, and challenges. Namely if we have all these agents running around doing all sorts of increasingly complex tasks, how do we manage them? How do we ensure they can talk to each other? And how do we keep our systems safe whilst using them?

Enter the new flavour of the month: “MCP”.

MCP (Model Context Protocol) servers are emerging as a key architectural pattern for managing the increasingly complex demands of LLM powered applications. They’re designed to bring structure, flexibility, and performance to environments where different agents and models need to be orchestrated, prompted, and scaled with increasingly complex requirements. And like RAG or agentic frameworks before them, they’re proving to be just as foundational – and just as game changing.

In this post, we’ll unpack what MCP servers actually are, why they matter, and how they fit into the broader generative AI ecosystem. Whether you’re building, deploying, or just curious, there’s plenty to explore, so let's start with a fairly simple question.

Why do we need MCP?

Increasing the performance of LLMs through various forms of augmentation has been around for years. RAG is a popular form of augmenting LLM inputs by adding additional context in a dynamic fashion, and tools have become a standard way of enhancing LLMs by giving them access to external data sources and the ability to run deterministic code.

There are loads of brilliant resources out there for how to get started with them (including some from Advancing Analytics, hint hint click click) so we won’t be going into the details of constructing them here. For now though, let’s just take a simple tool which you can find in notebook 3.1 from our Advancing AI agents series. This tool allows an LLM to reliably solve the quadratic formula for a given set of input values by using deterministic Python code: 

tool_screenshot

Imagine you've built this tool for an AI application which you are developing for your organisation. Your application is using LangChain as a framework, and so that is how you have built your tool to work - so far, so good. Then imagine that a colleague sees what you have been developing and says “wow, we really love how you’ve built that tool. We’d love an agent to call that regularly as part of an AI app we're making, can you help us out?”. After the flattery has subsided, you realise a couple of challenges:

  • How are you going to share your tool with them? 
    You can’t just zip it up and send it over, what if you make improvements and changes to it?
  • What if they fork it off your repo?
    Well then that’s making their solution more complex and introducing extra risks. And what if someone else comes along who wants to use it? This won’t scale well.
  • What if you host the tool somewhere, and then other teams can pull it down as needed?
    This could work, but it's still quite limited. You can store tools in central locations, such as Unity Catalog functions, but then you're just kicking the can down the road of different teams accessing resources from central storage locations, which may be reasonable in small teams and organisations but won't be watertight for long... And what if someone outside of your organisation wants to use it? Are you going to open up your Catalog to external traffic?

So, how would you normally serve tools and functions to users in other applications? How about an API?

This is where an MCP server comes in.

How does MCP solve the problem?

Rather than your colleague importing your tool from somewhere and binding their LLM to your toolkit locally, as you might do with the LangChain bind_tools method or create_react_agent function, why not host your toolkit on a server and allow them to bind their LLM to that server instead? 

This is the crux of MCP. Rather than keeping tool usage siloed, it allows different teams, even different organisations, to develop and maintain tools which others are free to come along and use. So MCP servers essentially act as APIs specialised and standardised for AI consumption.

As you might ping the OpenWeatherMap API, so too might an LLM be able to call the OpenWeatherMap managed MCP server. As your colleagues might ask you to run some queries against your organisation's DataLake, so too might their copilot agent ask your organisation's MCP server to run a DataLake query tool.

Bonus: MCP also offers an opportunity to solve one of the biggest asks we see from clients - web scraping! Rather than using those pesky human-centric servers to access site information, wouldn't it be far more convenient to give visiting bots their own interface into your site's content? 

But why stop there? Not only can you provide tools for agents to come along and use, but you can also provide prompts which have been specifically tailored with few shot examples and Jinja-esque parametrisation for different use cases, or even documentation for agents to read before making their next step. In this fashion, MCP allows LLMs and agents to become even more modular, providing prompts and tools on demand rather than all up front.

So how is this happening?

The answer, as is so frequently the case with  humanity’s greatest achievements, is standards! In this particular case, the MCP server is providing two public facing nodes which allow for anonymous request-response interactions. This means that, if you know you are talking to a server which is following the MCP standard, there is no need to know how it works internally or to build a set of custom connections for each new server. All you need to know is that you can provide a JSON payload of a set structure, and get back a predictable JSON response.

In this way, the MCP server could be providing tools or prompts which do just about anything - acting as modular processes which expose the functionality of a multitude of software or services to your agents. These servers can live anywhere: on your local machine, in a cloud function like Azure Functions or Databricks apps, or even inside enterprise environments if you need an extra level of security for internal implementations. If your environment can run a server, it can support MCP.

mcp_diagram

This flexibility makes MCP a solid foundation for building AI applications which are both portable and interoperable at any scale.

This is all supported by stdio (standard input/output), which standardises the communication between client and server using JSON-RPC messaging. But as MCP develops, it also supports more advanced transport layers such as Streamable HTTP, a more modern method which supports persistent sessions and streaming responses. Although MCP interactions are stateless by default, using Streamable HTTP allows for session management and context preservation across multiple AI interactions.

A quick word about robots

Robots? Why are we talking about them?

Partially because everyone always uses the USB analogy for MCP, so I want to use a different one! But also because this isn't the first time an industry has faced the issue of standardisation, and it certainly won't be the last - and one such case study of the dawn of MCP might play out is ROS.

ROS (the Robot Operating System) is a framework used in the world of robotics to provide pre-built modules and building blocks which can be chained together to create complex robot control systems. It saves Engineers time by providing the "plumbing" between black-box, open-source, community-driven modules which all fit together using a standardised publish/subscribe pattern.

Like any large open-source community, ROS has contributors of all sizes, from individual researchers to large corporations, and it has become an industry standard for robotic development.

This is where I can see MCP heading - becoming an ecosystem of servers and modules being developed at all levels which encourage good practices in agent development and enable cleaner and reusable agentic frameworks. This could make the internet just as accessible and flexible for AI agents as REST APIs do for humans.

Security considerations

Like any new tool, MCP does introduce several security concerns that must be addressed to ensure safe and reliable interactions between clients and servers.

One major risk is token theft, where attackers gain access to stored or logged tokens and use them to impersonate legitimate users. To mitigate this, MCP implementations must use secure token storage and issue short-lived tokens. Refresh tokens should be rotated, especially for public clients.

MCP servers should also leverage HTTPS, and redirect URIs should be secure to prevent interception of communications. Authorisation code interception is also a threat - MCP clients must implement Proof Key for Code Exchange (PKCE) to prevent this.

In addition, open redirection vulnerabilities can lead users to malicious sites. To prevent this, redirect URIs must be pre-registered and validated. Additionally, confused deputy problems and token privilege escalation can occur if tokens are misused across services. MCP servers must validate token audiences and avoid passing tokens to upstream APIs.

Fixes include strict adherence to OAuth 2.1 best practices, proper token validation, secure communication protocols, and careful client registration and consent handling. These measures collectively reduce the attack surface and enhance the overall security of MCP implementations.

When working with MCP, you should ensure you are familiar with the security recommendations and best practices provided by the community, which may change rapidly in this fast moving space!

Conclusion

MCP is a powerful next step in the world of generative AI, and this is just the beginning! Already MCP is already starting to reshape how organisations think about and structure their AI and agentic solutions, but there is still a long way to go before it becomes the accepted norm. Nonetheless, momentum is picking up fast with the MCP community already boasting:

  • 9 official SDKs
  • Over 1000 available servers
  • Over 70 compatible clients

So now is the perfect time to get learning and wrapping your head around the core ideas of how this all works. We're currently working on some great follow up content to help guide you through your journey of getting started with MCP, so make sure you keep an eye on our  YouTube channel and blog feed for that!

But if you can't wait, then we recommend you take a look at the official MCP info and documentation pages to read about MCP in a bit more depth, and check out some of their great getting started tutorials!

author profile

Author

James Bentley