Foundation Models Explained: The Engines Behind Modern AI

Behind almost every AI product you have used recently sits something called a foundation model. The chat assistant that drafts your emails, the tool that summarises your meetings, the system that answers customer questions, all of them are typically built on top of one of these large, general-purpose models. Foundation models are, in a very real sense, the engines of modern artificial intelligence, yet the term is rarely explained clearly to the people making business decisions about AI.

This guide demystifies foundation models in plain language. We will explain what they are, why they represented such a shift in how AI is built, how a single broad model can be adapted to countless specific tasks, and what all of this means practically for your organisation. If you would like a broader grounding first, our plain-English guide to artificial intelligence sets the wider context.

What a foundation model is

A foundation model is a large AI system trained on a broad and diverse range of data so that it can be adapted to many different tasks rather than just one. The name, popularised by researchers at Stanford, captures the idea well: it is a base that other applications are built upon. Instead of training a separate model from scratch for every problem, organisations start from a powerful general model and adapt it, which is dramatically faster and cheaper.

The most familiar foundation models work with language and underpin the large language models that power today's chat assistants. You can read more about how those generate text in our guide to large language models. But foundation models are not limited to text. There are foundation models for images, audio, video, and code, and increasingly multimodal models that handle several of these at once, such as understanding both a picture and a written question about it.

One base, many uses
A single foundation model can be adapted to hundreds of downstream tasks without retraining from scratch.
Source: Stanford HAI AI Index

Why foundation models changed everything

Before foundation models, building an AI system usually meant collecting a large, carefully labelled dataset for one narrow task and training a dedicated model on it. This was slow, expensive, and required specialist expertise for every new use case. A model that detected defects on a production line could not help with customer emails; each problem started from zero.

Foundation models broke that pattern. By training one very large model on broad data, researchers discovered that the same model could be adapted to a huge variety of tasks, often with only a little extra guidance. This is sometimes described as a shift from building many narrow tools to building one versatile platform. For businesses, it means the cost and effort of getting started with AI has fallen sharply, because you are adapting a ready-made engine rather than constructing one.

How a general model becomes specialised

There are several ways to adapt a foundation model to your needs, ranging from simple to involved. The lightest touch is prompting, where you simply give the model clear instructions and examples in plain language. A step further is retrieval, where you connect the model to your own documents so it answers using your specific information rather than only its general training. The most involved is fine-tuning, where the model is given additional training on examples from your domain so it adopts a particular style or specialism. Most businesses get a long way with prompting and retrieval alone, without ever needing to fine-tune.

Ways to adapt a foundation model
Approach When it makes sense
Prompting Quick wins where clear instructions and examples are enough
Retrieval Answering from your own documents and current data
Fine-tuning A consistent specialist style or domain at larger scale
Combination Most production systems blend prompting and retrieval

The leading foundation models in 2026

A handful of organisations build the most capable foundation models, and the field moves quickly. OpenAI's GPT-5 family, Anthropic's Claude models, and Google's Gemini are among the most widely used hosted models, with xAI's Grok also prominent. Alongside these sit open-weight foundation models that organisations can download and run themselves, including Meta's Llama, DeepSeek, Alibaba's Qwen, and Z.AI's GLM. The choice between hosted and self-run models involves real trade-offs in control, cost, and convenience, which we unpack in our guide to open versus closed AI models.

These frontier models are increasingly capable. The strongest now handle very large context windows, in some cases approaching a million tokens, meaning they can take in lengthy documents or large bodies of material in a single request. Their abilities are tracked on public benchmarks and leaderboards, which give a rough sense of relative strength across knowledge, reasoning, and coding.

Build on, not from scratch
Adapting an existing foundation model is far cheaper than training one, putting capable AI within reach of smaller organisations.
Source: Stanford HAI AI Index

What foundation models mean for your business

The single most important takeaway is that you almost never need to build a foundation model yourself. Training one requires enormous data, computing power, and expertise that only a few organisations possess. Your opportunity lies in adapting these powerful engines to your specific problems, which is well within reach of an ordinary business. The skill that matters is not building AI, but choosing the right model and applying it sensibly.

This is why so many practical AI applications, from drafting assistants to customer service, are really foundation models adapted to a job. A customer support tool such as a WhatsApp AI chatbot is typically a foundation model connected to your knowledge base and given clear instructions. Because the underlying model is general, the same approach can be pointed at many different tasks across your organisation.

Choosing and governing your model

With many capable foundation models available, the decision comes down to matching capability, cost, speed, and data handling to your needs. The most powerful model is not always the right one, since smaller models are often faster and cheaper for routine work. Our guide to choosing the right AI model walks through this in detail. Whichever you pick, the same governance principles apply: be careful about what data you share, keep humans reviewing important outputs, and strengthen your data foundations so the model has good information to work with. If you would like help applying a foundation model to a real problem, explore our AI chatbot solution or get in touch.

Frequently asked questions

What is the difference between a foundation model and a large language model?+
A large language model is a foundation model that works with text. Foundation model is the broader term, covering models for images, audio, video, and code as well. Every large language model is a foundation model, but not every foundation model handles language.
Do I need to build my own foundation model?+
Almost certainly not. Training a foundation model requires resources that only a few large organisations have. Your opportunity is to adapt an existing model to your needs through prompting, retrieval, or fine-tuning, which is far cheaper and within reach.
What does multimodal mean?+
A multimodal foundation model can work with more than one type of input or output, such as text and images together. This lets it do things like describe a photo, read a chart, or answer a written question about a picture you provide.
How do I make a foundation model use my own information?+
The common approach is retrieval, where the model is connected to your documents and data so it answers from your current, specific information rather than only its general training. This keeps answers accurate and relevant without retraining the model.

References

  1. Stanford HAI. "AI Index Report." hai.stanford.edu.
  2. Anthropic. "Introducing Claude." anthropic.com.
Back to blog