AI Coding Assistants: What They Can and Can't Do

Few areas of artificial intelligence have generated as much excitement — or as much confusion — as AI coding assistants. Tools such as GitHub Copilot, Cursor, and Claude Code can write software from a plain-language description, and the demonstrations are genuinely striking. For a business owner, this raises an obvious and important question: can these tools build the software my business needs, and what happens to the cost and time of development if they can?

The honest answer is nuanced. AI coding assistants are remarkably capable and already changing how software is built, but they are assistants, not replacements for skilled developers. This article explains, without jargon, what they do well, where they fall short, and how to think about them if your business depends on software — whether that is a website, an app, or internal tools.

What an AI coding assistant actually does

At its simplest, an AI coding assistant takes a description of what you want — in plain English or in partial code — and produces the code to achieve it. It can write new features, explain existing code, find and fix bugs, translate between programming languages, and generate the tedious boilerplate that consumes so much developer time. It works alongside a developer in their editor, suggesting code as they type and responding to requests in conversation.

The underlying technology is the same family of large language models that power general AI assistants, trained on vast amounts of publicly available code. Because so much of programming follows recognisable patterns, these models become very good at producing plausible, often correct, code for common tasks. That is the source of both their power and their pitfalls, as we will see.

It is worth noting how varied these tools have become. Some live quietly inside a developer's editor, offering suggestions one line at a time. Others operate more like a conversational collaborator that can take on a whole task, work across many files, and report back. The more autonomous a tool is, the more it can accomplish in one go — but also the more carefully its output needs to be reviewed, because a small misunderstanding can propagate across a lot of code before anyone notices.

An assistant, not an author

These tools accelerate a skilled developer's work but still rely on human judgement to review and direct them

Source: Stanford HAI AI Index

What they do well

It is worth being specific about the genuine strengths, because they are substantial. The clearest win is speed on routine work. A great deal of programming is repetitive — standard functions, common patterns, connecting one system to another — and AI assistants handle this quickly, freeing developers to focus on the harder, more creative parts of a project.

They are also excellent teachers and explainers. A developer facing unfamiliar code can ask the assistant to explain what it does, and a less experienced one can learn faster by seeing how a task might be approached. For finding bugs, the assistant offers a fresh pair of eyes that can spot mistakes a tired developer has read past a dozen times. And for prototyping — quickly building a rough version of an idea to see whether it is worth pursuing — they are transformative, turning a day of work into an hour.

This last point deserves emphasis for business owners. The ability to stand up a rough working version of an idea in an afternoon changes how cheaply you can test concepts. A feature you were unsure about can be prototyped and shown to a few customers before you commit to building it properly. Used this way, AI assistants reduce the cost of being wrong, which encourages the kind of low-risk experimentation that helps a business learn what its customers actually want.

What this means for your costs and timelines

For a business, the practical effect is that a capable developer can get more done in less time, particularly on the routine portions of a project. This does not usually mean software becomes dramatically cheaper overnight, but it can mean faster delivery and more capacity from the same team. The savings are real, but they accrue to teams that already know how to direct and review the tool.

Where they fall short

The limitations are just as important to understand, and overlooking them is where businesses get burned. The most fundamental is that AI assistants do not truly understand your business, your users, or the bigger picture of what the software is for. They produce code that addresses the immediate request, but they cannot make the judgement calls — about architecture, trade-offs, and long-term consequences — that separate software that works in a demonstration from software that holds up in the real world.

They also make mistakes confidently. An assistant can produce code that looks correct, runs without obvious error, and is nonetheless subtly wrong — mishandling an edge case, introducing a security weakness, or behaving incorrectly under unusual conditions. Because the code looks polished, these errors are easy to miss without an experienced reviewer. This is why every reputable use of these tools keeps a skilled human firmly in control, reviewing what the assistant produces before it reaches real users.

There is a longer-term concern as well: maintainability. Software is not written once and forgotten; it is read, changed, and extended for years. Code generated quickly without a guiding hand can become a tangle that is hard to understand later, even if it works at first. A developer who understands the whole system keeps it coherent over time — something an assistant, focused on the request in front of it, cannot do. The fastest way to build something is not always the cheapest way to own it.

Strengths and limits at a glance
Strong at	Weak at
Routine, repetitive code	Big-picture architecture decisions
Explaining unfamiliar code	Understanding your business context
Spotting certain bugs	Catching subtle errors in its own output
Rapid prototyping	Security-critical, novel problems

How their progress is measured

You may see coding assistants compared using a benchmark called SWE-bench, and it is worth knowing why this one is taken seriously. SWE-bench draws real bugs from real software projects and asks the model to fix them, then runs the project's own automated tests to check whether the fix genuinely works. Because the result is verified against working software rather than judged on appearance, it is far harder to game than benchmarks that reward plausible-looking answers.

Scores on this benchmark have climbed steadily, which reflects real and rapid improvement — the tools are markedly more capable than they were a year or two ago. But even a strong SWE-bench score does not guarantee good results on your particular software, which has its own quirks and context the benchmark cannot capture. If you would like to understand how to read these numbers properly, our guide to common AI benchmarks explains SWE-bench and its peers in plain terms, and our broader piece on how AI benchmarks work covers the traps to avoid.

Verified, not guessed

SWE-bench checks each fix against a project's own tests, making it a trustworthy measure of real coding ability

Source: Artificial Analysis

How to use them responsibly in your business

If your business builds or maintains software, AI coding assistants are almost certainly worth adopting — with the right expectations. Treat them as a force multiplier for skilled developers, not as a way to skip having skilled developers. The pattern that works is simple: the assistant drafts, a human reviews and decides, and nothing reaches real users without that review. Used this way, the tools speed up delivery without quietly introducing risk.

Be especially careful with anything sensitive — payment handling, personal data, security — where a subtle mistake is costly. These areas demand the most human scrutiny, regardless of how confident the assistant sounds. And if you do not have technical staff of your own, the right move is rarely to have a non-developer assemble critical software from AI output; it is to work with people who can wield these tools with judgement.

It also helps to set realistic expectations with whoever builds your software. Ask not whether they use AI tools — most capable developers now do — but how they review what those tools produce, how they handle sensitive areas, and how they keep the codebase maintainable over time. A team that can answer those questions clearly is using AI as it should be used. The promise of these tools is real, but it is realised through good engineering practice, not in spite of it. For a wider view of the AI landscape, see our pillar on what artificial intelligence is, and for the technology behind these assistants, our overview of large language models.

For everyday content and SEO work rather than software, the same principles of human review apply — our guide to content marketing for SEO shows how to keep quality high when AI helps with drafting.

Where the technology is heading

It is worth stepping back to see the direction of travel, because it shapes how you should plan. The clear trend is towards assistants that can take on larger, more self-contained pieces of work rather than just suggesting the next line of code. A tool that once completed a sentence may now attempt a whole task, working across many files and checking its own progress. This makes them more powerful, and it also raises the stakes of review: the bigger the chunk of work an assistant takes on, the more important it is that a skilled person verifies the result before it ships.

For a business owner, the sensible response is not to wait for the technology to settle — it will keep moving — but to build the right habits now. Adopt the tools where they clearly help, insist on human review for anything that reaches customers, and keep the people who understand your systems firmly in the loop. Businesses that pair capable tools with capable people are the ones that benefit; those that hope the tools will replace the people tend to learn an expensive lesson. The advantage compounds over time for teams that get the working relationship right.

None of this requires you to predict the future precisely. It only requires a steady principle: let the assistant accelerate the work, and let skilled judgement decide what counts as finished. That principle has held as the tools have grown more capable, and there is every reason to expect it to keep holding as they continue to improve.

Practical questions to ask before you rely on these tools

If you are commissioning software or working with a developer, a few plain questions will tell you whether AI tools are being used wisely. Ask how the work is reviewed before it reaches customers, and listen for a clear, confident answer rather than a vague reassurance. Ask how sensitive areas — anything touching payments, personal data, or security — are handled, since these deserve the most scrutiny. And ask how the resulting software will be kept understandable and changeable over the years to come, because the cost of owning software stretches far beyond the day it is first built.

A team that answers these questions easily is treating AI as a capable assistant within a disciplined process, which is exactly what you want. A team that cannot is more likely to be shipping fast-looking work that becomes a burden later. The goal is never to avoid these tools — they are genuinely valuable and here to stay — but to make sure they are wielded by people who understand both their strengths and their blind spots. That combination of powerful tools and sound judgement is what turns the promise of AI-assisted development into reliable software you can depend on.

Frequently asked questions

Can an AI coding assistant build my software without a developer?+

Not safely for anything important. These tools accelerate skilled developers but cannot replace the judgement, architecture decisions, and review that reliable software requires. The best results come from a developer directing the tool.

Do these tools make software development cheaper?+

They mostly make it faster, which can lower cost on routine work and increase what a team delivers. They rarely make complex software dramatically cheaper, because skilled review is still essential.

Is AI-generated code safe to use as-is?+

Not without review. AI can produce code that looks correct but contains subtle bugs or security weaknesses. A skilled human should always check it before it reaches real users, especially for sensitive features.

What does the SWE-bench score tell me?+

It measures how well a model fixes real software bugs, verified against the project's own tests. A strong score signals genuine capability, but it cannot guarantee good results on your specific codebase.

References

Stanford HAI, AI Index Report — hai.stanford.edu
Artificial Analysis, independent AI benchmarking — artificialanalysis.ai

Building something and want experienced hands directing the AI rather than replacing them? Explore our WhatsApp AI chatbot or get in touch to talk it through.

Back to blog

Country/region