Why Your AI Vendor Failed You - And What to Ask the Next One
It probably wasn’t the engineering. Most failed AI vendor relationships fail in the contract phase, not in the build. Five questions that catch the structural failure before signing.

It probably wasn’t the engineering.
You hired an AI company. They built something. Six months in, it doesn’t work the way the slide deck promised. Demos look fine; production use breaks. You’re not alone, and the failure pattern is almost never the one the vendor explains. Most failed AI vendor stories don’t end with a buggy model. They end with a contract that was set up to ship a thing that demoed well rather than a system that ran in production. The repair work starts before the next signature, not after.
Why your AI project went wrong before the first commit
The vendor will tell you the project went wrong because of integration complexity, or data quality, or a shifting requirement, or a model that needed more tuning than expected. Sometimes that’s true. More often, the failure was structural and arrived in the contract. The spec was vague, the success criteria were qualitative, the payment schedule was tied to milestones the vendor controlled rather than operational outcomes the business cared about, and the post-deployment ownership of the system was left for “later.” The build was always going to ship something that looked right in a demo and broke in production, because that’s what the contract paid for.
This is the structural truth about AI vendor failure in 2026. RAND’s 2024 research on AI project failure traces the same root cause: most failed engagements trace back to contract structure rather than engineering quality. The dominant failure mode isn’t a bad engineer. It’s a buyer who signed against the vendor’s standard SOW without rewriting the parts that should have been rewritten. Vendors don’t volunteer changes to their template that shift risk back to themselves. The buyer’s job is to require those changes. Most buyers don’t, because the buyer’s organisation hasn’t bought AI before, hasn’t learned what to ask for, and is under board pressure to start spending the budget that got approved last quarter.
The prevention pattern is short. Five questions, asked before signing, surface the structural problems early enough to either renegotiate or walk. The questions don’t replace technical due diligence. They precede it, because no amount of technical due diligence rescues a contract that was set up to fail.
Five questions that prevent a bad AI vendor experience
Each question targets a structural failure that gamgi has seen repeatedly in the post-mortem of an inherited or rescued AI project.
1. “Show me a production deployment from the last twelve months. Talk me through what shipped on day one versus what shipped six months later.” The answer separates vendors who ship from vendors who demo. A vendor who can describe what shipped on day one, what shipped at month three, and what shipped at month six has actually operated a system in production. A vendor who answers in capabilities or features (“we built a chatbot that integrates with...” rather than “we shipped phase 1 in week 14, then iterated...”) usually hasn’t been responsible for the post-launch life of the system. The follow-up: who owned operations after handoff? If the answer is “the client’s team,” ask what training and runbook documentation was provided. The absence of either is a structural red flag.
2. “What’s the success metric, and how does the payment schedule track to it?” The vendor should be able to name the operational metric the system has to hit and tie payment milestones to outcomes the business cares about. Payment schedules pegged to vendor-controlled deliverables (UAT signoff, production deployment) are the industry standard, and they’re the standard precisely because they protect the vendor from cases where the system technically deploys but doesn’t move the metric. A serious vendor will negotiate a structure with a meaningful tail tied to the operational outcome. A vendor who insists on full payment at deployment is signalling they don’t expect to be measurable after that point.
3. “Who owns the system after deployment, and what does that look like in week 12 post-launch?” The buyer’s organisation has to name an owner. The vendor’s question to the buyer here is the diagnostic. A vendor who walks the buyer through what week 12 ownership actually requires (monitoring, retraining, escalation handling, prompt drift, model version updates) is being honest about the post-deployment work. A vendor who hand-waves it is setting up the handoff for failure. The buyer’s commitment to naming an owner before the build starts is non-negotiable; vendors who don’t demand that commitment aren’t protecting the project.
4. “Show me how you would fail this project.” The most diagnostic question on the list. A serious vendor can describe in concrete terms what would have to go wrong for the project to fail - and what controls they propose to put in place to prevent each failure. A vendor who answers that the project won’t fail hasn’t shipped enough AI to know that all projects fail somewhere. The failure-mode answer also tells the buyer what risks the vendor is most worried about, which is almost always more useful than what the vendor is most confident about.
5. “What happens to the system if we replace you?” The NIST AI Risk Management Framework treats vendor exit and data portability as first-class risk-management questions, and the framing here follows that. The vendor’s IP position, the documentation they leave behind, the credentials and infrastructure ownership, the data they retain or destroy: all of it should be answerable in one paragraph. If the answer is complicated or evasive, the buyer is committing to a future where switching vendors is expensive or impossible, which is itself a structural problem. Lock-in shouldn’t be invisible at signing; it should be priced.
Together these five questions describe how to evaluate AI vendor relationships before committing budget. None of them are technical. All of them are about the structural shape of the engagement. A vendor who handles all five well is rare and worth paying a premium for. A vendor who can’t handle three of them isn’t a bad vendor in the moral sense; they’re just a vendor whose contract is going to ship a thing that doesn’t survive contact with production.
The case where two vendors had already mis-scoped the brief
A specific case helps locate where the five questions actually catch failures.
WA Center: the brief that two vendors had already mis-scoped. A multi-country education institution had spoken to two AI vendors before engaging gamgi. Both had proposed builds against the institution’s initial brief, which described “an AI-assisted platform for staff workflows.” Both proposals had a fixed-scope build, a payment schedule pegged to deployment milestones, and no specific named owner on the client side. Both would have shipped something that demoed but stalled, because the spec was vague, the success metric was unstated, and ownership was deferred.
The audit caught it. The reframing established what was actually needed: a custom multi-role platform with three user types, two language contexts, an audit trail requirement, and specific integrations with the existing record system. The success metric was defined (case-intake-to-routing time), the owning operations lead was named in week one, and the payment structure tied tail payments to operational outcomes rather than deployment milestones. The system shipped to production and has been running continuously since. Full structural detail in the WA Center case study.
The pattern is consistent across the post-mortems gamgi has reviewed: every AI project that went wrong had at least one of the five structural failures visible in the proposal before the build started. None of those failures are technical. All of them are visible in the proposal, if the buyer asks. The cheapest way to avoid the next bad AI vendor experience is to require the five-question disclosure as part of vendor evaluation.
The full audit-first engagement shape, which embeds these questions on both sides of the table, is described in the audit-first process. For the wider question of how to evaluate AI partners across the European vendor mix, the framework in best AI agencies for European businesses is the right next read. And if you’ve inherited a stuck project rather than a fresh one, the diagnosis in pilots that don’t make it to production covers the rescue path. A structured audit reframes the brief before you sign the next vendor.
When the vendor isn’t actually the problem
The framing here assumes the vendor relationship was structurally set up to fail. There are cases where the vendor is fine and the failure is elsewhere:
- The buyer changed scope mid-build. Real changes happen and have to be absorbed, but if the spec drifted from version 1 to version 3 between weeks 4 and 12, the build was always going to come in different from the original brief. The vendor isn’t the problem here; the buyer’s scope discipline is.
- The business priority changed. A reorg, a strategy pivot, or a new CEO arrives. The project funded against last year’s strategy is now misaligned. This isn’t anyone’s fault, but blaming the vendor for the misalignment isn’t honest.
- The data foundation collapsed mid-project. A source system was deprecated, an upstream integration changed, the data the AI was trained against stopped flowing. Data plumbing is rarely in the vendor’s scope unless the contract says so. The fix is to scope data ownership explicitly into the project.
- The buyer had no time for the project. AI projects fail when the buyer doesn’t show up: for requirements sessions, for UAT, for post-deployment ownership. A vendor cannot drag a disengaged buyer across the line. The fix is buyer-side commitment, not vendor-side persuasion.
- Most AI projects that go wrong fail in the contract phase, not the execution phase. The spec was vague, the success criteria qualitative, the payment schedule pegged to vendor-controlled milestones, and the post-deployment ownership left for later.
- Five questions surface the structural problems before signing: production-deployment evidence, success-metric-to-payment alignment, post-deployment ownership, failure-mode candour, and exit position.
- The “show me how you would fail this project” question is the most diagnostic. Vendors who can describe their failure modes have shipped enough AI to know the shape of failure; vendors who reassure haven’t.
- The vendor isn’t always the problem. Scope drift, strategy change, data foundation collapse, and disengaged buyers all account for project failures that have nothing to do with vendor quality.
- The cheapest way to avoid the next bad vendor relationship is to rewrite the vendor’s standard SOW before signing it. The five questions are the rewrite checklist.
The fastest way to find out which AI agency red flags actually apply to your shortlist is to run a structured audit before you commit to a vendor at all. Two weeks, fixed scope, fixed price. You leave with a vendor-evaluation framework written against your specific operational requirements, plus a portable brief that any future vendor has to deliver against. Most buyers discover the questions they were going to ask weren’t sharp enough to catch what they needed to catch.
Book your audit

