Topic

AI Readiness Checklist Before Deployment

Insights/ AI Strategy & Automation / Governance & Risk

09 May 2023 - 07 min read

AI Readiness Checklist Before Deployment
Listen to article00:00 / 08:19

Why a readiness checklist matters more than a governance memo

A governance policy on its own is a promise. A readiness checklist is what proves the promise holds the day the AI goes live. Most AI deployments that fail in production fail not because the model was wrong, but because basic operational questions were not answered before launch: who watches the system, how it gets turned off, what the fallback is when it breaks, and whether the data going in was actually fit for purpose.

This article is a working pre-deployment checklist. It pairs directly with the AI governance article, which defines the rules an organisation chooses to live by. The checklist below is what makes those rules enforceable in practice, so that the system that goes live on Friday does not quietly become someone else's problem on Monday.

Data readiness: is the input fit for the model

The first set of gates is about the data the model will see, not the model itself. Three checks tend to catch most of the avoidable problems.

The first is coverage and freshness. Does the data the model will use at inference actually represent the cases it will face in production? A model trained or prompted on six-month-old data, in a domain where the world has moved (new products, new pricing, new regulation), will be confidently wrong in ways that are hard to spot.

The second is noise and labels. For predictive models, are the labels reliable, or are they themselves the output of an inconsistent process upstream? For generative models with retrieval, is the source corpus clean, or does it include duplicates, outdated documents, and contradictory versions of the same policy?

The third is PII and confidentiality. Has someone walked through what data the model touches, and what can leave the organisation through the model's outputs or via the vendor's logs? "We assumed the model did not log this" is a recurring cause of post-launch incidents.

If these three are not answered in writing before launch, the system is not ready to deploy, regardless of how good the demo looks.

Performance readiness: baseline first, then improve

A model whose performance is not measured against a clear baseline cannot be said to be performing. The most common failure mode here is launching an AI feature without ever measuring what the human (or legacy system) was doing in the same workflow.

Two artefacts close this gap. First, a baseline metric for the task: the average handling time, the resolution rate, the document-review hours, the quote accuracy, before AI is added. Without it, "AI saved time" is a claim, not a measurement. Second, a validation set built from real organisational data, with an explicit acceptance criterion: how good does the model have to be on this set before it is allowed into production, and how is that judged.

The validation set is also what catches the worst category of regression: the model that gets better on average and worse on the small set of high-stakes cases. Sampling those cases deliberately into the validation set, instead of relying on average accuracy, is what protects the organisation from the headline-grade incident.

Failure readiness: fallback, rollback, and the off switch

Every AI system in production will, at some point, be wrong, slow, or down. Three pre-launch decisions determine how expensive that moment becomes.

The fallback path is what happens when the AI cannot or should not respond. A queue back to a human agent. A simpler heuristic. A read-only legacy system. Designing the fallback before launch (not during the incident) is the difference between a five-minute degradation and a half-day outage.

The rollback plan is how the organisation turns the AI off, fast, if behaviour changes for the worse. This includes pinning specific model versions where the vendor allows it, keeping the previous prompt or fine-tune available, and making sure the rollback can be triggered by someone on call without going through a release process. Vendors do update models silently. The rollback plan assumes that.

The off switch is the simplest and most often missing: a single, documented action that takes the AI feature out of the user-facing flow without taking the rest of the product down. If nobody can name who has the off switch and how to use it, the system is not ready.

Operational readiness: monitoring, ownership, and the people on call

Deploying AI without monitoring is the same as deploying any other production system without monitoring, and it produces the same outcome a few weeks later. Four things have to be in place.

A monitoring view that surfaces input drift, output drift (the distribution of model responses changing over time), latency, error rates, and rate of human override or correction. Override rates in particular are an early warning signal that the model is no longer doing what users want.

A named owner for the AI feature in production, with a clear escalation path. "The team that built it" is not an answer once the team has moved on to the next initiative. The owner is the person who gets paged when the override rate spikes at 2am.

An incident process that treats AI-specific incidents the same way a security or availability incident would be treated: written record, root cause, controls updated, lessons captured. Without this, the same incident class recurs every quarter.

A quality review cadence, monthly or quarterly depending on volume, where a sample of outputs is reviewed by domain experts. This catches the slow drift that monitoring alone misses, especially for generative outputs where "wrong" is a judgement call.

User readiness: the front-line that interacts with the output

The last set of gates concerns the people who will actually use what the AI produces. Three checks usually decide whether the launch is felt as an upgrade or as overhead.

Have the front-line users been trained on the limits of the system, not just on its features? "It can hallucinate" is a fact they need to internalise before they accept its outputs as drafts to edit rather than answers to forward.

Is there a simple way to flag a wrong output, as part of the workflow, in fewer clicks than the workaround would take? Without it, the team learns to silently correct errors and the organisation never sees them.

And is the support model clear? When a user is unsure whether to trust an output, who do they ask, and how fast do they get an answer that lets them keep working? An AI rollout with no support model on day one is one where users either over-trust the output, under-trust it, or stop using it altogether.

Final takeaway

The point of a readiness checklist is not to slow AI down. It is to make the difference between a deployment that survives its first incident and one that quietly gets rolled back six weeks in, with the organisation no smarter about why. Governance defines what the organisation is willing to accept. The readiness checklist is what proves the organisation is operationally able to deliver on it.

The wider context, including how readiness gates fit into a real AI roadmap and a real operating model, is collected in the AI strategy and automation insights cluster. And when the question moves from "do we have a checklist" to "we have one and we now need someone to actually run it across multiple deployments without slowing the business down", that is exactly what my project management and digital strategy practice is built for.

- Haja Faniry

Related services

Digital Transformation & Technology Solutions

Digital transformation consulting and technology solutions to automate workflows, modernize digital infrastructure and support organisational growth.

Project Management & Digital Strategy

Digital project management and technology strategy consulting to support organisations in planning, coordinating and delivering complex digital initiatives.

Previous Post
AI Governance for Business Leaders
Next Post
How to Build an AI Strategy for Your Organisation
AI Readiness Checklist Before Deployment | Haja Faniry