
What AI Governance Cannot Automate
Donald Farmer, January 2026
Most organisations I see that are attempting to govern AI start out using the wrong tools for the job. They reach for familiar frameworks designed for conventional data flows, which execute fixed logic, but then apply them to adaptive systems. In doing so, they create gaps that regulators and auditors will eventually expose. The problem is not that governance is lacking; the problem is that the governance put in place addresses the wrong questions.
You can see this quite clearly with RAG systems. Retrieval-augmented generation has moved from in-house experiments to production faster than governance can adapt to accommodate it, yet, as we shall see, it has some unique requirements.
Two Questions, Often Confused
Discussions about governance often concatenate two critical questions, treating them as a single concern when, in practice, they require separate capabilities.
-
- The first question concerns lineage: how information moves through the enterprise. Lineage answers architectural questions and reveals dependencies that matter for change management. Which systems originate data? What transformations occur along the way? Which downstream products depend on upstream sources?
- The second question concerns provenance, which may sound similar but is significantly different. Provenance documents the history of specific data elements. You can think of it as the chain of custody for a specific output. How was a particular value created? Who modified it, and when? What parameters governed those decisions?
As we move more towards agentic AI, provenance takes on even greater importance. Organisations that build lineage without provenance will find their governance incomplete when hard questions are asked.
The RAG-Specific Challenge
Neural networks and large language models for many teams are the classic black box, defying interpretation. Who knows what goes on the mind of an LLM?
But the pipeline that fed training data can be documented through lineage, while the specific data included (their timestamps, the users who curated them, and the criteria for selection) can all be recorded through provenance.
RAG systems compound these requirements because retrieval itself needs to be governed.
-
- Lineage should document which sources feed the knowledge base, through what processes, and which applications consume the output.
- Provenance should record, for any specific document (or any “chunk”) in the knowledge base, its source, when it entered the system, whether it was modified, and under what retrieval context it appeared.
We also need to be aware that knowledge bases degrade over time. Source documents simply become out of date, and the accuracy and usefulness of retrieval correspondingly decline. So, governance for RAG should be able to distinguish failures of retrieval (where the system surfaces outdated or irrelevant content) from failures of generation, where the language model hallucinates or misinterprets.
What Remains Human
Technical infrastructure can document lineage and record provenance. It can automate compliance checks and flag anomalies for review. What it cannot do is verify the outputs of models against business context, accept accountability for decisions, or use much judgment about competing priorities. These functions remain, at least for now, as human responsibilities. Governance structures should preserve them rather than assuming that automation will eventually absorb them.
The temptation to treat governance as a purely technical problem, solvable through better tooling, misses the point: someone still decides what counts as acceptable use, and that someone will answer when things go wrong.
So, the question facing data leaders is not whether to restrict AI but how to scale it in ways that satisfy boards, regulators, and the people who remain accountable.
I’ll be leading a full-day workshop on these questions at the Data Governance, AI Governance & Master Data Management Conference Europe on March 26th in London:
Governing AI: A Practical Framework for Data Leaders.
We’ll work through a five-component governance framework in detail, design federated structures that balance central oversight with domain authority, and address RAG-specific controls from ingestion through retrieval to output. Participants will leave with maturity assessment tools, tiered approaches to permissions and risk, and a clearer sense of how to communicate governance progress to boards and regulators.
If you’re building AI governance capabilities and finding that conventional data governance frameworks don’t quite fit, this is a chance to work through the problem with others facing the same challenges.
Want to hear more from Donald? He’s speaking at the Data Governance, AI Governance & Master Data Management Conference 2026 in London this March.
Find out more here: Data Governance & AI Governance Overview
View the Agenda: DG AIG MDM 26 Agenda


