If AI Trains Mostly on AI Text, Where Does New Knowledge Come From?

digitado ⋅ 11 de May de 2026

This is not a prediction. It is an intellectual exercise. Maybe it is wrong. Maybe it is too early. Maybe it can be discarded as a strange idea. But I think the question is worth asking: if AI-generated text becomes the dominant training substrate, where will the entropy needed for future AI evolution come from?

This article continues a line of thought I explored in my previous piece, “More Memory Won’t Fix Your AI Agents”. There, I argued that simply adding more context does not make agents more reliable, structure matters, boundaries matter, and context must be governed. In that piece, entropy and unstructured context were the enemies of reliable operations.

Here, I want to explore the paradox: when it comes to training and improving models, that very same contextual entropy may become our greatest ally for future evolution and adaptation.

So the question is: if context should not simply be expanded, could validated context become the primary source of future AI learning?

The Current Split: Training vs. Context

We should start with a basic distinction: Training teaches patterns. Context supplies the current case.

Most current large language model (LLM) implementations clearly split what they learn through training from what they temporarily interact with: the context. n n They use it to answer. But they do not truly learn from it in a permanent way. They process it brilliantly during a conversation, but their core knowledge stays frozen. n This was never a flaw, until now.

This is not a real problem for most daily AI use cases. It is also a design guardrail to prevent biases, wrong assumptions, data leaks, and other issues that a non-curated, ungoverned context could introduce into the “learned truth.”

This thought kept returning while I was writing my previous article: ignoring context as a source of learning may be safer for operations, but it also risks cutting AI off from reality.

The problem is not that context cannot train a model. In principle, a language model can learn from context.

The Training Substrate Is Changing

The deeper issue is that the raw material we have used to train these models is changing.

In the near future, human-written text may become a minority. As AI generates more and more of the world’s content, future training data risks becoming a derivative of previous AI-learned patterns rather than a reflection of independent observation, real-world interaction, and validated experience.

The data becomes less grounded, more self-referential, and poorer in genuine novelty, leading to the dystopian AI future often referenced as “model collapse from synthetic data.”

But the danger is not only model collapse. It is synthetic consensus: a growing body of AI-generated context that makes previous assumptions look like independent agreement. AI risks learning from its own echoes, from assumptions repeated so many times that they look like truth.

Imagine thousands of AI-generated articles repeating the same outdated assumption about a medical treatment, software architecture, legal interpretation, or operational practice. None of those texts are independent evidence. But when they enter future training data, they may look like broad agreement.

The model does not see one repeated assumption. It sees statistical weight. So, the original content, the new PoV or idea will become the needle in a haystack, overwhelmed and ignored.

And this is not only a hypothetical future risk. Something similar has happened many times in human knowledge with consensus. There was a time when radioactive materials were marketed as healthy, even in products for dental care. Lobotomy was once accepted as a treatment for mental illness. Humoral theory dominated medicine for centuries. Nutrition advice has repeatedly changed, sometimes turning yesterday’s recommendation into today’s mistake.

The point is not to enter into polemics. The point is simpler: consensus is not the same as truth. Repetition is not the same as evidence.

AI Does Not Measure Truth

We need to understand how AI works to understand the impact of this.

AI does not measure truth directly. It learns statistical weight from patterns that appeared many times in the past. In AI, statistical weight can act like a form of artificial consensus. The majority pattern is not truth, but it can behave like probabilistic truth inside the model.

The more we train with content created as a result of previous assumptions, the more those assumptions become “statistical certainty,” overwhelming and crushing anything new, different, or exceptional, like a revolutionary theory or simply a new unexpected natural condition.

This can virtually remove the possibility of learning something new, even if it was published somewhere.

A Note of Optimism

But I am still optimistic. AI-assisted content, which can also be seen as a human-assisted learning source, is also proliferating. Not every AI-assisted text is a clone of previous assumptions. Some of it carries genuine human ideas, new questions, strange intuitions, and unexpected combinations that the author used AI to express more clearly.

To improve the impact of this newness, we should also evaluate whether AI systems could learn to identify and extract those “human-assisted ideas” inside AI-assisted content. Not simply by detecting whether a text was written by AI, as current AI-content detectors try to do, but by detecting what is genuinely surprising, divergent, or new compared with the model’s existing assumptions.

Those signals could then be overweighted, preserved, and tested instead of being flattened by the statistical mass of derivative content. In the same way, training methods could explore how to downweight content that can be reliably identified as purely AI-generated repetition, without ignoring or censoring it completely.

Some AI-generated content can still be useful to reinforce what we already know. The point is different: if everything is weighted the same, genuine novelty may disappear under the mass of repetition.

Context as Reality Contact

As AI-generated content floods the world, the old training foundation is changing. The future of intelligence may depend on turning context into the new engine of evolution.

In that world, real-world connectors, prompts, human feedback, tool results, and operational outcomes become the most important sources of grounding.

Context is not just prompt material. It is the contact surface between AI and reality.

As synthetic text grows, reality contact becomes more valuable than content volume. The future bottleneck for AI may not be context length. It may be reality contact. The spread of the Model Context Protocol (MCP) as a possible sense layer for AI will be a critical test.

Thus, a deep contradiction appears: the very thing that was designed not to permanently change the model, context, may now have to become the main source of permanent learning if AI is to avoid stagnation.

Context will become the generator of the entropy necessary to discover new things and learn them. Therefore, for AI to evolve, it needs to find a way to include context in the learning loop. n Or more precisely:AI needs a context-to-learning loop, a way to transform live context into validated learning.

The loop could look like this:

live context

→ metadata envelope

→ anomaly detection

→ isolation

→ reality test

→ validation

→ synthesis

→ learning candidate

→ controlled consolidation

Learning here does not only mean updating model weights. It can mean updating memory, knowledge bases, policies, training sets, evaluation sets, or future fine-tuning data.

MCP as the Senses of AI

I would like to make a note about the Model Context Protocol (MCP) trend.

MCP adoption is expanding the AI context surface, but it is also adding a new validation layer. MCP is not only a connector. It is a strong candidate to act as the model’s senses, connecting AI to reality through tools, systems, feedback, and observable outcomes.

The more I thought about MCP, the more I realized I had been treating context mainly as an operational concern. But for learning, that same context may be the missing source of novelty.

In some way, that is what our brains have done for thousands of years. They use senses to test what the brain receives before treating it as reliable. We can consider MCP, together with human inputs, as the “senses of AI” to interact with reality.

But senses also create a trust problem. Like humans, AI needs contact with reality, but it cannot blindly trust every sense. Connectors must be validated, scoped, and challenged. MCP does not solve the problem by itself. It increases access to reality, but also increases the context surface that must be validated. The value is not connection alone. The value is governed, testable, attributable connection.

MCP can provide the raw contact with reality, but the learning loop must decide which contact becomes evidence, which evidence becomes hypothesis, and which hypothesis survives validation.

MCP as a Reality-Testing and Discovery Layer

MCP should not be seen only as a way to give AI more tools or more context. It can also become part of the reality-testing layer.

A model can generate a hypothesis, but a connected system can test whether that hypothesis survives contact with reality.

A simple example is code generation. A model can generate code, but the generated code should not be treated as correct only because it looks plausible. Through MCP, the system could connect to a repository, compile the code in a controlled environment, execute unit tests, observe logs and results, and compare the output against the expected behavior. The model proposes. The connected environment verifies. The result becomes evidence.

The same principle applies outside software.

A weather-prediction agent could compare its forecast with data from an official weather authority, radar, local sensors, or later observed conditions. A traffic-prediction agent could compare its hypothesis with live traffic feeds, road sensors, navigation data, or camera-based vehicle counts.

But the value is not only to say: the model was right or wrong.

The value is that repeated differences between prediction and reality can expose new patterns: microclimates, terrain effects, seasonal deviations, event-driven congestion, local behaviors, recurring anomalies, or conditions that were not captured in the original training data.

The connected system does not only validate the old pattern. It can help create the next pattern the model needs to learn, leveraging one of AI’s strongest capabilities: discovering new correlated patterns, even where those patterns were not expected.

This is where MCP becomes more than a connector. It becomes part of the validation loop, and potentially part of the pattern-discovery loop.

With those “senses,” context becomes more than input. It becomes a source of tested entropy.

But for context to become a real source of learning, it cannot arrive as raw text. It needs a governance envelope: source, timestamp, authority, scope, domain, confidence, permissions, approval state, outcome, and validation status.

A system should not only record what the model saw. It should record what the model inferred, what recommendation or action followed, what happened next, and whether the hypothesis was confirmed, corrected, or rejected.

Entropy Is Not Noise

I like the entropy concept because it is really that: the irremediable chaos, the source of novelty.

It is the change that is perceived at first as an error, a mistake, or an undesired result.

Like the AHA! moments that shock an expert before becoming a theory, or small DNA mutations where a few become the engine of evolution, these are the seeds of necessary change.

By entropy, I mean the unexpected variation that breaks the current pattern: the anomaly, contradiction, failed prediction, strange prompt, unusual tool result, or real-world outcome that does not fit the model’s previous assumptions.

In text, a related signal is perplexity: the level of surprise a model experiences when predicting the next words. But the entropy I mean here is broader. It is not only unexpected language. It is unexpected reality.

The goal is not to make AI learn from every unexpected signal. Most entropy is noise. The challenge is to preserve enough of it to discover which part is signal.

A strange tool result, an unexpected user correction, a production metric that contradicts the runbook, or a new scientific observation may look like noise at first. But if it repeats, survives validation, and explains reality better than the previous assumption, it becomes learning material.

The Central Challenge

The central challenge is therefore:

How do we govern entropy without eliminating it?
How do we overcome the weight of myriad clonal works derived from previous assumptions?
How do we let a minimal seed of difference grow without being overwhelmed by the derived “approved” context?
How will difference survive governance?

How does a minimal, fragile seed of true difference survive the overwhelming weight of “approved” context and myriad clonal works derived from previous assumptions? This is the real test. Governance naturally crushes what it did not previously sanction.

The scientific method requires every established truth to remain open to challenge. Traditional scientific progress has always faced this tension: new theories struggling against established consensus and political correctness. With AI, the asymmetry becomes even stronger because the system reinforces what it already “knows,” and even governance may become ruled by the AI-shaped consensus it was supposed to supervise.

Think about Darwin and the theory of evolution. When Darwin introduced his theory, it challenged established scientific, religious, and social assumptions. Many people rejected it, opposed it, or tried to crush it.

But what would have happened if a self-governed learning system had decided, before publication, that the idea was too divergent, too dangerous, or too contradictory to the accepted knowledge of the time?

The theory would not even have entered the arena where it could be heard, challenged, tested, refined, and eventually accepted.

That is the point. New ideas should not be accepted automatically. But they must be allowed to appear. They must be allowed to be heard, challenged, and tested against reality.

Consensus is useful for safe action. But if consensus becomes the only filter for learning, novelty dies before it can be tested.

From Governance to Curation

The challenge is that we must move from “govern entropy without eliminating it,” where govern means rule by an approved authority that will allow only what was previously thought and processed by AI, to something different.

Production should be governed. Learning should be curated.

Governance must protect production from entropy. Curation must protect learning from consensus.

The curator is not a judge of truth. It is a filter of sameness.

It does not need to magically recognize “high-quality novelty.” That would just recreate the same consensus trap. Its first job is simpler: identify what is already probable.

This curation process relies on a kind of Via Negativa of learning. It works through negation. It removes the synthetic echoes, the repeated assumptions, the derivative answers, and the patterns the model already knows well. What remains is not automatically true. It is a residual: the strange correction, the unexpected tool result, the failed prediction, the minority observation, or the idea that simply does not fit.

Think of it like negative space in a painting. The curator doesn’t try to paint the shape the model already recognizes. It looks at what remains in the empty space around that shape, the part the model cannot yet explain.

That residual is where entropy lives.

The goal is not to trust it immediately. The goal is to preserve it, isolate it, and test it before the dominant pattern crushes it.

Quality Emerges After Reality Tests It

We do not need the system to perfectly distinguish low-quality noise from high-quality breakthroughs at the moment of discovery. At first, both can look the same: surprise, contradiction, deviation, high perplexity.

The real distinction appears later, through validation.

A wrong idea will eventually fail reality tests: tool results, human feedback, sensor confirmation, operational outcomes, repeated failed predictions. A valuable one will survive repeated interaction with the world because it explains reality better than the previous assumption.

Quality is not a prerequisite for entry. It is an emergent property of independent recurrence and successful validation.

Volume alone is not enough. Synthetic consensus is also volume. The difference is independence and validation. Thousands of derivative texts repeating the same assumption are not evidence. But repeated observations, corrections, tests, and outcomes from independent contexts can become evidence.

The learning loop should not reward repetition. It should reward the signal that survives reality.

A Possible Path Forward

We may need dedicated models, or learning pipelines, trained primarily on validated live context rather than only on the general accumulated text.

These context-specialist systems could give higher weight to novelty, rewarding signals that diverge from previous patterns and treating the unexpected as a potential source of learning rather than as noise. But they should not be trained on raw context directly. They should be trained on validated context records, with provenance, scope, outcome, and correction history.

Some mechanisms could be simple at first.

One source could be context ingestion logs from agent audit trails. After anonymization, permission filtering, and validation, those logs could become a record of what context was retrieved, why it was retrieved, which tool produced it, how it was used, and whether the result was useful.

Another source could be AI context snapshots: controlled copies of the prompt, context, metadata, tool results, model output, human correction, and final outcome around moments where a genuinely new idea, anomaly, or useful divergence appeared. These snapshots could become learning candidates, not because they are automatically true, but because they preserve the full situation in which novelty emerged.

The goal is not to train on every interaction. The goal is to identify the interactions where context produced something surprising, useful, corrected, or validated against reality.

A second, separate model or system could then act as curator and coordinator. It would assign context records to relevant domains, synthesize what appears genuinely new, test it against reality through tools, feedback, or measurable outcomes, and decide whether the new material should be incorporated into memory, knowledge bases, evaluation sets, fine-tuning data, or future models.

Reality testing can be human feedback, tool-confirmed outcomes, sensor confirmation, repeated observation, successful prediction, or measurable operational improvement.

This decision can be automatic, or based on rules biased toward inclusion, with extra weight for novelty. Later, the efficiency of the newly learned model could be compared against the previous one.

In this architecture, context becomes much more than temporary conversation memory. It becomes the living engine of AI evolution. The fragile mutation is given space to grow in isolation, is carefully evaluated, and, if it proves valuable, is gradually admitted into collective knowledge without being immediately crushed by dominant patterns.

The next generation of AI may not be defined only by larger models or longer context windows, but by systems that can convert real-world context into validated learning without losing the anomaly that made learning possible.

Final Thought

The point is simple: order keeps AI safe, but entropy keeps AI alive.

This is not a rejection of order. It is the disciplined preservation of the chaos that makes new order possible.

AI needs order. But to evolve, AI must remain open to the sparks of new ideas, and disciplined enough to test which sparks can ignite the next fire.

Like 0

Liked Liked