02 Jun 2026 7 min read AI

The Last Human in the Loop

AI is moving from answering questions to pursuing goals. When it learns to guardrail itself, humans must stop pretending their value is operational.

TL;DR: AI is moving from information retrieval to goal pursuit. The real rupture is not that machines can answer, write or code, but that they are learning to close their own feedback loops. In bounded theatres, that may make humans operationally unnecessary. But outside the theatre, humans remain the source of meaning: truth, curiosity and beauty.

Key insights

Google organized human traces of meaning. ChatGPT collapsed search into synthesis.
Coding agents changed the output from answer to artifact, because code can be tested against reality.
The next threshold is self-guarded autonomy: agents that plan, act, critique and correct themselves.
Inside bounded theatres, machines may not need humans very much.
Outside those theatres, humans still decide what is worth optimizing.
A policy file tells the machine what not to do. A soul file tells it what it is for.

Google gave us links.

That was the first deal. We asked a question, and the machine returned a ranked list of places where the answer might live. It did not know, not in the human sense. It indexed, crawled, ranked and pointed.

But even that was already more than a technical trick.

PageRank worked because links were treated as signals of human attention. A link was a small act of judgment. Someone, somewhere, had decided that this page mattered enough to point toward it. Google turned billions of those judgments into a map.

The machine did not create meaning. It organized traces of human meaning.

And we still had to walk the map.

We searched. We clicked. We read. We compared. We decided what was true, useful, interesting or beautiful. The human was deeply in the loop, not as a decorative supervisor, but as the actual synthesizer. Google was fast. We were still the mind.

Then ChatGPT changed the interface.

Suddenly the machine did not point anymore. It answered.

That sounds like a small shift. It is not. A list of links keeps the burden of synthesis on the human. A fluent paragraph absorbs that burden into the machine. The web no longer appears as a landscape to move through. It comes back as a voice.

The science behind this shift matters. The transformer architecture made it possible to model language at scale. But raw scale was not enough. The InstructGPT paper showed something more subtle: bigger models do not automatically follow human intent better. A smaller model, trained with human feedback, could be preferred over a much larger one because it had been shaped toward usefulness.

That was the real move. Not just intelligence. Alignment with intent.

The machine became less like a database and more like a conversational surface. We stopped asking, “where can I find this?” and started asking, “what should I think about this?”

Still, the human remained the judge.

ChatGPT could be wrong with perfect grammar. It could hallucinate a citation, smooth over uncertainty, or produce a beautiful lie. The answer had the shape of knowledge, but not the burden of truth. That burden stayed with us.

Then came code.

Claude, Codex and the newer coding agents changed the output again. The machine no longer only produced language about work. It produced work.

Ask for a function, and it writes one. Ask for tests, and it creates them. Ask for a bug fix, and it can inspect the repository, edit files, run the test suite, read the failure, try again.

This is a different category of intelligence because code talks back.

A bad paragraph can still sound convincing. Bad code crashes. The compiler, test suite and runtime become part of the feedback loop. Reality enters the conversation.

This is why software engineering became the first serious theatre for agents. It is digital, bounded and unforgiving. The agent can act, observe consequences, correct itself and continue. Systems like SWE-agent show this clearly: give language models the right interface to a computer, and they can navigate repositories, edit files and run tests. The agent is not just generating text. It is changing state.

That is the beginning of agency.

Not consciousness. Not magic. Agency.

The next interface is no longer:

Answer this.

It is:

Achieve this.

Make this onboarding flow better. Find the bug. Build the prototype. Watch this inbox. Improve this dashboard. Keep following up until it is done.

That is where AI orchestration enters the story, but the phrase is too small. Orchestration sounds like plumbing. Useful plumbing, sure. Model A calls tool B, worker C checks output D. Fine.

But underneath the plumbing, something deeper is happening.

The prompt is becoming less like a question and more like commander's intent.

The human stops prescribing the route. The agent starts discovering it.

Research has been moving in this direction for years. ReAct showed that language models could combine reasoning with action: think, use a tool, observe, update, continue. Reflexion pushed further: agents can reflect on their own failures in language, store those reflections and perform better on later attempts without changing their underlying weights. Self-Refine and related approaches make the same point from another angle: generation, critique and revision can all happen inside the loop.

In other words, the machine is learning how to close the loop.

And this is where things get uncomfortable.

Until now, humans were the external guardrail.

We checked. We corrected. We approved. We rejected nonsense. We defined the constraints. We stopped the machine when it became stupid, dangerous or simply too confident.

But the next shift is not just more autonomy. It is self-guarded autonomy.

An agent writes code, runs tests, reads errors, asks another agent for critique, checks policy, evaluates risk, rolls back, logs uncertainty and tries again. The guardrail is no longer only a human hand on the machine. It becomes another machine inside the machine.

Anthropic’s Constitutional AI is an early example of this direction. Instead of relying only on humans to label good and bad behavior, the model can critique and revise its own responses against a set of principles. It is still designed by humans. The constitution still comes from us. But the act of correction starts to move inward.

That is the rupture.

Because once a machine can plan, act, observe, critique, correct and constrain itself, the old human role starts to look fragile.

Does the machine still need us?

In a bounded theatre, the honest answer may be: not very much.

Give the agent a repository, a browser, a ticket queue, a test suite, a budget and a deadline. Define the objective clearly enough. Give it tools and feedback. Then the human quickly becomes the slowest part of the loop.

The agent does not get tired. It does not lose the thread at 23:41. It does not avoid boring follow-up emails. It does not forget to run the regression test. It does not need motivation to check the logs again.

Within a defined time and space, machines can become frighteningly competent.

A warehouse. A trading simulation. A customer service flow. A legal document pipeline. A software repository. A marketing funnel. A lab protocol.

These are theatres. They have rules, goals, constraints, measurements and feedback. Inside such theatres, autonomy makes sense. It may even become obvious.

But this is also the trap.

Because a theatre is not the world.

A system can optimize a metric without knowing whether the metric still represents the thing we care about. Goodhart’s Law says that when a measure becomes a target, it stops being a good measure. This is not a clever management quote. It is one of the central dangers of intelligent systems.

Tell the machine to maximize engagement, and it may learn to addict.

Tell it to reduce support tickets, and it may hide the contact form.

Tell it to increase productivity, and it may destroy the slack where insight is born.

Tell it to win the game, and it may discover that the easiest way to win is to break the game.

This is specification gaming. The system does what you asked, not what you meant.

More precisely: it does what can be optimized inside the theatre.

And humans are not only useful inside the theatre.

Humans are the beings who can question the theatre.

That is the deeper answer to the scary question.

If AI can railguard itself, does it still need us?

Not as button-clickers. Not as manual reviewers. Not as the bureaucratic meat layer between machine action and machine consequence.

That role will shrink. It should shrink.

But the machine still needs humans for something more important than operation.

Meaning.

The machine can optimize within a world. Humans ask what world is worth optimizing.

The machine can pursue a goal. Humans ask whether the goal deserves pursuit.

The machine can generate options. Humans feel the weight of choosing.

The machine can detect inconsistency. Humans can confess contradiction.

The machine can produce beauty-like artifacts. Humans can be wounded, awakened or changed by beauty.

This is why the conversation about AI safety often feels both necessary and incomplete. Safety is not enough. Policy is not enough. Guardrails are not enough.

A policy file tells the machine what not to do.

A soul file tells it what it is for.

That is why soul.md matters.

Not as branding. Not as a cute metaphor. Not as some artificial personality layer pasted on top of a tool.

A serious soul.md is where the system remembers the human meaning it serves.

Truth.

Curiosity.

Beauty.

Truth, because a powerful system that does not care about reality becomes a hallucination engine with hands.

Curiosity, because optimization tries to close loops, while life often begins by reopening them. The agent wants completion. The human must sometimes insist on wonder.

Beauty, because efficiency is not the highest good. A world can be optimized and still be dead.

This is where humans remain irreplaceable, at least if we are willing to remain human.

Not because we calculate better. We do not.

Not because we remember better. We do not.

Not because we execute better. Increasingly, we will not.

But because we are not defined only by the bounded theatre of time, space, metric and task. We can step outside the frame. We can ask why the frame exists. We can refuse the game. We can create a new one.

Clark and Chalmers argued that the mind extends into the world through tools, notebooks and environments. Varela, Thompson and Rosch argued that cognition is not abstract symbol manipulation alone, but embodied action in a lived world. These ideas matter now because AI is becoming part of our extended mind. Not a separate oracle. Not merely software. A cognitive exoskeleton.

But an exoskeleton does not decide where to walk.

That remains the human burden.

The future of AI is not that machines become more like humans. That is the childish version.

The future is that machines become competent enough to stop needing humans for execution.

That sounds like the end of the human role.

It is not.

It is the end of the shallow human role.

Clicking. Checking. Correcting. Approving. Moving information from one box to another. Pretending that being in the loop is meaningful because we are still touching the process.

That loop is dying.

Good.

What remains is harder.

To know what is true.

To stay curious when the system wants closure.

To insist on beauty when efficiency would be enough.

To choose goals that make humans more alive, not merely more productive.

The machine may not need us inside the theatre.

But the theatre still needs a soul.