Lush green mountain landscape with a winding path under a cloudy sky

AI’s New Obsession: Continuous Loops That Let Agents Improve Themselves

AI loops could turn agents into always-on workers. Here’s why continuous self-improving systems may reshape coding, cost and oversight.

In short

AI loops are emerging as the next step in agentic AI, with systems prompting and reviewing other systems continuously. The approach could improve software development, but it also raises major questions about cost, drift and oversight.

  • AI loops move agentic systems from one-off tasks to continuous background work.
  • The concept is most compelling for coding, maintenance and other repeatable optimization problems.
  • Supporters see major productivity gains; critics point to cost, drift and trust issues.
  • The trend is closely linked to the broader push for more test-time compute.

The latest buzz in artificial intelligence is not another chatbot interface or a flashy new model release. It is a concept borrowed from software engineering and given new life by agentic AI: loops. At a recent Meta @Scale conference appearance, Claude Code creator Boris Cherny made the case that continuously running AI systems that prompt, critique and improve other AI systems are not a passing fad. They may, instead, be the next major step in how software gets built.

Cherny’s comments land at a moment when the industry is moving rapidly from one-off prompts toward persistent, task-oriented agents. The idea is no longer just to ask a model for help and wait for an answer. The newer ambition is to let software keep working in the background, checking its own output, identifying weaknesses, generating fixes and handing those fixes back into the pipeline without constant human intervention.

That vision is increasingly being described as an AI loop: an agentic workflow where one model or sub-agent observes a task, another proposes changes, and the system repeats until a stopping condition is met. In practice, that can mean code reviews, architecture cleanup, duplicated abstraction removal and other software-maintenance chores that are tedious for people but attractive to machines operating at scale.

For supporters, loops represent a powerful extension of the agent era. For skeptics, they raise obvious questions about reliability, drift and cost. Either way, the concept is gaining traction because it matches the direction the AI industry is already heading: more autonomy, more compute and more ambition to let systems do real work continuously rather than intermittently.

Why ‘loops’ are suddenly part of the AI conversation

In classic computing, loops are one of the simplest ideas in the field. A program repeats a set of steps until a condition says to stop. What is new in the current wave of AI is not the existence of loops, but the identity of the thing deciding what happens next.

Instead of a programmer hard-coding every branch, AI systems can now be used to determine whether the work is complete, whether more revision is needed or whether another pass through the system would improve the result. That shift is subtle in technical terms and enormous in practical implications.

Cherny argued that this progression is a natural extension of the move from hand-written software to agent-generated software. In his framing, the industry has already crossed one major threshold by allowing agents to write code. The next threshold is letting agents operate in cycles, with one agent generating work and another agent deciding whether further improvement is needed.

Cherny described the shift as a major step beyond agentic coding, saying the industry has moved from humans writing source code by hand to agents writing code, and now toward agents prompting other agents that then write the code.

That idea resonated because it neatly captures where many advanced AI teams are now experimenting. The goal is not just to build systems that answer questions well. It is to create systems that can keep making progress without being told every move to make.

What Claude Code’s creator was describing

At the conference, Cherny did not present loops as an abstract future theory. He talked about specific workflows he uses in his own work, where separate agents have continuous jobs with clear responsibilities.

One agent watching architecture

In one example, a model keeps looking for ways to improve the overall architecture of a codebase. Its job is not to write all the product code from scratch. Instead, it keeps scanning for structural improvements, refactors and opportunities to make the system cleaner or more maintainable.

Another agent hunting duplication

A second agent, in Cherny’s description, searches for repeated or overlapping abstractions that can be consolidated. That is a classic engineering task: spotting duplicated logic and replacing it with a single shared pattern. Because software is always changing, these checks do not end after one pass. They keep running and keep submitting proposed fixes.

In Cherny’s telling, those agents behave much like human contributors. They submit pull requests, wait for review and then move on to the next issue. The difference is that they do not need sleep, meetings or a fresh assignment every morning. If the system is designed correctly, they can remain on duty indefinitely.

How loops fit into the broader shift toward agentic AI

The AI industry has spent the last several years moving through stages of increasing autonomy. First came simple chat interfaces. Then came tools that could call functions, fetch data and use external applications. Now the focus is on agentic systems that can plan, execute and revise over time.

Loops push that model further by formalizing continuous improvement. Rather than having an agent complete one task and stop, the loop turns the agent into a persistent worker. It can keep checking progress, make adjustments and keep going until a target is reached or the budget runs out.

That matters because many useful business tasks are not one-shot problems. Software maintenance, security reviews, content cleanup, document processing and data quality work often require repeated passes. A looped agent can, in theory, handle those kinds of jobs with far less supervision than a traditional workflow.

It also changes the emotional posture of AI use. Most people who work with current assistants are still accustomed to managing them carefully: set the task, review the output, correct mistakes and move on. Loop-based systems ask for a different kind of trust, one that tolerates ongoing action behind the scenes.

Why the concept is not entirely new

Although the current branding around loops is new, the underlying idea has deep roots in computer science.

Recursive and repeated execution have long been part of programming education and software design. Computers have always been good at doing the same thing over and over again. What is changing is that the entity determining the repetition is no longer a deterministic piece of code alone; it can be an AI model making judgment calls.

That means today’s loops are less like a simple textbook while-loop and more like a dynamic workflow in which one model decides whether to continue, another model performs the next action and a controller system keeps the process bounded. In effect, the loop is a structured way to let AI supervise more AI.

That structure is especially appealing in environments where tasks are not neatly finished after a single answer. If a model can identify that a project still has unresolved issues, it may be more efficient to keep it active than to restart the entire workflow from scratch.

The Ralph Loop and other practical tricks

One of the recurring patterns in this space is the so-called Ralph Loop, a playful name borrowed from Ralph Wiggum. In practical terms, it works by summarizing what the model has done and asking whether the goal has been achieved.

The point of the technique is to reduce the chance that a model wanders aimlessly through a task for too long. If the system can periodically step back and ask whether progress has actually been made, it can bounce the model back into productive work or stop it when the job is done.

This kind of pattern is popular because it solves a familiar problem in AI operations: models can drift. They may continue generating plausible output long after they have stopped being useful. A loop with a built-in checkpoint can help catch that failure mode.

The compute question: why loops are tied to test-time scaling

Loops are not just a workflow design choice. They are also part of a much larger industry debate about test-time compute.

That debate centers on a simple idea: if models can improve their performance by thinking longer or using more internal computation, then problems that seem difficult at first may become solvable if enough compute is applied after the model receives a prompt.

OpenAI researcher Noam Brown recently argued that modern models can handle almost anything if enough compute is available. In that view, the question is not only whether a model is smart enough, but whether the system is willing to keep investing resources until the task is finished.

Loops make that philosophy operational. Instead of spending one burst of computation on a single response, the system spends compute across repeated cycles. Each cycle can refine the output a little more, test a little more and attempt to push the solution forward.

That is especially relevant in so-called hill-climbing tasks, where a system is trying to improve an answer or codebase incrementally. Many engineering problems are not about finding a perfect solution instantly. They are about making a series of small improvements until the result clears a desired threshold.

Why code quality tasks are a natural fit

Software engineering is a particularly good candidate for this approach because many improvements are measurable. Code can be linted, tested, benchmarked and reviewed. If an agent makes the code cleaner or more efficient without breaking it, the result is visible.

That creates an environment where repeated passes are useful rather than wasteful. An agent can inspect a repository, suggest a refactor, verify the change and then look for the next one. In theory, this can continue as long as there is value to extract from the codebase.

That same logic may extend to other domains where the quality of output can be checked systematically, such as security review, document correction or some forms of analytics.

The upside: autonomy, endurance and scale

Supporters of looped agents see a major opportunity in letting systems do work that does not fit into a finite, one-and-done prompt.

One obvious benefit is endurance. Human teams go offline, get distracted and move on to other priorities. A looped agent can keep working if it is properly monitored and if the task is well defined.

Another benefit is scale. If one agent can improve code architecture, another can hunt for duplication, and others can review tests, security and documentation, the effect can resemble a small always-on engineering workforce.

This is also where the commercial logic becomes important. The companies selling frontier models are effectively selling tokens and compute. Systems that keep running naturally consume more of both. That can make them attractive for the model providers, even if the economics are more complicated for customers.

Supporters argue that, with the right oversight and safeguards, continuously running agents could become powerful enough to justify the added cost because they can produce meaningful gains over time.

The downside: cost, drift and trust

The most obvious drawback of loops is cost. A chatbot session ends when the question is answered. A persistent agentic loop can keep spending tokens and compute for as long as the system allows it. There is no inherent ceiling unless one is imposed externally.

That means these systems can become expensive fast, especially if they are left to run continuously or are given broad mandates. In the wrong setup, a loop could burn through resources without delivering proportionate value.

There is also the risk of drift. The longer a model runs, the more likely it is to wander into unhelpful territory, over-optimize small details or chase changes that do not matter. A loop can amplify the very behavior teams are trying to control if there is no strong oversight.

Trust is the central issue. The more autonomous the system becomes, the more carefully humans need to define the boundaries. Teams must think about what the model is allowed to change, how often its work is reviewed and when the loop should be paused or shut down.

Classic AI failure modes do not disappear

Loops do not eliminate hallucination, bad judgment or misaligned incentives. They simply give those issues more time to surface. If a model misunderstands a task, a loop may repeat the mistake many times instead of once.

That is why practical deployments will likely require strong guardrails. These may include:

  • clear stopping rules
  • budget limits on token usage
  • automatic checks for regressions
  • human review of sensitive changes
  • audit trails for every action the agent takes

Without those protections, a loop becomes less like an intelligent assistant and more like a costly machine with a runaway process.

What this means for companies building with AI

If looped agents become standard, the practical implications for software teams could be significant.

Startups may use them to reduce engineering overhead, especially in maintenance-heavy environments. Larger companies may deploy them across internal codebases, support tooling, policy compliance and security monitoring. The common thread is repetitive work that benefits from continual revision.

But adoption will depend on more than technical capability. Companies will need budgeting discipline, observability tools and governance systems that can explain exactly what the agents are doing and why. The more continuous the loop, the more important that operational discipline becomes.

That may also change how teams evaluate model vendors. Raw benchmark scores will matter less than the ability to run safely for long periods, recover from mistakes and keep costs under control. In other words, continuous autonomy will likely be judged as much by operations as by model quality.

Where the industry goes next

The rise of loops suggests that AI development is entering a more industrial phase. The novelty of asking a model a question is wearing off. The harder and more valuable challenge is building systems that can keep improving over time with minimal supervision.

That is a major cultural shift. It moves AI from a tool you consult into a worker you delegate to, and then from a worker into a worker that manages other workers. The logic may sound unsettling, but it follows a broader pattern already visible across the field.

More compute, more autonomy and more automation are becoming the default direction of travel. Loops fit neatly into that trajectory because they convert short-lived model interactions into ongoing work pipelines.

Whether that becomes a mainstream way to build software will depend on a few critical questions: How much can be automated safely? How much oversight is required? And how expensive can the process be before the gains disappear?

For now, the answer from the AI world appears to be that the experiment is worth trying. If current models can already make repeated improvements and if future models get better at deciding when to stop, then looped systems may become one of the most important patterns in the next phase of AI adoption.

Key milestones in the rise of AI loops

Stage What it looks like Why it matters
Manual coding Engineers write source code by hand Human-only software production
Agentic coding AI writes code in response to prompts Agents start taking on discrete tasks
Agent-to-agent workflows One model prompts or reviews another AI begins supervising AI
Continuous loops Agents keep running, revising and checking in background Persistent autonomy and ongoing optimization

Why this matters beyond coding

Although the current discussion is centered on software engineering, the same structure could eventually show up in many other fields. Any environment with repeatable tasks, measurable outcomes and ongoing change could become fertile ground for continuous agent loops.

That includes internal operations, data cleanup, compliance workflows and some forms of research assistance. The more a task resembles a long-running optimization problem, the more attractive a loop becomes.

The key question is whether the industry can translate theoretical capability into dependable practice. If it can, loops may become one of the most important building blocks of enterprise AI. If it cannot, they may remain a clever but expensive experiment.

For now, the momentum is unmistakable. AI is moving from single-answer systems toward systems that can think, revise and collaborate with themselves over time. In that sense, loops are not just a technical trick. They are a signal that the next phase of artificial intelligence may be defined by persistence.

Share this 🚀