Loop Engineering Totally 10x Hermes agents
Chapters15
Defines loop engineering and explains how it moves from prompting the agent to designing systems that prompt the agent for you.
Loop engineering makes Hermes agents self-drive projects by designing autonomous loops instead of writing prompts.
Summary
AI Labs’ video by the host dives into loop engineering and why it radically shifts how we build with agents like Hermes and Claude Code. The presenter contrasts loop engineering with traditional prompt engineering, showing how autonomous loops redefine who crafts the plan and who executes it. Key figures mentioned include OpenClaw’s creator, Claude Code’s Boris, and the Anthropics conference remarks that sparked similar thinking. The video traces evolution from Opus 4.5 enabling long-running tasks to Fable 5’s push toward more capable models, then explains how deterministic and non-deterministic loops differ in practice. A five-step workflow is laid out for building loops, with emphasis on context management, feedback quality, verification gates, termination rules, error handling, and cross-turn state management. The sponsor segment highlights Scribba’s interactive code-editor-style Python courses. Finally, the host showcases two loop types: deterministic loops for clearly defined goals (e.g., test-powered deployment with Hermes) and non-deterministic loops for open-ended tasks like UI design, incorporating AI slop detectors and adversarial verification. The takeaway is that autonomous loops are accelerating AI-powered development, with domain knowledge from the user remaining crucial to define end goals. The video ends by pointing viewers to community skills and resources to start building today with Hermes and Claude code loops.
Key Takeaways
- Loop engineering shifts control from prompt writing to system design: the agent drives itself rather than the human writing prompts.
- Deterministic loops rely on a clear end goal and tests; Hermes can monitor production and fix issues automatically to pass all checks.
- Non-deterministic loops handle tasks like UI design; they use AI slop detectors and a verifier-builder adversarial loop to improve outputs.
- The five-step loop framework covers state checks, decision-making, action, feedback, and completion judgment, with emphasis on context, feedback quality, and termination conditions.
- Context management matters because chat history buries important instructions; external files track progress and keep the loop on track.
- Loop costs scale with token usage, so it’s important to balance loop depth with available compute and token budgets.
- Long-running capabilities emerged with Opus 4.5 and Fable 5, enabling tasks to unfold without heavy step-by-step scripting, thanks to self-evolving skills.
Who Is This For?
Essential viewing for AI engineers and developers who want to deploy autonomous agents at scale, especially those using Hermes or Claude Code, and who want practical guidance on building robust, self-sustaining loops.
Notable Quotes
"There's a new term going around and you might have already heard it. It's called loop engineering."
—Opening framing of loop engineering as a now-common concept.
"With [snorts] loop engineering, the core idea is simple. You stop being the person who writes the prompt that drives the agent, and instead, you let the agent drive itself."
—Core 정의 of loop engineering vs traditional prompting.
"The model finds the right path by trying different things."
—Analogy to learning and iteration in loop optimization.
"One thing to keep in mind, though, since you're handing the job of figuring out the path over to the model instead of doing it yourself, loops get expensive in tokens."
—Practical cost consideration for using loops.
"Hermes is always running, it's a really good agent implement this loop on."
—Highlighting Hermes as a continuously active agent.
Questions This Video Answers
- How do I start building autonomous loops with Hermes and Claude Code?
- What's the difference between deterministic and non-deterministic loops in AI agents?
- How do I manage context and state effectively in long-running AI loops?
- What is AI slop and how can I detect and fix it in UI automation tasks?
- Why did Opus 4.5 and Fable 5 change the capabilities of long-running AI tasks?
Loop EngineeringHermes agentClaude CodeOpenClawBoris Claude CodeOpus 4.5Fable 5Anthropic conferenceAI slop detectordeterministic loop vs non-deterministic looping workflow? goal-oriented automation
Full Transcript
There's a new term going around and you might have already heard it. It's called loop engineering. And just like every other hype term, everyone is talking about it like it's something new. It's not. But when you combine it with an always running agent like Hermes, it stops being hype. Most people who are trying to set these up are getting the loop right and missing the thing that actually makes it work. And if you already know there are two types of loops, there's a specific setup inside one of them that almost nobody is doing. Once you see it, the way you think about building with agents changes completely.
By the end of this video, you'll understand exactly what it is, and you'll have it running on Hermes and even Claude code without you having to step in at all. With [snorts] loop engineering, the core idea is simple. You stop being the person who writes the prompt that drives the agent, and instead, you let the agent drive itself. But to see why it's a shift in the first place, you've got to compare it to what came before. The skill that used to matter was prompt engineering, where all our focus went into writing the right series of instructions to drive the coding agent properly.
But loop engineering flips that around. Instead of writing the prompt yourself, you design the system that does the prompt engineering for you and drives the agent on its own. So, the focus moves away from crafting instructions and toward designing systems that run themselves. All of this started when the creator of OpenClaw said you shouldn't be prompting your coding agents anymore and that you should focus on designing loops that prompt the agent for you. And he's not the only one. Boris, who is the creator of Claude Code, also made the same claim at the Anthropics annual developer conference where he said he doesn't prompt Claude anymore.
He's got loops running that prompt claude and it figures out for itself what needs to be done. So the question is how do you get started with them? All of it comes down to how well you can set up the systems where you don't have to worry about prompting the agent at all. You define what you need and the agent does the rest. That's exactly where AI powered development is heading. Before [snorts] we get into how to actually build them, you need to be clear on what a loop is. A loop is basically a process where you define the end goal and the agent figures out the steps to reach it on its own.
It corrects itself along the way and works around problems until it reaches the goal you set. A few months ago, before models got capable enough to sustain long tasks. This wasn't possible. If you needed to build an app, you'd prompt the agent, monitor what it was doing, check the output yourself, find the issues, and reprompt to fix them. You were the loop. You were the part doing the error checking and course correcting between every step. That's what development still looks like for most people. And that's exactly what loop engineering is about to take off your plate.
Now, this might sound like a brand new concept, but loops have actually been around for a while. Chron jobs are a good example of a loop you've probably already seen. They're just tasks scheduled to run repeatedly and automatically without you having to trigger them each time. The only real difference is that a chrome job runs at a fixed time. So, with loops in place, the work stops being about writing the prompt. Your agents performance on a task comes down to how well you define the end goal. To some of you, this process will sound a lot like reinforcement learning.
If you haven't come across it, reinforcement learning is basically a way of training a model where you don't show it the right answers. Instead, you just tell it when it did well and when it didn't, and it gradually figures out how to get better on its own. The model finds the right path by trying different things. It gets a positive signal when it moves in the right direction and a negative one when it doesn't. The same idea applies here, except the model itself isn't what's being trained. Instead, the agent is working toward completing the task you want done.
Iterating on it in the same way a model would improve during training. If it fails the loop you've put on, the agent doesn't mark the task is done. It tries again, keeps going, and corrects itself until it reaches the goal you set. Now, after hearing all this, you might wonder what's actually left for you to do if everything is becoming autonomous. But your role doesn't shrink. It gets more important because it's your domain knowledge and experience that define the end goal in the first place. And that ends up showing in everything you build and ship.
This is exactly why the push toward autonomous loops is only speeding up and it's showing in every new feature that drops right now. Fable 5 is the clearest example yet. Anthropic dropped it even though they'd been calling for a slowdown in AI development because the models are getting capable at a pace that's hard to keep up with. And after releasing it for some time, they even pulled it. They built it for long and complex tasks. and it performs better the longer and more complex the task gets, which is basically the opposite of how models used to work.
This shift really started with Opus 4.5. Once that dropped, long-running tasks got dramatically better, and you didn't need to set agents up with carefully guided harnesses anymore. Basically, structured setups that walked the agent through each step. The focus moved instead toward preparing the project to run over the long term because the models are now capable enough to handle things on their own without much step-by-step handling. But the loop isn't the only thing that matters. You also need to structure your project in a way that lets the agent work on its own for a long time without you having to step in.
So, a lot of people have been building and open- sourcing systems for exactly this kind of setup. The Ralph loop was one of the first. It worked by setting the end goal and making sure the agent couldn't drift away from it. It did this through hooks, which are basically scripts that run automatically when something specific happens. So this script strictly prevents the agent from marking a task as done unless it had actually met the condition. But hooks are rigid. So Claude introduced its own goal command which did the same thing but with more flexibility. Instead of a hard-coded check, it lets another model decide whether the task is actually finished.
We covered goal buddy 2 which built on that by having the agent track its progress in local files and define exactly what done looks like before it even starts so it always knows what it's working toward. The Hermes agent and Open Claw were both built on the same philosophy. They take you out of the picture entirely and let the agent handle everything on its own. Now, if you want to build these loops, we've got a simple five-step system for you. And since there are two types of loops, some of those steps work a little differently, but we'll get into both types later on.
For now, we'll start in clawed code, and later in the video, we'll look at how to do the same thing in the Hermes agent. The first step is checking what state the project is in. From that, the model decides what the next action should be. Then it acts on that decision and this is where the actual work happens. The agent calls tools, writes to files and runs commands to get the task done. Once that's finished, it gathers feedback to see what actually happened. And based on that, it decides whether the task is done or not.
This is also where the difference between prompt engineering and loop engineering becomes obvious. With prompt engineering, you're only ever controlling the decision step, while loop engineering handles all five together. Building a loop that works well means getting a handful of things right. And each one is there because of a specific problem it solves. The first is context management. You pay attention to what goes into the context on every turn because that's what determines what the agent actually knows at any given point. You can't rely on the chat context alone. Even with context windows as large as a million tokens, basically how much the agent can hold in memory at once.
Because as the conversation grows, your system prompt and instructions get buried under recent tool outputs. The agents attention naturally pulls toward whatever is most recent. So the important stuff gets lost. That's why managing context matters so much. The next thing to get right is feedback quality. Feedback is what tells the agent how it did. And it's one of the most important signals in the whole system. It can take a lot of forms like the output of a test run or a screenshot of the UI it just built. And whatever form it takes, that's what the agent reads to figure out its next move.
Verification gates are what turn that feedback into a clear verdict. They're the checkpoints that tell the agent whether a task is actually done or not. You also need a termination condition. Basically, a rule that tells the loop when to stop. And this one has to be set explicitly. Otherwise, the agent either quits too early or keeps going without making real progress. The thing people most often overlook is error handling. You have to spell out what the model should do when a tool call fails. So, the system handles it cleanly instead of leaving things in a broken state that just creates more problems.
And finally, you need to manage state across turns. Basically, keep track of where the task is as the conversation grows. The context window can't hold everything forever, so you lean on external files that track information for the agent and let it keep working without losing the thread. One thing to keep in mind, though, since you're handing the job of figuring out the path over to the model instead of doing it yourself, loops get expensive in tokens. So, you need to be deliberate about when you actually use them. The more tokens a loop can work with, the better it tends to handle the task.
But before we move forward, let's have a word from our sponsor, Scribba. Most Python courses are just someone talking over slides. Scribba is different. Their video player is the code editor, so you can pause anytime, edit the instructor's code directly, and see what happens. No tab switching, no copy pasting, just hands-on coding from the start. Their new learn Python course caught my attention because instead of random exercises, you actually build something real. From day one, you're building payup, a fully functional expense spplitting app, and every concept gets applied immediately. You start from absolute zero. No prior Python knowledge needed and work through variables, strings, capturing user input, arithmetic operators, type conversion, data cleaning, and number formatting, all by building features for the app.
By the end, you've built a working project from scratch that proves you actually know Python. This is just part one of several that will become available over the coming weeks. And currently, it's totally free to access. Get started today with their free courses, and our users will get an extra 20% off on their pro plans. So click the link in the pinned comment or scan the QR code and start building today. As we mentioned, there are two types of loops. The first one is called the deterministic loop. You use it for tasks that have a clear definition of what done actually looks like.
That could be tests passing, code compiling successfully or anything like that. These loops are fairly straightforward to work toward because the end goal is clear. So the model knows exactly what it needs to do before it can call the task done. Since Hermes is always running, it's a really good agent implement this loop on. We've created multiple workflows on it before and showed in our previous video how it handles a lot of our work on its own. The core of a deterministic loop is the clear definition of the end goal and for the apps you've hosted, that definition is your tests.
So, you can point the Hermes agent at any app you've deployed with test cases and have it monitor it for you. Now, if a change or a commit ends up breaking production, you can set up an automation on Hermes to catch it. The reason it works best here is that it comes with the self-evolving skills feature. So, it automatically creates and evolves skills based on the workflow, which keeps the health of the app in check. Once you've set up that monitoring automation, you can ask it to launch Claude code in non-interactive mode. Basically, running it on its own without you having to drive it and have it fix issues in a loop until all the test cases pass.
What it does from there is set up the automation workflow and load skills like the sub aentdriven development skill and the GitHub PR workflow skill which tell it how to manage the app on GitHub. It first identifies the issues that were breaking production then launches clouded code in non-interactive mode which takes the tests and commits the changes once all of them pass. After it has run every test and fixed whatever was causing production to fail, it uses the GitHub CLI to commit the changes. The app ends up running without any failures because it has confirmed that all the checks for a successful deployment are in place.
If you like these breakdowns, subscribe to the channel, click the notification bell, and hit the hype button, too. On the channel, we post content that helps you learn new ways to optimize different processes in different businesses with AI. Your support, whether it's subscribing, the notification bell, or the hype button, helps us create more content like this and reach more people. It means a lot to us. Now [snorts] the second type is the non-deterministic loop and these are tasks where you can't just set a clear rule to check whether the job is done the way you can with deterministic loops.
Because of that there's no clean way to verify the outcome. These are the kinds of things that we as humans can look at and judge for ourselves like building a UI or implementing a feature that needs a judgment call. So when you're working with a non-deterministic loop, the workflow is different. If you're applying AI to UI, you already know that it tends to fall back to the same patterns all the time. That's why we created a skill called AI slop detector, which holds all the instructions on how to avoid AI slop and lists the patterns that actually give it away.
And the reason we're using Hermes again is the self-evolving skills. If we still find AI slop in the UI after running the skill, the skill can update itself to incorporate that feedback directly. And that's exactly why we set this workflow up on Hermes. So, we asked Hermes to use the skill and check whether the UI has any of those patterns. If it does, it fixes them and launches Claude code in non-interactive mode to run the skill and keep fixing what it finds until there's nothing left to fix. Another benefit we get out of Hermes is that the model reviewing the work is different from the one building it.
We were using the GPT models which are known to be among the best for code review. So the clawed models become the builder and the other agent becomes the verifier. That's what completes the adversarial loop where the two check each other's work. Once that loop ran, it generated a much better UI than the generic output the Opus models are putting out nowadays. And if you still spot any sign of AI slop in the UI after the agent loop has ended, you can just mention it and it will update the skill for you, strengthening the verifier you already have.
We've enhanced this skill to match multiple AI slop patterns that we and Hermes identified collectively. If you want to use this skill, you can get it from our community, AIABS Pro. The link's going to be in the description. That brings us to the end of this video. If you'd like to support the channel and help us keep making videos like this, you can do so by using the super thanks button below. As always, thank you for watching and I'll see you in the next one.
More from AI LABS
Get daily recaps from
AI LABS
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.









