32 Tricks to Level Up Claude Code in 16 Mins

Nate Herk | AI Automation| 00:16:15|Apr 29, 2026

Chapters5

The video introduces Cloudcode hacks that take you from a complete beginner to a power user, promising a progression from basic hacks to advanced techniques with the best tips saved for the end.

A practical, battle-tested set of Claude code tricks from beginner to pro, focused on Cloudcode workflows, context management, and cost-saving sub-agents.

Summary

Nate Herk packs 32 practical Cloudcode hacks into this fast-paced guide, starting with beginner wins like running /init to auto-generate cloud.md and setting up a terminal status line for context visibility. He emphasizes keeping context lean, using /context to spot token bloat, and using /compact and /clear to manage memory across tasks. Voice input, plan mode, and letting Claude act like a junior developer are highlighted as core productivity boosters. As the tips scale, Nate explores sub-agents, custom skills, and Haiku for cheaper parallel work, plus strategies to refresh cloud.md and route data to external files. Advanced techniques include parallel Git work trees, API-endpoints over MCPs, recurring loops with /loop, and hosting on a VPS for always-on sessions. The closer covers safety with permissions, agent teams, and Context 7 MCP to keep docs fresh, plus a free PDF guide in his free school community.

Key Takeaways

Run /init on existing projects to auto-generate a cloud.md that maps architecture, conventions, and key files.
Use /status line to display model, context percentage, and cost in the terminal for real-time awareness.
Enable native /voice command to dictate code, with third-party apps as a backup until rollout is complete.
Keep context tight and task-focused; avoid dumping entire codebases to improve Cloudcode performance.
Use /context to identify token bloat and /compact to shrink history while preserving critical items like API decisions.
Deploy sub-agents for parallel work so research, testing, and exploration run concurrently without cluttering the main thread.
Create custom skills in .cloud/skills (e.g., techdebt.md, codereview.md) for repeatable, team-wide workflows that can be Git-backed and shared.

Who Is This For?

Developers already using Claude code or Cloudcode who want to accelerate workflows, manage context and costs, and scale with sub-agents, skills, and parallel sessions.

Notable Quotes

"Cloudcode will scan your entire code base, your folders, your files, and it will generate a cloud.md file, which is basically like a cheat sheet for that project."

—Describes the core first-step benefit of /init for any project.

"Use /status line and tell Cloudcode what you want to see. It basically generates a little script that sits at the bottom of the terminal, a mini dashboard for your session."

—Shows how to maintain visibility into context and costs during work.

"If you notice that Cloud starts going down the wrong path, hit escape, correct course, and re-prompt."

—Practical tip for keeping outputs aligned with goals.

"Context 7 MCP fixes that by having up-to-date live documentation injected into the conversation before coding."

—Highlights the reliability boost from Context 7 MCP.

"Agent teams are like sub-agents but can talk to each other and share a task list for cohesive output."

—Introduces a more integrated, scalable collaboration model.

Questions This Video Answers

How do I set up a cloud.md for a new Cloudcode project?
What is plan mode in Cloudcode and when should I use it?
Can I run parallel Claude code sessions without interfering with each other?
What are the benefits of Context 7 MCP for live documentation in coding sessions?
How can I implement sub-agents and agent teams to scale Claude code workflows?

Claude codeCloudcodeCloud.md/init/status/contextHaikuMCP sub-agentscustom-skills','plan-mode','context-management','token-optimization

Full Transcript

These are the Cloudcode hacks that took me from a complete beginner to mass producing workflows and building websites, apps, and AI agents in real time. So today we're going to be going from beginner hacks all the way to advanced power user stuff, and the best ones are saved at the end. All right, so starting off with our beginner hacks, number one is to run {slash} init on every project. So if you've already got an existing project with files already in there, the first thing you should do is open it up and type {slash} init. Cloudcode will then scan your entire code base, your folders, your files, and it will generate a cloud.md file, which is basically like a cheat sheet for that project. It'll map out your architecture, your conventions, and any key files that you have in there. So instead of having to re-explain your project every session, Cloud will basically just contextualize everything and initialize everything and know what you're working with. And if you're starting a new project from scratch, then you can have Cloudcode help you create that cloud.md file yourself just by explaining what's the goal of this project, what tech stack you want to use, or any rules or key folders and files. All right, number two is to set up a status line. So if you're working in the terminal, you can type {slash} status line and tell Cloudcode what you want to see. Your model, your context percentage, cost, whatever. It basically generates a little script that sits at the bottom of the terminal, so as you're talking every single time, you can just see that status line. It's kind of like a mini dashboard for your session. So it's really helpful to always be able to see how much context you have left, so you can avoid context rot. Hack number three is using voice input. So Cloudcode just shipped a native {slash} voice command, which means you can literally just talk to your terminal and have it code for you now. So it's still rolling out, it will be out for everyone soon, but another good hack would just be to use an app to actually be able to voice tate anywhere. So if you want to see the tool that I use, you can check out the description. Now I can just talk and words will appear anywhere. Hack number four is to keep your context small. So don't dump your entire code base into a conversation, only give Cloud what it needs for the current task. So try to break big problems into small focused steps. The less noise in the context window, the better Cloud performs. It's simple, but a lot of people ignore this. Hack number five is to use {slash} context to find your token bloat. So if you do {slash} context, you'll see exactly what's eating your tokens, whether that's system prompts or file contents, MCP servers, whatever it is. All of that gets broken down into percentages. So if your session feels a little bloated, you can actually investigate it, diagnose where the problem is, and then restructure. Hack number six is to compact at 60% and also clear between tasks. So when your context hits around 60%, then type {slash} compact and Cloudcode will compress your conversation history so you can keep going without losing important stuff. And something interesting is that you can actually do a {slash} compact, but you can tell it to keep certain things. Like, "Hey, {slash} compact, but keep all of the API integration decisions and database schema." So Cloud will automatically shrink everything down and preserve the stuff that you need to keep in there. And if you're actually going to switch to a completely different task and you don't need that conversation history, then use {slash} clear to just wipe the slate clean and you're starting from a new conversation. But luckily you still have your cloud.md, you still have all the other files, so it's not like you're actually starting from scratch. So hack number seven is to always start in plan mode. So that means you can hit shift tab to cycle between modes or you just choose it manually. And once Cloud's in plan mode, it can still read, it can still research, but it won't actually change anything. So Cloud will outline the steps, it will ask clarifying questions, and it will map out the approach before writing a single line of code, which has been shown to improve the quality. Now once you like the plan, you switch out of plan mode, tell it to execute, and this alone will dramatically reduce how many times that you have to go back and correct Cloud. Hack number eight, we have to treat Cloud like a junior developer, which means don't always give it direct commands like, "Write me a function that does X," but try to understand how you can give it problems. So saying, "How should we handle growth tracking?" and let it think through the approach, because when it makes its own assumptions and it thinks through decisions, you can ask it to explain those. And this has also been shown to get better outputs when Cloud reasons through the problem first. So it's like plan mode, but now you're having it think a little bit deeper. Okay, hack number nine is to make Cloud ask questions. So a lot of times in plan mode, it will do this natively, but you can actually tell it to invoke its ask user question tool. You can tell it, "Continuously ask me questions until you're 95% confident that you understand exactly what I need and exactly what you need to do." And once again, this alignment helps you from having to go back and forth with, you know, three or four rounds of revisions. All right, hack number 10 is build self-checking into the to-do lists. So you know how Cloud makes a to-do list when it starts building? Well, you can actually build verification steps right into that list. So let's say one to-do is to build a website. The very next to-do could be take a screenshot of the website and check that everything looks right. And then maybe the next step is to open Chrome DevTools to use the browser and make sure that there are no actual errors in functionality. So you're now baking quality checks directly into the execution plan, so Cloud isn't just building stuff and handing it to you for feedback, but now it's building something, checking it, making sure everything's good, and then getting your feedback. And another cool thing that I like to do here is say, "Don't move on to your next to-do until you're 95% confident that that to-do is good." Because it's AI, it's really hard to one-shot what you're looking for, but you'd rather have it one-shot 90% of the way there rather than one-shot 60 or 65%. Okay, so those were our beginner hacks. Now let's step it up a little bit. These next ones are for the people who are already kind of using Cloudcode a little bit and want to move faster. All right, so hack number 11 is to deploy sub-agents for parallel work. Try telling the main session to use sub-agents in your prompt when you're working on complex problems. Cloud will spin up isolated sub-agents that each have their own context window. They can each be using their own model, and each agent works in parallel, which means the main thread stays clean while the sub-agents go to research, write tests, or explore different approaches. When they're done, they all report back to that main agent with their findings. So it's like having a team of developers instead of just having one. And you can even pair this with the model hack for cheaper tokens, which means you can have all the sub-agents running on Haiku for simpler stuff and your main thread can stay on Opus. All right, hack number 12 is to build custom skills. This means you can create reusable prompt files in your {dot}cloud/skills directory. So for example, you can have one skill called techdebt.md, which tells Cloud exactly how to scan for technical debt. Or you can have one called codereview.md, which knows exactly how to review your code base. And then all you have to do is invoke that skill in natural language or just use the {slash} command directly, and it will run that entire workflow consistently every single time. You can even commit them to GitHub and your whole team can instantly use them as well. So you can automate your actual SOPs. All right, so hack number 13 is something that I alluded to a little bit earlier, but that's basically just using Haiku for sub-agents. Because you can set the model for the sub-agents that you spin up, when you have simple tasks or processing a large amount of data, then use Haiku. It's way cheaper and it still gets the job done. Specifically, if you need a sub-agent to go scrape a ton of different articles, read hundreds of thousands of tokens, and then just give Opus, give your main agent, just a small summary or the key highlights. It just doesn't really make sense to have such a heavy and expensive model reading hundreds of thousands of tokens if it just needs a few bits of information. And if you do this right, it can really keep your costs down without sacrificing quality where it matters. All right, hack number 14 is to constantly be refreshing your cloud.md file. Once there's a new discovery about your project, update the cloud.md. Once you've made some new skills, update the cloud.md. You want Cloud to be logging new patterns, new gotchas, and any new conventions that it discovered during your session. So next time that you start it up, it already knows all of this. This will help prevent repeat mistakes and it will make Cloud smarter about you, your business, your project, all that kind of stuff over time. But here's the catch, you don't want to let it bloat, because the cloud.md file is basically the system prompt and it gets loaded into every single conversation. And everything in there is going to eat up at your context window. So I try to keep mine simple and only put the most important information in there. I like to keep it between 150 and 200 lines max. If it starts getting longer than that, then it's time to trim down some things, which leads perfectly into the next hack, number 15, which is to have cloud.md route to other files. Because it potentially eats so many tokens, you want to keep it lean, but you do have a lot of information in there. But what's cool is you can route it to different places. So you can have it link out to separate files for stuff like style guides or business context or reference docs. Just point to those files in the cloud.md so Cloud knows exactly where to look, and then you're also not wasting tokens on information that it doesn't always need. Because in its system prompt, it doesn't need to know the exact status of a certain project, but it does need to know exactly where to go look to find that information. Hack number 16 is to exit early and re-ask. So if you notice that Cloud starts going down the wrong path, don't just wait for it to finish. Hit escape, correct course, and then re-prompt. Every token that it spends going the wrong direction is just wasted context. So steer tight and steer early. At the end of the day, it is AI. Hack number 17 is to challenge outputs aggressively. So if Cloud gives you something that's just okay, push back. Say, "Scrap that. Do a more elegant version." Or, "This isn't good enough. Try again with a completely different approach." Cloud will often give you a dramatically better output on the second try when you set a higher bar, and now it knows what to not do. The key is, once it comes back with something better, tell it to update itself, whether that be the skill or the cloud.md, so it doesn't make that sort of mistake again. Hack number 18 is to use {slash} rewind for quick undos. So if you make a wrong turn, just try using {slash} rewind and Cloud will roll back to a previous point in the conversation without you having to start over. So it's super quick, super clean. Hack number 19 is using hooks for notifications. So if you type {slash} hooks, you can set up a notification hook, or you can just have Cloudcode do this for you in completely natural language. So for example, what I like to do is when I have a Cloud finish up a session or finish a chat, it sends me an actual noise notification. Because now I can work on something else on my computer, or I can literally have 15 different sessions of Cloudcode running, and if I hear that noise, I know that one of them is done and needs some more input from me. Hack number 20 is using screenshots. Just remember that Cloud can actually see, and this is a huge unlock, which means you can feed it error messages, which means you can feed it, you know, inspiration websites. You can also do a really cool self-check loop, where you can say things like, "Take a screenshot of the website and tell me if the layout looks right." And it will literally screenshot it, analyze it visually, and tell you what's off. So if you remember one of the hacks from earlier where I said to have it check itself, when I'm building websites, I basically have it design the website, screenshot, and then implement new changes, and then do that again. And so it does like three passes of building and screenshots before it even gives me V1. And in that flow, the V1 that it gives me is so much better than a V1 that it used to give me. Hack number 21 is to use Chrome DevTools. So Cloud can open a browser, it can interact with an app, it can check the functionality of things, and so it's kind of like the screenshot loop, but instead of for like websites and design, it's for actual functionality of like apps and buttons. This is huge for front-end work, so definitely give it a try. But this also means that it can do things like filling out forms and potentially like reCAPTCHAs and stuff. But this is also huge because if there's not an explicit API somewhere, it can go in and manually do things. I think that it could also solve captchas, but it's probably better if you're already signed in somewhere and all it has to do is navigate, click buttons, fill out things. Hack number 22 is to clone inspiration sites. So, you can take screenshots of sites that you really like and feed it to Claude and say, "Make it look like this." Claude will recreate the design patterns without making it look like generic AI slop. And this is huge for front-end quality because you could also use the site as inspiration by taking some of the actual like HTML styling and feed that into Claude, too. So, yes, Claude could essentially clone a website, but what you want to do is take that as kind of a template and give it your own touch. Okay, so now we're going to move on to some more advanced stuff. These hacks are for people who really want to push Claude code to its limits, so let's go. All right, hack number 23 is to run parallel sessions with Git work trees. Normally, when you're working on a project, you've got it in one folder with all your files in it. So, if you want to run two different sessions in the same folder at the same time, they might overwrite each other's work. And that's where work trees come in. So, think of work tree like basically making a parallel copy of your project, except it's way more efficient than actually copying the folder. You just type in Claude {dash} {dash} work tree and then that feature name. So, Claude will then create an isolated workspace on its own branch. You can then open up another terminal and type in the same thing with a different feature name and it will open up a different branch. So, now you can be working on the same project at the same time without having those coding agents step on each other. You can have three, four, or five of these things going at once. And when you're done, you can have them just merge the branches back together just like you would with any other Git branch. So, all the work can save back to the main project, once again, without overriding each other's files. All right, hack number 24 is to use API endpoints instead of MCP servers. Depending on the situation, but here's what I mean by that. MCP servers are great because you can look at all the different tools and execute any of them. But, they load their entire tool definitions into the context window. So, if you're tight on tokens, sometimes it's better to just use direct API endpoints instead. So, for example, let's say you're using Notion and you only actually need to be able to read one database. It makes no sense to show Claude how to do all of the other functions if for this specific project you only have to read one file. So, instead, just hardcode in that endpoint and now you're saving tons and tons of tokens. All right, hack number 25 is to use {slash} loop for recurring tasks. So, you can type, "Hey, every 5 minutes check in on the deployment." And Claude will rerun that prompt in that same session every single 5 minutes unless you close out of that session. You can set it to monitor a PR, check error logs, pull a build, whatever. It runs in the background and then only interrupts you when something actually needs your attention. You can even set one-time reminders in natural language like remind me at 3:00 p.m. to check in with the team on X. Now, the only caveat here is these actual loops will only last for 3 days. So, if you need a scheduled automation that's a little bit longer term, then you're going to want to use the desktop scheduled tasks. Although, the only difference here is every time one of those tasks spin up, it's an individual session, so it doesn't have that context memory. Number 26 is to host on a VPS for always-on sessions. So, if you want to run Claude code on a remote server, it'll stay running even when your laptop is closed, which means you can SSH in whenever you need to interact, which means you can talk to it through Telegram anytime. This is perfect for long-running tasks where you don't want to baby-sit a local terminal. Hack number 27 is you can use remote control from your phone. So, this is a pretty new feature, but Claude code now lets you control local sessions from your phone or any browser. You start a task at your local desk and then you walk away and you can keep steering it from your phone. Your code never actually leaves your local machine, but it's just the remote connection is on your phone. So, you can start something heavy, go grab a coffee, go on a walk, and you can keep building from your pocket. Hack number 28 is there's no SQL data analytics. So, you can connect CLI tools like BigQuery's BQ tool to Claude code. And then you can just ask questions in plain English like, "What were our top 10 revenue sources last quarter?" And Claude will instantly translate that into the right query, run it, and then give you that answer. No SQL required. And this should work for any CLI-based tool. Number 28 is ultra think. When you need Claude code to really think through a hard problem like architecture decisions, complex debugging, big refactors, or maybe it's just not giving you the right output after a couple prompts, try using ultra think. You literally just type the word and it will go all colorful and this means it allocates the maximum thinking budget of around 32,000 tokens before Claude actually responds. So, don't always use this for a simple fix, but absolutely use it if you're making decisions that might affect the entire system. Or like I said, if after the first couple tries it's not giving you what you want. Hack 30 is to edit permissions for safe autonomy. A lot of people, including myself, have shown on videos using dangerously skip permissions to make sure that Claude can just run without asking for approval on every single step. And yeah, it's much faster, but it is called dangerously skip permissions for a reason. So, the smarter way to go about it is to go into your permissions and explicitly allow the commands that you know are safe. And then, explicitly deny anything that's destructive like deletes or removes. And now you can actually get to the point where you have the same exact speed and autonomy of dangerously skip permissions without it being very dangerous. And anything in the deny list is going to take priority over anything in the allow list. Hack number 31 is to use agent teams. So, remember how we talked about sub-agents, being able to run agents in parallel that have fresh context, but can't talk to each other? Agent teams are like that, but all of the agents can talk to each other. So, it gets really, really cool. They share a task list, they can communicate with each other, and they can even assign each other work. And you can actually talk to each of those individual agents instead of just having to go through the main one and then the main one would communicate with sub-agents. These are a little bit more expensive and they do run longer, but they will give you a much more cohesive output for a big project. Hack number 32 is Context 7 MCP. This one's a game-changer. You can install the Context 7 MCP server and then whenever you need information on current documentation, just prompt it to use that MCP server. The problem that it solves is that Claude's training data has a cutoff, which means sometimes it might suggest functions or APIs that have been renamed or deprecated or just don't even exist anymore. So, Context 7 fixes that because it has up to eight version-specific technical documentation about live code examples from thousands of popular libraries that you probably need with a coding assistant like Next.js, React, MongoDB, you name it. So, it's able to pull and read all current documentation and then inject it into the conversation before Claude actually starts writing any code. So, it's basically one command to install and from there, all of your coding agents are working with much more up-to-date information and it's a huge quality improvement. All right, so I know that we covered a ton of information in this video. So, what I did is I threw all of this into a PDF resource guide so that you can just come back and reference them whenever you want. That's available completely for free inside of my free school community. The link for that is down in the description. But, that's going to do it for this one. So, if you guys enjoyed or you learned something new, please give it a like. Definitely helps me out a ton. And as always, I appreciate you guys making it to the end of the video. I will see you all in the next one. Thanks, everyone.