Nvidia's New Tool Just Fixed Agent Skills
Chapters6
Researchers found most skills have potential security vulnerabilities, and Skill Specter can scan and rate how dangerous a skill is before installation.
Skill Specter scans AI agent skills for vulnerabilities, but a clever bypass and a new discovery workflow redefine how you find, test, and install safe skills.
Summary
AI Labs presents Nvidia's Skill Specter, a tool designed to audit AI agent skills before installation. Founder insights show that over 30,000 skills were studied, with more than a quarter harboring security vulnerabilities. The video demonstrates Skill Specter’s two modes: a fast pattern-matching scan and an AI-assisted scan that reduces false positives. Viewers learn how attackers hide instructions, impersonate trusted tools with look-alike characters, misrepresent what a skill does, exfiltrate credentials, deploy malware, or poison dependencies. A bypass is revealed: turning on the AI check by dropping the no-LLM flag, then running Claude Code in headless mode via Claude’s backend. The host explains how to install and run Skill Specter from the GitHub repo, including the test folder with dangerous skills to validate the scanner. Nvidia’s workflow evolves into a broader “discover skills” process, integrating skills.sh and a feedback loop that fixes issues and re-scans. The segment closes by tying the tech to practical workflows in AI Labs Pro and teasing the new make design.md skill as part of a continual design-and-scan cycle.
Key Takeaways
- Over 30,000 AI agent skills were studied, and more than 25% contained security vulnerabilities.
- Skill Specter flags danger levels with a numeric score and exact file locations to identify the fault, helping users decide what not to install.
- Turning on the AI check requires an OpenAI key by default, but the hosts demonstrate a workaround by using Claude Code’s headless mode to perform the check.
- Threat vectors include hidden instructions, impersonation with look-alike characters, misrepresented capabilities, credential theft, malware execution, and poisoned dependencies.
- The broader workflow integrates skills.sh, discovery processes, and automated fixes to continuously discover, test, and sanitize skills before installation.
- The design.md skill example demonstrates extracting design tokens from an app and using discovery to pull and vet potential supporting skills.
Who Is This For?
Essential viewing for AI engineers and security-minded developers who work with autonomous agents, especially those integrating Nvidia’s Skill Specter into production workflows. It’s also valuable for teams using Claude Code, skills.sh, or similar tooling who want safer, auditable skill pipelines.
Notable Quotes
"Right now, AI agent skills are everywhere. Every agent runs them and you trust them without any checks."
—Sets up the security problem Skill Specter aims to solve.
"The higher the score, the more dangerous the skill."
—Explains how Skill Specter ranks vulnerabilities.
"You just drop this no LLM flag, and it does the AI check on a skill."
—Describes enabling the AI-assisted scan.
"We didn't just scan skills, we built a whole workflow that changes how you find and install them for good."
—Highlights the transition from scanning to a full protective process.
Questions This Video Answers
- How does Skill Specter detect hidden instructions in AI agent skills?
- What are look-alike impersonation attacks in AI skills and how are they caught?
- Can the AI check for vulnerabilities in skills without OpenAI keys, using Claude Code instead?
- What is the discovery skills workflow and how does it improve safety before installation?
Nvidia Skill SpecterAI agent securityimpersonation attackshidden instructionspoisoned dependenciescredential theftmalware in skillsClaude Code headless modeskills.shdiscover skills workflow
Full Transcript
Right now, AI agent skills are everywhere. Every agent runs them and you trust them without any checks. But here's the scary part. Researchers studied over 30,000 of these skills and more than a quarter of them had a security vulnerability. So Nvidia built a tool called Skill Specter that scans any skill before you install it and tells you exactly how dangerous it is. But here's where it gets interesting. One type of attack can slip right past it and the setting that actually catches it is off by default. So most people never even know it's there. Turning that on normally costs money, but we found a way around it.
And by the end, we didn't just scan skills, we built a whole workflow that changes how you find and install them for good. Now, before we get into the full workflow, let's give you a quick tour of the tool and what you need to use it. So these are the install commands in the GitHub repo. You can just copy them and hand them to Claude Code and it'll basically install and set up the whole thing for you. Claude Code's going to install all the dependencies you can see right here. And once all that's done, you can start using Skill Specter.
Inside the GitHub repo, there's this test folder and inside that they've got some dangerous skills you can actually run it on to confirm the tool works. So we ran it on these skills and with every one of them, it tells you not to install. The higher the score, the more dangerous the skill. And with each test, it doesn't just give you a number. It shows you the exact line number, the exact location, and the file name where the conflict is, which is basically what pushed the score up. Now, this isn't [snorts] the only way to use the tool.
It's got another mode. But before you get why we even need that second mode, you need to know two things. How a skill even attacks you and how this tool actually catches that attack. Now, there are 14 categories, but to keep it simple, we've grouped them into six similar ones. So the first way a skill can attack you is with hidden instructions. See, a skill is just a text file full of instructions and your agent reads the whole thing and treats it as orders. The problem is a bad skill can hide extra instructions in there that you'll never see, but the agent does.
They tuck them inside comments or they use invisible characters or they scramble the text into a code that looks like nonsense to you, but the AI reads it just fine. So, the scanner is built specifically to hunt these hidden instructions down and find them. The second way is impersonation. So, your agent has tools it trusts and reaches for by name. Say there's one just called read that reads a file for it. So, a malicious skill gives its own tool that exact same name and your agent grabs the bad one thinking it's the safe one it already knows.
And the way they pull it off is sneaky. They swap one letter for a look-alike from another alphabet. So, they name it read, but the A is actually a Russian letter that looks identical to ours. To you and to your agent at a glance, it's the same word, but underneath it's a completely different tool. And the scanner catches this by checking the real identity of every single character. So, it spots that one fake letter and flags it. The third way is when the skill just lies about what it does. The description says one thing, the code does another.
So, it calls itself a simple formatter and then quietly reaches out to the internet in the background. Or it says it only needs permission to read your files, but the code is actually writing files and running commands, too. And this one's way harder to catch. This is where that second mode comes in, but we'll get to that later. The fourth way is the skill steals your credentials. This could be your API keys, your passwords. So, a skill goes through all the keys saved on your machine, scoops them up, and sends them off to some server.
The fifth way is the skill just runs straight-up malware. This includes things like a reverse shell, which basically hands a stranger remote control of your whole computer. And because this kind of malware has known fingerprints, the scanner just matches the code against a big library of those fingerprints. And the sixth way is poison dependencies. So, a skill will often use a CLI tool. Basically, a small outside program it runs in the terminal to handle part of its job. And a bad skill grabs a piece that's actually malicious. Maybe it's a fake package with a name that's one typo off a real popular one.
So, you pull the wrong one and it runs malware like the last type. So, the scanner checks every package the skill pulls in against a live database of known bad ones, and it flags the fake names and those download and run commands to keep your system safe. So, in that first mode, it's just matching patterns without any context, which means it ends up flagging stuff that's completely fine. And those are what we call false positives. So, that's where the second mode comes in, the AI scan. And turning it on is simple. You just drop this no LLM flag, and it does the second scan here.
But [snorts] if you look inside the code, you'll find out that to run an AI check on a skill, you need to plug in an OpenAI key. So, to get around that cost, we just use Claude Code itself to run that AI check. Now, the main agent in Claude Code doesn't actually do it itself. We use Claude's headless mode, which is basically Claude Code running in the background with no chat window, just executing commands on its own. And we're sure most of you know it isn't free, but you do get monthly credits for it with your Anthropic plans.
And you can just ask Claude Code to make the change we just talked about, and it'll do it for you. Of course, you might hit a bug or two, but it's just a single line prompt Claude can set up for you. And if you're enjoying the video so far, subscribe to the channel and hit the like button. This small gesture of support goes a long way for us. So, they've also got dangerous skills in their test folder that actually need the AI check. When you run the no LLM check on one of them, the score comes out as zero, which means it's perfectly safe.
But the second you run it with the AI check, the score jumps to 100. It tells you not to install, and it lays out exactly why. But [snorts] what if instead of just detecting the problems in a skill, the scanner also helped you fix them? So, that's exactly why we turned the scanner into a skill. And you might be wondering, why is it called discover skills? Well, because we didn't just make one separate skill. We made a whole process that helps us discover more skills and make sure they're safe before we install them. So, we've been using skills.sh to find new skills for a while now.
It's basically a Git repo built specifically for skills, so one big shared library you can pull from. And we think they recently shipped a CLI update, so now Claude can just run search queries straight through the command line and pull the best skills it needs before installing anything. And we wanted our scanner running on top of that. So, in here we've got scan.sh, which is the script that actually runs Skill Specter. Since Skill Specter is a CLI tool, it has to be run as a command. So, we made a whole script and we baked the Claude headless mode fix right into it.
So, by default it runs the normal check, but if you want it'll run the AI check, too. And if you open up skill.md, you can see the basic steps laid out. It identifies the target, then scans it, then it shows you the findings. And once it knows what the problems are, it goes ahead and fixes them, then runs the whole loop again after to make sure everything's clean. So, for example, this folder we're showing you right now is our AI Labs design folder. It's basically our whole design process compressed into one folder with a bunch of skills inside.
We've got a whole video on this, and on top of that, the whole system's available in AI Labs Pro, which is our community. So, if you want to support the channel and grab this whole design system, go check it out. And this discovery skill is going to be uploaded in there, too. The link's going to be in the description. But, we're building on top of this here. So, we're adding a new make design.md skill, which lays out the fastest way to pull design tokens out of an app you've already built, basically the colors, fonts, and spacing rules, and merge them into a design.md file.
So, here we wanted to create the design.md file. So, we told it that we wanted to improve it and that it should go search for other tools out there. So, it used skills.sh, then we loaded the discovery skill, and that pulled back a handful of skills. These are the skills it brought back, and the first two looked interesting, so we wanted to dig in. We asked it to install and test both of them. And just like the discover skills workflow says, it won't install any skill without scanning it first. So, it installed them and read through them and told us straight up that neither one was going to help with the make design.md skill.
But, from a safety point of view, the first one got a score of 10, which meant it was safe, and the second got a 100, which meant don't install it. So, we told it to run the AI check on that second skill. It ran it again through Claude's headless mode, and this time the score came back as zero. This means that the skill was safe to use. And that's the whole point of this system. You're not just grabbing skills blindly off the internet. You have a whole process that you can kick off just by using a skill.
Now, let's have a word from our sponsor, Nimblelist. If you use Claude Code or Codex, you know the problem. You've got multiple sessions running, files changing everywhere, and you're constantly switching between terminal, browser, and editor just to keep track of what your agents are doing. Nimblelist is an open-source visual workspace that puts everything in one place. I had three agents working on different parts of a project at the same time, and instead of jumping across windows, I could see all of them on a Kanban board, jump into any session, review code changes as red and green diffs, and approve or reject them individually.
I was editing Markdown docs, UI mockups, and architecture diagrams visually right alongside my agent. When I was done, I didn't have to clean up commits manually because it generated Git commit messages automatically based on what changed. Tasks stayed connected to the actual sessions, and there's even a mobile app to continue the session while you're away from your desk. Nimblelist is completely free and open source, and you can check it out by using the link in the pinned comment. That brings us to the end of this video. If you'd like to support the channel and help us keep making videos like this, you can do so by using the Super Thanks button below.
As always, thank you for watching, and I'll see you in the next one.
More from AI LABS
Get daily recaps from
AI LABS
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.







