Open Source Friday - Welcome to Maintainer Month 2026

GitHub| 01:31:49|May 2, 2026
Chapters7
Introduces May as maintainer month, highlights the calendar of streams and special events, and invites community participation.

Maintainer Month 2026 kicks off with Nick Tindall sharing how AutoGPT scales AI-assisted contribution while keeping project quality and ethics front and center.

Summary

GitHub’s Andrea Griffith hosts a deep-dive into Maintainer Month 2026, highlighting how maintainers can harness AI agents without losing control of project direction. Nick Tindall, founder of AutoGPT, shares the origin story of AutoGPT, its monorepo, and how a flood of agent-driven PRs is managed at scale. The conversation covers practical techniques: using agents.md and skill-based prompts to guide AI contributions, forcing PR templates, and integrating tests and CI signals (Codecov/Sentry) to keep quality high. Grading PRs become a system-wide effort, with automated checks like test plans and blocking criteria that require human sign-off when needed. The panel digs into licensing, CLA requirements, and the realities of agent-authored code, including how to detect machine-written text and triage effectively. A notable focus is on how to balance automation with human review to prevent burnout, while enabling first-time contributors and new maintainers to participate meaningfully. The discussion also teases practical tools—Autopilot, Reddit blocks, and cloud hosting—demonstrating how automation can save time on release notes, emails, and outreach. Throughout, Andrea emphasizes community-building via maintainer channels, private peering for smaller projects, and the importance of experimentation with new tools. The hour-plus session ends with a tease for next week’s intimate Maintainer Month gathering and ongoing opportunities to shape GitHub’s open source ecosystem.

Key Takeaways

  • AI-assisted PR triage can handle 150 open PRs by filtering for new contributors and ready-to-merge changes, dramatically reducing manual workload.
  • “We require them to put a test plan, and this is kind of crazy.” (Nick Tindall, testing PRs and how test plans steer AI behavior)
  • CLA signing is a practical gate: agents struggle with signing, so a CLA bot plus human interaction ensures compliance and accountability.
  • Centralized agent instructions (agents.md vs. skills) help agents discover and follow repo-specific rules, improving merge quality across back-end and front-end tasks.

Who Is This For?

Essential viewing for maintainers ready to embrace AI-powered contribution workflows, and for developers seeking practical, scalable patterns to keep PR quality high while welcoming first-time contributors.

Notable Quotes

"AutoGPT wasn’t actually trying to solve a problem at all. We started way before most of the other AI agents that you've heard of."
Nick explains the origin of AutoGPT as an exploratory loop, not a problem-specific product.
"We’re going to close your PR automatically if you don’t follow the PR template."
illustrates strict, but effective, governance for agent-written contributions.
"We require them to put a test plan. And by writing the test plan, it realizes it needs to load the skill that will test the PR."
Shows how test plans drive automated testing and tool loading.
"CLA signing is a practical gate: agents are really bad at signing CLAs."
Highlights human-in-the-loop requirement for legal/identity checks.
"Code that’s unmerged is code unwritten."
Mantra on the importance of finalizing contributions.

Questions This Video Answers

  • How can you set up AI-assisted PR review without sacrificing quality in large repos?
  • What are best practices for integrating test plans into AI-generated PRs?
  • Why would a maintainer require CLA/Code of Conduct acknowledgments for AI contributors? takeaways?
GitHub Maintainer Month 2026AutoGPTAI agentsPR triagesoftware licensingCLAtest plansCI/CDagents.mdblock system
Full Transcript
[music] Hello my friends. Good morning, good afternoon, good evening, wherever in the world you are in. Welcome to the month of May. I am not going to lie to you. It's a little bit kind of startling to think that basically we're at midpoint in the year, pretty much. Um, today is May 1st. Several things happened this week. I'm sure all of you have been in this sort of roller coaster emotions that was the month of April. I was ready for that month to be over, to be honest. But I thank you because you're here supporting open source. And this is open source Friday. We have a special edition today. I am Andrea, Andrea Griffith, a Columbia dev in all places. And today we get to talk about my favorite month of the year. I think it is. It has to be. It's maintainer month. And what maintainer month is is a month where the good folks of open source programs at GitHub take the time to create an amazing invite-only experience for maintainers, as well as spaces for all of you to learn more about your favorite open source projects and your favorite maintainers. And so we we're going to have a tonny ton of streams this month. I'm going to grab the link for our maintainer month website so that you can see the full calendar. Uh, if you bear with me, I will bring you the actual calendar. So you can see there is a lot a lot of streams this month. We have events besides the events that are happening regularly for um, open source Fridays, but we also have special edition events like the one today where if we have either this come from two places, either conversations that happened with maintainers from just being part of the maintainer ecosystem and just being awesome and always giving folks at GitHub feedback. And uh, my guest today is actually no stranger and is a friend of GitHub and has been just very generous uh, with his time in sharing what he knows. So it's pretty cool because we actually been able to invite him and ask him to sort of give us a rundown. He's actually doing this but in a more intimate contest for maintainers later on this month. But I want you to see the website. So this is a maintainer website, maintainer month website. And when you come here you can click and see the entire schedule. Also, you have an option to subscribe to this calendar. So if you don't want to miss any of the events, which there's like I said, events a plenty, right? Today we're having two open source Fridays at one. My colleague Kedasha Kerr is going to have the folks from Remotion. How cool is Remotion? Plus all of our Robert Duck events, all of our open source Fridays, which are always focused on open source, but now we're going to have extra special events. So if you yourself uh, have an event that you think should be part of this celebration, this is an open source project and you can contribute to it. So if you are scrolling through, we did a lot of aesthetic improvements to the site. And by when I say we, I mean you good people with a excellent design taste actually help contribute to make the website better. So if you're looking for a project to contribute to, this is a good one. And if you're looking for submitting your own project uh, to appear in this fantastic schedule and calendar, please submit a PR. So you have the website now. I will share it a little bit later as well. But thank you for being here. I'm going to welcome you all. Thank you so much Jumoku. Appreciate you for being here. Good morning from Ethiopia. Amazing. Uh, good morning Fran. Am I an AI? I am not. I wish cuz then I will feel nothing. [laughter] Then it wouldn't hurt my feelings when everything is on fire and we're all suffering. So it's not. Um, welcome, welcome. And then Ravi, this question about migrating source code from GitLab to GitHub with mirroring approach in organization level. Oh my goodness. Actually my colleague just published a guide that's very comprehensive for actions. But let me find you one before this is over. I'm going to put the bat signal out and and get a colleague to go find one. So welcome everyone, welcome, welcome. So let's get into it because I think we have a lot to cover and I want to make sure that our guest gets the entire time that he deserves. So my guest today has been building AI agents before, I think before AI agents were like fancy. And his title is actually a really good representation of that. Nicholas Tindall is the founding AI engineer at AutoGPT. And if you haven't checked it out, it is one of the most starred repos in GitHub. I think it has over 180,000 stars. I might be wrong with my stat there, but correct me if I'm wrong. Um, they were the first public example of a LLM powered AI agent before like the buzz word caught. And today we're going to talk of course about AutoGPT. Nicholas has been super generous and he's going to actually give us a demo. But we're also going to talk about what happens when AI agents start contributing to your repo because we live in a completely different world now where contributions are both human and agentic and we have to optimize for that. And AutoGPT is doing a fantastic job at that. So let me uh, start by bringing Nicholas into the stage. Welcome Nicholas. Nicholas, you had a rough night and you're here. I'm so grateful. Yeah, I had a had a very busy night. Didn't get a lot of sleep, but I'm glad to be here now. Um, glad to be on the pod. Glad to be chat with y'all. Thank you, thank you. And by the way, um, you are joining us from Texas. You work remote. You have been with AutoGPT a couple of months now, well, a couple of months, a year. A couple of years, yeah. of years now. And you've actually going to give this this I guess a version of this into our more intimate gathering for maintainers, which is coming up next week. Um, but I am so grateful that you were open to coming by open source Friday because we talk a lot about projects and we share a lot about projects. And I think a big part of it is also being able to share what best practices and what projects are actually actively doing. So it's like open source for open source in open source, very meta. So happy you're here. Thank you, thank you. Before we get started into the topic, I do want to know a little bit more about AutoGPT. Like for those of us who maybe are newer to the project, can you tell us a little bit about what problem was AutoGPT was trying to solve and like how did it become a thing? And then I'm going to share the repository so that we can get everyone in open source Friday to go there and get that star count to 200k today. Yeah, absolutely. So we it's kind of an interesting thing. AutoGPT wasn't actually trying to solve a problem at all. We started way before most of the other AI agents that you've heard of um, even existed. And our goal was just to see what happens when you put uh, LLM in a loop. Uh, it started out as experiments. There's a bunch of different types of experiments we were running. Uh, and the first version, we call it classic now. Uh, we've done a little bit of work to revive it a little bit um, and are starting to look at that. But classic was like an experiment into what happens when you put an AI agent into a loop. Uh, try and give it a plan and goals. Uh, and it kind of is the driving factor for what a lot of these agents like Claude code or uh, Copilot CLI became today. This is before the concept of an AI agent even existed, right? Like we talked a lot about legal things and that's where a lot of the terms like agent came out of. Uh, it's because of the some of the legal discussions we had around like what kind of warnings do we have to put on this repo? Yes. It was crazy. Crazy times, very different. Um, Oh, how the times have changed. I've shared the the the link to the repository, so I hope everyone can take a look, go by and take a look at it. Um, but so AutoGPT is a repository that's like loved by contributors, right? You have a massive funnel of PRs. Uh, like I So many. I mean, I don't know if you can share some of those numbers. I know people can look at the repo and kind of get a Um, right now we have about 150 open. Um, That is wild. Which is and a significant amount of work to keep down. Just from internal and external contributors, we use a lot of agents of different varieties to do our work. Everything from GitHub's Copilot to Claude Code to Codex to our internal tools, right? Uh, our autopilot. We use a lot of agents and like manual coding still too, of course. And so we get a ton of PRs. Um, and we have to manage them and we do that and I'm going to tell you all a little bit how we do it today. Awesome. Awesome. Awesome. Uh, right now like I click on it and there is yeah, there is 144 and obviously like the speed in which you're getting these contributions is not slowing down. If anything, it's accelerating, right? Oh, yeah. It's unbelievable. Before we get started on how you are taking the approach of reviewing these PRs and stuff, I I think a lot of us are starting kind of like build this muscle to know how a PR was submitted, if it was an agent or a human who wrote it. I don't know, you've been there since the beginning, since like before everyone had open close submitting all their PRs or Claude Code was a thing or the GitHub CLI was incredible. So, like I don't know how you've been handling that yourself personally in the maintainership. Uh, and then we can talk about like the more approaches, like the more technical things that you've done to like make it better for everyone. Yeah, so for most of our team it's actually pretty easy to tell when the code is written by a human and when it was written by a bot. Um, we have a lot of things to make the bot written PRs good, but you can tell because they don't make spelling errors, for example. They don't um, deviate from the templates that we give them to the same degree, right? There's lots of things that a person will do that is taking liberties that an AI agent won't because of some of the rules we set up. Um, we've also uh, gotten really, really good at identifying LLM written text, uh, just as a team because we interact with them so much across so many different agents uh, and so many different models. It's just you can read it and go, uh, human probably didn't write this. It doesn't sound like a developer, you know, most developers are not the best writers or aren't writers. So, Oh my god, that is a telltale sign. Yeah. Yes. Yes. Like nobody is writing these dissertations every time they they submit a PR and some of these yeah, some of these agentic generated things are definitely like that. So, a lot of maintainers and a lot of projects have struggled with what to do because they don't want to silo like they don't want to close the door. They don't want to say, well, realistically, I don't think we can put the genie back in the bottle, friends. Like this is it. Okay? Short of like some kind of major event that just I don't know, takes the infrastructure out of the loop and we're unable to actively do it. I don't think we can go back to the times where we were not using AI assistants and working with LLMs. So, you and your project took a different approach to dealing with this type of PRs. Instead of just closing the PRs, right? You started doing things a little bit different to actually improve the quality of those instructions. Can you talk me through that? Yeah, absolutely. So, the first thing we had to do was identify what the problems were. Mhm. We were noticing low quality contributions, but to us low quality was a very hard thing to define, right? And it will be to most projects. Um, because there's so many different ways a PR can be low quality. Uh, for example, it could not have tests. It could have tests that are bad. It could not do the thing you want or it could do the thing you want perfectly, but be unreadable code or it could have 3,000 comments, right? Like there's a ton of different ways it can be wrong. So, we started off by identifying the things that really bothered us. Things like not enough tests, right? We ship a lot of code, you need to have tests, right? Uh, tests that were wrong, right? That was something that was really bothering one of our front end engineers. He's spent a lot of time fixing up some tooling to make sure the tests we write are good. Uh, and then we also identified things that were not telling us why people were making changes. That was really bothering us. We're getting a lot of PRs that just had some code change without any real understanding of why, Mhm. and we had a bunch of other smaller problems that we fixed along the way uh, that I'll get into, but those were like the really big things. We needed to identify what was going wrong. Cuz if you don't know what's wrong, you can't fix it, right? You got to have an engineering mindset for all this stuff. once we knew what was wrong, it was pretty easy to start fixing it, right? Uh, we started by identifying all the different tools people used, right? We looked at it and said, okay, a lot of people are using Claude Code, a lot of people are using Cursor, a lot of people are using Copilot and a lot of people are using Open Claw, right? So, we [clears throat] tried to look into those tools and see what different ways those tools can take instructions, right? Almost all these tools take a markdown files at various levels of the repo and read them, right? And they use that for instructions. Um, initially we thought our docs were sufficient, um, but turned out the tools aren't going to go read your docs unless they tell to. So, no amount of updating our contributor guidelines or um, updating the docs pages was having any effect, right? We even have a whole wiki on our repo dedicated to how to work with our repo. But none of the AI agents were following that. Um, so what we did was we found uh, we started with Claude. We made a bunch of Claude.md files. Okay. Right? Cuz that was the main contributor of spam was people using Claude. Um, and we knew that because we every Claude PR has code contributed by Claude. They all sound the same. Right, exactly. Um, so we added a bunch of Claude.md files and started using the tool ourselves, right? And started getting it to where whenever I asked it to add a feature, I was happy with the output it would make just by changing the markdown files, right? And putting them at various levels, learning that Claude will auto read any markdown file for those folder inners, right? We could control [clears throat] a lot of the behavior. So, if it entered our back end folder, we told it how to start the back end. We told it, if you're going to change the back end, you need to write tests and get to 80% coverage. Here's where you check coverage or your PR won't be accepted. Don't even open it, right? There's a bunch of different things we did like that. Small micro changes over time where we got to the point where I as a developer could tell Claude, here's the feature I want to make and Claude would just start writing it and discover on its own, here's the changes I need to make to make this compatible. Mhm. Uh, AutoGPT uses like a block system, right? To add new features. For example, we may have blocks for posting to Reddit or Twitter, for example, right? And that was one of the biggest things we were getting basically spam PRs for that were our sloppy PRs, I'll say. PRs that were unmergeable. because they weren't following the guides. Well, in that guide we already had a really, really good documentation set up. And with Claude markdown files, you can just reference that doc file and it'll go read it as if it's an allocation, right? So, that was the first step we did, right? Was build a little box to teach it how to interact with Open Code. Uh, that worked okay, but only for Claude, right? Other tools like Copilot, which became much more popular and Codex and Open Claw started contributing. Um, and we found that those tools ignored the Claude files because they're not So, we centralized to an agents.md and point all our Claude files to read that, right? And all this is iterative over time. Uh, you can go see on our repo all these random files littered around that have seemingly obscure instructions on how to work with them. But they're all instructions that were built off us using them and testing it until we got the output we liked, right? So, we added that and that was a really, really big improvement. Now we're getting agents to actually follow our rules around adding tests, um, making sure that they're implementing things correctly and avoiding problems. But we started realizing they don't know how to use the basic tools. Mhm. Stuff like Vercel, right? And Next and how do I do this pattern correctly? And our front end engineer, Luis, uh, was starting to get really annoyed that all these PRs were coming in. So, what he did was he wrote a guide and we added it to the repo as a skill. And every agent will go auto discover this skill when it starts working on our repo, whether it wants to or not. All of these harnesses, uh, that's the term for like Claude Code, Copilot, Autopilot, ours, will all just discover this, right? They don't That's like their default behavior. They'll go learn that see this skill. And we set up triggers in these skills as part of the skills name or part of the skills description, which says, "Trigger this when they when you need to add front-end tests or trigger this when you need to write back-end tests." With even more descriptions of what to do and what not to do. So, for example, we wanted storybook for certain types of components, right? Because we want to be able to validate it dynamically. Without having to run the entire front-end. So, as part of the front-end thing, it says, "Write a storybook test if your component is in one of these folders." And now Yui's is a lot happier because when he goes to run a PR from somebody, whether it's from me when I'm using Claude code or autopilot or somebody outside, it's much higher quality, And you can put these files anywhere and you should litter them extensively, but not make each individual one too long. Uh each of these tools behaves a little differently. Mhm. So, for Claude, for example, you can basically put as many of these things down as you want. Just put one per directory forever if you want, right? Go in every directory level down, it'll discover the ones that matter and not import the ones that don't. Ones that use agents.md, you have to tell it about the sub ones, right? So, "Hey, here's this file, but also these ones exist." Right? That applies to tools like Codex, [clears throat] which don't do as much native discovery. And over time we started seeing these things get their own contributions, right? As if they're like a separate part of our monorepo. Give me a second, I got to take a drink. Yeah, of course. Interesting, interesting. So, it's like a system of sort of uh I look at it like molarity, like progressive kind of disclosure, right? Where you're referencing files and sub files from different places. Um I guess that structure is important to help the agents actually find what they need to find. Uh so, that's a very interesting finding that you figure out what you should do in the agents.md versus what makes sense more as a skill. Yeah, that's a really big deal more than you'd expect cuz an agents.md file only happens in a certain directory. Mhm. But, for example, if you're writing back-end tests and you think about doing front-end stuff, a skill may load dynamically, right? So, if it's like, "Oh, we changed this back-end behavior, we need to make a front-end test for it." It's not going to know what directory to go look in for an agent.md file. Exactly. can tell it that, right? And then it'll go check the right one without wasting a bunch of extra time. Right? A lot of maintainers, I think, are treating AI tools as How do I say this? Slop generically, right? But, they don't have to be. It's basically somebody else paying for your compute if you set these things up right, you know? If they want to pay for Claude code and contribute to our repo, they're more than welcome to. And if I set my skills up in my repo up correctly for this, it'll work really well, right? So, we've done a lot of work and now most contributions we get, we can just say, "Okay, this probably works." Right? And we've built tools to validate that that we'll probably get to in a little bit. That's amazing. I love that. I love that. And this is what I think This is this is why like we really wanted to have this conversation with you cuz I think a lot of us are like the the the immediate instant is to say the door is shut and I'm not going to tax my team with reviewing all of these slop contributions. But, the way you just put it is brilliant because it's like, "You want to use your tokens and your compute to come and prove this project? Go for it, but I'm going to make it so you're contributing in a way that makes sense for us. Um fantastic. So, tell me about some of the ways that some of the tools that you created. So, like they well, creating the skills and the agents and then I think that's constantly been growing and improving, right? Um and then what what were some of the other um measures that you took to make sure that the way that the PRs were coming in, for example, made sense? Yeah, so we made PR templates and used them extensively. Um and used them quite uh forcefully, to be blunt. Um we tell the agents if they don't follow the PR template, we're closing your PR. And the agents take that quite seriously, surprisingly. Um When they're being tasked with doing something, they care quite a lot. Caring is maybe a not the right word. We're not going to get into the intention of AI, Yeah, that's a whole other stream, friend. Yeah, I'd love to I'd love to be back for that one. Uh just a a little round table chat for that. But, Yes. we've um set it up so that the agents think if the PR does not follow this template exactly, we're going to close it automatically with zero hesitation. Very like forcefully and angrily. Right? So, that means that agents follow the template, um which is good and bad, right? It's good that we know that they will follow the template, right? And we're certain of it. Um it's bad in that people have stopped following the template. Oh, that's interesting. the template really complicated, right? And the agents will still follow it. Um And you can see that a lot and you can tell which agents are like capable by which ones get our template correct. Right? So, we don't actually close all these automatically. We have all the tooling to do so. Um but, we decided that was not as required as just telling the agent that we will do it, it turns out. And you can do things like that, right? You can experiment and do changes and say, "Oh, I actually don't need to do that." But, yeah, we basically tell it if you don't follow, we're just going to close it. And it will follow it, but people don't follow it because it's really complicated template. But, as a maintainer, I don't really mind that because if you don't follow the template, I know you're probably a person and I'm going to be kinder, you know? Fair. Fair, fair. there is something that you put like, I don't know, is it a trick? The test plan trick for your template. Can you say more about that? Yeah, so we um we require them to put a test plan. And this is kind of crazy. And in combination with some of the skills we have, we have skills that call our one's called uh test PR. And as part of our test plan, it casually mentions testing the PR, which whenever it reads the template, it'll go, "Oh, I need to I need to write a test plan." And by writing the test plan, it realizes it needs to load the skill that will test the PR. And [clears throat] to test the PR, what it does is it opens up your browser using agent browser, which is a tool uh that it'll go install uh with your permission, of course, right? We're not trying to install things on people's computers without their permission. Uh it'll go and say, "Hey, I want to install agent browser to test this PR." And then it'll open up your browser and it'll launch the tooling and go, "This PR doesn't work. We're not opening it because we failed the to write the test plan." Right? Cuz for us to look at the PR, it requires you to check all the boxes in the test plan. And an agent will go, "Okay, cool, I got to make a test plan." Make the test plan. And then as it's making it, start testing it to say it did. Uh and we've noticed that's decreased a lot of the bad PRs, right? We don't really get PRs that don't work anymore so much as we get PRs that are maybe not aligned with what we want for our product, which is a lot harder to train. Uh we've done some work on that, but I think that's something you kind of got to accept as open source is just this person wants it to go a slightly different way than we do. And that's okay. Right? But, having the test plan makes it run it. Yes. Yes, yes, yes. Yeah. Yeah, it reduces a lot of the sloppiness uh where it's just like this doesn't work, right? Because well, if it doesn't work, it's not going to open the PR or it's going to keep iterating till it does or that runs out of credits, which not my problem. Hey, listen. And this is I think this is a purview of every single maintainer. Like, it's your it's your project, you're maintaining it actively, it it's your decision. So, uh you want to complain and participate, you got to make it so it makes sense for the project. Um and it shouldn't be personal, people. People get so upset. I hate it. And then there is like we forget that there are actual real people. Like, there is actual a real Nicholas who is doing this. They're not doing it to ruin your day. They're doing it so, you know, the integrity of project and like it makes sense for the entire user base, not just one person. I can't remember who said this a long time ago. Something about there are not enough forks of projects. I can't remember who did it, but it makes so much sense and I I've done like a lot of the PRs that I submit, if I'm submitting to an open source project and I know it's something that's kind of a niche, I always write, "Listen, if this doesn't make sense, this can happily live in my fork forever and ever and I will continue to use it and I will continue to I'll find other ways to contribute to the project. But, just because it's what I want, doesn't mean it's what's best for the project as a whole. So, don't be afraid to make your forks, people. Like, that's the beauty of open source. One of the tools we use internally is called branchlet. Uh it's made by a really cool open source guy. But, I started off using it and then I made like 15 PRs to this guy's thing by hand, Wow. And I felt really bad. I was like, "Oh, I here's a small feature I want to add. Oh, here's a small feature I want to add. Here's a small feature I want to add." And finally I I messaged on the issues and I was like, "Okay, here's all these issues Here's all these things I added. I can make one big PR or I can just have this live in a fork forever." Um but, I gave him a list of all the features I added. And he just went and used his agent to code it himself. And for me, that's no harm, no foul, right? Like, Exactly. He can do it how he wants to do it. I just want the feature to work. And it's okay to have forks, right? It's encouraged. It's how what makes code good and what's make what makes GitHub so alive. I Speaking of GitHub, of course it plays one of these things that are happening. And boy, do we have a lot of work to do and the team has been working really hard to I know definitely you are you're part of our maintainer group, so you know these conversations are happening. You're like under in the age you see things that people don't get to see. Um you see the good, the bad, the ugly, right? Mhm. You know, we've been working a lot on features that are limited like those interactions for new contributors because that has been a great source of pain for maintainers too, right? a lot of people are like just delighted that they now have the power to contribute to projects that maybe wasn't their stack or maybe it wasn't something that was within their expertise period, but with the power of realms, now everyone can learn and and and contribute. So you track this stuff very closely. Uh can you share a little bit about how you're using how you thinking about these limitations? I just from your point of view and the maintainership of these projects. Yes, absolutely. So we track it very closely. I'm going to give you some uh quick tips for other maintainers on how to track this. Please. Use this website maintainers.github.com. You got to go there. You got to sign up. You got to join the community if you're a maintainer of any project that's pretty big, you know. Um big is like relative. Right. Uh we have over 100,000 stars, so you don't have to be quite that big, you know. But maintainers.github.com is so so useful, right? It gets you all the connections you want at GitHub for things. Um and there's a project under maintain like that exists around maintainers, which is called the tiny wins project. Uh which basically every single Friday they come out [clears throat] or GitHub comes out with new small wins for devs, right? Um I've had a couple that I requested that I've started getting through. Um but one of the ones that they're working on is how to help limit the number of interactions from new users. Mhm. Right? And I'm not going to tell too much about this cuz NDAs NDAs, you know. Don't know what's covered. Uh but there's they're always so receptive. And if you're having issues with this, reach out, right? Like if you're a sub project of a size that is getting these kind of pull requests, there's a private community for help with peers, right? Like that's where I learned about all this stuff, right? And where I share it. Um for maintainer month, that's how I know what's happening. Um but there are tons of things that they've added. Uh stuff around security, stuff around what shows up. Disabling issues, for example, is now a thing you can do. Right? There's a thousand different things you can do. And they're adding more by the day. And I cannot stress enough how cool some of these things are. I can't talk about them. But [laughter] they're really cool. Yes, there is hope, friends. So definitely please join the maintainer community. Uh you just got to log in and submit and then you're part of this network of people uh that are not only participating actively in the way that the platform is shaped for your own just for it's for you. Like honestly, this team cares so much about maintainers and we want open source projects to stay in GitHub and we want these projects to be able to use the platform the best possible way. So if you want your voice heard, you're a maintainer of project, come join this community. Besides like the amazing camaraderie, like you have such a group bunch of people there, so it's really it's really a great place to be. So join. Maintainer month is a good month to join. Go buy. I dropped the link on the comments and I'll put it on the show notes later as well. So thank you for that. Thank you for for sharing that as well. Um let's talk a little bit about the way that you're triaging and filtering PRs. Because you use Autopilot to do this. So what does that look like in practice? Like how does this work? In practice, it's very different depending on what we're trying to do. Right? So Autopilot is Auto GPT's built-in chat. Chat is like a very underestimation of Autopilot. Um Autopilot is like a virtual business partner almost, right? It's not necessarily code assistant, but it's more of like a thought partner that can handle real complex workflows. and we use it for stuff like go pull all the PRs, right? All 150 of them. And go check which ones matter. Which which one should I integrate next? And merge next or which ones are ready, which ones aren't. And build me filters and stuff like that, right? that I can use dynamically. And it integrates with MCP and stuff like that via linear or it can just integrate with GitHub via our block system, right? It's pretty flexible. Uh and it lets up me go and see, okay, cool. I want to merge PRs from first like new contributors, right? That's a big thing I do a lot of. Is to say, hey, I want to go look at new contributor PRs uh cuz I always like to let like help those ones merge faster, right? Cuz I like the idea of helping people have their first contribution on GitHub be good. Right? And waiting three months for a contribution is not a good feeling. No. Um so we try really really hard to keep the contribution time down if you're a first contributor. And if you have a PR that's open on our repo, there's probably a reason why we haven't replied. and I'm very sorry. Uh so we use it to filter that, right? And I can just say go get me all the PRs that need first contributor like they're the first contributors and it'll come back with a list and then I can say prioritize it and it will. That's amazing. That's a great use of the tool in the tool to build that tool. I want to share I want to take a moment to share because you actually bring us like a really fun offer from Autopilot and I want to make sure we share it cuz as people are are listening to the things that you're doing, I'm sure they want to also try it. Can you tell us what this is? Like what do I get? Yeah. Absolutely. So we are launching our cloud-hosted platform soon, right? Right now you can run it yourself on your computer. Um but setting up Autopilot is like very complicated because it's a tool that integrates with so many other things. There's a thousand APIs and keys you need to get. There are from things to from like Google to GitHub to linear to Twitter, right? A lot of these keys are also locked behind very very complicated to get paywalls or um complicated systems. Like to get a Gmail key, for example, uh you have to go get like certifications to be able to log into Gmail. Right? So that or you need to do it through pretty insecure means to be want. Um so our platform we host is going to be launching soon. Uh we wanted to give the people who listen to this podcast a discount offer launch price. And that's what this link is going to. It's the form to fill out so that when we launch, we can email you and you'll get that discounted price for the first month and you can try it and see if you like it. Um and of course you can always run Autopilot on your own on your computer. Local always. own, right? That's always the goal. With your own API keys, our goal is to democ- democratize AI for everyone, right? Whether that is you as somebody on this podcast, me as developer or our parents who don't know a lot about AI, but just have things they need done. Right? So our goal is to make it easy for everyone. Our cloud platform is more associated with making sure that anybody can use it, right? Cuz most developers can set up our tools locally. But it's kind of complicated to use Docker, Docker Compose. And time-consuming, honestly. Like right now, who has time for that? Right. Exactly. Get it done. Yeah. Right. Well, that's very generous. Thank you to Autopilot for bringing that offer uh to us here. So if you're watching this, save that link gh.io/autopilot and drop your email address so they let you know once the cloud offering is up and you get a fun discount. Thank you for that. I wanted to make sure we mention that because that is important. Thank you for having me. the tools, for sure. Yeah. Absolutely. Let's talk a little bit about the work that you're doing also within your contributors too. So because you have a system where if people are doing the same things like creating blocks, uh you created these really impressive guides really to help people know exactly how to do it what not to do. Can you say more a little bit about um that? Yeah, so this is actually one of the things that I thought was going to be so helpful initially. Mhm. But didn't really turn out to be super useful and we spent a ton of time on these docs until we added agent files to help people discover them. Even if people are coding them by hand, they use agents to like discover how to explore repo now. Mhm. Uh so we Autopilot works off a block block-based system, right? So if you want to add support for linear, which is a project management tool, you would make a block for it, right? that is a pretty complicated thing to do. It makes you code and like figure all that out. So what we've done is we set up a bunch of guides on how to do that. And those exist on hpt.co/docs. And I think that one is /platform/new_blocks. Don't quote me on that. Um but it's a guide that we also taught the LM how to use. I like that. We do that for almost all of our docs. All of a lot of them are embedded into them the agent itself. So, it can improve itself. So, now for example, if you want to say, "Hey, and I'll give you an example I did like yesterday. Um we have our subreddit, AutoGPT. Um it's not necessarily like a mainstream communication platform for us, but I noticed we're getting a lot of spam on it. So, I told Auto Pilot, our agent, to go and make blocks for that. And it opened a PR. And to against itself to improve it. To say, "Hey, okay, cool. Now we can post to Reddit, but we got spam on one of our release notes." using these docs, we have made it to where the agent can improve itself pretty easily by just telling it the docs exist. And that's I think one of the coolest things about being open is being able to say, "Hey, here's this thing I don't like." Uh and I think Open Law does this really well, too. Is say, "Hey, I don't like this thing. Improve yourself and open a PR." Right? That's such an important facet of two interact with these agents and such a like a differentiating factor of agents that are public is like, "Hey, here are all these things I don't like about it. Fix it." Right? And the agents are competent enough now to do it themselves. And so, we can have them do it. Which is really interesting. It took a lot of uh a lot of work to make docs like this. Um but it works really well. That's great. So, you're you're adding sort of like customized for agents way of the writing documentation to where they can actually like improve on their own. Yeah. And and and it's written for people, but read by agents. Does that make sense? Yes. Yeah, which is a very different thing like an agent's .ini file. Right? Those are written for agents by like intended not to be really read by people. Whereas there's some things that are intended to be read by people primarily. So, they have a lot more hand-holding along the way. Um but it's your agent can still use them, right? There's no reason they can't. I love that. And we're going to take a look later as well as maybe we can share the links as you're showing us cuz I think people definitely want to see like what blocks look like uh in action. They're like thinking how they can apply this to their own their own practices. Um let's talk a little bit about text coverage because that's also a big part of the requirements that you have for contributions to your project. Uh you have now skills in the repo for writing tests, right? Can you tell me a little bit more about skills for tests like that approach and it just feels like you are kind of training contributors to the standards of the project, which is brilliant. Uh could you tell us a little bit about why how? Yeah, absolutely. We use Codecov to check our test coverage. Uh-huh. And Codecov's awesome. They they were Sentry now. Uh and Sentry has really, really generously helped our repo with a lot of different stuff over time. And so is Codecov. But one of the things that Codecov does is it says it can break down by directory or by project or by flag or whatever your test coverage, Mhm. So, we have done a lot of work into saying, "I want more end-to-end tests to to cover these things." Or "I want more back-end tests that are just straight-up unit tests, right?" And each one of these things has a required percentage for your change. And a lot of people don't put the effort in to get the best results out of this. But what we have is these are blocking changes, like blocking requirements. So, whenever somebody wants to merge their PR and they're like, "Hey, why isn't this merged yet?" Or their agent checks on it after it opens it after a couple minutes, it'll see, "Oh, I don't have enough tests. I cannot merge this." Uh-huh. Right? And there there's just like another little trick along the way to make the AI realize, "I need to write tests, right?" And so, it's manually tested at this point by the time it opened the PR. And now it's going to add test coverage um and when it goes to add the test coverage, it'll load the skills again. And it'll say, "Oh, cool. You need to add adversarial tests as well." Right? Because it starts trying to figure out how to get through all these different things and see, "Okay, I got to check all these little boxes." Um so, having like an extensive CI is very important for testing, you know, and making sure these agents behave. That's that is uh really clever use of it. Like that dynamically happening in the CI and then the agent itself figured out what's missing, why I'm not passing, then creating the test if it's a test that's missing. Uh I I appreciate that. Folks, take notes. We're going to talk about gotchas, too. You're going to give us like a list of things that you definitely lived through, like the pain. Um another topic that's really hot right now, especially because of agentic workflows happening in a lot of repositories, is licensing. And that's something that you've also thought about a lot. And you require a CLA. It's an MIT project, but you require a CLA. Um and if I understand it correctly, like this is not a legal thing, but it's more a way for you to sort of signal. Like I guess a little bit of both. But could you tell us why like and Yeah. how you came about that decision? So, our our project's actually dual license depending on the folder. Okay. So, it's a little more complicated, but even for MIT projects, I think you should do this. Right? Uh agents are really, really bad at signing CLAs. Like really bad. So, we use a CLA license bot. Uh they require you to click a button and type your name. Super simple. Okay. Um and for ours, that covers our butt legally, but it also covers to make sure that they follow our code of conduct. What this does though, and we have this as another check in our CI, is say a person interacted with this thing at some point. Because your agent's not going to know how to sign the CLA correctly. Right? It requires you to open an OAuth with GitHub, all kinds of stuff that require a browser. And for that, you have to have interacted manually with it. So, if your CLA is not signed after a week, we just close the PR. Right? You haven't interacted with it personally. And we'll close with a comment that says, "Sign CLA. Reopen when you're done." Smiley face. But that tells us you didn't interact with this PR. You're as a human, you know, you're let your agent go do it. And that's okay, but I think all projects should start doing this. This is one of the biggest gates that helps us identify the level of buy-in and whether or not it's worth fixing it Yes. And the CLA can be for anything, right? Like it doesn't have to be a CLA. It could be a you agree to the code of conduct. Right? It doesn't matter what your project makes it. Just make it a checkbox that requires you to have OAuth on a different page than GitHub. and that's like a huge thing. Agents are just really bad at that. And they're not going to get better because people don't want agents to just run their browser logged into their GitHub account. Like it's a bad idea. Yes, indeed. And most tools refuse to do it, right? Um because they know that signing into random things with your GitHub account is a terrible idea. Um but for large projects like ours, it helps us ensure that people are people. Right? And the bot accounts, we can manually like if for example, if we have bot accounts or um like we have a tool called Auto, which is our internal name for Auto Pilot. Um that we say, "Hey, can you like we have it open PRs on them." And we can just bypass the CLA for that as they can treat like the owners of the repo. But as somebody who's contributing, it's really, really helpful to know somebody's a person. And you don't want this to be a roadblock for anything really past the first time. Cuz when somebody's shown that they're good contributor, and we'll generally read their PRs. Mhm. They generally do it every other time we've noticed. But having that little checkbox just being like, "I'm a person." Actually, pinky promise. But not labeled that way has been really helpful. That's very clever. I think I need to go back and look at my projects and do that for like even the MIT ones. That makes complete sense. Um there's obviously you have tooling that you're using for day-to-day, like for your PR triage or release notes, things like that. Auto Pilot or Auto like you call it internally. Uh a lot of those things I guess most developers don't really think about like tool territory type things. But there are other things that go into maintainership of projects that might not be like developer specific. So, like what are some of the non-developer stuff that you've done to help uh manage the level of contributions that you get? Yeah, I can I can go ahead and try and show you all one Yeah, let's see it. Yeah, let's see it. to go ahead and present. Um share my screen. So, one of the things we had to do recently is we made a release. Uh we do it every Wednesday. Um so, I wanted to post about it to my Twitter and our subreddit. So, I said, "Hey, go pull the latest release." Right? And it goes and pulls all the details from it. Right? And then drafts it. Um I never want to write these, right? We haven't posted them in months because I don't want to write them. But with our tools, we can I real- like I was sitting and I'll think about it like, "Man, I never post these." I was like, "Wait, I can automate this with our tools." So, I set it up uh and you're getting a preview of our platform. So, uh ignore some of the stuff going on like my little cowboy emoji there. But, it then drafts uh Oh, yeah, you see demos they never work. Uh it drafts a bunch of stuff, right? And it's like, "Okay, cool. I like And it iterates on it cuz I don't want to write this document, right? I don't want to go and draft it. So, then it goes and we're about to open Twitter up. Let's see how bad this goes. And it goes and drafts it. And publishes it, right? What? All from your platform. So, you're not going to do that. That is very cool. And it does all the OAuth handling for Twitter, which is like a very complicated thing to do now because their API is paid and there's a bunch of stuff you have to do. And then it'll also draft the Reddit. We and go and just post it, right? And you can see it posts it, right? And I don't want to do these tasks and I wouldn't do these tasks normally, you know? Like it's just not something I'm interested in. Yeah, no, it's saving you time so you focus your time on the things that actually need your your brain power, honestly. Exactly. And now that I've done that once, I can just go tell it, "Hey, every Just post it again." Right? And it'll do it, right? And using tools to do these things for you automatically is a really big deal. Um we also [clears throat] use it for some of our non-dev tasks, too, like drafting our emails to some of our beta participants. Okay. Our PM John, for example, uh saw the emails I was sending and said, "Nick, you're really bad at writing emails and you're even worse at designing them." And he's so nice. And I I I make him sound like the uh He was giving you some honest, constructive feedback. Yeah, exactly. So, he used our tool to re- like to rebrand them from the ground up and it's just like, "Hey, but it looks like this now. Can you change our email format to this?" And it took me 10 seconds. And John's a PM, he doesn't know how to code. But, it took me 15 seconds because all of our repo knew how to treat it. I told Snow how to treat our repo and it knows how to render this HTML than an email will get. And so, he was able to contribute as a PM and say, "Hey, here's this thing I want to do." And that's not worth your time doing, right? Cuz for me, it was like it's a huge effort to go do a bunch of design back and forth. For him, it's just like I just want to improve this small thing and it took him 10 seconds. And all I had to do was hit approve cuz he could show me a picture of it, right? And now you have this fully automated sort of like release pipeline for these emails that look pretty uh and you don't have to bother with. I love that. What else are you handling like with these automations? Like I'm I'm thinking you probably get a lot of feature requests uh like chatter. Yeah, we do a lot of automations like that. Um some of them are more like small business oriented, right? Cuz we're technically a small business, fun fact. So, it's outreach and sales is a lot of it uh and that stuff our CEO uses. Uh his parents actually use it a lot. And both of our CEO's parents are artists. Oh, yeah. Yeah. Um so, he grew up in a house filled with art, which is really cool. But, they use it a lot for reaching out to potential clients and like prepping cuz they both uh do very like they're painting artists. I don't know what the technical word that is. But, they both paint. Um and they use it for client communications. I use it for managing my inbox, right? I get 3 million emails from like people cuz my emails on GitHub so prominently, right? We're one of the more big repos on GitHub. So, I get a ton of emails from that. Um that people are just going by and they're like, "Oh, yeah, we'll email this guy um for a potential client." And I use it for filtering all that hooked up to our Gmail and just so many different ways you can just improve your life by doing the things you don't want to do that are like running like related to running a business or doing admin work for your business, right? Whether it's design like John was doing or managing your inbox, right? One of our CEO uses it for a customer support agent. Right? So, it will receive all these emails, right? Read the customer support inbox and go, "Okay, cool. Here's the ones that matter. Here's the ones you need to manually reply to. Here's the ones I already know about how to reply. It's this problem we have, right? That's ongoing or whatever." or it's somebody asking for access to the beta. I'll add them to that list. All kinds of things like that, right? Uh and it's replaced so many different paid tools for us. That's been kind of surprising, right? There's all these SaaS products that exist um that as a small business can get really overwhelming to pay for. But, when you have an AI agent that can just do it the slice you need and that's it, sometimes that's really great, right? It's just one product that does this one problem solving. For example, we use Discord as our team. And we have one that takes all of the like feedback form submissions and just posts it right? Tiny little agent. But, to hook that up normally it would be a big amount [clears throat] like a huge amount of work. You got to hook into Discord, you got to hook into we use Tally. And like you got to build code to do that. Or you can say, "Hey, agent, when you get a new Tally thing in this spreadsheet, just put it here." Amazing. in our Discord, right? but there are a bunch of gotchas that I want to talk about before we run out of time. Yes, please. Let's talk about this because you have shared a lot of the things that can make us work better and easier, but there I'm sure technical gotchas and problems and and things that you've been through and seen. So, please take it away with those. Yeah, I'm going to speed run these a little bit cuz I think we're a little short on time. So, I'm going to say first off, agents marking review like threads that you comment on as fixed without fixing them is like a huge issue. Really? Um yeah, there's a bunch of agents out there that will just mark it in as resolved. without actually acknowledging the fact that you opened it. Uh we use Code Rabbit, we use Sentry and a couple others for automatic reviews for like security and stuff like that. Uh like GitHub's security bots and there's some tools that will just resolve everything. Um so, we require them to put a commit sha in the reply before resolving it or we don't merge it. super simple and you should require that for all your replies uh that are done by agents. And you can tell if they're done by agents based off the talk we just had. then we uh had a problem with GitHub's GraphQL API. You and a lot of other people. So, our our problem was a little different than some other people's. Okay. Um we used it too much. Not uh not like GitHub's down or anything like that, but we we hit a lot of agents. Yeah, we hit the limits. Um but, you can make a GitHub app. And you authenticate with the GitHub CLI using yeah. And when you do that, it's a lot easier to get more requests, right? Cuz as an individual developer, you're just limited to a pretty small number of requests per hour. But, as an app, you're allowed to get more, right? And you can request increases if you can prove that you're actually using it and not just like trying to clone GitHub as a whole, right? And that was something that was huge that we had no idea about. As all of our tools were using the CLI to interact with GitHub, but we were hitting all these rate limits. So, that was a huge deal. Um writing [clears throat] bad cloud files is worse than not writing a cloud file at all. That was a huge deal. We started off by adding cloud files to basically every directory um and then agent files by extension, but it turns out that that pollutes context and causes the agent to focus on that thing way more than it should. Um so, we found that having no files is sometimes a lot better. So, that was kind of surprising is if you're getting poor behavior, check the behavior that like check what's in that file and make sure it's not just nonsense. Uh and then uh of course, Dependabot. I don't know if you all use it, you should. It's a good security feature. Mhm. Dependabot is amazing. Um but, one thing you can do is set up an agent that goes and tests the PR. Right? Goes and checks, "Hey, what are all the different things that are in it?" and for that, you can say, "All right, here's these 15 dependency changes. They all got bumped 30 versions or whatever." Right? Some of these tools are released very quickly. What's all the breaking changes? And in Dependabot, it's not always easy to tell what changed between versions, right? Not everybody publishes a change log or publishes good release notes, but agents can go read every change. And doing that's really been helpful for us. So, those are the more technical ones. There's then the social side of the gotchas you got to watch out for. Okay. I think these are like super super big maintainers. You need to you don't have to accept every PR, Say it louder for the people in the back. You don't [laughter] have to accept every PR. The effort for merging a PR with LLMs is asymmetric, right? You have to do a lot of work. You have to do the upkeep. And it's okay to close it and say, "I'm just going to do it myself." Mhm. Right? If they if they they like the idea, um and the guy from Branchlet I was talking about earlier did this, and I was perfectly happy with Right? There's no reason you have to It's sometimes not a good fit. But if you say that and like, "Hey, I'm not going to accept this PR. I'm just going to do it myself." There's a couple options you can add, too. You can add co-authors on GitHub and say, "Hey, this person helped." Yes. Right? Cuz for a lot of people, what matters we've talked with that we've talked with is not that their code is merged, but one that their feed the thing they want fixed is fixed. Like that's a big deal. And two, sometimes they want the contribution credit. Yes. Right? And especially [snorts] with a large project like ours, we have like 800 contributors, right? So for us, adding another contributor is not like a huge deal. Right? We're not trying to gatekeep who's a contributor on the list or whatever. It's just like, yeah, if you have a good idea, you can contribute. Adding the co-author is basically free. and really really helps people internally. Right? Like it feels good to have be a co-author on a PR. even if the maintainer completely rewrote it. Yep. Yep. I think a lot of a lot of times you just want to make sure like like I really like this idea, so I don't care how I get done because it's going to solve my problem, but it'd be nice if you say, "Hey, by the way, Nicholas, thanks for that idea." Um Exactly. I see that before, also, where people just add your your handle to you like the change log. If it's not co-authoring, you know, just like making that recognition adding to contributor list like it's so nice. That is very nice. That's a really good tip. I like [snorts] that. 100% and I think it's really helped us acknowledge these people. Right? Cuz we wouldn't be where we are today without our community. Mhm. We're huge, and we're that because of the community we like that has helped us become this, right? I like to say we built this community, but realistically they built us. You know? And that's the same for every open source project, right? You can go to events and do marketing and stuff, but at the end of the day, the people who contribute to your project make you your project. 100%. And then, there are people who are just contributing in bad faith. Like Unfortunately. those PRs. If somebody like made a slot PR and generated it in 10 seconds, close They didn't spend any time on it. Why should you? Right? This is how I feel about the PRs that don't sign our CLA, right? And how if you make something like that, you should feel. If they don't have the effort to do sorry, right? Everybody who wants to sign it but needs to go talk to their company or whatever, comments and says, "Hey, I'm working on signing this. Don't close my PR, please, you know?" And we won't, right? We'll wait however long it takes, but don't let it turn something you love into something you loathe. Right? Wow. It's a big deal, right? Like the slot the the flood of them is real. We used to get a lot of PRs from Claude um Claude code like on the web because Auto GPT appears at the top of the list of repos whenever you have it connected to GitHub if you've starred or forked the repo. So people would attach their idea into our repo and make it a PR cuz they didn't really understand how Claude code worked. and closing that PR is okay, right? Like that's fine, you know? You don't have to worry so much. Um and then yeah, like but the PRs that do matter that aren't quite hitting the mark, I I I got to say again, it's really okay just add them as a co-author. Right? They just want to be recognized for the work they've done. It doesn't matter if you include their work or just recognize them, you know? That's an excellent tip. That's an excellent tip. Um tell us about because you've taken like so super incredible approach at like we all kind of do things to like scratch our own itch or like to fix our own problems. Like you said, somebody's contributing is likely cuz they wanted to perform in a way that's going to help them. And you do a lot of that like building tools for your specific problems. Can you share a little bit about that? Like that's what PR that AGPT that goes Can you tell us a little bit about that? Yeah. So that's uh some a tool we built for internal um that is will clone your PR locally um and go and test it end to end using UI and upload screenshots, right? To the PR. And we started off by building this in Open Claw as just like a thing on my computer or or technically a VM cuz I don't think you should run Open Claw on your computer directly. We're not Hello. Architecture [laughter] and security. Nicholas with all the bangers today. My goodness. I'm making so many teachers from this conversation. Um but we started off by making an Open Claw bot that just "Hey, will you start this thing? And will you run this PR? And then will you take screenshots? And then will you build a team of agents to do this, right? And we built tools that would do this automatically because we realized we were spending so much time going through and saying, "Hey, this thing I need to test it, but I don't want to clone it down locally because that takes time and I need to run it. But it's a UI change, right? Like on this random page that's not really going to affect any behavior. I just need to make sure they didn't totally mess up a class and like throw off the layout. So we started off by just generating UIs and like running them and like having the agent go to that page. And then while building that, we decided, "Okay, cool. This is really cool. Open Claw is kind of unpredictable." Right? Uh this is one of the problems we had with our classic Auto GPT is when you give an agent a sandbox and let it go do what it wants to do, sometimes it does what it wants to do. Mhm. You know? So we swapped it to the Claude SDK, right? And it's important to remember none of these tools are like have to be the tool, right? You can try a new tool and you can change it, right? We have LLMs that can write code and translate between tools faster than you can. So moving from Open Claw to the Claude SDK was not hard. and a lot of us to get a little more pipelined, right? So that okay, cool. Now Claude come pulls down the code, spawns a bunch of different agents that review it for product fit, right? So it connects to our linear road maps and says, "Hey, this is like completely against the road map. Um like what don't do this." You know? Where [snorts] for example, if somebody wanted to add those uh changes that were just like a whole new product inside of our repo, right? That agent would go, "This is just insane. We're not doing this." Right? And we'd have ones that were checking the test coverage and the quality, right? And that would go, "Okay, cool. I like this test." And then at the end of a whole eight different agents that all did slightly different things and they all had their own prompts, we had the UI reviewer, whose job was to start at the entire code base stack top to bottom, take screenshots using Agent Browser, right? The same tool we run on your computer whenever we tell it to tell whenever we convince your LLM to test it locally, take screenshots and upload it to the PR using Cloudflare Cloudflare, right? And it will take it and say, "Here's this page I navigated to. Mhm. Here's the look." And for a lot of PRs, that's sufficient, right? Now we are a client-based app that can show visually the changes, but you can apply similar things and write a your reviewer could say, "Hey, write code to test this new feature of our library." Right? And if you sufficiently sandbox that, right? We put it in E2E sandboxes and wall that off so that they can't like get our keys and stuff like that. You can basically say, "Hey, I want to be able to test all of this, right?" do it, right? And get a report back that says this doesn't work before you spend any any time on it, Now this goes into one of our gotchas, which that can cost a fortune. Mhm. Um so we don't run that tool very often. We used to run on every single PR, but now we run it mostly on PRs that are small or PRs that are very large. cuz it's really good at both ends of the spectrum, right? Where I want it to go and test everything. May take it 2 hours, but that's 2 hours I'm working on something else, right? While it's doing it in the background. And that's great, right? Cuz I can say, "Go test this PR." And I can come back to a report that says this PR works fine, and then it's worth it for me to clone it locally, And you can build all kinds of tools like that, right? Um we set up a CI fix to fix itself. Um this turned out to be a nightmare for security for such a large project um that we're not going to super get into, but there's no like locally, for example, that we do the same process where you and our agents follow it where you commit, you check the CI, and it goes, "Hey, you failed these five things. Go fix them." Right? One of the things we tested, and it kind of is a gotcha with GitHub Actions that we should talk about is we just made Cloud Code go fix it. Right? Okay, here's the five things. Run Cloud Code in GitHub Actions. In fact, that's a supported thing. You can do it with Copilot. Um uh GitHub Actions has inference now via models. So, you can say to a model, "Go identify the problem here and comment on the PR." Those models are so good at at fixing the problems with those uh with action runs. Like it's it's unbelievable. Like I'm so happy I never had to bother with YAML ever again. Like I'm never writing a workflow for an action ever. Right, exactly. Um and with we we're planning on rewriting this to use GitHub models. So, it's a lot more secure uh cuz we were just using Cloud Code directly and authenticating via that, and that has some weirdness with token theft that you can do. But, there are like GitHub Actions has AI inference built in. Mhm. Just use it and say, "Hey, here's the failures." After it fails, and if any of your CI fails, go figure out why. And comment. Right? And that's really helpful, right? But, we turned that off because we didn't love the fact that it was commenting so much. We We turns out we're kind of bad coders. Um And our CI fails a lot. And a bot that tells you why your CI failed all the time doesn't feel super awesome. you That's the cool thing about all these tools, you don't have such a commitment to them anymore. Right? You can say, "Hey, I can experiment on this thing and try it." Right? The Reddit posting that we talked about earlier where automatically posts to Reddit. If I decide after 2 weeks I don't like that, just turn it off. Right? There's not so much time needs to be spent on it anymore, right? You're not as committed to these processes. Um And that really ties into what we're like what I want to talk about next, which is mixing and matching these tools. Right? I've talked about Auto. I've talked about Cloud. I've talked about Copilot. I've talked about Codex. Cursor. We use all of them, right? Everyone on our team uses different you can mix and match them to do what you want. Right? There's a lot of very generous open uh source tiers for tools, right? Yes, shout out to Cool Rabbit. I love Cool Rabbit. Um and you can like For example, one of my favorite things to do is if you use the Copilot CLI, you can tell it to work adversarially. Like I think it's slash adversarial or something like that. Uh it's a skill where it's just like, "Hey, have Cloud argue with Codex until it comes to a conclusion on what's a good plan." It's like pin these things against each other, they're so useful. But, if they stop being useful, stop using them. Right? Um we try new tools all the time. And you don't need to keep using a tool that doesn't work for you. Right? It's not like for paid as a customer where it's like "Oh, but we paid for a 2-year subscription as a contract." It's like, "No, you If you're open source, just stop using the tool." Right? Half the time they want you to use it for advertising. And if you stop using it, well, that's bad adverti- like that's worse advertising for them. So, make it better. And maybe you can try it again later. Or maybe it just doesn't fit your workflow. We tried a tool um I think it's called Graphite or Reptile or something like that. It was a really cool tool. Works really, really well for the open flaw repo. But, it required a level of teaching and like setup that just don't have time for. So, get good. But, they did cuz they started using it from the beginning, right? Like our code base is huge. it didn't work for us. So, we removed it. Um but a quick little security cuz we are part of the secure open source fund. Oh, yeah. I got to point out I got to point out if you ever stop using a GitHub app, remove it from the authorized apps. So serious. Don't let an authorized app just hang around. One of those things you learn. Do a little audit right now after this stream and go see what you have. You'll be surprised. No, this is so real though. Like because we you never review like you I mean, it's so it's become such an innate thing to log in using GitHub. Yeah, so So, you don't realize it. Yeah, yeah. Always check. And now there's a ton of layers. Like it used to be like there wasn't as much flexibility. Like there's a lot of apps that they don't need to have read and write access. They don't need to And when you're allowing things to use your auth, like you are responsible for that. So, Exactly. make sure that you are only authorizing what's necessary for you to accomplish what you're trying to do. So, that's very, very important. Oh my gosh, Nicholas, this has been so I mean, we covered a ton of ton of ground. Like Yeah. Um Okay. You you brought up a point about time, and I know we're going over time, and thank you so much for being so gracious. I messaged you on the private chat. I was like, "Can you stay a little longer?" Yeah, absolutely. You brought up about time, and like I think we all sort of like fall into that sunk you know, like the investment time here. Like you don't want to like put something even if it's not working because but I we've been trying to use this thing, but I love that like just cut ties, be done. Um if it's not okay for your workflow, if it's not okay for your project. Um but thinking about like the economics of AI contributions, right? Where sometimes things are like costing you more time to review than they would be just to build them yourself. Like can you share a little bit about what you think about that? Absolutely. So, there's a ton of tools that help with this. Some projects handle it to where you're only allowed to open issues, and the maintainers will implement it. Or you're only allowed to open issues with like a spec of what the feature you want is. Right? We don't We don't go that far, but I don't like I totally get why those other projects do that. Like it makes a ton of sense. What we do is we allow you to open any PR, but you're not allowed to claim an issue. So, four or five people may open a PR against the same issue. Mhm. And we will tell our tools, "Go find all related PRs to this issue and pick the best one." And we'll merge that one and add everybody and try to add everybody else as a code contributor. Right? Some of them are kind of sloppy and those don't usually get added as a code contributor. But, if it's like good faith effort to solve the problem, well, awesome. But, there is so much asymmetric work for a lot of this that it's important that you remember what takes time as part of your workflow. Right? For us, coding is very fast cuz we build agents, right? Like we know how to use them really well. For you, that may not be true, right? Getting the contribution may be more important. Right? Or maybe you want to manually fix PRs. Or you want to have agents do it, right? Like identify what's important for the contributors contribution flow for what you want as a maintainer and focusing on that, right? For me, I like having a stack of PRs I can just go and pick the one I like. You know, but for a lot of people that's really overwhelming. Seeing 150 PRs every day is quite stressful for a lot of people, And like a lot of the times, that's all I do for a day is I go and review PRs. um it's like a motto for our team that code that's unmerged is code unwritten. Because if it's not merged, it's not usable. And if it's not usable, it's not written. You know? So, spending a day reviewing PRs for us is a good use of time. For some people, that's not as good use of time. Because you may not have a whole team of people coding. Right? You may be really overwhelmed. In that case, maybe only allowing issues to be created is a good idea. And then you can select among those issues what you think is a feature worth doing and just do it yourself. Pick what you like to do as a maintainer and do it. Right? That's how you keep the burnout away. You know? It's so easy to say, "Oh man, AI is taking all of our jobs." And it's changing it a lot. Mhm. But, as a maintainer of a project, you have a lot of choice in how you maintain your project. Right? I like to review PRs. Like that's a thing I enjoy. Several members of my team, I think, would rather do a lot of bad things to happen to them than I'm I'm trying to keep with a good example. Yeah. But, uh rather get sprayed by a skunk than a review of 10 PRs, for example. Mhm. And that's okay, right? For them, they go pick issues and assign them to people. You know? You got to work with what works for you. And you got to remember that you're the programmer, and AI is a tool. And the better you like the sharper you make an axe, the better it works. And if you leave it dull and rusty, and someone else picks it up, don't be surprised if they act like a butcher and hack apart your tree for 45 minutes. Like you got to make the tools work well for you. And what works well for you is different for every project. For our Like I've told you a little bit about our workflow for our project, but for every other project, it's different, you got to acknowledge that, right? You got to Absolutely. Absolutely. These are all ideas, right? Like you share your triumphs through These are things that you through your own pain points from the increased contributions because of agents, you you you have a system now it works for you. If you are a maintainer of open source project and you're seeing this now and you're thinking, you know what? I think this is a good idea. I'm going to go back and make sure that my agent files are what they're supposed to be. Like maybe there's a way for me to automate things so other people can contribute. Like I love the the the story about your PM actually being able to contribute in a technical way by using the tools. Like I think it's so easy to like pile on the negative that we sometimes miss actually that there can be a lot of positive. If you remember what Nicholas is telling you here that you control the way that you maintain your project. No agent is going to come in and maintain your project for you unless you allow it to, you set it up the way you want to, or you don't. You continue reviewing PRs like Nicholas does because he likes it, which Yeah, you can turn off PRs entirely for your project, right? That's one of the tiny wins that the tiny wins team did. that was such a big thing. Yes. Yeah, and for the people who just want issues, that's perfect. That's awesome, right? Don't even have to deal with it. You just make the change yourself. Like there's projects like I think SQLite doesn't accept contributions externally. But they accept bug like bug reports. That's perfect for them. Right? They're a team that's decided they don't want your code changes and that's okay. And that's okay. their project, you know. Absolutely. Looking into the future of what open source Like everything is changing for our profession. Everything is changing for open source. What do you think like where does open source go? Like I I would say like we were thinking like maybe in a year, two years like everybody's going to be working with an agent. Like I have three different open claw instances. One in hardware that is an old Windows very boxed down computer, two in virtual sandboxes, and I'm adding a third one today because I think I need to experiment with Hermes, which I have not tried it yet. I Like I feel like everyone now has an agent or the power of an agent if you're using Copilot, Copilot CLI. Like where does open source go? What does that look like? it's going to change really dramatically. Projects like ours I feel like are on the forefront of what this looks like because we're AI agents. Open Claw, Open Code, stuff like that. They experience these changes because people are excited about the tools and use them to make them. Um but there's a lot of projects that haven't yet experienced things like people who don't know how to code try to make changes to your project cuz they want to make it better for them. Mhm. Right? And that's uh like a suite of things that you're going to have to deal with as a contributor, right? This person you can tell them to run the test and they're not going to know what that means. And there's some level of education required. for that you're going to have to figure out how to handle it. And we've started doing that by teaching agents how to do these things, right? Uh our repo's pretty big, so it gets indexed in a lot of these different training. So that's really beneficial for us. But your project may not be that big. It may not get indexed in that training. Um because every agent like Claude or GPT knows about our project and knows the code base, right? Cuz it's such a big project. Mhm. But your project may not have that happen. So you need to set yourself up for success and decide how you want contribution, right? Because it's really easy to make a new project. Really easy. I for example wanted to get um backgrounds from museums for Van Gogh to show up every day on my laptop on my Mac. Oh, nice. I had Claude build that in a day. Right? And if I want to I could publish it to the App Store and it'll work, right? It'll auto start rotate the displays, puts it on a nice little mat. And that's a project that already exists or an app that already exists um that I didn't know existed, right? So you're going to see a lot of duplicates and things where people just have an idea and now can turn it into reality reality so quickly. as a maintainer, how do you deal with compers comparisons for these things? how do you deal with the fact that the thing you're building maybe…

Transcript truncated. Watch the full video for the complete content.

Get daily recaps from
GitHub

AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.