GitHub has a fake star problem…

Theo - t3․gg| 00:36:51|Apr 25, 2026

Chapters6

Explains how star counts became a key signal for adoption and investment, and why fake stars undermine trust in GitHub metrics.

GitHub stars are being gamed at scale, driving fake traction and misleading VC funding signals—and the fix won’t come from better moderation alone.

Summary

Theo’s deep dive into the fake star problem on GitHub highlights a chilling reality: stars have become a proxy for traction, adoption, and funding. Theo walks us through the Star Scout findings, which estimate roughly 6 million suspected fake stars spread across about 18,600 repos, with a vast ecosystem around selling and trading stars. He connects this to how VCs use GitHub metrics, citing Redpoint Ventures’ seed and Series A benchmarks that equate star counts with fundraising potential. The video also dissects the fingerprints of manipulation—ghost accounts, low follower counts, and skewed fork-to-star ratios—demonstrating that raw star totals can be gamed. On the policy side, GitHub’s enforcement is described as reactive and opaque, with 90% of flagged repos removed but many fake accounts left standing. Theo argues for a structural shift toward weighted popularity metrics and more robust signals like uniquely monthly contributed activity, package downloads, and contributor retention. He also discusses the broader implications for open source ecosystems, including AI and crypto projects that dominate fake-star campaigns and the chilling effect this has on real developers and funding decisions. Throughout, Theo blends critical commentary with practical observations from industry sources, investor benchmarks, and CME-style research, making a provocative case for why “star counts” can no longer be trusted as traction indicators. He even links to a nuanced writeup by Awesome Agents as a credible synthesis of the data. The takeaway is clear: unless platforms, investors, and regulators overhaul how popularity is measured, the fake star economy will continue to distort discovery and funding in open source.

Key Takeaways

Star Scout identified ~6 million suspected fake GitHub stars across ~18,600 repositories, based on 2019–2024 data from 326 million stars and 6.7 billion events.
VCs reportedly rely on GitHub star counts for funding decisions, with Redpoint benchmarks showing seed at ~2,850 stars and Series A near ~5,000 stars.
A low fork-to-star ratio and high rates of zero followers/ghost accounts are strong indicators of manipulated stars (e.g., Flask vs. Union Labs patterns).
GitHub’s enforcement is reactive: thousands of fake accounts exist after takedowns, and there is no transparent public report on detection or remediation.
Experts propose moving to weighted metrics like monthly unique contributors, issue/PR activity, and usage telemetry to gauge real adoption rather than vanity stars.

Who Is This For?

This is essential viewing for open source developers, startup founders, and VC engineers who rely on GitHub traction signals. It explains why star counts can mislead funding decisions and what better metrics to track instead.

Notable Quotes

"There are VCs that invest millions of dollars based on star counts. It's been this simple test to see what projects are being used and adopted, and we can't trust it anymore."

—Sets up the core concern: star counts as the funding signal are unreliable.

"The governance here is reactive and asymmetric. GitHub removed 90% of repositories that were flagged by Star Scout while in 57% of the accounts."

—Highlights a gap between repo takedowns and account-level remediation.

"Star economy is a $50 problem with a $50 million consequence."

—Captures the author’s closing synthesis of the financial impact.

"A project like Flask has 235 forks per thousand stars, while T3 Code shows 1.6k forks to 9k stars — a ratio that reveals manipulation patterns."

—Demonstrates how fork-to-star ratios serve as a practical fraud signal.

"You can fake a star count, but you can't fake a bug fix that saves someone's weekend."

—A pragmatic reminder that code quality and contribution impact are harder to counterfeit.

Questions This Video Answers

How do fake GitHub stars influence startup funding and VC decisions?
What metrics should investors use instead of GitHub stars to assess open source traction?
What is Star Scout and what did it find about fake stars on GitHub?
How can GitHub improve detection and enforcement of fake star campaigns?
Why are fork-to-star and watcher-to-star ratios useful in spotting manipulation?

GitHubStar ScoutFake star economyOpen source funding Venture capitalWeighted popularity metricsGitHub policy

Full Transcript

GitHub fundamentally changed how we think about software. Not only did it provide a place for us to host our source code and get contributions from people all over the world, it kind of formed a community around those open source projects. Sure, there are things like issues and they have the weird kind of cringe communities tabs as well. But what made GitHub magical is the idea that you can collect and find these open source projects and give them a little star, maybe even fork it if you want to make changes. With this, there was a lot of useful things we got as a community. Generally speaking, when a project had a lot of stars, it tended to mean it was trusted and well respected amongst the community. And as the numbers got bigger, it meant the project was being adopted very heavily, which is why this metric is so valuable to people within the dev space and especially those outside of it trying to peer in. A lot of people use star counts to decide what tools to use, what projects to invest in, and what companies they can trust. Which is why this recent reporting about fake stars on GitHub is so terrifying. There are VCs that invest millions of dollars based on star counts. It's been this simple test to see what projects are being used and adopted, and we can't trust it anymore. Awesome. Agents found what they believe are over 6 million fake stars and a VC funding pipeline alongside it that uses the popularity on GitHub as proof of traction. This is terrifying. On one hand, this is a classic instance of Goodart's law, which is that when a measure becomes a target, it ceases to be a good measure. But on the other hand, this goes so much deeper. This felt like conspiratorial thinking, but the more I've looked into it, the more real it seems. And I have so many thoughts on this, how it affects open source as a whole, and most importantly for me and the world I live in, how this changes the way that venture capitalists and different funds and firms think about their investments into open source technology. I am constantly in fear that capital is going to leave the open-source world and things like this are going to accelerate that happening. And if I can't make money from VCs anymore, I'm going to need to make a little bit from today's sponsor. Today's sponsor is an interesting one. It's a product I was super skeptical of. Well, being real, it was a whole category I was skeptical of. You don't know this about me. I'm a really fast typer. I type at over 160 words per minute. So, when I had to have my hand surgery last year, I was petrified. I did not want to lose my fast typing speeds because I knew that voice detects would just never be it. How is it going to know what file I'm trying to reference or all the weird variable names I use in all of my code? There's no way it'll be even close to as fast as I am typing. Or so I thought. And then I tried Whisper Flow and I liked it so much that I bugged them until they let me do a sponsor. And that's what today is. I just I fell in love with Whisper Flow. Even though my hands work great now, doesn't matter. I still use it dozens of times a day. You've probably even seen me using it in videos. Like everybody constantly asks what I'm using. And now you know. Whisper Flow is available everywhere and it works for everything. Mac, Windows, Android, and iPhone all alike. Even apps like Excaladra work perfectly with Whisper Flow. And see that it even knew Excaladra and capitalized it correctly. What's even cooler is if you go back and make deletions and changes, it'll detect that and save those changes to your dictionary so it knows that it can use it. Since whisper runs on your operating system, you can use it anywhere, even your CLI. And I found it super useful for this. Like I want to move over from SQLite to Postgress. Make sure that we remove the quip slop.sqlite file and see that it was able to use the screen reading capabilities to know that this is the specific thing I was talking about cuz the reference is right there. It's so good. It's all these little things that add up that make Whisper Flow an incredible tool to use. And I haven't even gotten to my favorite Power user stuff. Hopefully enough of you guys sign up that they'll let me keep doing ads so I can show you the rest. Write better prompts and waste less time at soyv.link/wisperflow. In a world of charts that look like this, where very new project openclaw now has more stars than React and Linux. And the chart looks like this. It's just a vertical line. Shows how much things are changing. And the fact that new projects can take off like that. and we have normies on GitHub. It changes the economics of a lot. I knew this was all over when I started seeing posts like this on Reddit. I'm new to GitHub and have lots to say. I don't give a [ __ ] about the [ __ ] code. I just want to download the stupid [ __ ] app and use it. Why is there code? Make a [ __ ] .exe file and give it to me. And it's because this project had an installation command to run and not an .exe download. I knew when I saw stuff like this, it was all over. And obviously on Twitter, we've seen even more. I'm sorry I have to say this, but GitHub needs a big red download button. Normies made it to GitHub, and Normie's making it to GitHub means that new projects have a much larger potential user base, and those user bases indicate trends breaking out past the dev world. We need to be realistic about AI right now. Most of these big AI things live within the dev world primarily. So, the things that break out are very impressive. The fact that we have people using tools like Claude Code who aren't devs to build actual things for themselves and use their computer, that's a big deal. The fact that rappers are setting up Open Claw to manage their email for them is a big deal. And venture capitalists who want to make money by investing early and then selling their shares later look at these things to know that stuff is going on, that trends are happening. And since they don't understand code either, since the VCs I'm talking about here arguably fall under this I'm sorry, where's the download button camp? They rely on these superficial things like the number of stars on GitHub. And as we know with Goodart's law, as soon as a metric can be measured and is looked at, it stops being useful because people will game it. And that is where we get into the awesome agents article. They're claiming 6 million fake stars at 6 cents per click. A GitHub star cost 6 cents at the low end. A seed round unlocks 1 million to10 million. The math is obvious, and thousands of repos are exploiting it. This investigation maps the full ecosystem from the peer-reviewed research quantifying the problem to the marketplaces selling the stars openly to the venture capital pipeline that converts star counts into funding decisions. We ran our own analysis on 20 repos using the GitHub API, sampling thousands of stargazer profiles to independently verify which projects show fingerprints of manipulation and which don't. The picture that emerges is a mature professionalized shadow economy operating in plain sight. This started from a peer-reviewed study that was presented by researchers at Carnegie Melon, North Carolina State University, and Socket. Their tool, Star Scout, analyzed 20 terabytes of GitHub metadata, 6.7 billion events, and 326 million stars from 2019 to 2024, and identified approximately 6 million suspected fake stars, distributed across 18,600 repos by roughly 301,000 accounts. Oh boy. Yeah, Socket is very legit. They have been doing a lot of this type of research for a while now. This is real. The problem accelerated dramatically in 2024. By July, 16 plus% of all repos with 50 or more stars were involved in fake star campaigns, up from near zero before 2022. The researchers detection proved accurate. 90% of flagged repos and 57% of flagged accounts have been deleted as of January 2025, confirming GitHub itself recognized these as illegitimate. AI and LLM repositories emerged as the largest non-malicious category of fake star recipients. ahead of blockchain cryptocurrency projects in an absolute volume at 177,000 fake stars. The study notes that many of which are academic paper repositories or LM related startup projects. That is interesting that random academic papers are trying to boost themselves by getting a bunch of stars or perhaps they're trying to make the accounts look more legit by starring those. Critically, 78 repos with detected fake star campaigns appeared on GitHub trending. That is very bad. I have struggled to get my stuff on GitHub trending historically. I think upload thing ended up there once or twice. Create T3 app did very very shortly. But despite the fact that a lot of our projects are big and have legit stars like again create T3 app with 28,800 stars as far as I know these are vast majority legit if not entirely. But like godamn this was even hard to get into trending. But the fact that people purchasing stars managed to get into trending does absolutely prove that stars being purchased lets you game discovery on GitHub. Dagster did research in 2023 about this where they actually went and purchased stars from two vendors in order to study this phenomena. They found services via basic Google search. A premium vendor, GitHub 24, which is a registered German company, charged €85 per star and delivered reliably with all 100 stars persisting after a month. A budget service sold a,000 stars for $64, though only 75% survived. This ecosystem spans across dedicated websites, freelance platforms, exchange networks, and underground channels. At least a dozen active sites sell GitHub stars directly, like social plug, buy fans, boost, like. Cool. There's a bunch of scam sites if you want to use them. Not an ad. Don't use these. Seriously, these things should not [ __ ] exist. You have different tiers of accounts. The disposable ones are 3 cents to 10 cents per star. They take days. Mid-range is 20 to 50 cents and it takes one to two weeks and they have some actual history in the accounts. Then premium aged accounts, ones that have been on GitHub for a while, cost 80 to 90 cents per star. The delivery is a lot slower and natural, but the quality of the accounts is much higher. On Fiverr, there's 24 active gigs selling GitHub promotion with packages from $5 for basic stars and forks to 25 plus for organic promotion. Many use offuscated language to evade platform filters. Star exchange platforms like GitHub Starmate and Safe Star Exchange, both live and operational, enable free mutual starring through credit based systems. Are you kidding? This is awful. I did not know this went this deep. The info extends beyond stars. At least seven open source tools on GitHub like fake git history, commitbot, and committer, and others exist specifically to fabricate GitHub contribution graphs. Pre-built GitHub profiles with 5-year commit histories in Arctic Code Vault contribution badges will sell for approximately $5,000 on Telegram. Some vendors offer replacement guarantees, followed advertises 30-day coverage, and premium services promise non-drop stars that survive GitHub's detection systems. Social Plug claims 3.1 million stars delivered across 53,000 clients and they offer a formal API for programmatic purchasing. There are WeChat groups with over a thousand members processing 20 plus repos generating 3.4 to 4.4 million annually in promoter profits. This is feeding into a bias I already have against GitHub, which is that they wanted to make a code platform that is also a community platform, like a social media type thing, but they have no insight or expertise at all in how to do that. They don't know how feeds work. They don't know how moderation works. They don't know how chat and community and harassment, all of these things work. And I was very frustrated with this when we were dealing with massive spam waves on GitHub because they just never provided the tools we need to ban users to lock down the comm's channels and do all the things you need to do to make the community spaces more viable and well-maintained. I'm a bitly biased here. I put a lot of work into building Mod View for Twitch, which is, as far as I know, to this day the best moderation platform on a modern community product. This whole dashboard is fully customizable. You can put things wherever you want them. You can resize stuff. We have a ton of ways to control what's going on in chat. We have mod settings here where I can show and moderators do things. I can show messages that are caught by automod. I can configure automod in order to block certain things from certain people. I can show how long deleted messages are shown and if they're shown at all. I can change chat pause behaviors. This is more for me as a user. But I also have a bunch of other options like shield mode where it hardrestricts the chat or sub only chat where only people who paid can chat or emote only where they can't use messages. They can just send emotes. Follower only where they can only send messages if they've followed for a certain amount of time. And then slow mode where people can only send a message every however many seconds. These are just some of the tools we had to build at Twitch in order to make a good moderation experience for the hardworking, very poorly underappreciated people that keep platforms like Twitch safe, the moderators. Of all of the features I just listed, there is zero comparable features anywhere in any dashboard on GitHub. If you want to limit how many issues a user can open, you're [ __ ] If you want to filter certain words from appearing in GitHub issues, you're [ __ ] You cannot do any of the things that you would want to do as a moderator of the platform on the platform because they just don't build any of it. So, it's not surprising to me that things like this that these star spamming platforms aren't going to be meaningfully stopped because GitHub doesn't know how to run a platform. They know how to run a place that holds your source code kind of. I have very low faith GitHub will ever deal with this problem. As such, I think this will get worse way before it ever gets better. And I want to read more of this analysis because this is all very scary. They did a breakdown of what fake stargazers look like. Stargazers is the GitHub term for people who hit the star button. To move beyond reported statistics, we built a GitHub API analysis tool and ran it against 20 repos, projects flagged by Star Scout or fast growing AI repos from the Runa Capital ROSS index and known organic baselines. For each repo, we sampled 150 Stargazer profiles and measured account age, public repos, followers, and bio presence. The fingerprints of manipulation are unmistakable once you know what to look for. So we look at Flask stars. They have 71K stars on Flask. Median account age is 4,81 days. Zero public repos are 5.3% of the accounts they sampled. So the vast majority have repos. Only 10% have no followers. 1.3% of ghost accounts. Zero of them are suspicious accounts. A relatively low fork to star ratio, but not that low. It's 23. So that's 20% or so. Laer tostar ratio is about 0.03. So about 3%. That's fair. And the others aren't too far off. Even Lang Chain has a much higher median account age. Organic repos are starred by devs who have been on GitHub for years, maintain their own projects, and follow other users. Ghost accounts have zero repos, zero followers, and no bio. They only make up 1% of a healthy project's stargazers. But the manipulated repos like in the blockchain world, 52% on Union Labs have zero followers. 59% in shared do, 81% in free domain do, and 62% in a noma do. God, you crypto people, I hope you understand how the whole world you are surrounding yourself in is just full of [ __ ] scams and lurkers that aren't real. Like this is Yeah. 0.001. So.1% of free domain stars also bothered to hit watch. Insane. The accounts aren't obviously new. The median age is still over a thousand days for most of them. They pass simple young account filters, but they're empty. A third have zero repos. Half to four-fifths have zero followers and a quarter are complete ghosts. They are aged accounts purchased or farmed specifically for star campaigns. The fork to star ratio is a strong signal. Flask has 235 forks per thousand stars. I actually have been thinking about this ratio a lot because T3 code has a crazy ratio here of 1.6k forks to 9k stars. That's almost 20% ratio. Yeah, flask is in a similar range in the 20ish%. Shared is 22 forks per thousand stars. free domain is 17. When nobody is forking 157,000 star repo, nobody is using it. The watcher tostar ratio tells the same story. Free domain's 0.001 means that for every thousand people who start the repo, just one actually watches for updates. Domain is worth isolating, though. 107,000 stars, only 168 watchers, and 2,600 forks. That's a watcher-to-star ratio 26 times lower than Flask. 81% of the sampled stargazers have zero followers. This is a repo where almost nobody who started has any visible presence on GitHub. Union Labs is the most consequential case. It was ranked number one in Runa Capital's ROSS index for Q2 2025, a widely cited VC industry report identifying the hottest open-source startups with 54x star growth and 74,300 stars. Our analysis found 32.7% of zero repo accounts, 52% of zero follower accounts, and a fork to star ratio of 0.052. This analysis flagged it with 47% suspected fake stars. An influential investment sourcing report that VCs rely on was topped by a project where nearly half the stars are likely artificial. That is insane. And now we get to what we're all here for. Open- source AI projects. Baraga AI. The median account is only 400 days. Open AFM it's 116 days. Lang flow is a little better in the 3K range as well as Hermes agent. I would be very surprised if Hermes agent was fake. So, we'll see where this goes. Zero public repos for the followers that they checked or the star gazers they checked. 38% on Raga, 38% on OpenA IFM, 11% on Langflow, and 10% on Hermise Agent. Zero followers are 70% here, 66% for Open IFM, only 20% for Langflow. Still way higher than it should be, and 32% for Hermes Agent. I am biased. I like these guys. I suspect that there's a lot of new people here that are touching that, but it's still under half what these projects are. If we compare to the numbers here, zero followers for these blockchain repos is more than double there. But projects we trust are in the 10 to 12% range. So, a jump all the way up to 30 is scary, but I'm not going to read too hard into that, especially when the number of forks is still reasonable and the percentage that they deem suspicious is lower than others. Still good to know, though. I definitely know more people who share the Hermes agent growth than are actually using it. So, yeah. And then ghost accounts. These first two are brutal with ghost accounts, but Hermes agent is a lot less bad at only 6% versus 36 and 28 there. And the fork to star ratio still quite telling. Open IFM having a 2.8 is very interesting here. I'm curious why that's forked so heavily. Raga AI and Open IFM show clear manipulation signals. Raga AI has a 76.2% zero follower account ratio and 28% are ghosts. Nearly identical to the blockchain pattern. Open if the most extreme case in our data set. 66% suspicious accounts, 36% ghosts, and a median account age of just 116 days. Twothirds of its Stargazers are less than a year old with virtually no GitHub activity. The Star Scout analysis notes that it's likely third party bots, not OpenAI itself. Yeah, Langflow flagged by Star Scout at 48% fake showed clean metrics in our profile samples with a median age of 3,000 days and low ghost rates. Likely reflects improved account quality since the Star Scout scan. the 0.06 fork to star ratio still notably low, lower than a quarter of flasks, which gives less genuine adoption relative to star count. For comparison, Hermes agent looks relatively organic. Median age of 8 years, 6% ghosts, forkto ratio is much better despite accusations of astroturfing. The stargazer population is mostly real devs. The project's crypto adjacent audience includes more casual GitHub users, which explains slightly elevated zero follower rates, but the fundamental engagement pattern is legitimate. Cool. I'm happy that my bias here feels correct. So, how do stars actually become dollars? Why do we care so much? The connection between GitHub star counts and startup funding is not speculative. It's explicitly documented by investors themselves. Jordan Seagal, who's a partner at Redpoint Ventures, published an analysis of 80 dev tool companies, showing that the median GitHub star count at Seed Financing was 2,850 and at series A was 4,980. So, if T3 codes at 9K, I really got to do my series A, huh? I should think about that. He confirmed that many VCs write internal scraping programs to identify fast growing GitHub projects for sourcing and the most common metric they look towards is stars. Yep, the numbers set an implicit target. For $85 to $285 in budget stars, a startup can manufacture the $2,850 median they need for a seed. And for between $1,000 and $4,500, they can hit the series A range. And when you consider that the these rounds raise 1 to 10 million, the return on investment is between 3500x and 117,000x. Runa Capital publishes the ROSS index quarterly, which is the Runa open source startup index, ranking the 20 fastest growing open source startups by GitHub star growth. Per techch crunch, 68% of the indexed startups are attracting investments do so at seed stage with 169 million raised across the track rounds. GitHub itself through its GitHub fund partnered with M12, which is Microsoft's VC ARM, $10 million annually to invest in 8 to 10 open source companies at preed and seed stages based partly on platform traction. I hate that 10 million annually in investments doesn't like really register as a meaningful number to me anymore. I've been too YC VC pilled, but that's like 10 companies 10 mil. 10 of them get 1 mil each. But I would hope GitHub puts a little more effort in the GitHub fund to not just give money to the things with a lot of stars. Okay, I I know I'm biased cuz I'm an investor, but Lovable does not [ __ ] belong here. Lovable is raising at these numbers because Lovable's revenue growth is [ __ ] crazy. Lovable did not raise based on their stars on GitHub. They raised based on their [ __ ] unbelievable revenue. As of February 2026, the public number is 400 million a year in revenue. So yeah, them raising 200 million for their series A is not that crazy when they're doing 400 mil a year in rev. So no, they they do not belong here. That is a [ __ ] example. Penglin had a thousand stars in January. They got into Y Combinator and they raised a 4.7 million seed. Browser use 50,000 stars in three months. Y Combinator W25 batch 17 mil seed. Yeah, wonder why they raised that much. Wonder why they raised that much. These are not star to funding pipelines in the way they are trying to indicate here. This is making me less worried about this problem because none of these are good examples. Browser use was the hottest thing at the time. I had so many people asking me my thoughts on them. I did end up writing a check. I gave them a stipulation though. I told them if they hire more than three people in their first full year, because it was 2025 winter, so that would have been about a year ago. I told them, I think they had three people at the time. I said, "You better end the year with less than six. Part of my deal giving you money is that you don't overspend by overhiring." And to their credit, they did. They kept the team very, very small. still haven't used it personally, but yeah, I they raised this much because the demand to get into this company that was going to potentially figure out how agents use browsers was really high. The star count was just one small thumbs up, like a checkbox on the sheet of reasons to invest. The hunger to get into browser uses round was insane. It's crazy that I am in the space enough like I'm an investor in lovable and browser use and I passed on Panggalan and Langchain. I have talked to all of these companies. I've been involved in these rounds. Never were we talking about [ __ ] GitHub stars during those conversations. So, I hate all of these examples. Dagger's founder, I believe, is Fraser Marlo, who led the fake star investigation, admitted directly himself, "In the run-up to the fundraising, I spent a fair amount of time preoccupied with GitHub stars. It is very easy to worry about type of thing, especially when you're pre-revenue. You need something to show that you are growing fast and you'll take anything at that point." An academic paper in organization science provided rigorous statistical evidence that GitHub engagement correlates with startup funding outcomes. Startups active on GitHub are 15 percentage points more likely to have raised a financing round. Yeah, probably the incentive loop is self-reinforcing. VCs use stars as sourcing signals, so startups manipulate stars. So VC inflated traction, so VCs adopt star tracking, so more startups manipulate. Redpoint's own published benchmarks give startups an exact target to buy towards. The fork to star ratio, a simple detection heristic. This is fun because I use the fork to star ratio for very different things. But apparently you can use it to measure how likely a thing is to be manipulated. Like shared and freedom made a 0.02 ratio, a fork to star, whereas things like flask and lang chain are at a.16 almost 20%. Huge difference. Any repo with a fork to star below 0.05 and more than 10,000 stars warrant scrutiny. The watcher to star ratio is even more telling. Organic projects average 0.005 05 to 0.03. Red domain is at 0.001. The ratios aren't perfect. Educational repos and curated lists naturally have low fork rates, but as a first pass filter, they catch the most egregious cases that raw star counts miss entirely. There are lots of other things that can be faked as well. NPM downloads, for example, are trivially inflatable. There's a lot of good examples of this. In particular, if you look at something like Spelt and do all time, you'll see these huge spikes where Spelt suddenly went from 370,000 installs a day or a week to 28 million. This is somebody [ __ ] around. Sorry for the flashbang. These numbers are very manipulatable. A dev named Andy Richardson demonstrated this by using a single Lambda function on the free tier to push his package is introspection query to nearly a million downloads a week, surpassing legitimate packages like URQL and Mobex. and zero actual users. The CMU study found that of the repos with fake star campaigns, only 1.2% appeared in package registries, but of the 738 packages, 70% had zero dependent projects. Yeah, VS Code marketplace extensions are similarly vulnerable. Researchers demonstrated that a thousand plus installs of a fake extension could be done in 48 hours. Aquac found 1,283 extensions with known malicious depths totaling 229 million installs. X and Twitter promotion amplifies artificial GitHub virality through engagement pods. private groups where members agree to like, repost, and comment on each other's content. This doesn't actually work great. NBC News and Clemson University researchers identified a network of 686 exac accounts that posted more than 130,000 times using LM generated content, some containing telltale artifacts like dolphin here from the uncensored dolphin model that they employed. I noticed a few days ago this specific message appearing in a lot of replies I was getting. None of the end just things any more abstract. Then suddenly Claudex by Blackbox AI helps keep it usable. Anyways, I'm okay with the Claudeex mode by Blackbox AI because I can't support something which is against the masses. Blackbox Claudeex users are a different breed in this data set. What's with the Blackbox AI slot promotion going on everywhere in the comments of every big account lately? Blackbox would have at least let Claudex split the load. That's why I shifted to Claudeex of Blackbox. Blackbox Claudex uses codeex to verify the sheer volume of these I have been getting and it started like at the end of March and just got a shitload of these back to back to back. I was actually in talks about possibly having Blackbox as a sponsor and they were down to spend absurd amounts of money which made me sus immediately. Then I went and tried the product and didn't love it so I kind of ghosted them and then all of this spam started to happen in my replies and I decided I'm thankful I didn't work with them. Yeah, I've seen this a lot. Oh, and Higsfield, another sponsor, I declined because they offered too much money and were sketchy. Apparently, Higsfield also is known for not paying out. I I turned down life-changing money from Higsfield because they sketched me out. I hope you guys know that I hold a way higher bar for sponsors than anyone reasonably should. I have literally thrown away millions of dollars by refusing brands like these because I don't work with them if I don't actually think the product is useful and usable to my audience. And Higsfield sketched me out. So, I didn't do it and passed up an amount of money that is significantly more than I would have made in any two years at Twitch. Like, insane. Like, I I'll say it. It's a seven-digit number. And I turn that down because I don't [ __ ] sell scams to my audience. So, talk all the [ __ ] you want to about me and my sponsors. I am the one person anybody who took a Higsfield deal or a Blackbox deal is fundamentally less trustworthy than me and is doing it for way less money because they have way less reach. So, yeah. Uh, anybody who gives me [ __ ] for taking sponsors and being biased know that I have thrown away more money than you will see in your [ __ ] lifetime in order to keep my set of sponsors companies that I actually respect. Isn't Higsfield legit, though? Want to see all of the history of them being absolute [ __ ] scams? They purged accounts for a shitload of paying users. They paid people to advertise for them and specifically told them to not disclose that they were paid to do it. They got banned from Twitter for violating terms of service on Twitter so aggressively. They are an absolute [ __ ] scam. That case documents crossplatform astroturfing at industrial scale. Over 100 confirmed spam posts across 60 plus subreddits combined with mass template DMs to content creators offering payment for promotion. Yep, I'm one of many who got these. And then we have the legal exposure that nobody talks about. The FTC consumer review rule, effective October 21st, 2024, explicitly prohibits selling or buying fake indicators of social media influence generated by bots or fake accounts for commercial purposes. Penalties of up to $53,000 per violation. The issued its first warning letter to 10 companies in December. A GitHub star purchase to promote a commercial product fits within this framework. The SEC president is more direct. Headspin CEO was charged with wire fraud, which has a maximum of 20 years in prison, and securities fraud for inflating metrics to deceive investors out of $80 million. This Yeah, you should be in jail for that. If you make fake numbers so that you could scam investors out of money, you that's a that should be federal crime. Absolutely. Compliance founder faced charges for claiming $250,000 of monthly revenue when the actual revenue is only $250. Jesus [ __ ] SEC's message is clear. Startup fundraisers cannot use the fake it till you make it ethos to whitewash lying to investors. Yep. If a startup buys fake GitHub stars to inflate perceived traction during a fundraising round, investors rely on those metrics to deploy capital, the wire fraud framework applies, using electronic communications to misrepresent material facts for financial gain. No one has been charged specifically for fake GitHub stars yet. Given the CME research documenting the practice at scale and the FDC rule explicitly covering fake social influence metrics, it may only be a matter of time. I am excited for the first SEC and FTC investigation and get fake GitHub stars. I would also encourage them to take some time looking at YouTubers who don't disclose advertising properly. As one of the few people who does and one of the people who gets more [ __ ] than any other tech YouTuber because I am better at disclosing than every other tech YouTuber, it is very frustrating that people seem to think I am this super biased person who is paid for all their [ __ ] opinions because I am actually following the law and being transparent. So yeah, send the FTC after all of the YouTubers, myself included. I will come out clean and half the other channels will get shut [ __ ] down because they're all lying. Just to emphasize this point, our space is not free of this problem. This is a post I made earlier this year in January because I noticed a significant amount of fake viewership on other tech YouTube channels. I won't out these specific creators because I'm not that guy, but you can find these pretty easily. Here's a handful of channels where they got 600 plus thousand views and under 2,000 likes. Here's one with 1.4 million views, 400 likes, and 10 comments. This is fake viewership. This is a tech creator in our space just lying. Here's one with 4 million views, under a,000 likes, and 36 comments. That is a 0.25% 25% engagement rate. Not a 2.5%, a 0.025% engagement ratio. For comparison, on a video of mine with 70K views, I get 2,000 likes versus a few hundred. 273 comments, that's a 3.5% engagement ratio. 54K views, 1.7K likes, 440 comments, that's a 4% engagement rate. 93K views, 3.1K likes. 296 comments, 3.6 6% engagement, literally 10 to 100 times higher, mostly 100 times higher engagement rates. These creators are just lying. And then they go and show these numbers to brands and then they sell sponsorships to them. So if you're a brand interested in doing YouTube sponsorships and you're not already talking with me about it, you are just waiting to get scammed because most of the people doing this right now are just straight up [ __ ] lying about their viewership. There are plenty who aren't. I know for a fact that somebody like Primagen is not doing this [ __ ] and you can go look at their numbers to see it, but please look at the numbers or talk to somebody who knows what they're doing before making these purchases because I've seen people turn down working with me because I only get a measly 100 to 200,000 views and they could spend 2x more money and get a million views. It's just [ __ ] Buying the optics of success so that you can sell that to other people is one of the most common scams in the world. It happens in my space too. And I would gladly support and even be a key witness in any company that is a sponsor of these YouTubers that feels like they got scammed. If you want an expert witness in your testimony, feel free. I am more than happy to do it. I will burn all these [ __ ] bridges. I will not be the one to publicly announce that these YouTubers are scammers. But if you were scammed and you choose to pursue action, I will gladly support you in that effort. And if you or a friend happens to work at the FTC and you would like to have the names of these channels so that you can investigate them yourselves, let me know. I'm happy to provide all of my resources. And to all of the other tech YouTubers who are either watching this video directly or got sent a clip because you should be scared, you should be [ __ ] scared. You're scammers and you're [ __ ] up my ability to land brands because they expect these inflated [ __ ] numbers. Now I'm going to end you. You're lucky that I'm keeping your names blurred now. You better cut the [ __ ] and DM me an apology if you don't want to be outed in the future. Stop doing this [ __ ] You're [ __ ] it over for everyone. You're all on watch now. Get you're on notice. Be careful. Back to the GitHub problem that we're all here for because now we're at GitHub's response. This will be fun. GitHub's acceptable use policies explicitly prohibit inauthentic interactions like fake accounts and automated inauthentic activity, rank abuse such as automated starring or following, and creation of or participation in secondary markets for the purpose of proliferation of inauthentic activity. These policies even specifically prohibit starring incentivized by cryptocurrency airdrops, tokens, credits, gifts, or other giveaways. Enforcement is reactive and asymmetric. GitHub removed 90% of repositories that were flagged by Star Scout while in 57% of the accounts. The infra for future campaigns largely remains intact. When Dagar published its investigation, fakear profiles were deleted within 48 hours, but only after public embarrassment, not proactive detection. Once again, if I have to go the public embarrassment route, I will. GitHub has never published an engineering blog post about its detection methods or enforcement statistics. No transparency reports exist for star manipulation. VP of security operations told Wired only that they disabled user accounts in accordance with GitHub's acceptable use policies and declined to elaborate, though the comment was specifically about the Stargazer Ghost network malware operation, not a vanity metric manipulation. The CME researchers recommend GitHub adopt a weighted popularity metric based on network centrality rather than just raw star counts. A change that would structurally undermine the fake star economy. GitHub has not implemented it yet. Agreed. If the number was how many days old are the accounts of the people who starred, that would be much more powerful. So, what should VCs use instead of stars? Stars are a vanity metric, and instead they should track uniquely monthly contributor activity. Anyone who creates an issue, comment, PR, or commit. Fewer than 5% of top 10,000 projects ever exceed 250 monthly contributors. Only 2% sustained it across 6 months. Oh boy, we have a lot more than that of people trying to contribute to T3 code right now. Jeano Bacon at states shift recommends five metrics that correlate with real adoption. package downloads, issue quality, contributor retention, community discussion depth, and usage telemetry. Cool. We have really good numbers on all of that. The fork to star ratio in our analysis is the simplified first pass filter. A healthy project is roughly 100 to 200 forks per thousand stars. Projects below 50 forks per thousand with high absolute counts deserve a closer look. As one commenter put it, you can fake a star count, but you can't fake a bug fix that saves someone's weekend. Yep. And there's a fundamental structural problem here. Three dynamics make this self-reinforcing. First, we have the incentive loop. VCs use stars as a sourcing signal. Startups manipulate the stars. VCs see inflated traction. More VCs adopt star tracking. More startups manipulate. Redpoint published benchmarks with these numbers, giving startups a price list for how many stars to buy. Second, the AI sector's specific vulnerability combination of extreme hype crypto adjacent funding models that reward token price over product quality and a reviewer ecosystem on X and Twitter that's populated partly by fabricated personas creating a perfect environment for manipulated and manufactured credibility. Yep, our analysis confirmed this. The repos with the worst manipulation signals were overwhelmingly blockchain and crypto adjacent AI projects. And then we have GitHub's enforcement asymmetry. Removing repos but leaving 57% of fake accounts intact preserves the labor force of the fake star economy while doing little to deter repeat offenders. Until GitHub makes some significant structural changes like weighted popularity metrics, account level reputation scoring, or transparent enforcement reporting, the gap between star counts and genuine developer adoption will continue to widen. Star econo is a $50 problem with a $50 million consequence. Until the platforms, investors, and regulators catch up, the market will keep paying the $50. Yep. Thank you to Awesome Agents for this awesome writeup. This is a very, very good writeup of a very chaotic situation. They sourced all of the right things and made something useful. I'm throwing them a follow since it's clear their account isn't fake, and I would recommend you do as well. And if you still trust Star Counts, I hope you know better. Now, I need to make sure the world knows that my numbers aren't fake. So, if you can, hit that like button, the sub button if you haven't yet. And until next time, peace nerds.