How to Build a Self-Improving Company with AI
Chapters12
Discusses organizing companies like Roman legions with hierarchical, centralized command to project power, using the idea that many firms operate as nested hierarchies and that information flows through human conduits in the hierarchy.
Reimagine your company as a self-improving AI loop, not a hierarchical org, and learn how to turn every data source into an automated, token-driven, performance-boosting engine.
Summary
Y Combinator's talk, drawing on ideas from Diana and Jack Dorsey, argues that traditional top-down hierarchies can be rendered obsolete by AI-native organizations. Gary and the panel suggest extracting and codifying a company’s domain knowledge—emails, Slack messages, Notion data—and building recursive AI loops that improve themselves with minimal human intervention. The example of an agent that can query databases, marshal introductions, and even optimize its own tooling through monitoring shows how a system can learn from each interaction and deploy fixes overnight. The core architecture comprises a sensor layer (external data), a policy layer (rules and approvals), a tool layer (deterministic APIs and Gary’s code), a quality gate, and a learning loop that continuously refines itself. Recording everything, diorizing conversations to create usable intelligence, and treating software as disposable while preserving the “company brain” are key practical takeaways. The path forward emphasizes token burn over headcount, eliminating middle management in favor of IC-driven execution, and keeping humans on the edge for high-stakes decisions and real-world touchpoints. By turning the entire organization into legible AI data, YC founders can push toward truly self-improving companies—and potentially reach demo-day-scale productivity gains.
Key Takeaways
- Token usage becomes the primary efficiency metric; companies are likely to measure token burn per employee, with revenue per headcount rising as a result.
- Middle management becomes unnecessary for coordination when AI-native loops handle self-improvement and task execution.
- A self-improving company requires legibility: record everything (emails, Slack, recordings) and diorize data to create usable AI breadcrumbs.
- An agent-native workflow can surpass traditional productivity gains by autonomously refining tools, databases, and indexes based on real-world queries.
- The company brain is the collection of data, skills, and knowhow; software is ephemeral, regenerated as the models improve while humans stay on the front lines for high-stakes interactions.
- Two essential human roles emerge: every team member becomes an IC (builder/operator), and a single named human remains responsible for decisive actions.
- Concrete examples include converting product analytics into an autonomous AB-testing loop and triaging customer feedback with AI-driven prioritization and deployment.
Who Is This For?
Founders and engineers building AI-native startups or teams converting existing orgs into self-improving AI systems; ideal for those aiming to maximize productivity per employee and leverage token-based optimization.
Notable Quotes
"This AI loop is really about a set of recursive self-improving AI loops that run every single step without human intervention."
—High-level description of the self-improving architecture.
"If you can identify parts of your company that work like this and eliminate human supervisory capacity, you can just throw tokens at this problem and your company will get better."
—Core thesis on token-driven improvement.
"Everything needs to be recorded so that it can be legible to the AI. If it was not recorded, it did not happen to your intelligence."
—Data legibility requirement to enable AI understanding.
"Middle management is done. I just don’t think you need middle management for this coordination problem."
—Claim about organizational structure changes.
"You can write a new user manual every month and have it be self-improving, using recordings of office hours to update guidance."
—Practical example of keeping organizational knowledge current.
Questions This Video Answers
- How can I start turning my company into a self-improving AI loop today?
- What is diorization and why is it essential for AI-enabled organizational knowledge?
- Can token burn really replace headcount, and what metrics matter most for early-stage AI teams?
- How do you implement an AI agent that autonomously improves tools and databases in a real company?
- What are the best practices for recording and structuring data to build an AI-native org?
AI-native organizationself-improving AI looptoken economicsdiarizationAGI-enabled dashboardsHarry’s code and deterministic toolsrecording and data governancemiddle management eliminationIC-focused org designYC partner user manual evolution
Full Transcript
This is based a little bit off a talk Diana gave. There's a video up over the weekend which is super cool. Um Jack Dorsey was tweeting some stuff like two or three weeks ago that I thought was super cool and I've kind of um stolen a bunch of those ideas and shove them into here. This talk is like pretty conceptual and high level about thinking about how to build companies. So the Roman legions were designed to project power over two continents or something from Rome at the center to like these people on Hadron's wall up in Scotland.
And the idea was um this nested hierarchies with consistent spans of control and you had like named individual with spans of control to pass orders down and send information back up the hierarchy. And if you think about most companies today, they are organized like a Roman legion where human beings are the conduit for information flowing up and down. And so Jack Dorsey's tweet which I thought was great was it's like this underlying assumption that hierarchically organized companies are the are the way that we should be organizing like our economic units of value. And I think AI basically breaks that.
If you talk to people a year ago about how AI was useful, they talked about productivity, like co-pilots, making engineers 20% more productive, adding co-pilots to workflows, shipping more software. But I think that is actually a broken way of thinking about AI. That's like Pete had a great blog post. We're basically just like taking the old way of working and adding like a more powerful engine onto it. And instead of that, I think you can reimagine like what a company is and how it acts. And so as Gary's talking like he I genuinely believe can produce more code than an entire engineering team.
The thing that's really stuck with me is this idea of like extracting the domain knowledge from your company and defining it as a as like context or a set of skills or whatever you want to call it. But like this idea that there's domain knowledge or business knowledge or like some knowhow that's inside the heads of people and in Slack messages and in emails and in notion. All of this like information together defines how your company works. And if you can make that legible, you suddenly can can move from this hierarchal organization to a sort of intelligent AI powered organization with AI native software.
AI isn't the some it's not something you bolt onto the side of a company. It's not like a tool you give to your engineers to make them more productive. But I think you can reimagine what a company is as a set of recursive self-improving AI loops. I think this is really, really, really important because when it gets there, I think the company starts to self-improve even when you're sleeping. So, let me give you an example. Diana's talks about this as well. this AI loop. You start with like a sensor layer, which is like that's a fancy word, but really it might be like emails from your customers.
Might be support tickets, code changes, people canceling their subscription, product telemetry. It's like sensor data to get information from the outside world. And then a a policy layer, decision layer, like rules about what you can do, what it has to ask a human permission for, what it must log. A tool layer, that's kind of Gary's skills and code. Like the tool layer is Gary's code. It's basically deterministic APIs, things like query my database or look at my calendar. Um, a set of tools that the the AI can call a quality gate like that might be evalistic checks, safety filters, human review for high-risk stuff.
and then a learning mechanism. It's like your system interacts with the real world, picks up where it doesn't work, and loops back into the top again. And if you can run every single step of that without human intervention, without with minimal human intervention, your system gets better and better and better while you're sleeping. And I can give you actual examples of this that are live right now. We started with an agent that you can ask and it it has deterministic tools to query our database. Pretty simple, like when did I last have office hours with this company?
Then it got a little bit smarter which was like for this company I'm doing offices hours with right now they need introductions for anyone in petrochemicals or something and it could query the database in different ways and use rag and all sorts of stuff to like come up with five relevant founders for you to meet. But again this is like this is a sidekick right this is an agent this is like the old this is last year's version of how AI is making me better as a group partner. It's making me 20 or 30% more effective.
The aha moment for me came when we put a monitoring agent on top of that which looked at every single query every single YC employee was doing and saw when it worked and when it did not work and when it did not work it's like oh why not what would have made this query work do we need different deterministic tools do we need to update the skills file do we need a different database view do we need a new index and this happen this literally happens overnight now let's write the code put in a merge request to the YC codebase have an agent review it and merge it and deploy it.
So when a human comes the next day to ask the same query, it will now succeed. For me, that was like the holy [ __ ] [ __ ] right? That's not just AI making you 20 or 30% more valuable. It is the AI going through this loop to figure out how to self-improve. And I think basically if you can identify parts of your company that work like this and eliminate as have the human and kind of a monitoring of supervisory capacity, you can just throw tokens at this problem and your company will get better. And so other examples might be if you have product analytics, having an agent go through your product analytics to to figure out what part of your sales funnel is presenting the highest amount of friction, researching best practices, putting in place an AB test, running it for a week, picking the best version, and deploying it.
Then doing that again and again and again for your product. Just have a self-optimizing like product loop. Or you do it with customer service queries. You have customer suggestions coming in and in and in. you triage it with a kind of you have to have an agent which is like your chief product officer and your chief technology officer who make kind of judgment calls about okay this is a suggestion we just don't want to do we'll discard it but no this is a suggestion which is now in line with our road map um we can do it overnight let's write the code let's deploy it let's ship it to the customer without a human being involved so I think if you can think about each part of your company as a self-improving like recursive AI loop it becomes very very different to this like hierarchically organized Roman legion from a company so what So like if you want to do this, what are the implications?
One is like burn tokens, not headcount. We are seeing companies get to demo day with about 5x more revenue per employee than they did 18 months ago. And I think that's going to continue to series A and series B. And so I think you're going to be constrained on token usage, not on headcount really, really soon. The blunt measure now is just like measuring everyone's token usage, which is obviously like dumb and gameable at the extreme, but directionally I think is correct. We're in the phase of like what is possible right now and so everyone should be experimenting to the max to figure out what we can even do with this crazy new intelligence we have.
As soon as you turn it into a leaderboard and people get promoted or fired based on it, obviously it gets gamed, obviously that's dumb. But I think directionally figuring out who in your organization is token maxing, who is not is like a good way to think about which employees you should be spending your time with. I think middle management is done. I just don't think you need middle management for this coordination problem. I think AI should be doing it. And for me, there are two roles. Jack Dorsey has three. I actually don't like the third one, so I deleted it.
But there are two roles that really, really matter for me. I think everyone just has to be an IC now, a builder, an operator. And I think crucially having directly responsible individuals to get anything done I think you need a named human not a committee not a group of people just a single person and I think you can build companies based on IC's effectively I think just middle management is is over so building this self-improving company that's a dream and by the way I think like people are at the bleeding edge of this right now I'd be interested to see where you all are but it feels like people are like exploring the boundaries here I'm not sure anyone has a truly self-improving company in every function.
I might be wrong. You might prove me wrong. What would I do? First of all, this is really, really important. I would make the entire organization legible to AI. What does that mean? It means you've got to record everything. Simplistically, all of our um partner emails. Now, if you email a YC partner, that email is in the YC database. Every Slack message, every DM, every office hour we've started recording for the last three or four months. every single thing that happens, if it is recorded, it happened to the AI. If it did not get recorded, it is it did not happen to your intelligence.
You know what I mean? And so, I was talking with some founders over here um just now and we're having like really good conversations about their company, but every conversation I had, I was like, "Fuck, I need to be recording this conversation." Because some guy wanted an introduction to I can't even remember who the introduction was now. Who was that? I was talking to someone about and I promise you an introduction. said yes. And I said, "Email me afterwards cuz I would I'm going to forget this. I'm going to talk to 20 people." Yeah. So, it needs to be on my phone or a clip or or smart glasses or we deck out every room with like microphones.
But basically, everything needs to be recorded so that it can be legible to the AI. And then, as Gary talked about like diorization, you cannot pump in 100,000 hours worth of recordings into a context window. So, you have to diorize it. You have to basically aggregate it down, synthesize it into the important parts, and then give the AI breadcrumbs. It's like, okay, so here's an example. Who's read the user manual? The YC user manual. Hopefully, everyone in this room has at least opened the user manual at one point in time, right? Like, it's fine. It was written 5 to 10 years ago, most of it.
It's kind of out of date. So, Haj thought uh last weekend, since now we've got about 2,000 hours of recorded office hours in the last 3 months, why don't we regenerate the user manual? And so you can click like you give it a set of instructions. You basically diorize it down, synthes like categorize it into certain areas like fundraising, hiring, co-founder disputes, whatever. And then write me a new user manual. And by the end of the weekend, he had 150 page user manual, which is dramatically better than the existing user manual. And now we can also update it every single month.
So our user manual becomes self-improving. Every new piece of advice we give, it's compared with the existing user manual and either incorporated or thrown away. So the user manual becomes this up-to-date living brain of the advice we give to founders. And obviously it doesn't stop as a user manual. You then pump it in as context to an AI agent and suddenly you can ask a super intelligent AI and get the combined wisdom of 16 YC partners in one, but only if it's legible. So you have to record everything. The second point is kind of the same, right?
Like if it creates an artifact that can self-improve, it's legible. If it doesn't, you throw it away. The third point then is that every function can generate this used to say dashboards. It's not just dashboards. It's on demand software. Codeex 55 is now good enough. You can oneshot most simple inter like most internal software dashboards you can oneshot to a pretty high level of quality. I tried it over the weekend on a bunch of our stuff. It's just unreal. So all of your internal operations teams should be sitting on this layer of like kind of intelligence understanding and then creating their own dashboards and their own workflows.
And I would see that those as entirely disposable. I would very preciously store all the data. So as Gary said, he puts it all all of his emails in markdown. Never throw anything away, but then treat the the software as ephemeral. You can you can generate it, you can regenerate it. The valuable part is like the comprehension inside people's heads of like this is how the function works. This is how we run a YC event. Whatever the software to actually run the event, you can generate for the event. You can throw it away. The mo the models get smarter in a month or two.
Throw the software away. Give it your original set of instructions and regenerate the software. So I think the business context and and skills are the valuable part. I think the software on top of it is ephemeral. So what what are humans for in this world? I think basically we're talking about a company brain and I know a bunch of people in this room are building this but the bit in the middle like all of your data, all of your emails, your DMs, the skills, the knowhow that is like the company brain and I think the humans sit around the edge of this interfacing with the real world.
So it's where this intelligence makes contact with reality. Human beings reach into places the models can't go yet. That might be like a conference. It might be a I'm trying to think of examples. I would say a phone call, but I think the AI can reach into phone calls pretty easily now. Um I think it's like novel situations, ethical considerations, high stakes moments, you know, it's like it's where the founder comes to us and is like thinking about breaking up with their co-founder, right? It's like those real high stakes, high emotion moments where you really want a human being.
I think that's where the human fits for all of you like sales conversations. I think that's a human being in the room for the next 20 years. So the humans live I think around the edge and I'm over time and cool vision should bullhorn me. I will leave you this one question. If you were building your company today would you start it in this shape for most of you you're small enough to build it right and so I don't think you have any excuse and I know there are a few of you who are in the process of ripping up and rebuilding your company.
So with that I will stop um and we'll hand over to Pete. Thank you for listening.
More from Y Combinator
Get daily recaps from
Y Combinator
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.





