Shopify CEO Reveals Their Secret AI Developer

Chapters10
Shopify uses River in public Slack channels, enabling engineers to see how tasks are scoped, context loaded, and corrections made. The bigger point is that most AI work happens privately, which hides how teams actually think and learn.

Shopify’s River shows that public AI workflows, not private ones, unlock real organizational learning and scalable performance.

Summary

Nate B. Jones highlights Toby Licki’s Shopify experiment with River, their public Slack-based AI assistant. River is used across thousands of channels, with nearly 6,000 employees engaging it in a single spring period and 1,800 poll requests opened in the main mono repo in one week. But the true story isn’t just the usage stats; it’s the design choice to keep River’s conversations in public channels so the entire team can learn from each step. The key problem is visibility: private AI work hides valuable context, corrections, and reusable workflows, forcing each new hire to rediscover solutions from scratch. Jones argues that the apprenticeship gap widens as more thinking happens in private chats. He draws parallels to manufacturing, referencing John Deere and the transfer of tacit knowledge into machine learning workflows, and stresses four visible work elements: task, context, interaction, and review. Public, senior-led AI work becomes a powerful teaching tool, enabling junior workers to observe judgment and decision-making in action. Shopify’s approach includes declared Slack channels for reusable workflows, strict public-access rules (no DMs with River), and a push to turn single-use insights into organization-wide playbooks. Jones suggests concrete metrics around learning and reuse, not just token counts, to measure true organizational AI progress. The overarching takeaway: deliberate constraints that enforce public collaboration can dramatically accelerate collective AI competence across a company.

Key Takeaways

  • River’s public-channel model makes tacit AI learnings visible, enabling thousands of engineers to reuse workflows rather than reinvent them.
  • Four-part visibility framework: task, context, interaction, and review—sharing all four drives shared taste and faster skill acquisition.
  • Senior leaders must demonstrate and narrate their AI-driven judgments in public channels to transfer expertise and close the apprenticeship gap.
  • Declared, public AI workspaces with channel-based rules (no DMs with River) encourage collaboration, reduce duplication, and scale learning across teams.
  • Measuring success shifts from raw AI usage to tangible learning metrics like the number of reusable workflows created, adopted, and retained as best practices.

Who Is This For?

Engineering leaders, AI product managers, and CTOs looking to scale AI adoption without sacrificing learning and collaboration. This is essential viewing for teams that want to turn individual AI experiments into repeatable, company-wide capabilities.

Notable Quotes

"River doesn't work in private. Every conversation an engineer has with River happens in a public Slack channel."
Illustrates the core design choice that makes learning visible rather than hidden.
"The workflow that worked yesterday gets rediscovered next week by the next person who builds the same thing from scratch because nobody told them it existed."
Demonstrates the apprenticeship gap caused by private AI work.
"The most valuable part of AI work is rarely the prompt. It's the surrounding habit."
Emphasizes process and culture over solo prompt engineering.
"Ask your senior people to run some nonsensitive work in public and equip them to do so. Make it easy the way River is easy at Shopify."
Practical path to transferring senior judgment into teachable public workflows.
"Constraints that are creative and careful shape incentives toward learning."
Key takeaway about designing the environment for collaboration and learning.

Questions This Video Answers

  • How does Shopify use River to make AI work public across Slack channels?
  • What is the apprenticeship gap in AI and how can public workflows help close it?
  • What four parts should you share for effective public AI work: task, context, interaction, review?
  • What are practical channel rules to prevent sensitive data from leaking in AI-enabled teams?
  • How can senior leaders demonstrate AI-driven decision making in a public workflow?
ShopifyRiver AI assistantAI in SlackPublic AI workflows apprenticeship gapvisibility in AIdeclarative Slack channelsSenior leadership in AImeasurement of AI learning
Full Transcript
So Shopify's internal coding agent is named River. In one 30-day stretch this spring, 5,938 Shopify employees used River across more than 4,400 Slack channels. In a single week, River opened 1,800 poll requests in Shopify's main mono repo for code. About one in every eight merged poll requests at Shopify come from River today. Those are big numbers and the numbers are what most people grabbed on to when Toby Licki, CEO of Shopify, posted about all of this earlier this month. Underneath those numbers though, there's a design choice that is actually the story. River doesn't work in private. Every conversation an engineer has with River happens in a public Slack channel. Other engineers can scroll back through the thread. They can see how a senior engineer scoped the task, what context she loaded, where the agent got stuck, what she rejected, what she kept. That's the part that nobody is copying. Okay, let's get into the deeper meaning here. Most companies have a hidden AI problem and it has nothing to do with tooling. Your employees are using AI all day long. They're asking chat GPT to rewrite emails. They're using Claude to reason through a tricky customer issue. They're running coding agents to inspect repos. They're getting co-pilot to summarize 40page docs in two minutes. They're quietly building small workflows that save them hours every single week. And almost all of it is happening in private in private software, private windows. This is not a conversation about whether it's secure. It's about the fact that it's not shared. The good prompt disappears into one person's chat history. The clever correction stays inside one employees browser tab. The workflow that worked yesterday gets rediscovered next week by the next person who builds the same thing from scratch because nobody told them it existed. That is a real thing. I have talked to Amazonians who will tell me that there are six, eight, 10 different vibecoded tools inside the company for the same problem. So individuals are getting smarter. The company is not. And that is the gap. Most companies have already bought the tools they need. So the problem at this point isn't necessarily tooling per se. It's visibility. I wrote about this same shift on the Substack last month about how the comprehension layer is what gets rewarded now, not the output layer. If you want the longer version of why all of this matters at the individual level, that's where you dig in. Now, we're going to keep moving at the team level because that's what this video is about today. Every quarter, your team rediscovers the same lessons. The workflow your best operator nailed last month is invisible to the new hire. They build it from scratch. For most of human history, the way we learned skilled work was by being near skilled workers. And I don't think that's changed. You watched how the senior person framed the problem, what they noticed, what they ignored. You picked up the bits that didn't show up in any training manual. You learned the craft from the process as much as from the finished product. Now, think about what happens when most of the actual thinking in the AI age happens in a private window. The junior employee never sees how the senior person instructs their agents. The new manager never watches an experienced operator verify an answer. The correction that made the workflow reusable stays invisible to everyone except the person who wrote it. Everyone is alone with their model, which means everyone has to rediscover the same lessons from scratch. And I'm calling this the apprenticeship gap, and it's getting wider every single quarter because more of your team's actual thinking is happening inside chat windows that nobody can see. One of the best examples I know of how hard it is to get implicit knowledge into digital systems into the way we learn and understand comes from the manufacturing era and specifically from John Deere and from other similar companies that are used to building complex physical machine tooling. There is a whole generation of American workers who are used to working on complex tooling and manufacturing environments who are nearing retirement and they're extraordinarily skilled and they know things in their fingertips literally that they can't speak or express. It goes back to Pollanu's paradox. The work is more than we know. And there is an entire effort of what I know PMs who have done this to grab their knowledge and figure out how to turn it into a machine learning algorithm before they retire. And there's no new people coming up in the manufacturing industry with the same knowledge in their fingertips because there's just fewer and fewer people stepping into factories. And so we need to find ways to digitize it. I think what's really interesting about that is that when when you talk to a PM who's done that, and I have, you hear mostly how hard it is to get it right, that there is literally no way to get the full felt experience of someone's uh ability to turn a particular steel piece of machinery into an algorithm. You can get close, you can approximate, it's not perfect, which is why key bottlenecks in our supply chain actually are driven by a single person with extraordinary ability in their fingertips. And and the world is built by people like this. So there's one person who knows how to paint the racing stripes on a Rolls-Royce. They're the only ones. There's one person somewhere in Oregon, I think, who's a machinist who knows how to test the quality on a particular type of Boeing screw. and they're the only ones. And by the way, if you think they retired and that's the issue with Boeing, that is not the issue with Boeing. That is a much longer conversation. They are not the problem. But but that's what I mean like that physical knowledge is hard to speak. It's hard to communicate. And we have the same thing in software. And I think it's really important to learn from our physical engineering counterparts when we think about solving these kinds of software problems. And that's why this video matters. So what does public AI work actually look like? Because just dumping every chat transcript into a Slack channel is not what I'm suggesting. It just pollutes the Slack. What you want to make visible is four parts of the work. One, the task. What was the person actually trying to get done? Two, the context. What did they tell the model? What did they paste in? What did they leave out? Three, the interaction. How did they prompt? What did the first answer look like? How did they push back? What did they ask the model to redo? And four, the review. What did the human accept? What did they reject? What did they verify manually and what did they rewrite and why? If you only share your final answer, the team learns almost nothing. If you share all of those four parts, the team starts to build a sense of shared taste. And shared taste is one of the tremendous bottlenecks in AI adoption. Right? Now, a prompt library doesn't fix this, right? A prompt library captures static instructions, but it misses all of that messy context. It misses the revisions. it misses the moment when the model produced something that looked plausible and the human said no that's wrong for our customer or no that violates our tone or no that analysis skipped the constraint that actually matters here and I do that a lot in fact when people look over my shoulder when I'm using AI one of the things they notice is I say no to the model a lot and I say no very quickly and I say no based on a very rapid assessment of the quality of what the model is producing for me and that tends to surprise people and that's why it's so important to share these things so the most valuable part of AI work is rarely the prompt It's the surrounding habit. The prompt is the easy part to copy. The habit is what teaches us and helps us to learn. On the Substack, I broke down all four parts with a worked example for you, right? The task, the context, the interaction pattern, the review standard. So, if you want the full version, how this looks in practice, the link's in the description. Okay. The objection that tends to come up first when I describe this is privacy. And it's a very serious objection. So, let's be really honest about it. Your employees should not assume their private AI chats are going to just become company property. Now, I know on paper, most people have agreements that say whatever you type into the company AI is the companies. In reality, most people don't act that way. And if you were to say every one of your AI chats is default public, a lot of people would just stop using AI a lot. Like, that's just the reality. And you don't want to push good work underground. So, what I'm describing is the opposite of that. I'm describing declared spaces and declared rules. And that's the beauty of the Shopify example. Senior people running real work where the team can watch. Toby does this himself with River in a public channel. The point of the channel is to make learning visible full stop. You create declared channels in Slack. A product team gets an AI workbench channel. A sales team gets a sanitized customer research workflow channel. A finance team gets a readonly analysis pattern channel. The engineering team can have public agent channels for certain classes of nonsensitive tasks. The boundary is the way this works well, right? Your team needs to know exactly what belongs in the public channel and what does not. Customer data has to stay private. HR has to stay private. Legal strategy has to stay private. So there's things that you have to keep private. But if you can prioritize everything around that and say, can we learn together in a public channel, you can get a tremendous amount of momentum. You should be able to draw up a workflow where you can say, I can put clinical decision support. I can put anonymized patient records. I can put treatment reasoning into a space that is public so that we can see the agent operate against it, but not in a way that discloses any PII or violates HIPPA. And that takes some work and it's not perfectly easy. But the alternative is that you take something like HIPPA that was intended to protect patient privacy in the US and it turns into a constraint on AI learning. And you don't want that. You want to be in a situation where you're thinking creatively about how to be compliant, but how to still expose relevant context that other people can learn from as far as how they interact with AI and reasoning models. So, the takeaway here is not make regulated uh work public in a non-compliant way. I'm not advocating that. The lesson is create a safe public surface for the parts of AI work that can teach without exposing protected information and lean into that as much as you can. There's a fuller treatment of the regulated industry version on the Substack. What a hospital IT team or a bank can actually put in a public channel without crossing a line. If you're operating in a high regulated environment, I think that one's worth a read. Okay, here's where this gets uncomfortable for a lot of us watching. The most important public AI work in your company has to come from senior people. In most companies, your senior people have the most valuable judgment and also the least visible process. They will write the final memo, but you don't know how they did it. They'll make the decision, but they don't tell you why they made that decision. They'll edit the strategy deck. They approve the customer plan. All the thinking happens offstage. With AI, that off-stage thinking can get even more hidden. A senior leader can use an agent to pressure test a plan, but never share it. Right? They rewrite a board update with the model. They don't talk about that. They compare scenarios. They identify risks in a road map. They critique a launch narrative. If all of that happens in private, your organization never gets to see how a strong operator actually uses AI. The fix is really clear here, right? Ask your senior people to run some nonsensitive work in public and equip them to do so. Make it easy the way River is easy at Shopify. Real work things that have stakes. A leader asking an agent to critique a launch plan in a team channel. A senior engineer using an agent to investigate a low-risk bug while narrating the review out loud. A sales leader showing how they turn account notes into a call prep brief with customer sensitive details stripped out. A product leader asking AI to find weak assumptions in a roadmap narrative. The junior person on the team doesn't copy the prompt anymore. They actually see the judgment in action. They see how senior people frame ambiguity, how much context is enough, how often the first answer is wrong. They watch a good operator push back. They learn that using AI well is active supervision, not passive consumption. Most AI training doesn't get close to this level of quality because training just tells people what the tool can do, right? Public senior workflows show people how capable individuals who are at the top of their craft actually use AI. And by the way, that is exactly what Toby is modeling at Shopify as the CEO. He also considers himself an individual contributor and he is deliberately putting his work in a public channel allowing other people to ask questions of his agent allowing other people to critique his choices in that channel and work with his agent as he works with the agent to shape results. Is it a little bit chaotic? Yes. Is he still the one who's telling the agent what to do? Also yes. But that open room format for the work he does with his agent allows him to teach and socialize what he wants to drive through the company in a way that nothing else does. So start with one declared channel per team. Write a pinned message at the top that says what the channel is for. Uh make sure that it's for reusable workflows and useful failures and for prompt revisions. And make sure that it's a default. Like one of the ways that River works at Shopify is that you cannot interact with River in a DM. It's not possible. have those kinds of constraints and as patterns repeat and they will repeat turn the repeated patterns into playbooks or into skills or into inputs for the next challenge that you're facing because you can learn deliberately from the channel. And by the way, yes, you can use AI to brush through that channel and gather those lessons learned. It is one of the fastest ways to socialize real AI usage that I've ever found. And that means that your company actually starts to get smarter. The whole team starts to get smarter. Junior folks get smarter. Not because everyone has the same prompt, but because the organization now has a way to turn what one person learned into what the whole team can use. Total senior team investment to get the flywheel started. It's not that long, right? Like you just have to be willing to do it and you have to be willing to comply with frankly a constraint that can feel a little bit binding. You can only interact with this agent in a public channel, right? Well, you got to be willing to do that. And then when you do that, you start to find out that you are effectively multiplying your time and impact in ways you didn't realize. And that's what Toby concluded when he was actually writing out his whole reflection on apprenticeship in the age of AI. Last thing, what does it matter to measure in this in this particular environment? We've talked about things like token volume. They have their place. Tasks or workflows, they're definitely useful, but I think we need to shape some useful metrics around learning and reuse. How many reusable workflows did the team create in the last month based on a public channel where you interact with agents? How many got adopted by another person or another team? How many examples got pinned because they changed how somebody works? How often did a public workflow prevent duplicated effort somewhere else? That one's hard to measure, but it's worth trying. How many stale examples got retired? How many failures became better review rules? The best signal sometimes is not AI usage is up. The best signal sometimes is that the mistake is happening less often on our team. And that's how organizational learning looks and it's hard to measure, but it's worth trying to because it reflects reality better. So the practical question for the leaders out there isn't whether your team is using AI. Most of them are using it a lot already. The practical question is what AI work inside your company is making one person better while everybody else falls behind. Because if that work stays private, you're paying the for the same lesson twice, three times, 10 times. That is the apprenticehip moment. Either your senior people start running real work where the team can watch or every individual on your team is going to get faster while your company stays where it is. The companies that pick the ability to learn from one another are the ones that start to compound organizationally. So if you want to dig into this, if you want to look at what channel rules look like, get a playbook to get started for different industries, that's on the Substack. deeper dive there. I just want to call out as we close here that the biggest thing that you can take away from this video is actually the power of constraints to drive this collaboration. I think it is undersold how this is getting facilitated at Shopify by the simple rule that agents never run in DMs in Slack. DMs are so popular in Slack and Slack has been fighting them from a product perspective for a while because they're demonstrably bad for teamwork, but they're very popular with individuals. they allow you to say I I can get a response, I can just ping someone, etc. By insisting that agents only work in public channels, you are putting a binding constraint in favor of collaboration and learning. And so the larger lesson here, I'm not talking about Slack per se. The larger lesson is that constraints that are creative and careful shape incentives toward learning. And you should audit your environment and say where are we putting intentional constraints that individuals may find frustrating sometimes but that on the whole promote collective public learning for AI. That is the takeaway I want you to have. Cheers. I'll see you next time.

Get daily recaps from
AI News & Strategy Daily | Nate B Jones

AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.