Rubber Duck Thursdays
Chapters7
Host greets viewers across platforms and invites audience to share location and projects.
A lively, code-first hangout where GitHub folks showcase Copilot CLI, open-source projects, and live feature hacks like voice input on a to-do Electron app.
Summary
Rubber Duck Thursdays on the GitHub channel blends community storytelling with hands-on experimentation. The host guides viewers through live demos of GitHub Copilot CLI, forking repos, and running an interactive session inside a terminal to build features on a to-do Electron app called Todometer. The stream doubles as a global meet-and-share, with attendees from Pakistan, Nigeria, Zimbabwe, India, and beyond talking about their projects and learning moments. A big thread is the use of Context7 MCP servers to provide up-to-date docs to Copilot, and the host walks through plan-mode research, parallel sub-agents, and rubber-duck style reviews to critique outputs from different model families. Viewers get real-time exposure to adding voice input via Whisper and Foundry-hosted models, along with a live code-review workflow that saves a code-review.md file. The chat also circles around pricing shifts for Copilot, language support, and the broader value of open-source contributions and Open Source Fridays. The session closes with tips for local meetups, a recap of security-minded practices, and encouragement to keep experimenting with agent-based workflows despite the occasional glitches. The host’s energy and practical demos aim to demystify AI-assisted development and show how to integrate AI agents into real-world projects. You’ll leave with a set of concrete ideas to try, from CLI-first Copilot usage to rubber-duck style peer review of AI-generated changes.
Key Takeaways
- For Copilot CLI users, switching to plan mode and enabling MCP context7 servers helps the agent fetch up-to-date documentation and code samples.
- The Todometer to-do app serves as a concrete testbed for building features like voice input via Whisper, using Foundry-hosted models and Azure endpoints.
- Rubber duck reviews can bring cross-model critique (e.g., Claude orchestrator with GPT-based review) to improve AI-generated plans.
- Open Source Fridays and Copilot Dev Days are practical entry points to learn agentic workflows and contribute to OSS communities.
- The stream demonstrates a full loop: research plan, implement with autopilot, run tests, and produce a code-review.md for quality gates.
- Pricing shifts (tokens/credits) for GitHub AI services are a live concern; viewers are encouraged to read the official changelog for June 1 changes.
- Security and proper guardrails remain essential when delegating tasks to AI agents, especially for production-grade code or sensitive data.
Who Is This For?
Essential viewing for developers and site reliability engineers curious about AI-assisted development, Copilot CLI workflows, and live experimentation with agentic features. It’s especially valuable for teams exploring OSS contributions and practical demos of voice input and code reviews.
Notable Quotes
"This is a weekly stream that we do every Thursday just to catch up, talk about GitHub, what's new, what's working, and connect with the community."
—Defines Rubber Duck Thursdays and the stream’s purpose.
"There’s so many projects building an upskill platform."
—Shows the breadth of community projects discussed.
"We’re going to fork this repo, we’re going to clone it, and we’re going to work on it from scratch."
—Preview of the hands-on Copilot CLI workflow.
"Transcribing doesn’t go through; we’ll debug this together and try two parallel models."
—Live debugging of a voice-input feature using Foundry Whisper and Azure endpoints.
"The rubber duck review is a built-in agent that allows the cross-model critique."
—Explains the concept of cross-model review in Copilot CLA.
Questions This Video Answers
- How does GitHub Copilot CLI work in plan mode versus autopilot mode for feature development?
- What is Context7 MCP and how does it improve AI suggestions in Copilot?
- Can you implement voice input in an Electron app using Whisper in Foundry, and what are the key steps?
- What are rubber duck reviews in AI-assisted development and when should you use them?
- Where can I learn more about Open Source Fridays and Copilot Dev Days for hands-on AI tooling?
GitHub Copilot CLICopilot CLAContext7 MCPFoundry WhisperTodometerOpen Source FridaysRubber Duck ThursdayAI in DevOpsVoice input in Electron appsCode review automation
Full Transcript
Hello. Hi everyone. Good morning, good afternoon, good evening based on where you are joining us from. Good to see everyone. Hi Shirley. Good to see you. I've seen you on several of this live stream. So, good to see a familiar face. I'm going to remove the starting soon button. All right. Hey, how's everyone doing? Good to see you all. Hi. Is it Cotiso? Welcome to the show. Good to see you. Where are you joining in from? Where are you joining the live stream from? Let us know on the chat. This is streaming on LinkedIn, on Twitch, on YouTube.
So, just a good time to connect with the GitHub community. So, how's everyone doing? What's everyone working on? Hello. Hi, Manuel. All right, I see um Mike Pik is from East Coast AU here. Welcome to the show. Good to have you. Hey, Osman. All right. Great. Great. Hello. I see so many people coming in from LinkedIn. LinkedIn is one of our popular platforms. So, welcome, welcome, welcome. What's everyone working on? I see we have representation from Denmark. We have Usain from Pakistan. We have um Baba from France. Good, good, good. Mulovia. Oh, that's that's wonderful.
Hello from South Africa. We are from all over the world. So a good place to connect. I saw someone shared their GitHub um their GitHub profile. So if you also want to connect with each other, more than welcome to do so. But also let us know what are you experimenting with? Uh what are you building? What features are you excited about in the GitHub space? I would love to just talk about that. Okay. So um Ken says you're experimenting with agentic flows using GitHub copilot. You just created a smart car dashboard with my flows zero code.
Oh wow. That's that is fantastic, exciting, and a little bit scary at the same time. But it's it's a really good way to experiment with all this new cool AI shiny stuff. So wow. Um, Ken, what's your key takeaway from that project? It sounds like it's a big project. It's a smart card dashboard. So, that's quite a lot of work that went behind it. I'm curious, just let us know in the chat, how was the experience for you? What are some of your learning points and maybe a mistake that you made that you probably can share with all of us as a learning point?
But that's a really, really good use case. All right, I see more people from Pakistan. Hi Summit. Hi Osman. Good to have you all from the show. Hello Emmanuel from Nigeria. We have Prime Rose from Zimbabwe. Welcome. Welcome to the show. This is this is really good representation to be honest. Wow. From India. Hello from Brazil. What is everyone working on? I'm sure we all have exciting projects, scary projects, new technologies you're playing around with. So, what are you guys trying out? All right, Prosper from DRC. How's How's DRC? Oh, so correction smart car dashboard.
That's for your car. Yeah. So, I'm really interested in hearing about how that experience was like for you. Um, you know, what did you learn? What are some of the mistakes that you probably made in the process of building that that you can share with the community here today? All right. Um, I see we have another user here. I'm developing social gaming platform and uses GitHub as a single source of truth for your code for your files for CI/CD. That's interesting. If you have links you can share uh feel free to put them on the chat.
Lovely to connect with you and all the best in your project. Okay. All right. Yeah. So, keep keep the comments coming. Um Muhammad shares his LinkedIn in case you want to connect. This is a really nice place for you to just connect with people from different parts of the world. Australia, Brazil, fantastic. I want to see more projects. I mean, what is everyone building or let's talk about features, right? Um, we've had so many announcements from GitHub, GitHub Copilot, GitHub Copilot CLI. So, anyone who's excited about specific features that we can talk about, so you can just mention the feature.
Um, Naria is developing a traveling app. Fantastic. My comment is we have so many apps out there but this is the era of building personal software. So if for some reason an app is expensive or it doesn't really fit your exact needs, you can easily just spin up your own custom application. So Nares, I hope you're having a fun time building the traveling app. so I've been working on learning more about temporal anti-aliasing lately. Interesting. Not quite familiar with what that is. So probably you could just share like a short summary on the chat for everyone there for those of us who are not familiar with the topic, but it sounds exciting, right?
Great. Great. So many projects. Uh this is a DevOps architect using GitHub Copilot. Hm, that's interesting. Now AJ, I'm going to spotlight you there for a quick minute. Could you maybe give us some use cases exactly how you're using GitHub copilot? I know that we've highlighted so much of how developers can use GitHub copilot. So you are a DevOps architect. maybe you can just um shed some light in terms shed some light in terms of how exactly you are utilizing GitHub copilot and most of the features that you are using. So AJ I'm going to come back to that um shortly.
All right. So Ken has shared a really good summary. I'm not going to be able to read it all, but it looks like you're having a blast there using Aentic Flows here for your um for your smart car dashboard. I think that's quite exciting. You've been coding. Oh, in the 80s. Oh, wow. Respect, sir. Respect. That's that's exciting. Um yeah, so curious, how does coding today compare with how you were coding back then? I mean that must be a really big shift. So good to see you at the forefront of ex experimenting with um imagine technology.
Okay, we have Lamiday who is developing um palok a messaging and search app. The search is to give you the right result or service you need mostly rendered by connections and people you know but didn't know their job or the service they render. All right. So, it's more of a connection for people based on their roles and probably something that you can uh benefit from. Sounds exciting. So, yeah. Good, good, good. Mhm. All right. We have a question here from Ali. Um, can you talk about open-source contributions? I am new and want to contribute to the opensource projects.
I think that's a really good target that you have there Ali. I mean right now contributing to opensource is is really the way to build meaningful software really fast. So we do have several open-source um platforms. So let me just do a quick search here for you. Open source But there are so many opensource projects right now on GitHub and uh I think we're all familiar with you know the whole scale uh as you're seeing more AI agents contribute to projects build projects. So I feel like the um the opportunity for you to get started with opensource is really really um is is really really wide.
you do have a great opportunity there. So, I'm just looking through the YouTube channel and I'm just realizing that I haven't shared my screen. So, just give me a second to share my screen. Um, present screen. I'm going to share this screen. I'll pop it on stage. Then I'm going to pull this over here. Okay. Hope that is visible to everyone. Should have done this earlier. But yeah, I'm just trying to look um early because we do have so many resources, so many videos, so many live streams on opensource. Um specifically, we do have opensource Friday.
So opensource Friday. It's a series, right? There's a playlist here that every single Friday we do have something new. Oops, just lost the link. Right. So, here is the playlist. Uh why does that keep happening? Okay. So, there's a playlist here that you can um hop every single Friday there's a new guest talking about a new open source project, new opportunities to contribute to open source, new learnings, the some of the new challenges that open source maintainers are facing in the era of AI. So, I strongly suggest that you look through the GitHub account on YouTube and just find the opensource Fridays um playlist.
This will give you a lot of insights from actual open source maintenance and you can find just a lot of exciting opportunities to contribute. So that's something you can look at. Okay. Um good so many comments. Okay. Macro 81 or L is asking what is rubber duck Thursday. So this is a weekly stream that we do every Thursday just to catch up, talk about GitHub, what's new, what's working, connect with the community. So if you have any questions, if you have any features you'd want to maybe see a demo of, this is a right space for you.
And as you can see, we're seeing so many people share projects they're working on, key learning. So it's just a good 60 or so minutes every Thursday where we get to meet and talk about what's new in the world of GitHub. Okay, I'm seeing so many projects um building an upskill platform. Okay, that's fantastic. Oh, I'm so far up with this comment. All right, I am building a new logic on how AI is being built in the digital ecosystem to make them into aentic AI with security perfected. All right, security is a big and a key focus area.
So yes, as we jump into this AI train, security is not an afterthought. It's something that you kind of want to think about from the get-go. So yeah, that's quite a lot. Thank you all so much for sharing your projects. Keep them coming. Have interactions on the chat. Um so for this live stream uh there's a project that um I actually just learned about yesterday from my manager and we started talking about it and I thought to do a quick um demo we can just experiment with it. So how many have actually how many of you have actually used to do meter?
So I'll just pop that here. How many have seen or interacted with this application before? You can just let me know on the chat. Right. So, to do it's a to-do app as it says, but it has a way for you to just track the progress at the very top here. You can bump items to uh complete them later. And at midnight, the list kind of resets. it drops everything that's completed and then it bumps up anything that you procrastinated or you pushed to do later back into the to-do list. So, it's it's just a cool app that I was just learning about it for the first time yesterday and it's actually pretty popular on GitHub.
So, I'm going to drop um let me just pull this here. So, I'm going to drop this on the chat in case you haven't tried it out. You can, but on this live stream, we're just going to play around with it. We're going to use GitHub Copilot CLI to see how we can just interact with the application and maybe add a few features. So, that's what I had in mind for today's live stream. Let me look through the chat, see if we have any questions before we get started. So we're going to focus on GitHub copilot CLA.
If you've not worked with GitHub copilot CLA before, this is an agent that lives in your terminal connected to your GitHub subscription. So stay tuned as you're going to cover it in depth today. All right. So Mario is asking what are the useful features of GitHub? That's a good question. GitHub is such a broad platform. It's an entire ecosystem for developers, for builders. This is a platform where you can host your code. You can manage your deployment pipelines. You can now integrate AI and agentic workflows into your software development life cycle. So, Mari, I don't think we can exhaust all the features of GitHub, but I'm just going to briefly mention um the GitHub AI features.
So, that is GitHub Copilot. uh the ability for you to access a variety of models from different providers all in one subscription. That's a GitHub copilot subscription and you're able to integrate AI into different parts of your software development stages. So you can use AI for planning, for documentation, for testing, for implementation, for code review, etc. So there's a variety of features that you can utilize. So today I'm actually going to demo GitHub Copilot through the CLI. So we're not going to open VS Code. We're not going to open an IDE. We'll interact with our agent through the terminal.
And I don't know if you enjoy working from the terminal. I didn't up until a couple of months back when I tried out Copilot CLA and to my surprise, I actually enjoyed it. So I'm more of a visual person. I love seeing beautiful UIs, clicking on buttons, but there's just something about being able to work from the terminal. So, if that's your preference, then the CLI is definitely a tool that you should check out. Okay, so um punch the monkey interesting name there. So, what's your thoughts on um GitHub Copilot's usagebased migration from the requestbased?
Um I mean I won't comment so much on this but um my understanding is that this is an attempt to just make the platform and the services more reliable to our existing users. So it's just a strategy to ensure that we are addressing some of the recent reliability issues that we've had so many people um get affected by. So this is just one of the strategies to see if we can have a more sustainable model to avail the very best models, the very best AI services to our users. So if you are interested in learning more about that, we do have the GitHub um so if you go to github.com/bench log in case you're hearing this for the first time.
Oh, so it's github.blog. Sorry, if you're hearing about this for the first time, if you go over to the GitHub change log, you will find some information about um let's see, let's see. Let's see. It should have some information about usage based metrics. um the usage based migrations it's somewhere here we have so many announcements coming from from GitHub recently so let me see message metrics okay that's not it okay okay token based. Oh, okay. I'm not finding that. But uh the idea is uh from June 1st. Yeah. So from June 1st um GitHub will move from uh from we'll actually move to a token based pricing model where you'll just be charged on the number of tokens that that you consume.
I'm sorry I'm not getting that link right now. based reviewing. Okay. Yeah. So, here you go. Um, yes. So, this is the announcement. I'm just going to pop this on the chat for you. So effective June 1st um your usage for copilot will be uh consuming GitHub AI credits. So I believe that as we approach June 1st we'll have a lot more resources trying to explain how this will work but this is a good starting point just to understand the rationale behind it. Why are we making this change from the current pricing model and what exactly is going to change.
So I do recommend that you uh read that just to understand um what what will change for you. U Maria is also asking how many languages can integrate in GitHub. I will say think of a language most likely it can easily integrate in GitHub. I can't think of a single language that can be supported. Um all right. Okay. So you're also hearing about to do for the first time. Okay. So, let's actually experiment. Let's try it out together. So, here's the repo. I posted it on the chat. Um, and what you can do is uh we're just going to fork this.
Let me just fork this. The link again is on the chat. So, you can you can follow along as we experiment with it. I also just learned about it yesterday. Uh so a short demo and I am excited to see how it kind of works. So I'm going to create a fork and then I have my terminal open here. We're just going to work on it from scratch. So first thing is I'm going to get into my labs directory. This is where I host all of my projects. And we going to clone this project. So, I'll come over to code and uh in case you haven't seen this before, you can clone a project with um with GitHub CLI.
So, let's do that. I'm going to copy that link. I'll switch back to my terminal and let me just make this a bit bigger and then I'm going to type Copilot. I already have Copilot installed. Copilot CLI installed. So by typing copilot, I'm going to get into the copilot um CLI here. So you'll see I have Copilot GitHub Copilot already running. This is the version that I'm on. I'm on version 1.0.48. I believe this should be the very latest version. Um okay. Uh Malef is asking me to try and be slow. Am I moving too fast?
Sorry, sorry about that. Yeah. So, uh to recap, there's this application to dometer. It's a meter based to-do list for your desktop. It's an electron app. So, we just want to experiment, try it out, maybe add a few features. So, that's what we're trying to do. I shared the link uh on the chat. We've just cloned it and we now we've just a it and we now want to clone it using the GitHub CLI. So I'm just going to run that command here. So I'll do GitHub repo clone. We'll clone it right here. All right.
I have typed GitHub here to open copilot in my CLI. so let's see what we have here. We should have the to-dometer application. Yes, we do have it here. So, I'm going to exit. So, let me just exit the CLI. And then I'm going to get into my to-dometer project, right? And then I'm going to open Copilot from in here. So, I'm going to open it inside this project. And then I am going to check if I am logged into the right account. So let me see the list of models that I have. Okay, that's fine.
So if you're seeing copilot CLI for the first time, this is how you can interact with GitHub copilot completely on the CLI. So you just type copilot, it will trigger copilot CLI. Here you can see the list of models that you have access to as part of your subscription. So here I can type /model, hit enter, and you can navigate through the list of models that you have access to. So right here I'm going to stick to haiku before we do any serious work. Um, so we've just cloned the project. I'm going to ask Copilot to install dependencies.
dependencies build and run the application. So while it does that I see a request to reshare the link. So this is the link github.com/cassido/todometer. I'll just reshare it on the chat in case you want to follow along. All right. So, copilot CLA here is attempting to run some um some terminal commands. So, it's asking for my explicit permissions. I'm going to grant permission to the terminal here at the C-Pilot CLA running npm installed. So, it's simply going to read through the files. I'm sure it read through the readme and got the instructions on how to set up the project.
It's now attempting to build the application. So again, I'm going to hit yes, and then it should hopefully run the application for us. Okay. So there we go. It's running the app and the app just opened. Uh it's a different screen. So here we go. Uh, so let me just move this a bit. I hope this is visible, but I'll just bump everything up one more time. So there we have to do running. So it's set up the dependencies, everything that we need. Uh, let's test this up. I need to host Robert Thursday. All right, I need to prepare a demo on um Microsoft Boundaries.
Okay, I can mark a task as completed. I can What did that do? I can remove it. Okay, post Robert live stream. go out for a walk. All right. Mark is completed. Okay. So, it's a basic to-do app. We have the progress bar at the top. So, that kind of gives me a nice visual of uh how far I am along in terms of completing the items on my list. Okay, I can bump them up again. So yeah, basic basic to-do functionality. That's good. So what we'll do, let's try and work with copilot CLI to add some features to this application.
So you see what we have so far. So maybe let me just pause on the chat. Just give me some ideas of what you'd want us to work on for this application. So you've seen how it works, the features it has. Let's let's explore that. And just to kick us off, I'm thinking that we could try and add um voice input. So, for now, you can only add in some text, but let me see if we could actually hack adding um a voice integration to this. So, let's let's go ahead and do that. Uh so, first thing I like to do is before I even start working on a new feature, I love to do some bit of research.
So have the agent do some bit of research about the project that I want to build or the feature that I want to integrate. So I'm going to check one of the MCP servers that I always double check if I have installed is context 7. So I'm going to use the /mcp command and then show to see the list of MCP servers that I have already installed. And I see that I have the context 7 MCP server installed. If you don't know about this, this is an MCP server that allows agents to um to to access upto-date documentation.
So if you want the agent to work with certain services, libraries, packages, you you don't want it to rely on outdated information. You want it to pull in the most upto-ate versions, the most up-to-date documentation. So before anything, I sort of default to having the context 7 MCP server installed and enabled in in my copilot CLI. So that's already set up. If you don't have it, you can easily add it by typing /mcp add and then that should give you a nice interface for you to add your different MCP servers. So that's how easy it is to add an an MCP server and just extend the reach and capability of the copilot agent.
So that's good. I have it installed. So that's fine. So I'm going to number one switch to a slightly more powerful model right now. You see I have the Let me do this. I hope you can all see this. Just confirm if you can see this clearly. I realize that the color contrast might not be the best, but so I'm going to switch to a slightly more powerful model. Let's work with um we can try Opas 4.5 or the default which is Opus 4.6. I can try that. I'll just set it to medium. I'll set it to medium reasoning effort and then I'm going to switch over to plan mode.
So we have different uh we have different modes so you can interact with the agent in plan mode. This way it won't write directly on your code files. It's just going to read uh your files and help you brainstorm. You can switch to autopilot or you can switch to the basic agent experience. So I want to be in plan mode and then I'm going to invoke the research agent. So this is a builtin custom agent that will run a deep research analysis on my topic of choice. Okay. So in this case I wanted to research um speech to text models to add voice input to this electron app.
So, I'm going to just mention explicitly that it's an election app and I'll ask it to use context 7 to pull the latest the latest docs. So, in this case, I'm working in plan mode. I have invoked the builtin research agent. I have given it a topic. So we'll see co-pilot here is now trying to reason over my request my prompt. So the user wants me to research speechtoext models for adding voice input to their application. Let me plan the research and dispatch multiple parallel research sub agents. So that's that's an interesting move. So the orchestrator agent here which is the opas 4.6 six model that we're using decides to break this down into smaller tasks and then just dispatch multiple agents to work in parallel.
So that's that's quite interesting. So it's asking for my approval to access these URLs. I'm going to approve for the remaining part of this session. I am okay with this um destination. So that's fine. So let's see. It's it's actually building a plan. face. So we can see it has broken down into some key research areas. So it's currently researching the state of the application, the electron app that we have. It's uh okay. It's going to research on speechtoext models and libraries that are suitable for electron applications. It's going to do some research on the web speech API that's built into Chromium or Electron.
It's going to do uh it's going to have a separate research work stream for whisper js and whisper models for local speech recognition alternative um speechtoext libraries and the latest documentation from context 7 for these libraries. So it's setting up the different sub aents that are running in parallel. And I believe if we use the slash tasks command, we should be able to see all the different sub aents that are running in parallel. So you can see the amount of time that each sub aent is taking and you can you can as well just follow along in the progress.
So it's dispatched the first agent to explore our electron application. Want to do some research on whisper speech to text documentation speech API. So for those on the chat what service would you actually use? So assume we didn't start with this research step. So the assumption here is that I don't know even where to start to build this feature. So I want the agent to do some bit of research on my behalf. Now for those in the chat, what would you use? Let me see the decision points and your ration for your decisions and then let's compare that with what the agent comes up with from the research.
So let me know if you were to just implement this out of the box, what models, what APIs would you use for this? All right. So for those joining us, I see so many people joining. Um we are experimenting with uh the to-dometer application. Again I am just going to reshare it in the chat in case you want to follow along. So it's simply just a to-do application that tracks your progress throughout the list. And this is this is the app right here. So we're trying to work with Copilot in the CLI to add a voice input feature.
So instead of just typing in a new item, we want to have some voice uh speech to text functionality. So add voice input. So we are working in plan mode. We have our builtin research agent here doing some bit of research. um it's dispatched uh several sub agents to run in parallel task and and uh it's given each sub aent a mandate part of the plan. So you can see that the orchestrator um model here is trying to consolidate. So it's pulling in results from the different sub aents that are running and then eventually the hope is that it's going to give us a consolidated final comprehensive report.
Okay, I see a question here. So appach context 7 creates document indexing kind of like spinning up a postgress SQL with vector extension and then building your own MCP front end tab. yeah I don't think I quite understand the question because with context seven here and I'll just okay um So let me pull that. So context 7 MCP server because I don't think I quite understand um the question there clearly. With context 7 you basically provide your LLMs with the upto-date version specific documentation and code samples. So this way if you ask copilot to implement a feature it's not just relying on its training data set.
So depending on the model's training cut off date, it might have some outdated information and you don't want it to basically introduce code from versions that are already out outdated probably with some vulnerabilities that have been exposed. So you do want the agent to rely on more up-to-date information. So that's why we're using the contact 7 MCP server. So you just clarify more. I guess I'm getting your question wrongly, but I wasn't able to understand clearly what what you meant by that. All right. Uh, good, good, Carlos. I I agree with you. This is amazing.
I'm really hoping we're going to hack it. So, I hope that by the time we're leaving this live stream, we do have um a working integration for that feature. So, hoping hoping we'll get to see that. All right. So, so you've been struggling, you've create, you you have created this in your own small to-do app. You've been struggling with Danish language translation, huh? So, what uh what service are you using there for your for your to-do app? Right. So, just let us know what what services did you use as we wait for this research workload to to finish up.
Okay. All right. So, seems like uh I sort of answered your question. So, you're welcome. All items. Okay. So, we're just waiting. So, you can see it's now pulling in results from the different sub aent work streams that it uh it's spun across the different sub aents. Now it has a comprehensive research from the seven sub agent dispatch dispatches. Um and then it's now pulling everything together into a final report. So that's fine. Again, we are in plan mode. So we do not expect the agent to write any code or to make any changes to our application.
So it's purely just in a readonly mode. It's going to go off do its research. As you've seen, it's not asking for a lot of questions. Like, it's not asking what's my preference, what do I want to use simply because it's just fully in a research mode. It wants to go out and find as much information as it possibly can. So, as we wait for this to run, um, do we have anyone with an idea of a different feature that we could integrate into our application? So, we're working on voice input. Anyone else who might have a suggestion of something we can just try out?
Then we can spin up a different um agent session and kind of have it work on that as we wait on this agent. Uh okay. So uh I was first trying to get Charpt in audio mode to call my own MCP servers in my own agentic flow setup but charg audio currently doesn't work in the same pipeline. So I've tried to use the new realtime model from open AI to see if it could solve my problem. What do you mean by it doesn't work in the same pipeline as the text chat pipeline? Um, and I'm also curious.
Oh, and I'm also curious. I'm also curious if uh you've tried this with a whisper model from OpenAI. If uh if that worked. Quest, I'm sorry about that. I think there's someone just outside my door who's dragging something um on the floor. So, it's making that squeaking noise. I was hoping that the voice isolation feature would would uh knock that off the live stream, but I'm really sorry. I'm really sorry that you're picking that up. Um yeah, apologies for that. I should notify my neighbors whenever I'm having a live stream. So, the agent here is uh pulling together final report.
uh given the amount of time it's spending on this, I think it's going to produce a very thorough report with like code samples with some code examples. So, we'll just quickly uh look at what uh it comes back with. So yeah, so the research is done. So let's look at the findings. So the research is completed. It has saved it into this session logs. So this is not yet saved in our current working directory. This is saved in our session logs. So if we exit this session, we basically lose this report. So we're just going to stick here.
Um so we have the key findings and it has some recommendations. So we have the react speech recognition doesn't have offline support. It's the easiest to set up. We have hygiene phase transformers. Um, okay. This is interesting. I expected to see more of um the available language models like whisper models from this research. So yeah, that's that's quite interesting. So what I'll do here is I'm just going to steer it. I'm going to steer it in a direction that I think we'll have success with. Um, I only have access to Azure Open AI, so I'll probably Yeah, I'll probably ask it to refine the research.
Actually, let me just do a quick research and ask it that I want to use Open AI whisper model hosted on Microsoft Foundry. And after you update the report report, can I get a rubber duck review? Okay. So, let's kick that off and I'm going to explain what the rubber duck review is. But before that, let me look through the comments. Ah, so we have an idea here. Can we add a search function for quering past days? H search function for quering past days. Okay, interesting. We can try that. So, yeah, I'm going to spin up a new agent session there.
And maybe before we do that, just share some bits uh some bit more details in terms of how you envision this to work like just just use a different set of pricing for that feature and then I can just feed that to the plan agent and have it build a plan around that implementation. Uh so we see the agent based on that new prompt is going to now research on Microsoft Foundry. It's going to look at the Whisper API. It's going to look at the docs for some code samples. Okay. So we can see it actually um looking through LAN Microsoft LAN.
So, let's just give it a few minutes to update its research. Uh, I just haven't like set up any um any any accounts on most of these other platforms. So, I think it's it's going to be much easier to work with Azar Pen AI and Whisper is a pretty popular um popular model for text or speech to text. Okay. So it's researching on Foundry hosted whisper integration. Hopefully this won't take so much time. But yeah, I like that. I like that idea. Just give some bit more details and then we'll spin off a different agent session for that.
Ken, my agent flow setup is written. So I generate Chrome kind of breakdown for my ideas. So brief idea epic feature. So this is like your working with AI workflow. Is that what this is? You start with a brief an idea then have that as an epic feature stories and then tasks all with a included. All right. That's an interesting workflow. Uh yeah. So in this case, I think I understand uh how you're what you're sharing here. So since this research is automatically saved in the session logs, there is a built-in feature. So if you do slash share, this will allow you to just export that research in a file that you can now save in your working directory.
So that's exactly what we're going to do. Assuming this um research completes, we're going to export that whole research and then just save it as a markdown file. So this way even if we start new sessions, we can always feed that as part of the context for the different agents that we work with. So that's something that you're going to do as soon as the research is updated with our prompts. Um all right so Azaratron online I'm hoping I'm just hoping that by coding doesn't load more flaws than working code. Um it is possible to have um code that's not necessarily up to standard if you default to VIPE coding.
Hence why the recommendation is yes in as much as your vibe coding uh do know your fundamentals do know your basics be in a position where you can review the output from the models and then basically judge if you want to merge this into your code base. So yes there is the expectation that hey of course there are scenarios where you you're safe just by coding. So for instance, if you're just building a local productivity tool, it will never leave your computer. You'll just work with it locally. Then you do have some flexibility there to just assign that to an agent, have it work in autopilot, and give you a working output.
But then for more critical workloads, for code that will hit the internet, you do want to be very cautious. You do want to have very strict um quality gates. want to have very strict safety gates to basically just have a human review on the output from these models. So these models are getting insanely powerful and they are generating high quality code but they still have the um they still have the ability to produce wrong code. Okay. So that that is obviously a situation you might you will always have to deal with the agent just producing something that either doesn't work that is not factual that is completely made up.
So it's up to you with your knowhow with your knowledge with your experience to basically be the judge of what you're getting as output from these language models. Okay. So, we're just waiting for this to complete, finish up on our research. Yeah. So, AI can do mistakes. You can always count on that. That's the constant. AI will at some point make mistakes. Some mistakes are small you can easily get away with, but some I think we've had stories. Some can be very very detrimental. So yes, AI can and will do and make mistakes. All right.
Yeah. So you used testdriven development in your agent flows. Yes. So that's the other conversation point where we have an industry tested workflow like life cycle the traditional software development life cycle that we all know you start with brainstorming designing implementation testing etc. So yes um at some point when I when I started um experimenting with this AI agents to write code, I used to try as much as possible to fit the AI agents into that traditional life cycle. So the distinct stages of building software, I'd try as much as possible to um to have an agent sort of mimic that way of working.
So before even writing code, we'd have a brainstorming session. We'd have a design session. We have so many design tools. I think the last live stream that we did together here, we worked with a tool called pencil where you just give the agent again we worked with a copilot CLI access to a design tool and it's able to actually build out the actual designs. So that was pretty cool. So I tried as much as possible to fit in these AI agents into that existing workflow. But I then realized that that will most likely change that will also evolve as we are seeing more companies default to agent native development workflows.
I believe that it's just a matter of time before we start um before we start realizing that there are more effective and efficient workflows that we can see agents basically just deliver better results um compared to the old um software development life cycle. Um so I see our agent here is taking some bit of time. Yeah, this is this is quite slow. yeah, this is quite slow. Let's see. So, web coding is too slow. It makes humans stay in the loop. Yeah. Yeah, it depends. I mean, you should have a specific role that you have for the humans in your team, for the agents in your team with very clear boundaries.
So it depends on how you end up setting up your human agent teams and the collaboration points that you have. So I see it's just trying to deepen the integration research. So let me try and uh nudge it a bit. A we are running out of time. Can you wrap up? Let's see what that does. Okay. Uh, there's a feature there's a feature suggestion and looks like I forgot it to add such something something. So, it looks like the research is already done. It's just trying to do more and more then. Right. I I talked about a rubber duck review.
So in case you are not familiar with the concept of rubber duck review, this is again a builtin agent that allows the crossmodel family reviews. So in this case our agent orchestrator is a model from the claude family. So we're working with claude opus 4.6 as the orchestrator agent. So in this case we've asked it to do a research and then it has generated a report. Now before you take that report and start building on top of it, we can ask copilot CLA to invoke a rubber duck review. So what this will do is that copilot is going to request a review from a model which is from a different model family.
So in this case, it's going to ask for a review from a model from the GPT family. And I don't know if you'll be able to see that here. Okay. Uh that task is already completed. But if our orchestrator agent is claude opus or any cloud um model then the rubber duck review is going to bring in an agent from the GPT family to provide some critique on the on the plan. it's going to iterate on the plan and basically you're going to end up with um with you know strengths from both sides. So it's just a strategy again to improve the quality of the output that you get have the first draft being generated by a model from one family and then bring in a review from the model a model from a different family.
So that's that's pretty cool. So we can see that it has uh built a plan it has completed the research. Oh, so I think that is why it took some time. Uh in addition to the research, it went on to now build the actual plan. So what we're going to do here is that I'm going to I don't work with autopilot as much. But in this case, in the interest of time, let's just accept the plan as is. And then I am going to ask it to build on autopilot. So this means that I have the option to enable all permissions.
Do this with extreme caution. You don't want to be doing this across all your projects, giving all agents full access to permissions. But in this case, in this case, I'm going to risk it. So I'm just going to enable autopilot with all permissions. So we don't expect the agent to now keep coming back to me asking, hey, do I use this? Do I run this command? It's just going to do what it needs to do to get this plan in place. So, it has restarted the application. Let me just pull it back here and hopefully we can see the changes as they come in live.
So, we'll see it um implementing the feature. Then I'm going to I'm going to just um nudge it a little bit and I'll ask it to um I want to pass in my API keys in av file just in case it's not obvious. Um I see some questions coming in. Okay. So the interesting thing about um TDD testing is that it will see failures try to update them sometimes before I even ask in some cases like um you you have to interrupt. So yeah, that's that's a valid point and um yeah, so I think my colleague is really big on experimenting with um testdriven development with agentic workflows.
So I believe that's one of the topics that probably in one of these live streams will be addressed in depth. But yes, I do agree with your comment there that you do want to have agents basically work from a set of guardrails. agents are powerful, but they need to know what guards are there. And the thing with these models, I don't know if you all have found this, it's an observation I've made, is that they can get very cheeky in that if you ask them to create a test suite, it's basically going to design tests that are designed to pass.
So, I actually get excited when I see an agent trying to run tests and see some actually failing. that gives me the confidence that it's actually um thinking about testing in the right way because to some extent and I've seen this on several occasions where you'll see an agent just designing tests that will just all pass out of the box without really testing some edge cases. So yeah, if I see any tests failing in an agentic workflow, then that sort of gives me some bit of confidence in what the agent is doing. But again, you still need to be fair.
You still need to double check, cross check, and ensure that you're hitting your right quality dates. All right. So, Majid is asking here, how can Copilot improve code review speed and quality? Um, yeah. So that's a good question and it will I'll probably best answer it by giving it an example giving you an example. So you can use copilot out of the box as is right or you can bring in your existing um quality standards your existing your existing business standards. Sorry I think this agent is blocked somewhere. Uh yeah so there's an error that occurred sorry yeah so as I was saying the question is how can copilot improve code review speed and quality the secret here is in co-pilot customizations so I'll just quickly share a site that is awesome copilot if you are not familiar with this website I'm just going to drop this on the chat.
This is where you want to go just to know and understand how you can customize Copilot. And by that I mean you can bring in your own coding standards, the quality bars that you want these agents to hit and then just add them as custom instructions, add them as skills, have custom agents that um are really built on top of your existing coding practices. So this way if you're having an agent review code it's actually implementing or enforcing some of the policies you already work have. So that's one and I think the best way at the moment to have better quality code reviews from AI agents not just copilot this is by just the aspect and the concept of customization.
So yeah I invite you to have a look at this repository. It has some community contributed workflows, skills, instructions, custom agents. So, I'm sure if we look through the custom agents, we should find one on code review. We should have something on code review. So, you can see we could have easily just pulled in a custom agent to work with the application that we're working with here. So yeah, it looks like it claims to have um implemented the feature. We can see a summary of the changes that it made. If you want to quickly review the files that have been updated, you can use the slashde command and then you can one by one just go through the different files that have been changed.
You can review what has been added, the changes that have been made. And this is a familiar interface. It's the div field that you're familiar with already. So we see that it provided av example file. So yeah, let's let's try that. So let me try setting it up. I am going to open this on a separate window because I'm just going to paste in some API keys. So to run a regular shell command on copilot, you don't have to exit copilot. You can simply just start a prompt with a exclamation point mark. So you'll just type exclamation point and then in this case I want to open it with code insiders.
So this will just open insiders which I will move over here to set up my env. So I will provide this. So let's see. I want to see if we can get this to work before we leave. Now I already have the model deployed. So this is this is Foundry, right? This is Foundry. If I go over to build, scroll to the models, I should have a whisper model deployed. So let's let's test this out. So I have the the endpoint here. So I'm going to paste that in. And I'm going to copy the key as well.
Okay, that's done. So how to use I've done that. Let's now run npm run f. Okay, it started the application. No errors. Did it start? Okay, I have the app here. Um, it added a mic emoji. I don't like that to be honest. Uh, so let's test it. Add a new feature. Transcribing. transcribing. seems to be stuck in a loop there. So, it says transcribing but not getting any output. Uh, from a design perspective, I'm not quite happy with the output, but then from basic feature functionality, it appears to not work. So, I'm going to switch out of the autopilot mode.
So, I'm in the basic agent mode and then I'm going to Okay, we didn't capture any errors. So I'm going to tell it that hey uh I uh the transcribing right transcribing doesn't go through confirm that the hex integration works as should and then what I can also do is this is going to the base model that we're using that is clone opas 4.6. I will ask it to run this uh with both opus 4.6 and GPT 5.5. So this way it will just try to troubleshoot in two parallel work streams one with the opus model and one with the GPT5 model.
So if we look at what's happening, so the user says transcription isn't working. Let me debug this first. Let me check if there's av file. Of course, the usual culprit, but I did provide that. So let's see. So it's noticed some bit of incompatibility. it's using Oh, it actually implemented Azure OpenAI endpoint and I provided an Azure AI service endpoint. If you've used Foundry before, you kind of have to be mindful of this of those two. So I did provide an Azure service endpoint. That's that's true. So let's see if it's able to find a way to uh fix this.
So let's see if it will either ask me to provide the open AI endpoint or if it's going to rewrite the code. Let's I'm actually interested in seeing the approach that it takes. Okay, I've been experimenting with local models to hopefully be able Okay. Yeah, so many interesting comments, guys. I'm sorry. They're coming in from LinkedIn, from YouTube, and uh in case I miss your comment, just uh just let me know. Uh yeah. So, I see just a lot of comments, opinions, and experiences. So, that's that's good. Uh interesting but not everybody can have subscriptions to API keys or aentic models nor do we have the high-end setups.
So how how can we use agents under limited resources? That's a valid question. That's a key concern at the moment. Uh the aspect of um token optimization. Yes, these models are incredibly expensive. So, I I do know that we have a lot of work streams in terms of experimenting with open-source models with local models. Uh but again, there's always that tradeoff, right? The reality is that OPAS is a really expensive model, but if you try and compare the quality of the output, you sort of understand why. But then uh we do have like all these conversations around how you can improve you know the quality of these other rather cheap or quite affordable models.
So one example that I will share is what we did earlier when we requested a rubber duck review from our from the copilot CLI. So I think GitHub has actually published a a a report or a blog that captures or they actually mentioned that they tried the rubber duck agent plus claw set 4 which is relatively cheap compared to opas and they got a quality that is close to what opas would give you. So in this case you're not paying for opus but you're getting near opus quality. So these are some of the work streams that we have both from GitHub again from the community the opensource community.
We're trying to see how can we better improve the more affordable agents and just equip them and just have strategies that will give us higher quality outputs within the more affordable rates. So, it's just restarted the application on a different window. I've pulled it back in. Um, I don't know if it's done. Looks like it's still testing. Let's see if we can test it as well. I really don't like this emoji icon, but that's a that's a problem that can be addressed later. Test out the new voice input. Okay. All right. So, not yet. Uh, think upon the bug.
Let me check if React strict mode is still enabled. So, it looks like it's still working behind the scenes to troubleshoot. Really hoping that it's going to um address the issue. interesting conversations. Just trying to read through the chat. Okay. Okay. This Yeah, this is quite a lot. Um, yeah, that's that's a relatively old old machine. Old is gold anyway. Um, okay. Cool, cool, cool. So there's a circular dependency issue. so it's just trying to debug, trying to troubleshoot here, pinpoint the bug. It's discovered several issues. Not sure if they're all related to to the feature, but let's see.
The build succeeds. I wonder if there's a way for it to test the application out. If this were um see a web application, it has access to the MCP, the Playright MCP. So, I would assume it can easily just test out everything on its own on a browser. But this being a desktop app, I'm not sure there's a way to connect it. In case you've tried that, uh, let me know. There's an MCP server that that we can explore, but it's asking. Okay, I've launched the app. Let's try again. Test out the new voice input feature.
Transcribing. Oh, yay. We got it to work. Okay, good job. Good job. So, it's a team effort. GPT 5.5 and Opus 4.6. So, looks like that's okay. Try adding it. I can't believe that worked. I can't. Wow. And it's accurate. That's That's cool. That's cool. So, let's mark that as done. We've added the voice H. Okay. So, that's that's cool. I love that it has been able to integrate to our model hosted on Foundry. It was able to read through the the documentation on its own and uh find how to handle the integration. I just needed to come in provide my API key, my endpoint, a few bugs that the model was really able to work through and it looks like we have working functionality.
So that's that's quite nice. Um, publish a rubber duck Thursday blog. Okay, cool. So it looks like it's it's working. Uh, so the last thing we're probably going to do is at this point we have not even opened this application in our IDE. I don't even know the folder structure. I don't know how the code is. I don't know how it was before. I don't know what the agents have introduced. So as I mentioned earlier, you can do a quick diff here just to review the files that have been added, have a glance at the changes that have been introduced.
So you can easily just review all of these things right here or you can alternatively use a builtin review agent. So this is a code review agent that will basically analyze all the changes that have been made and then provide a comprehensive um code review report based on the quality of the code that has been introduced. So, I'm just going to run that and I'll add some additional instructions to save the code review report in the root. That's a code review. MD. So, again, this is a built-in custom agent. So it has the the instructions to focus on code review specifically.
So we're just going to have that running and then produce a report that will surface any issues, any security vulnerabilities, any obvious um any obvious bad code that has been introduced and then just surface that in a report. okay, any questions? And yeah, I I appreciate you all having very very good conversations on the chat. This is this is good. Um, but yeah, if you've not worked with the copilot CLI before, please do test it out. That is if you enjoy working from the terminal in the terminal, but if this is not your preference, you can of course interact with Copilot on VS Code and across a variety of other IDEs and editors.
So you can have a really nice interface and then just interact with the agent. So that's that's cool. Um yeah, I think we've we managed to add one feature. I kind of hoped we'd have enough time to do more. So for instance, I'll just have to come back clean up on this icon. It's really really bugging me. Uh so that's something you can come back in and uh and and fix update. So, we can probably just work more on this on the next live stream that we have. Um, but yeah, what do you all think?
Any questions? I know it was slow. Um, I've done research workflows before. They weren't as slow, so I'm not entirely sure, but I'll do a couple of more runs and then see if it's I don't know, my internet or I just used it with actually used it with medium reasoning effort. So that shouldn't um be an issue. But yeah, I've done several research workflows. They didn't take as long. But I guess it also depends with the scope. In this case, since I was doing the research in plan mode, I also think that contributed to that stage being a bit slow because it's trying to combine efforts or towards the research and then efforts towards analyzing the project to create a plan to implement that new feature.
So I think combining the two probably made it that slow. But normally I start with research on its own and then save that research file. Then use that research to create a plan and then save that plan and then I'll pass that to an implementation agent. Um so there's a question here. So if you are if you wanted to introduce to a local meetup group how to get started with agent like this where would you recommend people start? so we have this GitHub copilot dev days um copilot dev days luma. So we've had a series of GitHub copilot dev days happening all over the world.
Let me just drop this. I don't know if this series is still ongoing. Um but this would be a good place to start find if you can uh you can find a location close to you. We had so many events before. Um yeah, you can see had so many events. This is a series that um that GitHub just recently concluded. I believe this series is over. But the idea is that we bring all these um communities together to talk about get a co-pilot do workshops. So this would be a really good place to start if you can find such events within your local community.
We have so very many MVPs, many employees just going around and doing hands-on workshops on GitHub copilot. So that should give um some some more um insights and hands-on experience with copilot. Right. So, interesting comment here. I'm curious to see which models are the best for real time ST. I hope OpenAI's models will be better than Danish. Um, to be honest, I think Whisper performs really, really well. Um, but I haven't had a chance to try out different models and I know we have a bunch which uh if I just scroll over to Microsoft Foundry uh I can just filter down to the models and look for any speechtoext models.
I think it it would be fun to kind of just go through the list and see see how they perform especially on non-English languages. So we have speech to text, speech translation. The list is much shorter but yeah we do have um we do have GP realtime translate transcribe and we have some from Microsoft research. So it's it's worth just experimenting. I believe we also have a lot of opensource speech models. So unfortunately I haven't really tested them all out. We could check on popular benchmarks but I do believe that uh these models are getting better.
So it's just a matter of time uh when we have very reliable and strong support for non-English languages. All right. So we see that we have a code review. We have a report ready. If I look through the directory we do have a file called code review.md. So that was added. Let's review the content of the file. So code review.md there. So yeah, it actually surfaced a bunch of stuff. Uh issue one memory leak stream not cleaned up on component unmount. We have Okay, so this this is good. This is really good. So before even pushing this code, I mean I think this should be the default behavior.
This should be part of your workflow if you're using copilot. Before you even um push any AI generated code, just run that that builtin code review. It's going to really surface some of the obvious um problems in that code. So before you even look through the code yourself, I think this is a really good starting point. Back to the question that was asked earlier on how you can use coding agents to make code review faster and more effective. I think this is one of those ways. yeah, um I think since we're out of time, it's been 1 and a half hours, but that was a good exploration.
Um, ideally the next move would be to either log this as issues on the repo and then now I can assign either to local agents or the cloud agent on github.com and then just have have some of these issues addressed before I even get the code on GitHub. Yeah. So yeah, Ken emphasizes emphasizes that you use the um awesome co-pilot repository. I'll show that briefly. As the last thing, this is a resource that you want to fully utilize. As I said, it has community contributed customizations. So, if you want to work with agents, uh you can explore their available custom instructions depending on your framework, your language, your task.
Um security for instance, we do we should have a bunch of security related custom instructions, right? So security and OASP um instructions. So this here um does a comprehensive coding review based on the OASP top 10 um vulnerabilities. So you can always bring in all these aspects of customizations. You can customize Copilot on a repo level. You can customize on an organization level. You can customize on an individual level. You can have hooks that are actually part of that agent loop itself. So as the agent is working before it even exits the loop, you can actually work with hooks.
We do have um there's a new video on the GitHub channel on YouTube that really explains what books are, how they work, and it's in all all in an attempt to give you more control over how these agents work, what tools they choose, what context they choose to pull in, and the quality of the output that you get. All right, so I hope this has been helpful. Uh thank you all so much for sticking around. I know we've gone 30 minute minutes over time, but this was actually fun. So, I'm probably just going to save this and we can pick up from there on the next live stream.
But if there's a specific topic you'd want us to cover, more than happy to adjust and address that. So, thank you all so much for joining the stream. If you'll be around for the next couple of hours, we do have this live stream across multiple time zones. So if you missed this completely, there's one that will start later in the day. So you can always just look through the GitHub channel and catch a live stream that's time zone friendly for you. So thank you all so much for your time. Enjoy the rest of your days, evenings, and enjoy building with AI.
Bye.
More from GitHub
Get daily recaps from
GitHub
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.







