Build an AI Chat app with configurable personality from scratch | AI Ave Ep 3

Cloudflare Developers| 01:01:58|Mar 26, 2026
Chapters15
This chapter introduces building an AI chatbot and sets up the series context, inviting you to create one using web development tools while outlining prerequisites like GitHub, Cloudflare, and Node.

Cloudflare Developers shows how to build a configurable AI chat app from scratch, including system prompts, an API, streaming responses, and a front end, all hosted with Wrangler/Workers AI.

Summary

Cloudflare Developers’ episode 3 guided by Craig walks you through building an AI chat app with configurable personality. He starts with a barebones frontend and backend, then demonstrates how to expose an API that can control the system message to shape the bot’s behavior. The video dives into a quick local playground, starting from a basic Hello World server, then replacing it with a Hono-based API, and finally wiring up a streaming OpenAI-like experience via Cloudflare Workers AI. Craig emphasizes practical tooling (Node, npm, Wrangler, Git, VS Code) and gives concrete steps to test endpoints with HTTPie. He also shows how to handle chat history as a list of messages and how to pass a system prompt to influence the model’s persona. The flow includes creating a chat endpoint, posting prompts, and iterating toward a streaming response for a snappier UX. Throughout, he notes where to find notes and how to deploy the finished app with npm run deploy. The tone stays hands-on, with real-world tips like using HTTPie for API testing and how to structure messages for chat models. By the end, you see a runnable project that can switch personalities (New Yorker, Karaoke wizard) and scale with Workers AI.

Key Takeaways

  • Install and boot a Cloudflare project via npm and Wrangler, then explore the starter with a hello world SSR example to understand the dev flow.
  • Create a Hono-based API (GET /api/hello) that returns JSON, then expand to a POST endpoint at /api/chat to handle user prompts.
  • Leverage Cloudflare Workers AI bindings to call a model (e.g., llama for scout) and return AI-generated content in API responses.
  • Structure chat interactions with a messages array (roles: system, user, assistant) to maintain conversation context across requests.
  • Add a streaming API (POST /api/chat/streaming) to deliver tokens in real time using server-sent events for a snappier UI.
  • Use HTTPie (or curl) to test API endpoints locally, validating both JSON results and streaming data.
  • Deploy the finished app globally with npm run deploy, unlocking region-earth hosting and daily credit resets for experiments.

Who Is This For?

Essential viewing for frontend and backend developers who want to build a configurable AI chat app from scratch using Cloudflare Workers AI, Hono, and Wrangler. Great for those new to API design and streaming AI responses who want practical, step-by-step guidance.

Notable Quotes

"Let's build an API where we can control the system message."
Craig introduces the core idea of controlling bot behavior via system messages.
"'You are a helpful assistant'... you can change these, right?"
Shows the default system prompt and how to customize it.
"This binding with the name of AI... is enough to start using AI with Cloudflare."
Introduces Workers AI bindings and how to access AI from code.
"We want to pass in all of the messages, right? We want this payload to have messages and the system message."
Explains the standard OpenAI-style chat format and payload shape.
"Deploy the finished app globally... npm run deploy."
Wraps up with deployment steps and the broader takeaway.

Questions This Video Answers

  • How do you implement a configurable system prompt for an AI chat app on Cloudflare?
  • What is Hono and how do you use it to create API endpoints in a Cloudflare Workers project?
  • How can I test Cloudflare Worker endpoints locally with HTTPie or curl?
  • How do you implement server-sent events streaming for AI-generated responses in a Node/Cloudflare setup?
  • What are the steps to deploy a Cloudflare AI chat app to production with Wrangler and Workers AI?
Cloudflare DevelopersAI Chat AppWorkers AIHonoWranglerOpenAI APIStreaming (Server-Sent Events)HTTPieAPI DesignFrontend Integration
Full Transcript
Let's build an AI chatbot. It's like the hello world of AI applications, and I am excited for you to get to create one. Of course, if this is you saying hello world to building your first web app, we'll get that all set up for you, too. Oh, shoot. I should have introduced myself. Hey there, I'm Craig and I'm a developer. This tutorial is part of a series. There are notes attached to this video, and in those notes, there's an introductory welcome video that sets the tone for what we're about to get into. If you haven't seen it yet, I'd love it if you checked it out. If you don't got time for that, I get it. I'm doing my best here to bring folks who are building apps with some of these amazing vibe coding tools, but might not yet have the fundamentals. Now, if this isn't you, feel free to speed me up and check the chapters. I labeled them pretty good. Jump around. I'm going to be expecting you to have a GitHub account, a Cloudflare account, and Node, and an editor installed. All of the links and instructions for those are in the notes. In this tutorial, we are going to build an AI chatbot from scratch. We'll build the backend API and then we'll generate a frontend for you to consume that API. API stands for application programming interface. And we'll dive into that a bit more as we go. And we're going to build an API where we can control the system message. You know, control the tone and behavior of how you want the chat app to behave. Let's explore that first in a working app that you'll have by the end of this tutorial. All right. So, here we are. This is a barebones chat app. And you'll see that by default, it's defaulted to you are a helpful assistant. And if we text hello to that, we're going to send that and we're going to send it. It says, "Hello, it's nice to meet you. There's something I can help you with." So, we now have a working chatbot. And this is the default that most of them have, but actually you can change these, right? So, I'm going to uh go ahead and click this New Yorker here. So, uh you're a longtime New Yorker. You speak with a thick accent. You are helpful eventually, but often want to rile people up. So, what's really powerful about these system messages is you can define the behavior and how you want it to behave, right? So, so let's do that. So, let's uh let's let's think how can we do that? So, uh I'm married into the the New York thing, so I'm allowed to say that. I think a lot of New Yorkers are proud that they're kind of in this this this way. this some they get stereotyped as being rude, but I think they like really like to have fun with this. So, like what how could I really Oh, this would work. Um, where is the Pizza Hut in Time Square? Like, pretend to be a tourist here and ask that of a New Yorker and see what they say. They might say something like, "What do you mean you can't find the Pizza Hut in Time Square? Forget about it. You tell me you're living in this city for years and you don't know what the Pizza Hut is." All right, cool. So, it's taking the information that it has and it's it's bringing it into uh this system that we created, right? We we made this system message and that's really fun. Uh feel free to play around with that. See what else you can do uh with that that New Yorker. And you can also write your own of these as well. Uh let's I'm going to show you up the karaoke wizard as well. Uh so, you're a karaoke pro. You love to give people suggestions of what to sing, right? So, you share YouTube links like this and then uh karaoke plus the name of the band and their song name URL encoded. That's the way that will generate these things for us. And you encourage them to practice with YouTube. So, um, I'd I'd like to sing a punk song. I really want to sing one at karaoke. So, let's see what this karaoke wizard says for me. Punk is a great job. How about singing Blitz Creek Bop? What a great idea. And I can grab this and you'll see that you'll notice that, you know, this this isn't fully linked out here. Uh, that's because we haven't done this this in markdown, but it's still working. We're still doing we still got back this information. I'm going to go here. We're going to uh paste that over here. And sure enough, look, if you do that, if you search for karaoke and here's Blitzcrreek B by the Ramones and we we look in here, we've got all of our lyrics. A nice karaoke song, right? Awesome. So, uh, super fun. I would love for you to come in here and we're going to build this locally and you can play around with what these these are. And typically, this would be hidden on the back, but we put it on the front here so you could play around and and try different things out. uh because that's the best way to learn, right? Explore. Pretty awesome, right? Now, you might hear the term of a GPT wrapper. And typically when they say that, that's when someone's written a very smart system prompt that makes the chatbot work in a specific way within specific constraints. All right, let me show you how I start my projects from scratch. Okay, so if you do not yet have NodeJS installed on your machine, this is where you would go. node.js.org. Um, depending on whatever uh language you're on and whatever system you're running as well, you can go ahead and find the right thing that you need to do here. Um, this will always look a little bit different. Your version is always going to be a little bit different, but after you go and get it installed and check the notes if you need that, I want you to show you how I normally do a Cloudflare project. So, um, there is a a package manager called Node Package Manager, npm. And if you do create, we're going to say Cloudflare at latest. this uh will run through and spin up a nice little wizard for us to build some stuff. So, I'm going to go ahead and say yes, it's okay to proceed. Again, this is probably going to look totally different. Uh so, we're just going to call this I'm going to call this thing uh chatty. That's the name of the app that I want to call this thing. I'm going to call it chatty. And uh hello world example is a great thing to do. You can actually go through there's different frameworks and things if you go you can use your up up down arrows to go through here. I'm going to do the hello world example. So, um, a hello world is kind of a, uh, when you start out a program, it's the first thing that you typically write to the screen. And I'm just going to use the default, this, uh, SSR, serverside rendering full stack app. I'm just going to choose that. Again, this might just be look completely different. Uh, but I'm just going to choose the defaults here. And I'm going to choose Typescript, which is a a uh, a supererset of JavaScript, which gives you um, stronger types, and it will help us with some autocomplete stuff. I'll show you how it all works. Cool. So, it's going to go and it's going to install stuff. It's going to in install the things that we need to run uh for us, including a command line tool called Wrangler, which we're going to use to uh run some things. And we are going to use git. I'd like to say yes that you do want to use git. And I don't think we should deploy our application just yet. We should probably take a look at what it is, right? So, we're going to say no. And so, I'm on a Mac and I'm going to get into that directory. So, this is this is my terminal here. I guess I probably should have uh introduced uh at the start of this that this is my terminal on a Mac. uh um uh if you command space on a Mac and type terminal, you will uh find a thing like this. And then on Windows, and in fact, just check the notes for the best way to get open up your terminal. Uh and I'm going to change the directory into uh that directory that we just made it, which was called Chatty. Uh and so now that I'm here, I'm going to open up my uh development environment. I could run some stuff here if I wanted to, but I want to open this up in my Visual Studio Code. So again, check the notes. uh if you don't have a code editor and uh also check the notes if you want to learn how to do this. So I want to I want to open up my editor in this directory. So I'm going to do code space period. So this directory I want to open up code. So here we go. Here's a a program called uh chatty that we have and you'll see that it created a nice some nice scaffolding on the side here. Uh and I'm going to open up here. I'm going to open up this source and I'm going to open up index.ts. So this is the worker. This is the Cloudflare worker that's running a function for us here. And uh uh let's take a look and see what happens when we actually run the server. And the way that you do that uh I'm going to open up I also have a terminal here. So you can say terminal new terminal. I'm going to collapse these little options here that we have just so we're right here. Okay. So to start that up, I'm going to do npm rundev to start the development server. And you'll see that it says I can press B to open a browser. And it's going to be open on this localhost 8787. So I'm going to press B and we'll see we have our hello world. And if I do this, I can see that it uh gets this random uh ID. Let's take a look at the code that is generating that. So uh first off, what's happening is this is the function signature that you need to run one of these servers. And if something is going to slash message, it will say hello world. And if it's something's going uh to slash random, it's showing the crypto random and uh otherwise it's saying not found. And so this is this backend server that's running. And then we also have this front-end server that's running this index html. And you'll note here's the hello world. And so let's just go ahead. Let's change this a little bit. I'm going to save it. And you'll notice that it did a hot reload there. So the the server reloaded and now I should have Oh, I changed the title. So see the title is hello world with uh three three exclamation points there. I actually want meant to change uh so so the heading is coming in automatically. It's using this thing called fetch. So it says fetch the message and then get back the response text and set it in the H1. So the H1 is this thing called heading here. It's this element called heading and it is setting that heading there. And if we look um when what it when it's saying fetch/ message, it's saying fetch to the backend server, right? The backend server here is this slash message. So it's a a little bit of an API, right? So it's got a if I go slash message, I will get hello world. And in fact, watch this. If I come if I come here and I say uh slash message, I should see back hello world. So I do see hello world uh when I go to slash message. And what's happening is this front end is pulling in that hello world. And of course the uh slashrandom is being called and that's being called because it's clicking a button right so it says this random button uh it's it's ID is random and so it says button dot uh when click happens fetch the random get the text back set the text content to random and that's why we see that random uh number come across and that's coming here from this this random uyu ID. So again on the client side it's doing a fetch. And if I come up here and I say view developer and I say developer tools it's going to open up this side panel over here. And if I go to this network you can see that when I refresh this page you'll see that it comes in. There it is. It hit it hit the message and it got back uh hello world. So we go we go to the message and then we go to response and it you'll see that it came back hello world. And if I click this uh random, you'll see that it here's the request for that random and here's the response that came back uh from that random. So it's making a web request to our back end. Awesome. So what we want to do now uh we have all of this uh default things here. This gets a little unwieldy as you can imagine if you have a lot of different uh endpoints that you want to do. And one of the things that I want to do is I want to write an API. I want to write an application programming interface. And I find one of the best ways to do that is to use a web framework. So I'm going to use my favorite web framework. So what I'm going to do here is I'm going to grab all of the code in here. So I'm going to do a command A or a control A and I'm just going to press uh delete. I have emptied out that entire file. I'm going to stop the server from running. Uh I'm going to do that with control C uh to stop the server from running. And now I'm going to install. I'm going to use that same node package manager. npm install. I'm going to install Hono, which means uh flare in Japanese. So there we go. So I have installed Hono and Hono looks like this. Um and all of these all this code's available in the repo. If I do uh I'm going to do import and I'm going to import Hono uh from Hono. I pressed tab and it automatically did that. That's a nice uh autocomp completion that that knew what was happening there. So I've got I'm going to import Hono. And the way that hono works is you say you make a new app and we're going to make a new hono new one of those. And uh I am going to uh export default app. Now that is what you need. Oh, it's a new hono uh const app equals new hono. Uh this code is available if you need to copy and paste. I know sometimes it can be hard to hit exactly what you're trying to do here. So um I'm going to make a new app. I'm going to return that app and uh we're going to do app.get. So we're going to talk about uh we want to do a get request to it, right? So just by going to the URL and let's just make a new uh an API for so we're going to make a new one called forward slash API and when that happens we want to say uh we'll call it hello app hello and that takes this call back and there's I know I'm writing a bunch of stuff there but this is going to take this call back. So, this is going to take this asynchronous uh callback and we'll talk about that what that is here in a second. And we're just going to take this uh object called a C for the context. The hono context is what we want there. So, I'm just going to return right away. I'm going to return C.JSON. So, we know about JSON. Return back some JSON. And I'm just going to make uh we'll make this say hello. Make a new object here that says hello world. Okay. So, um now that I have that running, I'm going to go ahead and I'm going to start my server. I'm going to do npm rundev and I am going to go and uh go here to my localhost and I am going to say and you see the not found right because I got rid of the API that was there but we're going to go to APIhello API/hello and we get back our hello world and we get back it's improper JSON right so it took that object and it turned it into JSON so that's a really nice thing uh that hono does for us automatically is it takes the results and turns it into JSON And that's what we can use locally on the client side. It makes it very easy to read, right? And it's kind of the common language of these APIs, right? Oftent times you're going to go make a request to an API and you're going to get back JSON or you're going to tell it to go do something. You're going to go make a change or something and it will return back JSON. So this is this is something this is a way that you can uh interact and we're going to build our own API. Most sites that you look at have these APIs uh tons of external tools, but they can be internal, too. like you can have APIs that that you're using internally as well. Uh and for more information on APIs, uh I'm going to say it again, just check the notes. Um I want to show off um one of the ways that APIs work uh internally for Cloudflare for your for your apps that you're using. Uh because it's it's pretty uh it's interesting. It's a little bit different than how you might have uh seen things before in the past. So what I want to do is I want to use AI. I want to use AI in this. I want to use a model. And let's say instead of saying hello world, why don't we make it so that it says uh we we generate hello world in a bunch of different languages, right? Because we can do that. We can generate with generative AI, right? So let's let's do that. So I'm going to go into my uh Wrangler JSON C file. It's over here on on the left. And Wrangler is how we uh control the application that's here. So Wrangler will will run stuff. When we did this mpm rundev, it actually did a wrangler dev, right? Right. So it starts the development server and this is kind of where you do the configuration. So you do this wrangler JSON C. C there is uh means that you can do comments in it. Normally you can't do comments in a in a file. So this lets you put comments in your JavaScript object notation your JSON. All right. So we're going to add a new binding here. And I am going right. So I did a comma and then I'm going to add this AI. And we're going to do a binding with the name of AI. Now, believe it or not, that is enough to start using AI with Cloudflare, which is really cool. Uh, super powerful. Um, so that's going to get access to our account, right? Because our account knows who we are, or we'll here in a second. So, now what we're going to do is we're going to let Typescript know that this exists because we want to have some autocomp completion. So, here here, check this out. So, I'm going to go um I'm going to stop the server and I'm going to say npx because I want to execute locally here. I'm going to execute that wrangler command and I'm going to do types. So, uh I'm just showing you how you can run Wrangler here locally. npx wrangler types. And so, uh that generated and now we're able to access this AI. This there's going to be a binding with the name of AI. Let me show you where that lives. So, uh, what we're going to do is we're going to get back results from, um, this method that is off C. So, that's the environment. And the way that you get the environment into hono is by, uh, specifying it. So, we're going to say that the bindings are in here in this hono definition. And so now when I do c. dot, you'll see that I have the bindings that I've done. So I've bound to this AI which is the workers AI which gives us access to that and on that there is a method there's actually quite a few methods now uh that says run and so we're going to run and I'm going to run llama for llama for scout that's the the model ID that I want uh from from meta that's just what we're going to use for this this example uh again you're in the future there's probably different models this one is the one that I want to want to start using here you could use any model here as Well, you don't even need to use ours uh if you don't want to, but this is the workers AI way of doing things. Uh so, we're going to come in here and uh it takes some parameters. It takes some options, right? And uh one of the options that it takes is one that's called prompt. And we are going to prompt um say hello world in five different spoken languages. Uh so that's that's the prompt that we're going to do. and we're going to so we have the results and we're going to get those back are going to come back here. Let's just write them out. Let's write them out to the screen so that we can see uh what that says. So I'm going to say results. I'm going to save that. And so now we have at the API we're going to get back the results of this API call, right? This this call here to this this run uh method. Let's see what happens. So I'm going to do npm rundev. Now, if you haven't uh logged in before, you are going to uh hit this page, and it might take a second. And if it does, it might require you to come over here and choose your account. So, I'm choosing my uh account here. Uh and it might ask you to log in. Uh if it does, you will go log in. You'll see that I did a get to slap API hello, and I got back a 200. Okay, which is exactly what we want to see. And here we go. So, we got back a a JSON object and it says, let's do this pretty print. It's got a response and we got hello world. Okay, different languages. With Spanish, we have olamundo. In French, we have bonjul. We have nihow and we have I'm not sure how to do this in Arabic. Uh, Portuguese olundo as well. I hope that those are correct. And you get back some additional information about like what was done uh with that AI call. But as you can see, we have in our file here, we have defined a successful first API. With our first API built, we're ready to build our AI chatbot. We're going to use an open-source model that is hosted on Workers AI to complete this. We have a pretty generous free tier, so you should be able to do this without entering a credit card. So, the Workers AI platform has an API that we're going to use in our API. It's kind of like that spinny top Leo DiCaprio flick API exception. No, this happens a bunch in the AI app world. You take input from the user and you pass that to an AI model API and you get the output and you pass that back. So let's do that. Okay, so we have a API endpoint for API hello and we are using our AI our API from Cloudflare inside of here. So let's do what we want to do is we want to take input uh from a user and typically when you do that you want them to send you something and you want them to that they're usually changing something and the way that you do that is with the HTTP verb called post. So they're going to post information to us. So instead of get right so we're doing app.get and that's when we were doing a get to slap API/hello. Now we're going to do a post. So instead of using uh the the URL to do that, we're going to need to have a different way of doing that. So we're going to say slash API slash chat and uh we're going to follow that same pattern where we're going to give it an asynchronous uh uh context there uh with a callback. The way that we're going to handle stuff is we're going to pass a JSON payload. We're going to post a JSON payload with information to our API. That's how we're going to let people interact. So the way that that looks is we're going to say const and we're going to get a thing back. Uh we're going to call it payload and we're going to say c.rec.json. Okay. So we're going to get back uh we're going to pull back this this JSON object and it's going to be we're going to call it payload. Let's just right away we'll just output that back right right back to the user. Right. So we're going to send back JSON of what that payload is. So whatever it is we're just basically doing a little echo here first just to make sure that it works. So now we have an interesting problem. we have uh how do we make this post without having a front end? Uh and one of the things there's a couple ways that you could do it. There's a there's a couple of REST uh browsers that you could do. There's a there's a command line option that's called curl. Um check the notes for both of those. So So if you wanted to, you could do like an app like a Postman sort of thing where you could go and test uh uh sending information back and forth. One of the things that I think is really handy is there's a little application that's called HTTP PI. So, uh, if you go to HTTP PI, it's what we want. And it's an API testing that flows with you is what it's called. So, this is what it looks like. Uh, I'm going to go to the terminal section here. This might look different. Uh, check the notes for this. And if I go to this install, it will show you all the different ways that you can install it. So, of course, if you're coming in from a different um uh a different uh way, there's there's a bunch of different ways to install here. Uh, check the notes for what's right for you. this this brew install httpi is probably what you're doing on a Mac. There's Choco if you're on a on Windows, but definitely check the notes. Uh here, let's see if there's there's learn more and they can kind of walk you through how to go about doing that. And then after you get that installed, you'll have a executable on your machine that's called HTTP. Let me let me show you what that feels like to use. So, uh another thing that I'm going to show off here is I'm going to keep my server running, right? And so when I saved it, it reloaded my local server. I'm going to make a new terminal. So, I'm going to have two terminals. So, you'll see here I have this. This is my first one, and this is my uh second one here. And I wanted to just have this running so I could show you two terminals running at once. You could open up a new terminal window, too, if you wanted to, if you have a different editor. So, if we do HTTP, and we're going to say that we're going to do a get and we're going to go to HTTP localhost 8787 because that's that's where the server's at, right? So, so localhost 8787. And if I come back here and I'm going to go to uh slash API/hello and we should get back uh the results of this, right? This this in JSON. One of the nice things about HTTP is it color codes it for you. I think that that's very nice. So uh we got hello world in English and yeah, it's doing it again. Awesome. Super cool, right? But what we want to do is we want to do a post to our new one that we just added. And we're going to see what this payload is. So let's let's do that. So So this is pretty cool. So, um, if I go HTTP post, then I put in, uh, the server again. Again, that's local host. That's the my local server, and it's on port 8787 is what got started when I started the server. And we're going to do API chat. And, uh, really neat. Uh, by default, anything that you put after this are just parameters that it will pass. So you could say hello uh world and then anything you you pass after this it'll just keep on it'll keep doing that. So you see that it does hello world. Um you can do like a first name and you can do uh more advanced stuff too more more more deeper ways of doing that. But it's doing the pass through, right? So it's pulling that I'm I'm submitted. So I'm going to post JSON into that thing to get it back. So let's see if we can't just pass in a prompt, right? So let's let's let's do that. So we're going to go I'm gonna grab this, right? And so instead of uh hard coding in there hard coding that that AI call what we'll do is uh we will do const results and we'll get back results here. And we know that results is going to have that response. So we can even actually say results.response and we'll say uh instead of this payload we'll say the payload uh payload.prompt prompt, right? So, we're going to say the prompt that we're going to do is we're going to pass in what came through this payload. We're going to pass it into this prompt, right? And then we're going to return the response back there. Let's see. Let's see what happens there. So, if we put in a prompt, and we can say uh write a poem about uh New York pizza. Now, I don't have the the the system message in there just yet, but I'm going to just prompt for the thing by default just to see what the the the thing about New York pizza is. Oh wow. So, uh, in the city that never sleeps, there's a pie that's made to keep the flavors of Italy's heart in every slice. A work of art. Beautiful. Nice little poem. Uh, it could do that. It could do that sort of stuff. And it knows about knows about New York pizza. Uh, so you now have an API that you can go where you can post information to this. And the other thing that I wanted to show off is that you could also, if you were up in here, if you're up in this code here, you can do a console uh.log log if we ever wanted to see what that payload was uh and we could say something like this payload is and then pass that payload in. Right? So it'll it'll log out whatever this payload was. So let's let's uh I'm going to save that and it's going to refresh we refresh the server here. So and then uh let's just let's ask for another poem about New York pizza. That's going to do a post. And if we look here, we'll say that the we'll see the payload was prompt write a poem about New York pizza. Great. So, we now have an API that can take input from a post. Now, we really need to think about what a chat really is. It's the back and forth conversation between a user and an AI assistant. Now, typically these are called messages and you define them with roles and what content they provide. OpenAI standardized this and there are a bunch of external model APIs that support a format that looks something like this. This is often called the messages list. See how the system messages defined and then we saw what the user asked and how the assistant responded. To continue a conversation, you just need to make sure that the AI knows what was said in the past. So let's make sure that we can pass in a list of previous messages and we should also pass in the system message as well. So what we want to do is instead of getting a prompt, we want to pass in all of the messages, right? So we want this this payload is going to have messages and instead of prompt here, we're going to have this messages and this is going to equal payload messages. Okay, we'll just start we'll start with that. We'll make sure that we can get that working. Uh, okay. So, the way that that works in uh HTTP. So, the way that that you can make that work is if we go here, we would say messages and you do a colon equals and we'll do a single quote. And now we can use a typical JSON in here. So, again, what you want to do is you want to say the role is the user and the content uh that we want to pass in is what what we want it to build. And so I'm going to end that object. So right, so it's an object that has RO and user and content. And the contents here, we will say my favorite topping on pizza is pepperoni. So I've got this array. I've got an array of messages that I'm passing. And we'll make sure uh that this works that this works properly. So So we're going to output that what the payload was. And it should have a property called messages that is an array uh right now of only one messenger. So uh let's let's send that across. And uh we'll see that it responded back. Right? So it took the response. This is a classic choice. Pepperoni is a popular pizza topping for a reason. It adds a nice spicy kick and a meaty flavor to the pizza. So I'm just gonna say a classic choice is what it said back. So um if you haven't seen this before, I'm going to use the up arrow and we'll go into the history of what was left here. So now I'm going to say I'm going to give it another message and I'm going to say roll and this is what it returned back right so it returned back this assistant and I'm going to give it a property here of content and I'm going to paste there a classic choice is what it said and so because it's a conversation and because I'm pushing that information forward I can now I can do this I can say um we're gonna ask another question, right? So, the last question here will be from a user and the content it's going to ask, "What is my favorite, right?" So, it's going to know, it's going to remember, but not really because it's part of the conversation, right? It was having a conversation. So, of course, it's going to be able to remember. Um, so we're going to say, "My favorite topping on pizza is pepperoni." It's going to be like a classic choice. And then it's going to return this and ask, "What is my favorite topping?" and it should return back something about pepperoni if we got a working array. Awesome. So, we've stored the conversational context in this array that we're just going to pass up each time, right? So, we're going to pass that array up and we're going to pass that array into here. But one of the other types of messages that you can pass across is that system message that we were playing with, right? So, the system message in the beginning where we were talking about how we wanted it to talk, right? we we made it be the New Yorker or the or the karaoke wizard. We probably want to have a field for that. So, we want in the payload, we'll pass up the messages and then we'll also pass up the system message. I like to put the system message first uh of all the messages so that it kind of defines the behavior for how the conversation uh should behave. So, the way that you could do that um this is this is a thing called JavaScript dstructuring. So, what's going to happen is I'm going to go like this. I'm going to pull messages off. If there's a property called messages, it will be stored in now a variable called messages. And I'm also going to store one here called system message. And so what we're going to do is we're going to pass up messages and system messages. And this is just going to pull it off of this JSON object here. Uh and so now now we have uh instead of uh the payload here like this, instead of payload because I don't have the payload variable anymore, that's what that squiggle is about. We're going to say is messages system message like that. Uh and so we can uh with JavaScript also uh because it's got the same name this is the same this is equivalent to this messages messages uh so we'll just do this this messages like that and uh let's see if there is a system message right so so we'll say if if there is a system message what we want to do is we want to uh add that so we want to say messages ununshift so that That means that we'll put it at the the start of the array. We're going to make a a system message. The role is system, but spelled right, system. And the content is going to be whatever that system message was, right? So, um, so we've added a new object with a system message at the front, right? So, this unshift will put it at the front of it. So, now it's got the messages there. So, another thing we can do is there's there's some different properties here. We can make this longer if we wanted to. Max tokens. Let's just I'm feeling Let's just let it as long as it needs to be. So, we'll say max tokens is 8,000. So, uh let's go ahead and make sure that we can get the system message up there. So, we'll just start from again with HTTP post. We'll do HTTP localhost uh 8787 API chat. And again, we're going to pass up the messages array, right? And the messages look like this. It's going to be an array. and we're going to say role of user and the content. Now, this is uh called the chat style and lots and lots of APIs use this. So, so uh learning this will help. Uh so, so we're going to do content and the content here is hello. We're just going to say hello to it. And we also want to pass along the other variable that we want is one that's called system message, right? So, we're going to pull that off as well. And so the system message that we want, let's see, uh, you speak only in pig Latin. Uh, so that will make it, that'll be fun. So I'm going to send a chat. I'm going to say hello. Let's see what he comes back and says. Hello. Hey. Perfect. So now we have a couple of options. We could store these messages, the chat history if you will, on the server side, but for simplicity sake, let's keep them all on the client. If you're interested in learning how to store messages on the server side, check the notes for more information. So now I'm going to use AI to make our front end. If you want to skip this part, the repo is available and has a pretty nice already cooked up front end. Feel free to jump there. Okay, so this is a little wild. We're going to use, you know, the saying, we're going to build the airplane in mid-flight. In this situation, we're going to build the airplane with the airplane mid-flight. So, so we don't have a front end, but we're going to make one. And the way that we're going to do that is we're going to try and see if this works. I'm pretty sure it will. Uh AI is a little funny, right? It's always going to be a little bit different, but uh because we're using the same model that I am, this should be pretty similar, uh to what's going to happen. So, I want to do that that HTTP post like we just did, right? All right. So, I'm going to use my up arrow uh to get in there and uh I'm going to get rid of the system message. We're not going to we're not going to pass a system message in this time because what we're going to what we are going to pass in is this. So, we're going to pass in messages. And I have in my clipboard I've got uh this and we are going to read it. Uh and it is Oops. You're going to read this uh we're going to read this together here. Let's let's get into it. It's kind of wild that we could do this, right? We could let let's remember what we're doing before we get frustrated by by what this looks like. We're going to write code. We're going to have the LLM is going to write us code. And what it's going to do is it's going to say write a simple chatbot website front end that submits a messages array via a post to API chat. That's what we just did, right? We made API chat and we want to submit the messages array. Now, this is, like I said, it's common enough. It knows what that is. The app is titled chatty, right? Store the messages on the client side. So, we've we've made that decision. Instead of storing it in the database or anything, we we just wanted to write the front end for us. So, the message object has a role of user or assistant. Clearly delineate the message by role. The message object has a property called content, which is the message to display. So, we've been very very specific. probably didn't need to be so specific, but I wanted to be and I wanted for when you run this for it to work for you the same way that it's kind of working for me. So, uh, the response from API chat will have a key called response which will be the assistant message. Generate all HTML CSS and JS. Additionally, add an admin section with the ability to add a system message text area that will submit along with the post to API chat named system message. Right? So that was remember on our original chatty app that we had running uh over here with we look at this chatty app we could we could see that this is the section that I'm talking about and we're going to have some buttons there and we're just going to tell it but later we'll do it default it to you are a helpful assistant and create example buttons that will fill in system messages area on the client side and I will create the examples later. This prompt is in the notes exactly in the notes like this. So I'm just going to run this once and we'll just see what happens. um didn't like something. Uh let's do that. There we go. It's running long instructions, but I did give it a lot of max tokens, right? So, I said the max tokens at 80,000. So, we've got a lot of stuff to work with here. Awesome. So, it did return a bunch of stuff and I'm realizing now that it returned it as a string and that's not really what we want, right? We want to we want to probably put this out into a file. So, check this out. So, instead of returning JSON, let's return text. So, we're just going to return uh straight text. So, what will happen is it will run this and it will run it and um if you haven't seen this before, this is pretty cool. You can uh pipe. So, after after a command, after a command, you can pipe it to something. So, I'm going to say app uh markdown. So, markdown is a way of writing things and it's the way that this is going to return uh information typically. So, it's a way of of rendering and I'll show I'll show what it looks like here in a second. So, I'm going to say do that again, but this time write it to a new file called app.md. So, this will help us uh save it and take look a look at things and see it actually what it said instead of that that uh stuff there. It's a kind of a nice way to clean it up. So, I'm going to send it there and because I have an editor that I can edit MD files. It's going it's writing it's going to write to that file. It's going to take the results the text results and it's going to put it into that file. And there we go. So, let's take a look and see what happened. Wow, it looks like I might have started. So, so it's telling us that we have a file called index.html and it should be called chatty. It's got a thing called styles CSS. Uh it says type a message and there's a send button admin section and there's example one two and three and then it's got a script.js and then uh you'll see here in markdown like this is this is a heading for CSS and this is how it does the code and the reason why color coding is working which is why I wanted to put it into a file is so that we can see what's happening. So these are styles. It even did the made a nice little style looking thing there. And then uh it dropped some of the JavaScript here. So there's the the JavaScript. So uh I think we should try to see if this works. Um uh and of course, you know, it's AI. It might be a little bit different each time it runs. Uh but let's see. Let's see how close we got to uh the way that Chatty looked uh when we got started. So I'm going to highlight this, right? So I'm going to highlight all of the stuff. So I'm holding down shift and pressing the arrow keys as I do that. That's how I did that. And I'm going to command C. I'm going to go into index.html over here. This is the this this public uh /public index html. This is the original hello world uh that that came with the starter. Right? So, I'm going to get rid of that. I'm going to paste our HTML. And uh we have a file that's called styles.css. So, I'm going to make a new file called styles.css. And I'm going to grab from my app.md file. I'm going to grab what that that should be again highlighting with holding down shift and uh going down to where it ends the way that it ends in markdown. See that triple the triple back tick is how it how it's doing that. So I'm going to copy that jump in styles paste. Save that file and I'm going to do it one more time for our JavaScript. Going to grab the JavaScript here. Now in code generation tools these days it will go ahead and automatically modify your files. This is kind of a throwback to how uh when we first realized that we could do something like this, we did a lot of this copy and paste. So, I'm going to copy this and we're going to put this over into script.js. We're going to call it script.js and I'm going to paste there. So, that's my my JavaScript there. Wouldn't it be wild if this worked? So, uh let's take a look. It's always a good idea to take a look at what things are doing, right? So, let's see. So, it's going to do a fetch, right? We saw that in, uh in the beginning here. So it's doing a fetch to API chat and it's going to make a post and it's going to post application JSON and it's going to pass across this messages and it's going to pass across the system message and the messages is just local in memory here and then system messages it's looks like it's getting from an element called system message ID. So it it converted pretty well what I asked to happen. This is the JavaScript that's on the client side that's going to post to our API. And I think let's do it. Let's see. So, the server is still running, but uh in case your server failed or something along the way there. If you do npm rundev, you can start that again. All right. Are you ready? I'm a little nervous. I'm always a little nervous when you first go to look at it. Let's see what it looks like. That looks pretty similar to what I had uh in the beginning. And that's because the prompt is the same, right? So, it's very similar to the prompt. I hope this works. Uh so, I'm going to say hello. Fingers crossed. That's a big send button. Let's see. Oh, no. Oh, it didn't work. What happened? It said hello is not valid JSON. So, let's see if we can figure out what's happening on our side. You know what? I bet I bet we left it as text, right? That's what happened. So, in in our source in the index, remember we said set uh C.ext. I'm going to say JSON. And uh instead of returning, I'm just going to return it all. Instead of just the response, we're just going to return all of the results like whatever is there. Uh because I said that remember when I said um in my in my prompt I said that it has a key called response. So this results response is what what it's going to use. Let's try that again. All right. No wies. We got it. Let's make sure our system message is working. You talk like a baby. Um tell me about pizza. Do babies like pizza? I don't know. You notice it's taking some time, but it is running. At least I hope it is. Who? That doesn't sound much like a baby, does it? Let's see. Let's go take a look at our code and see what see if it's working. So, the messages is that and the system message was you are a helpful assistant. So, it didn't send you are a baby. So, let's try that again. I wonder why that didn't work. You talk like a baby. Tell me about pizza. Let's go look and see if it came through again. Ah, it's still saying you are a helpful assistant. So, something's wrong with the JavaScript. Let's see if we can't figure that out. Sometimes this So I'm going to find system message every time system message is talked by and it says get me uh element by ID system message and it's not doing that each time. So, what needs to happen is we either need to do it when it changes or we're just going to copy this. And when we're going through to send the message, we're going to we're going to do it in here, too. So, I think we probably could just say const system message here. So, yeah, let's let Oh, you know what? It's probably better to do this. system message equals this. Let do let it do a let because we're doing the sets on the other side. So, uh let's try that. I'm gonna say I saved that. Notice it refreshed by itself. I'm going to come over here and refresh. You talk like a baby and I want you to tell me about pizza What an adorable baby. Oh, sauce crust crunching past me. Okay. Wow. All right. So, very close to working, but something that you might run into, right, as you as you're generating things like this, uh, one of the things that we could have done is we could have said, "Hey, it's not working. The system message isn't going." And give it the code and it would fix it. Maybe that's something that we could do. In fact, I don't know if you noticed this, but it was kind of taking a while, wasn't it? So, there is a way to make this faster. Let's take a look at that. Pretty awesome, eh? Things are looking pretty good. You could add the system message editor if you wanted to, or you could just keep that all on the server side. Now, the last thing that we might want to do is make it so things appear to move faster. We can use what is known as streaming. We can stream a token at a time. Let's stream. Right. So, let's get this streaming. It works actually pretty much the same. In fact, so much that I'm going to do one of developers favorite things. I'm going to do copy pasta. So, I'm going to uh copy exactly what we had there and paste it. Now, uh instead of being called API chat, I'm going to call it API chat streaming. Now, believe it or not, in order to make it stream, all you need to do is pass it a new parameter and that's true. Stream of true. But we get a little bit of a different h way to handle this. So, um when we do that, when we do a result, uh we do we do a stream, what we're getting back is we're not getting back results. we're getting back uh what's known as server sentent events. It's a stream of server sentent events or sse and um you can handle that a little bit differently and one of the really nice things about hono is that it kind of has a nice thing there for you. So we're going to do this thing called uh stream text and uh that is from hono. So uh that that that came in here from hono and I'm also going to use another library here actually suggesting it uh fetch event stream. So I'm going to do mpm install uh fetch event stream. This is in the notes as well. So um an event stream right that's the stream of things that come across. It's kind of uh a little challenging to use. So there's a really nice library that uh I like to use. Not everybody does. You can you can do it yourself if you wanted to. I think it makes a lot easier if you just do this. So uh stream text takes two things. It takes the the context and it takes a call back. And that call back takes uh a new object called stream which is uh what we're going to get from uh uh hono is going to pass this stream in it and anything that we write out to that stream that's the response will be written out to the response. So so what happens is uh the request happens and it gets a handle of the stream and then on the other end we have to process it. So let's do that really quick. So um this looks uh uh like this. So we're going to we're going to get chunks back from that stream. Um and we are going to uh do that by using that events and the events is from uh uh this the you know it's a server sent event message and this is coming from that package that we just installed. This is very nice. So we're going to say new response and it gives you back this iterable and I'm going to pass in the result stream because it takes a response stream. So it takes a response and I'm going to give it the new response that we had and it's going to process that here locally for us. And then uh we're going to do a yield of that. So uh we're going to do this await for await. So an asynchronous loop through this. We're going to do const chunk of chunks. The chunk of chunks. Uh and then uh let's log it out so we can see what it looks like. Uh we'll just show that off a little bit here. And uh so that chunk object has a couple things. It has a thing called data on it. So if if we have uh chunk chunkdata, right, we want to make sure that we have it. If it's not uh undefined, right? Not equal to undefined. And uh the very last chunk when you know that it's time to stop, it looks like this. Uh so if it's not and it's not equal to uh it's a string that looks like this. This is just kind of a strange way that it looks, but it says done when it's all done. So that's a way that you know that the the the thing is done. And then if we do have that, the chances are that we have this data, right? So we're going to we're going to parse that. So we're going to get JSON.parse and it looks like uh chunk.data. And each one of these datas has a response. So uh let's just be very clear. So I'm going to get a token, right? So so um this returns a token at a time, right? So this is like the suggest token, right? When you hear about that, it's going to return back um from response. It is a token at a time. So what we'll do then is uh if the token exists, right? If it does exist, we are going to uh use that stream that was passed to us, right? The stream from from uh hono stream.right and we're going to pass that token. Uh and you know what? Just in case I'm going to do two string just in case it comes back as a number. There's nothing saying that it was that. So I want to stream text. So I just want to make sure that it's a string that's coming across. So uh that is it. That's what that's what we need to do. Let's walk that one more time. So, I've got this hono uh reply here. That's the stream and it's going to pass this across. Um there is we're going to make a new response from the result stream that we got, right? Because now that we're doing stream, we have this result stream. And so, this is a server uh sent event and this events helps turn it into objects that uh that we can yield across. Right? So, that's what this four await chunk of chunk. So, chunks is this iterator and we're getting one of these chunks each time. I'll show you what it looks like and then uh I want to make sure that it has what we need and that it's not the last one because if it's the last one, we don't want to write the word done out. That that's kind of all the problem that usually happens. So now what we have uh very nicely on the other end. Now you could do this in the client side, but I like to do this uh on the server side because I think that it's it's easier to um it's easier to debug when there's a problem and this pattern is pretty common. One more thing that I like to do is I like to set this header here. Um, all of this is in the notes, but I like to set the content uh content encoding header uh to identity. And this will help it so that things stream properly when we're in dev mode here. It's just a little a little hack if you will. I'm going to actually write that. So we stream properly in dev mode. There we go. All right. So we can now try this, right? So we can we can go and we can we can see uh we're going to say http post and we're going to go to our our post there localhost 8787 and we have API chat streaming now right so we have a streaming chat and uh we're going to do messages and the messages are equal to an array of messages and the role is user and the content. Uh what should we say? Uh what is the advice a Ghostbuster would give you about streams? That's pretty dorky. All right. Uh there we go. So So we're going to pass that messages array. We're not going to pass a system message. Uh and we are passing it to streaming. And because I logged out the chunks, we should see that there. Let's make sure that we've got our server running. Oh, I ran that in the wrong one. So, I'm going to do npm rundev to get the server running. And I'm going to move this above it. I'll move this below it. There we go. Uh, so now I'm going to run this and we should see back the joke. Uh, wow. We got It came back pretty pretty quick. Puts on the Ghostbusters jumpsuit. When it comes to streams, here's what I say. Don't cross the streams. That's what I was trying to say. Don't cross the streams. That's the joke I was trying to make. But it made a lot more about it. I guess it knows it knows better than I do. But this is what came across. And you'll see that the the data has this response object, right? So that's what this response here is. And it's the token. It's this who you going to call, right? And that the token is is uh what's what's coming across there. So uh what we want to do now is we want to have the front end be able to do this. So let's get our app to use our new streaming API. So, I'm going to head back over to the airplane that we're building while flying it, but I'm going to use the airplane to build the airplane that we're building while we're flying it. Uh, what could go wrong? So, uh, I have this client side chat app code. Uh, and I'm just going to go ahead. I'm just going to paste it all in there. We've got a large context window these days, right? So, I'm able to put a bunch of information in there. So, I'm just going to grab our current script, right? because I I want to change I like what's working but I want to change it a little bit and you can do that right so I'm going to just go ahead I'm going to put JavaScript and then I'm going to paste that code and then I'm going to end the code there so we'll say I would like to instead use the API chat slashstreaming endpoint I'm going to be really specific I'm going to say use a reader and a text decoder uh to update the assistant message. So that is one way to do things. Uh and because I have a text stream, I want it to be really clear that it's doing that. Um that's just from some experience there. Uh this prompt of course as always is in the notes. So I'm going to run this and that's gross. We should definitely fix that. Uh, and it's going to come back. And you'll see now now you can really feel it, right? Because it's cooking. It's thinking. And it came back and it it made a good thing. So, um, it's saying that we need to change the send message function. Oh, you know what I should do? You know what I should do? I'm going to copy this again. I'm going to copy this again because guess what I forgot to do? I forgot to put the system message there. So, I'm going to copy that. I'm going to refresh this. I'm going to paste that back in there. And we forgot to say you should only return the JavaScript code. U return all the code. No explanations. I'm going to just say that. We'll see what we'll see if we can get what comes back there. Uh because as you saw there was like a bunch of markdown in there. Let's let's hope that this works again. So we're going to send that again. It's cooking. It's thinking. It's taken time and that's we want it could be streaming in. All right. Awesome. So, here we go. Here is a hope that this is going to work. And of course, you know, this is always different, right? So, the first thing I want to do before I even copy that over is I'm going to make a backup copy of my script.js because it's already working and I don't want to break that, right? So, um I'm going to do a copy and then I'm going to do a paste. And so, um uh script.copy.js. That's fine. I'm going to save that as my backup there. Now, I want to get this code, this code that's in this block here. I want to get that in there. So, here's a here's a little trick. It's kind of gross, right? Because uh uh if I had markdown, it might render this better. But I I what I want to do is I want to go I'm going to inspect this. So, rightclick, inspect. And now I've got my little over here, I've got this little browser. If I click in here, I can kind of just copy that. So, I've got the I've got that copied. I'm going to head over to my app MD. Uh, this is where I had stored the stuff before. So, let's see what it's talking about. We'll paste it in here. And you'll see it's got some really nice uh we've got some stuff in here. We've got it's really nice. So, uh, and it's it's colorcoded and it returned the JavaScript. So, uh, nice. And it just returned the code. So, let's let's grab this. I'm going to grab this code and I'm going to take it and I'm going to drop it. So, again, shift shift while I'm selecting there. there. And then I'm going to copy. And I think this this feels good. I don't know. It always feels good right at this point. So I'm going to save it. So now it's saved. Uh our server uh reloaded. So we've got a new server here. I'm going to refresh. And I'm going to say, what do the Ghostbusters say? And hopefully I'm going to keep this console up over here so we see if there's any JavaScript errors as well. I'm going to click send. What do the Ghostbusters say? That was pretty close. But you'll notice that it went inside the thing, right? It went inside the same message. It didn't give us the new thing, but it did stream in. The text did stream in. And if we take a look, we see here that it did stream in. Good. And I see Ghostbusters. Like, who you going to call Ghostbusters? That's what I was trying to get it to say. All right. So, we've got that. Uh, but it's a little bit wrong. So, we can keep going, right? So, so but why why not, right? So, uh I'm going to refresh and I'm going to say uh I have uh uh a bug in this chat app code where it's writing um the assistant message on top of the user message. Can you fix it? And uh what a weird time that we live in, right? So I'm going to give it that JavaScript again and we're going to see we're going to see if we get this that we could do something like that. And of course, you know, there are tools and there is actually we have a tutorial about this. So I'm going to give it the script that it just gave me uh that has the bug. I'm going to paste that in there and hopefully it will fix it. Um uh you assist in fixing code. Return only the code. Return all of it. Say just the fix thing because we want to we want to copy and paste it. So uh we're going to pass that through. Uh let's see what happens. Let's see. So that that's what it feels like when it's stream. That's pretty nice, right? So, so see it feels like it's moving really fast. And that's kind of the the illusion that you want it to give. I hope it figured it out. Let's see. Let's see. I'm going to go ahead and I'm going to uh uh inspect this again. I'm going to go in here. I'm going to copy it. I'm going to paste that into uh this app MD so that you have what happened on my side. If you ever need to look at this, I will give that to you. And then I'm going to copy uh from this. I'm going to copy out uh the code that it gave us. And I hope it figured it out. Wouldn't it be cool if it was able to look at the code that was written by it and fix it? And that's that's the world that we're getting into now, right? This is really neat. So, I'm going to paste that. I'm going to save that. Here goes nothing. Um, give me some words of advice uh to stick with it with I'm building stuff. Oh, what happened? Unexpected token at line four. Let's see what happened in there. Huh. What's strange? Did I not copy that correctly or did it just right the index was wrong? Just started out with a weird index. I'm started at zero. Let's try that. Refresh that. Let's try that again. Should give me some advice. I need it now. Uh on how to stick with it because I'm sticking with it. I'm sticking with it till the end here. There we go. Set clear goals. Create a routine. And did you feel how that went in? Now, we should get this to be markdown, but that's a different tutorial. And now it's time to deploy. I hate to break it to you. All you need to do to deploy this application globally is npm run deploy. This will deploy to what we call region earth. And now, just like that, you have an AI chatbot that is yours and free unless you go wild. Your credits will reset each day, and you are backed by a super powerful, safe, and secure network with all sorts of tools ready for you to scale when you need to. More in the notes. How'd that feel? I hope you're feeling empowered to build more and more AI apps with all the models on Workers AI. Check out future tutorials in this series as we'll dive deeper and give you more building blocks to cook with. As your AI dad, I got to say I'm really proud of what you built. Keep it up, champ. Thanks so much for hanging out and we'll see you real soon. [Music]

Get daily recaps from
Cloudflare Developers

AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.