Comfy UI Crash Course For Beginners
Chapters10
Introduces ComfyUI as a node based interface for running diffusion models locally and contrasts it with simpler web UIs.
A practical, hands-on walk-through of building and tweaking ComfyUI pipelines for Stable Diffusion, with tips on models, prompts, upscaling, and fine-tuning.
Summary
Developedbyed delivers a beginner-friendly crash course on ComfyUI, showing how to wire up a SD1.5 workflow from checkpoint to output and then scale up to SDXL. He begins by contrasting ComfyUI with simpler web UIs, arguing for the node-based flexibility it offers. The tutorial covers loading models (SD 1.5, SDXL, and fine-tunes like RealVisXL), connecting a clip text encoder for prompts, and adding a negative prompt for older models. He explains key knobs—seed, steps, CFG, samplers (DPM SDE), and schedulers (Karras)—and demonstrates how prompt tweaks and prompt-boosting via an LLM improve results. The video then shows saving workflows, swapping to SDXL, and experimenting with LoRAs and fine-tunes on CivitAI, emphasizing that vanilla models benefit greatly from these enhancements. A section on image-to-image, VAE encoding, and controlling denoise illustrates how to preserve or alter details. Up-scaling with 4x Ultra Sharp, plus integrating Z Image Turbo with separate VAE and text encoder setups, demonstrates modularity and speed. The sponsor segment for Brevo is brief but clear, and the finale reinforces trying different fine-tunes and LoRAs for better results. Throughout, ed emphasizes practical tips, model VRAM requirements, and the shift toward more capable, modular pipelines with newer diffusion tech.
Key Takeaways
- SD 1.5 can run on 4–6 GB of VRAM, while SDXL typically needs 8–12 GB, making GPUs like the RTX 3070 Ti viable for beginners.
- Using a clip text encoder (and optional negative prompt) improves prompt understanding and image quality, especially on older models.
- DPM SDE with Karras scheduler is a strong default pairing for denoising; keep seed fixed or randomize for variety as desired.
- Fine-tunes like RealVisXL dramatically improve faces and hands over vanilla SDXL, and LoRAs offer lightweight style/character nudges without retraining the whole model.
- Upscaling via 4x Ultra Sharp and modular VAE/text-encoder setups (e.g., Z Image Turbo) can achieve significantly crisper results without changing the base model.
- Saving workflows in ComfyUI creates JSON files you can reload later, enabling quick swaps between SD 1.5, SDXL, and fine-tuned models.
- Image-to-image workflows require encoding a loaded image to a latent space with VAE to apply diffusion-based edits while preserving structure.
Who Is This For?
Aspiring AI artists and hobbyists who want hands-on control over Stable Diffusion pipelines in ComfyUI, plus developers curious about enabling fine-tunes, LoRAs, and modular upscaling for higher-quality outputs.
Notable Quotes
"This is going to be super exciting. I'm going to show you how we can hook up SD, SDXL, maybe Z image turbo."
—Introductory setup and model hooking promise; sets expectations for the walkthrough.
"What this clip text encode does is it's essentially the tokenizer which kind of breaks up the whole text you have here into something that the LLM can understand."
—Explains the role of the text encoder in converting prompts into usable embeddings.
"The real power lies in fine-tunes and LoRAs."
—Highlights why users should explore model refinements beyond vanilla releases.
"Upscaling with 4x Ultra Sharp… you just drop it in the right folder and restart ComfyUI."
—Practical tip for adding an external upscaler model to improve detail.
"The newer models are more modular, so we need to get our own VAE and our own text encoder."
—Notes on modular architecture and preparing components for Z Image Turbo workflow.
Questions This Video Answers
- how do I set up a ComfyUI workflow from SD1.5 to SDXL in one project
- what are LoRAs and how do they improve Stable Diffusion results in ComfyUI
- which scheduler and sampler combinations work best for beginner-friendly prompts in ComfyUI
- how to use VAE encoders and decoders in ComfyUI for image-to-image workflows
- how to add upscaling models like 4x Ultra Sharp to a ComfyUI pipeline
ComfyUIStable Diffusion 1.5Stable Diffusion XLRealVisXLLoRAsVAEClip Text EncoderKarras schedulerDPM SDEUpscaling 4x Ultra Sharp
Full Transcript
Oh boy, I have an exciting tutorial for you guys. I'm going to teach you how to use ComfyUI. If you haven't heard of it, it's essentially a node-based visual interface for running your own stable diffusion models locally on your own machine. And this is getting more and more popular as GPUs are getting more powerful, and you don't want to send all your data to these pesky cloud providers when we can just do it ourselves. So, it's going to be super exciting. I'm going to show you how we can hook up SD, SDXL, maybe Z image turbo.
What all these nodes mean as well, what's a clip text encode, what are the best settings to also use to kind of generate the best images. And we're also going to talk about upscaling and a bunch of other stuff like fine-tuning and Laura. So, strap yourself in and let's get into it. Before we set up ComfyUI, I quickly want to touch on some of the alternatives that you might encounter on the web because there's quite a couple of tools similar to it. And then also what it means to actually run these things. So, ComfyUI is probably the most popular one out of all of them.
Uh but then you have something like Focus as well and Automatic111. And whilst these tools are are great and they're really easy to use as well, the problem is they're not really flexible and it's just like a web UI with sliders. So, whilst you can just like get up and running with them, they don't really give you the flexibility of something like ComfyUI which is entirely node-based. So, you'd actually have to manually, you know, add a model and then hook up a clip text encoder to it, a sampler or a VAE decoder, and finally get your image.
So, whilst the learning curve is a bit steeper, I actually recommend you to go through this because you'll actually gain a much better understanding of how these systems work. So, stick to ComfyUI. And by the way, it's not just like images you can do, you can generate videos and 3D and audio and a bunch of other stuff. So, super super flexible. Now, what do you need to run these? Well, it depends on the model that you use. One of the most popular models out there, one of the earlier ones, is Stable Diffusion 1.5. You only need about 4 to 6 GB of VRAM to run this.
So, even if you go for a budget Nvidia card, you will be more than fine. So, an RTX I'm running an RTX 3070 Ti which is super old now, but this is super cheap. You can probably find it for like 100 or 200 lb. And this gives you 8 GB. So, not too much, but I had zero problems running V1.5 and even Stable Diffusion XL which is a newer version. This requires around 8 to 12, but again, I had no problem running this on an RTX uh 3070. So, but once you get to better and better models, the more VRAM it it requires, right?
So, Flux Due Cline is probably way better and then the previous two that we just talked about, but this requires around 29 GB of VRAM. And now you are looking at you know, RTX 4090 to the minimum. And if you check the prices on that, guh. This gets quite quite up there, right? Around 2,000 or so. Now, you might be thinking, "Oh, I might just actually opt of getting a Mac cuz haven't the Macs started unifying the memory?" And you are right. So, this the base M5 Pro now, the 16-in comes with 24 GB of VRAM.
And then you have 48 and 64. So, you're like, "Pwah, why should I get an Nvidia card when I can just get one of these and probably load up even bigger models?" Uh the thing is, whilst this is true and you probably have a good time with one of these laptops, the inference speed is just not going to be there because you don't have CUDA. So, whilst you can load up the model, it's probably going to be 5 to 10 times slower than an Nvidia card. So, whilst both options are are getting really good and, you know, Mac is really pushing now on the software side to kind of optimize and and make inference a bit faster, if I had to recommend one, I'd still go the Nvidia route.
But, up to you. Up to you what you want to do. Before we have a lot of fun with ComfyUI, I want to thank today's sponsor Brevo. If you ever launch your own product, you know that writing the code is 50% of the work. The other 50% is emails and onboarding and being able to communicate with your users. That's where Brevo really comes in. This is an all-in-one platform for transactional emails, for email campaigns, automation flows as well. So, you don't even have to touch your keyboard and just set everything up so it just magically works.
And, you know, you can even send SMSs and WhatsApp messages. It's super fantastic. If there's one thing I highly recommend for developers is to use a highly trusted email service like Brevo for their marketing and transactional emails rather than writing your own. That's going to take a ton of time and guess what? 99% of the time it's not even going to end up in your users' inbox. Now, there's a bunch of other cool features as well. So, you can like import your own contacts here. You can kind of segment them and create lists as well.
So, certain groups get certain emails and then you can kind of test what works out. They have AI tools to help you with that as well. Again, it works with both transactional and also marketing emails. They have really nice templates. So, if you're building out anything, whether it's your next SaaS product, a course platform, or even a game, you will need quality email campaigns and transactional emails. So, check out Brevo by clicking the link in the description down below and you can use code at50 at checkout to get 50% off the first 3 months. Thanks so much, Brevos, for sponsoring this episode.
Let's get going. So, there's two really good places where you can download models. One of them is going to be Hugging Face. So, you just look up what you want to get like Stable Diffusion 1.5 and then you head over here to the files and versions and you want to look for the dot safe tensors here. So, V1.5 pruned Emma only. So, these are all the weights here. 4 GB, not too bad. So, you can download it there. And then the other one would be Civitai. So, this one you're going to find a bunch of Laura's, you're going to find a bunch of fine-tunes as well.
Uh but you might have this problem. I'm not going to tell you how to get through it, but if you know, you know. Now, getting ComfyUI installed is super simple. Hit download, download for Windows in my case, and just go through the installer. Just hit next. It's not going to be anything complicated. And once you have your model downloaded, simply head over to models here and I'm going to make this a bit bigger so you can see. And you want to go to checkpoints and then load up all the weights that you downloaded here. So, I have three.
I downloaded V1.5, SDXL, and then Real Vis XL which is a fine-tune uh of XL. Okay, let's hook up SD 1.5. For now, don't worry too much about the left panel here and what all these buttons do. It doesn't matter as much. Uh it's kind of more than nodes that are important. And you'll get more comfortable with, you know, all the little functionalities and shortcuts and whatnot. But I'll I'll try to cover uh some really important ones as we go. So, for now, just double-click here on the screen and it gives you a little preview of of all the different nodes that you can add.
So, let's load up a model. How do we do that? Well, you want to look for the load checkpoint here. So, let's click on that. That's going to be our diffusion model and I can drop it right here. So, that's one node here and you can think of it kind of as an input and an output, right? You have nodes that kind of you can combine them together like this. And then the final one here is going to be your output. So, it's kind of like a pipeline. Now, here, let's just pick SD 1.5 so you can kind of click the arrows.
If it doesn't show up here for you, just in case you had ComfyUI opened and then you dragged in the weights in the folder here, just restart the app and come back and then it should detect it automatically. All right. So, here is our model. And what we want to do is essentially put this through a clip text encoder. So, let's search for that. Clip text encode and let's drop it right here. And we essentially want to connect the yellow here with the yellow up there. So, what this clip text encode does is it's essentially the tokenizer which kind of breaks up the whole text you have here into something that the LLM can understand because it cannot understand the text you wrote here.
It needs to convert it to the embedding, to the numbers which are like vectors and matrices. So, it can do it based on like simple words or even like sentences. It entirely depends on the tokenizer. So, it takes that, it breaks it up into tokens, and then it's going to create uh the the matrix and the embedding that the LLM can understand. Now, for older models, um you would have to also add a negative prompt to kind of steer the um diffusion model into the direction that you want. So, what you can do is let's drag out another one here and I'll add another clip text encode.
This would be your negative prompt. So, all the things you don't want your model to do. So, no blurry images. So, you can say no blur. Uh you can also say no anime style, cartoon, anything that you kind of want to steer the model away from. So, for now, we'll just keep it simple like this. We'll have the two clip text encodes. Awesome. And then we need our sampler. So, we're going to say K sampler. Let's create this as well. There we go. And we are going to hook up the positive here to the positive and the negative to the negative.
Also, that's why you're going to see stuff like, you know, if you remember the older models, especially with like Midjourney as well, if you add something like high quality, 8K, you know, professional photography, you'll get slightly better results. And that's like that's not needed anymore with the newer models. Okay, so there we go. And now that we have these two hooked up, we still need a couple of things. We need to hook up the model here, so let's grab the blue one and hook it up to the blue one. And we also need a latent image.
So, for this one, we are going to do text to image, so we can just do an empty latent image. So, if you don't know, with stable diffusion models, you essentially start from pure noise, and then from that pure pure noise, it goes step by step and essentially generate the image that you want. But we're not directly working with pixels here. It's a latent image space, which is like eight times smaller. So, again, this is you can just think of it as like the noise uh that we start from. So, we'll hook it up here to the latent image as well.
And the batch size here controls how many images you want. We'll keep it at one. And the resolution's pretty important here. So, we want 512 512 for SD 1.5. If you go up to XL, that's going to be 1024. Just make sure you have a look uh kind of what your model supports. Okay, so there we go. We got that going. And that is going to be we're almost there. Here at the end, let's take a VAE decoder. So, this turns basically the image from latent space back to like a an actual pixel image. So, we'll hook that up there.
And then here the VAE, let's also hook it up here at the end. And now we can either save the image or do a preview. Let's just do a preview image node here. Uh hook that up, and there we go. So, that should be good to go. Let's give this a shot by running run up here at the top. As you can see, it goes right through like that. Let me make this a little bit bigger as well, so we can see the final output. And there we go, we get the image. Now, it's not too great looking.
It's got two tails, but it is what it is. So, let's talk a a little bit about all these settings here. So, we have seed, which is kind of the starting point of kind of how the noise is going to look. So, if you pick a different seed here, it's going to give you a different noise pattern at the beginning, giving you a bit of a variety. And by default, this is set to random. So, if I keep clicking, it's going to give you a random number every time. Now, we can right-click on this, and uh sorry, not right-click.
We can click here at the control after generation, and we can have it randomize, or we can do fixed as well. Let's switch it to fixed for now. So, if I run this, we should get the kind of the same looking image every time. Let's kind of change this prompt up as well. A guy with glasses sitting in a coffee shop sipping coffee. Okay, let's run this. Let's see what we get. There we go. The faces aren't that too nice looking, but it is what it is. So, if I run this again, it's going to be pretty much the same image because the seed is identical.
Now, here you have the steps. If you go something really low like two or three, um you're going to see it's going to look pretty pretty bad. So, you know, you're basically denoising for each step. So, the less steps you have, you're going to have a pretty bad looking image. So, for SD 1.5, you're looking somewhere between 20 to 30. So, I'll leave this at 25-ish for now. Now, you have something called a CFG. This kind of gives you the creative control like how you know, how accurately it should follow your prompt here. A lower CFG would lead to lead to more creativity, whilst a higher CFG will like follow the prompt really strictly.
So, for SD 1.5, I recommend somewhere between like six and eight. I'll put this to six for now. Let's regenerate. And looks a little bit better. The eyes are still quite wonky looking. And here's what I mean. So, if I go here and just add high quality, 4K, you're going to see that it's going to be the same image, but it's going to look a bit crispier. See it? It just looks a tad bit better. And this is why it's it's quite hard with these older models, cuz you really have to like fine-tune your like negative prompt here and your actual original text prompt quite a bit to get the results that you want.
Whereas with the newer ones, you have an LLM that's going to do kind of the heavy lifting with the intent behind it. Because I can have something like this, uh uh a guy and a cat sitting on a chair, right? Here for SD 1.5, it might get confused about the composition of like how is this going to look like? Is the cat sitting on the guy or is the guy sitting on the cat? What's going on? Whereas if you have an LLM here as the text encoder, it's going to know the intent behind it. It's going to know kind of what you mean, and it's just going to do a better job with the composition.
Even if you describe a scene, so if you say somewhere in Tokyo, right? It's not really going to know any information about it or as much as an LLM would be able to provide. Okay. So, that's kind of the CFG. That's your creative control. Lower one gives you more creativity, higher one gives you more kind of follows the prompt more accurately. Then you have something called a sampler name and a scheduler. These essentially control how the denoising process happens. So, here you have something called Euler, I think is the default. Uh one that works really well is DPM SDE.
I really like this one. And for the scheduler, I want Karras. So, the scheduler acts more like, okay, how much noise are we removing after each uh each step? So, you can think of it like sculpting, right? When you're sculpting, how much am I cutting off from the beginning? Am I doing a big chunk or am I refining it, you know, bit by bit? Um if you do simple, it essentially takes off chunks equally, right? Whereas something like Karras uh might might might do a bigger chunk at the beginning and then kind of refine it with slower cuts afterwards.
And then the sampler is essentially kind of the algorithm that we use for the denoising. And then the scheduler is like kind of like my sculpting example, I guess. Um so, hope that makes any sense. But a good default for it is DPM SDE and then Karras for your scheduler. And for the denoise here, we want to leave this full because we're essentially starting from full noise, right? We're not really we don't have like an image to image where we start off with something and then we're refining it to something else. So, leaving it to one here means pure noise.
So, that's the result we got. But again, what we can do is kind of refine our negative and positive prompt here, and you're going to see that we get much better results. So, again, this is something that's called prompt boosting. So, if we go here, you can ask an LLM model to do this for you. So, I asked it to give me a good negative. So, low low-res, blurry, JPEG artifacts, grainy, noisy, bad anatomy, bad hands. You're kind of pushing and steering the model to give you a better add that negative prompt. Again, I'm on the same fixed seed.
So, we can actually see if the quality gets any better with this negative. And then something like this is too vague as well. A guy drinking coffee in a a cafe, sharp, high quality. So, I asked an LLM model again, and it gave me this one. So, again, the newer models will give you something like this. A man sitting in a cozy cafe drinking a cup of coffee, relaxed expression, natural pose, warm ambient light, wooden table, soft window light, etc. So, we should see quite a bit of a better result. And there we go. Like much better composition already.
The image looks way way better in my opinion. The faces are still wonky, but it is what it is. It's an old model. What can you expect? Now, here's what's nice. Once you have this set up, you can simply control S and save this workflow. So, I can save SD 1.5, and done. I can close this up. I can come back in, and I have this saved. And if you look in the folders, it just essentially saves it as a JSON. Here we go. If we go to user default workflows, and there we go, SD 1.5.
I have two saved here. So, there we go, just a simple JSON file. Okay. Now, here's what's cool. We can simply change this up to SDXL by simply hitting the arrow key there. And look at that, we have SDXL. And I can see if it does any better. This is again a much bigger model, so it should perform way way better. And we need to change the value here to 1024 now, because that's kind of the native resolution it uses. So, make sure you use that and hit run again. And there we go. With SDXL, as you can see, we have much higher quality images.
The anatomy is also looking a tad bit better. But, you might still be disappointed with these, and that's completely fine. These models are all vanilla, okay? So, understand that just using SD1.5 on its own, or SDXL on its own, it's not really an effective way. And the real power lies in fine-tunes and LoRAs. And that's why, if you check Civit AI, those specific fine-tunes are going to do much, much better job. And they kind of branch out in two directions, either anime fine-tunes, or kind of realistic fine-tunes. The best way I can describe it is a vanilla model like SDXL is trained on millions of images to gain a general understanding of the world and how to draw, you know, environments, and how to draw a person.
Uh but, it's not going to do a fantastic job at almost any of them. Real and newer models, like Flux 2, are going to do way better job. But, what a fine-tune does is it retrains the entire data set on a specific thing, like characters, for example. So, it's going to do a way better job. So, a really popular SDXL model fine-tune is RealVis RealVisXL. Okay, that's the name of it. So, let's give that a go. I'll select it here. I have that downloaded as well. And let's run it. This takes, again, like maybe two or three times longer than uh SD1.5, but it still works just fine on a 3070.
And there we go. Running this, as you can see, gives a way, way better uh image here for faces and hands, as well. That looks way better than the vanilla SDXL. Now, if we want to save the image, this is just a preview, but we can add another node here at the end, and do save image. So, let's hook that up right there. We have the preview here. Let's make that a little bit smaller. And there we go. And what we can also do here, if you want to uh kind of delete this, you can just click on it, and do delete.
Or, what we can also do is click on it, and then add a reroute, like that. So, what this reroute allows us to do is kind of hook it up to two different things. So, we can preview the image, but also save the image here. Uh I'll just leave the kind of the default here, which is fine. And let's run this again. I guess the nice thing about uh save image is that also gives you a preview here. So, technically, you won't even need this node. So, let's just simply delete it. There we go. So, that's saved now.
Uh so, what we can do is let's open this up. So, we can do open image, which just opens it here in the new tab. I believe you can also go directly to the path, uh if I'm not mistaken. Or maybe not. I swear they had that option. But, it got saved here. Let me show you. If you go to documents, ComfyUI, so just find your ComfyUI folder, outputs, and there that's where it is. You have the temp here, which kind of generated everything that we had so far, but this gets all deleted as soon as we close ComfyUI.
Okay, let's also have a look on how we can do image to image. So, rather than providing an empty latent image here, let's delete that, and we are going to hook up a load image node. Now, let's take that image and hook it up here to the K sampler. And oh, damn, we cannot do it. Why? Again, this is a pixel image. We need to turn it into a latent image format. So, let's drag out the image here, and we want to encode it with VAE. So, VAE encode. There we go. We're passing the pixels in.
It's a latent image now. And now we can hook it up here. Cool. That's all you need. And now we just need to lower the D noise. Again, this is how much basically how much we want to modify the original image. If this is really high, we're essentially keeping almost the entire image. So, if I have the same prompt here, and I can say with his hands up above his head, let's run this. So, this is at 0.8. Oh, I also need to hook up the VAE here. So, rather than this going directly to the decoder here, let's kind of undo that, hook it up here, and then hook this up here again, like that.
So, it needs to go in there and in there, as well. My apologies, it's actually complete opposite. If you do a high D noise, and then it's going to affect the image more. Whereas, if you do something like 0.1, 0.3, it's almost going to be not noticeable at all. So, maybe like texture or color shifts. Uh but, there we go. With 0.856, uh we managed to get this guy to put his hands up, but the actual image looks quite a bit different. So, it's all about having like a really good model now, uh honestly, you know, an image to image uh to kind of retain uh kind of the same look and character.
Next up, let's look at how we can do upscaling. So, let's head over here in Google, and I'm going to search up 4x Ultra Sharp. This is kind of a really good all-rounder, really popular upscaler. Uh so, let's head over here to download. And this is not going to be like super big, around 63 megabytes. And the place where you need to drop this is if you head to your folders, let's go up here to models, and then here you have a bunch of folders. So, what we want to look for is the upscale models.
So, you you here, upscale models, this is where we want to drop it. And again, as soon as this finishes downloading, make sure we restart uh ComfyUI. So, let's drag this in here, drop it in here. Again, it's not going to show up, so let's just close this up and reopen it. Okay, we are back in business. Let's get rid of the load image, and go back to the original setup we had, uh which is going to be just a simple empty latent image. So, let's grab that, drop it here. We'll hook it up to the latent image, change the resolution to 1024, hit okay, 1024 here, hit okay.
So, rather than saving the image, I'll just add a preview image here to the top, and a reroute. So, I'll drag this reroute out, and I'm going to do load upscale model. So, let's click on that. We'll put it down here. As you can see, it says uh 4x Ultra Sharp. It's already selected for us. Then, I'm also going to hook up a upscale image. So, let's search for that. I'm going to click using a model. Let's drop it here. And this is how it's going to work, right? So, this is going to go in here, upscale model to the upscale model.
The image is going to go down here. And here, we can add the preview again. So, we can kind of compare the two. Okay, cool. Let's make this bigger, as well. Put side by side, and let's hit run. And there we go. Here are the results. If I put them side by side really quickly, there's way, way more detail here on the right side, especially in the sleeves. It might be again hard to show up on YouTube. But, it's as simple as that. As downloading the model, putting it in the right folder, and then adding a node, and you are good to go.
Okay, so there we go. That's kind of the general idea of how to use these models. I highly recommend you to check out different fine-tunes, because they're going to give you quite different results. And just to briefly touch on LoRAs, you can experiment with those, as well. Those are low-rank adaptations. Uh they're smaller data set, essentially, that can act like a filter on top of either vanilla models or fine-tunes, as well. Uh they just kind of nudge the model towards a specific style or character. So, you can have a LoRA specifically trained on pixel art, for example.
Uh so, that's something really fun, as well. Those are not too large. Let me show you how we can use Z Image Turbo, as well. Again, these newer models are more modular, so we need to get our own VAE and our own text encoder. So, once you download it off Hugging Face, we can put the Z Image Turbo here in the diffusion models folder. The VAE is going to go in the VAE folder. You can search for AE safe tensors. And then, for the text encoder, I used Gwen 3 4 billion, which is going to go, again, in the text encoder.
And here is how it's going to look like. So, you have your load clip, uh which is going to load up Gwen. All right. And then, you have your diffusion model here, so you can use load diffusion model. There it is, Z Image Turbo. So, we're not doing load checkpoint anymore. Now, these are separate. This is going to go in the positive and the negative, again. And then, for the K sampler here, we are going to pick way lower steps, so it's going to be way faster here. And these don't need as many steps as the older models, so we can get away with something like eight.
The CFG can go really low, to something like one. And we can use the same DPMMP and Karras. That's going to be perfectly fine. Everything else is pretty much the same. So, let's try out a prompt here. We're going to say cat walking uh in a Tokyo here, with like sakura trees in the background. And let's see how it does. So, let's run this prompt. I should also mention that since we have an LLM now to essentially create our prompts in the text encoder, we don't actually need a negative prompt anymore. So, you're going to find this in your models like flux as well.
And check it out. That looks pretty cool to me. Way, way better the results than SDXL and SD 1.5. So, hope you enjoyed this little tutorial. Let me know if you want to see more comfy UI stuff. There's so much that I just kind of cram into this one little tutorial here. But, hope you try it out. Let me know how it works and I'll catch you guys in the next one. Peace.
More from developedbyed
Get daily recaps from
developedbyed
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.



