Gemini 3.1 PRO...pure disappointment
Chapters6
The creator describes running nine tests to compare Gemini 3.1 Pro with Opus 4.6 and notes initial excitement based on benchmarks and pricing implications.
Gemini 3.1 Pro underwhelms vs Opus 4.6 in design prompts, 3D concepts, stock UI, and interactive demos across nine tests.
Summary
Developedbyed dives into Gemini 3.1 Pro through nine rigorous prompts, comparing it head-to-head with Opus 4.6. The tester starts with high hopes based on benchmarks and price-friendly quotas, but quickly exposes Gemini’s inconsistencies: strong in aesthetic prompts yet failing to deliver reliable, polished UI and interactive features. In several tests, Opus delivers smoother animations, usable 3D scenes, and fully fleshed-out stock analytics with real-time charts and news, while Gemini winds up with flat lighting, broken 3D interactions, and empty or “coming soon” panels. The creator highlights repeated tool-calling failures, problematic automatic dependency handling (npm/mpm vs bun), and a tendency to overcomplicate tasks instead of delivering one-shot, working results. A standout moment is Opus’s impressive circuit-board playground that instantly shows voltage, warnings about missing resistors, and interactive wiring, which Gemini struggles to match even after multiple prompts. The overall verdict: for design/UI, interactive demos, and practical features like stock dashboards, Opus outperforms Gemini 3.1 Pro by a large margin; Gemini’s promise of “nicer visuals” doesn’t translate to reliable, production-ready outputs across the tests. The tester ends by inviting viewer experiences and suggestions for other models to try.
Key Takeaways
- Gemini 3.1 Pro consistently struggles with one-shot prompts and tool calling, unlike Opus 4.6 which often delivers working results in seconds.
- In a landing-page prompt for Vit and React with dark/light mode, Gemini produced a prettier look but with a messy color/typography mix and a non-functional button, while Opus produced a tighter, more usable design.
- Stock-tracker UI: Opus offers a complete feature set (MACD chart, news, light/dark mode, price alerts) and usable interaction, whereas Gemini’s panels were mostly empty or marked coming soon.
- 3D prompts: Opus rendered a more convincing isometric room with interactive elements and lighting; Gemini’s attempt was dark, flat, and often misaligned, with broken 3D lighting and calibration.
- Circuit-board toybox: Opus built a fully interactive UI showing voltage changes, LED status, and real-time feedback; Gemini required five prompts and still produced an unusable result with jittery camera and non-functional pins.
- The author notes recurring issues with Gemini’s dependencies and tooling (trying to install npm/mpm, overcomplicating setup) versus Opus’s more seamless handling (bun-based workflows).
- Overall, for real-world usability and appealing yet reliable visuals, Opus 4.6 remains the stronger choice; Gemini 3.1 Pro falls short in core interactive scenarios and UI polish.
Who Is This For?
This is essential viewing for developers and product designers who compare AI design assistants for UI/UX tasks, 3D prompts, or stock/interactive dashboards. It explains concrete strengths and weaknesses of Gemini 3.1 Pro versus Opus 4.6 that matter in real-world workflows.
Notable Quotes
""I ran nine tests. I documented all of them and I want to give you kind of my personal experience with this model.""
—Opening premise showing the structured testing approach.
""Opus far outperformed Gemini""
—Initial verdict after comparing multiple tests.
""Google Chrome crashed because anti-gravity opened up the simulation tool""
—A concrete failure point highlighting instability with Gemini.
""missing resistor LED is connected without current limiting resistor. This could damage the LED""
—Impressive realism in Opus circuit-board test; Gemini misses key UI/UX cues.
""Gemini would just completely fail at it""
—Summary of tool-calling and prompt failures across prompts.
Questions This Video Answers
- how does Gemini 3.1 Pro compare to Opus 4.6 for UI design prompts
- what are the main UI/UX weaknesses of Gemini 3.1 Pro according to tests
- can Opus 4.6 generate interactive stock dashboards better than Gemini 3.1 Pro
- why do AI design tools struggle with tool calling and dependency management
- which AI model handles 3D prompts and lighting more effectively: Gemini 3.1 Pro or Opus 4.6
Gemini 3.1 ProOpus 4.6AI design tools3D promptsUI/UX generationStock dashboard UICircuit-board simulationTooling and dependenciesNPM vs BunAnimation and lighting
Full Transcript
Hey there, how's it going? In the past three days, I ran numerous tests on Google's new Gemini 3.1 Pro model to see how it stacks up against the likes of Opus 4.6. So, I ran nine tests. I documented all of them and I want to give you kind of my personal experience with this model. So, initially I was really excited for it because when the benchmarks came out, holy look at that. It pretty much aces all of the benchmarks. maybe besides one or two here, but completely stumps everyone else. And that was really exciting for me even like pricewise cuz we know that if you use uh you know something like anti-gravity or Gemini CLI, if you get one of the Google subscriptions, basically the the quota is quite good.
You don't really run into rate limits as quickly as something like Claude, right? You do Opus 4.6, six. If you're on like the $20 subscription, you can literally rate limit yourself in one or two prompts. So, it was really exciting. And Google was also really good at front-end uh design, so it would make really nice uh landing pages and pretty good SVG. So, 3.1 was really exciting. With the first test, I wanted to lean more into Gemini strengths, which are creating pretty things, whether that's SVG or a landing page. So, this was the comparison between Opus 4.6 and Gemini 3.1 Pro.
And as you can see, the animation on the right here is far nicer uh in terms of of visual clarity and how the animation unfolds as well. However, I do want to note one thing. This specific um prompt also had uh the idea of creating some sort of wind physics to blow these leaves. So, that's something that was here and it was supposed to loop seamlessly as well, which neither of them did. They all kind of faded out. However, Claude at least attempted to do the whole wind physics, whereas Gemini pretty much ignored it and went for the aesthetic.
But overall, if you really want to like compare final result, I do like Gemini 3.1's output so far. And this was okay. I was quite excited for this because I felt like, okay, this kind of tests, running these tests, it's going to be a bit more, you know, closed than I would have thought. But to my surprise, this was the last test that Gemini did well at. Everything else was an absolute disaster. Okay, I'll say this. Maybe the second test also went to Gemini, but that's about it. So, this was just a simple prompt of creating a Vit and React landing page.
Uh that was like for creative studio with dark and uh dark and light mode high-end animations with frame or motion uh and all the basic sections you would need like a hero feature demo pricing and testimonials. So here are the results. Overall I do like the look of Geminis's a little bit better here and then this ugly gradient that it chose here. Now it kind of messed up the button here so it didn't get that right. Uh, but overall it just looks way prettier. The way it did the trusted um platforms here as well looks way nicer.
The font that and everything that it chose here looks a bit ugly to me. Uh, but yeah, I'm quite happy. See, Opus kind of went pretty hardcore with the gradients everywhere, which just doesn't look as pretty. Uh, whereas, yeah, Gemini went for a more minimal look, but it shows the colors better overall, I feel like. Um, here it's just too much color. The cards don't really pop out. And what I like here is that Gemini also kept the kind of the color right even for the stars here. Uh, whereas Opus just picked Oh, stars are yellow, so I should pick that.
But that's about it. This is the last time uh Gemini did well because on the third test here which was testing pretty much how well it does design but also 3D by creating an isometric uh cozy room that's quite quite interactive as well. So you could click on the laptop or you could click on uh the mouse and kind of see information about it. So, the initial frame here, the Google one, Gemini, looks pretty cool, but the problem is is that it's it's it's dark. It didn't do the lighting proper properly. And all of these prompts were the same as well.
And uh Opus just had much nicer animations. The cat here was stuck in the bed for some reason. Uh and it even added dark and light mode here, which we're going to see in a second. So, as you can see, the whole thing transitions. And look how the laptop actually still shines some light off. I think that's really cool. Um, whereas the Gemini one looks really, really flat. Even if you look at the desk here and the shading on the desk, it just doesn't look as nice. Um, and the only thing is I'd say no, I prefer Opus in every single way.
But let me know what you think. I wanted to try just one more 3D prompt at least uh but something that would also hook up to an external API like uh for example here it's kind of tracking flights throughout the planet. So that was the idea with this one. Uh but I made it simple for both models. So I told them to use dummy data for now uh that they can query from. So this is the result here. Opus rendered a pretty nice planet. I don't like the border around it. this big border that it did.
H but Gemini H didn't work at all and just rendered a blue sphere and put all the planes which are cones in this case for Gemini. Put them all in one spot. So something definitely messed up there with a coordinate system. Uh but yeah, that's all Gemini output here. Now Opus uh worked properly here. I don't like how it rendered the planes out. render them kind of blue and hard to see, but it was functional. Again, it it nailed it from one simple prompt. Um, and you could select the planes here as well, and it will give you details about it.
And look at that at the end there. Uh, Google Chrome crashed because anti-gravity opened up the simulation tool where it take takes like screenshots of of why things are working and not working and it just crashed. So overall again with one s sing single prompt opus far outperformed Gemini. For the next test I ran a prompt where I asked both models to create me an application to track stocks also do stock analysis and view prices in real time. And these are the results. Now do note I also gave both models the option to either fetch the images for the thicker symbols or to make them themselves with SVG.
So that was the prompt. You had the choice either or. Now here's what the interesting part is. Opus here took 12 minutes and 23 seconds. Gemini took 18 minutes and 31 seconds. So you might be like, "Oh, Gemini took quite a while." But what's really impressive is that Opus took time to recreate all the SVGs for all these companies. So Apple, Microsoft, Nvidia, these are all SVGs. Now, now given, okay, some of them are not great, like the Amazon one looks trash, right? The Tesla one doesn't look good as well, but some of these look pretty good.
Uh, so it took extra time to do this while Gemini, you'd think they'd Google at least the thicker images and get those, but no, it just added a simple div with kind of a color that closely matches uh the company. So, so that's all it did. But overall, design-wise, Opus wins as well, uh, in my opinion. And the way it did the layout here is quite tight. didn't even optimize it for mobile as well. So, I was really disappointed to see Gemini just not looking as good at all. The charting was a bit hard to navigate around as well.
Whereas Opus felt like you could go up and down and just kind of move it. I'm just used to maybe, you know, apps like Weeble and whatever, Robin Hood, other stock apps that are really good. It felt really good to use it in Opus, whereas Gemini 3.1 Pro felt like the Yahoo Finance chart a little bit. But yeah, uh you could you could toggle between all of them. You can add them to watch list. So that worked on both just fine. As you can see, the search functionality works as well just fine on both. Uh but that was about it.
Um you could compare as well and also kind of see how the stocks move around. But both of these also had in their prompts that they should have a specific section where they show key statistics, uh the company's profile. They should also show news regarding the company and stuff like that. Gemini didn't really do any of it. As you can see on the right here, we have a panel called overview, financials, valuations, but most of these were empty or just saying coming soon, which I don't know why it did that and not just implement it.
Whereas on Opus, it did amazing. like my god. You have the MACD chart here, the key statistics, and you're going to see even like news section and even the light and dark mode, price alerts, just a bunch. It's it's it's pretty pretty impressive uh how good Opus is. Uh but you're going to see here you can favorite charts as well. Look at that. Gives you valuation, technicals. Look how cool this is. This is crazy. News as well. And then look, this is the chart you have here. nothing to be implemented. It says whilst I I prompted it to do it, they had the same prompt.
At this point, running all of these tests, I noticed that Gemini would have a really hard time oneshotting something. It will always run into some tool calling issues uh and just stumble and overthink overall in general. Um, for example, with this prompt, I asked it to programmatically recreate uh kind of like a a couple of slides here explaining linear regression when it comes to machine learning. And the problem I ran into with Opus, it would oneshot it, right? All these tests, it would oneshot and it would work. You'd run mpmdev, you'd have some you'd have it fully working.
with Gemini multiple times I've noticed it's trying to do a tool call and just failing or it would try to run mpm and it wouldn't find it on my system because I run bun and then it would go in this like overly complicated mode trying to analyze why not mpm is not working and it would try to install mpm uh it would again run into issues and it would just crash it it's pretty crazy whereas opus when I asked it to do it and all of these tests it's like oh you don't have npm installed oh but I see you have bun so let's just run it through bun and boom it would be like 5 to 10 seconds it would figure it out whereas Gemini would just completely fail at it so I had to install mpm just so Gemini wouldn't have this issue anymore but generally speaking you you'd start noticing these patterns where Gemini is like installing Tailwind v4 but then it tries to implement it with v3 three and then nothing works and then it just over complicates things again.
Um, and in this case as well, I was really hoping to see Gemini just kind of do better in the UI, but I just honestly didn't prefer it in this case. My final test was recreating a toy box that would let you experiment with circuit boards. uh you'd have an Arduino and a breadboard and you'd be able to hook up LEDs, resistors, push buttons, jump wires, stuff like that. I thought this was a really good test because it not only tests how good it does in terms of the UI, um but also there's quite a bit of logic behind this, right?
You'd have to know how the wires would transmit the electricity, right? how the resistor changes um the electricity the LED intakes right so a lot of little bits but also how it's presented to the user because this is something that's a bit difficult to to intuitively make sense and what I meaning by that if I hook up a wire okay does the wire show the electricity flowing for example uh can I click on the LED and maybe see you know how much power goes through that so there's a lot nuance to this that I wanted to see how it does.
And this is what like honestly kind of shocked me with Opus because in one shot it it did the whole thing. It had a smooth rotating camera as well. It also had challenges and guides. And look at that. When you click on an LED, it automatically put me in that view to easily put the pins in, right? Put me in this top down view, which I thought was was crazy. So there we go. It puts the LED there. There's the cathode, the anode that you can hook up to. You can grab the jumper wire. You can also color these.
And look how cool that is. Now, this is a little It hasn't done it perfectly because it shows the electricity flowing through it. But at the bottom here, you're going to notice this shows I like how See, this is what I mean by making it intuitive and easy to kind of reason about. Um, is it shows the digital pins and they all say that they're at zero voltage right now. So you can just tap one of these to activate it. And look at that. Now it's 5 volts. So it's on, right? So it's just the the UI is really nice.
It's intuitive. Um, but there we go. That's coming true. And I'm I'm just hooking up ground there. I'm not adding a resistor because I wanted to see just kind of what it says. And look at that. I even get warning. It's where I'm standing right here. But it basically says u missing resistor. LED is connected without current limit limiting resistor. This could damage the LED. Okay, so it did that as well. And when you click here on the LED, it says active and it says it's getting the full 5 volts. So missing resistor let LED might burn out.
How sick is that? That is insane. I just never thought it would be this impressive. So you can change the LED colors as well. And then if I add a resistor here, boom, there we go. Goes down to 2 volts. So what can I say, guys? What can I say? It was really impressive. Now, when it came to Gemini, this took five reprompts. I'm sorry. I didn't even want to include this test at the end because I I didn't think it was fair anymore. Opus was one prompt. This was five prompts to just get compiling.
Uh, but the camera just was unusable, unfortunately. It was really jittery. And the wires, I assume these are, they were just stretching vertically all the way for no reason. So, this was kind of the experience with it here. So, when I click the LED, as you can see, it tried to do the same rotation as Opus. But look where my cursor is. My cursor is here. And the LED looks like it's up there for some reason there. But when you place it down, it does go in the correct spot. But look how the pins are positioned here as well.
See, they're not I guess they're touching, but it's Yeah, it's just I don't know. It just doesn't look as nice. But that's as far as I could go with Gemini. It hasn't really done any other pins here to decide. I don't have a ground I can connect to. I cannot turn these pins on to 5 volts and off 5 volts. It's just literally a brick. That's it. Nothing is functional here. And even though both got the same prompts, Gemini decided that the wire, nah, we'll skip on that. Wire is coming soon, guys. And if you noticed on the stock example as well, it did the same thing.
It just wouldn't want to go all the way for some reason and it just added those features as coming soon. I don't know why it does this. Uh but yeah, so that's kind of about it. I am just honestly really disappointed with Gemini. Uh, I I honestly thought that we're going to go in 3.1 having a model that you just have to use because it just is going to be so amazing at designing stuff, but in my experience, um, you nail the plan design. It might nail a couple more SVG bits, but other than that, I would not rely on this at all.
So, let me know what your experience has been with it. I'm curious to to hear. And yeah, let me know what other uh models should I test. This is like a super fun thing. So, I'll catch you guys in the next one.
More from developedbyed
Get daily recaps from
developedbyed
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.









