MongoDB Takes Over Embeddings — You Write Nothing

Jack Herrington| 00:08:10|Jun 8, 2026

Chapters6

The video introduces vector embeddings in MongoDB, comparing traditional keyword search with vector search and showing how vector search yields more meaningful results for complex queries. It demonstrates how embeddings enable concept-based retrieval and teases integrating them with LLMs.

MongoDB now handles embeddings natively with a docs_autoembed vector index, delivering fast, concept-based search that outperforms simple keyword lookups.

Summary

Jack Herrington demonstrates how vector embeddings turn MongoDB into a powerful search engine. He uses a TanStack AI docs dataset to compare a traditional keyword search against a vector search, showing how phrases like “how do I use tools” return relevant results even when the exact terms aren’t present. The core idea is embedding a query into a high-dimensional vector and finding documents with nearby vectors using a specialized index. Herrington explains a typical setup: a docs_autoembed index with a type of autoEmbed that runs embeddings automatically on content, eliminating manual multi-part pipelines. He walks through the Voyage AI integration, describing how Voyage AI provides embeddings via an API key and how MongoDB ingests data and bulk-writes it, handling the embedding requests under the hood. The practical payoff is demonstrated by comparing the two search approaches side-by-side and noting how much cleaner the vector-based implementation is. He also ties embeddings to real-world use in LLM chats, where a searchDocs tool feeds the LLM with context from relevant documents. The video culminates with a reminder that embeddings enable “agentic memory” for LLMs, making vector search a foundational feature for modern databases. Herrington thanks MongoDB for support and points viewers to the GitHub repo for hands-on experimentation.

Key Takeaways

A vector search uses embeddings to map queries and documents into high-dimensional coordinates, enabling concept-based retrieval.
Creating a docs_autoembed index with type autoEmbed lets MongoDB auto-generate embeddings for the content field from TanStack AI docs.
Voyage AI provides the embedding service via API key, allowing MongoDB to ingest data and fetch embeddings automatically during bulk writes.
The keyword search and vector search produce the same UI shape, but vector search is simpler to implement because it relies on the docs_autoembed index.
Embedding-based retrieval improves results for queries like 'how do I use tools' and supports LLM workflows by supplying relevant memories to the model.
Embeddings are portrayed as an essential feature for databases in enabling agentic memory and better LLM interactions.
The example app and GitHub repository are practical entry points for trying embeddings with MongoDB and TanStack AI.

Who Is This For?

Developers and data engineers who want to add native embedding and vector search capabilities to MongoDB, especially those integrating TanStack AI or building LLM-backed chat tools. It’s a practical guide for moving beyond keyword search to semantic retrieval.

Notable Quotes

""vector search is a much better search. It's a conceptual search.""

—He contrasts vector search with simple keyword matching to show semantic retrieval.

""the docs_autoembed index... it's going to automatically do the embeddings on that content.""

—Explains how MongoDB handles embeddings automatically via a specialized index.

""Embeddings are becoming incredibly important features for your database.""

—Summarizes the strategic value of embedding-based search for modern databases.

Questions This Video Answers

How does MongoDB implement vector search with autoEmbed and docs_autoembed?
What is Voyage AI and how does it work with MongoDB embeddings?
How can I set up a bulk insert pipeline to auto-embed content in MongoDB?
What makes vector search better than keyword search for technical docs?

MongoDBVectorSearchAutoEmbeddocs_autoembedEmbeddingsVoyageAITanStackAILLMAgenticMemoryBulkWrite

Full Transcript

Some things are just easier to show than to explain, so I'm gonna start off this video on how to do vector embeddings in MongoDB by just showing you why you'd want to do vector embeddings. Here's a very simple search app built on TanStack Start, and I've loaded into the MongoDB database all of the documentation for TanStack AI. I can just type in, for example, tool here, and on the left-hand side, we can see the results for a keyword search. All it's literally doing is looking for the string "tool". It is doing it case insensitively, but it's just looking for that in the documents, and when it finds it, that's a hit. Now on the other side, we have a vector search, and that vector search is a much better search. It's a conceptual search. And yes, both searches are getting results right now. But if I were to say something along the lines of, for example "how do I use tools"? Well, that's never gonna show up in the documentation as a single term like that. But vector search, hey, that gives me really good results. In fact, I can go so far as to say things like, "How do I give the LLM access to my data?" And it actually comes up with server tools as the second hit, which is really amazing. So in this video I'm gonna show you how vector search works and how to use it really easily inside of MongoDB. And this video is brought to you by my friends over at MongoDB. Thank you so much for supporting the channel and videos like this. Let's get right into it. All right, let's start off with how vector search actually works. I've got this little figure here in Draw.io, and it shows a little Cartesian graph that has different terms in it. For example, server and tool and serverTool, chat, bot, and text. And you'll notice that the terms that are related are close to each other in kind of groupings or clusters. And that clustering's really important because now when I have something like tool, I know that it's close conceptually to serverTools. So if I were to say query on tool, I would find things like server and serverTool, but I might not find things like bot, chat, and text, which is good because what we're doing is we are comparing the embeddings and the distance between the embedding. So what's an embedding? An embedding is an array of coordinates, or in this case, vectors. Each one of these in this really simplified example has two coordinates. So for example, like tool here would have a negative X value and then a positive Y value, and a similar one for server and a similar one for server tool. So when I'm doing my search, I actually put in the user's query into the embedding model. There's a specific machine learning model for this, and it gives me back a vector, And then I can find all the documents in the database that are close to that vector. The way that it does it in the database is by comparing these coordinates using special indices to give us all of the documents that have the closest coordinates to the input coordinate really quickly. And the reason that they're using special indices is because these embeddings aren't ever an array of two numbers. They are an array of thousands of floating point numbers. Now there's been a lot of work put into that, and MongoDB has very good indices for doing this type of lookup really, really fast. So now that we understand how vectors work and how a vector search works, let's talk about how to do it inside of MongoDB. The real magic happens over here in createIndex, where I'm attaching to my local MongoDB instance on localhost, and then I'm creating a new index. That index is named "docs_autoembed", and it is a type of vectorSearch. Within that index, there's a type of autoEmbed attached to the content. Every markdown that I get from TanStack AI, I'm going to bring into Mongo as a document, and in that document, it's gonna have a content field. That content is gonna have the content of that markdown. And by putting this index on there that says type "autoEmbed", it's going to automatically do the embeddings on that content. This is really cool because doing embeddings has been a very manual process. In fact, it often involved three different moving parts. One would be a vector database, one would be the database of the content, and then the third would be the embedding model. And what MongoDB has done is basically bring that kind of all together in one piece, although there is this Voyage AI system, which is related to MongoDB. We can take a look at that right now. This is the Voyage AI site, and in order to use this service, you need to log in, create an account, and then get an API key. Then you can just give it over to MongoDB as part of the environment. And what MongoDB then does is when you ingest all of the data into the database, we'll take a look over that, ingest.ts. Here's the code for creating all the documents. You can see that we have the content down there. That's the key that we're gonna use for that docs_autoembed index. Then we're using bulk write to just bulk write all of those entries into MongoDB, and then MongoDB is doing all of the work of firing all the requests off to Voyage AI, getting all the embeddings back, and doing that automatically, so auto-embedding. It's great. We can take a look at these two different types of searching side by side. The keyword search is actually, well, four lines longer than the vector search. Let's go take a look at how the keyword search works. So it takes the query, and then it escapes a regex, and it looks for a title or content that matches that query And then finally it converts all the documents that it finds into a form that's usable in the UI The vector search has exactly the same shape for the input. But I gotta tell you, it's actually a lot easier in terms of implementation. We just use that docs_autoEmbed index that we created. We give it a path to what we're looking at, in this case, the content, and then we give it the query. And the reason that you give it that voyage four model now is because you actually have to go and take that query, run the embedding model on the query to get that embedding coordinate, and then that's what you use to do that indexed lookup to go and find the documents that match that query. After that, we do the coercion of the documents that we find into a shape that's good for the UI to display. Now let's talk about one more thing, which is to use this embedding model inside of an LLM chat If I go back to my test app, which is of course available to you for free on GitHub in the link in the description right down below, I can click on the chat link and have a TanStack AI built chat, and then I can ask it, for example "How do I use server tools in TanStack AI?" And now the AI is doing a search docs tool request against the application. That searchDocs tool is in turn using the embeddings to go and find the right documents to inform the LLM so the LLM can then give us the right answer. This is why embeddings are actually more important than ever, because LLMs need to do things like use agentic memory. Agentic memory is based on vector search, where you use a vector search to find the memories that are related to the query that the customer is putting in. So actually, embeddings and auto-embedding like this are becoming incredibly important features for your database. All right, I hope this helps you understand more about embeddings and how they work inside of MongoDB. Thanks again to the folks over at MongoDB for supporting this video. In the meantime, if you have any questions or comments, be sure to put that in the comment section right down below. And if you like the video, please hit that like button. If you really like the video, hit the subscribe button and click on that bell, and you'll be notified the next time a new Blue Collar Coder comes out.