Prepare the Data | Build RAG Applications using MongoDB

📄️ 👐 Load the dataset

First, let's download the dataset for the lab. We'll use a subset of MongoDB's technical documentation as the source data for the documentation chatbot.

📄️ 👐 Chunk up the data

Since we are working with large documents, we first need to break them up into smaller chunks before embedding and storing them in MongoDB.

📄️ 👐 Generate embeddings

To perform vector search on the data, we need to embed it (i.e. generate embedding vectors) before ingesting it into MongoDB.

📄️ 👐 Ingest data into MongoDB

The final step to build a MongoDB vector store for the chatbot is to ingest the embedded article chunks into MongoDB.