๐๏ธ ๐ Load the dataset
First, let's download the dataset for the lab. We'll use a subset of MongoDB's technical documentation as the source data for the documentation chatbot.
๐๏ธ ๐ Chunk up the data
Since we are working with large documents, we first need to break them up into smaller chunks before embedding and storing them in MongoDB.
๐๏ธ ๐ Generate embeddings
To perform vector search on the data, we need to embed it (i.e. generate embedding vectors) before ingesting it into MongoDB.
๐๏ธ ๐ Ingest data into MongoDB
The final step to build a MongoDB vector store for the chatbot is to ingest the embedded article chunks into MongoDB.