Skip to main content

馃憪 Generate embeddings

To perform vector search on our data, we need to embed it (i.e. generate embedding vectors) before ingesting it into MongoDB.

Fill in any <CODE_BLOCK_N> placeholders and run the cells under the Step 5: Generate embeddings section in the notebook to embed the chunked articles.

The answers for code blocks in this section are as follows:

CODE_BLOCK_4

Answer
embedding_model.encode(text)

CODE_BLOCK_5

Answer
doc["embedding"] = get_embedding(doc["body"])
caution

If the embedding generation is taking too long (> 5 min), kill/interrupt the cell and move on to the next step with the documents that have been embedded up until that point.