馃摌 Semantic search in MongoDB
In MongoDB, you can semantically search through your data using MongoDB Atlas Vector Search.
To perform vector search on your data in MongoDB, you need to create a vector search index. An example of a vector search index definition looks as follows:
{
"fields":[
{
"type": "vector",
"path": "embedding",
"numDimensions": 1536,
"similarity": "cosine"
},
{
"type": "filter",
"path": "symbol"
},
...
]
}
In the index definition, you specify the path to the embedding field (path
), the number of dimensions in the embedding vectors (numDimensions
), and a similarity metric that specifies how to determine nearest neighbors (similarity
). You can also index filter fields that allow you to pre-filter on certain metadata to narrow the scope of the vector search.
Vector search in MongoDB takes the form of an aggregation pipeline stage. It always needs to be the first stage in the pipeline and can be followed by other stages to further process the semantic search results. An example pipeline including the $vectorSearch
stage is as follows:
[
{
"$vectorSearch": {
"index": "vector_index",
"path": "embedding",
"queryVector": [0.02421053, -0.022372592,...],
"numCandidates": 150,
"filter": {"symbol": "ABMD"},
"limit": 10
}
},
{
"$project": {
"_id": 0,
"Content": 1,
"score": {"$meta": "vectorSearchScore"}
}
}
]
In this example, you can see a vector search query with a pre-filter. The limit
field in the query definition specifies how many documents to return from the vector search.
The $project
stage that follows only returns documents with the Content
field and the similarity score from the vector search.