跳到主要内容

📘 Building the Hybrid Search Pipeline

In this section, we'll construct a hybrid search pipeline that combines the power of vector search for semantic similarity with the precision of full-text search. We'll use MongoDB's aggregation framework to create this pipeline.

3.1 Vector Search Stage

The first stage of our pipeline will be the vector search. This allows us to find books with synopses that are semantically similar to our query.


{
$vectorSearch: {
index: "books_synopsis_vector",
path: "embeddings",
queryVector: [0.1, -0.2, 0.3, ...], // Your query vector here
numCandidates: 100,
limit: 20
}
}

Let's break this down:

  • index: The name of your vector index.
  • path: The field containing your embeddings.
  • queryVector: The vector representation of your search query.
  • numCandidates: The number of initial candidates to consider.
  • limit: The maximum number of results to return from this stage.

3.2 Text Search Stage

Next, we'll add a text search stage to find books based on title and author matches.

{
$search: {
index: "books_text_index",
compound: {
should: [
{
text: {
query: "your search query",
path: "title",
score: { boost: { value: 3 } }
}
},
{
text: {
query: "your search query",
path: "authors.name",
score: { boost: { value: 2 } }
}
}
]
}
}
}

Key points:

  • We're using a compound query with "should" clauses.
  • We search in both title and authors.name fields.
  • The boost values (3 for title, 2 for author) give more weight to title matches.