Using OpenAI
Take-home activity! Do it if you are following along at home. It won't be covered during the hands-on lab.
OpenAI is a company that develops AI models for natural language processing. They offer a free API that you can use to create embeddings for your documents. The API is called OpenAI's Embedding API.
To get some embeddings using their API, you need to create an account and get an API key.
Create an OpenAI account and get an API keyβ
To create an account, go to https://openai.com/ and click on the Log In button in the upper right corner. This will redirect you to the login page, where you'll have the option to sign up for their services.

Follow the instructions on the screen, and verify your email address.
Once you have an account, you can go to the API keys page to get an API key.
From there, click on the Create new secret key button.

You'll be prompted to give your key a name. You can call it "MongoDB Vector Search Demo." Click on the Create secret key button.
You will then be presented with your API key. Copy it and save it somewhere safe.

Make sure you copy this key somewhere as you'll need it later on, and you won't be able to see it again.
Now that you have an API key, you can use it to create embeddings for your documents.
Create embeddings for documentsβ
To create embeddings for your documents by sending curl commands to the OpenAI API, you can use the following command.
OPENAI_API_KEY=<YOUR_API_KEY>
curl https://api.openai.com/v1/embeddings \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "The food was delicious and the waiter...",
"model": "text-embedding-ada-002"
}'
You can find more information about the API in the OpenAI documentation.
Create embeddings for the booksβ
To create the embeddings for the books in your collection, you should run this curl command, or use the Node.js library, for each book. This process is somewhat time-consuming, so we've already created them for you.
You can find the 1586 dimensions vector in the embeddings
field of the books.
Because we already have the vectors for the books, we can use them with Vector Search.
Querying with vectorsβ
To query the data, Vector Search will need to calculate the distance between the query vector and the vectors of the documents in the collection.
To do so, you will need to vectorize your query. You can use the same function to vectorize your query, as well.
- π NodeJS/Express
- βοΈ Java Spring Boot
In the library application, we've created a function that will vectorize your query for you. You can find it in the server/src/embeddings/openai.ts
file.
import OpenAI from 'openai';
const { EMBEDDING_KEY } = process.env;
let openai;
const getTermEmbeddings = async (text) => {
if (!openai) {
openai = new OpenAI({apiKey: EMBEDDING_KEY});
}
const embeddings = await openai.embeddings.create({
model: 'text-embedding-ada-002',
input: text,
});
return embeddings?.data[0]?.embedding;
};
export default getTermEmbeddings;
Configuring the applicationβ
In your server/.env
file, you'll find a few variables that you can use to configure the application.
The first one is EMBEDDINGS_SOURCE
. It tells the application where to get the embeddings from. You can set it to openai
.
Now that you have an OpenAI API key, you can set the EMBEDDING_KEY
variable to your API key.
EMBEDDINGS_SOURCE=openai
EMBEDDING_KEY=sk-...
To run semantic queries in Java, the application also needs to generate embeddings for your query.
This is handled by the EmbeddingProvider
interface. For OpenAI, the implementation is OpenAIEmbeddingProvider
, which delegates the HTTP call to the Feign client:
@FeignClient(
name = "openai-embeddings",
url = "${openai.base-url}",
configuration = OpenAIFeignInterceptor.class
)
public interface OpenAIEmbeddingClient {
@PostMapping(
value = "/v1/embeddings",
consumes = "application/json",
produces = "application/json"
)
OpenAIEmbeddingResponse createEmbeddings(@RequestBody OpenAIEmbeddingRequest request);
}
Configuring the applicationβ
To use OpenAI, you only need to adjust two parameters in application.yml
:
embeddings:
source: ${EMBEDDING_SOURCE}
openai:
api-key: ${OPENAI_KEY:yourKey}
The embeddings.source
property is what selects the provider at runtime.
If you set it to openai, the OpenAI implementation will be used.
You can either edit these values in application.yml, or export them as environment variables:
export EMBEDDING_SOURCE=openai
export OPENAI_KEY=sk-...
Running the applicationβ
Once the variables are set, start the application as usual:
mvn spring-boot:run
At the end, when the application starts, just make a call to searchBooks
as
described in the step Add Semantic Search to Your Application
and you will see the application calling the OPENAI implementation:
