Skip to main content

Using OpenAI

info

Take-home activity! Do it if you are following along at home. It won't be covered during the hands-on lab.

OpenAI is a company that develops AI models for natural language processing. They offer a free API that you can use to create embeddings for your documents. The API is called OpenAI's Embedding API.

To get some embeddings using their API, you need to create an account and get an API key.

Create an OpenAI account and get an API key​

To create an account, go to https://openai.com/ and click on the Log In button in the upper right corner. This will redirect you to the login page, where you'll have the option to sign up for their services.

The OpenAI signup page

Follow the instructions on the screen, and verify your email address.

Once you have an account, you can go to the API keys page to get an API key.

From there, click on the Create new secret key button.

The OpenAI API keys page

You'll be prompted to give your key a name. You can call it "MongoDB Vector Search Demo." Click on the Create secret key button.

You will then be presented with your API key. Copy it and save it somewhere safe.

The OpenAI API key
caution

Make sure you copy this key somewhere as you'll need it later on, and you won't be able to see it again.

Now that you have an API key, you can use it to create embeddings for your documents.

Create embeddings for documents​

To create embeddings for your documents by sending curl commands to the OpenAI API, you can use the following command.

OPENAI_API_KEY=<YOUR_API_KEY>
curl https://api.openai.com/v1/embeddings \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "The food was delicious and the waiter...",
"model": "text-embedding-ada-002"
}'

You can find more information about the API in the OpenAI documentation.

Create embeddings for the books​

To create the embeddings for the books in your collection, you should run this curl command, or use the Node.js library, for each book. This process is somewhat time-consuming, so we've already created them for you.

You can find the 1586 dimensions vector in the embeddings field of the books.

Because we already have the vectors for the books, we can use them with Vector Search.

Querying with vectors​

To query the data, Vector Search will need to calculate the distance between the query vector and the vectors of the documents in the collection.

To do so, you will need to vectorize your query. You can use the same function to vectorize your query, as well.

In the library application, we've created a function that will vectorize your query for you. You can find it in the server/src/embeddings/openai.ts file.

import OpenAI from 'openai';

const { EMBEDDING_KEY } = process.env;

let openai;

const getTermEmbeddings = async (text) => {
if (!openai) {
openai = new OpenAI({apiKey: EMBEDDING_KEY});
}
const embeddings = await openai.embeddings.create({
model: 'text-embedding-ada-002',
input: text,
});
return embeddings?.data[0]?.embedding;
};

export default getTermEmbeddings;

Configuring the application​

In your server/.env file, you'll find a few variables that you can use to configure the application.

The first one is EMBEDDINGS_SOURCE. It tells the application where to get the embeddings from. You can set it to openai.

Now that you have an OpenAI API key, you can set the EMBEDDING_KEY variable to your API key.

EMBEDDINGS_SOURCE=openai
EMBEDDING_KEY=sk-...