Getting Started with OpenAI's Assistants API
OpenAI has once again changed the AI industry at the releases at their DevDay on November 6th. While there are many new features to try out, in my opinion the Assistants API was the biggest release:
An Assistant has instructions and can leverage models, tools, and knowledge to respond to user queries.
The Assistants API currently supports three types of tools: Code Interpreter, Retrieval, and Function calling.
As Sam Altman highlighted at DevDay, building these agentic features was possible before, but often required significant engineering, the use of third-party libraries, and frankly weren't always that reliable. Now, by combining code interpreter, retrieval, and function calling, we can build AI agents directly with the GPT API.
In this guide, we'll look at how to get started based on with this new capability based on the Assistants API documentation, including:
- Overview of the Assistants API
- Assistant 1: Code Interpreter
- Assistant 2: Knowledge Retrieval
Overview of the Assistants API
Before we just into the code, let's first look at the high-level overview of building on the Assistants API as there are several new components.
First, let's start with the steps and definitions to create an Assistant:
- Defining an Assistant: An Assistant is an purpose-built AI that uses models, instructions, and uses tools.
- Creating a Thread: A Thread is a conversation flow initiated by a user, to which messages can be added, creating an interactive session.
- Adding Messages: Messages contain the text input by the user and can include text, files, and images.
- Running the Assistant: Finally, we run the Assistant to process the Thread, call certain tools if necessary, and generate the appropriate response.
Assistant 1: Code Interpreter
Now that we have an overview of the steps & definitions, let's build a simple Assistant that uses Code Interpreter.
Before we build the Assistant, in order to use these new features we import OpenAI slightly differently than before:
from openai import OpenAI
client = OpenAI()
Step 1: Creating an Assistant
In this example, we'll build a machine learning tutor Assistant with the following instruction:
You are an assistant that helps with machine learning coding problems. Write, run, and explain code to answer questions.
You can also see we've got the code_interpreter
tool enabled:
assistant = client.beta.assistants.create(
name="ML Code Helper",
instructions="You are an assistant that helps with machine learning coding problems. Write, run, and explain code to answer questions.",
tools=[{"type": "code_interpreter"}],
model="gpt-4-1106-preview"
)
Step 2: Create a Thread
Next, let's create create a Thread for the Assistant as follows:
thread = openai.beta.threads.create()
The nice thing about Threads is that they don't have a size limit, meaning you can pass as many Messages as you want.
If you recall with the previous GPT-4 API, creating "conversations" was accomplished by appending user
and assistants
responses onto each other. This not only consumed significantly more API costs, but you also quickly ran out of token limit space after a few conversations...but now:
The API will ensure that requests to the model fit within the maximum context window, using relevant optimization techniques such as truncation.
If we print out the thread, we can see it's empty right now, so let's add messages to it.
Step 3: Adding Messages to a Thread
We can now add Messages to our Thread, in this case I'll ask a relatively common question that a student might ask about machine learning:
When I try to calculate the cost for my linear regression, I get a 'ValueError: operands could not be broadcast together with shapes (100,) (100,1)'. Here's the part where it fails: cost = (1/(2*m)) * np.sum(np.square(y_pred - y))
# User is asking for help with their Python code for a linear regression cost function
message = client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content="When I try to calculate the cost for my linear regression, I get a 'ValueError: operands could not be broadcast together with shapes (100,) (100,1)'. Here's the part where it fails: `cost = (1/(2*m)) * np.sum(np.square(y_pred - y))`. Can you help me figure out why this is happening?"
)
We can now see the a new ThreadMessage
object with the user's question:
Step 4: Run the Assistant
Now we're ready to create a Run, which will run the Assistant on the Thread to trigger responses and automatically call relevant tools.
This makes the Assistant read the Thread and decide whether to call tools or simply use the model to best answer the user query
After deciding what tools to use, the assistant appends Messages to the thread with the role="assistant"
.
run = client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant.id,
instructions="Please explain the solution in a simple way so the user can learn from it."
)
We can see the status of the Run is initally queued and then goes through the Run lifecycle:
Step 5: Display the Assistant's Response
Next, we can retrieve the Run to check if it's completed as follows:
run = client.beta.threads.runs.retrieve(
thread_id=thread.id,
run_id=run.id
)
After it's completed, let's go and retrieve the Assistant messages that has been appended to the thread:
messages = openai.beta.threads.messages.list(
thread_id=thread.id
)
Next, let's display the role and content of the Assistant message:
assistant_response = messages.data[0].content[0].text.value
Assistant Response
You can fix the broadcasting error by ensuring that both y_pred
and y
have the same shape before you perform the subtraction. You can do this by either reshaping y_pred
to be a two-dimensional array with one column or by flattening y
to be a one-dimensional array.
Here are two examples that you can use to fix the error:
Option 1: Reshape y_pred
to match y
's shape
y_pred = y_pred.reshape(-1, 1) # Now y_pred shape will be (100, 1), matching y
cost = (1 / (2 * m)) * np.sum(np.square(y_pred - y))
Option 2: Flatten y
to match y_pred
's shape
y = y.flatten() # Now y shape will be (100,), matching y_pred
cost = (1 / (2 * m)) * np.sum(np.square(y_pred - y))
Choose the appropriate option based on whether y_pred
or y
is closer to the desired shape in your context. If y_pred
is the output of a machine learning model, it might output predictions in either shape, and you should adjust the shape of y
accordingly.
Let's apply Option 2 and calculate the cost using the corrected shapes, as y
being a one-dimensional array is more common in practice.
Not bad. Alright now that we've got the Code Interpreter tool working, let's look at the Knowledge Retrieval Assistant.
Assistant 2: Knowledge Retrieval
As highlighted in the Knowledge Retrieval tool documentation:
Retrieval augments the Assistant with knowledge from outside its model, such as proprietary product information or documents provided by your users.
If you've seen our previous tutorials on embeddings and vector search, now this is all done natively within the GPT API.
For this example, let's start with the classic AI app: chatting with PDFs.
Similar to our first assistant, we can follow these steps to accomplish this:
- Upload files for retrieval
- Create a retreival Assistant
- Create a thread & add messages to it
- Run the assistant
- Display the response
Step 1: Upload files for retrieval
Next up, let's upload a PDF to OpenAI with the purpose
set to assistants
. For this example, we'll of course use the classic Attention Is All You Need paper:
# Upload a file with an "assistants" purpose
file = client.files.create(
file=open("/content/attention.pdf", "rb"),
purpose='assistants'
)
If we check the files section of the OpenAI platform, you can find your uploaded files listed there:
Step 2: Create a Retrieval Assistant
First off, let's create a new assistant with the following simple instructions for retrieval. We'll also need to pass retrieval
in the tools
parameter:
After uploaded the files, we can then add the file to our previously created assistant as follows:
# Add the file to the assistant
assistant = client.beta.assistants.create(
instructions="You are a knowledge retrieval assistant. Use your knowledge base to best respond to users queries.",
model="gpt-4-1106-preview",
tools=[{"type": "retrieval"}],
file_ids=[file.id]
)
Step 3: Create a thread & add messages
Next up, we'll create a new Thread as follows:
thread = client.beta.threads.create()
And then we can add messages and files to our thread, in this case I'll just ask it the summarize the abstract of the paper and pass in the file.id
:
message = client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content="Summarize the abstract of the paper.",
file_ids=[file.id]
)
Step 4: Run the assistant
Now that we have the context of both the message and file in our Thread, we can rub the Thread with our Assistant as follows:
run = client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant.id,
)
After you run it, it takes a minute or two to for the run lifecycle to complete.
Step 5: Display the response
After the Run status is complete, we can retrieve the responses as follows:
messages = client.beta.threads.messages.list(
thread_id=thread.id
)
Now, let's access the assistant response containing the abstract summary like this:
assistant_response = messages.data[0].content[0].text.value
Which we can see, returns:
The abstract of the paper introduces the Transformer, a novel network architecture designed for sequence transduction tasks that is based solely on attention mechanisms and does not rely on recurrent or convolutional neural networks...
Note that we can also add annotations to these responses, although we'll cover that in a future article.
Summary: Getting Started with the Assistants API
In this guide, we looked at two of the built-in tools that Assistants can use: Code Interpreter and Knowledge Retrieval. Already I can see retrieval augmented generation (RAG) is going to be significantly easier with this built in embeddings & vector search capability.
Of course, this is a starter tutorial so it's important to note a few things:
- Code Interpreter & Retrieval: We can also combine these two tools to create code interpreter on specific data, for example a data visualization assistant for your CSV data.
- Function Calling: Assistants also have access to function calling, which can be used to connect the Assistant to external APIs or our own functions. This is a bigger topic, so we'll cover that in a dedicated article soon.
These next few weeks and months will certainly be interesting to watch as AI agents start to proliferate into our everyday lives.
As Sam Altman said about AI agents at DevDay...
The upsides of this are going to be tremendous.