MLQ Academy: PDF to Q&A Assistant using Embeddings & GPT-3
In this video tutorial, we're going to walk through a Colab notebook that shows you how to upload a PDF document and create a factual Q&A assistant based on the text using GPT-3.
Specifically, we'll be using OpenAI Embeddings API to retrieve relevant document context to the user's question, and then the Completions API to answer the question with this information. As OpenAI highlights:
OpenAI’s text embeddings measure the relatedness of text strings. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness.
The steps we're going to walk through include:
- Installs, imports, and setting up our OpenAI API constants
- Upload the PDF, read the text, and prepare it for the Embeddings API
- Compute & load the document embeddings
- Calculate the document & query similarity to retrieve relevant context
- Construct a prompt that creates a factual Q&A bot with our user's question and relevant context from the PDF
- Answer the user's question with the Completions API
The Colab notebook is adapted from this OpenAI Cookbook on Question Answering using Embeddings and in future videos, I'll show you how you can take out of Colab and build a simple web app using Sstreamlit that allows users to upload their own documents and create question-answering assistants.