Create and Deploy RAG Applications with Gradio
In this tutorial, we’ll demonstrate how to use Gradio to build an interactive Semantic Search and Question Answering app using Hugging Face embeddings, Upstash Vector, and LangChain. Users can enter a question, and the app will retrieve relevant information and provide an answer.
Important Note on Python Version
Recent Python versions may cause compatibility issues with torch
, a dependency for Hugging Face models. Therefore, we recommend using Python 3.9 to avoid any installation issues.
Installation and Setup
First, we need to set up our environment and install the necessary libraries. Install the dependencies by running the following command:
Next, create a .env
file in your project directory with the following content, replacing your_upstash_url
and your_upstash_token
with your actual Upstash credentials:
This configuration file will allow us to load the required environment variables.
Code
We will load our environment variables, initialize the Hugging Face embeddings model, set up Upstash Vector, and configure a Hugging Face Question Answering model.
Next, we will create sample documents, embed them using Hugging Face embeddings, and store them in Upstash Vector.
When inserting documents, they are first embedded using the Embeddings
object. Many embedding models, such as the Hugging Face models, support embedding multiple documents at once. This allows for efficient processing by batching documents and embedding them in parallel.
- The
embedding_chunk_size
parameter controls the number of documents processed in parallel when creating embeddings.
Once the embeddings are created, they are stored in Upstash Vector. To reduce the number of HTTP requests, the vectors are also batched when they are sent to Upstash Vector.
- The
batch_size
parameter controls the number of vectors included in each HTTP request when sending to Upstash Vector.
In the Upstash Vector free tier, there is a limit of 1000 vectors per batch.
Now, we can set up a Question Answering model and the Gradio interface.
Running the App
After setting up the code, run your script to start the Gradio app. You will be presented with an interface where you can enter a question. The app will retrieve the most relevant information from the embedded documents and provide an answer based on the content.
Notes
- Deployment: To create a public link, set
share=True
inlaunch()
. This will generate a public URL for your Gradio app. This share link expires in 72 hours. For free permanent hosting and GPU upgrades, rungradio deploy
from Terminal to deploy to Hugging Face Spaces - Batch Processing: The
batch_size
andembedding_chunk_size
parameters allow you to control the efficiency of document processing and storage in Upstash Vector. - Namespaces: Upstash Vector supports namespaces for organizing different types of documents. You can set a namespace while creating the
UpstashVectorStore
instance.
Was this page helpful?