RAG with LangChain
How to use Exa’s integration with LangChain to perform RAG.
LangChain is a framework for building applications that combine LLMs with data, APIs and other tools. In this guide, we’ll go over how to use Exa’s LangChain integration to perform RAG with the following steps:
- Set up Exa’s LangChain integration and use Exa to retrieve relevant content
- Connect this content to a toolchain that uses OpenAI’s LLM for generation
Get Started
Pre-requisites and installation
Install the core OpenAI and Exa LangChain libraries
OPENAI_API_KEY
and EXA_API_KEY
for OpenAI and Exa keys respectively. Get your Exa API key
Use Exa Search to power a LangChain Tool
Set up a Retriever tool using ExaSearchRetriever
. This is a retriever that connects to Exa Search to find relevant documents via semantic search. First import the relevant libraries and instantiate the ExaSearchRetriever.
Create a prompt template (optional)
We use a LangChain PromptTemplate to define a template of placeholder to parse out URLs and Highlights from the Exa retriever.
Parse the URL and content from Exa results
We use a Runnable Lambda to parse out the URL and Highlights attributes from the Exa Search results then pass this to the prompt template above
Join Exa results and content for retrieval
Complete the retrieval chain by stitching together the Exa retriever, the parser and a short lambda function - this is crucial for passing the result as a single string as context for the LLM in the next step.
Set up the rest of the toolchain including OpenAI for generation
In this step, we define the system prompt with Query and Context template inputs to be grabbed from the user and Exa Search respectively. First, once again import the relevant libraries and components from LangChains libraries
Then we define a generation prompt - the prompt template that is used with context from Exa to perform RAG.
We set the generation LLM to OpenAI, then connect everything with a RunnableParallel parallel connection. The generation prompt, containing the query and context, is then passed to the LLM and parsed for better output representation.
Running the full RAG toolchain
Optionally, stream the output of the chain
Optionally, you may
Outputs, in a stream - click here to learn more about the .stream method and other options, including handling of chunks and how to think about further parsing outputs:
As you can see, the output generation is enriched with the context of our Exa Search query result!