Company Analyst
Example project using the Exa Python SDK.
What this doc covers
- Using Exa’s link similarity search to find related links
- Using the keyword search setting with Exa search_and_contents
In this example, we’ll build a company analyst tool that researches companies relevant to what you’re interested in. If you just want to see the code, check out the Colab notebook.
The code requires an Exa API key and an OpenAI API key. Get 1000 free Exa searches per month just for signing up!
Shortcomings of Google
Say we want to find companies similar to Thrifthouse, a platform for selling secondhand goods on college campuses. Unfortunately, googling “companies similar to Thrifthouse” doesn’t do a very good job. Traditional search engines rely heavily on keyword matching. In this case we get results about physical thrift stores. Hm, that’s not really what I want.
Let’s try again, this time searching based on a description of the company, like by googling “community based resale apps.” But, this isn’t very helpful either and just returns premade SEO-optimized listicles…
What we really need is neural search.
What is neural search?
Exa is a fully neural search engine built using a foundational embeddings model trained for webpage retrieval. It’s capable of understanding entity types (company, blog post, Github repo), descriptors (funny, scholastic, authoritative), and any other semantic qualities inside of a query. Neural search can be far more useful than traditional keyword-based searches for these complex queries.
Finding companies with Exa link similarity search
Let’s try Exa, using the Python SDK! We can use thefind_similar_and_contents
function to find similar links and get contents from each link. The input is simply a URL, https://thrift.house and we set num_results=10
(this is customizable up to thousands of results in Exa).
By specifying highlights={"num_sentences":2}
for each search result, Exa will also identify and return a two sentence excerpt from the content that’s relevant to our query. This will allow us to quickly understand each website that we find.
This is an example of the full first result:
And here are the 10 titles and URLs I got:
Looks pretty darn good! As a bonus specifically for companies data, specifying category="company"
in the SDK will search across a curated, larger companies dataset - if you’re interested in this, let us know at [email protected]!
Now that we have 10 companies we want to dig into further, let’s do some research on each of these companies.
Finding additional info for each company
Now let’s get more information by finding additional webpages about each company. To do this, we’re going to do a keyword search of each company’s URL. We’re using keyword because we want to find webpages that exactly match the company we’re inputting. We can do this with the search_and_contents
function, and specify type="keyword"
and num_results=5
. This will give me 5 websites about each company.
Here’s an example of the first result for the first company, Rumie App. You can see the first result is the actual link contents itself.
Creating a report with LLMs
Finally, let’s create a summarized report that lists our 10 companies and gives us an easily digestible summary of each company. We can input all of this web content into an LLM and have it generate a nice report!
And we’re done! We’ve built an app that takes in a company webpage and uses Exa to
- Discover similar startups
- Find information about each of those startups
- Gather useful content and summarize it with OpenAI
Hopefully you found this tutorial helpful and are ready to start building your very own company analyst! Whether you want to generate sales leads or research competitors to your own company, Exa’s got you covered.