Recruiting Agent
What this doc covers
- Using Exa search with includeDomain to only retrieve search results from a specified domain
- Using Exa keyword search to find specific people by name
- Using excludeDomain to ignore certain low-signal domains
- Using Exa link similarity search to find similar websites
Introduction
In this tutorial, we use Exa to automate the process of discovering, researching, and evaluating exceptional candidates. If you just want to see the code, check out the Colab notebook.
Here’s what we’re going to do:
- Candidate research: Identify potential candidates and use Exa to find additional details, such as personal websites, LinkedIn profiles, and their research topics.
- Candidate evaluation: Evaluate candidates using an LLM to score their fit to our hiring criteria.
- Finding more candidates: Discover more candidates similar to our top picks.
This project requires an Exa API key and an OpenAI API key. Get 1000 Exa searches per month free just for signing up!
Initial Candidates
Suppose I’m building Simile, an AI startup for web retrieval.
My hiring criteria is:
- AI experience
- interest in retrieval, databases, and knowledge
- available to work now or soon
We start with 13 example PhD students recommended by friends. All I have is their name and email.
Information Enrichment
Now, let’s add more information about the candidates: current school, LinkedIn, and personal website.
First, we’ll define a helper function to call OpenAI — we’ll use this for many of our later functions.
We’ll ask GPT to extract the candidate’s school from their email address.
Now that we have their school, let’s use Exa to find their LinkedIn and personal website too.
Here, we’re passing in type="keyword"
to do an Exa keyword search because we want our results to have the exact name in the result. We also specify include_domains=['linkedin.com']
to restrict the results to LinkedIn profiles.
To now find the candidate’s personal website, we can use the same Exa query, but we want to also scrape the website’s contents. To do this, we use search_and_contents
.
We can also exclude some misleading websites with exclude_domains=['linkedin.com', 'github.com', 'twitter.com']
. Whatever’s left has a good chance of being their personal site!
Now that I have personal websites of each candidate, we can use Exa and GPT-4 to answer questions like:
- what are they doing now? Or what class year are they?
- where did they do their undergrad?
- what topics do they research?
- are they an AI researcher?
Once we have all of the page’s contents, let’s start asking some questions:
Candidate Evaluation
Next, we use GPT-4 to score candidates 1-10 based on fit. This way, we can use Exa to find more folks similar to our top-rated candidates.
Finally, let’s enrich our dataframe of people. We define a function enrich_row
that uses all the functions we defined to learn more about a candidate,and sort by score to get the most promising candidates.
Finding more candidates
Now that we know how to research candidates, let’s find some more! We’ll take each of the top candidates (score 7-10), and use Exa to find similar profiles.
Exa’s find_similar
,allows us to search a URL and find semantically similar URLs. For example, I could search ‘hinge.co’ and it’ll return the homepages of similar dating apps. In this case, we’ll pass in the homepages of our top candidates to find similar profiles.
Final stretch — let’s put it all together. Let’s find and add our new candidates to our original dataframe.
Alrighty, that’s it! We’ve just built an automated way of finding, researching, and evaluating candidates. You can use this for recruiting, or tailor this to find customers, companies, etc.
And the best part is that every time you use Exa to find new candidates, you can do more find_similar(new_candidate_homepage)
searches with the new candidates as well — helping you build an infinite list!
Hope this tutorial was helpful and don’t forget, you can get started with Exa for free :)