What this doc covers

1. Generating search queries for Exa using an LLM 2. Retrieving relevant URLs and their contents using Exa 3. Summarizing webpage contents using an LLM

In this example, we will build an LLM-based news summarizer with the Exa API to keep us up-to-date with the latest news on a given topic. We will use Exa to retrieve recent news articles and then feed the article contents to GPT-3.5 Turbo for summarization. This is a form of Retrieval Augmented Generation (RAG).

The Jupyter notebook for this tutorial is available on Colab for easy experimentation. You can also check it out on Github, including a plain Python version if you want to skip to the complete product.

To play with this code, we just need a Exa API key and an OpenAI API key. Get 1000 free Exa searches per month just for signing up!

Setup

Python
# install Exa and OpenAI SDKs
!pip install exa_py
!pip install openai
Python
from google.colab import userdata # comment this out if you're not using Colab

EXA_API_KEY = userdata.get('EXA_API_KEY') # replace userdata.get(...) with your API key, or add your API key to Colab Secrets
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY') # replace userdata.get(...) with your API key, or add your API key to Colab Secrets

Retrieving news with Exa

Let’s use the Exa neural search engine to search the web for relevant links to the user’s question.

First, we ask the LLM to generate a search engine query based on the question.

Python
import openai
from exa_py import Exa

openai.api_key = OPENAI_API_KEY
exa = Exa(EXA_API_KEY)

SYSTEM_MESSAGE = "You are a helpful assistant that generates search queries based on user questions. Only generate one search query."
USER_QUESTION = "What's the recent news in physics this week?"

completion = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": USER_QUESTION},
    ],
)

search_query = completion.choices[0].message.content

print("Search query:")
print(search_query)
query:
Recent news in physics this week

Looks good! Now let’s put the search query into Exa. Let’s also use start_published_date to filter the results to pages published in the last week. Notice that we set use_autoprompt=True which lets the Exa API further optimize our search query for best results. Essentially, there is a special way to format Exa queries for best results, which autoprompt does automatically.

Python
from datetime import datetime, timedelta

one_week_ago = (datetime.now() - timedelta(days=7))
date_cutoff = one_week_ago.strftime("%Y-%m-%d")

search_response = exa.search_and_contents(
    search_query, use_autoprompt=True, start_published_date=date_cutoff
)

urls = [result.url for result in search_response.results]
print("URLs:")
for url in urls:
    print(url)
URLs:
https://phys.org/news/2024-01-carrots-reveals-mechanics-root-vegetable.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2
https://phys.org/news/2024-01-astrophysicists-theoretical-proof-traversable-wormholes.html
https://gizmodo.com/proton-physics-strong-force-quarks-measurement-1851192840
https://www.nytimes.com/2024/01/24/science/space/black-holes-photography-m87.html
https://phys.org/news/2024-01-liquid-lithium-walls-fusion-device.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2
https://physics.aps.org/articles/v17/s13
https://phys.org/news/2024-01-validating-hypothesis-complex.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2
https://phys.org/news/2024-01-scientists-previously-unknown-colonies-emperor.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2
https://phys.org/news/2024-01-reveals-quantum-topological-potential-material.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2
https://phys.org/news/2024-01-shallow-soda-lakes-cradles-life.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2

Now we’re getting somewhere! Exa gave our app a list of relevant, useful URLs based on the original question.

By the way, we might be wondering what makes Exa special. Why can’t we just search with Google? Well, let’s take a look for ourselves at the Google search results. It gives us the front page of lots of news aggregators, but not the news articles themselves. And since we used Exa’s search_and_contents, our search also returns the webpage contents, so we can use Exa to skip writing a web crawler and access the knowledge directly!

Python
results = search_response.results
result_item = results[0]
print(f"{len(results)} items total, printing the first one:")
print(result_item.text)
10 items total, printing the first one:

 Credit: CC0 Public Domain

Chopped carrot pieces are among the most universally enjoyed foods and a snacking staple—a mainstay of school lunchboxes, picnics and party platters year-round.
Now researchers from the University of Bath have uncovered the secret science of prepping the popular root vegetable and quantified the processes that make them curl up if left uneaten for too long.
Mechanical Engineering student Nguyen Vo-Bui carried out the research as part of his final-year studies, in the limited circumstances of COVID-19 lockdowns of 2021. The research paper, "Modelling of longitudinally cut carrot curling induced by the vascular cylinder-cortex interference pressure", is published in Royal Society Open Science.
Without access to labs, Nguyen aimed to identify the geometrical and environmental factors that have the most influence on carrots' longevity. Working in his kitchen, he characterized, analytically modeled and verified the aging of over 100 Lancashire Nantes carrot halves, cut lengthways, using finite-element (FE) models normally used in structural engineering.
The research team concluded that residual stresses and dehydration were the two key factors behind the curling behavior. The starchy outer layer of the carrot (the cortex) is stiffer than the soft central vein (also known as the vascular cylinder). When cut lengthwise, the two carrot halves curl because the difference in stress becomes unbalanced. Dehydration leads to further loss of stiffness, further driving the curling effect.
Their recommendations to manufacturers include handling carrots in cold, moist, airtight and humidity-controlled environments to protect their natural properties and increase their edible life span.
They say the study provides a methodology to predict the deformation of cut root vegetables, adding that the procedure is likely to apply to other plant structures. The study gives food producers a new mathematical tool that could be applied to the design of packaging and food handling processes, potentially reducing food waste.
One of the world's top crops by market value, carrots are known for their high production efficiency—but despite this, wastage is high. Around 25–30% of this occurs prior to processing and packaging—due to deformities, mechanical damage or infected sections. Fresh cut and minimally processed carrots are a convenient ready-to-use ingredient that make possible the use of carrots that might otherwise be discarded, reducing food waste.
Dr. Elise Pegg, a senior lecturer in Bath's Department of Mechanical Engineering, is one of the research paper authors and oversaw the study. She said, "We have mathematically represented the curl of a cut carrot over time, and showed the factors that contribute to curling.
"Our motivation was to look for ways to improve the sustainability of carrot processing and make them as long-lasting as possible. We have produced a methodology that a food producer could use to change their processes, reducing food waste and making packaging and transportation more efficient. Understanding the bending behavior in such systems can help us to design and manufacture products with higher durability.
"A question like this would normally be investigated from a biological perspective, but we have done this work using purely mechanical principles. I'm so pleased for Nguyen—it's a measure of his resourcefulness and dedication to produce such interesting research in a challenging situation."
Over the course of a week, the curl of the carrot halves increased—with the average radius of each carrot's curvature falling from 1.61m to 1.1m. A 1.32-times reduction in stiffness was also seen, correlating with the carrots drying out; on average, their weight fell by 22%.
Nguyen added, "This was interesting research—to apply mechanical principles to vegetables was surprising and fun.
"One of the big challenges was to devise an experiment that could be done in a lockdown setting, without access to normal labs and equipment. To now be in a position to have this work published in an academic journal and potentially be used by the food industry is really rewarding.
"This project has inspired me to continue my studies at the University of Bath and I now study residual stresses in porous ferroelectric ceramics for my Ph.D."
As well as having to use a suitcase to collect the 30kg of carrots the experiment demanded from a farmers' market, a further challenge was finding ways to use them afterward. Carrot cake, the Indian carrot dessert Gajar Ka Halwa, carrot pesto and many other dishes kept Nguyen and his flatmates fed for several days.

More information:
	Modelling of longitudinally cut carrot curling induced by the vascular cylinder-cortex interference pressure, Royal Society Open Science (2024). DOI: 10.1098/rsos.230420. royalsocietypublishing.org/doi/10.1098/rsos.230420

Citation:
	Why do carrots curl? Research reveals the mechanics behind root vegetable aging (2024, January 23)
	retrieved 24 January 2024
	from https://phys.org/news/2024-01-carrots-reveals-mechanics-root-vegetable.html


	 This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
	 part may be reproduced without the written permission. The content is provided for information purposes only.

Awesome! That’s really interesting, or it would be if we had bothered to read it all. But there’s no way we’re doing that, so let’s ask the LLM to summarize it for us.

Summarizing with GPT-3.5 Turbo

Python
import textwrap

SYSTEM_MESSAGE = "You are a helpful assistant that briefly summarizes the content of a webpage. Summarize the users input."

completion = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": result_item.text},
    ],
)

summary = completion.choices[0].message.content

print(f"Summary for {urls[0]}:")
print(result_item.title)
print(textwrap.fill(summary, 80))
Summary for https://phys.org/news/2024-01-carrots-reveals-mechanics-root-vegetable.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2:
Why do carrots curl? Research reveals the mechanics behind root vegetable aging
Researchers from the University of Bath have conducted a study on the curling
behavior of chopped carrot pieces. The study found that residual stresses and
dehydration were the main factors behind the curling effect. The starchy outer
layer of the carrot is stiffer than the soft central vein, and when cut
lengthwise, the difference in stress causes the carrot to curl. Dehydration
further contributes to the curling effect. The research provides recommendations
to manufacturers on how to handle carrots to increase their edible lifespan. The
study also offers a methodology that can be used to predict the deformation of
cut root vegetables and potentially reduce food waste. The findings have
implications for the design of packaging and food handling processes. Carrots
are a highly produced crop, but wastage is still high, with a significant amount
occurring before processing and packaging. The study was carried out by
Mechanical Engineering student Nguyen Vo-Bui during the COVID-19 lockdowns of
2021.

And we’re done! We built an app that translates a question into a search query, uses Exa to search for useful links, uses Exa to grab clean content from those links, and summarizes the content to effortlessly answer your question about the latest news, or whatever we want.

We can be sure that the information is fresh, we have the source in front of us, and we did all this with an Exa query and LLM call. No web scraping or crawling needed!

Through Exa, we have given our LLM access to the entire Internet. The possibilities are endless.