Example project using the Exa Python SDK

Search with LLMs: Recent News Summarizer


In this example, we will build a LLM-based news summarizer app with the Exa API to keep us up-to-date with the latest news on a given topic.

This Jupyter notebook is available on Colab for easy experimentation. You can also check it out on Github, including a plain Python version if you want to skip to a complete product.

To play with this code, first we need a Exa API key and an OpenAI API key. Get 1000 Exa searches per month free just for signing up!

# install Exa and OpenAI SDKs
!pip install exa_py
!pip install openai
from google.colab import userdata # comment this out if you're not using Colab

EXA_API_KEY = userdata.get('EXA_API_KEY') # replace with your api key, or add to Colab Secrets
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY') # replace with your api key, or add to Colab Secrets

First Approach (without Exa)

First, let's try building the app just by using the OpenAI API. We will use GPT 3.5-turbo as our LLM. Let's ask it for the recent news, like we might ask ChatGPT.

import openai

openai.api_key = OPENAI_API_KEY

USER_QUESTION = "What's the recent news in physics this week?"

completion = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": USER_QUESTION},
    ],
)

response = completion.choices[0].message.content
print(response)
One recent news in physics is that researchers at the University of Illinois have discovered a new topological state of matter. They created a material composed of interacting particles called quadrupoles, which can exhibit unique behavior in their electrical properties. This finding has the potential to pave the way for the development of new types of electronic devices and quantum computers.

Another interesting development is in the field of cosmology. The European Space Agency's Planck satellite has provided new insights into the early universe. By analyzing the cosmic microwave background radiation, scientists have obtained more accurate measurements of the rate at which the universe is expanding, which could challenge current theories of physics.

Additionally, scientists at CERN's Large Hadron Collider (LHC) have observed a rare phenomenon called charm mixing. They found that particles containing both charm and strange quarks can spontaneously transition between their matter and antimatter states. This discovery could contribute to our understanding of the puzzle of why the universe is primarily made of matter and why there is very little antimatter.

Oh no! Since the LLM is unable to use recent data, it doesn't know the latest news. It might tell us some information, but that info isn't recent, and we can't be sure it's trustworthy either since it has no source. Luckily, Exa API allows us to solve these problems by connecting our LLM app to the internet. Here's how:

Second Approach (with Exa)

Let's use the Exa neural search engine to search the web for relevant links to the user's question.

First, we ask the LLM to generate a search engine query based on the question.

import openai
from exa_py import Exa

openai.api_key = OPENAI_API_KEY
exa = Exa(EXA_API_KEY)

SYSTEM_MESSAGE = "You are a helpful assistant that generates search queries based on user questions. Only generate one search query."
USER_QUESTION = "What's the recent news in physics this week?"

completion = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": USER_QUESTION},
    ],
)

search_query = completion.choices[0].message.content

print("Search query:")
print(search_query)
Search query:
Recent news in physics this week

Looks good! Now let's put the search query into Exa. Let's also use start_published_date to filter the results to pages published in the last week:

from datetime import datetime, timedelta

one_week_ago = (datetime.now() - timedelta(days=7))
date_cutoff = one_week_ago.strftime("%Y-%m-%d")

search_response = exa.search_and_contents(
    search_query, use_autoprompt=True, start_published_date=date_cutoff
)

urls = [result.url for result in search_response.results]
print("URLs:")
for url in urls:
    print(url)
URLs:
https://phys.org/news/2024-01-carrots-reveals-mechanics-root-vegetable.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2
https://phys.org/news/2024-01-astrophysicists-theoretical-proof-traversable-wormholes.html
https://gizmodo.com/proton-physics-strong-force-quarks-measurement-1851192840
https://www.nytimes.com/2024/01/24/science/space/black-holes-photography-m87.html
https://phys.org/news/2024-01-liquid-lithium-walls-fusion-device.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2
https://physics.aps.org/articles/v17/s13
https://phys.org/news/2024-01-validating-hypothesis-complex.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2
https://phys.org/news/2024-01-scientists-previously-unknown-colonies-emperor.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2
https://phys.org/news/2024-01-reveals-quantum-topological-potential-material.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2
https://phys.org/news/2024-01-shallow-soda-lakes-cradles-life.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2

Now we're getting somewhere! Exa gave our app a list of relevant, useful URLs based on the original question.

By the way, we might be wondering what makes Exa special. Why can't we just search with Google? Well, let's take a look for ourselves at the Google search results. It gives us the front page of lots of news aggregators, but not the news articles themselves. And since we used Exa's search_and_contents, our search came with the webpage contents, so can use Exa to skip writing a web crawler and access the knowledge directly!

results = search_response.results
result_item = results[0]
print(f"{len(results)} items total, printing the first one:")
print(result_item.text)
10 items total, printing the first one:




 Credit: CC0 Public Domain
  
Chopped carrot pieces are among the most universally enjoyed foods and a snacking staple—a mainstay of school lunchboxes, picnics and party platters year-round.
Now researchers from the University of Bath have uncovered the secret science of prepping the popular root vegetable and quantified the processes that make them curl up if left uneaten for too long.
Mechanical Engineering student Nguyen Vo-Bui carried out the research as part of his final-year studies, in the limited circumstances of COVID-19 lockdowns of 2021. The research paper, "Modelling of longitudinally cut carrot curling induced by the vascular cylinder-cortex interference pressure", is published in Royal Society Open Science.
Without access to labs, Nguyen aimed to identify the geometrical and environmental factors that have the most influence on carrots' longevity. Working in his kitchen, he characterized, analytically modeled and verified the aging of over 100 Lancashire Nantes carrot halves, cut lengthways, using finite-element (FE) models normally used in structural engineering.
The research team concluded that residual stresses and dehydration were the two key factors behind the curling behavior. The starchy outer layer of the carrot (the cortex) is stiffer than the soft central vein (also known as the vascular cylinder). When cut lengthwise, the two carrot halves curl because the difference in stress becomes unbalanced. Dehydration leads to further loss of stiffness, further driving the curling effect.
Their recommendations to manufacturers include handling carrots in cold, moist, airtight and humidity-controlled environments to protect their natural properties and increase their edible life span.
They say the study provides a methodology to predict the deformation of cut root vegetables, adding that the procedure is likely to apply to other plant structures. The study gives food producers a new mathematical tool that could be applied to the design of packaging and food handling processes, potentially reducing food waste.
One of the world's top crops by market value, carrots are known for their high production efficiency—but despite this, wastage is high. Around 25–30% of this occurs prior to processing and packaging—due to deformities, mechanical damage or infected sections. Fresh cut and minimally processed carrots are a convenient ready-to-use ingredient that make possible the use of carrots that might otherwise be discarded, reducing food waste.
Dr. Elise Pegg, a senior lecturer in Bath's Department of Mechanical Engineering, is one of the research paper authors and oversaw the study. She said, "We have mathematically represented the curl of a cut carrot over time, and showed the factors that contribute to curling.
"Our motivation was to look for ways to improve the sustainability of carrot processing and make them as long-lasting as possible. We have produced a methodology that a food producer could use to change their processes, reducing food waste and making packaging and transportation more efficient. Understanding the bending behavior in such systems can help us to design and manufacture products with higher durability.
"A question like this would normally be investigated from a biological perspective, but we have done this work using purely mechanical principles. I'm so pleased for Nguyen—it's a measure of his resourcefulness and dedication to produce such interesting research in a challenging situation."
Over the course of a week, the curl of the carrot halves increased—with the average radius of each carrot's curvature falling from 1.61m to 1.1m. A 1.32-times reduction in stiffness was also seen, correlating with the carrots drying out; on average, their weight fell by 22%.
Nguyen added, "This was interesting research—to apply mechanical principles to vegetables was surprising and fun.
"One of the big challenges was to devise an experiment that could be done in a lockdown setting, without access to normal labs and equipment. To now be in a position to have this work published in an academic journal and potentially be used by the food industry is really rewarding.
"This project has inspired me to continue my studies at the University of Bath and I now study residual stresses in porous ferroelectric ceramics for my Ph.D."
As well as having to use a suitcase to collect the 30kg of carrots the experiment demanded from a farmers' market, a further challenge was finding ways to use them afterward. Carrot cake, the Indian carrot dessert Gajar Ka Halwa, carrot pesto and many other dishes kept Nguyen and his flatmates fed for several days.

More information:
	Modelling of longitudinally cut carrot curling induced by the vascular cylinder-cortex interference pressure, Royal Society Open Science (2024). DOI: 10.1098/rsos.230420. royalsocietypublishing.org/doi/10.1098/rsos.230420



Citation:
	Why do carrots curl? Research reveals the mechanics behind root vegetable aging (2024, January 23)
	retrieved 24 January 2024
	from https://phys.org/news/2024-01-carrots-reveals-mechanics-root-vegetable.html
	 

	 This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
	 part may be reproduced without the written permission. The content is provided for information purposes only.
	 

Awesome! That's really interesting, or it would be if we had bothered to read it all. But there's no way we're doing that, so let's ask the LLM to summarize it for us:

import textwrap

SYSTEM_MESSAGE = "You are a helpful assistant that briefly summarizes the content of a webpage. Summarize the users input."

completion = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": result_item.text},
    ],
)

summary = completion.choices[0].message.content

print(f"Summary for {urls[0]}:")
print(result_item.title)
print(textwrap.fill(summary, 80))
Summary for https://phys.org/news/2024-01-carrots-reveals-mechanics-root-vegetable.html?utm_source=twitter.com&utm_medium=social&utm_campaign=v2:
Why do carrots curl? Research reveals the mechanics behind root vegetable aging
Researchers from the University of Bath have conducted a study on the curling
behavior of chopped carrot pieces. The study found that residual stresses and
dehydration were the main factors behind the curling effect. The starchy outer
layer of the carrot is stiffer than the soft central vein, and when cut
lengthwise, the difference in stress causes the carrot to curl. Dehydration
further contributes to the curling effect. The research provides recommendations
to manufacturers on how to handle carrots to increase their edible lifespan. The
study also offers a methodology that can be used to predict the deformation of
cut root vegetables and potentially reduce food waste. The findings have
implications for the design of packaging and food handling processes. Carrots
are a highly produced crop, but wastage is still high, with a significant amount
occurring before processing and packaging. The study was carried out by
Mechanical Engineering student Nguyen Vo-Bui during the COVID-19 lockdowns of
2021.

And we're done! We built an app that translates a question into a search query, uses Exa to search for useful links, uses Exa to grab clean content from those links, and summarizes the content to effortlessly answer your question about the latest news, or whatever we want.

We can be sure that the information is fresh, we have the source in front of us, and we did all this with a Exa queries and LLM calls, no web scraping or crawling needed!

With Exa, we have empowered our LLM application with the Internet. The possibilities are endless.