Date: 7 June 2025

We’ve added a new livecrawl option called "preferred" that provides a more resilient approach to content fetching. This option attempts to crawl fresh content but gracefully falls back to cached results when live crawling fails.

The preferred option is now available in both /contents and /search_and_contents endpoints.

What’s New

The new livecrawl: "preferred" option provides intelligent fallback behavior:

  • First: Attempts to crawl fresh content from the live webpage
  • If crawling succeeds: Returns the fresh, up-to-date content
  • If crawling fails but cached content exists: Returns cached content instead of failing
  • If crawling fails and no cached content exists: Returns the crawl error

How It Differs from “Always”

The key difference between "preferred" and "always":

OptionCrawl Fails + Cache AvailableCrawl Fails + No Cache
"preferred"Returns cached contentReturns crawl error
"always"Returns crawl errorReturns crawl error

This makes "preferred" more resilient for production applications where you want fresh content when possible, but don’t want requests to fail when websites are temporarily unavailable.

If content freshness is critical and you want nothing else, then using "always" might be better.

When to Use “Preferred”

The "preferred" option is ideal when:

  • You want the freshest content available but need reliability
  • Building production applications that can’t afford to fail on crawl errors
  • Content freshness is important but not critical enough to fail the request
  • You’re crawling websites that might be occasionally unavailable

Complete Livecrawl Options Overview

Here are all four livecrawl options and their behaviors:

OptionCrawl BehaviorCache FallbackBest For
"always"Always crawlsNever falls backCritical real-time data, willing to accept failures
"preferred"Always crawlsFalls back on crawl failureFresh content with reliability
"fallback"Only if no cacheUses cache firstBalanced speed and freshness
"never"Never crawlsAlways uses cacheMaximum speed

Migration Guide

If you’re currently using livecrawl: "always" but experiencing reliability issues:

# Before - fails when crawling fails
result = exa.get_contents(urls, livecrawl="always")

# After - more resilient with cache fallback
result = exa.get_contents(urls, livecrawl="preferred")

This change maintains your preference for fresh content while improving reliability.