Date: 7 June 2025 We’ve added a new
livecrawl
option called "preferred"
that provides a more resilient approach to content fetching. This option attempts to crawl fresh content but gracefully falls back to cached results when live crawling fails.
The
preferred
option is now available in both /contents
and /search_and_contents
endpoints.What’s New
The newlivecrawl: "preferred"
option provides intelligent fallback behavior:
- First: Attempts to crawl fresh content from the live webpage
- If crawling succeeds: Returns the fresh, up-to-date content
- If crawling fails but cached content exists: Returns cached content instead of failing
- If crawling fails and no cached content exists: Returns the crawl error
How It Differs from “Always”
The key difference between"preferred"
and "always"
:
Option | Crawl Fails + Cache Available | Crawl Fails + No Cache |
---|---|---|
"preferred" | Returns cached content | Returns crawl error |
"always" | Returns crawl error | Returns crawl error |
"preferred"
more resilient for production applications where you want fresh content when possible, but don’t want requests to fail when websites are temporarily unavailable.
If content freshness is critical and you want nothing else, then using "always"
might be better.
When to Use “Preferred”
The"preferred"
option is ideal when:
- You want the freshest content available but need reliability
- Building production applications that can’t afford to fail on crawl errors
- Content freshness is important but not critical enough to fail the request
- You’re crawling websites that might be occasionally unavailable
Complete Livecrawl Options Overview
Here are all four livecrawl options and their behaviors:Option | Crawl Behavior | Cache Fallback | Best For |
---|---|---|---|
"always" | Always crawls | Never falls back | Critical real-time data, willing to accept failures |
"preferred" | Always crawls | Falls back on crawl failure | Fresh content with reliability |
"fallback" | Only if no cache | Uses cache first | Balanced speed and freshness |
"never" | Never crawls | Always uses cache | Maximum speed |
Migration Guide
If you’re currently usinglivecrawl: "always"
but experiencing reliability issues: