Skip to main content
POST
/
scrape
curl --request POST \
  --url https://api.spidra.io/api/scrape \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '
{
  "urls": [
    {
      "url": "https://example.com"
    }
  ],
  "prompt": "Extract the main heading and first paragraph",
  "output": "json"
}
'
{
  "status": "queued",
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "message": "Scrape job has been queued. Poll /api/scrape/550e8400-e29b-41d4-a716-446655440000 to get the result."
}

How It Works

  1. Load - Opens each URL in a real browser
  2. Execute - Runs your browser actions (clicks, scrolls, etc.)
  3. Solve - Automatically handles CAPTCHAs
  4. Process - Runs AI extraction (if prompt provided)

Browser Actions

Interact with pages before scraping:
{
  "urls": [{
    "url": "https://example.com/products",
    "actions": [
      {"type": "click", "selector": "#accept-cookies"},
      {"type": "wait", "value": 1000},
      {"type": "click", "selector": ".load-more-btn"},
      {"type": "scroll", "value": "50%"}
    ]
  }],
  "prompt": "List all product names and prices",
  "output": "json"
}

Available Actions

ActionDescriptionExample
clickClick an element{"type": "click", "selector": "#btn"}
typeType into input{"type": "type", "selector": "#search", "value": "query"}
waitWait (ms){"type": "wait", "value": 2000}
scrollScroll page{"type": "scroll", "value": "50%"}
Use browser DevTools (right-click → Inspect) to find CSS selectors.

Proxy and Geo-Targeting

Route requests through proxies to avoid detection or access geo-restricted content.
{
  "urls": [{"url": "https://amazon.de/dp/B123456"}],
  "prompt": "Extract the product price in euros",
  "output": "json",
  "useProxy": true,
  "proxyCountry": "de"
}

Supported Locations

Regions: global (worldwide), asia, eu Popular countries: us, gb, de, fr, jp, au, ca, br, in
CodeCountryCodeCountryCodeCountry
usUnited StatesdeGermanyjpJapan
gbUnited KingdomfrFranceauAustralia
caCanadabrBrazilinIndia
nlNetherlandssgSingaporeesSpain
itItalymxMexicozaSouth Africa
ngNigeriaarArgentinabeBelgium
chSwitzerlandclChilecnChina
coColombiaczCzech RepublicdkDenmark
egEgyptfiFinlandgrGreece
hkHong KonghuHungaryidIndonesia
ieIrelandilIsraelkrSouth Korea
myMalaysianoNorwaynzNew Zealand
pePeruphPhilippinesplPoland
ptPortugalroRomaniasaSaudi Arabia
seSwedenthThailandtrTurkey
twTaiwanuaUkrainevnVietnam

Screenshots

Capture screenshots of scraped pages for debugging or archival.
{
  "urls": [{"url": "https://example.com"}],
  "prompt": "Extract page title",
  "screenshot": true,
  "fullPageScreenshot": true
}
OptionDescription
screenshot: trueCapture the visible viewport
fullPageScreenshot: trueCapture the entire scrollable page (requires screenshot: true)
Screenshot URLs are returned in the screenshots array of the response.

Extract Content Only

Remove navigation, headers, footers, and sidebars before processing. Useful when you only want the main article or product content.
{
  "urls": [{"url": "https://blog.example.com/article"}],
  "prompt": "Summarize this article",
  "output": "json",
  "extractContentOnly": true
}

AI Mode

AI Mode uses intelligent browser automation. Instead of relying on CSS selectors that break when websites update, it understands page structure and adapts to layout changes.
{
  "urls": [{"url": "https://news.ycombinator.com"}],
  "prompt": "Extract the top 10 story titles with their scores",
  "output": "json",
  "aiMode": true
}
With AI Mode, you can use natural language selectors in actions:
{
  "urls": [{
    "url": "https://example.com/products",
    "actions": [
      {"type": "click", "selector": "Accept cookies button"},
      {"type": "click", "selector": "Load more button"}
    ]
  }],
  "prompt": "List all product names and prices",
  "output": "json",
  "aiMode": true
}

AI Mode

Learn more about AI-powered scraping

Authentication

Scrape protected pages by providing session cookies:
{
  "urls": [{"url": "https://example.com/dashboard"}],
  "prompt": "Extract account details",
  "output": "json",
  "cookies": "session=eyJ...; auth_token=abc123..."
}

Authenticated Scraping

Full guide on getting cookies and formats

Authorizations

x-api-key
string
header
required

Body

application/json
urls
object[]
required

Array of URLs to scrape (1-3 URLs per request)

Required array length: 1 - 3 elements
prompt
string

Optional LLM prompt for extracting or transforming the scraped content

output
enum<string>
default:json

Output format for the extracted content

Available options:
json,
markdown
useProxy
boolean
default:false

Enable stealth mode with proxy rotation to avoid detection

proxyCountry
string

Country code (e.g., 'us', 'uk', 'de') or region ('global', 'asia', 'eu') for geo-targeted proxy routing. Requires useProxy: true

aiMode
boolean
default:false

Use AI-powered browser automation. When enabled, actions use natural language understanding instead of CSS selectors, making scraping more resilient to page layout changes

cookies
string

Session cookies for authenticated scraping. Supports standard format (name=value; name2=value2) or raw Chrome DevTools paste format

screenshot
boolean
default:false

Capture a screenshot of each page after scraping

fullPageScreenshot
boolean
default:false

Capture full page screenshot instead of just the viewport. Requires screenshot: true

extractContentOnly
boolean
default:false

Remove headers, footers, navigation, and other non-content elements from the scraped output

Response

Job successfully queued

status
enum<string>
Available options:
queued
jobId
string

Unique job identifier for polling

message
string
deduplicated
boolean

True if an identical request was made within the last 5 seconds and this returns the existing job ID