Skip to main content
POST
/
batch
/
scrape
Submit a Batch Scrape Job
curl --request POST \
  --url https://api.spidra.io/api/batch/scrape \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '
{
  "urls": [
    "https://example.com/product/1",
    "https://example.com/product/2"
  ],
  "prompt": "Extract the product name, price, and availability",
  "output": "json"
}
'
{
  "status": "queued",
  "batchId": "f3a2b1c0-0000-0000-0000-000000000000",
  "total": 2
}

How It Works

Batch scrape jobs are asynchronous. Submitting returns a batchId immediately. Each URL is processed in parallel by independent workers.
  1. Submit — Send your URL list. Receive batchId in the response.
  2. Process — Each URL is opened in a real browser, CAPTCHAs solved, content extracted.
  3. Poll — Call GET /api/batch/scrape/{batchId} every 2–5 seconds until status is terminal.
Credits are reserved upfront when you submit. The final amount is reconciled per item once processing completes.

Minimal Example

curl -X POST https://api.spidra.io/api/batch/scrape \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["https://example.com/page-1", "https://example.com/page-2"],
    "prompt": "Extract the headline and summary",
    "output": "json"
  }'
Response 202 Accepted:
{
  "status": "queued",
  "batchId": "f3a2b1c0-0000-0000-0000-000000000000",
  "total": 2
}
The Location response header is also set to /api/batch/scrape/{batchId} for convenience.

With Structured Output

Pass a schema to receive a consistent JSON shape for every item:
{
  "urls": [
    "https://shop.example.com/item/100",
    "https://shop.example.com/item/101"
  ],
  "prompt": "Extract the product details",
  "schema": {
    "type": "object",
    "required": ["name", "price"],
    "properties": {
      "name":      { "type": "string" },
      "price":     { "type": "number" },
      "currency":  { "type": ["string", "null"] },
      "available": { "type": ["boolean", "null"] }
    }
  }
}
When schema is provided, output is automatically forced to "json". Non-fatal schema issues are returned as schema_warnings in the submission response.

With Proxy

{
  "urls": ["https://amazon.de/dp/B123", "https://amazon.de/dp/B456"],
  "prompt": "Extract price and availability",
  "output": "json",
  "useProxy": true,
  "proxyCountry": "de"
}

With Screenshots

{
  "urls": ["https://example.com"],
  "screenshot": true,
  "fullPageScreenshot": true
}
Screenshot URLs are returned in each item’s screenshotUrl field once processing is complete.

Request Body

FieldTypeRequiredDefaultDescription
urlsstring[]YesURLs to scrape. 1–50 URLs per request. Must be http:// or https://. Private/internal IPs are rejected.
promptstringNoAI extraction instruction applied to every URL in the batch
output"json" | "markdown"No"json"Output format for extracted content. Automatically "json" when schema is set
schemaobjectNoJSON Schema object that constrains the AI output. Validated before queuing — returns 422 if invalid
useProxybooleanNofalseRoute each URL through residential stealth proxies
proxyCountrystringNoISO country code ("us", "de", "gb") or region ("eu", "global"). Requires useProxy: true
extractContentOnlybooleanNofalseStrip navigation, headers, and sidebars — keeps only the main content
cookiesstringNoSession cookies for authenticated pages. Never persisted to the database — passed ephemerally to the worker only
screenshotbooleanNofalseCapture a viewport screenshot of each page
fullPageScreenshotbooleanNofalseCapture the full scrollable page. Requires screenshot: true

Response

FieldTypeDescription
status"queued"Always "queued" on a successful submission
batchIdstringUUID — use this to poll status and manage the batch
totalnumberNumber of URLs accepted into the batch
schema_warningsstring[]Non-fatal schema issues (e.g., unsupported keywords). Only present if there are warnings

Errors

CodeReason
400urls is missing, not an array, empty, or exceeds 50 items
401Missing x-api-key header
402Payment overdue — update your payment method
403Monthly credit limit reached
422One or more URLs are invalid, or schema is malformed. An errors array is returned with per-URL details
429More than 20 batch submissions per minute
Validation error example:
{
  "status": "error",
  "message": "Request validation failed. Fix the errors below and try again.",
  "errors": [
    "URL 2: \"ftp://example.com\" is not a valid URL — must use http or https",
    "URL 4: private and internal URLs are not allowed"
  ]
}

Get Batch Status

Poll for results

Batch Scraping Guide

Full feature walkthrough

Authorizations

x-api-key
string
header
required

Body

application/json
urls
string<uri>[]
required

URLs to scrape. 1–50 per request. Must be http:// or https://. Private/internal IPs are rejected.

Required array length: 1 - 50 elements
prompt
string

AI extraction instruction applied to every URL in the batch

output
enum<string>
default:json

Output format. Automatically set to 'json' when schema is provided

Available options:
json,
markdown
schema
object

JSON Schema that constrains AI output shape. Validated before queuing — returns 422 if invalid

useProxy
boolean
default:false

Route each URL through residential stealth proxies. Usage is billed from your bandwidth quota.

proxyCountry
string

ISO country code (e.g. 'us', 'de', 'gb') or region ('eu', 'global'). Requires useProxy: true

crawlerMode
string
default:default

Browser rendering mode: 'default', 'fast', or 'ai'

extractContentOnly
boolean
default:false

Strip navigation, headers, and sidebars — keep only the main content

cookies
string

Session cookies for authenticated pages. Never persisted — passed ephemerally to the worker

screenshot
boolean
default:false

Capture a viewport screenshot of each page

fullPageScreenshot
boolean
default:false

Capture the full scrollable page. Requires screenshot: true

Response

Batch accepted and queued

status
enum<string>
Available options:
queued
batchId
string<uuid>
total
integer
schema_warnings
string[]

Non-fatal schema issues. Only present if there are warnings