batchId.
Use batch scraping when:
- You have a list of product, article, or listing URLs to extract
- You want one API call per dataset rather than managing dozens of individual jobs
- You need to retry only the URLs that failed without re-running the whole set
How It Works
Submit
Send a
POST /api/batch/scrape with your URL list and extraction options. You get a batchId back immediately — the job is queued.Process
Spidra processes each URL independently using a real browser. CAPTCHA solving, proxy routing, and AI extraction all run per-item.
Poll
Call
GET /api/batch/scrape/{batchId} every few seconds. The response includes live progress counters (completedCount, failedCount) and per-item results.Quick Start
Polling Pattern
Batch jobs are asynchronous. PollGET /api/batch/scrape/{batchId} every 2–5 seconds until status is a terminal value.
status | Meaning |
|---|---|
pending | Queued, no items have started yet |
running | At least one item is being processed |
completed | All items finished (some may have failed — check failedCount) |
failed | The entire batch failed unexpectedly |
cancelled | You cancelled it via DELETE /api/batch/scrape/{batchId} |
completed does not mean every URL succeeded. A batch is completed when all items have reached a terminal state (completed or failed). Always check failedCount and inspect individual item statuses.Per-Item Results
Each item in theitems array represents one URL:
| Field | Description |
|---|---|
uuid | Unique ID for this batch item |
url | The URL that was scraped |
status | pending, running, completed, failed, or cancelled |
result | Extracted content (object if JSON, string if markdown). null until completed |
error | Error message if status is failed, otherwise null |
creditsUsed | Credits consumed by this item. 0 for failed items |
startedAt | When the worker picked up this item |
finishedAt | When this item reached a terminal state |
screenshotUrl | S3 URL if screenshot: true was set, otherwise null |
Structured Output
Pass aschema to enforce a specific output shape across all URLs in the batch. The AI will return JSON matching your schema for every item.
schema is provided, output is automatically set to "json". The schema is validated before the batch is queued — a 422 is returned if it is malformed.
Structured Output Guide
Full guide on nested objects, arrays, nullable fields, and schema limits
Retrying Failed Items
When a batch completes with some failures, retry only those items — no need to re-run the whole batch:running and you poll the same batchId until it completes again. Successfully completed items are never touched.
Cancelling a Batch
Cancel a running or pending batch to stop processing and refund credits for items that have not started yet:Proxy & Geo-Targeting
Apply stealth proxy routing to every URL in the batch withuseProxy and proxyCountry:
Stealth Mode & Geo-Targeting
Full country list, EU rotation, and billing details
Cookies & Authenticated Pages
Pass session cookies to scrape pages behind a login. Cookies are never stored — they are passed ephemerally to the worker and discarded after processing.Authenticated Scraping
Full guide on obtaining and formatting cookies
Submit a Batch
Full request reference
Get Batch Status
Polling and response shape
List Batches
See all your batch jobs
Cancel & Retry
Stop a batch or re-run failures

