Skip to main content
POST
/
scrape
curl --request POST \
  --url https://api.spidra.io/api/scrape \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '
{
  "urls": [
    {
      "url": "https://example.com"
    }
  ],
  "prompt": "Extract the main heading and first paragraph",
  "output": "json"
}
'
{
  "status": "queued",
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "message": "Scrape job has been queued. Poll /api/scrape/550e8400-e29b-41d4-a716-446655440000 to get the result."
}

How It Works

  1. Load - Opens each URL in a real browser
  2. Execute - Runs your browser actions (clicks, scrolls, etc.)
  3. Solve - Automatically handles CAPTCHAs
  4. Process - Runs AI extraction (if prompt provided)

Browser Actions

Interact with pages before scraping:
{
  "urls": [{
    "url": "https://example.com/products",
    "actions": [
      {"type": "click", "selector": "#accept-cookies"},
      {"type": "wait", "value": 1000},
      {"type": "click", "selector": ".load-more-btn"},
      {"type": "scroll", "value": "50%"}
    ]
  }],
  "prompt": "List all product names and prices",
  "output": "json"
}

Available Actions

ActionDescriptionExample
clickClick an element{"type": "click", "selector": "#btn"}
typeType into input{"type": "type", "selector": "#search", "value": "query"}
waitWait (ms){"type": "wait", "value": 2000}
scrollScroll page{"type": "scroll", "value": "50%"}
Use browser DevTools (right-click → Inspect) to find CSS selectors.

Authentication (Optional)

Scrape protected pages by providing session cookies from your logged-in browser:
{
  "urls": [{"url": "https://crunchbase.com/company/stripe"}],
  "prompt": "Extract company details",
  "output": "json",
  "cookies": "authcookie=eyJ...; cf_clearance=2B08..."
}

How to Get Cookies

  1. Log into the target website in your browser
  2. Open DevTools (F12) → Application → Cookies
  3. Copy the relevant cookie names and values
  4. Format as name=value; name2=value2
Legal Responsibility: You are solely responsible for ensuring your authenticated scraping complies with applicable laws and the target website’s Terms of Service. Only scrape content you’re authorized to access. When uncertain, seek written permission from the platform. Misuse of session cookies may violate terms of service or applicable laws. Cookies are processed transiently and never stored by Spidra.

Authorizations

x-api-key
string
header
required

Body

application/json
urls
object[]
required

Array of URLs to scrape (1-3 URLs per request)

Required array length: 1 - 3 elements
prompt
string

Optional LLM prompt for extracting or transforming the scraped content

output
enum<string>
default:json

Output format for the extracted content

Available options:
json,
markdown
useProxy
boolean
default:false

Enable stealth mode with proxy rotation to avoid detection

Response

Job successfully queued

status
enum<string>
Available options:
queued
jobId
string

Unique job identifier for polling

message
string