Skip to main content
GET
/
scrape
/
{jobId}
Get Scrape Job Status
curl --request GET \
  --url https://api.spidra.io/api/scrape/{jobId} \
  --header 'x-api-key: <api-key>'
{
  "status": "active",
  "progress": {
    "message": "Processing content with AI...",
    "progress": 0.6
  },
  "result": null,
  "error": null
}

Polling Pattern

Scrape jobs are processed asynchronously. When you submit a job you get a jobId back immediately. You then poll this endpoint every 2-5 seconds until status is completed or failed.
async function waitForResult(jobId) {
  while (true) {
    const res = await fetch(`https://api.spidra.io/api/scrape/${jobId}`, {
      headers: { 'x-api-key': 'YOUR_API_KEY' }
    });
    const data = await res.json();

    if (data.status === 'completed') return data.result;
    if (data.status === 'failed') throw new Error(data.error);

    await new Promise(r => setTimeout(r, 3000));
  }
}

Status Values

StatusMeaning
waitingIn queue, not started yet
activeRunning right now
completedDone, results are ready
failedSomething went wrong, check error

Response Structure

When status is completed, everything you need is inside result.
{
  "status": "completed",
  "progress": {
    "message": "Scrape completed successfully",
    "progress": 1
  },
  "result": {
    "content": "...",
    "data": [
      {
        "url": "https://example.com",
        "title": "Example Domain",
        "markdownContent": "...",
        "success": true,
        "screenshotUrl": null
      }
    ],
    "screenshots": [],
    "ai_extraction_failed": false,
    "stats": {
      "durationMs": 4200,
      "captchaSolvedCount": 0,
      "inputTokens": 312,
      "outputTokens": 84,
      "totalTokens": 396
    }
  },
  "error": null
}

result.content

This is the main output field. What it contains depends on whether you provided a prompt:
  • With prompt: the AI-extracted result, formatted according to output ("markdown" or "json")
  • Without prompt: the raw scraped page content as markdown
If AI extraction fails for any reason, content still returns the raw markdown as a fallback, and ai_extraction_failed is set to true so you can detect this.

result.data

An array with one entry per URL you submitted. Each entry contains:
FieldDescription
urlThe URL that was scraped
titleThe page title from the browser
markdownContentThe full raw scraped content for this URL as markdown. If you used forEach, this contains all the collected items formatted as ## Item 1, ## Item 2, etc.
successtrue if the page was scraped successfully, false if it failed
screenshotUrlURL to the screenshot on S3, or null if you did not request one

result.stats

Timing and usage information for the job.
FieldDescription
durationMsHow long the whole job took in milliseconds
captchaSolvedCountNumber of CAPTCHAs that were automatically solved
inputTokensTokens sent to the AI model
outputTokensTokens returned from the AI model
totalTokensTotal tokens used (input + output)

Failed Jobs

When status is failed, the error field contains the reason:
{
  "status": "failed",
  "error": "Failed to scrape https://example.com — net::ERR_NAME_NOT_RESOLVED"
}

Authorizations

x-api-key
string
header
required

Path Parameters

jobId
string
required

The job ID returned from POST /scrape

Response

Job status and results

status
enum<string>

Current status of the scrape job

Available options:
waiting,
active,
completed,
failed,
delayed
progress
object
result
object

Present only when status is 'completed'

error
string | null

Error message if status is 'failed'