Skip to main content
GET
/
crawl
/
job
/
{jobId}
Get Crawl Job Details
curl --request GET \
  --url https://api.spidra.io/api/crawl/job/{jobId} \
  --header 'x-api-key: <api-key>'
{
  "id": "<string>",
  "base_url": "<string>",
  "crawl_instruction": "<string>",
  "transform_instruction": "<string>",
  "max_pages": 123,
  "pages_crawled": 123,
  "status": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "updated_at": "2023-11-07T05:31:56Z",
  "input_tokens": 123,
  "output_tokens": 123,
  "credits_used": 123
}
This endpoint returns the complete record for a crawl job: the original instructions you submitted, the job status, token consumption, and credit cost. It does not return the extracted page data. To get the actual content from each page, use GET /crawl//pages.

When to Use This Endpoint

Use this endpoint when you need to:
  • Inspect the crawl_instruction or transform_instruction that was used for a job
  • Check token and credit costs for accounting or reporting
  • Confirm job configuration before re-running an extraction with POST /crawl//extract

Example Request

curl https://api.spidra.io/api/crawl/job/abc-123 \
  -H "x-api-key: YOUR_API_KEY"

Response Fields

FieldTypeDescription
idstringUnique job identifier
base_urlstringThe starting URL that was crawled
crawl_instructionstringThe instruction used to decide which pages to follow
transform_instructionstringThe AI prompt used to extract data from each page
max_pagesintegerMaximum pages requested
pages_crawledintegerPages that were successfully processed
statusstringCurrent status: waiting, active, completed, or failed
created_atstringISO 8601 timestamp when the job was created
updated_atstringISO 8601 timestamp of the last status update
input_tokensnumberTotal input tokens consumed by the AI across all pages
output_tokensnumberTotal output tokens generated by the AI across all pages
credits_usednumberTotal credits charged for this job

Example Response

{
  "id": "abc-123",
  "base_url": "https://example.com/blog",
  "crawl_instruction": "Crawl all blog post pages",
  "transform_instruction": "Extract title, author, publish date, and the first 200 words of the body",
  "max_pages": 10,
  "pages_crawled": 8,
  "status": "completed",
  "created_at": "2025-12-17T15:00:00Z",
  "updated_at": "2025-12-17T15:03:42Z",
  "input_tokens": 14820,
  "output_tokens": 3210,
  "credits_used": 25
}
This endpoint returns job configuration and stats only. For the actual extracted content from each page, call GET /crawl//pages.

Authorizations

x-api-key
string
header
required

Path Parameters

jobId
string
required

Response

Job details

id
string
base_url
string
crawl_instruction
string
transform_instruction
string
max_pages
integer
pages_crawled
integer
status
string
created_at
string<date-time>
updated_at
string<date-time>
input_tokens
number
output_tokens
number
credits_used
number