Get Crawl Job Details

curl --request GET \
  --url https://api.spidra.io/api/crawl/job/{jobId} \
  --header 'x-api-key: <api-key>'

{
  "id": "<string>",
  "base_url": "<string>",
  "crawl_instruction": "<string>",
  "transform_instruction": "<string>",
  "max_pages": 123,
  "pages_crawled": 123,
  "status": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "updated_at": "2023-11-07T05:31:56Z",
  "input_tokens": 123,
  "output_tokens": 123,
  "credits_used": 123
}

GET

crawl

job

{jobId}

Get Crawl Job Details

curl --request GET \
  --url https://api.spidra.io/api/crawl/job/{jobId} \
  --header 'x-api-key: <api-key>'

{
  "id": "<string>",
  "base_url": "<string>",
  "crawl_instruction": "<string>",
  "transform_instruction": "<string>",
  "max_pages": 123,
  "pages_crawled": 123,
  "status": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "updated_at": "2023-11-07T05:31:56Z",
  "input_tokens": 123,
  "output_tokens": 123,
  "credits_used": 123
}

This endpoint returns the complete record for a crawl job: the original instructions you submitted, the job status, token consumption, and credit cost. It does not return the extracted page data. To get the actual content from each page, use GET /crawl//pages.

When to Use This Endpoint

Use this endpoint when you need to:

Inspect the crawl_instruction or transform_instruction that was used for a job
Check token and credit costs for accounting or reporting
Confirm job configuration before re-running an extraction with POST /crawl//extract

Example Request

curl https://api.spidra.io/api/crawl/job/abc-123 \
  -H "x-api-key: YOUR_API_KEY"

Response Fields

Field	Type	Description
`id`	string	Unique job identifier
`base_url`	string	The starting URL that was crawled
`crawl_instruction`	string	The instruction used to decide which pages to follow
`transform_instruction`	string	The AI prompt used to extract data from each page
`max_pages`	integer	Maximum pages requested
`pages_crawled`	integer	Pages that were successfully processed
`status`	string	Current status: `waiting`, `active`, `completed`, or `failed`
`created_at`	string	ISO 8601 timestamp when the job was created
`updated_at`	string	ISO 8601 timestamp of the last status update
`input_tokens`	number	Total input tokens consumed by the AI across all pages
`output_tokens`	number	Total output tokens generated by the AI across all pages
`credits_used`	number	Total credits charged for this job

Example Response

{
  "id": "abc-123",
  "base_url": "https://example.com/blog",
  "crawl_instruction": "Crawl all blog post pages",
  "transform_instruction": "Extract title, author, publish date, and the first 200 words of the body",
  "max_pages": 10,
  "pages_crawled": 8,
  "status": "completed",
  "created_at": "2025-12-17T15:00:00Z",
  "updated_at": "2025-12-17T15:03:42Z",
  "input_tokens": 14820,
  "output_tokens": 3210,
  "credits_used": 25
}

This endpoint returns job configuration and stats only. For the actual extracted content from each page, call GET /crawl//pages.

Authorizations

x-api-key

string

header

required

Path Parameters

jobId

string

required

Response

Job details

string

base_url

string

crawl_instruction

string

transform_instruction

string

max_pages

integer

pages_crawled

integer

status

string

created_at

string<date-time>

updated_at

string<date-time>

input_tokens

number

output_tokens

number

credits_used

number

Download Crawl Results List Crawl History

Using the API

Scrape Endpoints

Crawl Endpoints

Account Endpoints

Get Crawl Job Details

When to Use This Endpoint

Example Request

Response Fields

Example Response

Authorizations

Path Parameters

Response

Using the API

Scrape Endpoints

Crawl Endpoints

Account Endpoints

​When to Use This Endpoint

​Example Request

​Response Fields

​Example Response

Authorizations

Path Parameters

Response

When to Use This Endpoint

Example Request

Response Fields

Example Response