Run a new extraction on pages from a completed crawl job without re-crawling the site
| Field | Type | Required | Description |
|---|---|---|---|
transformInstruction | string | Yes | The extraction prompt to apply to every page from the source crawl. Maximum 5,000 characters. |
jobId of a completed crawl job. If you ran the crawl previously, this is the id field shown in your crawl history.transformInstruction describing what you want to extract.jobId. Use the standard crawl endpoints to check progress and get results:
| Endpoint | Purpose |
|---|---|
GET /crawl/{jobId} | Poll job status |
GET /crawl/{jobId}/pages | Get extracted data per page |
GET /crawl/{jobId}/download | Download results as ZIP |
POST /crawl/{jobId}/retry/{pageId} | Retry a specific page |
| Status | Error message | Cause |
|---|---|---|
400 | Source crawl job has not completed successfully | You called /extract before the source job finished. Wait for status: "completed". |
400 | Missing required field: transformInstruction | The request body is missing the transformInstruction field. |
400 | transformInstruction must be 5000 characters or fewer | Your prompt exceeds the 5,000 character limit. |
403 | You have exceeded your monthly credit limit. | Not enough credits remaining. Check your usage at GET /usage. |
404 | Source crawl job not found | The jobId does not exist or does not belong to your account. |
The ID of the completed source crawl job to extract from
Extraction prompt to apply to all pages from the source crawl. Maximum 5,000 characters.
5000