Use this file to discover all available pages before exploring further.
Scrape pages, run browser actions, batch-process URLs, and crawl entire sites. All results come back as structured data ready to feed into your LLM pipelines or store directly.
All scrape jobs run asynchronously. run() submits a job and polls until it finishes. For manual control, use submit() and get() directly. Up to 3 URLs can be passed per request and are processed in parallel.
Fire-and-forget approach: submit a job immediately and poll on your own schedule.
// Submit — returns immediately with a jobIdconst { jobId } = await spidra.scrape.submit({ urls: [{ url: 'https://example.com' }], prompt: 'Extract the main headline',});// Check status at any timeconst status = await spidra.scrape.get(jobId);if (status.status === 'completed') { console.log(status.result.content);} else if (status.status === 'failed') { console.error(status.error);}
Job statuses:waiting · active · completed · failed
Route through a residential proxy in a specific country for geo-restricted content or localized pricing.
const job = await spidra.scrape.run({ urls: [{ url: 'https://www.amazon.de/gp/bestsellers' }], prompt: 'List the top 10 products with name and price', useProxy: true, proxyCountry: 'de',});
Supported codes include us, gb, de, fr, jp, au, ca, br, in, nl, sg, es, it, mx, and 40+ more. Use "global" or "eu" for regional routing.
forEach finds a set of matching elements on the page and processes each one individually. Use it when you need to collect data from a list of items, paginate across pages, or click into each item’s detail page.
You don’t need forEach if all the data fits on a single page — a plain
prompt is simpler and works just as well.
Use forEach when:
The list spans multiple pages and you need pagination
You need to click into each item’s detail page (navigate mode)
You have 20+ items and want consistent per-item AI extraction (itemPrompt)
Follow each element’s link to its destination page and capture content there. Best for product listings where full details are only on individual pages.
{ type: 'forEach', observe: 'Find all book title links in the product grid', mode: 'navigate', captureSelector: 'article.product_page', maxItems: 10, waitAfterClick: 800, itemPrompt: 'Extract title, price, star rating, and availability as JSON',}
Click each element, capture the content that appears (modal, drawer, or expanded section), then move on. Best for hotel room cards, FAQ accordions, or any UI where clicking reveals hidden content.
{ type: 'forEach', observe: 'Find all room type cards', mode: 'click', captureSelector: "[role='dialog']", maxItems: 8, waitAfterClick: 1200, itemPrompt: 'Extract room name, bed type, price per night, and amenities as JSON',}
Run extra browser actions on each item after navigating or clicking into it, before content is captured. Useful for scrolling below the fold or expanding collapsed sections.
{ type: 'forEach', observe: 'Find all book title links', mode: 'navigate', captureSelector: 'article.product_page', maxItems: 5, waitAfterClick: 1000, actions: [ { type: 'scroll', to: '50%' }, ], itemPrompt: 'Extract title, price, and full description as JSON',}
Use itemPrompt to extract fields from each item individually. Use the top-level prompt to filter, sort, or reshape the combined output. They can be used together.
scrape.run(), batch.run(), and crawl.run() accept a second argument to control polling behavior.
const job = await spidra.scrape.run(params, { pollInterval: 3000, // ms between status checks (default: 3000) timeout: 120_000, // max wait in ms before throwing (default: 120000)});
Give Spidra a starting URL and instructions for which links to follow. It discovers pages automatically, extracts structured data from each one, and returns everything when the crawl is done.
const job = await spidra.crawl.run({ baseUrl: 'https://competitor.com/blog', crawlInstruction: 'Follow blog post links only, skip tag and category pages', transformInstruction: 'Extract the title, author, publish date, and a one-sentence summary', maxPages: 30, useProxy: true,});for (const page of job.result) { console.log(page.url, page.data);}
Apply a new AI prompt to an existing completed crawl without fetching the pages again. Only transformation credits are charged.
const { jobId: newJobId } = await spidra.crawl.extract( sourceJobId, 'Extract only the product SKUs and prices as a CSV',);// Poll the new extraction jobconst result = await spidra.crawl.get(newJobId);