Skip to main content
The Spidra n8n node lets you trigger scrape jobs, batch process URLs, and crawl entire websites as steps in any n8n workflow. No code required. Configure your extraction prompt, connect the output to whatever comes next, and you’re done.

Installation

In your n8n instance, go to Settings > Community Nodes and install:
n8n-nodes-spidra
Requires n8n 1.0 or higher. After installing, restart n8n for the node to appear in the editor.

Authentication

Add a new Spidra API credential and enter your API key. You can get your key from app.spidra.io under Settings > API Keys. If you are running a self-hosted Spidra instance, change the Base URL field to point at your server. The default is https://api.spidra.io/api.

Resources and operations

The node has five resources. Each one maps directly to the Spidra API.
ResourceOperations
ScrapeRun, Submit, Get Status
Batch ScrapeRun, Submit, Get Status, List, Cancel, Retry Failed
CrawlRun, Submit, Get Status, Get Pages, Extract, History
LogsList, Get
UsageGet Stats

Run vs Submit + Get Status

Every resource that creates a job has two ways to handle it. Run submits the job and keeps the workflow waiting until results come back. This is the simplest option and works well for short jobs. You set a Max Wait Time (default 120 seconds). If the job finishes in time, the node outputs the full result. If it times out, the node outputs a { status: "timeout", jobId: "..." } response so you can chain a Get Status node and check progress later. Submit returns the job ID immediately without waiting. Use this when you want to kick off a long job and check it in a later step or a separate workflow run.

Scraping

Select Resource: Scrape and Operation: Run to scrape up to three URLs in one request. Add your URLs using the Add URL button. Each URL can include an optional Browser Actions JSON array for interactions like clicking, scrolling, or filling a form before the AI extracts. Set Output Format to JSON or Markdown. Use the Options collection to add an extraction prompt, a JSON schema for structured output, proxy settings, cookies, and screenshot capture. Extraction Prompt tells the AI what to pull from the page in plain English. Extraction Schema enforces an exact output shape and takes precedence over the prompt when both are set.

Batch scraping

Select Resource: Batch Scrape to process a large list of URLs in one job. Add each URL as a separate line in the URLs field. The batch supports up to 50 URLs per job and processes them all in parallel. The same options available in Scrape (prompt, schema, proxy, cookies, screenshots) are available here too. If some items fail, use Retry Failed with the batch ID to re-queue only the failed URLs without re-running the ones that already completed. Use Cancel to stop a running batch and get credits refunded for anything that has not started yet.

Crawling

Select Resource: Crawl to start from a URL and let Spidra discover and process pages on its own. Three fields are required:
  • Start URL: the root page the crawler starts from
  • Navigation Instruction: plain English instructions for which links to follow and which to skip
  • Extraction Instruction: what data to pull from each page the crawler visits
Under Options, set Max Pages to cap how many pages the crawl visits. Proxy and cookie options work the same as in Scrape. Once a crawl completes, use Get Pages with the job ID to retrieve signed download URLs for the raw HTML and Markdown of every crawled page. URLs expire after one hour. Use Extract to re-run AI extraction on a completed crawl with a new instruction, without re-crawling any pages. This only charges transformation credits.

Logs and Usage

Logs: List returns paginated scrape logs for your account. Filter by status (success, error, in progress) and search by URL or preset name using the Filters collection. Logs: Get returns the full detail of a single log entry including the AI extraction output. Usage: Get Stats returns credit usage, request counts, and bandwidth broken down by day or week. Choose the time window from the Time Range dropdown: Last 7 Days, Last 30 Days, or This Week.

Using as an AI tool

The Spidra node has usableAsTool enabled, which means you can connect it directly to an AI Agent node in n8n. The agent can call Spidra to fetch live web data as part of its reasoning without any additional setup on your end.

Error handling

Enable Continue On Fail on the node to prevent a single failed item from stopping the whole workflow. When enabled, errors are returned as { error: "..." } in the output and execution continues with the next item.