Installation
Addspidra to your list of dependencies in mix.exs:
mix deps.get in your terminal.
Get your API key from app.spidra.io under Settings → API Keys.
Store it as an environment variable — never hardcode it in source.
Requirements
- Elixir 1.14 or later
- A Spidra API key (sign up free)
Getting started
Spidra.Scrape, Spidra.Batch, Spidra.Crawl, Spidra.Logs, and Spidra.Usage.
Scraping
All scrape jobs run asynchronously on the Spidra platform.Spidra.Scrape.run/3 submits a job and polls until it finishes. If you need more control, use submit/2 and get/2 directly.
Up to 3 URLs can be passed per request and they are processed in parallel.
Basic scrape
Submit a scrape job and wait for results.| Parameter | Type | Description |
|---|---|---|
urls | list | Up to 3 URL maps. Each takes a url key and optional actions list |
prompt | string | AI extraction instruction |
output | string | "markdown" (default) or "json" |
schema | map | JSON Schema for guaranteed output shape (use with output: "json") |
use_proxy | boolean | Route through a residential proxy |
proxy_country | string | Two-letter country code, e.g. "us", "de", "jp" |
extract_content_only | boolean | Strip navigation, ads, and boilerplate before AI extraction |
screenshot | boolean | Capture a screenshot of the page |
full_page_screenshot | boolean | Capture a full-page (scrolled) screenshot |
cookies | string | Raw Cookie header string for authenticated pages |
Fire-and-forget approach
Usesubmit/2 and get/2 when you want to manage polling yourself.
waiting · active · completed · failed
Structured JSON output
Pass aschema to enforce an exact output shape. Missing fields come back as null rather than hallucinated values.
Geo-targeted scraping
Passuse_proxy: true and a proxy_country code to route the request through a specific country. Useful for geo-restricted content or localized pricing.
us, gb, de, fr, jp, au, ca, br, in, nl, sg, es, it, mx, and 40+ more. Use "global" or "eu" for regional routing.
Authenticated pages
Pass cookies as a string to scrape pages that require a login session.Browser actions
Actions let you interact with the page before the scrape runs. They execute in order, and the scrape happens after all actions complete.| Action | Required fields | Description |
|---|---|---|
click | selector or value | Click a button, link, or any element |
type | selector, value | Type text into an input or textarea |
check | selector or value | Check a checkbox |
uncheck | selector or value | Uncheck a checkbox |
wait | duration (ms) | Pause for a set number of milliseconds |
scroll | to (0–100%) | Scroll the page to a percentage of its height |
forEach | observe | Loop over every matched element and process each one |
selector for a CSS selector or XPath. Use value for plain English — Spidra locates the element using AI.
Batch scraping
Submit up to 50 URLs in a single request. All URLs are processed in parallel. Each URL is a plain string.pending · running · completed · failed
Batch statuses: pending · running · completed · failed · cancelled
batch.submit() + batch.get()
Retry failed items
Re-queue only the items that failed — successful items are not re-run.Cancel a batch
Stops all pending items and refunds credits for unprocessed work.List past batches
Crawling
Give Spidra a starting URL and instructions for which links to follow. It discovers pages automatically, extracts structured data from each one, and returns everything when the crawl is done.| Parameter | Type | Description |
|---|---|---|
base_url | string | Starting URL for the crawl |
crawl_instruction | string | Which links to follow and which to skip |
transform_instruction | string | What to extract from each page |
max_pages | integer | Maximum number of pages to crawl |
use_proxy | boolean | Route through a residential proxy |
proxy_country | string | Two-letter country code, e.g. "us" |
cookies | string | Raw Cookie header string for authenticated sites |
crawl.submit() + crawl.get()
Download crawled content
Returns signed S3 URLs for the raw HTML and Markdown of each crawled page. Links expire after 1 hour.Re-extract without re-crawling
Apply a new AI prompt to an existing completed crawl without fetching the pages again. Only transformation credits are charged.History and stats
Logs
Every API scrape job is logged automatically. Access your full history with optional filters.| Parameter | Type | Description |
|---|---|---|
status | string | "success" or "failed" |
search_term | string | Search by URL or prompt |
channel | string | "api" or "playground" |
date_start | string | ISO date — return logs on or after this date |
date_end | string | ISO date — return logs on or before this date |
page | integer | Page number (default: 1) |
limit | integer | Results per page (default: 20) |
Usage statistics
Returns credit and request usage broken down by day or week.| Range | Description |
|---|---|
"7d" | Last 7 days, one row per day |
"30d" | Last 30 days, one row per day |
"weekly" | Last 7 weeks, one row per week |
Python
Official Python SDK — async-first with sync wrappers. Works in scripts, Django, Flask, and Jupyter.
.NET
Official .NET SDK — fully async, typed exceptions, JSON schema support. Requires .NET 8+.

