map[string]any to wrestle with — you get concrete types for everything, errors as values, and zero external dependencies. Just the standard library.
Installation
Get your API key from app.spidra.io under Settings → API Keys.
Store it as an environment variable — never hardcode it in source.
Getting started
client.Scrape, client.Batch, client.Crawl, client.Logs, and client.Usage.
Scraping
Each scrape request can include up to three URLs and runs them in parallel. You tell the AI what to extract through aPrompt, and optionally lock the output shape with a Schema.
The quickest path is Run() — it submits the job, polls until it completes, and returns the full result:
Submit() and Get() separately:
waiting → active → completed (or failed).
ScrapeParams fields
| Field | Type | Description |
|---|---|---|
URLs | []ScrapeURL | Up to 3 URLs. Each can carry its own Actions slice |
Prompt | string | What to extract, in plain English |
Output | string | "markdown" (default) or "json" |
Schema | any | JSON Schema object — locks the output shape when using "json" |
UseProxy | bool | Route through a residential proxy |
ProxyCountry | string | Two-letter country code: "us", "de", "jp", etc. |
ExtractContentOnly | bool | Strip nav, ads, and boilerplate before the AI sees the page |
Screenshot | bool | Capture a viewport screenshot |
FullPageScreenshot | bool | Capture a full-page (scrolled) screenshot |
Cookies | string | Raw Cookie header string for pages behind a login |
Enforcing an exact output shape
Pass aSchema when you need the output to match a specific structure. Fields the AI can’t find come back as null instead of guessed values:
Geo-targeted scraping
Some sites return different prices or content based on the visitor’s location. SetUseProxy: true and pick a ProxyCountry to route through a residential IP in that region:
us, gb, de, fr, jp, au, ca, br, in, nl, sg, and 40+ more. Use "global" or "eu" for regional routing without pinning to a specific country.
Scraping pages behind a login
Pass your session cookies as a raw header string. The easiest way to grab this is to open browser devtools, log in, and copy theCookie header from any authenticated request:
Browser actions
For pages that need interaction before you can extract anything — accepting cookies, typing into a search input, scrolling to trigger lazy loading — include anActions slice on the URL. They run in order before the AI sees the page:
selector pass a CSS selector or XPath. If you’d rather describe the element in words, use value — Spidra will locate it using AI.
| Action | What it does |
|---|---|
click | Click any element — CSS selector via selector, plain text via value |
type | Type into an input or textarea |
check | Check a checkbox |
uncheck | Uncheck a checkbox |
wait | Pause for duration milliseconds |
scroll | Scroll to a percentage of page height (e.g. "80%") |
forEach | Loop over every matched element and process each one |
Controlling how long Run() waits
By defaultRun() polls every 3 seconds and times out after 120 seconds. Pass PollOptions to override:
Run() returns the result with Status: "timeout" rather than an error — the JobID is still there so you can keep polling with Get() if you need to wait longer. The same options work on Batch.Run() and Crawl.Run().
Batch scraping
When you have a list of URLs to process, batch is the right tool. Submit up to 50 URLs in a single request and they run in parallel. Unlike the scraper, each URL here is a plain string — no per-URL actions.pending → running → completed (or failed). The batch as a whole follows the same lifecycle plus a cancelled state.
For fire-and-forget, use Submit() and come back with Get():
Retrying failures
If some items fail, you can re-queue just those without touching the ones that already succeeded:Crawling
Crawling is for when you need to cover a whole site or section, not just a handful of URLs. You give it a starting URL and instructions for which links to follow; it discovers pages on its own, extracts data from each one, and hands everything back when it’s done.CrawlInstruction controls navigation — which links to follow, which to skip. TransformInstruction controls extraction — what to pull from each page. MaxPages is a cap so a crawl doesn’t run forever.
The same UseProxy, ProxyCountry, and Cookies options from the scraper all work here.
For fire-and-forget:
Downloading the raw content
Once a crawl is complete, you can get signed URLs to download the raw HTML and Markdown for every page that was visited. Links expire after an hour:Re-extracting with a different prompt
Crawling is the expensive part. If you’ve already crawled a site and just want to pull out different information, you don’t have to crawl again —Extract() runs a new AI pass over the already-stored content and only charges transformation credits:
History and stats
Logs
Every scrape request your API key makes is logged automatically. You can query the full history and filter by status, URL, date range, or channel:Usage statistics
Check how many requests and credits your account has consumed over a period:"7d" gives one row per day for the past week. "30d" gives the last 30 days. "weekly" gives one row per week for the past seven weeks.
Error handling
All API errors are returned as typed error values. Useerrors.As() to match the specific type you care about:
| Type | Status | When |
|---|---|---|
*AuthenticationError | 401 | The API key is missing or invalid |
*InsufficientCreditsError | 403 | No credits remaining on the account |
*RateLimitError | 429 | Too many requests — back off |
*ServerError | 500 | Unexpected server-side error |
*SpidraError | any | Base type — all others embed this |
StatusCode int and Message string. *SpidraError is the base — if you only want one catch-all, match against that.
