Use this file to discover all available pages before exploring further.
The Go SDK wraps the Spidra API with fully typed request and response structs. No map[string]any to wrestle with — you get concrete types for everything, errors as values, and zero external dependencies. Just the standard library.
Each scrape request can include up to three URLs and runs them in parallel. You tell the AI what to extract through a Prompt, and optionally lock the output shape with a Schema.The quickest path is Run() — it submits the job, polls until it completes, and returns the full result:
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{ URLs: []spidra.ScrapeURL{ {URL: "https://example.com/pricing"}, }, Prompt: "Extract all pricing plans with name, price, and included features", Output: "json",})if err != nil { return err}fmt.Println(job.Result.Content)
When you need more control — say you’re building a queue and want to poll on your own schedule — use Submit() and Get() separately:
job, err := client.Scrape.Submit(ctx, spidra.ScrapeParams{ URLs: []spidra.ScrapeURL{{URL: "https://example.com"}}, Prompt: "Extract the main headline",})if err != nil { return err}// Come back later and checkstatus, err := client.Scrape.Get(ctx, job.JobID)if err != nil { return err}if status.Status == "completed" { fmt.Println(status.Result.Content)}
Jobs move through waiting → active → completed (or failed).
Some sites return different prices or content based on the visitor’s location. Set UseProxy: true and pick a ProxyCountry to route through a residential IP in that region:
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{ URLs: []spidra.ScrapeURL{{URL: "https://www.amazon.de/gp/bestsellers"}}, Prompt: "List the top 10 products with name and price", UseProxy: true, ProxyCountry: "de",})
Supported codes include us, gb, de, fr, jp, au, ca, br, in, nl, sg, and 40+ more. Use "global" or "eu" for regional routing without pinning to a specific country.
Pass your session cookies as a raw header string. The easiest way to grab this is to open browser devtools, log in, and copy the Cookie header from any authenticated request:
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{ URLs: []spidra.ScrapeURL{{URL: "https://app.example.com/dashboard"}}, Prompt: "Extract the monthly revenue and active user count", Cookies: "session=abc123; auth_token=xyz789",})
For pages that need interaction before you can extract anything — accepting cookies, typing into a search input, scrolling to trigger lazy loading — include an Actions slice on the URL. They run in order before the AI sees the page:
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{ URLs: []spidra.ScrapeURL{ { URL: "https://example.com/products", Actions: []map[string]any{ {"type": "click", "selector": "#accept-cookies"}, {"type": "wait", "duration": 1000}, {"type": "scroll", "to": "80%"}, }, }, }, Prompt: "Extract all product names and prices visible on the page",})
For selector pass a CSS selector or XPath. If you’d rather describe the element in words, use value — Spidra will locate it using AI.
Action
What it does
click
Click any element — CSS selector via selector, plain text via value
type
Type into an input or textarea
check
Check a checkbox
uncheck
Uncheck a checkbox
wait
Pause for duration milliseconds
scroll
Scroll to a percentage of page height (e.g. "80%")
forEach
Loop over every matched element and process each one
On timeout, Run() returns the result with Status: "timeout" rather than an error — the JobID is still there so you can keep polling with Get() if you need to wait longer. The same options work on Batch.Run() and Crawl.Run().
When you have a list of URLs to process, batch is the right tool. Submit up to 50 URLs in a single request and they run in parallel. Unlike the scraper, each URL here is a plain string — no per-URL actions.
batch, err := client.Batch.Run(ctx, spidra.BatchParams{ URLs: []string{ "https://shop.example.com/product/1", "https://shop.example.com/product/2", "https://shop.example.com/product/3", }, Prompt: "Extract product name, price, and whether it is in stock", Output: "json", UseProxy: true,})if err != nil { return err}fmt.Printf("%d/%d completed\n", batch.CompletedCount, batch.TotalURLs)for _, item := range batch.Items { if item.Status == "completed" { fmt.Println(item.URL, item.Result) } else if item.Status == "failed" { fmt.Println("failed:", item.URL, item.Error) }}
Each item moves through pending → running → completed (or failed). The batch as a whole follows the same lifecycle plus a cancelled state.For fire-and-forget, use Submit() and come back with Get():
batch, err := client.Batch.Submit(ctx, spidra.BatchParams{ URLs: []string{"https://example.com/1", "https://example.com/2"}, Prompt: "Extract the page title and meta description",})if err != nil { return err}// Later...result, err := client.Batch.Get(ctx, batch.BatchID)fmt.Printf("%s: %d/%d done\n", result.Status, result.CompletedCount, result.TotalURLs)
Crawling is for when you need to cover a whole site or section, not just a handful of URLs. You give it a starting URL and instructions for which links to follow; it discovers pages on its own, extracts data from each one, and hands everything back when it’s done.
job, err := client.Crawl.Run(ctx, spidra.CrawlParams{ BaseURL: "https://competitor.com/blog", CrawlInstruction: "Follow links to blog posts only — skip tag pages, category pages, and the homepage", TransformInstruction: "Extract the post title, author name, publish date, and a one-sentence summary", MaxPages: 30, UseProxy: true,})if err != nil { return err}for _, page := range job.Result { fmt.Println(page.URL, page.Data)}
CrawlInstruction controls navigation — which links to follow, which to skip. TransformInstruction controls extraction — what to pull from each page. MaxPages is a cap so a crawl doesn’t run forever.The same UseProxy, ProxyCountry, and Cookies options from the scraper all work here.For fire-and-forget:
job, err := client.Crawl.Submit(ctx, spidra.CrawlParams{ BaseURL: "https://example.com/docs", CrawlInstruction: "Follow all documentation pages", TransformInstruction: "Extract the page title and a short summary", MaxPages: 50,})// Poll laterstatus, err := client.Crawl.Get(ctx, job.JobID)// status.Status: "waiting" | "active" | "running" | "completed" | "failed"
Crawling is the expensive part. If you’ve already crawled a site and just want to pull out different information, you don’t have to crawl again — Extract() runs a new AI pass over the already-stored content and only charges transformation credits:
result, err := client.Crawl.Extract( ctx, completedJobID, "Extract product SKUs and prices as structured JSON",)if err != nil { return err}// This creates a new job — poll it like any otherextracted, err := client.Crawl.Get(ctx, result.JobID)
All API errors are returned as typed error values. Use errors.As() to match the specific type you care about:
import "errors"_, err := client.Scrape.Run(ctx, spidra.ScrapeParams{ URLs: []spidra.ScrapeURL{{URL: "https://example.com"}}, Prompt: "Extract the main headline",})if err != nil { var authErr *spidra.AuthenticationError var credErr *spidra.InsufficientCreditsError var rateErr *spidra.RateLimitError var srvErr *spidra.ServerError var apiErr *spidra.SpidraError switch { case errors.As(err, &authErr): log.Fatal("invalid or missing API key") case errors.As(err, &credErr): log.Fatal("account is out of credits") case errors.As(err, &rateErr): log.Fatal("rate limited — slow down") case errors.As(err, &srvErr): log.Printf("server error — safe to retry: %s", srvErr.Message) case errors.As(err, &apiErr): log.Printf("api error %d: %s", apiErr.StatusCode, apiErr.Message) default: log.Fatal(err) }}
Type
Status
When
*AuthenticationError
401
The API key is missing or invalid
*InsufficientCreditsError
403
No credits remaining on the account
*RateLimitError
429
Too many requests — back off
*ServerError
500
Unexpected server-side error
*SpidraError
any
Base type — all others embed this
Every error type exposes StatusCode int and Message string. *SpidraError is the base — if you only want one catch-all, match against that.
PHP
Official PHP SDK — idiomatic helpers, typed exceptions, and configurable polling.
Ruby
Official Ruby SDK — pure stdlib, no external dependencies. Works in Rails, Sinatra, and scripts.