Go - Spidra

The Go SDK wraps the Spidra API with fully typed request and response structs. No map[string]any to wrestle with — you get concrete types for everything, errors as values, and zero external dependencies. Just the standard library.

Installation

go get github.com/spidra-io/spidra-go

Get your API key from app.spidra.io under Settings → API Keys. Store it as an environment variable — never hardcode it in source.

Getting started

import spidra "github.com/spidra-io/spidra-go"

client := spidra.New(os.Getenv("SPIDRA_API_KEY"))

If you need to override the base URL (for local dev or self-hosted) or set a custom HTTP timeout:

client := spidra.New(
    os.Getenv("SPIDRA_API_KEY"),
    spidra.WithBaseURL("http://localhost:4321/api"),
    spidra.WithTimeout(30 * time.Second),
)

Everything lives under client.Scrape, client.Batch, client.Crawl, client.Logs, and client.Usage.

Scraping

Each scrape request can include up to three URLs and runs them in parallel. You tell the AI what to extract through a Prompt, and optionally lock the output shape with a Schema. The quickest path is Run() — it submits the job, polls until it completes, and returns the full result:

job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs: []spidra.ScrapeURL{
        {URL: "https://example.com/pricing"},
    },
    Prompt: "Extract all pricing plans with name, price, and included features",
    Output: "json",
})
if err != nil {
    return err
}

fmt.Println(job.Result.Content)

When you need more control — say you’re building a queue and want to poll on your own schedule — use Submit() and Get() separately:

job, err := client.Scrape.Submit(ctx, spidra.ScrapeParams{
    URLs:   []spidra.ScrapeURL{{URL: "https://example.com"}},
    Prompt: "Extract the main headline",
})
if err != nil {
    return err
}

// Come back later and check
status, err := client.Scrape.Get(ctx, job.JobID)
if err != nil {
    return err
}

if status.Status == "completed" {
    fmt.Println(status.Result.Content)
}

Jobs move through waiting → active → completed (or failed).

ScrapeParams fields

Field	Type	Description
`URLs`	`[]ScrapeURL`	Up to 3 URLs. Each can carry its own `Actions` slice
`Prompt`	`string`	What to extract, in plain English
`Output`	`string`	`"markdown"` (default) or `"json"`
`Schema`	`any`	JSON Schema object — locks the output shape when using `"json"`
`UseProxy`	`bool`	Route through a residential proxy
`ProxyCountry`	`string`	Two-letter country code: `"us"`, `"de"`, `"jp"`, etc.
`ExtractContentOnly`	`bool`	Strip nav, ads, and boilerplate before the AI sees the page
`Screenshot`	`bool`	Capture a viewport screenshot
`FullPageScreenshot`	`bool`	Capture a full-page (scrolled) screenshot
`Cookies`	`string`	Raw `Cookie` header string for pages behind a login

Enforcing an exact output shape

Pass a Schema when you need the output to match a specific structure. Fields the AI can’t find come back as null instead of guessed values:

job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:   []spidra.ScrapeURL{{URL: "https://jobs.example.com/senior-engineer"}},
    Prompt: "Extract the job listing details",
    Output: "json",
    Schema: map[string]any{
        "type":     "object",
        "required": []string{"title", "company", "remote"},
        "properties": map[string]any{
            "title":      map[string]any{"type": "string"},
            "company":    map[string]any{"type": "string"},
            "remote":     map[string]any{"type": []string{"boolean", "null"}},
            "salary_min": map[string]any{"type": []string{"number", "null"}},
            "skills":     map[string]any{"type": "array", "items": map[string]any{"type": "string"}},
        },
    },
})

Geo-targeted scraping

Some sites return different prices or content based on the visitor’s location. Set UseProxy: true and pick a ProxyCountry to route through a residential IP in that region:

job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:         []spidra.ScrapeURL{{URL: "https://www.amazon.de/gp/bestsellers"}},
    Prompt:       "List the top 10 products with name and price",
    UseProxy:     true,
    ProxyCountry: "de",
})

Supported codes include us, gb, de, fr, jp, au, ca, br, in, nl, sg, and 40+ more. Use "global" or "eu" for regional routing without pinning to a specific country. Pass your session cookies as a raw header string. The easiest way to grab this is to open browser devtools, log in, and copy the Cookie header from any authenticated request:

job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:    []spidra.ScrapeURL{{URL: "https://app.example.com/dashboard"}},
    Prompt:  "Extract the monthly revenue and active user count",
    Cookies: "session=abc123; auth_token=xyz789",
})

Browser actions

For pages that need interaction before you can extract anything — accepting cookies, typing into a search input, scrolling to trigger lazy loading — include an Actions slice on the URL. They run in order before the AI sees the page:

job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs: []spidra.ScrapeURL{
        {
            URL: "https://example.com/products",
            Actions: []map[string]any{
                {"type": "click", "selector": "#accept-cookies"},
                {"type": "wait", "duration": 1000},
                {"type": "scroll", "to": "80%"},
            },
        },
    },
    Prompt: "Extract all product names and prices visible on the page",
})

For selector pass a CSS selector or XPath. If you’d rather describe the element in words, use value — Spidra will locate it using AI.

Action	What it does
`click`	Click any element — CSS selector via `selector`, plain text via `value`
`type`	Type into an input or textarea
`check`	Check a checkbox
`uncheck`	Uncheck a checkbox
`wait`	Pause for `duration` milliseconds
`scroll`	Scroll to a percentage of page height (e.g. `"80%"`)
`forEach`	Loop over every matched element and process each one

Controlling how long Run() waits

By default Run() polls every 3 seconds and times out after 120 seconds. Pass PollOptions to override:

job, err := client.Scrape.Run(ctx, params, spidra.PollOptions{
    PollInterval: 5 * time.Second,
    Timeout:      60 * time.Second,
})

On timeout, Run() returns the result with Status: "timeout" rather than an error — the JobID is still there so you can keep polling with Get() if you need to wait longer. The same options work on Batch.Run() and Crawl.Run().

Batch scraping

When you have a list of URLs to process, batch is the right tool. Submit up to 50 URLs in a single request and they run in parallel. Unlike the scraper, each URL here is a plain string — no per-URL actions.

batch, err := client.Batch.Run(ctx, spidra.BatchParams{
    URLs: []string{
        "https://shop.example.com/product/1",
        "https://shop.example.com/product/2",
        "https://shop.example.com/product/3",
    },
    Prompt:   "Extract product name, price, and whether it is in stock",
    Output:   "json",
    UseProxy: true,
})
if err != nil {
    return err
}

fmt.Printf("%d/%d completed\n", batch.CompletedCount, batch.TotalURLs)

for _, item := range batch.Items {
    if item.Status == "completed" {
        fmt.Println(item.URL, item.Result)
    } else if item.Status == "failed" {
        fmt.Println("failed:", item.URL, item.Error)
    }
}

Each item moves through pending → running → completed (or failed). The batch as a whole follows the same lifecycle plus a cancelled state. For fire-and-forget, use Submit() and come back with Get():

batch, err := client.Batch.Submit(ctx, spidra.BatchParams{
    URLs:   []string{"https://example.com/1", "https://example.com/2"},
    Prompt: "Extract the page title and meta description",
})
if err != nil {
    return err
}

// Later...
result, err := client.Batch.Get(ctx, batch.BatchID)
fmt.Printf("%s: %d/%d done\n", result.Status, result.CompletedCount, result.TotalURLs)

Retrying failures

If some items fail, you can re-queue just those without touching the ones that already succeeded:

result, _ := client.Batch.Get(ctx, batchID)

if result.FailedCount > 0 {
    if err := client.Batch.Retry(ctx, batchID); err != nil {
        return err
    }
}

To stop a running batch and get credits back for anything that hasn’t started:

if err := client.Batch.Cancel(ctx, batchID); err != nil {
    return err
}

To browse past batches:

page, err := client.Batch.List(ctx, 1, 20) // page, limit

for _, job := range page.Jobs {
    fmt.Printf("%s %s — %d/%d\n", job.UUID, job.Status, job.CompletedCount, job.TotalURLs)
}

Crawling

Crawling is for when you need to cover a whole site or section, not just a handful of URLs. You give it a starting URL and instructions for which links to follow; it discovers pages on its own, extracts data from each one, and hands everything back when it’s done.

job, err := client.Crawl.Run(ctx, spidra.CrawlParams{
    BaseURL:              "https://competitor.com/blog",
    CrawlInstruction:     "Follow links to blog posts only — skip tag pages, category pages, and the homepage",
    TransformInstruction: "Extract the post title, author name, publish date, and a one-sentence summary",
    MaxPages:             30,
    UseProxy:             true,
})
if err != nil {
    return err
}

for _, page := range job.Result {
    fmt.Println(page.URL, page.Data)
}

CrawlInstruction controls navigation — which links to follow, which to skip. TransformInstruction controls extraction — what to pull from each page. MaxPages is a cap so a crawl doesn’t run forever. The same UseProxy, ProxyCountry, and Cookies options from the scraper all work here. For fire-and-forget:

job, err := client.Crawl.Submit(ctx, spidra.CrawlParams{
    BaseURL:              "https://example.com/docs",
    CrawlInstruction:     "Follow all documentation pages",
    TransformInstruction: "Extract the page title and a short summary",
    MaxPages:             50,
})

// Poll later
status, err := client.Crawl.Get(ctx, job.JobID)
// status.Status: "waiting" | "active" | "running" | "completed" | "failed"

Downloading the raw content

Once a crawl is complete, you can get signed URLs to download the raw HTML and Markdown for every page that was visited. Links expire after an hour:

result, err := client.Crawl.Pages(ctx, jobID)

for _, page := range result.Pages {
    fmt.Println(page.URL, page.Status)
    // page.HTMLURL     — download the raw HTML
    // page.MarkdownURL — download the cleaned Markdown
}

Re-extracting with a different prompt

Crawling is the expensive part. If you’ve already crawled a site and just want to pull out different information, you don’t have to crawl again — Extract() runs a new AI pass over the already-stored content and only charges transformation credits:

result, err := client.Crawl.Extract(
    ctx,
    completedJobID,
    "Extract product SKUs and prices as structured JSON",
)
if err != nil {
    return err
}

// This creates a new job — poll it like any other
extracted, err := client.Crawl.Get(ctx, result.JobID)

History and stats

history, err := client.Crawl.History(ctx, 1, 10) // page, limit
fmt.Printf("%d total crawl jobs\n", history.Total)

stats, err := client.Crawl.Stats(ctx)
fmt.Printf("%d all-time crawls\n", stats.Total)

Logs

Every scrape request your API key makes is logged automatically. You can query the full history and filter by status, URL, date range, or channel:

result, err := client.Logs.List(ctx, map[string]string{
    "status":     "failed",
    "searchTerm": "amazon.com",
    "dateStart":  "2024-01-01",
    "dateEnd":    "2024-12-31",
    "page":       "1",
    "limit":      "20",
})
if err != nil {
    return err
}

for _, entry := range result.Logs {
    fmt.Println(entry.URLs[0].URL, entry.Status, entry.CreditsUsed)
}

To get the full details of a single log — including the AI output from that job:

entry, err := client.Logs.Get(ctx, logUUID)
fmt.Println(entry.ResultData)

Usage statistics

Check how many requests and credits your account has consumed over a period:

result, err := client.Usage.Get(ctx, "30d") // "7d" | "30d" | "weekly"
if err != nil {
    return err
}

for _, row := range result.Data {
    fmt.Printf("%s: %d requests, %d credits\n", row.Date, row.Requests, row.Credits)
}

"7d" gives one row per day for the past week. "30d" gives the last 30 days. "weekly" gives one row per week for the past seven weeks.

Error handling

All API errors are returned as typed error values. Use errors.As() to match the specific type you care about:

import "errors"

_, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:   []spidra.ScrapeURL{{URL: "https://example.com"}},
    Prompt: "Extract the main headline",
})
if err != nil {
    var authErr *spidra.AuthenticationError
    var credErr *spidra.InsufficientCreditsError
    var rateErr *spidra.RateLimitError
    var srvErr  *spidra.ServerError
    var apiErr  *spidra.SpidraError

    switch {
    case errors.As(err, &authErr):
        log.Fatal("invalid or missing API key")
    case errors.As(err, &credErr):
        log.Fatal("account is out of credits")
    case errors.As(err, &rateErr):
        log.Fatal("rate limited — slow down")
    case errors.As(err, &srvErr):
        log.Printf("server error — safe to retry: %s", srvErr.Message)
    case errors.As(err, &apiErr):
        log.Printf("api error %d: %s", apiErr.StatusCode, apiErr.Message)
    default:
        log.Fatal(err)
    }
}

Type	Status	When
`*AuthenticationError`	401	The API key is missing or invalid
`*InsufficientCreditsError`	403	No credits remaining on the account
`*RateLimitError`	429	Too many requests — back off
`*ServerError`	500	Unexpected server-side error
`*SpidraError`	any	Base type — all others embed this

Every error type exposes StatusCode int and Message string. *SpidraError is the base — if you only want one catch-all, match against that.

PHP

Official PHP SDK — idiomatic helpers, typed exceptions, and configurable polling.

Ruby

Official Ruby SDK — pure stdlib, no external dependencies. Works in Rails, Sinatra, and scripts.

​Installation

​Getting started

​Scraping

​ScrapeParams fields

​Enforcing an exact output shape

​Geo-targeted scraping

​Scraping pages behind a login

​Browser actions

​Controlling how long Run() waits

​Batch scraping

​Retrying failures

​Crawling

​Downloading the raw content

​Re-extracting with a different prompt

​History and stats

​Logs

​Usage statistics

​Error handling

PHP

Ruby

Installation

Getting started

Scraping

ScrapeParams fields

Enforcing an exact output shape

Geo-targeted scraping

Scraping pages behind a login

Browser actions

Controlling how long Run() waits

Batch scraping

Retrying failures

Crawling

Downloading the raw content

Re-extracting with a different prompt

History and stats

Logs

Usage statistics

Error handling