Skip to main content
The Go SDK wraps the Spidra API with fully typed request and response structs. No map[string]any to wrestle with — you get concrete types for everything, errors as values, and zero external dependencies. Just the standard library.

Installation

go get github.com/spidra-io/spidra-go
Get your API key from app.spidra.io under Settings → API Keys. Store it as an environment variable — never hardcode it in source.

Getting started

import spidra "github.com/spidra-io/spidra-go"

client := spidra.New(os.Getenv("SPIDRA_API_KEY"))
If you need to override the base URL (for local dev or self-hosted) or set a custom HTTP timeout:
client := spidra.New(
    os.Getenv("SPIDRA_API_KEY"),
    spidra.WithBaseURL("http://localhost:4321/api"),
    spidra.WithTimeout(30 * time.Second),
)
Everything lives under client.Scrape, client.Batch, client.Crawl, client.Logs, and client.Usage.

Scraping

Each scrape request can include up to three URLs and runs them in parallel. You tell the AI what to extract through a Prompt, and optionally lock the output shape with a Schema. The quickest path is Run() — it submits the job, polls until it completes, and returns the full result:
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs: []spidra.ScrapeURL{
        {URL: "https://example.com/pricing"},
    },
    Prompt: "Extract all pricing plans with name, price, and included features",
    Output: "json",
})
if err != nil {
    return err
}

fmt.Println(job.Result.Content)
When you need more control — say you’re building a queue and want to poll on your own schedule — use Submit() and Get() separately:
job, err := client.Scrape.Submit(ctx, spidra.ScrapeParams{
    URLs:   []spidra.ScrapeURL{{URL: "https://example.com"}},
    Prompt: "Extract the main headline",
})
if err != nil {
    return err
}

// Come back later and check
status, err := client.Scrape.Get(ctx, job.JobID)
if err != nil {
    return err
}

if status.Status == "completed" {
    fmt.Println(status.Result.Content)
}
Jobs move through waitingactivecompleted (or failed).

ScrapeParams fields

FieldTypeDescription
URLs[]ScrapeURLUp to 3 URLs. Each can carry its own Actions slice
PromptstringWhat to extract, in plain English
Outputstring"markdown" (default) or "json"
SchemaanyJSON Schema object — locks the output shape when using "json"
UseProxyboolRoute through a residential proxy
ProxyCountrystringTwo-letter country code: "us", "de", "jp", etc.
ExtractContentOnlyboolStrip nav, ads, and boilerplate before the AI sees the page
ScreenshotboolCapture a viewport screenshot
FullPageScreenshotboolCapture a full-page (scrolled) screenshot
CookiesstringRaw Cookie header string for pages behind a login

Enforcing an exact output shape

Pass a Schema when you need the output to match a specific structure. Fields the AI can’t find come back as null instead of guessed values:
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:   []spidra.ScrapeURL{{URL: "https://jobs.example.com/senior-engineer"}},
    Prompt: "Extract the job listing details",
    Output: "json",
    Schema: map[string]any{
        "type":     "object",
        "required": []string{"title", "company", "remote"},
        "properties": map[string]any{
            "title":      map[string]any{"type": "string"},
            "company":    map[string]any{"type": "string"},
            "remote":     map[string]any{"type": []string{"boolean", "null"}},
            "salary_min": map[string]any{"type": []string{"number", "null"}},
            "skills":     map[string]any{"type": "array", "items": map[string]any{"type": "string"}},
        },
    },
})

Geo-targeted scraping

Some sites return different prices or content based on the visitor’s location. Set UseProxy: true and pick a ProxyCountry to route through a residential IP in that region:
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:         []spidra.ScrapeURL{{URL: "https://www.amazon.de/gp/bestsellers"}},
    Prompt:       "List the top 10 products with name and price",
    UseProxy:     true,
    ProxyCountry: "de",
})
Supported codes include us, gb, de, fr, jp, au, ca, br, in, nl, sg, and 40+ more. Use "global" or "eu" for regional routing without pinning to a specific country.

Scraping pages behind a login

Pass your session cookies as a raw header string. The easiest way to grab this is to open browser devtools, log in, and copy the Cookie header from any authenticated request:
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:    []spidra.ScrapeURL{{URL: "https://app.example.com/dashboard"}},
    Prompt:  "Extract the monthly revenue and active user count",
    Cookies: "session=abc123; auth_token=xyz789",
})

Browser actions

For pages that need interaction before you can extract anything — accepting cookies, typing into a search input, scrolling to trigger lazy loading — include an Actions slice on the URL. They run in order before the AI sees the page:
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs: []spidra.ScrapeURL{
        {
            URL: "https://example.com/products",
            Actions: []map[string]any{
                {"type": "click", "selector": "#accept-cookies"},
                {"type": "wait", "duration": 1000},
                {"type": "scroll", "to": "80%"},
            },
        },
    },
    Prompt: "Extract all product names and prices visible on the page",
})
For selector pass a CSS selector or XPath. If you’d rather describe the element in words, use value — Spidra will locate it using AI.
ActionWhat it does
clickClick any element — CSS selector via selector, plain text via value
typeType into an input or textarea
checkCheck a checkbox
uncheckUncheck a checkbox
waitPause for duration milliseconds
scrollScroll to a percentage of page height (e.g. "80%")
forEachLoop over every matched element and process each one

Controlling how long Run() waits

By default Run() polls every 3 seconds and times out after 120 seconds. Pass PollOptions to override:
job, err := client.Scrape.Run(ctx, params, spidra.PollOptions{
    PollInterval: 5 * time.Second,
    Timeout:      60 * time.Second,
})
On timeout, Run() returns the result with Status: "timeout" rather than an error — the JobID is still there so you can keep polling with Get() if you need to wait longer. The same options work on Batch.Run() and Crawl.Run().

Batch scraping

When you have a list of URLs to process, batch is the right tool. Submit up to 50 URLs in a single request and they run in parallel. Unlike the scraper, each URL here is a plain string — no per-URL actions.
batch, err := client.Batch.Run(ctx, spidra.BatchParams{
    URLs: []string{
        "https://shop.example.com/product/1",
        "https://shop.example.com/product/2",
        "https://shop.example.com/product/3",
    },
    Prompt:   "Extract product name, price, and whether it is in stock",
    Output:   "json",
    UseProxy: true,
})
if err != nil {
    return err
}

fmt.Printf("%d/%d completed\n", batch.CompletedCount, batch.TotalURLs)

for _, item := range batch.Items {
    if item.Status == "completed" {
        fmt.Println(item.URL, item.Result)
    } else if item.Status == "failed" {
        fmt.Println("failed:", item.URL, item.Error)
    }
}
Each item moves through pendingrunningcompleted (or failed). The batch as a whole follows the same lifecycle plus a cancelled state. For fire-and-forget, use Submit() and come back with Get():
batch, err := client.Batch.Submit(ctx, spidra.BatchParams{
    URLs:   []string{"https://example.com/1", "https://example.com/2"},
    Prompt: "Extract the page title and meta description",
})
if err != nil {
    return err
}

// Later...
result, err := client.Batch.Get(ctx, batch.BatchID)
fmt.Printf("%s: %d/%d done\n", result.Status, result.CompletedCount, result.TotalURLs)

Retrying failures

If some items fail, you can re-queue just those without touching the ones that already succeeded:
result, _ := client.Batch.Get(ctx, batchID)

if result.FailedCount > 0 {
    if err := client.Batch.Retry(ctx, batchID); err != nil {
        return err
    }
}
To stop a running batch and get credits back for anything that hasn’t started:
if err := client.Batch.Cancel(ctx, batchID); err != nil {
    return err
}
To browse past batches:
page, err := client.Batch.List(ctx, 1, 20) // page, limit

for _, job := range page.Jobs {
    fmt.Printf("%s %s%d/%d\n", job.UUID, job.Status, job.CompletedCount, job.TotalURLs)
}

Crawling

Crawling is for when you need to cover a whole site or section, not just a handful of URLs. You give it a starting URL and instructions for which links to follow; it discovers pages on its own, extracts data from each one, and hands everything back when it’s done.
job, err := client.Crawl.Run(ctx, spidra.CrawlParams{
    BaseURL:              "https://competitor.com/blog",
    CrawlInstruction:     "Follow links to blog posts only — skip tag pages, category pages, and the homepage",
    TransformInstruction: "Extract the post title, author name, publish date, and a one-sentence summary",
    MaxPages:             30,
    UseProxy:             true,
})
if err != nil {
    return err
}

for _, page := range job.Result {
    fmt.Println(page.URL, page.Data)
}
CrawlInstruction controls navigation — which links to follow, which to skip. TransformInstruction controls extraction — what to pull from each page. MaxPages is a cap so a crawl doesn’t run forever. The same UseProxy, ProxyCountry, and Cookies options from the scraper all work here. For fire-and-forget:
job, err := client.Crawl.Submit(ctx, spidra.CrawlParams{
    BaseURL:              "https://example.com/docs",
    CrawlInstruction:     "Follow all documentation pages",
    TransformInstruction: "Extract the page title and a short summary",
    MaxPages:             50,
})

// Poll later
status, err := client.Crawl.Get(ctx, job.JobID)
// status.Status: "waiting" | "active" | "running" | "completed" | "failed"

Downloading the raw content

Once a crawl is complete, you can get signed URLs to download the raw HTML and Markdown for every page that was visited. Links expire after an hour:
result, err := client.Crawl.Pages(ctx, jobID)

for _, page := range result.Pages {
    fmt.Println(page.URL, page.Status)
    // page.HTMLURL     — download the raw HTML
    // page.MarkdownURL — download the cleaned Markdown
}

Re-extracting with a different prompt

Crawling is the expensive part. If you’ve already crawled a site and just want to pull out different information, you don’t have to crawl again — Extract() runs a new AI pass over the already-stored content and only charges transformation credits:
result, err := client.Crawl.Extract(
    ctx,
    completedJobID,
    "Extract product SKUs and prices as structured JSON",
)
if err != nil {
    return err
}

// This creates a new job — poll it like any other
extracted, err := client.Crawl.Get(ctx, result.JobID)

History and stats

history, err := client.Crawl.History(ctx, 1, 10) // page, limit
fmt.Printf("%d total crawl jobs\n", history.Total)

stats, err := client.Crawl.Stats(ctx)
fmt.Printf("%d all-time crawls\n", stats.Total)

Logs

Every scrape request your API key makes is logged automatically. You can query the full history and filter by status, URL, date range, or channel:
result, err := client.Logs.List(ctx, map[string]string{
    "status":     "failed",
    "searchTerm": "amazon.com",
    "dateStart":  "2024-01-01",
    "dateEnd":    "2024-12-31",
    "page":       "1",
    "limit":      "20",
})
if err != nil {
    return err
}

for _, entry := range result.Logs {
    fmt.Println(entry.URLs[0].URL, entry.Status, entry.CreditsUsed)
}
To get the full details of a single log — including the AI output from that job:
entry, err := client.Logs.Get(ctx, logUUID)
fmt.Println(entry.ResultData)

Usage statistics

Check how many requests and credits your account has consumed over a period:
result, err := client.Usage.Get(ctx, "30d") // "7d" | "30d" | "weekly"
if err != nil {
    return err
}

for _, row := range result.Data {
    fmt.Printf("%s: %d requests, %d credits\n", row.Date, row.Requests, row.Credits)
}
"7d" gives one row per day for the past week. "30d" gives the last 30 days. "weekly" gives one row per week for the past seven weeks.

Error handling

All API errors are returned as typed error values. Use errors.As() to match the specific type you care about:
import "errors"

_, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:   []spidra.ScrapeURL{{URL: "https://example.com"}},
    Prompt: "Extract the main headline",
})
if err != nil {
    var authErr *spidra.AuthenticationError
    var credErr *spidra.InsufficientCreditsError
    var rateErr *spidra.RateLimitError
    var srvErr  *spidra.ServerError
    var apiErr  *spidra.SpidraError

    switch {
    case errors.As(err, &authErr):
        log.Fatal("invalid or missing API key")
    case errors.As(err, &credErr):
        log.Fatal("account is out of credits")
    case errors.As(err, &rateErr):
        log.Fatal("rate limited — slow down")
    case errors.As(err, &srvErr):
        log.Printf("server error — safe to retry: %s", srvErr.Message)
    case errors.As(err, &apiErr):
        log.Printf("api error %d: %s", apiErr.StatusCode, apiErr.Message)
    default:
        log.Fatal(err)
    }
}
TypeStatusWhen
*AuthenticationError401The API key is missing or invalid
*InsufficientCreditsError403No credits remaining on the account
*RateLimitError429Too many requests — back off
*ServerError500Unexpected server-side error
*SpidraErroranyBase type — all others embed this
Every error type exposes StatusCode int and Message string. *SpidraError is the base — if you only want one catch-all, match against that.