> ## Documentation Index
> Fetch the complete documentation index at: https://docs.spidra.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Go

> Official Go SDK for Spidra.

The Go SDK wraps the Spidra API with fully typed request and response structs. No `map[string]any` to wrestle with — you get concrete types for everything, errors as values, and zero external dependencies. Just the standard library.

## Installation

```bash theme={null}
go get github.com/spidra-io/spidra-go
```

<Note>
  Get your API key from [app.spidra.io](https://app.spidra.io) under **Settings → API Keys**.
  Store it as an environment variable — never hardcode it in source.
</Note>

## Getting started

```go theme={null}
import spidra "github.com/spidra-io/spidra-go"

client := spidra.New(os.Getenv("SPIDRA_API_KEY"))
```

If you need to override the base URL (for local dev or self-hosted) or set a custom HTTP timeout:

```go theme={null}
client := spidra.New(
    os.Getenv("SPIDRA_API_KEY"),
    spidra.WithBaseURL("http://localhost:4321/api"),
    spidra.WithTimeout(30 * time.Second),
)
```

Everything lives under `client.Scrape`, `client.Batch`, `client.Crawl`, `client.Logs`, and `client.Usage`.

## Scraping

Each scrape request can include up to three URLs and runs them in parallel. You tell the AI what to extract through a `Prompt`, and optionally lock the output shape with a `Schema`.

The quickest path is `Run()` — it submits the job, polls until it completes, and returns the full result:

```go theme={null}
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs: []spidra.ScrapeURL{
        {URL: "https://example.com/pricing"},
    },
    Prompt: "Extract all pricing plans with name, price, and included features",
    Output: "json",
})
if err != nil {
    return err
}

fmt.Println(job.Result.Content)
```

When you need more control — say you're building a queue and want to poll on your own schedule — use `Submit()` and `Get()` separately:

```go theme={null}
job, err := client.Scrape.Submit(ctx, spidra.ScrapeParams{
    URLs:   []spidra.ScrapeURL{{URL: "https://example.com"}},
    Prompt: "Extract the main headline",
})
if err != nil {
    return err
}

// Come back later and check
status, err := client.Scrape.Get(ctx, job.JobID)
if err != nil {
    return err
}

if status.Status == "completed" {
    fmt.Println(status.Result.Content)
}
```

Jobs move through `waiting` → `active` → `completed` (or `failed`).

### ScrapeParams fields

| Field                | Type          | Description                                                     |
| -------------------- | ------------- | --------------------------------------------------------------- |
| `URLs`               | `[]ScrapeURL` | Up to 3 URLs. Each can carry its own `Actions` slice            |
| `Prompt`             | `string`      | What to extract, in plain English                               |
| `Output`             | `string`      | `"markdown"` (default) or `"json"`                              |
| `Schema`             | `any`         | JSON Schema object — locks the output shape when using `"json"` |
| `UseProxy`           | `bool`        | Route through a residential proxy                               |
| `ProxyCountry`       | `string`      | Two-letter country code: `"us"`, `"de"`, `"jp"`, etc.           |
| `ExtractContentOnly` | `bool`        | Strip nav, ads, and boilerplate before the AI sees the page     |
| `Screenshot`         | `bool`        | Capture a viewport screenshot                                   |
| `FullPageScreenshot` | `bool`        | Capture a full-page (scrolled) screenshot                       |
| `Cookies`            | `string`      | Raw `Cookie` header string for pages behind a login             |

### Enforcing an exact output shape

Pass a `Schema` when you need the output to match a specific structure. Fields the AI can't find come back as `null` instead of guessed values:

```go theme={null}
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:   []spidra.ScrapeURL{{URL: "https://jobs.example.com/senior-engineer"}},
    Prompt: "Extract the job listing details",
    Output: "json",
    Schema: map[string]any{
        "type":     "object",
        "required": []string{"title", "company", "remote"},
        "properties": map[string]any{
            "title":      map[string]any{"type": "string"},
            "company":    map[string]any{"type": "string"},
            "remote":     map[string]any{"type": []string{"boolean", "null"}},
            "salary_min": map[string]any{"type": []string{"number", "null"}},
            "skills":     map[string]any{"type": "array", "items": map[string]any{"type": "string"}},
        },
    },
})
```

### Geo-targeted scraping

Some sites return different prices or content based on the visitor's location. Set `UseProxy: true` and pick a `ProxyCountry` to route through a residential IP in that region:

```go theme={null}
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:         []spidra.ScrapeURL{{URL: "https://www.amazon.de/gp/bestsellers"}},
    Prompt:       "List the top 10 products with name and price",
    UseProxy:     true,
    ProxyCountry: "de",
})
```

Supported codes include `us`, `gb`, `de`, `fr`, `jp`, `au`, `ca`, `br`, `in`, `nl`, `sg`, and [40+ more](/features/stealth-mode#country-targeting). Use `"global"` or `"eu"` for regional routing without pinning to a specific country.

### Scraping pages behind a login

Pass your session cookies as a raw header string. The easiest way to grab this is to open browser devtools, log in, and copy the `Cookie` header from any authenticated request:

```go theme={null}
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:    []spidra.ScrapeURL{{URL: "https://app.example.com/dashboard"}},
    Prompt:  "Extract the monthly revenue and active user count",
    Cookies: "session=abc123; auth_token=xyz789",
})
```

### Browser actions

For pages that need interaction before you can extract anything — accepting cookies, typing into a search input, scrolling to trigger lazy loading — include an `Actions` slice on the URL. They run in order before the AI sees the page:

```go theme={null}
job, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs: []spidra.ScrapeURL{
        {
            URL: "https://example.com/products",
            Actions: []map[string]any{
                {"type": "click", "selector": "#accept-cookies"},
                {"type": "wait", "duration": 1000},
                {"type": "scroll", "to": "80%"},
            },
        },
    },
    Prompt: "Extract all product names and prices visible on the page",
})
```

For `selector` pass a CSS selector or XPath. If you'd rather describe the element in words, use `value` — Spidra will locate it using AI.

| Action    | What it does                                                            |
| --------- | ----------------------------------------------------------------------- |
| `click`   | Click any element — CSS selector via `selector`, plain text via `value` |
| `type`    | Type into an input or textarea                                          |
| `check`   | Check a checkbox                                                        |
| `uncheck` | Uncheck a checkbox                                                      |
| `wait`    | Pause for `duration` milliseconds                                       |
| `scroll`  | Scroll to a percentage of page height (e.g. `"80%"`)                    |
| `forEach` | Loop over every matched element and process each one                    |

### Controlling how long Run() waits

By default `Run()` polls every 3 seconds and times out after 120 seconds. Pass `PollOptions` to override:

```go theme={null}
job, err := client.Scrape.Run(ctx, params, spidra.PollOptions{
    PollInterval: 5 * time.Second,
    Timeout:      60 * time.Second,
})
```

On timeout, `Run()` returns the result with `Status: "timeout"` rather than an error — the `JobID` is still there so you can keep polling with `Get()` if you need to wait longer. The same options work on `Batch.Run()` and `Crawl.Run()`.

## Batch scraping

When you have a list of URLs to process, batch is the right tool. Submit up to 50 URLs in a single request and they run in parallel. Unlike the scraper, each URL here is a plain string — no per-URL actions.

```go theme={null}
batch, err := client.Batch.Run(ctx, spidra.BatchParams{
    URLs: []string{
        "https://shop.example.com/product/1",
        "https://shop.example.com/product/2",
        "https://shop.example.com/product/3",
    },
    Prompt:   "Extract product name, price, and whether it is in stock",
    Output:   "json",
    UseProxy: true,
})
if err != nil {
    return err
}

fmt.Printf("%d/%d completed\n", batch.CompletedCount, batch.TotalURLs)

for _, item := range batch.Items {
    if item.Status == "completed" {
        fmt.Println(item.URL, item.Result)
    } else if item.Status == "failed" {
        fmt.Println("failed:", item.URL, item.Error)
    }
}
```

Each item moves through `pending` → `running` → `completed` (or `failed`). The batch as a whole follows the same lifecycle plus a `cancelled` state.

For fire-and-forget, use `Submit()` and come back with `Get()`:

```go theme={null}
batch, err := client.Batch.Submit(ctx, spidra.BatchParams{
    URLs:   []string{"https://example.com/1", "https://example.com/2"},
    Prompt: "Extract the page title and meta description",
})
if err != nil {
    return err
}

// Later...
result, err := client.Batch.Get(ctx, batch.BatchID)
fmt.Printf("%s: %d/%d done\n", result.Status, result.CompletedCount, result.TotalURLs)
```

### Retrying failures

If some items fail, you can re-queue just those without touching the ones that already succeeded:

```go theme={null}
result, _ := client.Batch.Get(ctx, batchID)

if result.FailedCount > 0 {
    if err := client.Batch.Retry(ctx, batchID); err != nil {
        return err
    }
}
```

To stop a running batch and get credits back for anything that hasn't started:

```go theme={null}
if err := client.Batch.Cancel(ctx, batchID); err != nil {
    return err
}
```

To browse past batches:

```go theme={null}
page, err := client.Batch.List(ctx, 1, 20) // page, limit

for _, job := range page.Jobs {
    fmt.Printf("%s %s — %d/%d\n", job.UUID, job.Status, job.CompletedCount, job.TotalURLs)
}
```

***

## Crawling

Crawling is for when you need to cover a whole site or section, not just a handful of URLs. You give it a starting URL and instructions for which links to follow; it discovers pages on its own, extracts data from each one, and hands everything back when it's done.

```go theme={null}
job, err := client.Crawl.Run(ctx, spidra.CrawlParams{
    BaseURL:              "https://competitor.com/blog",
    CrawlInstruction:     "Follow links to blog posts only — skip tag pages, category pages, and the homepage",
    TransformInstruction: "Extract the post title, author name, publish date, and a one-sentence summary",
    MaxPages:             30,
    UseProxy:             true,
})
if err != nil {
    return err
}

for _, page := range job.Result {
    fmt.Println(page.URL, page.Data)
}
```

`CrawlInstruction` controls navigation — which links to follow, which to skip. `TransformInstruction` controls extraction — what to pull from each page. `MaxPages` is a cap so a crawl doesn't run forever.

The same `UseProxy`, `ProxyCountry`, and `Cookies` options from the scraper all work here.

For fire-and-forget:

```go theme={null}
job, err := client.Crawl.Submit(ctx, spidra.CrawlParams{
    BaseURL:              "https://example.com/docs",
    CrawlInstruction:     "Follow all documentation pages",
    TransformInstruction: "Extract the page title and a short summary",
    MaxPages:             50,
})

// Poll later
status, err := client.Crawl.Get(ctx, job.JobID)
// status.Status: "waiting" | "active" | "running" | "completed" | "failed"
```

### Downloading the raw content

Once a crawl is complete, you can get signed URLs to download the raw HTML and Markdown for every page that was visited. Links expire after an hour:

```go theme={null}
result, err := client.Crawl.Pages(ctx, jobID)

for _, page := range result.Pages {
    fmt.Println(page.URL, page.Status)
    // page.HTMLURL     — download the raw HTML
    // page.MarkdownURL — download the cleaned Markdown
}
```

### Re-extracting with a different prompt

Crawling is the expensive part. If you've already crawled a site and just want to pull out different information, you don't have to crawl again — `Extract()` runs a new AI pass over the already-stored content and only charges transformation credits:

```go theme={null}
result, err := client.Crawl.Extract(
    ctx,
    completedJobID,
    "Extract product SKUs and prices as structured JSON",
)
if err != nil {
    return err
}

// This creates a new job — poll it like any other
extracted, err := client.Crawl.Get(ctx, result.JobID)
```

### History and stats

```go theme={null}
history, err := client.Crawl.History(ctx, 1, 10) // page, limit
fmt.Printf("%d total crawl jobs\n", history.Total)

stats, err := client.Crawl.Stats(ctx)
fmt.Printf("%d all-time crawls\n", stats.Total)
```

## Logs

Every scrape request your API key makes is logged automatically. You can query the full history and filter by status, URL, date range, or channel:

```go theme={null}
result, err := client.Logs.List(ctx, map[string]string{
    "status":     "failed",
    "searchTerm": "amazon.com",
    "dateStart":  "2024-01-01",
    "dateEnd":    "2024-12-31",
    "page":       "1",
    "limit":      "20",
})
if err != nil {
    return err
}

for _, entry := range result.Logs {
    fmt.Println(entry.URLs[0].URL, entry.Status, entry.CreditsUsed)
}
```

To get the full details of a single log — including the AI output from that job:

```go theme={null}
entry, err := client.Logs.Get(ctx, logUUID)
fmt.Println(entry.ResultData)
```

## Usage statistics

Check how many requests and credits your account has consumed over a period:

```go theme={null}
result, err := client.Usage.Get(ctx, "30d") // "7d" | "30d" | "weekly"
if err != nil {
    return err
}

for _, row := range result.Data {
    fmt.Printf("%s: %d requests, %d credits\n", row.Date, row.Requests, row.Credits)
}
```

`"7d"` gives one row per day for the past week. `"30d"` gives the last 30 days. `"weekly"` gives one row per week for the past seven weeks.

## Error handling

All API errors are returned as typed error values. Use `errors.As()` to match the specific type you care about:

```go theme={null}
import "errors"

_, err := client.Scrape.Run(ctx, spidra.ScrapeParams{
    URLs:   []spidra.ScrapeURL{{URL: "https://example.com"}},
    Prompt: "Extract the main headline",
})
if err != nil {
    var authErr *spidra.AuthenticationError
    var credErr *spidra.InsufficientCreditsError
    var rateErr *spidra.RateLimitError
    var srvErr  *spidra.ServerError
    var apiErr  *spidra.SpidraError

    switch {
    case errors.As(err, &authErr):
        log.Fatal("invalid or missing API key")
    case errors.As(err, &credErr):
        log.Fatal("account is out of credits")
    case errors.As(err, &rateErr):
        log.Fatal("rate limited — slow down")
    case errors.As(err, &srvErr):
        log.Printf("server error — safe to retry: %s", srvErr.Message)
    case errors.As(err, &apiErr):
        log.Printf("api error %d: %s", apiErr.StatusCode, apiErr.Message)
    default:
        log.Fatal(err)
    }
}
```

| Type                        | Status | When                                |
| --------------------------- | ------ | ----------------------------------- |
| `*AuthenticationError`      | 401    | The API key is missing or invalid   |
| `*InsufficientCreditsError` | 403    | No credits remaining on the account |
| `*RateLimitError`           | 429    | Too many requests — back off        |
| `*ServerError`              | 500    | Unexpected server-side error        |
| `*SpidraError`              | any    | Base type — all others embed this   |

Every error type exposes `StatusCode int` and `Message string`. `*SpidraError` is the base — if you only want one catch-all, match against that.

<CardGroup cols={2}>
  <Card title="PHP" icon="php" href="/sdks/php">
    Official PHP SDK — idiomatic helpers, typed exceptions, and configurable polling.
  </Card>

  <Card title="Ruby" icon="gem" href="/sdks/ruby">
    Official Ruby SDK — pure stdlib, no external dependencies. Works in Rails, Sinatra, and scripts.
  </Card>
</CardGroup>
