Skip to main content
The official Swift SDK for Spidra uses modern async/await concurrency throughout. All results come back as structured data ready to feed into your iOS, macOS, or server-side Swift applications.

Installation

Swift Package Manager

Add Spidra to your Package.swift dependencies:
dependencies: [
    .package(url: "https://github.com/spidra-io/spidra-swift.git", from: "1.0.0")
]
Or add it directly via Xcode: File → Add Packages… and paste the repository URL.
Get your API key from app.spidra.io under Settings → API Keys. Never hardcode it in source files — use an environment variable instead.

Requirements

  • Swift 5.9+
  • iOS 15.0+ / macOS 12.0+ / tvOS 15.0+ / watchOS 8.0+
  • A Spidra API key (sign up free)

Getting started

import Spidra

let spidra = SpidraClient(apiKey: "spd_YOUR_API_KEY")
From here you access everything through spidra.scrape, spidra.batch, spidra.crawl, spidra.logs, and spidra.usage.

Quick start

import Spidra

Task {
    do {
        let spidra = SpidraClient(apiKey: "spd_YOUR_API_KEY")

        let params = ScrapeParams(
            urls: [ScrapeUrl(url: "https://news.ycombinator.com")],
            prompt: "List the top 5 stories with title, points, and comment count",
            output: "json"
        )

        let job = try await spidra.scrape.run(params)

        if let content = job.result?.content?.value {
            print(content)
        }
    } catch {
        print("Error: \(error.localizedDescription)")
    }
}

Scraping

All scrape jobs run asynchronously using Swift’s async/await. The run() method submits a job and polls until it finishes. Up to 3 URLs can be passed per request and they are processed in parallel.

Basic scrape

let params = ScrapeParams(
    urls: [ScrapeUrl(url: "https://example.com/pricing")],
    prompt: "Extract all pricing plans with name, price, and included features",
    output: "json"
)

let job = try await spidra.scrape.run(params)
print(job.result?.content?.value ?? "No data")
Parameters
ParameterTypeDescription
urls[ScrapeUrl]Up to 3 URLs, each with optional per-URL browser actions
promptStringAI extraction instruction
outputString"markdown" (default) or "json"
schemaAnyCodable?JSON Schema for guaranteed output shape
useProxyBoolRoute through a residential proxy
proxyCountryString?Two-letter country code, e.g. "us", "de", "jp"
extractContentOnlyBoolStrip navigation, ads, and boilerplate before AI extraction
screenshotBoolCapture a screenshot of the page
fullPageScreenshotBoolCapture a full-page (scrolled) screenshot
cookiesString?Raw Cookie header string for authenticated pages

Fire-and-forget approach

Use submit() and get() when you want to manage polling yourself.
// Submit a job immediately
let queued = try await spidra.scrape.submit(ScrapeParams(
    urls: [ScrapeUrl(url: "https://example.com")],
    prompt: "Extract the main headline"
))

// Check status later
let status = try await spidra.scrape.get(queued.jobId)
if status.status == "completed" {
    print(status.result?.content?.value ?? "")
}
Job statuses: waiting · active · completed · failed

Structured JSON output

Pass a schema to enforce an exact output shape. Missing fields come back as null rather than hallucinated values.
let schemaDict: [String: Any] = [
    "type": "object",
    "required": ["title", "company", "remote"],
    "properties": [
        "title":   ["type": "string"],
        "company": ["type": "string"],
        "remote":  ["type": ["boolean", "null"]]
    ]
]

let params = ScrapeParams(
    urls: [ScrapeUrl(url: "https://jobs.example.com/senior-engineer")],
    prompt: "Extract the job listing details",
    output: "json",
    schema: AnyCodable(schemaDict)
)

let job = try await spidra.scrape.run(params)

Geo-targeted scraping

Pass useProxy: true and a proxyCountry code to route through a residential IP in that country.
let params = ScrapeParams(
    urls: [ScrapeUrl(url: "https://www.amazon.de/gp/bestsellers")],
    prompt: "List the top 10 products with name and price",
    useProxy: true,
    proxyCountry: "de"
)
Supported codes include us, gb, de, fr, jp, au, ca, br, in, nl, and 40+ more. Use "global" or "eu" for regional routing.

Authenticated pages

Pass cookies as a string to scrape pages that require a login session.
let params = ScrapeParams(
    urls: [ScrapeUrl(url: "https://app.example.com/dashboard")],
    prompt: "Extract the monthly revenue and active user count",
    cookies: "session=abc123; auth_token=xyz789"
)

Browser actions

Actions let you interact with the page before the scrape runs. They execute in order.
let url = ScrapeUrl(
    url: "https://example.com/products",
    actions: [
        .click(selector: "#accept-cookies", value: nil),
        .wait(duration: 1000),
        .scroll(to: "80%")
    ]
)

let params = ScrapeParams(urls: [url], prompt: "Extract product names and prices")
let job = try await spidra.scrape.run(params)
Available actions
ActionDescription
.click(selector:value:)Click a button, link, or any element
.type(selector:value:)Type text into an input or textarea
.check(selector:value:)Check a checkbox
.uncheck(selector:value:)Uncheck a checkbox
.wait(duration:)Pause for a set number of milliseconds
.scroll(to:)Scroll to a percentage of the page height
.forEach(observe:mode:...)Loop over every matched element and process each

forEach — loop over every element

forEach finds a set of elements and processes each individually. Best used when dealing with pagination, clicking into detail pages, or looping over long lists.
let forEachAction = BrowserAction.forEach(
    observe: "Find all book cards in the product grid",
    mode: "inline",
    captureSelector: "article.product_pod",
    maxItems: 20,
    itemPrompt: "Extract title, price, and star rating. Return as JSON",
    waitAfterClick: nil,
    actions: nil,
    pagination: nil
)

let url = ScrapeUrl(
    url: "https://books.toscrape.com/",
    actions: [forEachAction]
)
Modes:
  • inline — Read element content directly without navigating away.
  • navigate — Follow each element’s link to its destination page and capture content there.
  • click — Click each element, capture the content that appears (e.g., a modal), then move on.
You can also use pagination to navigate through multiple pages automatically:
let pagination = BrowserActionPagination(nextSelector: "li.next > a", maxPages: 3)

Poll options

Override default polling intervals via PollOptions:
let options = PollOptions(pollInterval: 2.0, timeout: 60.0)
let job = try await spidra.scrape.run(params, options: options)
The same options work on batch.run() and crawl.run().

Batch scraping

Submit up to 50 URLs in a single request. All URLs are processed in parallel. Each URL is a plain string.
let params = BatchScrapeParams(
    urls: [
        "https://shop.example.com/product/1",
        "https://shop.example.com/product/2",
        "https://shop.example.com/product/3"
    ],
    prompt: "Extract product name, price, and availability",
    output: "json",
    useProxy: true
)

let batch = try await spidra.batch.run(params)

for item in batch.items {
    if item.status == "completed" {
        print("Completed: \(item.url)")
    } else if item.status == "failed" {
        print("Failed: \(item.error ?? "Unknown")")
    }
}
Item statuses: pending · running · completed · failed Batch statuses: pending · running · completed · failed · cancelled You can also list(), retry(), or cancel() batches using the same pattern as scrape.

Crawling

Given a starting URL, Spidra discovers pages automatically according to your instruction and extracts structured data from each one.
let params = CrawlParams(
    baseUrl: "https://competitor.com/blog",
    crawlInstruction: "Find all blog posts published in 2024",
    transformInstruction: "Extract the title, author, and publish date",
    maxPages: 30
)

let job = try await spidra.crawl.run(params)

if let pages = job.result {
    for page in pages {
        print(page.url, page.data?.value ?? "No Data")
    }
}
Parameters
ParameterTypeDescription
baseUrlStringStarting URL for the crawl
crawlInstructionStringWhich links to follow and which to skip
transformInstructionStringWhat to extract from each page
maxPagesIntMaximum number of pages to crawl
useProxyBoolRoute through a residential proxy
proxyCountryString?Two-letter country code, e.g. "us"
cookiesString?Raw Cookie header string for authenticated sites

Download crawled content

Fetch signed download URLs for HTML and Markdown for all crawled pages. Links expire after 1 hour.
let response = try await spidra.crawl.pages(job.jobId)

Logs

Every API scrape job is logged automatically.
let params = ScrapeLogsParams(status: "failed", limit: 20)
let response = try await spidra.logs.list(params)

for log in response.logs {
    print("Log: \(log.uuid) - Status: \(log.status) - Credits: \(log.creditsUsed)")
}

// Get full extraction result for a specific log
let detail = try await spidra.logs.get("log-uuid")

Usage statistics

Returns credit and request usage broken down by day or week.
let rows = try await spidra.usage.get("30d") // "7d" | "30d" | "weekly"

for row in rows {
    print("Date: \(row.date) - Requests: \(row.requests) - Credits: \(row.credits)")
}
RangeDescription
"7d"Last 7 days, one row per day
"30d"Last 30 days, one row per day
"weekly"Last 7 weeks, one row per week

Error handling

Every API error throws a SpidraError. Catch the specific case you care about.
do {
    let job = try await spidra.scrape.run(params)
} catch SpidraError.authenticationError(let msg) {
    // 401: API key is missing or invalid
    print("Check your API key: \(msg)")
} catch SpidraError.insufficientCreditsError(let msg) {
    // 403: Monthly credit limit reached
    print("Out of credits: \(msg)")
} catch SpidraError.rateLimitError(let msg) {
    // 429: Too many requests
    print("Rate limited: \(msg)")
} catch SpidraError.serverError(let msg) {
    // 500: Server error
    print("Server error: \(msg)")
} catch {
    // Decoding errors, network timeouts, etc.
    print("Other error: \(error.localizedDescription)")
}
Error caseStatusWhen
.authenticationError401API key is missing or invalid
.insufficientCreditsError403No credits remaining
.rateLimitError429Too many requests — back off
.serverError500Unexpected server-side error

.NET

Official .NET SDK — fully async, typed exceptions, JSON schema support. Requires .NET 8+.

Java

Official Java SDK — CompletableFuture-based, builder pattern, no extra HTTP dependencies.