Rate Limits

Spidra enforces rate limits to ensure fair usage and platform stability. This page explains the limits and best practices for handling them.

Submission limits are enforced per user account: 60 scrape jobs/minute and 20 batch jobs/minute. Polling (GET) endpoints are not rate-limited.

Two Types of Limits

Spidra enforces two separate limits. Understanding the difference matters when building integrations.

1. Submission Rate Limit (HTTP 429)

Controls how fast you can submit new jobs — 60 scrape submissions/minute and 20 batch submissions/minute, both tracked per user account. Polling an existing job’s status is exempt from this limit. Response:

{
  "status": "error",
  "message": "Too many requests, please try again later."
}

Includes standard headers:

Header	Description
`RateLimit-Limit`	Maximum requests allowed in the window
`RateLimit-Remaining`	Requests remaining in current window
`RateLimit-Reset`	Unix timestamp when the limit resets

2. Service Capacity (HTTP 503 with code `SERVICE_BUSY`)

When the platform is under heavy load from all users combined, new jobs are temporarily rejected to protect stability. This is rare but can occur during traffic spikes. Response:

{
  "status": "error",
  "message": "The scraping service is currently at capacity. Please retry in a few minutes.",
  "code": "SERVICE_BUSY",
  "retry_after": 60
}

Retry after the number of seconds in retry_after.

Handling All Limits in Code

async function submitJob(request, maxRetries = 10) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch('https://api.spidra.io/api/scrape', {
      method: 'POST',
      headers: {
        Authorization: 'Bearer YOUR_API_KEY',
        'Content-Type': 'application/json'
      },
      body: JSON.stringify(request)
    });

    if (response.ok) return response.json();

    const body = await response.json();

    if (response.status === 503 || body.code === 'SERVICE_BUSY') {
      // Platform at capacity — wait and retry
      const wait = (body.retry_after ?? 60) * 1000;
      await new Promise(r => setTimeout(r, wait));
      continue;
    }

    if (response.status === 429) {
      // HTTP rate limit — exponential backoff
      const wait = Math.pow(2, attempt) * 1000;
      await new Promise(r => setTimeout(r, wait));
      continue;
    }

    throw new Error(`Request failed: ${body.message}`);
  }
  throw new Error('Max retries reached');
}

2. Batch URLs Efficiently

Instead of making multiple single-URL requests, batch up to your plan’s limit:

{
  "urls": [
    {"url": "https://example.com/page1"},
    {"url": "https://example.com/page2"},
    {"url": "https://example.com/page3"}
  ],
  "prompt": "Extract the title and main content"
}

3. Use Crawl for Large Sites

For scraping many pages from the same domain, use the Crawl API instead of multiple scrape requests. One crawl request can process up to 50 pages.

4. Cache Results

Store scrape results locally to avoid redundant requests for the same content.

Some limits vary depending on your plan. The number of concurrent URLs per scrape request and the number of actions per URL both increase on higher tiers. See the Plans and Pricing page for a full breakdown.

Increasing Limits

Need higher limits? Contact us to discuss Enterprise plans with custom rate limits and dedicated infrastructure.

Using the API

Scrape Endpoints

Logs

Crawl Endpoints

Account

Two Types of Limits

1. Submission Rate Limit (HTTP 429)

2. Service Capacity (HTTP 503 with code `SERVICE_BUSY`)

Handling All Limits in Code

2. Batch URLs Efficiently

3. Use Crawl for Large Sites

4. Cache Results

Increasing Limits

​Two Types of Limits

​1. Submission Rate Limit (HTTP 429)

​2. Service Capacity (HTTP 503 with code SERVICE_BUSY)

​Handling All Limits in Code

​2. Batch URLs Efficiently

​3. Use Crawl for Large Sites

​4. Cache Results

​Increasing Limits

Two Types of Limits

1. Submission Rate Limit (HTTP 429)

2. Service Capacity (HTTP 503 with code `SERVICE_BUSY`)

Handling All Limits in Code

2. Batch URLs Efficiently

3. Use Crawl for Large Sites

4. Cache Results

Increasing Limits