What Are Browser Actions?
When Spidra opens a page, it does not just grab the raw HTML and leave. You can tell it to interact with the page first. That means clicking buttons, filling in search boxes, scrolling down to load more content, dismissing cookie banners, or looping through every card, accordion, or link on the page. Actions run in the order you provide them, one after the other. Spidra uses a real browser, so anything a human could do on the page, your action pipeline can do too. You add actions in theactions array on each URL object:
How to Target Elements
Most actions need to know which element on the page to interact with. There are two ways to point Spidra at an element.CSS or XPath selectors (the selector field)
If you know the page structure, CSS selectors are the most precise and reliable way to target an element.
Plain English descriptions (the value field)
You can also describe the element in plain English and Spidra will find it on the page for you. This is useful when CSS selectors are fragile or change between page loads.
Actions Reference
click
Clicks any element on the page. This works for buttons, links, tabs, dropdowns, toggles, and anything else that responds to a click. Eitherselector or value is required.
- Dismiss cookie consent banners before the real content loads
- Click “Load More” to reveal additional results
- Open dropdown menus or select tabs
- Navigate to the next page
type
Types text into an input field, textarea, or search box. Bothselector and value are required.
check
Checks a checkbox. If the checkbox is already checked, nothing happens.uncheck
Unchecks a checkbox. If the checkbox is already unchecked, nothing happens.wait
Pauses the scrape for a set number of milliseconds. Use this after actions that trigger loading, animations, or data fetching. Theduration field sets the wait time in milliseconds.
The older
value field (e.g. "value": 2000) is still accepted but deprecated. Use duration going forward.scroll
Scrolls the page to a percentage of its total height. This is essential for pages that load content as you scroll (infinite scroll, lazy-loaded images). Theto field takes a number or string percentage between 0 and 100.
The older
value field (e.g. "value": "80%") is still accepted but deprecated. Use to going forward.forEach: Process Every Element on a Page
forEach is the most powerful action in Spidra. Rather than scraping a page once and hoping all the data is there, forEach finds a set of matching elements on the page and processes each one individually. It then combines all the results into one output.
Think of it as running a mini scrape on every item in a list.
Adding forEach
forEach is an action, so it goes inside the actions array along with any other actions you want to run first. Use the observe field to describe which elements to find.
forEach run once on the page first. The forEach then runs on whatever state the page is in after those actions complete.
Writing a good observe instruction
Theobserve field is the most important part of forEach. It tells Spidra which elements to find on the page. Vague instructions produce inconsistent results.
Describe what the elements are, not what you want to do with them:
captureSelector: the observe instruction helps Spidra locate the set of elements to iterate over, but the captureSelector CSS is what actually reads each element’s content. Keep them consistent. If observe says “product cards” then captureSelector should point at the same element (e.g. article.product_pod), not a child element.
Do you actually need forEach?
Before reaching for forEach, consider whether a top-levelprompt is enough.
If the list is short and fits on one page, you usually do not need forEach at all. Just scrape the URL and use prompt to extract what you need from the full page. It is simpler and works just as well.
- You need to collect items across multiple pages using
pagination. A top-level prompt only sees the page it lands on. It cannot follow next-page links on its own. - You have 20 or more items and want
itemPromptto extract fields from each one individually, keeping each AI call small and the output consistently structured. - You are using
navigateorclickmode to access content that is only available after clicking into each item.
The three forEach modes
Themode field tells Spidra how to interact with each element it finds.
inline mode: Read the element directly
Use this when the data is visible on the page inside each matched element, and you do not need to click anything. Product cards, quote blocks, search result rows, table rows. Spidra reads each element’s content and moves on. The page is not changed between items.For a short single-page list,
inline mode without pagination or itemPrompt gives you structured output, but a top-level prompt on the raw page produces equivalent results with less setup. The main reasons to use inline are pagination across pages and per-item AI extraction at scale (20+ items).navigate mode: Follow each link to its destination page
Use this when each element is a link and the content you want lives on the page it points to. Product listings, search results, category pages where clicking a card takes you to the full detail page. Spidra clicks each element, loads the destination page, captures it, then returns and moves to the next element.click mode: Click to expand, capture, then move on
Use this for pages where clicking an element opens more content within the same page. Hotel room cards that open modals, FAQ rows that expand, product variant selectors that reveal details. Spidra clicks each element, waits for the expanded content to appear, captures it, then closes and moves to the next.captureSelector is provided, Spidra first tries to find an open modal ([role="dialog"]) and falls back to capturing the full page.
captureSelector
This tells Spidra which part of the page to read for each item. Without it, Spidra captures whatever is most relevant by default, but being specific gives you cleaner, more focused results. You can use a CSS selector or a plain English description:- In
clickmode: Spidra looks for an open modal first, then captures the full page - In
navigatemode: Spidra captures the full destination page - In
inlinemode: Spidra captures the full HTML of each matched element
itemPrompt: Extract specific fields from each item
itemPrompt runs an AI extraction on each item individually, right after it is captured. You tell it exactly what fields to pull out and in what format.
This is different from the top-level prompt, which runs once on the full combined output after all items are collected.
Why use itemPrompt:
- Each item is processed on its own, so the AI has full focus on just that one item
- Very useful in navigate mode where each destination page has a lot of unrelated content
- Keeps output clean and structured per item from the start
- The top-level prompt still runs afterwards if you also provide one
itemPrompt, each item’s raw captured content is returned as markdown text, and your top-level prompt handles all the extraction at the end.
pagination: Keep going to the next page
After processing all elements on the current page, forEach can follow the next-page link and continue collecting until it hits your limit or runs out of pages. You need to tell it which button or link goes to the next page:maxItems total across all pages, when it has visited maxPages additional pages, or when there is no next page button left.
nextSelector examples:
maxPages is the number of extra pages beyond the first one. Setting maxPages: 3 means Spidra processes the starting page plus 3 more, so 4 pages total.Per-element actions: Do something on each item before capturing
Theactions field inside forEach lets you run browser actions after landing on each item but before capturing its content. This is useful when the destination page needs a scroll to load the full content, an extra click to expand a section, or a moment to settle before reading.
maxItems and waitAfterClick
maxItems sets a cap on how many elements to process. The default is 50 and the maximum allowed is also 50. This limit applies across all pages combined when you use pagination.
waitAfterClick is how long to wait in milliseconds after clicking or navigating before capturing the content. The default is 2500ms. You can lower this for fast static pages or raise it for pages that fetch content from an API after load.
Advanced Patterns
Pattern 1: Click to a category first, then forEach over its items
The pre-actions run once when the URL loads. The forEach runs on whatever page the browser is on after those actions finish.Pattern 2: Click to a category using plain English, then forEach with pagination
When you do not know the CSS selector for a navigation element, describe it in plain English using thevalue field on a click action. This is the same click action covered in the Actions Reference — no special setup needed.
Pattern 3: Scrape multiple categories at the same time
Pass up to 3 URL objects in a single request and they are all processed in parallel. Each URL has its own forEach config.data array has three entries, one per URL, each with their 4 extracted books.
Maximum 3 URLs per request. Each URL counts as 1 credit.
Pattern 4: Click to category, navigate each book, scroll to load full description
Pre-action clicks into a category. forEach navigates into each book’s detail page. A per-element scroll reveals the full description. The AI extracts everything.- Opens the homepage in a real browser
- Clicks the Poetry category link
- Finds all book title links on the category page
- For each book (up to 3): opens the book page, waits 1 second, scrolls down 50% to reveal the full description, captures the product section, runs AI to extract the four fields
- Combines all results into a single output with numbered items
Response Format
All forEach results are returned in themarkdownContent field of each URL’s result. Items are numbered from 1 and separated by ---.
Without itemPrompt, you get the raw captured content per item:
itemPrompt, each item has already been extracted by the AI before being combined:
prompt with output: "json", the AI reads the combined output and returns a clean final JSON array in the content field of the response.
itemPrompt vs Top-level prompt
Both are optional but they serve different purposes and work well together.itemPrompt | Top-level prompt | |
|---|---|---|
| When it runs | Right after each item is captured, during scraping | Once, after all items are collected and combined |
| What it sees | Only that one item’s content | All items together |
| Best for | Per-item field extraction, cleaning up noisy pages | Final restructuring, filtering, or summarising all results |
| Where output appears | Each item’s section in result.data[].markdownContent | result.content in the response |
itemPrompt runs during scraping, one item at a time. The top-level prompt runs after all scraping is finished, on the full combined output. If you use both, itemPrompt cleans up each item first, then the top-level prompt does a final pass on all the cleaned results together.
Limits
| Setting | Default | Maximum |
|---|---|---|
maxItems | 50 | 50 |
pagination.maxPages | 5 | 10 |
| URLs per request | 1 | 3 |
Full forEach Field Reference
| Field | Type | Required | Description |
|---|---|---|---|
observe | string | Yes | Plain English description of the elements to find and process |
mode | string | No | How to interact with each element. One of inline, navigate, or click. Defaults to click. |
captureSelector | string | No | CSS selector or plain English description of which part of the page to capture per item |
maxItems | number | No | Maximum number of elements to process. Default is 50, maximum is 50. |
waitAfterClick | number | No | Milliseconds to wait after clicking or navigating before capturing. Default is 2500. |
actions | array | No | Browser actions to run on each item after clicking or navigating, before capturing |
itemPrompt | string | No | AI extraction prompt to run on each item individually |
pagination.nextSelector | string | Yes if using pagination | CSS selector or plain English description of the next page button |
pagination.maxPages | number | No | How many additional pages to process beyond the first. Default is 5, maximum is 10. |

