Skip to content

Browser Setup and Testing

Use this guide when you need browser automation in Tandem and want to verify the full path from sidecar install to engine readiness.

What tandem-browser is

tandem-browser is the Chromium automation sidecar used by tandem-engine. It is not the same thing as the control panel browser UI or the Tauri desktop app.

The runtime model is:

  • tandem-browser runs on the same host as tandem-engine
  • a Chromium-family browser must also exist on that host
  • desktop and control panel query the engine for readiness instead of hosting the browser sidecar themselves

Prerequisites

  • Rust stable if you are building from source
  • tandem-engine and tandem-browser on the same host
  • Chrome, Chromium, Edge, or Brave on that host
  • Linux only: the shared libraries Chromium needs

Linux: install Chromium

On Debian/Ubuntu hosts, install Chromium with:

Terminal window
sudo apt update && sudo apt install -y chromium

Then verify:

Terminal window
tandem-browser doctor --json

If the browser binary is not on PATH, set:

  • TANDEM_BROWSER_EXECUTABLE
  • or browser.executable_path in config

Current setup paths

Engine-managed install

This is the preferred operator path.

Install the sidecar for the current Tandem version:

Terminal window
tandem-engine browser install

This downloads the matching tandem-browser release asset from GitHub Releases and installs it on the engine host.

Then verify readiness:

Terminal window
tandem-engine browser doctor --json

If the sidecar is installed but Chromium is missing, the doctor output will usually show browser_not_found.

Build from source

From the repo root:

Terminal window
cargo build -p tandem-browser
./target/debug/tandem-browser --version
./target/debug/tandem-browser doctor --json

Build a release binary when you want a realistic size measurement:

Terminal window
cargo build --release -p tandem-browser
ls -lh target/release/tandem-browser

Config keys

Browser automation is controlled through the browser config block or equivalent env vars.

{
"browser": {
"enabled": true,
"sidecar_path": "/path/to/tandem-browser",
"executable_path": "/path/to/chromium",
"headless_default": true,
"allow_no_sandbox": false,
"user_data_root": "/path/to/browser-data",
"allowed_hosts": ["example.com"]
}
}

Environment equivalents:

  • TANDEM_BROWSER_ENABLED
  • TANDEM_BROWSER_SIDECAR
  • TANDEM_BROWSER_EXECUTABLE
  • TANDEM_BROWSER_HEADLESS
  • TANDEM_BROWSER_ALLOW_NO_SANDBOX
  • TANDEM_BROWSER_USER_DATA_ROOT
  • TANDEM_BROWSER_ALLOWED_HOSTS

Test flow

1. Sidecar doctor

Standalone sidecar smoke test:

Terminal window
tandem-browser doctor --json

This confirms:

  • the sidecar binary is present
  • a Chromium-family browser can be found
  • Chromium can actually launch in the configured headless/sandbox mode

2. Engine doctor

Run the engine-facing doctor using the effective engine config:

Terminal window
tandem-engine browser doctor --json

This is the command to trust when validating a real engine deployment.

3. Running engine status

Start the engine and query browser readiness through HTTP:

Terminal window
tandem-engine serve --hostname 127.0.0.1 --port 39731
tandem-engine browser status --hostname 127.0.0.1 --port 39731
curl -s http://127.0.0.1:39731/browser/status | jq .

4. How browser automation is exposed

Browser automation is exposed through the engine tool registry, not through dedicated runtime endpoints like /browser/open or /browser/click.

Use the browser HTTP endpoints for host diagnostics and installation:

  • GET /browser/status
  • POST /browser/install
  • POST /browser/smoke-test

Use engine tools for actual automation:

  • browser_status
  • browser_open
  • browser_navigate
  • browser_snapshot
  • browser_click
  • browser_type
  • browser_press
  • browser_wait
  • browser_extract
  • browser_screenshot
  • browser_close

That means an SDK or agent has two valid ways to drive browser automation through the engine:

  • call POST /tool/execute directly with one of the browser tool names
  • start a session run with tool use enabled and include the browser tools in the run allowlist for a QA or browser-debugging agent

The browser namespace in the SDK is intentionally narrower. It is for readiness and install flows, while the actual browser actions are ordinary tools.

For browser_wait, the canonical payload shape is:

{
"session_id": "browser-123",
"condition": {
"kind": "selector",
"value": "#login"
},
"timeout_ms": 5000
}

The engine also accepts compatibility aliases that agents commonly generate:

  • wait_for or waitFor instead of condition
  • sessionId and timeoutMs camelCase fields
  • top-level kind plus value
  • top-level selector, text, or url, which infer the wait kind automatically

5. Tool-level smoke test

Once status is runnable, validate real browser actions through the engine runtime:

  1. Call browser_status
  2. Call browser_open
  3. Call browser_snapshot
  4. Call browser_close

If browser_status says blocked, do not debug the model flow first. Fix host readiness first.

6. Exact smoke-test commands

After Chromium is installed, run this sequence.

Verify the sidecar and engine host first:

Terminal window
tandem-browser doctor --json
tandem-engine browser doctor --json

Both should report a runnable browser setup.

Start the engine:

Terminal window
tandem-engine serve --hostname 127.0.0.1 --port 39731

In another terminal, confirm engine browser status:

Terminal window
tandem-engine browser status --hostname 127.0.0.1 --port 39731
curl -s http://127.0.0.1:39731/browser/status | jq .

Then run a full browser tool smoke test through the engine.

Check browser tool readiness:

Terminal window
cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_status","args":{}}
JSON

Open a page:

Terminal window
cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_open","args":{"url":"https://example.com"}}
JSON

Copy the returned session_id, then snapshot that session:

Terminal window
cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_snapshot","args":{"session_id":"PASTE_SESSION_ID_HERE","include_screenshot":false}}
JSON

Close the browser session:

Terminal window
cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_close","args":{"session_id":"PASTE_SESSION_ID_HERE"}}
JSON

Expected result:

  • browser_status returns readiness info
  • browser_open returns a new session_id
  • browser_snapshot returns page metadata and elements
  • browser_close succeeds without leaking the session

7. browser_wait argument guide

If an agent needs to wait explicitly, use one of these shapes.

Wait for a selector:

Terminal window
cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_wait","args":{"session_id":"PASTE_SESSION_ID_HERE","condition":{"kind":"selector","value":"#login"},"timeout_ms":5000}}
JSON

Wait for visible text with the compatibility alias:

Terminal window
cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_wait","args":{"sessionId":"PASTE_SESSION_ID_HERE","waitFor":{"type":"text","value":"Dashboard"},"timeoutMs":5000}}
JSON

Wait for a URL fragment with the short top-level form:

Terminal window
cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_wait","args":{"session_id":"PASTE_SESSION_ID_HERE","url":"/settings","timeout_ms":5000}}
JSON

Use browser_wait when you need an explicit synchronization step. For click and keypress flows, prefer the built-in wait_for field on browser_click and browser_press so the action and follow-up wait happen together.

8. Agent-driven QA usage

For browser-based QA agents, pass the browser tools through as allowed engine tools instead of inventing a second browser API layer.

A recommended allowlist is:

  • browser_status
  • browser_open
  • browser_navigate
  • browser_snapshot
  • browser_click
  • browser_type
  • browser_press
  • browser_wait
  • browser_extract
  • browser_screenshot
  • browser_close

Recommended run pattern:

  1. Call browser_status first.
  2. If the browser is blocked, report the blocking issue instead of attempting QA.
  3. If the browser is runnable, use the browser tools to navigate, interact, extract page content, and capture screenshots.

Desktop and Control Panel

Control Panel

The control panel does not run tandem-browser directly. It talks to the engine host.

Use Settings -> Browser automation to:

  • read GET /browser/status
  • trigger POST /browser/install

If the control panel is connected to a remote engine, installation still happens on that engine host.

Tauri Desktop

The desktop app should be treated as an engine client for browser automation. The browser sidecar belongs next to the engine host, not inside the web frontend runtime.

The recommended desktop behavior is:

  • keep managing the Tandem engine sidecar as usual
  • use engine browser diagnostics for browser automation readiness
  • rely on engine-managed tandem-browser install instead of bundling a second browser sidecar into the desktop app

Release assets and incorporation

tandem-browser should be published as standalone GitHub release assets alongside other Tandem binaries:

  • tandem-browser-linux-x64.tar.gz
  • tandem-browser-darwin-x64.zip
  • tandem-browser-darwin-arm64.zip
  • tandem-browser-windows-x64.zip

The engine install flow targets the matching Tandem version, not an unconstrained latest release. That keeps the browser sidecar protocol aligned with the engine build.

Troubleshooting

browser_sidecar_not_found

  • Run tandem-engine browser install
  • or set TANDEM_BROWSER_SIDECAR
  • or set browser.sidecar_path

browser_not_found

  • Install Chrome, Chromium, Edge, or Brave on the engine host
  • Debian/Ubuntu example: sudo apt update && sudo apt install -y chromium
  • set TANDEM_BROWSER_EXECUTABLE if the binary is outside PATH

Launch fails on Linux VPS

  • confirm Chromium shared libraries are installed
  • try headless mode first
  • only enable allow_no_sandbox when the host environment requires it

Control panel shows blocked

  • read the blocking issue from /browser/status
  • run tandem-engine browser doctor --json on the engine host
  • confirm the control panel is connected to the engine you actually configured