Browser Setup and Testing

Use this guide when you need browser automation in Tandem and want to verify the full path from sidecar install to engine readiness.

What `tandem-browser` is

tandem-browser is the Chromium automation sidecar used by tandem-engine. It is not the same thing as the control panel browser UI or the Tauri desktop app.

The runtime model is:

tandem-browser runs on the same host as tandem-engine
a Chromium-family browser must also exist on that host
desktop and control panel query the engine for readiness instead of hosting the browser sidecar themselves

Prerequisites

Rust stable if you are building from source
tandem-engine and tandem-browser on the same host
Chrome, Chromium, Edge, or Brave on that host
Linux only: the shared libraries Chromium needs

Linux: install Chromium

On Debian/Ubuntu hosts, install Chromium with:

sudo apt update && sudo apt install -y chromium

Then verify:

tandem-browser doctor --json

If the browser binary is not on PATH, set:

TANDEM_BROWSER_EXECUTABLE
or browser.executable_path in config

Current setup paths

Engine-managed install

This is the preferred operator path.

Install the sidecar for the current Tandem version:

tandem-engine browser install

This downloads the matching tandem-browser release asset from GitHub Releases and installs it on the engine host.

Then verify readiness:

tandem-engine browser doctor --json

If the sidecar is installed but Chromium is missing, the doctor output will usually show browser_not_found.

Build from source

From the repo root:

cargo build -p tandem-browser
./target/debug/tandem-browser --version
./target/debug/tandem-browser doctor --json

Build a release binary when you want a realistic size measurement:

cargo build --release -p tandem-browser
ls -lh target/release/tandem-browser

Config keys

Browser automation is controlled through the browser config block or equivalent env vars.

{
  "browser": {
    "enabled": true,
    "sidecar_path": "/path/to/tandem-browser",
    "executable_path": "/path/to/chromium",
    "headless_default": true,
    "allow_no_sandbox": false,
    "user_data_root": "/path/to/browser-data",
    "allowed_hosts": ["example.com"]
  }
}

Environment equivalents:

TANDEM_BROWSER_ENABLED
TANDEM_BROWSER_SIDECAR
TANDEM_BROWSER_EXECUTABLE
TANDEM_BROWSER_HEADLESS
TANDEM_BROWSER_ALLOW_NO_SANDBOX
TANDEM_BROWSER_USER_DATA_ROOT
TANDEM_BROWSER_ALLOWED_HOSTS

Test flow

1. Sidecar doctor

Standalone sidecar smoke test:

tandem-browser doctor --json

This confirms:

the sidecar binary is present
a Chromium-family browser can be found
Chromium can actually launch in the configured headless/sandbox mode

2. Engine doctor

Run the engine-facing doctor using the effective engine config:

tandem-engine browser doctor --json

This is the command to trust when validating a real engine deployment.

3. Running engine status

Start the engine and query browser readiness through HTTP:

tandem-engine serve --hostname 127.0.0.1 --port 39731
tandem-engine browser status --hostname 127.0.0.1 --port 39731
curl -s http://127.0.0.1:39731/browser/status | jq .

4. How browser automation is exposed

Browser automation is exposed through the engine tool registry, not through dedicated runtime endpoints like /browser/open or /browser/click.

Use the browser HTTP endpoints for host diagnostics and installation:

GET /browser/status
POST /browser/install
POST /browser/smoke-test

Use engine tools for actual automation:

browser_status
browser_open
browser_navigate
browser_snapshot
browser_click
browser_type
browser_press
browser_wait
browser_extract
browser_screenshot
browser_close

That means an SDK or agent has two valid ways to drive browser automation through the engine:

call POST /tool/execute directly with one of the browser tool names
start a session run with tool use enabled and include the browser tools in the run allowlist for a QA or browser-debugging agent

The browser namespace in the SDK is intentionally narrower. It is for readiness and install flows, while the actual browser actions are ordinary tools.

For browser_wait, the canonical payload shape is:

{
  "session_id": "browser-123",
  "condition": {
    "kind": "selector",
    "value": "#login"
  },
  "timeout_ms": 5000
}

The engine also accepts compatibility aliases that agents commonly generate:

wait_for or waitFor instead of condition
sessionId and timeoutMs camelCase fields
top-level kind plus value
top-level selector, text, or url, which infer the wait kind automatically

5. Tool-level smoke test

Once status is runnable, validate real browser actions through the engine runtime:

Call browser_status
Call browser_open
Call browser_snapshot
Call browser_close

If browser_status says blocked, do not debug the model flow first. Fix host readiness first.

6. Exact smoke-test commands

After Chromium is installed, run this sequence.

Verify the sidecar and engine host first:

tandem-browser doctor --json
tandem-engine browser doctor --json

Both should report a runnable browser setup.

Start the engine:

tandem-engine serve --hostname 127.0.0.1 --port 39731

In another terminal, confirm engine browser status:

tandem-engine browser status --hostname 127.0.0.1 --port 39731
curl -s http://127.0.0.1:39731/browser/status | jq .

Then run a full browser tool smoke test through the engine.

Check browser tool readiness:

cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_status","args":{}}
JSON

Open a page:

cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_open","args":{"url":"https://example.com"}}
JSON

Copy the returned session_id, then snapshot that session:

cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_snapshot","args":{"session_id":"PASTE_SESSION_ID_HERE","include_screenshot":false}}
JSON

Close the browser session:

cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_close","args":{"session_id":"PASTE_SESSION_ID_HERE"}}
JSON

Expected result:

browser_status returns readiness info
browser_open returns a new session_id
browser_snapshot returns page metadata and elements
browser_close succeeds without leaking the session

7. `browser_wait` argument guide

If an agent needs to wait explicitly, use one of these shapes.

Wait for a selector:

cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_wait","args":{"session_id":"PASTE_SESSION_ID_HERE","condition":{"kind":"selector","value":"#login"},"timeout_ms":5000}}
JSON

Wait for visible text with the compatibility alias:

cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_wait","args":{"sessionId":"PASTE_SESSION_ID_HERE","waitFor":{"type":"text","value":"Dashboard"},"timeoutMs":5000}}
JSON

Wait for a URL fragment with the short top-level form:

cat <<'JSON' | tandem-engine tool --json -
{"tool":"browser_wait","args":{"session_id":"PASTE_SESSION_ID_HERE","url":"/settings","timeout_ms":5000}}
JSON

Use browser_wait when you need an explicit synchronization step. For click and keypress flows, prefer the built-in wait_for field on browser_click and browser_press so the action and follow-up wait happen together.

8. Agent-driven QA usage

For browser-based QA agents, pass the browser tools through as allowed engine tools instead of inventing a second browser API layer.

A recommended allowlist is:

browser_status
browser_open
browser_navigate
browser_snapshot
browser_click
browser_type
browser_press
browser_wait
browser_extract
browser_screenshot
browser_close

Recommended run pattern:

Call browser_status first.
If the browser is blocked, report the blocking issue instead of attempting QA.
If the browser is runnable, use the browser tools to navigate, interact, extract page content, and capture screenshots.

Desktop and Control Panel

Control Panel

The control panel does not run tandem-browser directly. It talks to the engine host.

Use Settings -> Browser automation to:

read GET /browser/status
trigger POST /browser/install

If the control panel is connected to a remote engine, installation still happens on that engine host.

Tauri Desktop

The desktop app should be treated as an engine client for browser automation. The browser sidecar belongs next to the engine host, not inside the web frontend runtime.

The recommended desktop behavior is:

keep managing the Tandem engine sidecar as usual
use engine browser diagnostics for browser automation readiness
rely on engine-managed tandem-browser install instead of bundling a second browser sidecar into the desktop app

Release assets and incorporation

tandem-browser should be published as standalone GitHub release assets alongside other Tandem binaries:

tandem-browser-linux-x64.tar.gz
tandem-browser-darwin-x64.zip
tandem-browser-darwin-arm64.zip
tandem-browser-windows-x64.zip

The engine install flow targets the matching Tandem version, not an unconstrained latest release. That keeps the browser sidecar protocol aligned with the engine build.

Troubleshooting

`browser_sidecar_not_found`

Run tandem-engine browser install
or set TANDEM_BROWSER_SIDECAR
or set browser.sidecar_path

`browser_not_found`

Install Chrome, Chromium, Edge, or Brave on the engine host
Debian/Ubuntu example: sudo apt update && sudo apt install -y chromium
set TANDEM_BROWSER_EXECUTABLE if the binary is outside PATH

Launch fails on Linux VPS

confirm Chromium shared libraries are installed
try headless mode first
only enable allow_no_sandbox when the host environment requires it

Control panel shows blocked

read the blocking issue from /browser/status
run tandem-engine browser doctor --json on the engine host
confirm the control panel is connected to the engine you actually configured