Storage Maintenance For Agents

Tandem keeps active runtime state small and moves long-lived history into sharded storage. Agents working on local repair, release testing, or operator runbooks should treat storage cleanup as a maintenance operation, not as an automation workflow step.

Storage shape

The local state root is usually ~/.local/share/tandem on Linux or the configured TANDEM_HOME / TANDEM_STATE_DIR.

Important directories:

Path	Purpose
`data/automation-runs/YYYY/MM/`	Immutable Automation V2 per-run history shards
`data/context-runs/hot/`	Active and recent context-run directories
`data/context-runs/archive/YYYY/MM/`	Compressed old context-run archives
`data/mcp/`	MCP registry files
`data/channels/`	Channel sessions and tool preferences
`data/routines/`	Routine definitions and run history
`data/bug-monitor/`	Bug Monitor config, incidents, drafts, and posts
`data/actions/`	External action history
`data/pack-builder/`	Pack-builder workflows, plans, and zip artifacts
`data/system/`	Shared resources and other system-level state
`data/knowledge/`	Legacy embedded-doc bootstrap markers
`data/workflow-planner/`	Workflow-planner sessions

Hot indexes should stay small. Large node outputs, blackboards, runtime context, and terminal run details belong in per-run files, JSONL shards, or artifact files referenced by path.

Cleanup commands

Most agents and operators should use the normal command:

tandem-engine storage cleanup --dry-run --context-runs --json
tandem-engine storage cleanup --dry-run --root-json --json
tandem-engine storage cleanup --dry-run --default-knowledge --json

For an actual local cleanup:

sudo systemctl stop tandem-engine
tandem-engine storage cleanup --context-runs --root-json --default-knowledge --quarantine --json
sudo systemctl restart tandem-engine

On developer machines with more than one tandem-engine on PATH, run which -a tandem-engine first and call the intended binary explicitly.

Use --retention-days <N> to tune how long terminal context runs stay hot. The default is conservative for local repair work.

Managed worktree cleanup

Managed Git worktrees are different from the engine state root. Tandem creates them per repository under:

<repo>/.tandem/worktrees/<slug>

These worktrees are used for isolated coder runs, agent-team instances, and other edit-capable repo tasks. If a run is blocked, crashes, or the process restarts before teardown, the Git worktree entry can remain registered even after the task is gone.

Typical symptoms:

.tandem/worktrees/ grows large inside a repo
git worktree list shows many old tandem/... branches
operators have to manually remove worktrees and branches

Use the packaged CLI for a preview first:

tandem-engine storage worktrees --repo-root /abs/path/to/repo --json

Apply cleanup only after reviewing the preview:

tandem-engine storage worktrees --repo-root /abs/path/to/repo --apply --json

What this cleanup does:

Reads Git-registered worktrees under <repo>/.tandem/worktrees
Compares them to Tandem’s currently tracked in-memory managed worktrees
Skips worktrees that the live runtime still considers active
Removes stale Git worktrees and their managed branches when possible
Removes orphaned leftover directories that are no longer Git-registered

This cleanup is also available from the control panel at Settings -> Maintenance, where operators can preview stale worktrees, run cleanup, and inspect an animated per-item log of what was skipped, removed, or failed.

Runtime API and SDK access

The same operation is available through the engine runtime API:

POST /worktree/cleanup

Example request:

{
  "repo_root": "/abs/path/to/repo",
  "dry_run": true,
  "remove_orphan_dirs": true
}

Use the HTTP or SDK path when an operator tool, external service, or governed agent flow needs to inspect or clean stale worktrees without shelling out to the CLI.

SDK inspection

The TypeScript and Python SDKs expose storage inspection helpers for agents and tools that need to list files or trigger the legacy session-storage repair scan, plus worktree cleanup helpers for repo-local stale worktree maintenance:

const files = await client.storage.listFiles({ path: "data/context-runs", limit: 100 });
await client.storage.repair({ force: true });
const preview = await client.worktrees.cleanup({
  repoRoot: "/abs/path/to/repo",
  dryRun: true,
});

files = await client.storage.list_files(path="data/context-runs", limit=100)
await client.storage.repair(force=True)
preview = await client.worktrees.cleanup(
    repo_root="/abs/path/to/repo",
    dry_run=True,
)

Storage SDK methods do not run archive cleanup. Worktree cleanup only affects repo-local managed worktrees for the selected repository. Both should still be treated as operator-directed maintenance actions rather than background workflow behavior.

Agent guidance

Before fixing workflow or Bug Monitor bugs on a machine with slow startup, inspect storage first. A root directory full of large JSON maps or thousands of legacy context_runs entries can make unrelated bugs look worse.

The old embedded docs path used guide_docs: memory rows plus a legacy default_knowledge_state.json marker. If those still exist on a machine, purge them with --default-knowledge once and then keep using the docs MCP server for Tandem-specific guidance.

Prefer this order:

Run dry-run cleanup with --json.
Stop the engine service before mutating local storage.
Run cleanup with --quarantine so moved root files can be recovered.
Restart the engine and verify startup time.
Only then continue debugging workflow behavior.

For worktree maintenance, prefer this order:

Run tandem-engine storage worktrees --repo-root ... --json first.
Confirm the reported stale paths are not tied to an active operator session.
Apply cleanup with --apply.
Re-run git worktree list and verify the repo is back to the expected set of worktrees.