How Tandem Works Under the Hood
Tandem is a session-oriented engine. A session is the durable conversation record, a run is one execution of that session, and a message is made of ordered parts. The engine streams state changes as events instead of treating the transcript as one opaque blob.
If you only need setup or user-facing workflows, use the guide pages. If you need to answer “how does Tandem actually work?” or debug a hallucination, start here.
The Mental Model
flowchart LR U[Client / Channel] --> S[Session] S --> R[Run] R --> E[SSE Event Stream] R --> P[Provider] R --> M[Memory + Context Hooks] E --> U C[Channel Adapter] --> S C --> A[Attachments] A --> S
The important distinction is:
- the session persists between turns
- the run is the active execution attached to one session
- the event stream is how the engine reports progress
- the memory store is separate from the raw transcript
Runtime Primitives
The core wire/runtime objects are:
- Session: persisted under
sessions.json, with metadata such as title, directory, workspace root, project ID, provider/model, and the message history. - Message: a user or assistant turn stored inside the session history.
- Part: the ordered pieces of a message. Tandem uses text parts, reasoning parts, and tool invocation/result parts.
- Run: a single execution of a session, identified by
runID. - Event: streamed runtime output such as
message.part.updated,session.run.started,session.run.finished,todo.updated,question.asked, andagent_team.*. - Memory record: a chunk in
memory.sqlitewith a tier, source, and metadata.
session_meta.json stores snapshot history and session flags such as archived/shared state, parent lineage, revert state, and TODOs.
Run Lifecycle
The main chat path is POST /session/{id}/prompt_async.
- The client creates or appends the user message.
- The engine acquires the session run lock.
- If another run is already active, Tandem returns
409 Conflictwith the active run details and anattachEventStreampath. - If the request is accepted, the engine emits
session.run.started. - The run is streamed over SSE from
/event?sessionID=<id>&runID=<runID>. - The engine publishes
message.part.updatedevents as text and tool parts evolve. - The run ends with
session.run.finished.
When the caller requests ?return=run, Tandem returns the runID and the SSE attach path immediately, which is how thin clients and channel adapters bind to the same execution.
The synchronous path uses the same engine logic, but blocks until the run is complete.
What gets streamed
- text deltas from the model
- tool invocation and tool result updates
- permission and question events
- terminal status and error events
If a run fails or times out, Tandem persists an error message into the session history and finishes the run with an error status instead of silently dropping the failure.
Context and Memory
Tandem keeps raw transcript history separate from retrieval memory.
- The session history is the source of truth for what happened in that session.
- The engine builds a bounded provider payload and trims older turns instead of growing one infinite prompt forever.
- Memory lives in
memory.sqliteand is tiered:sessionfor ephemeral session-scoped chunksprojectfor reusable project factsglobalfor cross-project user patterns and preferences
When memory is injected for prompting, it is formatted as a structured block like:
<memory_context> <current_session>...</current_session> <relevant_history>...</relevant_history> <project_facts>...</project_facts></memory_context>That means Tandem can reuse context without replaying the full transcript every time.
Channel archival memory
Channel sessions add a second retrieval layer for long-lived chat integrations:
- raw channel transcripts still live in normal Tandem session storage
- after a successful channel reply, Tandem archives the latest completed user + assistant exchange into global memory
- the archived record is deduped by source hash and stores provenance such as session ID, project ID, workspace root, and message IDs
This is the key reason the docs need a canonical under-the-hood page: if someone asks about “Discord session context” or “Telegram channel memory”, the correct answer is not “the whole channel transcript is loaded every time”. The correct answer is that Tandem keeps the transcript and retrieval memory separate, then archives selected completed exchanges for reuse.
For a deeper storage-level breakdown of the memory layers, see Memory Internals.
Channel Flow
Discord, Telegram, and Slack all ride the same engine session contract, but each adapter normalizes inbound messages differently before calling prompt_async.
- Discord connects through the Gateway WebSocket, maintains a heartbeat, normalizes bot mentions, and chunks outbound replies to Discord’s 2000-character limit.
- Telegram uses long polling, renders outbound MarkdownV2 safely, and chunks replies to Telegram’s 4096-character limit.
- Slack polls
conversations.history, filters self-messages, and replies throughchat.postMessage.
Common adapter behavior:
- Normalize inbound text and detect whether the bot was explicitly mentioned.
- Attach file/media payloads when the adapter supports media ingestion.
- Store uploads under
channel_uploads/<channel>/<chat_or_user>/.... - Forward
textandfileprompt parts to/session/{id}/prompt_async. - Stream the run result back over SSE and post the final reply into the channel.
Channel adapters map channel/user identity to the Tandem session they own, so a channel conversation behaves like a stable engine session rather than a fire-and-forget webhook.
Agent Teams And Missions
Agent teams sit on top of the same engine runtime, but they add policy-gated spawning and mission lineage.
SpawnPolicycontrols whether spawning is enabled, whether justification is required, role-to-role edges, skill requirements, and budget limits.- spawn requests can come from the UI, from a tool call, or from the orchestrator runtime
- approvals can be required before a spawn is executed
- mission events stream alongside normal session events
Useful lifecycle events include:
agent_team.spawn.requestedagent_team.spawn.approvedagent_team.spawn.deniedagent_team.instance.startedagent_team.instance.completedagent_team.instance.failedagent_team.instance.cancelledagent_team.budget.exhaustedagent_team.mission.budget.exhausted
The practical mental model is that a mission can fan out into multiple spawned sessions, but each spawned worker still uses the same underlying session/run/event machinery.
Storage And Wire Contracts
These are the most important persisted/runtime surfaces:
sessions.json- canonical session storagesession_meta.json- snapshots, revert state, archive/shared flags, and TODOsquestions.json- pending structured question requestsmemory.sqlite- memory chunks, retrieval state, and knowledge recordschannel_uploads/- inbound media and file attachments from channel adapters
The public wire types mirror the runtime model:
WireSessionWireSessionMessageWireMessagePart
The wire format uses stable identifiers like sessionID and runID, and tool parts are exposed as type: "tool" with state transitions such as running, completed, failed, or denied.
What Tandem Does Not Do
- It does not treat a Discord or Telegram thread as one giant prompt forever.
- It does not automatically share context across unrelated channel sessions.
- It does not replay the entire transcript into every run.
- It does not use memory as a raw chat-log dump.
- It does not make agent-team spawning automatic without policy and budget checks.