Concurrent Sessions Plan -- Issue #1493 Step 3
Goal
Enable multiple sessions per bot to run simultaneously. A bot can have 100 sessions, each talking to its own Claude Code process through gate-runtime. The frontend shows which sessions are working, lets you switch instantly, and accumulates messages in real-time from all sessions.
Current Architecture (Single-Session)
The Problem: One Process Per Bot
The current system enforces one Claude Code process per bot and one active session per bot:
User -> sendMessage(botId) -> runtime.send(botId)
|
bots.get(botId) <-- ONE BotRuntime per botId
|
bot.proc (stdin) <-- ONE ChildProcess
|
stdout -> handleMessage(bot, msg)
|
_onEvent(botId, msg)
|
bridge.handleEvent(botId, event)
|
emitBotEvent(botId, { event: 'chat', payload: { sessionKey, chatSessionId, ... } })
|
SSE -> frontend handleChatEventForBot(botId, payload)
Single-Session Assumptions (Must Change)
| Layer | File | Assumption |
|---|---|---|
| runtime.ts | const bots = new Map<string, BotRuntime>() |
One BotRuntime per botId. Contains one proc, one runId, one sessionId. |
| runtime.ts | activateSession() |
Calls stop() when switching sessions -- kills the old process. |
| runtime.ts | send() -> sendToProcess() |
Writes to bot.proc.stdin -- one process, one stdin. |
| runtime.ts | onTurnComplete() |
Drains bot.messageQueue -- one queue for all messages. |
| index.ts | sendMessage() |
Calls ensureCurrentRuntimeSession() -- one session per bot. |
| index.ts | activateSession() |
Sets bot.runtimeSessionId, bot.chatSessionId -- singular. |
| session-registry.ts | ensureCurrentRuntimeSession() |
Finds or creates ONE current session. is_current column is a single boolean. |
| session-registry.ts | createFreshRuntimeSession() |
Stops the previous session before creating a new one. |
| bridge.ts | streams = new Map<string, ...>() |
One stream accumulator per botId. |
| bridge.ts | _getSessionMeta(botId) |
Returns one chatSessionId, one sessionKey. |
| bot-state.ts | botWorkingState.get(botId) |
One working state per bot. |
| bot-state.ts | setBotWorking(botId, runId, sessionKey) |
One runId, one sessionKey per bot. |
| ws-manager.ts | SSE events keyed by botId |
All events for a bot go to all SSE clients for that bot. |
| Frontend | ChatState.currentMsg |
One streaming bubble. |
| Frontend | bot.currentRunId |
One active run per bot. |
| Frontend | bot.currentStreamText |
One stream accumulator per bot. |
| Frontend | showTyping(botId) / hideTyping(botId) |
One typing indicator per bot. |
New Architecture (Multi-Session)
Core Concept: Session-Keyed Processes
Instead of Map<botId, BotRuntime>, use Map<sessionKey, BotRuntime> where sessionKey = "gate:{chatSessionId}".
Each session gets its own Claude Code process, its own stdin/stdout, its own queue.
User -> sendMessage(botId, sessionKey) -> runtime.send(sessionKey)
|
sessions.get(sessionKey) <-- BotRuntime per SESSION
|
session.proc (stdin) <-- own ChildProcess
|
stdout -> handleMessage(session, msg)
|
_onEvent(botId, msg) <-- botId + sessionKey tagged
|
bridge.handleEvent(botId, event) <-- includes sessionKey
|
emitBotEvent(botId, { ..., sessionKey, chatSessionId })
|
SSE -> frontend routes by sessionKey
Phase 1: Backend -- Multi-Process Runtime
File: lib/gate-runtime/runtime.ts
Rename key from
botIdtosessionKey:const sessions = new Map<string, BotRuntime>()(renamebots->sessions)- Each
BotRuntimestill has abotIdfield for identification - Registration:
register(sessionKey, config)wheresessionKey = "gate:{chatSessionId}" send(sessionKey, text, opts)-- routes to the correct process
Keep bot-level helpers:
getState(botId)-> scan all sessions for this bot, return 'working' if any aregetSessionsForBot(botId)-> return all BotRuntime entries for this botstopAll(botId)-> stop all processes for a bot
Remove single-session locks:
activateSession()no longer callsstop()-- sessions coexistnewSession()no longer kills old process -- just registers a new oneonTurnComplete()drains queue per-session, not per-bot
File: lib/gate-runtime/index.ts
Route
sendMessageby session:sendMessage(botId, message, bot, opts)-> resolve session fromopts.chatSessionId- If session doesn't have a registered process, register and spawn one
activateSessionbecomes per-sessionKey- Bot-level state queries (
getState,isRegistered) aggregate across all sessions
syncRuntimeSessionEventalready tags by bot + session -- works as-is.
File: lib/gate-runtime/session-registry.ts
- Remove single-session enforcement:
ensureCurrentRuntimeSession-> create or find session by chatSessionId, don't enforceis_currentcreateFreshRuntimeSession-> DON'T stop previous session, just create a new one- Remove
is_currentcolumn dependency for session selection (keep for "last active" display)
File: lib/gate-runtime/bridge.ts
- Session-keyed stream accumulators:
streams = new Map<string, ...>()keyed bysessionKeyinstead ofbotId_toolNameskeyed bysessionKey- Every event emitted includes both
botIdandsessionKey
File: lib/bot-state.ts
- Multi-session working state:
botWorkingState->Map<string, { runId, sessionKey, since }>keyed by compositebotId:sessionKeyisBotWorking(botId)-> return true if ANY session is workingisBotWorkingSession(botId, sessionKey)-> check specific sessionsetBotWorking(botId, runId, sessionKey)-> track per-sessionclearBotWorking(botId, runId, sessionKey)-> clear per-sessiongetAllWorkingSessions(botId)-> return all working sessions for a bot
Phase 2: CS Ownership -- Code Sessions Belong to Bot Sessions
Critical problem: Code Sessions (CS) are currently owned by bot_id only. The code monitor pings sendMessageToBot(botId, ...) which routes to whatever session is "current". With concurrent sessions, this ping goes to the WRONG session -- the bot session that happens to be active, not the one that started the CS.
Current state (broken for concurrent):
code_sessions.bot_id = 'molly' <-- CS knows its bot
code_monitor -> sendMessageToBot('molly') <-- pings "the bot"
gate-runtime -> routes to is_current=1 <-- wrong session!
New state (session-aware):
code_sessions.bot_id = 'molly'
code_sessions.chat_session_id = 42 <-- CS knows its bot SESSION
code_monitor -> sendMessageToBot('molly', chatSessionId=42)
gate-runtime -> routes to session 42 <-- correct session
Changes needed:
DB migration: Add
chat_session_id INTEGERcolumn tocode_sessionstable- Nullable for backward compat (existing CS get null = legacy bot-level routing)
- Set on CS creation:
startSession(botId, { chatSessionId })passes the owning session
CS creation: When a bot session starts a CS, tag it with
chat_session_idPOST /api/bots/:botId/code/startacceptschatSessionIdparamlib/code-session.ts startSession()stores it in DB- The frontend sends
chatSessionIdfromChatState.currentChatSessionIds[botId]
Monitor routing:
deliverMonitorPing()routes to the owning sessionlib/code-monitor.ts: readcode_sessions.chat_session_idfor the CS being monitored- Pass
chatSessionIdtosendMessageToBot(botId, msg, ..., chatSessionId) - Gate-runtime routes the message to that specific session's process (not "current")
- Fallback: if
chat_session_idis null (legacy CS), route to current session as before
CS transfer:
POST /api/bots/:botId/code/session/transfermust also updatechat_session_id- When transferring a CS between bots, the new bot's current session becomes the owner
Phase 3: Frontend -- Tree Structure
Remove the standalone sidebar sessions section. Sessions are now nested under each bot in the sidebar tree:
Sidebar
|-- [Search button] (Ctrl+K — searches bots, sessions, CS, chat history)
|-- Bot: Molly
| |-- Session: "Fix auth bug" (active, working) <-- click = switch chat to this session
| | |-- CS: claude-a8f2 (running, project-x) <-- click = open terminal
| | '-- CS: claude-b3c1 (idle, project-y)
| |-- Session: "Refactor DB" (idle)
| | '-- CS: claude-d4e5 (completed)
| |-- Session: "New session" (new, empty)
| |-- [+ New session]
| '-- [5 more sessions...] <-- collapsed, click to search
|-- Bot: Klara
| |-- Session: "Deploy pipeline" (working)
| '-- [+ New session]
'-- [+ Add bot]
Key changes:
Remove
#sidebarSessionssection from HTML andloadSidebarSessions()from JS- Sessions are no longer a standalone sidebar section
- They render as children of each bot list item
Extend
renderBotListItem()to show bot sessions as expandable children:- Each bot item expands to show up to 5 sessions (sorted: working first, then by activity)
- Each session shows: status icon, label, message count
- Clicking a session switches the chat view to that session (not the bot -- the SESSION)
- CS items render nested under their owning session (not flat under the bot)
CS renders under its owning session (not flat under the bot):
renderSessionsInSidebar()groups CS bychat_session_id- CS with null
chat_session_id(legacy) renders under the bot's current/latest session - Each CS item: status dot, name/project, click to open terminal
Active state is per-session, not per-bot:
- Clicking a session makes it the active chat view AND the active session for that bot
- The bot list item shows which session is active (highlighted child)
- Multiple bots can have working sessions simultaneously
- Only the currently viewed session shows the typing indicator in the chat area
Session-level message routing (unchanged from Phase 1 plan):
bot.sessionMessages: Record<string, ChatMessage[]>-- keyed by chatSessionId- SSE events route by sessionKey to correct message array
- Switching sessions re-renders from memory (instant)
- Lazy-load history from API for sessions not yet in memory
Search modal (Ctrl+K):
- Already searches bots and code sessions
- Extend to also search bot sessions (by label, first message)
- Extend to search chat history (by message content)
- Results show: type icon, label, metadata, click to navigate
Phase 4: DB & Status
gate_runtime_sessions.statustracks per-session state:new-> just created, no Claude process yetworking-> Claude process active, processing a messageidle-> Claude process alive but idle (waiting for timeout)stopped-> Claude process killed (idle timeout or manual)errored-> last turn errored
is_currentbecomes "last user-interacted session":- Multiple sessions can be
workingsimultaneously is_current = 1just means "which session the user last sent a message to"- Used for UI default selection, not for runtime exclusivity
- Multiple sessions can be
Event Flow (Concurrent)
Bot has 3 sessions: S1 (working), S2 (idle), S3 (working)
S1 stdout -> handleMessage -> bridge -> emitBotEvent(botId, { sessionKey: "gate:1", state: "delta" })
S3 stdout -> handleMessage -> bridge -> emitBotEvent(botId, { sessionKey: "gate:3", state: "delta" })
Frontend SSE receives both:
- delta for S1: append to sessionMessages["gate:1"]
- delta for S3: append to sessionMessages["gate:3"]
User is viewing S1:
- S1 delta -> render streaming bubble
- S3 delta -> accumulate silently, update sidebar status icon
User switches to S3:
- S1 streaming bubble paused (still accumulating in background)
- S3 messages rendered from sessionMessages["gate:3"]
- S3 streaming continues in real-time
Implementation Order
- Runtime multi-process (runtime.ts) -- key change: Map keyed by sessionKey
- Index.ts routing -- sendMessage resolves to sessionKey
- Bridge session-keying -- stream accumulators per session
- Bot-state multi-session -- working state per session
- Session-registry unlocking -- remove stop-previous enforcement
- Frontend message routing -- accumulate per-session
- Frontend UI -- session status indicators, instant switching
- Cleanup -- idle timeout per-session, max concurrent sessions limit
Safety & Limits
- Max concurrent PROCESSES per bot: 5 (configurable). Prevents runaway process spawning.
- Min sessions per bot: 1. Auto-created on first load if none exist.
- Idle timeout: 10 min per process (existing). Inactive sessions auto-kill process but preserve session data.
- Memory: Each Claude Code process uses ~50-100MB. 5 concurrent = 250-500MB per bot.
- Auth: All sessions for a bot share the same auth profile and credentials.
- Workspace: All sessions for a bot share the same workspace directory. Claude Code handles isolation with worktrees.
- Session count: Unlimited sessions can EXIST (history preserved). Only 5 can have live PROCESSES at once.
Gap Analysis (Critical Issues Found)
Audit of every file in the event chain found 7 critical gaps the plan must address:
GAP 1: messageQueue has no session routing (runtime.ts)
File: lib/gate-runtime/runtime.ts line 86-90, 603-605
The QueuedMessage interface has { text, opts, runId } but no sessionKey or chatSessionId. When multiple sessions queue messages, onTurnComplete() drains FIFO without checking which session the message belongs to.
Example failure: Session 1 finishes, drains queue, sends Session 2's message to Session 1's process.
Fix: Add sessionKey to QueuedMessage. Queue per-session: Map<sessionKey, QueuedMessage[]>. Drain only the queue for the session that just completed.
GAP 2: lastSessionCost is per-bot (runtime.ts)
File: lib/gate-runtime/runtime.ts line 108, 704-705
BotRuntime.lastSessionCost is a single number. Cost is computed as turnCost = msg.total_cost_usd - bot.lastSessionCost. With concurrent sessions, each session has its own cumulative cost from Claude Code, but they share one lastSessionCost. Result: negative costs, wrong cost events.
Fix: Track lastSessionCost per runtimeSessionId: Map<number, number>.
GAP 3: Context tokens are per-bot (runtime.ts)
File: lib/gate-runtime/runtime.ts line 109-113, 531-590
cumulativeInputTokens, cumulativeOutputTokens, contextWindow are fields on BotRuntime (shared). readContextFromConversation() reads one JSONL file and overwrites shared state.
Fix: Track context per session. Each BotRuntime (now per-session) carries its own context state -- this is solved by the rekey to Map<sessionKey, BotRuntime>.
GAP 4: activeProjectId is per-bot (message-capture.ts)
File: lib/message-capture.ts line 58
const activeProjectId = new Map<string, string>() keyed by botId only. Concurrent sessions could overwrite each other's project context.
Fix: Key by botId:chatSessionId composite.
GAP 5: Work rules refresh counter is per-bot
File: lib/work-rules-refresh.ts, called from lib/gate-runtime/index.ts line 363
incrementMessageCount(botId) increments a single counter per bot. Multiple sessions increment the same counter simultaneously, causing premature or skipped refreshes.
Fix: Accept chatSessionId parameter, track per-session. Or: accept per-bot counting as intentional (total messages across all sessions triggers refresh). Document the decision.
GAP 6: CLAUDE.md write collision (config.ts)
File: lib/gate-runtime/config.ts generateClaudeMd()
All sessions for a bot write to the same CLAUDE.md in the shared workspace. 5 concurrent writeFileSync() calls = file corruption.
Fix: Atomic write: write to temp file, then fs.renameSync() (atomic on same filesystem). Or: write once on first session, skip if already fresh (check hash/mtime).
GAP 7: bot-activity SSE is one state per bot
File: lib/ws-manager.ts broadcastBotStatus(), lib/bot-state.ts setBotWorking()
The dashboard and chat frontend receive one bot-activity event per bot with a single state. If Session 1 finishes (emits idle) while Session 2 is still working, the bot shows as idle.
Fix:
bot-activityevent includesworkingSessions: numbercountsetBotWorking/clearBotWorkingcheck if ANY session is still working before emitting idle- Dashboard bot card: show "2/5 sessions working" or similar aggregate
- Chat sidebar: per-session status icons (already planned)
GAP 8: Code Sessions have no session ownership
File: lib/drizzle/schema.ts code_sessions table, lib/code-monitor.ts, lib/code-session.ts
code_sessions table has bot_id but NO chat_session_id. The monitor pings sendMessageToBot(botId) which routes to whatever session is "current". With concurrent sessions, the ping hits the wrong session -- the user's active chat, not the session that started the CS.
Example failure: Session 1 starts a CS. User switches to Session 2. CS needs input. Monitor pings Session 2. Session 2 has no idea what the CS is about.
Fix:
- Add
chat_session_id INTEGERtocode_sessionstable (nullable for backward compat) startSession()storeschatSessionIdfrom the calling contextdeliverMonitorPing()readscode_sessions.chat_session_idand passes it tosendMessageToBot()- Gate-runtime routes the message to the correct session process
Files to Modify (Complete List)
| File | Change | Gap |
|---|---|---|
lib/gate-runtime/runtime.ts |
Rekey Map by sessionKey, per-session queue, per-session cost/context | GAP 1,2,3 |
lib/gate-runtime/index.ts |
Route sendMessage by session, aggregate bot-level queries | |
lib/gate-runtime/bridge.ts |
Session-keyed stream accumulators | |
lib/gate-runtime/session-registry.ts |
Remove single-session enforcement | |
lib/gate-runtime/config.ts |
Atomic CLAUDE.md writes | GAP 6 |
lib/bot-state.ts |
Multi-session working state, aggregate bot status | GAP 7 |
lib/ws-manager.ts |
Include workingSessions count in bot-activity | GAP 7 |
lib/message-capture.ts |
Key activeProjectId by bot+session | GAP 4 |
lib/work-rules-refresh.ts |
Decide: per-session or per-bot counting | GAP 5 |
lib/code-session.ts |
Accept + store chatSessionId on CS creation | GAP 8 |
lib/code-monitor.ts |
Pass chatSessionId when pinging bot | GAP 8 |
lib/drizzle/schema.ts |
Add chat_session_id to code_sessions | GAP 8 |
api/code.ts |
Accept chatSessionId in start endpoint | GAP 8 |
api/bot-runtime.ts |
GET sessions shows per-session status + CS children | |
clients/chat/src/state.ts |
Add sessionMessages to BotState | |
clients/chat/src/chat-events.ts |
Route events by sessionKey | |
clients/chat/src/ui.ts |
Tree: Bot > Sessions > CS. Remove standalone sidebar sessions | |
clients/chat/src/code-sessions.ts |
Group CS under owning session, not flat under bot | |
clients/chat/src/search-modal.ts |
Add session + chat history search | |
clients/chat/src/history.ts |
Instant session switching from memory | |
clients/chat/src/messaging.ts |
Send with sessionKey, don't kill old session | |
clients/chat/index.html |
Remove #sidebarSessions, tree renders in bot list | |
clients/dashboard/src/stores/bot-store.ts |
Per-session state tracking (optional) | GAP 7 |