Process-tree correlation: matching VS Code terminal tabs to Claude Code sessions deterministically
Timing windows are how you lose data. Walking the process tree is how you do it deterministically. Here is the algorithm inside the Claude Recall VS Code extension (v0.7).
Here is the problem the Claude Recall VS Code extension solves. You open a terminal tab in VS Code. You type claude. Claude Code starts a session, writes a new JSONL to ~/.claude/projects/<encoded-cwd>/<uuid>.jsonl, and the session is live. Now: given a tab named "api: webhook work", which session UUID does that tab correspond to, right now, deterministically, even if four other tabs started Claude at roughly the same moment?
This is harder than it looks. The naive approaches all fail in normal usage. The correct approach walks the process tree. This post is the internals of that algorithm, shipped in the extension in v0.7.
Naive approach 1: latest session wins
"The most recently created JSONL is the one that belongs to the tab that just started Claude."
Fails immediately. If you open two tabs in two-second succession and type claude in both, you have two new JSONLs within a couple seconds of each other. Which is which? The kernel does not know, and neither do you. Under any real concurrent workflow, this heuristic is wrong more often than it is right.
Naive approach 2: match on tab name
"The tab is named api; find the session whose cwd ends in api."
Fails for a different reason. Two tabs can have the same cwd (you might have two sessions going in the same repo deliberately). Tab names can be edited. And the cwd-to-project mapping is one-to-many by design, because the whole point of multiple sessions is that you can run more than one in the same project.
Naive approach 3: time-since-open
"The tab opened 3 seconds ago; any session created in the last 5 seconds belongs to it."
Fails because time windows are not identities. Two tabs open within the same window claim the same sessions. User closes a tab and reopens; the timing collides with someone else's activity. Any timing-based heuristic is correct until it is not, and when it is not, the wrong session gets aliased to the tab permanently.
The correct approach: walk the process tree
Every process on the system has a PID and a parent PID. The shell inside a VS Code terminal tab has a PID. When that shell runs claude, the Claude Code process inherits the shell's PID as its parent. When Claude Code writes to a JSONL, the kernel knows which PID holds the file descriptor doing the writing.
The chain is:
VS Code terminal tab
|
v
shell process (pid X)
|
v
claude process (pid Y, ppid = X)
|
v
JSONL file at ~/.claude/projects/<encoded-cwd>/<uuid>.jsonl
(file descriptor held by pid Y)If we can follow that chain backward from the JSONL to the tab, we have a deterministic mapping with no timing windows.
Here is the algorithm, shipped in correlator.ts:
async function correlateJsonlToTab(jsonlPath: string): Promise<TabHandle | null> {
// 1. Find the process holding the JSONL open.
const writerPid = await lsofWriter(jsonlPath);
if (writerPid === null) return null;
// 2. Walk parents until we hit a shell PID that the extension
// has registered from VS Code's onDidOpenTerminal.
let pid: number | null = writerPid;
while (pid !== null && pid !== 1) {
const tab = knownShellPids.get(pid);
if (tab !== undefined) return tab;
pid = await parentPid(pid);
}
return null;
}lsofWriter shells out to lsof (or the platform equivalent) to find the PID currently holding the JSONL for writing. parentPid reads /proc/<pid>/stat on Linux, or ps -o ppid= on macOS. Neither call depends on time; both return the kernel's ground truth.
The other half: registering shells
For the walk to work, we need to know which shell PIDs belong to which VS Code tabs. The extension registers that at tab-open time by listening to VS Code's terminal lifecycle events:
vscode.window.onDidOpenTerminal(async (term) => {
const pid = await term.processId;
if (pid === undefined) return;
const tabName = term.name;
await fetch('http://127.0.0.1:' + daemonPort + '/vscode/tab', {
method: 'POST',
body: JSON.stringify({ shell_pid: pid, tab_name: tabName }),
});
});The daemon stores the {shell_pid, tab_name} pair in an in-memory map. When a new JSONL later appears and the correlator walks parents, the map lookup resolves the tab name. The extension also listens for onDidCloseTerminal and removes the entry, which keeps the map from accreting dead PIDs over a long VS Code session.
Why the bind is 127.0.0.1
The extension POSTs to the daemon on the loopback interface. This is not a detail; it is a hard rule of the project. The daemon never binds 0.0.0.0, never binds a public IP, never listens on a LAN address. Loopback only. The correlator endpoint inherits that posture by default and cannot be reached from anywhere except the machine the extension is running on.
What happens when the walk fails
Sometimes the walk returns null. The user started Claude in a shell that was launched outside VS Code (a tmux session, a standalone iTerm window) and then attached to a VS Code tab later. The shell PID was never registered. That is fine. The extension falls back to the non-deterministic heuristic ("most recent JSONL in this cwd") but marks the alias as low-confidence. The UI shows a tentative badge. The user can accept it or reject it, and the corrected mapping sticks.
This is the right failure mode. Deterministic when it can, clearly non-deterministic when it cannot, never silently wrong.
Takeaway
Process-tree correlation is the kind of thing that looks like over-engineering until the first time timing heuristics bite you. Once you are running ten terminal tabs and the aliases start going to the wrong sessions, the walk stops looking clever and starts looking mandatory.
The full shape of the extension is in the VS Code doc. The daemon's endpoint contract is documented under data. Install at /install and the correlator runs automatically on the first new terminal you open after enabling the extension.