gmux Implementations — A Technical Review of Each Variant

Implementation 0 · the assembled prototype (April 2026)

Before a single gmux repo existed, the pieces were already there — scattered across MASTER_PROJECTS/.

qalarc-claude-voice — wake word "Qalarc" → TTS output → listens for voice response → routes command to Claude session in tmux. The complete voice loop, already working.
voice-terminal — speak → keystrokes injected into any focused terminal pane. Three modes: dictate, chat, speak.
parralax_tracking — MediaPipe Face Mesh driving interactive canvas art via head position. Browser-based, no install.
ai-orchestrator — four-agent pipeline: Personality → Refiner → Coder (Claude CLI) → Tester. Multi-agent already working.
local-ai — tmux-style terminal UI for Ollama with sidebar + session management.

The prototype vision was to symlink these into a unified gmux-core/ directory and call it done. The gesture layer just needed hands added to the face tracking. The voice layer just needed the wake word renamed. The multi-agent backend was already running.

This worked as a proof of concept but not as a product. There was no status bar. No live AI state detection. No unified config. No way to install it. The pieces existed but weren't integrated.

What it established: The core insight — that all the technology already existed, the work was integration and UX, not invention.

Implementation 1 · Python terminal stack (gmux core)

Install: pip install gmux or paru -S gmux

The first real implementation was pure Python with no desktop window — just a daemon that watches tmux and augments it.

Architecture

monitor.py          ← Polls all qalcode2 HTTP APIs, writes state JSON
pane_status.py      ← Formats state as tmux status-bar string
bridge.py           ← WS :8767 + HTTP :8768 hub
session_restore.py  ← Saves and relaunches sessions on restart
gmux_receiver.py    ← Receives push events from qalcode2
jump_red.py         ← tmux keybinding: jump to next waiting pane
tui.py              ← Textual dashboard, service toggles, agent launcher

State detection

The key decision was reading qalcode2's HTTP API rather than pattern-matching terminal output. The earlier approach — scanning tmux pane content for strings like ❯ (prompt), spinning cursors (working), Continue? (y/n) (permission) — worked but was fragile. A model output that happened to contain ❯ would flip the state indicator. The wrong font could break spinner detection.

qalcode2 exposes /session/status (polling) and /event (SSE stream). monitor.py subscribes to the SSE stream for every running instance, getting state transitions pushed instantly rather than inferred. This is why the status bar shows the correct state in real time rather than lagging behind the visual.

State icons

Icon	Colour	Meaning
`◉`	Green	AI actively working — streaming or running tools
`●`	Red	AI idle — waiting for your next message
`!`	Orange	Permission needed
`◆`	Blue	Just finished, not yet acknowledged
`─`	Grey	Pane exists, no AI started
`○`	Dim	Plain shell
`✗`	Red blink	Error

Commands

gmux                  # Full mode: gesture + voice + status
gmux --status-only    # Just the status bar — no camera, no mic
gmux --no-gesture     # Voice + status, camera free
gmux tui              # Interactive Textual dashboard
gmux restore          # Re-launch all agents from saved session
gmux restore --check  # Preview what would be restored
gmux status           # Quick pane state dump

What this version gets right

Zero friction entry point. pip install gmux && gmux --status-only works immediately with no camera, no mic, no Tauri. Just a better tmux status bar.
Graceful degradation. --status-only, --no-gesture, --phone flags let you use exactly as much of gmux as you want.
Stability. The terminal stack never crashes the terminal. If bridge.py dies, tmux keeps working. Nothing is load-bearing except tmux itself.

What this version lacks

The terminal stack has no visual pane for the gesture overlay. Point at a tmux pane with your hand and nothing can see you — the terminal doesn't have a camera feed. The gesture engine runs in a separate browser window, creating a fragmented experience.

Implementation 2 · browser overlay (Option A — abandoned)

The first attempt at a visual layer was a transparent floating window that would sit on top of whatever terminal was already running. The idea: a chromeless browser window, transparent background, with the gesture canvas drawn over it. You'd run your normal terminal, and the gesture overlay would float above it.

Why it failed

Wayland. On X11, you can query the position and size of any window via xdotool or xwininfo. A floating overlay can align itself with the terminal's pane borders because it can ask the display server "where exactly is that terminal window, down to the pixel?"

Wayland has no such API. Windows don't know about each other. A floating overlay on Wayland can't know where the terminal panes are, so it can't align gesture tap targets with the actual tmux windows. A 1px difference means tapping pane 3 selects pane 2.

Web Speech API. The voice layer needed speech recognition that ran locally (not sending audio to Google). The plan was using WebKit's Web Speech API with a local engine. webkit2gtk on Linux does not implement Web Speech API at all — the API exists in the DOM but every call silently fails.

The fundamental problem. A floating overlay doesn't own the terminal. It guesses where the terminal is. At any window size, DPI, or scroll position, that guess degrades. The architecture is permanently fragile.

What this version established: A clear requirement — the visual layer needs to own the terminal, not float above it. This led directly to Option B.

Implementation 3 · Tauri desktop app (gmux-ui / gmux-system)

Launch: ./scripts/launch.sh

The correct solution to the overlay problem was to make the terminal window itself the Tauri app. Instead of floating above a terminal, Tauri becomes the terminal.

Architecture

Tauri (Rust + WebKit native window)
├── xterm.js                          ← Terminal emulator
│     └── PTY (portable-pty Rust crate)  ← Connected to tmux
├── Gesture Canvas                    ← Transparent layer over terminal
│     └── MediaPipe (hand tracking)   ← Reads /dev/video2
├── Agent Sidebar                     ← Live state for all panes
│     └── Rust lib.rs → HTTP :8769    ← Reads monitor.py state
└── Tab Bar                           ← Per-window state + todo progress

PTY implementation

The terminal works through a real PTY (pseudoterminal), not a terminal emulator widget. Rust spawns tmux new-session -A -s gmux via the portable-pty crate, pipes PTY output to a Tauri event pty-data, which xterm.js renders. Keystrokes in xterm.js call invoke('pty_write') → Rust → PTY → tmux. This is a real terminal — not a screenshot, not a VTE widget, not an approximation.

The advantage over a VTE-based terminal widget: a canvas element can be layered on top of xterm.js for the gesture overlay. Native terminal widgets don't support arbitrary DOM children.

Gesture canvas

A <canvas> element is positioned over the xterm.js div with pointer-events: none, so clicks pass through to the terminal. MediaPipe's hand tracking runs in the WebView, drawing landmark overlays and skeleton lines as the user's hand moves. Gesture events bubble up to the sidebar and tab bar — a pinch selects the agent card under the fingertip, a swipe switches the tmux window.

Agent sidebar

The sidebar shows all 14 panes (in the live test environment) sorted by urgency:

! Permission needed   → sort first (blocks progress)
● Waiting input       → sort second (needs attention)
◉ Working             → sort third (fine, just running)
✓ Done                → sort fourth
○ Idle                → sort last

Each card shows: project name, state indicator, todo progress bar, and an action button (Approve/Reject for permission state, Open Chat otherwise). The sidebar updates from the same SSE stream as the status bar — sub-second latency.

The data source hierarchy

The UI uses a three-tier fallback so it works in any environment:

Tauri events — fastest path. Rust reads /tmp/gmux-pane-state.json and emits a gmux-state event to the WebView.
HTTP + SSE from :8769 — if running in a browser without Tauri, polls the monitor daemon directly. An SSE stream is available for real-time updates.
Mock data — falls back to realistic simulated data if neither is available. The demo at gmux.ai/demo/ uses this mode.

Camera architecture

Two camera devices are in play:

/dev/video0  →  Real webcam (exclusive, cam-broker only)
                     ↓
              ffmpeg (gmux-cam-broker.service)
                     ↓
/dev/video2  →  Virtual loopback (v4l2loopback)
                     ├── gmux gesture engine
                     ├── Brave / Chrome (WebRTC)
                     └── Python gesture engine (when Tauri not running)

The rule is strict: nothing reads /dev/video0 except the cam-broker. Everything else reads /dev/video2. This eliminates "camera already in use" errors entirely.

What's working (May 2026)

Feature	Status
PTY → xterm.js terminal	✅ Working
Window switching	✅ Working
Agent sidebar (14 panes, 5 sessions)	✅ Working
Tab todo counts	✅ Working
Bridge WS :8767	✅ Working
Live data from :8769	✅ Working
Resize without flicker	✅ Fixed
Gesture overlay canvas	⚠️ Partial
MediaPipe hand model	⚠️ CDN fetch on first run
Voice daemon	❌ Port conflict (:8765 taken by aria-phone)
qalcode2 push patch	❌ Not yet applied
Installer	⏸️ Paused (waiting for stable Tauri app)

The installer pause decision

On May 12, 2026, a deliberate decision was logged to stop installer work until the Tauri app passes five criteria:

./scripts/launch.sh opens the Tauri app cleanly on a fresh shell
The status sidebar shows live pane state from :8769
Spawning an agent via the UI actually creates a new tmux window + opencode
Permission approve/reject from the UI works against a real session
Voice connects and transcribes into the UI

Until those five are green, packaging something that doesn't run reliably is premature. The installer exists — it checks deps, installs Python requirements, downloads the MediaPipe model, writes systemd units and a .desktop entry — but it's frozen until the app itself is stable.

Implementation 4 · gmux-brain (memory layer)

gmux-brain is not a separate UI or terminal layer — it's the intelligence layer that makes every agent pane smarter before it starts.

The problem it solves

Without memory, every qalcode2 session starts cold. The agent doesn't know the codebase architecture. It doesn't know decisions made in previous sessions. It doesn't know what the other 13 agents running alongside it are currently doing. The developer has to re-explain context repeatedly, or paste it in manually, or waste the first 10 exchanges getting the agent up to speed.

gmux-brain injects ~600 tokens of structured context into each new agent pane automatically, drawn from three sources:

Source	Technology	Answers
Structural	Graphify (AST + NetworkX)	What calls what, god nodes, community structure, architecture
Episodic	MemPalace (ChromaDB + SQLite)	Why we made decision X, when Y was changed, what was agreed
Workspace	gmux native SSE	What other agents are doing right now

The MCP server

gmux-brain exposes a single MCP endpoint that the opencode instance in each pane can call:

{
  "mcpServers": {
    "gmux-brain": {
      "type": "stdio",
      "command": "python3",
      "args": ["/home/fivelidz/projects/gmux-brain/src/router.py"]
    }
  }
}

Available tools: brain_query, brain_context, brain_graph_query, brain_memory_search, brain_memory_add, brain_kg_add, brain_kg_query, brain_status.

Query routing without an LLM

The router dispatches queries to the right memory layer using keyword matching, not an LLM call. This is intentional — routing should be fast and free, not a round-trip to a model:

Query contains	Routes to
"what calls", "god nodes", "architecture", "class", "function"	Graphify
"why did we", "when was", "who decided", "history"	MemPalace
"other agents", "pane status", "workspace"	gmux state
Ambiguous	Both

Implementation 5 · gmux.ai (the product face)

The landing page is a single HTML file with zero dependencies, deployed on Cloudflare Pages. The vote/email backend is a Cloudflare Worker with KV storage.

The counter

The interest counter on the landing page isn't a raw click count. It uses a formula:

display = real × 5 + floor(log(real+1) × 2.3) + (real×11+3)%4

This produces a number that:

Is always strictly increasing
Hovers around 5–5.8x the real click count
Is never a clean multiplier (so it doesn't look obviously fabricated)
Drips toward its target at ~2 increments per hour via a scheduled Cloudflare Worker

The audience

The early interest data from the Cloudflare Worker tells an interesting story:

21 of 23 visitors are Australian (en-AU locale, Australia/Sydney timezone)
Mostly iPhone / Safari — mobile users
14 mobile vs 8 desktop

This is a local audience reached through the developer's existing network. The product hasn't been posted anywhere public (no HN, no GitHub, no Reddit). Broadening distribution is the next step after the Tauri app stabilises.

The demo

A live demo exists at gmux.ai/demo/ — not yet linked from the main page. It uses mock data to show the v3 UI running in the browser without needing a real gmux backend. The gesture controls and visual layout work; agent state is simulated.

Comparing the implementations

Version	What it is	Status	Unique value
Assembled prototype	Symlinked MASTER_PROJECTS	Concept only	Showed existing pieces could combine
Python terminal stack	Daemon + tmux status bar	✅ Shipped (PyPI/AUR)	Zero-friction entry point, no window needed
Browser overlay	Floating transparent window	❌ Abandoned	Proved Wayland requires owning the terminal
Tauri desktop app	Terminal host with sidebar + gesture	🔄 Pre-release	Correct architecture, gesture canvas, real PTY
gmux-brain	MCP memory router	⚠️ Built, not wired	600-token context injection per agent pane
gmux.ai	Landing + email + demo	✅ Live	Product identity, early audience

The honest state of things

The Python terminal stack is real and working. If you want live AI state detection in your tmux status bar today, pip install gmux && gmux --status-only does it.

The Tauri app is close. PTY, sidebar, and live data are working. Voice and gesture aren't fully wired. The installer is paused on purpose — shipping a bad install experience is worse than shipping nothing. The five criteria for resuming installer work are clear and measurable.

gmux-brain is an interesting idea sitting idle. Wiring graphify + kalarc-memory into it and registering it in opencode.json would be a high-value afternoon's work.

The terminal AI agent space is moving fast. DeepSeek-TUI gained 21,752 GitHub stars in one week in May 2026 — showing the audience is real and hungry. gmux's combination of gesture + voice + phone remote + live AI state is genuinely novel. The window to be first is open.

All implementations: MIT licensed.
Core terminal stack: pip install gmux | paru -S gmux

Next · The devlog → ← Back to overview