# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Overview JARVIS is an Electron-based desktop voice assistant with a particle animation UI. It listens for the wake word "Jarvis", captures a voice command, sends it to the Claude CLI for processing (with Bash tool access), and speaks the response aloud using macOS TTS. No build step, no TypeScript, no bundler — vanilla JavaScript + Electron. ## Running the App ```bash npm install # first time only npm start # launches Electron ``` The app requires: - macOS (uses the `say` command for TTS) - Claude CLI installed and accessible in PATH (`claude` command) - `whisper-cpp` installed (`brew install whisper-cpp`) — provides `whisper-server` - A GGML model at `~/whisper-models/ggml-base.bin` (override with `JARVIS_WHISPER_MODEL` env var). Download: `curl -L -o ~/whisper-models/ggml-base.bin https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin` ## Architecture The app has three layers: **`main.js` — Electron main process** - Starts a local HTTP server on port 52736 (serves the renderer — required for `getUserMedia` to work in a secure context, not `file://`) - Spawns `whisper-server` on port 52737 at startup (loads the GGML model once; killed on `before-quit`) - Handles IPC: `askClaude`, `speak`, `stopSpeak`, `showContextMenu`, `whisperUrl` - Calls the Claude CLI via `execFile` with `--model opus --allowed-tools Bash --dangerously-skip-permissions`, `cwd` set to `$HOME` - Maintains conversation history (max 20 messages) in memory - Detects French vs. English in responses to pick the `say` voice (Thomas vs. Alex) **`preload.js` — context bridge** - Exposes `window.jarvis.{askClaude, speak, stopSpeak, showContextMenu, whisperUrl}` with context isolation. **`renderer.js` — UI + voice pipeline** - `JarvisVisualizer`: Canvas-based particle animation. States: idle (cyan), listening (green), thinking (amber), speaking (light blue). - `AudioPipeline`: `getUserMedia` → `AudioContext({sampleRate: 16000})` → `ScriptProcessorNode` delivers Float32 frames. - `encodeWAV` / `transcribe`: encodes Float32 PCM to 16-bit WAV Blob and POSTs to `http://127.0.0.1:52737/inference` (multipart `file`, `response_format=text`). - `JarvisController`: state machine (`idle`/`listening`/`thinking`/`speaking`) driving the full pipeline. **Note**: We do NOT use `webkitSpeechRecognition` — it's broken in Electron (missing Google API key → `network` error). All STT goes through local whisper-server. ## Voice Pipeline Flow 1. **Idle**: every 1.2s, take the last 2.2s from a rolling 3s ring buffer; if RMS above floor, POST to whisper-server. If transcription matches `/\bjarvis\b/i` → enter listening with the trailing text as inline command. 2. **Listening**: accumulate Float32 frames into `cmdChunks`. Per-frame RMS drives a VAD: after speech onset, 1.5s of silence (or 12s max) → finalize. 3. **Finalize**: concat chunks, encode WAV, transcribe once, combine with inline → Claude CLI → `say`. 4. Audio capture is gated off during `thinking`/`speaking` so JARVIS never hears its own voice. Ring buffer is cleared on entry to listening and on return to idle. ## Key Constants (in `main.js`) - Claude model alias: `opus` - CLI timeout: 120s, output buffer: 2MB - Conversation history cap: 20 items - Local HTTP server port: 52736 - Whisper server port: 52737 - Whisper model path: `$JARVIS_WHISPER_MODEL` or `~/whisper-models/ggml-base.bin` ## Making Changes - **System prompt / personality**: Edit `buildPrompt()` in `main.js` - **Claude model or CLI flags**: Edit the `execFile` call in `askClaude()` in `main.js` - **Wake word or silence timeout**: Edit `_startWakeLoop()` / `_listenContinuous()` in `renderer.js` - **Visual states or animation**: Edit `JarvisVisualizer` in `renderer.js` - Restart `npm start` after any change to see the effect (no hot reload)