How It Works¶
cc-vox is a Claude Code plugin that uses the hook system to inject voice feedback into every conversation turn. The entire pipeline is hands-free — once installed, Claude automatically includes voice summaries.
The Pipeline¶
sequenceDiagram
participant U as User
participant C as Claude Code
participant H1 as UserPromptSubmit Hook
participant H2 as PostToolUse Hook
participant H3 as Stop Hook
participant S as scripts/say
participant T as TTS Backend
U->>C: Types a message
C->>H1: Hook fires
H1-->>C: Injects 📢 reminder into system prompt
C->>C: Claude generates response
Note over C: Includes 📢 summary at end
opt If tools are used
C->>H2: After each tool call
H2-->>C: Brief reminder to keep 📢 in mind
end
C->>H3: Response complete (Stop event)
H3->>H3: Extract 📢 marker from response
alt 📢 marker found
H3->>S: Speak marker text
else Response is short
H3->>S: Speak response directly
else Long response, no marker
H3->>H3: Call headless Claude for summary
H3->>S: Speak generated summary
else Last resort
H3->>S: Speak truncated response
end
S->>S: Select backend & acquire lock
S->>T: Generate audio
T-->>S: WAV audio data
S->>S: Play audio
The Three Hooks¶
1. UserPromptSubmit — Inject Reminder¶
When: Every time the user sends a message.
The hook reads ~/.claude/cc-vox.toml and injects a system message telling Claude to include a 📢 voice summary at the end of its response. This reminder includes:
- The max word limit for summaries
- Style instructions (match user's tone, avoid technical identifiers)
- Any custom personality prompt
2. PostToolUse — Brief Nudge¶
When: After each tool call (file reads, edits, bash commands, etc.).
In long tool-heavy responses, Claude can lose track of the voice summary instruction. This hook injects a brief reminder to keep the 📢 summary in mind.
3. Stop — Extract & Speak¶
When: Claude finishes its response.
This hook runs the 4-strategy summarization cascade:
| Strategy | Speed | When |
|---|---|---|
1. Extract 📢 marker |
Instant | Claude included a 📢 line |
| 2. Speak directly | Instant | Response is short enough (<=max_sentences) |
| 3. Headless Claude | ~3--5s | Calls claude -p to generate a summary |
| 4. Truncate | Instant | Last resort — truncate the response |
The summary is passed to scripts/say, which selects a TTS backend, generates audio, and plays it.
Audio Playback¶
The say script handles:
- Backend selection — auto-detect or forced, with fallback
- Playback locking — file-based mutex prevents overlapping audio
- Audio player detection — prefers
ffplay(streaming), falls back toaplay,paplay, orafplay - Session state — sentinel files in
/tmp/so the stop hook knows TTS status