## Voice Pipeline (P3) - Replace openWakeWord daemon with Wyoming Satellite approach - Add Wyoming Satellite service on port 10700 for HA voice pipeline - Update setup.sh with cross-platform sed compatibility (macOS/Linux) - Add version field to Kokoro TTS voice info - Update launchd service loader to use Wyoming Satellite ## Home Assistant Integration (P4) - Add custom conversation agent component (openclaw_conversation) - Fix: Use IntentResponse instead of plain strings (HA API requirement) - Support both HTTP API and CLI fallback modes - Config flow for easy HA UI setup - Add OpenClaw bridge scripts (Python + Bash) - Add ha-ctl utility for HA entity control - Fix: Use context manager for token file reading - Add HA configuration examples and documentation ## Infrastructure - Add mem0 backup automation (launchd + script) - Add n8n workflow templates (morning briefing, notification router) - Add VS Code workspace configuration - Reorganize model files into categorized folders: - lmstudio-community/ - mlx-community/ - bartowski/ - mradermacher/ ## Documentation - Update PROJECT_PLAN.md with Wyoming Satellite architecture - Update TODO.md with completed Wyoming integration tasks - Add OPENCLAW_INTEGRATION.md for HA setup guide ## Testing - Verified Wyoming services running (STT:10300, TTS:10301, Satellite:10700) - Verified OpenClaw CLI accessibility - Confirmed cross-platform compatibility fixes
6.6 KiB
Voice Loop Integration Plan
Created: 2026-03-07 | Status: Planning
Current State
What's Working
| Component | Status | Port/Location |
|---|---|---|
| Wyoming STT (Whisper large-v3) | ✅ Running | tcp://0.0.0.0:10300 |
| Wyoming TTS (Kokoro ONNX) | ✅ Running | tcp://0.0.0.0:10301 |
| openWakeWord daemon | ✅ Running | Detects "hey_jarvis" |
| OpenClaw Gateway | ✅ Running | ws://127.0.0.1:8080 |
| Home Assistant | ✅ Running | http://10.0.0.199:8123 |
| Ollama | ✅ Running | http://localhost:11434 |
What's Broken
- Wake word → STT gap: The
wakeword_daemon.pydetects the wake word but does NOT capture the subsequent audio command - OpenClaw /wake endpoint: Returns 404 - OpenClaw doesn't expose this endpoint by default
- No audio routing: After wake word detection, there's no mechanism to stream audio to STT and route the transcript to OpenClaw
Architecture Options
Option A: Full Wyoming Satellite (Recommended)
Replace the custom wakeword_daemon.py with a proper Wyoming satellite that handles the full pipeline.
flowchart LR
A[USB Mic] --> B[Wyoming Satellite]
B -->|wake detected| B
B -->|audio stream| C[Wyoming STT :10300]
C -->|transcript| B
B -->|text| D[OpenClaw via WebSocket]
D -->|response| E[Wyoming TTS :10301]
E -->|audio| F[Speaker]
Pros:
- Standard Wyoming protocol - works with Home Assistant
- Handles wake word, VAD, audio streaming, and response playback
- Can be used with ESP32 satellites later
Cons:
- Need to write a custom satellite or use
wyoming-satellitepackage - More complex than a simple script
Option B: Enhanced Wake Word Daemon
Extend wakeword_daemon.py to capture audio after wake word and send to STT.
flowchart LR
A[USB Mic] --> B[wakeword_daemon.py]
B -->|wake detected| B
B -->|record 5s| B
B -->|audio file| C[Wyoming STT :10300]
C -->|transcript| B
B -->|POST /agent| D[OpenClaw]
D -->|response| E[TTS via Wyoming]
E -->|audio| F[Speaker via afplay]
Pros:
- Simpler to implement
- Keeps existing code
Cons:
- Not standard Wyoming protocol
- Won't integrate with Home Assistant voice pipelines
- Harder to extend to ESP32 satellites
Option C: Home Assistant as Hub
Use Home Assistant's voice pipeline as the central coordinator.
flowchart LR
A[USB Mic] --> B[HA Wyoming Integration]
B -->|wake| C[HA Voice Pipeline]
C -->|audio| D[Wyoming STT :10300]
D -->|text| C
C -->|intent| E[HA Conversation Agent]
E -->|response| C
C -->|text| F[Wyoming TTS :10301]
F -->|audio| G[Speaker]
Pros:
- Leverages existing HA infrastructure
- Works with Assist UI and mobile app
- Easy to add ESP32 satellites
Cons:
- OpenClaw not directly in the loop (HA handles conversation)
- Need to configure HA as conversation agent or use custom component
Recommended Approach: Hybrid (Option A + C)
-
Phase 1: Get HA voice pipeline working first (Option C)
- Register Wyoming STT/TTS in Home Assistant
- Create voice assistant pipeline
- Test via HA Assist UI
-
Phase 2: Write Wyoming satellite for Mac Mini (Option A)
- Use
wyoming-satellitepackage or write custom - Connects to HA as a satellite device
- Handles local wake word + audio capture
- Use
-
Phase 3: Wire OpenClaw as custom conversation agent
- Create HA custom component or use REST command
- Route conversation to OpenClaw instead of HA default
Implementation Plan
Phase 1: Home Assistant Voice Pipeline (Sprint 1 completion)
- Open HA UI → Settings → Integrations → Add Integration → Wyoming Protocol
- Add STT provider: host
10.0.0.199, port10300 - Add TTS provider: host
10.0.0.199, port10301 - Create Voice Assistant pipeline in HA
- Test via HA Assist panel (type a query, hear response)
Phase 2: Wyoming Satellite for Mac Mini
- Install
wyoming-satellitepackage in homeai-voice-env - Configure satellite with openWakeWord integration
- Write launchd plist for satellite service
- Register satellite in Home Assistant
- Test full wake → STT → HA → TTS → playback cycle
Phase 3: OpenClaw Integration
- Research OpenClaw's WebSocket API for sending messages
- Write HA custom component or automation to route to OpenClaw
- Configure OpenClaw to respond via TTS
- Test full voice → OpenClaw → action → response flow
Technical Details
Wyoming Satellite Configuration
# Example satellite config
satellite:
name: "Mac Mini Living Room"
wake_word:
model: hey_jarvis
threshold: 0.5
stt:
uri: tcp://localhost:10300
tts:
uri: tcp://localhost:10301
audio:
input_device: default
output_device: default
OpenClaw WebSocket API
Based on the gateway health check, OpenClaw uses WebSocket for communication:
- URL:
ws://127.0.0.1:8080 - Auth: Token-based (
gateway.auth.tokenin config) - Agent:
main(default)
To send a message:
openclaw agent --message "Turn on the lights" --agent main
Home Assistant Integration
Bridge Script: homeai-agent/skills/home-assistant/openclaw_bridge.py
Python bridge that calls OpenClaw CLI and returns JSON response:
python3 openclaw_bridge.py "Turn on the lights" --raw
HA Configuration: homeai-agent/skills/home-assistant/ha-configuration.yaml
Example shell command for HA:
shell_command:
openclaw_chat: 'python3 /Users/aodhan/gitea/homeai/homeai-agent/skills/home-assistant/openclaw_bridge.py "{{ message }}" --raw'
Full Integration Guide: homeai-agent/skills/home-assistant/OPENCLAW_INTEGRATION.md
Open Questions
- Does OpenClaw expose an HTTP API for chat? The gateway seems WebSocket-only.
- Can we use
openclaw agentCLI from an automation? This would be the simplest integration. - Should we use HA's built-in conversation agent first? This would validate the voice pipeline before adding OpenClaw complexity.
Next Actions
- Complete HA Wyoming integration (manual UI steps)
- Test HA Assist with typed queries
- Research
wyoming-satellitepackage for Mac Mini - Decide on OpenClaw integration method
Success Criteria
- Say "Hey Jarvis, turn on the reading lamp" → lamp turns on
- Hear spoken confirmation via TTS
- Latency under 5 seconds from wake word to action
- Works reliably 9/10 times