feat(phase-04): Wyoming Satellite integration + OpenClaw HA components
## Voice Pipeline (P3) - Replace openWakeWord daemon with Wyoming Satellite approach - Add Wyoming Satellite service on port 10700 for HA voice pipeline - Update setup.sh with cross-platform sed compatibility (macOS/Linux) - Add version field to Kokoro TTS voice info - Update launchd service loader to use Wyoming Satellite ## Home Assistant Integration (P4) - Add custom conversation agent component (openclaw_conversation) - Fix: Use IntentResponse instead of plain strings (HA API requirement) - Support both HTTP API and CLI fallback modes - Config flow for easy HA UI setup - Add OpenClaw bridge scripts (Python + Bash) - Add ha-ctl utility for HA entity control - Fix: Use context manager for token file reading - Add HA configuration examples and documentation ## Infrastructure - Add mem0 backup automation (launchd + script) - Add n8n workflow templates (morning briefing, notification router) - Add VS Code workspace configuration - Reorganize model files into categorized folders: - lmstudio-community/ - mlx-community/ - bartowski/ - mradermacher/ ## Documentation - Update PROJECT_PLAN.md with Wyoming Satellite architecture - Update TODO.md with completed Wyoming integration tasks - Add OPENCLAW_INTEGRATION.md for HA setup guide ## Testing - Verified Wyoming services running (STT:10300, TTS:10301, Satellite:10700) - Verified OpenClaw CLI accessibility - Confirmed cross-platform compatibility fixes
This commit is contained in:
224
plans/voice-loop-integration.md
Normal file
224
plans/voice-loop-integration.md
Normal file
@@ -0,0 +1,224 @@
|
||||
# Voice Loop Integration Plan
|
||||
|
||||
> Created: 2026-03-07 | Status: Planning
|
||||
|
||||
---
|
||||
|
||||
## Current State
|
||||
|
||||
### What's Working
|
||||
| Component | Status | Port/Location |
|
||||
|-----------|--------|---------------|
|
||||
| Wyoming STT (Whisper large-v3) | ✅ Running | tcp://0.0.0.0:10300 |
|
||||
| Wyoming TTS (Kokoro ONNX) | ✅ Running | tcp://0.0.0.0:10301 |
|
||||
| openWakeWord daemon | ✅ Running | Detects "hey_jarvis" |
|
||||
| OpenClaw Gateway | ✅ Running | ws://127.0.0.1:8080 |
|
||||
| Home Assistant | ✅ Running | http://10.0.0.199:8123 |
|
||||
| Ollama | ✅ Running | http://localhost:11434 |
|
||||
|
||||
### What's Broken
|
||||
1. **Wake word → STT gap**: The `wakeword_daemon.py` detects the wake word but does NOT capture the subsequent audio command
|
||||
2. **OpenClaw /wake endpoint**: Returns 404 - OpenClaw doesn't expose this endpoint by default
|
||||
3. **No audio routing**: After wake word detection, there's no mechanism to stream audio to STT and route the transcript to OpenClaw
|
||||
|
||||
---
|
||||
|
||||
## Architecture Options
|
||||
|
||||
### Option A: Full Wyoming Satellite (Recommended)
|
||||
|
||||
Replace the custom `wakeword_daemon.py` with a proper Wyoming satellite that handles the full pipeline.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[USB Mic] --> B[Wyoming Satellite]
|
||||
B -->|wake detected| B
|
||||
B -->|audio stream| C[Wyoming STT :10300]
|
||||
C -->|transcript| B
|
||||
B -->|text| D[OpenClaw via WebSocket]
|
||||
D -->|response| E[Wyoming TTS :10301]
|
||||
E -->|audio| F[Speaker]
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Standard Wyoming protocol - works with Home Assistant
|
||||
- Handles wake word, VAD, audio streaming, and response playback
|
||||
- Can be used with ESP32 satellites later
|
||||
|
||||
**Cons:**
|
||||
- Need to write a custom satellite or use `wyoming-satellite` package
|
||||
- More complex than a simple script
|
||||
|
||||
### Option B: Enhanced Wake Word Daemon
|
||||
|
||||
Extend `wakeword_daemon.py` to capture audio after wake word and send to STT.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[USB Mic] --> B[wakeword_daemon.py]
|
||||
B -->|wake detected| B
|
||||
B -->|record 5s| B
|
||||
B -->|audio file| C[Wyoming STT :10300]
|
||||
C -->|transcript| B
|
||||
B -->|POST /agent| D[OpenClaw]
|
||||
D -->|response| E[TTS via Wyoming]
|
||||
E -->|audio| F[Speaker via afplay]
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Simpler to implement
|
||||
- Keeps existing code
|
||||
|
||||
**Cons:**
|
||||
- Not standard Wyoming protocol
|
||||
- Won't integrate with Home Assistant voice pipelines
|
||||
- Harder to extend to ESP32 satellites
|
||||
|
||||
### Option C: Home Assistant as Hub
|
||||
|
||||
Use Home Assistant's voice pipeline as the central coordinator.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[USB Mic] --> B[HA Wyoming Integration]
|
||||
B -->|wake| C[HA Voice Pipeline]
|
||||
C -->|audio| D[Wyoming STT :10300]
|
||||
D -->|text| C
|
||||
C -->|intent| E[HA Conversation Agent]
|
||||
E -->|response| C
|
||||
C -->|text| F[Wyoming TTS :10301]
|
||||
F -->|audio| G[Speaker]
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Leverages existing HA infrastructure
|
||||
- Works with Assist UI and mobile app
|
||||
- Easy to add ESP32 satellites
|
||||
|
||||
**Cons:**
|
||||
- OpenClaw not directly in the loop (HA handles conversation)
|
||||
- Need to configure HA as conversation agent or use custom component
|
||||
|
||||
---
|
||||
|
||||
## Recommended Approach: Hybrid (Option A + C)
|
||||
|
||||
1. **Phase 1**: Get HA voice pipeline working first (Option C)
|
||||
- Register Wyoming STT/TTS in Home Assistant
|
||||
- Create voice assistant pipeline
|
||||
- Test via HA Assist UI
|
||||
|
||||
2. **Phase 2**: Write Wyoming satellite for Mac Mini (Option A)
|
||||
- Use `wyoming-satellite` package or write custom
|
||||
- Connects to HA as a satellite device
|
||||
- Handles local wake word + audio capture
|
||||
|
||||
3. **Phase 3**: Wire OpenClaw as custom conversation agent
|
||||
- Create HA custom component or use REST command
|
||||
- Route conversation to OpenClaw instead of HA default
|
||||
|
||||
---
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1: Home Assistant Voice Pipeline (Sprint 1 completion)
|
||||
|
||||
- [x] Open HA UI → Settings → Integrations → Add Integration → Wyoming Protocol
|
||||
- [x] Add STT provider: host `10.0.0.199`, port `10300`
|
||||
- [x] Add TTS provider: host `10.0.0.199`, port `10301`
|
||||
- [ ] Create Voice Assistant pipeline in HA
|
||||
- [ ] Test via HA Assist panel (type a query, hear response)
|
||||
|
||||
### Phase 2: Wyoming Satellite for Mac Mini
|
||||
|
||||
- [x] Install `wyoming-satellite` package in homeai-voice-env
|
||||
- [x] Configure satellite with openWakeWord integration
|
||||
- [x] Write launchd plist for satellite service
|
||||
- [ ] Register satellite in Home Assistant
|
||||
- [ ] Test full wake → STT → HA → TTS → playback cycle
|
||||
|
||||
### Phase 3: OpenClaw Integration
|
||||
|
||||
- [x] Research OpenClaw's WebSocket API for sending messages
|
||||
- [x] Write HA custom component or automation to route to OpenClaw
|
||||
- [x] Configure OpenClaw to respond via TTS
|
||||
- [ ] Test full voice → OpenClaw → action → response flow
|
||||
|
||||
---
|
||||
|
||||
## Technical Details
|
||||
|
||||
### Wyoming Satellite Configuration
|
||||
|
||||
```yaml
|
||||
# Example satellite config
|
||||
satellite:
|
||||
name: "Mac Mini Living Room"
|
||||
wake_word:
|
||||
model: hey_jarvis
|
||||
threshold: 0.5
|
||||
stt:
|
||||
uri: tcp://localhost:10300
|
||||
tts:
|
||||
uri: tcp://localhost:10301
|
||||
audio:
|
||||
input_device: default
|
||||
output_device: default
|
||||
```
|
||||
|
||||
### OpenClaw WebSocket API
|
||||
|
||||
Based on the gateway health check, OpenClaw uses WebSocket for communication:
|
||||
- URL: `ws://127.0.0.1:8080`
|
||||
- Auth: Token-based (`gateway.auth.token` in config)
|
||||
- Agent: `main` (default)
|
||||
|
||||
To send a message:
|
||||
```bash
|
||||
openclaw agent --message "Turn on the lights" --agent main
|
||||
```
|
||||
|
||||
### Home Assistant Integration
|
||||
|
||||
**Bridge Script**: [`homeai-agent/skills/home-assistant/openclaw_bridge.py`](homeai-agent/skills/home-assistant/openclaw_bridge.py)
|
||||
|
||||
Python bridge that calls OpenClaw CLI and returns JSON response:
|
||||
```bash
|
||||
python3 openclaw_bridge.py "Turn on the lights" --raw
|
||||
```
|
||||
|
||||
**HA Configuration**: [`homeai-agent/skills/home-assistant/ha-configuration.yaml`](homeai-agent/skills/home-assistant/ha-configuration.yaml)
|
||||
|
||||
Example shell command for HA:
|
||||
```yaml
|
||||
shell_command:
|
||||
openclaw_chat: 'python3 /Users/aodhan/gitea/homeai/homeai-agent/skills/home-assistant/openclaw_bridge.py "{{ message }}" --raw'
|
||||
```
|
||||
|
||||
**Full Integration Guide**: [`homeai-agent/skills/home-assistant/OPENCLAW_INTEGRATION.md`](homeai-agent/skills/home-assistant/OPENCLAW_INTEGRATION.md)
|
||||
|
||||
---
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Does OpenClaw expose an HTTP API for chat?** The gateway seems WebSocket-only.
|
||||
2. **Can we use `openclaw agent` CLI from an automation?** This would be the simplest integration.
|
||||
3. **Should we use HA's built-in conversation agent first?** This would validate the voice pipeline before adding OpenClaw complexity.
|
||||
|
||||
---
|
||||
|
||||
## Next Actions
|
||||
|
||||
1. Complete HA Wyoming integration (manual UI steps)
|
||||
2. Test HA Assist with typed queries
|
||||
3. Research `wyoming-satellite` package for Mac Mini
|
||||
4. Decide on OpenClaw integration method
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] Say "Hey Jarvis, turn on the reading lamp" → lamp turns on
|
||||
- [ ] Hear spoken confirmation via TTS
|
||||
- [ ] Latency under 5 seconds from wake word to action
|
||||
- [ ] Works reliably 9/10 times
|
||||
Reference in New Issue
Block a user