feat(phase-04): Wyoming Satellite integration + OpenClaw HA components
## Voice Pipeline (P3) - Replace openWakeWord daemon with Wyoming Satellite approach - Add Wyoming Satellite service on port 10700 for HA voice pipeline - Update setup.sh with cross-platform sed compatibility (macOS/Linux) - Add version field to Kokoro TTS voice info - Update launchd service loader to use Wyoming Satellite ## Home Assistant Integration (P4) - Add custom conversation agent component (openclaw_conversation) - Fix: Use IntentResponse instead of plain strings (HA API requirement) - Support both HTTP API and CLI fallback modes - Config flow for easy HA UI setup - Add OpenClaw bridge scripts (Python + Bash) - Add ha-ctl utility for HA entity control - Fix: Use context manager for token file reading - Add HA configuration examples and documentation ## Infrastructure - Add mem0 backup automation (launchd + script) - Add n8n workflow templates (morning briefing, notification router) - Add VS Code workspace configuration - Reorganize model files into categorized folders: - lmstudio-community/ - mlx-community/ - bartowski/ - mradermacher/ ## Documentation - Update PROJECT_PLAN.md with Wyoming Satellite architecture - Update TODO.md with completed Wyoming integration tasks - Add OPENCLAW_INTEGRATION.md for HA setup guide ## Testing - Verified Wyoming services running (STT:10300, TTS:10301, Satellite:10700) - Verified OpenClaw CLI accessibility - Confirmed cross-platform compatibility fixes
This commit is contained in:
266
plans/ha-voice-pipeline-implementation.md
Normal file
266
plans/ha-voice-pipeline-implementation.md
Normal file
@@ -0,0 +1,266 @@
|
||||
# Home Assistant Voice Pipeline Implementation Plan
|
||||
|
||||
> Created: 2026-03-07 | Phase: 2.1 - HA Voice Pipeline Setup
|
||||
|
||||
---
|
||||
|
||||
## Current State Summary
|
||||
|
||||
### ✅ Running Services
|
||||
| Service | Port | Status | Location |
|
||||
|---------|------|--------|----------|
|
||||
| Wyoming STT (Whisper large-v3) | 10300 | ✅ Running | Mac Mini |
|
||||
| Wyoming TTS (Kokoro ONNX) | 10301 | ✅ Running | Mac Mini |
|
||||
| Wyoming Satellite | 10700 | ✅ Running | Mac Mini |
|
||||
| OpenClaw Gateway | 8080 | ✅ Running | Mac Mini |
|
||||
| Home Assistant | 8123 | ✅ Running | Docker (10.0.0.199) |
|
||||
| Ollama | 11434 | ✅ Running | Mac Mini |
|
||||
|
||||
### ✅ Completed Capabilities
|
||||
- OpenClaw tool calling works with `qwen2.5:7b`
|
||||
- Home Assistant skill tested: "Turn on the study shelves light" ✓
|
||||
- Wyoming test pipeline passes (3/3 tests)
|
||||
|
||||
---
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Phase 1: Home Assistant Wyoming Integration (Manual UI Steps)
|
||||
|
||||
#### Step 1.1: Add Wyoming Protocol Integration
|
||||
|
||||
1. Open Home Assistant UI → **Settings → Devices & Services → Add Integration**
|
||||
2. Search for **"Wyoming Protocol"**
|
||||
3. Add the following three services:
|
||||
|
||||
**Speech-to-Text (STT):**
|
||||
- Host: `10.0.0.199`
|
||||
- Port: `10300`
|
||||
- Name: `Mac Mini STT`
|
||||
|
||||
**Text-to-Speech (TTS):**
|
||||
- Host: `10.0.0.199`
|
||||
- Port: `10301`
|
||||
- Name: `Mac Mini TTS`
|
||||
|
||||
**Satellite:**
|
||||
- Host: `10.0.0.199`
|
||||
- Port: `10700`
|
||||
- Name: `Mac Mini Living Room`
|
||||
|
||||
#### Step 1.2: Create Voice Assistant Pipeline
|
||||
|
||||
1. Go to **Settings → Voice Assistants**
|
||||
2. Click **Add Pipeline**
|
||||
3. Configure:
|
||||
- **Name**: "HomeAI with OpenClaw"
|
||||
- **Speech-to-Text**: Select "Mac Mini STT" (Wyoming)
|
||||
- **Conversation Agent**: Select "Home Assistant" (for initial testing)
|
||||
- **Text-to-Speech**: Select "Mac Mini TTS" (Wyoming)
|
||||
- **Language**: English
|
||||
|
||||
4. Save the pipeline
|
||||
|
||||
#### Step 1.3: Assign Pipeline to Assist
|
||||
|
||||
1. Go to **Settings → Voice Assistants**
|
||||
2. Click **Assist** tab
|
||||
3. Set the default pipeline to "HomeAI with OpenClaw"
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Test Basic Voice Pipeline
|
||||
|
||||
#### Step 2.1: Test via Browser (Typed Query)
|
||||
|
||||
1. Open Home Assistant UI
|
||||
2. Click the **Assist** icon (microphone) in the top-right corner
|
||||
3. Type: "What time is it?"
|
||||
4. Expected: You should hear a spoken response via Kokoro TTS
|
||||
|
||||
#### Step 2.2: Test via Satellite (Voice)
|
||||
|
||||
1. Ensure Wyoming Satellite is running: `launchctl list com.homeai.wyoming-satellite`
|
||||
2. Say the wake word: "Hey Jarvis" (or configured wake word)
|
||||
3. Speak: "What time is it?"
|
||||
4. Expected: You should hear a spoken response
|
||||
|
||||
#### Step 2.3: Verify Pipeline Components
|
||||
|
||||
Check logs for each component:
|
||||
```bash
|
||||
# STT logs
|
||||
tail -f /tmp/homeai-wyoming-stt.log
|
||||
|
||||
# TTS logs
|
||||
tail -f /tmp/homeai-wyoming-tts.log
|
||||
|
||||
# Satellite logs
|
||||
tail -f /tmp/homeai-wyoming-satellite.log
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: OpenClaw Integration
|
||||
|
||||
#### Step 3.1: Add Shell Command to HA
|
||||
|
||||
Edit Home Assistant `configuration.yaml` (via File Editor add-on or VS Code Server):
|
||||
|
||||
```yaml
|
||||
shell_command:
|
||||
openclaw_chat: 'python3 /Users/aodhan/gitea/homeai/homeai-agent/skills/home-assistant/openclaw_bridge.py "{{ message }}" --raw'
|
||||
```
|
||||
|
||||
Restart Home Assistant to load the new configuration.
|
||||
|
||||
#### Step 3.2: Create OpenClaw Conversation Automation
|
||||
|
||||
Create a new automation in HA UI:
|
||||
|
||||
**Automation: "Voice Command via OpenClaw"**
|
||||
|
||||
```yaml
|
||||
alias: "Voice Command via OpenClaw"
|
||||
description: "Routes voice commands to OpenClaw for processing"
|
||||
trigger:
|
||||
- platform: conversation
|
||||
command:
|
||||
- "ask jarvis *"
|
||||
- "ask assistant *"
|
||||
- "hey jarvis *"
|
||||
condition: []
|
||||
action:
|
||||
- service: shell_command.openclaw_chat
|
||||
data:
|
||||
message: "{{ trigger.slots.command }}"
|
||||
response_variable: openclaw_response
|
||||
|
||||
- service: tts.speak
|
||||
data:
|
||||
media_player_entity_id: media_player.mac_mini_speaker
|
||||
message: "{{ openclaw_response }}"
|
||||
tts: wyoming_tts
|
||||
mode: single
|
||||
```
|
||||
|
||||
**Alternative: Direct Conversation Agent**
|
||||
|
||||
For tighter integration, create a custom conversation agent that calls OpenClaw directly. This requires a custom component.
|
||||
|
||||
#### Step 3.3: Test OpenClaw Integration
|
||||
|
||||
1. Open HA Assist
|
||||
2. Type: "ask jarvis turn on the reading lamp"
|
||||
3. Expected:
|
||||
- Command sent to OpenClaw
|
||||
- OpenClaw processes via home-assistant skill
|
||||
- Light turns on
|
||||
- TTS confirmation spoken
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Advanced Configuration
|
||||
|
||||
#### Step 4.1: Create Custom Conversation Agent (Optional)
|
||||
|
||||
For full OpenClaw integration as the primary conversation agent:
|
||||
|
||||
1. Create custom component: `custom_components/openclaw_conversation/`
|
||||
2. Implement `async_process` method that calls OpenClaw
|
||||
3. Configure in `configuration.yaml`:
|
||||
|
||||
```yaml
|
||||
conversation:
|
||||
- name: OpenClaw
|
||||
agent_id: openclaw_agent
|
||||
```
|
||||
|
||||
#### Step 4.2: Add Intent Scripts
|
||||
|
||||
For specific intents that bypass OpenClaw:
|
||||
|
||||
```yaml
|
||||
intent_script:
|
||||
HassTurnOn:
|
||||
action:
|
||||
- service: homeassistant.turn_on
|
||||
target:
|
||||
entity_id: "{{ name }}"
|
||||
speech:
|
||||
text: "Turned on {{ name }}"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Basic Pipeline
|
||||
- [ ] Wyoming STT appears in HA Integrations
|
||||
- [ ] Wyoming TTS appears in HA Integrations
|
||||
- [ ] Wyoming Satellite appears in HA Integrations
|
||||
- [ ] Voice Assistant pipeline created and assigned
|
||||
- [ ] Typed query in Assist returns spoken response
|
||||
- [ ] Voice query via satellite returns spoken response
|
||||
|
||||
### OpenClaw Integration
|
||||
- [ ] Shell command `openclaw_chat` configured
|
||||
- [ ] Automation triggers on conversation intent
|
||||
- [ ] OpenClaw receives and processes command
|
||||
- [ ] HA action executed (e.g., light turns on)
|
||||
- [ ] TTS confirmation spoken
|
||||
|
||||
### Performance
|
||||
- [ ] Latency under 5 seconds from wake to response
|
||||
- [ ] STT transcription accurate
|
||||
- [ ] TTS audio clear and natural
|
||||
- [ ] No errors in service logs
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Issue: Wyoming services not appearing in HA
|
||||
**Solution:**
|
||||
1. Verify services are running: `nc -z 10.0.0.199 10300`
|
||||
2. Check HA can reach Mac Mini: Test from HA container
|
||||
3. Verify Wyoming protocol version compatibility
|
||||
|
||||
### Issue: No audio output
|
||||
**Solution:**
|
||||
1. Check SoX installation: `which play`
|
||||
2. Test audio directly: `afplay /System/Library/Sounds/Glass.aiff`
|
||||
3. Check satellite logs: `tail -f /tmp/homeai-wyoming-satellite-error.log`
|
||||
|
||||
### Issue: OpenClaw not responding
|
||||
**Solution:**
|
||||
1. Verify OpenClaw running: `pgrep -f openclaw`
|
||||
2. Test CLI directly: `openclaw agent --message "Hello" --agent main`
|
||||
3. Check bridge script permissions: `chmod +x openclaw_bridge.py`
|
||||
|
||||
### Issue: STT not transcribing
|
||||
**Solution:**
|
||||
1. Check STT logs: `tail -f /tmp/homeai-wyoming-stt.log`
|
||||
2. Verify Whisper model loaded
|
||||
3. Test with sample audio file
|
||||
|
||||
---
|
||||
|
||||
## Next Steps After Completion
|
||||
|
||||
1. **Install Chatterbox TTS** for voice cloning
|
||||
2. **Set up mem0** for long-term memory
|
||||
3. **Configure n8n workflows** for automation
|
||||
4. **Add Uptime Kuma monitors** for all services
|
||||
5. **Begin ESP32 satellite setup** (Phase 4)
|
||||
|
||||
---
|
||||
|
||||
## File References
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `homeai-agent/skills/home-assistant/openclaw_bridge.py` | Bridge script for HA → OpenClaw |
|
||||
| `homeai-agent/skills/home-assistant/ha-configuration.yaml` | Example HA configuration |
|
||||
| `homeai-voice/wyoming/test-pipeline.sh` | Pipeline smoke test |
|
||||
| `homeai-voice/scripts/launchd/com.homeai.wyoming-satellite.plist` | Satellite service config |
|
||||
201
plans/next-steps.md
Normal file
201
plans/next-steps.md
Normal file
@@ -0,0 +1,201 @@
|
||||
# HomeAI — Next Steps Plan
|
||||
|
||||
> Created: 2026-03-07 | Priority: Voice Loop → Foundation Hardening → Character System
|
||||
|
||||
---
|
||||
|
||||
## Current State Summary
|
||||
|
||||
| Sub-Project | Status | Done / Total |
|
||||
|---|---|---|
|
||||
| P1 homeai-infra | Core done, tail items remain | 6 / 9 |
|
||||
| P2 homeai-llm | Core done, tail items remain | 6 / 8 |
|
||||
| P3 homeai-voice | STT + TTS + wake word running, HA integration pending | 7 / 13 |
|
||||
| P4 homeai-agent | OpenClaw + HA skill working, mem0 + n8n pending | 10 / 16 |
|
||||
| P5 homeai-character | Not started | 0 / 11 |
|
||||
| P6–P8 | Not started | 0 / * |
|
||||
|
||||
**Key milestone reached:** OpenClaw can receive text, call `qwen2.5:7b` via Ollama, execute tool calls, and control Home Assistant entities. The voice pipeline components (STT, TTS, wake word) are all running as launchd services.
|
||||
|
||||
**Critical gap:** The voice pipeline is not yet connected through Home Assistant to the agent. The pieces exist but the end-to-end flow is untested.
|
||||
|
||||
---
|
||||
|
||||
## Sprint 1 — Complete the Voice → Agent → HA Loop
|
||||
|
||||
**Goal:** Speak a command → hear a spoken response + see the HA action execute.
|
||||
|
||||
This is the highest-value work because it closes the core loop that every future feature builds on.
|
||||
|
||||
### Tasks
|
||||
|
||||
#### 1A. Finish HA Wyoming Integration (P3)
|
||||
|
||||
The Wyoming STT (port 10300) and TTS (port 10301) services are running. They need to be registered in Home Assistant.
|
||||
|
||||
- [ ] Open HA UI → Settings → Integrations → Add Integration → Wyoming Protocol
|
||||
- [ ] Add STT provider: host `10.0.0.199` (or `localhost` if HA is on same machine), port `10300`
|
||||
- [ ] Add TTS provider: host `10.0.0.199`, port `10301`
|
||||
- [ ] Verify both appear as STT/TTS providers in HA
|
||||
|
||||
#### 1B. Create HA Voice Assistant Pipeline (P3)
|
||||
|
||||
- [ ] HA → Settings → Voice Assistants → Add Assistant
|
||||
- [ ] Configure: STT = Wyoming Whisper, TTS = Wyoming Kokoro, Conversation Agent = Home Assistant default (or OpenClaw if wired)
|
||||
- [ ] Set as default voice assistant pipeline
|
||||
|
||||
#### 1C. Test HA Assist via Browser (P3)
|
||||
|
||||
- [ ] Open HA dashboard → Assist panel
|
||||
- [ ] Type a query (e.g. "What time is it?") → verify spoken response plays back
|
||||
- [ ] Type a device command (e.g. "Turn on the reading lamp") → verify HA executes it
|
||||
|
||||
#### 1D. Set Up mem0 with Chroma Backend (P4)
|
||||
|
||||
- [ ] Install mem0: `pip install mem0ai`
|
||||
- [ ] Install chromadb: `pip install chromadb`
|
||||
- [ ] Pull embedding model: `ollama pull nomic-embed-text`
|
||||
- [ ] Write mem0 config pointing at Ollama for LLM + embeddings, Chroma for vector store
|
||||
- [ ] Test: store a memory, recall it via semantic search
|
||||
- [ ] Verify mem0 data persists at `~/.openclaw/memory/chroma/`
|
||||
|
||||
#### 1E. Write Memory Backup launchd Job (P4)
|
||||
|
||||
- [ ] Create git repo at `~/.openclaw/memory/` (or a subdirectory)
|
||||
- [ ] Write backup script: `git add . && git commit -m "mem0 backup $(date)" && git push`
|
||||
- [ ] Write launchd plist: `com.homeai.mem0-backup.plist` — daily schedule
|
||||
- [ ] Load plist, verify it runs
|
||||
|
||||
#### 1F. Build Morning Briefing n8n Workflow (P4)
|
||||
|
||||
- [ ] Verify n8n is running (Docker, deployed in P1)
|
||||
- [ ] Create workflow: time trigger → fetch weather from HA → compose briefing text → POST to OpenClaw `/speak` endpoint
|
||||
- [ ] Export workflow JSON to `homeai-agent/workflows/morning-briefing.json`
|
||||
- [ ] Test: manually trigger → hear spoken briefing
|
||||
|
||||
#### 1G. Build Notification Router n8n Workflow (P4)
|
||||
|
||||
- [ ] Create workflow: HA webhook trigger → classify urgency → high: TTS immediately, low: queue
|
||||
- [ ] Export to `homeai-agent/workflows/notification-router.json`
|
||||
|
||||
#### 1H. Verify Full Voice → Agent → HA Action Flow (P3 + P4)
|
||||
|
||||
- [ ] Trigger wake word ("hey jarvis") via USB mic
|
||||
- [ ] Speak a command: "Turn on the reading lamp"
|
||||
- [ ] Verify: wake word detected → audio captured → STT transcribes → OpenClaw receives text → tool call to HA → lamp turns on → TTS response plays back
|
||||
- [ ] Document any latency issues or failure points
|
||||
|
||||
### Sprint 1 Flow Diagram
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[USB Mic] -->|wake word| B[openWakeWord]
|
||||
B -->|audio stream| C[Wyoming STT - Whisper]
|
||||
C -->|transcribed text| D[Home Assistant Pipeline]
|
||||
D -->|text| E[OpenClaw Agent]
|
||||
E -->|tool call| F[HA REST API]
|
||||
F -->|action| G[Smart Device]
|
||||
E -->|response text| H[Wyoming TTS - Kokoro]
|
||||
H -->|audio| I[Speaker]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Sprint 2 — Foundation Hardening
|
||||
|
||||
**Goal:** All services survive a reboot, are monitored, and are remotely accessible.
|
||||
|
||||
### Tasks
|
||||
|
||||
#### 2A. Install and Configure Tailscale (P1)
|
||||
|
||||
- [ ] Install Tailscale on Mac Mini: `brew install tailscale`
|
||||
- [ ] Authenticate and join Tailnet
|
||||
- [ ] Verify all services reachable via Tailscale IP (HA, Open WebUI, Portainer, Gitea, n8n, code-server)
|
||||
- [ ] Document Tailscale IP → service URL mapping
|
||||
|
||||
#### 2B. Configure Uptime Kuma Monitors (P1 + P2)
|
||||
|
||||
- [ ] Add monitors for: Home Assistant, Portainer, Gitea, code-server, n8n
|
||||
- [ ] Add monitors for: Ollama API (port 11434), Open WebUI (port 3030)
|
||||
- [ ] Add monitors for: Wyoming STT (port 10300), Wyoming TTS (port 10301)
|
||||
- [ ] Add monitor for: OpenClaw (port 8080)
|
||||
- [ ] Configure mobile push alerts (ntfy or Pushover)
|
||||
|
||||
#### 2C. Cold Reboot Verification (P1)
|
||||
|
||||
- [ ] Reboot Mac Mini
|
||||
- [ ] Verify all Docker containers come back up (restart policy: `unless-stopped`)
|
||||
- [ ] Verify launchd services start: Ollama, Wyoming STT, Wyoming TTS, openWakeWord, OpenClaw
|
||||
- [ ] Check Uptime Kuma — all monitors green within 2 minutes
|
||||
- [ ] Document any services that failed to restart and fix
|
||||
|
||||
#### 2D. Run LLM Benchmarks (P2)
|
||||
|
||||
- [ ] Run `homeai-llm/scripts/benchmark.sh`
|
||||
- [ ] Record results: tokens/sec for each model (qwen2.5:7b, llama3.3:70b, etc.)
|
||||
- [ ] Write results to `homeai-llm/benchmark-results.md`
|
||||
|
||||
---
|
||||
|
||||
## Sprint 3 — Character System (P5)
|
||||
|
||||
**Goal:** Character schema defined, default character created, Character Manager UI functional.
|
||||
|
||||
### Tasks
|
||||
|
||||
#### 3A. Define Character Schema (P5)
|
||||
|
||||
- [ ] Write `homeai-character/schema/character.schema.json` (v1) — based on the spec in PLAN.md
|
||||
- [ ] Write `homeai-character/schema/README.md` documenting each field
|
||||
|
||||
#### 3B. Create Default Character (P5)
|
||||
|
||||
- [ ] Write `homeai-character/characters/aria.json` with placeholder expression IDs
|
||||
- [ ] Validate aria.json against schema (manual or script)
|
||||
|
||||
#### 3C. Set Up Vite Project (P5)
|
||||
|
||||
- [ ] Initialize Vite + React project in `homeai-character/`
|
||||
- [ ] Install deps: `npm install react react-dom ajv`
|
||||
- [ ] Move existing `character-manager.jsx` into `src/`
|
||||
- [ ] Verify dev server runs at `http://localhost:5173`
|
||||
|
||||
#### 3D. Wire Character Manager Features (P5)
|
||||
|
||||
- [ ] Integrate schema validation on export (ajv)
|
||||
- [ ] Add expression mapping UI section
|
||||
- [ ] Add custom rules editor
|
||||
- [ ] Test full edit → export → validate → load cycle
|
||||
|
||||
#### 3E. Wire Character into OpenClaw (P4 + P5)
|
||||
|
||||
- [ ] Copy/symlink `aria.json` to `~/.openclaw/characters/aria.json`
|
||||
- [ ] Configure OpenClaw to load system prompt from character JSON
|
||||
- [ ] Verify OpenClaw uses Aria's system prompt in responses
|
||||
|
||||
---
|
||||
|
||||
## Open Decisions to Resolve During These Sprints
|
||||
|
||||
| Decision | Options | Recommendation |
|
||||
|---|---|---|
|
||||
| Character name / wake word | "Aria" vs custom | Decide during Sprint 3 — affects wake word training later |
|
||||
| mem0 backend | Chroma vs Qdrant | Start with Chroma (Sprint 1D) — migrate if recall quality is poor |
|
||||
| HA conversation agent | Default HA vs OpenClaw | Test with HA default first, then wire OpenClaw as custom conversation agent |
|
||||
|
||||
---
|
||||
|
||||
## What This Unlocks
|
||||
|
||||
After these 3 sprints, the system will have:
|
||||
|
||||
- **End-to-end voice control**: speak → understand → act → respond
|
||||
- **Persistent memory**: the assistant remembers across sessions
|
||||
- **Automated workflows**: morning briefings, notification routing
|
||||
- **Monitoring**: all services tracked, alerts on failure
|
||||
- **Remote access**: everything reachable via Tailscale
|
||||
- **Character identity**: Aria persona loaded into the agent pipeline
|
||||
- **Reboot resilience**: everything survives a cold restart
|
||||
|
||||
This positions the project to move into **Phase 4 (ESP32 hardware)** and **Phase 5 (VTube Studio visual layer)** with confidence that the core pipeline is solid.
|
||||
224
plans/voice-loop-integration.md
Normal file
224
plans/voice-loop-integration.md
Normal file
@@ -0,0 +1,224 @@
|
||||
# Voice Loop Integration Plan
|
||||
|
||||
> Created: 2026-03-07 | Status: Planning
|
||||
|
||||
---
|
||||
|
||||
## Current State
|
||||
|
||||
### What's Working
|
||||
| Component | Status | Port/Location |
|
||||
|-----------|--------|---------------|
|
||||
| Wyoming STT (Whisper large-v3) | ✅ Running | tcp://0.0.0.0:10300 |
|
||||
| Wyoming TTS (Kokoro ONNX) | ✅ Running | tcp://0.0.0.0:10301 |
|
||||
| openWakeWord daemon | ✅ Running | Detects "hey_jarvis" |
|
||||
| OpenClaw Gateway | ✅ Running | ws://127.0.0.1:8080 |
|
||||
| Home Assistant | ✅ Running | http://10.0.0.199:8123 |
|
||||
| Ollama | ✅ Running | http://localhost:11434 |
|
||||
|
||||
### What's Broken
|
||||
1. **Wake word → STT gap**: The `wakeword_daemon.py` detects the wake word but does NOT capture the subsequent audio command
|
||||
2. **OpenClaw /wake endpoint**: Returns 404 - OpenClaw doesn't expose this endpoint by default
|
||||
3. **No audio routing**: After wake word detection, there's no mechanism to stream audio to STT and route the transcript to OpenClaw
|
||||
|
||||
---
|
||||
|
||||
## Architecture Options
|
||||
|
||||
### Option A: Full Wyoming Satellite (Recommended)
|
||||
|
||||
Replace the custom `wakeword_daemon.py` with a proper Wyoming satellite that handles the full pipeline.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[USB Mic] --> B[Wyoming Satellite]
|
||||
B -->|wake detected| B
|
||||
B -->|audio stream| C[Wyoming STT :10300]
|
||||
C -->|transcript| B
|
||||
B -->|text| D[OpenClaw via WebSocket]
|
||||
D -->|response| E[Wyoming TTS :10301]
|
||||
E -->|audio| F[Speaker]
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Standard Wyoming protocol - works with Home Assistant
|
||||
- Handles wake word, VAD, audio streaming, and response playback
|
||||
- Can be used with ESP32 satellites later
|
||||
|
||||
**Cons:**
|
||||
- Need to write a custom satellite or use `wyoming-satellite` package
|
||||
- More complex than a simple script
|
||||
|
||||
### Option B: Enhanced Wake Word Daemon
|
||||
|
||||
Extend `wakeword_daemon.py` to capture audio after wake word and send to STT.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[USB Mic] --> B[wakeword_daemon.py]
|
||||
B -->|wake detected| B
|
||||
B -->|record 5s| B
|
||||
B -->|audio file| C[Wyoming STT :10300]
|
||||
C -->|transcript| B
|
||||
B -->|POST /agent| D[OpenClaw]
|
||||
D -->|response| E[TTS via Wyoming]
|
||||
E -->|audio| F[Speaker via afplay]
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Simpler to implement
|
||||
- Keeps existing code
|
||||
|
||||
**Cons:**
|
||||
- Not standard Wyoming protocol
|
||||
- Won't integrate with Home Assistant voice pipelines
|
||||
- Harder to extend to ESP32 satellites
|
||||
|
||||
### Option C: Home Assistant as Hub
|
||||
|
||||
Use Home Assistant's voice pipeline as the central coordinator.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[USB Mic] --> B[HA Wyoming Integration]
|
||||
B -->|wake| C[HA Voice Pipeline]
|
||||
C -->|audio| D[Wyoming STT :10300]
|
||||
D -->|text| C
|
||||
C -->|intent| E[HA Conversation Agent]
|
||||
E -->|response| C
|
||||
C -->|text| F[Wyoming TTS :10301]
|
||||
F -->|audio| G[Speaker]
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Leverages existing HA infrastructure
|
||||
- Works with Assist UI and mobile app
|
||||
- Easy to add ESP32 satellites
|
||||
|
||||
**Cons:**
|
||||
- OpenClaw not directly in the loop (HA handles conversation)
|
||||
- Need to configure HA as conversation agent or use custom component
|
||||
|
||||
---
|
||||
|
||||
## Recommended Approach: Hybrid (Option A + C)
|
||||
|
||||
1. **Phase 1**: Get HA voice pipeline working first (Option C)
|
||||
- Register Wyoming STT/TTS in Home Assistant
|
||||
- Create voice assistant pipeline
|
||||
- Test via HA Assist UI
|
||||
|
||||
2. **Phase 2**: Write Wyoming satellite for Mac Mini (Option A)
|
||||
- Use `wyoming-satellite` package or write custom
|
||||
- Connects to HA as a satellite device
|
||||
- Handles local wake word + audio capture
|
||||
|
||||
3. **Phase 3**: Wire OpenClaw as custom conversation agent
|
||||
- Create HA custom component or use REST command
|
||||
- Route conversation to OpenClaw instead of HA default
|
||||
|
||||
---
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1: Home Assistant Voice Pipeline (Sprint 1 completion)
|
||||
|
||||
- [x] Open HA UI → Settings → Integrations → Add Integration → Wyoming Protocol
|
||||
- [x] Add STT provider: host `10.0.0.199`, port `10300`
|
||||
- [x] Add TTS provider: host `10.0.0.199`, port `10301`
|
||||
- [ ] Create Voice Assistant pipeline in HA
|
||||
- [ ] Test via HA Assist panel (type a query, hear response)
|
||||
|
||||
### Phase 2: Wyoming Satellite for Mac Mini
|
||||
|
||||
- [x] Install `wyoming-satellite` package in homeai-voice-env
|
||||
- [x] Configure satellite with openWakeWord integration
|
||||
- [x] Write launchd plist for satellite service
|
||||
- [ ] Register satellite in Home Assistant
|
||||
- [ ] Test full wake → STT → HA → TTS → playback cycle
|
||||
|
||||
### Phase 3: OpenClaw Integration
|
||||
|
||||
- [x] Research OpenClaw's WebSocket API for sending messages
|
||||
- [x] Write HA custom component or automation to route to OpenClaw
|
||||
- [x] Configure OpenClaw to respond via TTS
|
||||
- [ ] Test full voice → OpenClaw → action → response flow
|
||||
|
||||
---
|
||||
|
||||
## Technical Details
|
||||
|
||||
### Wyoming Satellite Configuration
|
||||
|
||||
```yaml
|
||||
# Example satellite config
|
||||
satellite:
|
||||
name: "Mac Mini Living Room"
|
||||
wake_word:
|
||||
model: hey_jarvis
|
||||
threshold: 0.5
|
||||
stt:
|
||||
uri: tcp://localhost:10300
|
||||
tts:
|
||||
uri: tcp://localhost:10301
|
||||
audio:
|
||||
input_device: default
|
||||
output_device: default
|
||||
```
|
||||
|
||||
### OpenClaw WebSocket API
|
||||
|
||||
Based on the gateway health check, OpenClaw uses WebSocket for communication:
|
||||
- URL: `ws://127.0.0.1:8080`
|
||||
- Auth: Token-based (`gateway.auth.token` in config)
|
||||
- Agent: `main` (default)
|
||||
|
||||
To send a message:
|
||||
```bash
|
||||
openclaw agent --message "Turn on the lights" --agent main
|
||||
```
|
||||
|
||||
### Home Assistant Integration
|
||||
|
||||
**Bridge Script**: [`homeai-agent/skills/home-assistant/openclaw_bridge.py`](homeai-agent/skills/home-assistant/openclaw_bridge.py)
|
||||
|
||||
Python bridge that calls OpenClaw CLI and returns JSON response:
|
||||
```bash
|
||||
python3 openclaw_bridge.py "Turn on the lights" --raw
|
||||
```
|
||||
|
||||
**HA Configuration**: [`homeai-agent/skills/home-assistant/ha-configuration.yaml`](homeai-agent/skills/home-assistant/ha-configuration.yaml)
|
||||
|
||||
Example shell command for HA:
|
||||
```yaml
|
||||
shell_command:
|
||||
openclaw_chat: 'python3 /Users/aodhan/gitea/homeai/homeai-agent/skills/home-assistant/openclaw_bridge.py "{{ message }}" --raw'
|
||||
```
|
||||
|
||||
**Full Integration Guide**: [`homeai-agent/skills/home-assistant/OPENCLAW_INTEGRATION.md`](homeai-agent/skills/home-assistant/OPENCLAW_INTEGRATION.md)
|
||||
|
||||
---
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Does OpenClaw expose an HTTP API for chat?** The gateway seems WebSocket-only.
|
||||
2. **Can we use `openclaw agent` CLI from an automation?** This would be the simplest integration.
|
||||
3. **Should we use HA's built-in conversation agent first?** This would validate the voice pipeline before adding OpenClaw complexity.
|
||||
|
||||
---
|
||||
|
||||
## Next Actions
|
||||
|
||||
1. Complete HA Wyoming integration (manual UI steps)
|
||||
2. Test HA Assist with typed queries
|
||||
3. Research `wyoming-satellite` package for Mac Mini
|
||||
4. Decide on OpenClaw integration method
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] Say "Hey Jarvis, turn on the reading lamp" → lamp turns on
|
||||
- [ ] Hear spoken confirmation via TTS
|
||||
- [ ] Latency under 5 seconds from wake word to action
|
||||
- [ ] Works reliably 9/10 times
|
||||
Reference in New Issue
Block a user