- Add openclaw-http-bridge.py: HTTP server translating POST requests to OpenClaw CLI calls - Add launchd plist for HTTP bridge (port 8081, auto-start) - Add install-to-docker-ha.sh: deploy custom component to Docker HA via SSH - Add package-for-ha.sh: create distributable tarball of custom component - Add test-services.sh: comprehensive voice pipeline service checker Fixes from code review: - Use OpenClawAgent (HTTP) in async_setup_entry instead of OpenClawCLIAgent (CLI agent fails inside Docker HA where openclaw binary doesn't exist) - Update all port references from 8080 to 8081 (HTTP bridge port) - Remove overly permissive CORS headers from HTTP bridge - Fix zombie process leak: kill child process on CLI timeout - Remove unused subprocess import in conversation.py - Add version field to Kokoro TTS Wyoming info - Update TODO.md with voice pipeline progress
12 KiB
Voice Pipeline Status Report
Last Updated: 2026-03-08
Executive Summary
The voice pipeline backend is fully operational on the Mac Mini. All services are running and tested:
- ✅ Wyoming STT (Whisper large-v3) - Port 10300
- ✅ Wyoming TTS (Kokoro ONNX) - Port 10301
- ✅ Wyoming Satellite (wake word + audio) - Port 10700
- ✅ OpenClaw Agent (LLM + skills) - Port 8080
- ✅ Ollama (local LLM runtime) - Port 11434
Next Step: Manual Home Assistant UI configuration to connect the pipeline.
What's Working ✅
1. Speech-to-Text (STT)
- Service: Wyoming Faster Whisper
- Model: large-v3 (multilingual, high accuracy)
- Port: 10300
- Status: Running via launchd (
com.homeai.wyoming-stt) - Test:
nc -z localhost 10300✓
2. Text-to-Speech (TTS)
- Service: Wyoming Kokoro ONNX
- Voice: af_heart (default, configurable)
- Port: 10301
- Status: Running via launchd (
com.homeai.wyoming-tts) - Test:
nc -z localhost 10301✓
3. Wyoming Satellite
- Function: Wake word detection + audio capture/playback
- Wake Word: "hey_jarvis" (openWakeWord model)
- Port: 10700
- Status: Running via launchd (
com.homeai.wyoming-satellite) - Test:
nc -z localhost 10700✓
4. OpenClaw Agent
- Function: AI agent with tool calling (home automation, etc.)
- Gateway: WebSocket + CLI
- Port: 8080
- Status: Running via launchd (
com.homeai.openclaw) - Skills: home-assistant, voice-assistant
- Test:
openclaw agent --message "Hello" --agent main✓
5. Ollama LLM
- Models: llama3.3:70b, qwen2.5:7b, and others
- Port: 11434
- Status: Running natively
- Test:
ollama list✓
6. Home Assistant Integration
- Custom Component: OpenClaw Conversation agent created
- Location:
homeai-agent/custom_components/openclaw_conversation/ - Features:
- Full conversation agent implementation
- Config flow for UI setup
- CLI fallback if HTTP unavailable
- Error handling and logging
- Status: Ready for installation
What's Pending 🔄
Manual Steps Required (Home Assistant UI)
These steps require access to the Home Assistant web interface at http://10.0.0.199:8123:
-
Install OpenClaw Conversation Component
- Copy component to HA server's
/config/custom_components/ - Restart Home Assistant
- See:
homeai-voice/VOICE_PIPELINE_SETUP.md
- Copy component to HA server's
-
Add Wyoming Integrations
- Settings → Devices & Services → Add Integration → Wyoming Protocol
- Add STT (10.0.0.199:10300)
- Add TTS (10.0.0.199:10301)
- Add Satellite (10.0.0.199:10700)
-
Add OpenClaw Conversation
- Settings → Devices & Services → Add Integration → OpenClaw Conversation
- Configure: host=10.0.0.199, port=8080, agent=main
-
Create Voice Assistant Pipeline
- Settings → Voice Assistants → Add Assistant
- Name: "HomeAI with OpenClaw"
- STT: Mac Mini STT
- Conversation: OpenClaw Conversation
- TTS: Mac Mini TTS
- Set as preferred
-
Test the Pipeline
- Type test: "What time is it?" in HA Assist
- Voice test: "Hey Jarvis, turn on the reading lamp"
Future Enhancements
- Chatterbox TTS - Voice cloning for character personality
- Qwen3-TTS - Alternative voice synthesis via MLX
- Custom Wake Word - Train with character's name
- Uptime Kuma - Add monitoring for all services
Architecture
┌──────────────────────────────────────────────────────────────┐
│ Mac Mini M4 Pro │
│ (10.0.0.199) │
├──────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Wyoming │ │ Wyoming │ │ Wyoming │ │
│ │ STT │ │ TTS │ │ Satellite │ │
│ │ :10300 │ │ :10301 │ │ :10700 │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ OpenClaw │ │ Ollama │ │
│ │ Gateway │ │ LLM │ │
│ │ :8080 │ │ :11434 │ │
│ └─────────────┘ └─────────────┘ │
│ │
└──────────────────────────────────────────────────────────────┘
▲
│ Wyoming Protocol + HTTP API
│
┌──────────────────────────────────────────────────────────────┐
│ Home Assistant Server │
│ (10.0.0.199) │
├──────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Voice Assistant Pipeline │ │
│ │ │ │
│ │ Wyoming STT → OpenClaw Conversation → Wyoming TTS │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ OpenClaw Conversation Custom Component │ │
│ │ (Routes to OpenClaw Gateway on Mac Mini) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────┘
Voice Flow Example
User: "Hey Jarvis, turn on the reading lamp"
-
Wake Word Detection (Wyoming Satellite)
- Detects "Hey Jarvis"
- Starts recording audio
-
Speech-to-Text (Wyoming STT)
- Transcribes: "turn on the reading lamp"
- Sends text to Home Assistant
-
Conversation Processing (HA → OpenClaw)
- HA Voice Pipeline receives text
- Routes to OpenClaw Conversation agent
- OpenClaw Gateway processes request
-
LLM Processing (Ollama)
- llama3.3:70b generates response
- Identifies intent: control light
- Calls home-assistant skill
-
Action Execution (Home Assistant API)
- OpenClaw calls HA REST API
- Turns on "reading lamp" entity
- Returns confirmation
-
Text-to-Speech (Wyoming TTS)
- Generates audio: "I've turned on the reading lamp"
- Sends to Wyoming Satellite
-
Audio Playback (Mac Mini Speaker)
- Plays confirmation audio
- User hears response
Total Latency: Target < 5 seconds
Service Management
Check All Services
# Quick health check
./homeai-voice/scripts/test-services.sh
# Individual service status
launchctl list | grep homeai
Restart a Service
# Example: Restart STT
launchctl unload ~/Library/LaunchAgents/com.homeai.wyoming-stt.plist
launchctl load ~/Library/LaunchAgents/com.homeai.wyoming-stt.plist
View Logs
# STT logs
tail -f /tmp/homeai-wyoming-stt.log
# TTS logs
tail -f /tmp/homeai-wyoming-tts.log
# Satellite logs
tail -f /tmp/homeai-wyoming-satellite.log
# OpenClaw logs
tail -f /tmp/homeai-openclaw.log
Key Documentation
| Document | Purpose |
|---|---|
homeai-voice/VOICE_PIPELINE_SETUP.md |
Complete setup guide with step-by-step HA configuration |
homeai-voice/RESUME_WORK.md |
Quick reference for resuming work |
homeai-agent/custom_components/openclaw_conversation/README.md |
Custom component documentation |
plans/ha-voice-pipeline-implementation.md |
Detailed implementation plan |
plans/voice-loop-integration.md |
Architecture options and decisions |
Testing
Automated Tests
# Service health check
./homeai-voice/scripts/test-services.sh
# OpenClaw test
openclaw agent --message "What time is it?" --agent main
# Home Assistant skill test
openclaw agent --message "Turn on the reading lamp" --agent main
Manual Tests
-
Type Test (HA Assist)
- Open HA UI → Click Assist icon
- Type: "What time is it?"
- Expected: Hear spoken response
-
Voice Test (Wyoming Satellite)
- Say: "Hey Jarvis"
- Wait for beep
- Say: "What time is it?"
- Expected: Hear spoken response
-
Home Control Test
- Say: "Hey Jarvis"
- Say: "Turn on the reading lamp"
- Expected: Light turns on + confirmation
Troubleshooting
Services Not Running
# Check launchd
launchctl list | grep homeai
# Reload all services
./homeai-voice/scripts/load-all-launchd.sh
Network Issues
# Test from Mac Mini to HA
curl http://10.0.0.199:8123/api/
# Test ports
nc -z localhost 10300 # STT
nc -z localhost 10301 # TTS
nc -z localhost 10700 # Satellite
nc -z localhost 8080 # OpenClaw
Audio Issues
# Test microphone
rec -r 16000 -c 1 test.wav trim 0 5
# Test speaker
afplay /System/Library/Sounds/Glass.aiff
Next Actions
- Access Home Assistant UI at http://10.0.0.199:8123
- Follow setup guide:
homeai-voice/VOICE_PIPELINE_SETUP.md - Install OpenClaw component (see Step 1 in setup guide)
- Configure Wyoming integrations (see Step 2 in setup guide)
- Create voice pipeline (see Step 4 in setup guide)
- Test end-to-end (see Step 5 in setup guide)
Success Metrics
- All services show green in health check
- Wyoming integrations appear in HA
- OpenClaw Conversation agent registered
- Voice pipeline created and set as default
- Typed query returns spoken response
- Voice query via satellite works
- Home control via voice works
- End-to-end latency < 5 seconds
- Services survive Mac Mini reboot
Project Context
This is Phase 2 of the HomeAI project. See TODO.md for the complete project roadmap.
Previous Phase: Phase 1 - Foundation (Infrastructure + LLM) ✅ Complete
Current Phase: Phase 2 - Voice Pipeline 🔄 Backend Complete, HA Integration Pending
Next Phase: Phase 3 - Agent & Character (mem0, character system, workflows)