Files
homeai/VOICE_PIPELINE_STATUS.md
Aodhan Collins 664bb6d275 feat: OpenClaw HTTP bridge, HA conversation agent fixes, voice pipeline tooling
- Add openclaw-http-bridge.py: HTTP server translating POST requests to OpenClaw CLI calls
- Add launchd plist for HTTP bridge (port 8081, auto-start)
- Add install-to-docker-ha.sh: deploy custom component to Docker HA via SSH
- Add package-for-ha.sh: create distributable tarball of custom component
- Add test-services.sh: comprehensive voice pipeline service checker

Fixes from code review:
- Use OpenClawAgent (HTTP) in async_setup_entry instead of OpenClawCLIAgent
  (CLI agent fails inside Docker HA where openclaw binary doesn't exist)
- Update all port references from 8080 to 8081 (HTTP bridge port)
- Remove overly permissive CORS headers from HTTP bridge
- Fix zombie process leak: kill child process on CLI timeout
- Remove unused subprocess import in conversation.py
- Add version field to Kokoro TTS Wyoming info
- Update TODO.md with voice pipeline progress
2026-03-08 22:46:04 +00:00

12 KiB

Voice Pipeline Status Report

Last Updated: 2026-03-08


Executive Summary

The voice pipeline backend is fully operational on the Mac Mini. All services are running and tested:

  • Wyoming STT (Whisper large-v3) - Port 10300
  • Wyoming TTS (Kokoro ONNX) - Port 10301
  • Wyoming Satellite (wake word + audio) - Port 10700
  • OpenClaw Agent (LLM + skills) - Port 8080
  • Ollama (local LLM runtime) - Port 11434

Next Step: Manual Home Assistant UI configuration to connect the pipeline.


What's Working

1. Speech-to-Text (STT)

  • Service: Wyoming Faster Whisper
  • Model: large-v3 (multilingual, high accuracy)
  • Port: 10300
  • Status: Running via launchd (com.homeai.wyoming-stt)
  • Test: nc -z localhost 10300

2. Text-to-Speech (TTS)

  • Service: Wyoming Kokoro ONNX
  • Voice: af_heart (default, configurable)
  • Port: 10301
  • Status: Running via launchd (com.homeai.wyoming-tts)
  • Test: nc -z localhost 10301

3. Wyoming Satellite

  • Function: Wake word detection + audio capture/playback
  • Wake Word: "hey_jarvis" (openWakeWord model)
  • Port: 10700
  • Status: Running via launchd (com.homeai.wyoming-satellite)
  • Test: nc -z localhost 10700

4. OpenClaw Agent

  • Function: AI agent with tool calling (home automation, etc.)
  • Gateway: WebSocket + CLI
  • Port: 8080
  • Status: Running via launchd (com.homeai.openclaw)
  • Skills: home-assistant, voice-assistant
  • Test: openclaw agent --message "Hello" --agent main

5. Ollama LLM

  • Models: llama3.3:70b, qwen2.5:7b, and others
  • Port: 11434
  • Status: Running natively
  • Test: ollama list

6. Home Assistant Integration

  • Custom Component: OpenClaw Conversation agent created
  • Location: homeai-agent/custom_components/openclaw_conversation/
  • Features:
    • Full conversation agent implementation
    • Config flow for UI setup
    • CLI fallback if HTTP unavailable
    • Error handling and logging
  • Status: Ready for installation

What's Pending 🔄

Manual Steps Required (Home Assistant UI)

These steps require access to the Home Assistant web interface at http://10.0.0.199:8123:

  1. Install OpenClaw Conversation Component

  2. Add Wyoming Integrations

    • Settings → Devices & Services → Add Integration → Wyoming Protocol
    • Add STT (10.0.0.199:10300)
    • Add TTS (10.0.0.199:10301)
    • Add Satellite (10.0.0.199:10700)
  3. Add OpenClaw Conversation

    • Settings → Devices & Services → Add Integration → OpenClaw Conversation
    • Configure: host=10.0.0.199, port=8080, agent=main
  4. Create Voice Assistant Pipeline

    • Settings → Voice Assistants → Add Assistant
    • Name: "HomeAI with OpenClaw"
    • STT: Mac Mini STT
    • Conversation: OpenClaw Conversation
    • TTS: Mac Mini TTS
    • Set as preferred
  5. Test the Pipeline

    • Type test: "What time is it?" in HA Assist
    • Voice test: "Hey Jarvis, turn on the reading lamp"

Future Enhancements

  1. Chatterbox TTS - Voice cloning for character personality
  2. Qwen3-TTS - Alternative voice synthesis via MLX
  3. Custom Wake Word - Train with character's name
  4. Uptime Kuma - Add monitoring for all services

Architecture

┌──────────────────────────────────────────────────────────────┐
│                        Mac Mini M4 Pro                        │
│                      (10.0.0.199)                             │
├──────────────────────────────────────────────────────────────┤
│                                                               │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │  Wyoming    │  │  Wyoming    │  │  Wyoming    │          │
│  │    STT      │  │    TTS      │  │  Satellite  │          │
│  │   :10300    │  │   :10301    │  │   :10700    │          │
│  └─────────────┘  └─────────────┘  └─────────────┘          │
│                                                               │
│  ┌─────────────┐  ┌─────────────┐                           │
│  │  OpenClaw   │  │   Ollama    │                           │
│  │   Gateway   │  │     LLM     │                           │
│  │    :8080    │  │   :11434    │                           │
│  └─────────────┘  └─────────────┘                           │
│                                                               │
└──────────────────────────────────────────────────────────────┘
                            ▲
                            │ Wyoming Protocol + HTTP API
                            │
┌──────────────────────────────────────────────────────────────┐
│                    Home Assistant Server                      │
│                      (10.0.0.199)                             │
├──────────────────────────────────────────────────────────────┤
│                                                               │
│  ┌─────────────────────────────────────────────────────┐    │
│  │         Voice Assistant Pipeline                     │    │
│  │                                                       │    │
│  │  Wyoming STT → OpenClaw Conversation → Wyoming TTS   │    │
│  └─────────────────────────────────────────────────────┘    │
│                                                               │
│  ┌─────────────────────────────────────────────────────┐    │
│  │      OpenClaw Conversation Custom Component          │    │
│  │      (Routes to OpenClaw Gateway on Mac Mini)        │    │
│  └─────────────────────────────────────────────────────┘    │
│                                                               │
└──────────────────────────────────────────────────────────────┘

Voice Flow Example

User: "Hey Jarvis, turn on the reading lamp"

  1. Wake Word Detection (Wyoming Satellite)

    • Detects "Hey Jarvis"
    • Starts recording audio
  2. Speech-to-Text (Wyoming STT)

    • Transcribes: "turn on the reading lamp"
    • Sends text to Home Assistant
  3. Conversation Processing (HA → OpenClaw)

    • HA Voice Pipeline receives text
    • Routes to OpenClaw Conversation agent
    • OpenClaw Gateway processes request
  4. LLM Processing (Ollama)

    • llama3.3:70b generates response
    • Identifies intent: control light
    • Calls home-assistant skill
  5. Action Execution (Home Assistant API)

    • OpenClaw calls HA REST API
    • Turns on "reading lamp" entity
    • Returns confirmation
  6. Text-to-Speech (Wyoming TTS)

    • Generates audio: "I've turned on the reading lamp"
    • Sends to Wyoming Satellite
  7. Audio Playback (Mac Mini Speaker)

    • Plays confirmation audio
    • User hears response

Total Latency: Target < 5 seconds


Service Management

Check All Services

# Quick health check
./homeai-voice/scripts/test-services.sh

# Individual service status
launchctl list | grep homeai

Restart a Service

# Example: Restart STT
launchctl unload ~/Library/LaunchAgents/com.homeai.wyoming-stt.plist
launchctl load ~/Library/LaunchAgents/com.homeai.wyoming-stt.plist

View Logs

# STT logs
tail -f /tmp/homeai-wyoming-stt.log

# TTS logs
tail -f /tmp/homeai-wyoming-tts.log

# Satellite logs
tail -f /tmp/homeai-wyoming-satellite.log

# OpenClaw logs
tail -f /tmp/homeai-openclaw.log

Key Documentation

Document Purpose
homeai-voice/VOICE_PIPELINE_SETUP.md Complete setup guide with step-by-step HA configuration
homeai-voice/RESUME_WORK.md Quick reference for resuming work
homeai-agent/custom_components/openclaw_conversation/README.md Custom component documentation
plans/ha-voice-pipeline-implementation.md Detailed implementation plan
plans/voice-loop-integration.md Architecture options and decisions

Testing

Automated Tests

# Service health check
./homeai-voice/scripts/test-services.sh

# OpenClaw test
openclaw agent --message "What time is it?" --agent main

# Home Assistant skill test
openclaw agent --message "Turn on the reading lamp" --agent main

Manual Tests

  1. Type Test (HA Assist)

    • Open HA UI → Click Assist icon
    • Type: "What time is it?"
    • Expected: Hear spoken response
  2. Voice Test (Wyoming Satellite)

    • Say: "Hey Jarvis"
    • Wait for beep
    • Say: "What time is it?"
    • Expected: Hear spoken response
  3. Home Control Test

    • Say: "Hey Jarvis"
    • Say: "Turn on the reading lamp"
    • Expected: Light turns on + confirmation

Troubleshooting

Services Not Running

# Check launchd
launchctl list | grep homeai

# Reload all services
./homeai-voice/scripts/load-all-launchd.sh

Network Issues

# Test from Mac Mini to HA
curl http://10.0.0.199:8123/api/

# Test ports
nc -z localhost 10300  # STT
nc -z localhost 10301  # TTS
nc -z localhost 10700  # Satellite
nc -z localhost 8080   # OpenClaw

Audio Issues

# Test microphone
rec -r 16000 -c 1 test.wav trim 0 5

# Test speaker
afplay /System/Library/Sounds/Glass.aiff

Next Actions

  1. Access Home Assistant UI at http://10.0.0.199:8123
  2. Follow setup guide: homeai-voice/VOICE_PIPELINE_SETUP.md
  3. Install OpenClaw component (see Step 1 in setup guide)
  4. Configure Wyoming integrations (see Step 2 in setup guide)
  5. Create voice pipeline (see Step 4 in setup guide)
  6. Test end-to-end (see Step 5 in setup guide)

Success Metrics

  • All services show green in health check
  • Wyoming integrations appear in HA
  • OpenClaw Conversation agent registered
  • Voice pipeline created and set as default
  • Typed query returns spoken response
  • Voice query via satellite works
  • Home control via voice works
  • End-to-end latency < 5 seconds
  • Services survive Mac Mini reboot

Project Context

This is Phase 2 of the HomeAI project. See TODO.md for the complete project roadmap.

Previous Phase: Phase 1 - Foundation (Infrastructure + LLM) Complete
Current Phase: Phase 2 - Voice Pipeline 🔄 Backend Complete, HA Integration Pending
Next Phase: Phase 3 - Agent & Character (mem0, character system, workflows)