homeai/VOICE_PIPELINE_STATUS.md

# Voice Pipeline Status Report

> Last Updated: 2026-03-08

---

## Executive Summary

The voice pipeline backend is **fully operational** on the Mac Mini. All services are running and tested:

- ✅ Wyoming STT (Whisper large-v3) - Port 10300
- ✅ Wyoming TTS (Kokoro ONNX) - Port 10301
- ✅ Wyoming Satellite (wake word + audio) - Port 10700
- ✅ OpenClaw Agent (LLM + skills) - Port 8080
- ✅ Ollama (local LLM runtime) - Port 11434

**Next Step**: Manual Home Assistant UI configuration to connect the pipeline.

---

## What's Working ✅

### 1. Speech-to-Text (STT)
- **Service**: Wyoming Faster Whisper
- **Model**: large-v3 (multilingual, high accuracy)
- **Port**: 10300
- **Status**: Running via launchd (`com.homeai.wyoming-stt`)
- **Test**: `nc -z localhost 10300` ✓

### 2. Text-to-Speech (TTS)
- **Service**: Wyoming Kokoro ONNX
- **Voice**: af_heart (default, configurable)
- **Port**: 10301
- **Status**: Running via launchd (`com.homeai.wyoming-tts`)
- **Test**: `nc -z localhost 10301` ✓

### 3. Wyoming Satellite
- **Function**: Wake word detection + audio capture/playback
- **Wake Word**: "hey_jarvis" (openWakeWord model)
- **Port**: 10700
- **Status**: Running via launchd (`com.homeai.wyoming-satellite`)
- **Test**: `nc -z localhost 10700` ✓

### 4. OpenClaw Agent
- **Function**: AI agent with tool calling (home automation, etc.)
- **Gateway**: WebSocket + CLI
- **Port**: 8080
- **Status**: Running via launchd (`com.homeai.openclaw`)
- **Skills**: home-assistant, voice-assistant
- **Test**: `openclaw agent --message "Hello" --agent main` ✓

### 5. Ollama LLM
- **Models**: llama3.3:70b, qwen2.5:7b, and others
- **Port**: 11434
- **Status**: Running natively
- **Test**: `ollama list` ✓

### 6. Home Assistant Integration
- **Custom Component**: OpenClaw Conversation agent created
- **Location**: `homeai-agent/custom_components/openclaw_conversation/`
- **Features**:
  - Full conversation agent implementation
  - Config flow for UI setup
  - CLI fallback if HTTP unavailable
  - Error handling and logging
- **Status**: Ready for installation

---

## What's Pending 🔄

### Manual Steps Required (Home Assistant UI)

These steps require access to the Home Assistant web interface at http://10.0.0.199:8123:

1. **Install OpenClaw Conversation Component**
   - Copy component to HA server's `/config/custom_components/`
   - Restart Home Assistant
   - See: [`homeai-voice/VOICE_PIPELINE_SETUP.md`](homeai-voice/VOICE_PIPELINE_SETUP.md)

2. **Add Wyoming Integrations**
   - Settings → Devices & Services → Add Integration → Wyoming Protocol
   - Add STT (10.0.0.199:10300)
   - Add TTS (10.0.0.199:10301)
   - Add Satellite (10.0.0.199:10700)

3. **Add OpenClaw Conversation**
   - Settings → Devices & Services → Add Integration → OpenClaw Conversation
   - Configure: host=10.0.0.199, port=8080, agent=main

4. **Create Voice Assistant Pipeline**
   - Settings → Voice Assistants → Add Assistant
   - Name: "HomeAI with OpenClaw"
   - STT: Mac Mini STT
   - Conversation: OpenClaw Conversation
   - TTS: Mac Mini TTS
   - Set as preferred

5. **Test the Pipeline**
   - Type test: "What time is it?" in HA Assist
   - Voice test: "Hey Jarvis, turn on the reading lamp"

### Future Enhancements

6. **Chatterbox TTS** - Voice cloning for character personality
7. **Qwen3-TTS** - Alternative voice synthesis via MLX
8. **Custom Wake Word** - Train with character's name
9. **Uptime Kuma** - Add monitoring for all services

---

## Architecture

```
┌──────────────────────────────────────────────────────────────┐
│                        Mac Mini M4 Pro                        │
│                      (10.0.0.199)                             │
├──────────────────────────────────────────────────────────────┤
│                                                               │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │  Wyoming    │  │  Wyoming    │  │  Wyoming    │          │
│  │    STT      │  │    TTS      │  │  Satellite  │          │
│  │   :10300    │  │   :10301    │  │   :10700    │          │
│  └─────────────┘  └─────────────┘  └─────────────┘          │
│                                                               │
│  ┌─────────────┐  ┌─────────────┐                           │
│  │  OpenClaw   │  │   Ollama    │                           │
│  │   Gateway   │  │     LLM     │                           │
│  │    :8080    │  │   :11434    │                           │
│  └─────────────┘  └─────────────┘                           │
│                                                               │
└──────────────────────────────────────────────────────────────┘
                            ▲
                            │ Wyoming Protocol + HTTP API
                            │
┌──────────────────────────────────────────────────────────────┐
│                    Home Assistant Server                      │
│                      (10.0.0.199)                             │
├──────────────────────────────────────────────────────────────┤
│                                                               │
│  ┌─────────────────────────────────────────────────────┐    │
│  │         Voice Assistant Pipeline                     │    │
│  │                                                       │    │
│  │  Wyoming STT → OpenClaw Conversation → Wyoming TTS   │    │
│  └─────────────────────────────────────────────────────┘    │
│                                                               │
│  ┌─────────────────────────────────────────────────────┐    │
│  │      OpenClaw Conversation Custom Component          │    │
│  │      (Routes to OpenClaw Gateway on Mac Mini)        │    │
│  └─────────────────────────────────────────────────────┘    │
│                                                               │
└──────────────────────────────────────────────────────────────┘
```

---

## Voice Flow Example

**User**: "Hey Jarvis, turn on the reading lamp"

1. **Wake Word Detection** (Wyoming Satellite)
   - Detects "Hey Jarvis"
   - Starts recording audio

2. **Speech-to-Text** (Wyoming STT)
   - Transcribes: "turn on the reading lamp"
   - Sends text to Home Assistant

3. **Conversation Processing** (HA → OpenClaw)
   - HA Voice Pipeline receives text
   - Routes to OpenClaw Conversation agent
   - OpenClaw Gateway processes request

4. **LLM Processing** (Ollama)
   - llama3.3:70b generates response
   - Identifies intent: control light
   - Calls home-assistant skill

5. **Action Execution** (Home Assistant API)
   - OpenClaw calls HA REST API
   - Turns on "reading lamp" entity
   - Returns confirmation

6. **Text-to-Speech** (Wyoming TTS)
   - Generates audio: "I've turned on the reading lamp"
   - Sends to Wyoming Satellite

7. **Audio Playback** (Mac Mini Speaker)
   - Plays confirmation audio
   - User hears response

**Total Latency**: Target < 5 seconds

---

## Service Management

### Check All Services

```bash
# Quick health check
./homeai-voice/scripts/test-services.sh

# Individual service status
launchctl list | grep homeai
```

### Restart a Service

```bash
# Example: Restart STT
launchctl unload ~/Library/LaunchAgents/com.homeai.wyoming-stt.plist
launchctl load ~/Library/LaunchAgents/com.homeai.wyoming-stt.plist
```

### View Logs

```bash
# STT logs
tail -f /tmp/homeai-wyoming-stt.log

# TTS logs
tail -f /tmp/homeai-wyoming-tts.log

# Satellite logs
tail -f /tmp/homeai-wyoming-satellite.log

# OpenClaw logs
tail -f /tmp/homeai-openclaw.log
```

---

## Key Documentation

| Document | Purpose |
|----------|---------|
| [`homeai-voice/VOICE_PIPELINE_SETUP.md`](homeai-voice/VOICE_PIPELINE_SETUP.md) | Complete setup guide with step-by-step HA configuration |
| [`homeai-voice/RESUME_WORK.md`](homeai-voice/RESUME_WORK.md) | Quick reference for resuming work |
| [`homeai-agent/custom_components/openclaw_conversation/README.md`](homeai-agent/custom_components/openclaw_conversation/README.md) | Custom component documentation |
| [`plans/ha-voice-pipeline-implementation.md`](plans/ha-voice-pipeline-implementation.md) | Detailed implementation plan |
| [`plans/voice-loop-integration.md`](plans/voice-loop-integration.md) | Architecture options and decisions |

---

## Testing

### Automated Tests

```bash
# Service health check
./homeai-voice/scripts/test-services.sh

# OpenClaw test
openclaw agent --message "What time is it?" --agent main

# Home Assistant skill test
openclaw agent --message "Turn on the reading lamp" --agent main
```

### Manual Tests

1. **Type Test** (HA Assist)
   - Open HA UI → Click Assist icon
   - Type: "What time is it?"
   - Expected: Hear spoken response

2. **Voice Test** (Wyoming Satellite)
   - Say: "Hey Jarvis"
   - Wait for beep
   - Say: "What time is it?"
   - Expected: Hear spoken response

3. **Home Control Test**
   - Say: "Hey Jarvis"
   - Say: "Turn on the reading lamp"
   - Expected: Light turns on + confirmation

---

## Troubleshooting

### Services Not Running

```bash
# Check launchd
launchctl list | grep homeai

# Reload all services
./homeai-voice/scripts/load-all-launchd.sh
```

### Network Issues

```bash
# Test from Mac Mini to HA
curl http://10.0.0.199:8123/api/

# Test ports
nc -z localhost 10300  # STT
nc -z localhost 10301  # TTS
nc -z localhost 10700  # Satellite
nc -z localhost 8080   # OpenClaw
```

### Audio Issues

```bash
# Test microphone
rec -r 16000 -c 1 test.wav trim 0 5

# Test speaker
afplay /System/Library/Sounds/Glass.aiff
```

---

## Next Actions

1. **Access Home Assistant UI** at http://10.0.0.199:8123
2. **Follow setup guide**: [`homeai-voice/VOICE_PIPELINE_SETUP.md`](homeai-voice/VOICE_PIPELINE_SETUP.md)
3. **Install OpenClaw component** (see Step 1 in setup guide)
4. **Configure Wyoming integrations** (see Step 2 in setup guide)
5. **Create voice pipeline** (see Step 4 in setup guide)
6. **Test end-to-end** (see Step 5 in setup guide)

---

## Success Metrics

- [ ] All services show green in health check
- [ ] Wyoming integrations appear in HA
- [ ] OpenClaw Conversation agent registered
- [ ] Voice pipeline created and set as default
- [ ] Typed query returns spoken response
- [ ] Voice query via satellite works
- [ ] Home control via voice works
- [ ] End-to-end latency < 5 seconds
- [ ] Services survive Mac Mini reboot

---

## Project Context

This is **Phase 2** of the HomeAI project. See [`TODO.md`](TODO.md) for the complete project roadmap.

**Previous Phase**: Phase 1 - Foundation (Infrastructure + LLM) ✅ Complete
**Current Phase**: Phase 2 - Voice Pipeline 🔄 Backend Complete, HA Integration Pending
**Next Phase**: Phase 3 - Agent & Character (mem0, character system, workflows)