Files
homeai/plans/next-steps.md
Aodhan Collins 6a0bae2a0b feat(phase-04): Wyoming Satellite integration + OpenClaw HA components
## Voice Pipeline (P3)
- Replace openWakeWord daemon with Wyoming Satellite approach
- Add Wyoming Satellite service on port 10700 for HA voice pipeline
- Update setup.sh with cross-platform sed compatibility (macOS/Linux)
- Add version field to Kokoro TTS voice info
- Update launchd service loader to use Wyoming Satellite

## Home Assistant Integration (P4)
- Add custom conversation agent component (openclaw_conversation)
  - Fix: Use IntentResponse instead of plain strings (HA API requirement)
  - Support both HTTP API and CLI fallback modes
  - Config flow for easy HA UI setup
- Add OpenClaw bridge scripts (Python + Bash)
- Add ha-ctl utility for HA entity control
  - Fix: Use context manager for token file reading
- Add HA configuration examples and documentation

## Infrastructure
- Add mem0 backup automation (launchd + script)
- Add n8n workflow templates (morning briefing, notification router)
- Add VS Code workspace configuration
- Reorganize model files into categorized folders:
  - lmstudio-community/
  - mlx-community/
  - bartowski/
  - mradermacher/

## Documentation
- Update PROJECT_PLAN.md with Wyoming Satellite architecture
- Update TODO.md with completed Wyoming integration tasks
- Add OPENCLAW_INTEGRATION.md for HA setup guide

## Testing
- Verified Wyoming services running (STT:10300, TTS:10301, Satellite:10700)
- Verified OpenClaw CLI accessibility
- Confirmed cross-platform compatibility fixes
2026-03-08 02:06:37 +00:00

7.9 KiB
Raw Blame History

HomeAI — Next Steps Plan

Created: 2026-03-07 | Priority: Voice Loop → Foundation Hardening → Character System


Current State Summary

Sub-Project Status Done / Total
P1 homeai-infra Core done, tail items remain 6 / 9
P2 homeai-llm Core done, tail items remain 6 / 8
P3 homeai-voice STT + TTS + wake word running, HA integration pending 7 / 13
P4 homeai-agent OpenClaw + HA skill working, mem0 + n8n pending 10 / 16
P5 homeai-character Not started 0 / 11
P6P8 Not started 0 / *

Key milestone reached: OpenClaw can receive text, call qwen2.5:7b via Ollama, execute tool calls, and control Home Assistant entities. The voice pipeline components (STT, TTS, wake word) are all running as launchd services.

Critical gap: The voice pipeline is not yet connected through Home Assistant to the agent. The pieces exist but the end-to-end flow is untested.


Sprint 1 — Complete the Voice → Agent → HA Loop

Goal: Speak a command → hear a spoken response + see the HA action execute.

This is the highest-value work because it closes the core loop that every future feature builds on.

Tasks

1A. Finish HA Wyoming Integration (P3)

The Wyoming STT (port 10300) and TTS (port 10301) services are running. They need to be registered in Home Assistant.

  • Open HA UI → Settings → Integrations → Add Integration → Wyoming Protocol
  • Add STT provider: host 10.0.0.199 (or localhost if HA is on same machine), port 10300
  • Add TTS provider: host 10.0.0.199, port 10301
  • Verify both appear as STT/TTS providers in HA

1B. Create HA Voice Assistant Pipeline (P3)

  • HA → Settings → Voice Assistants → Add Assistant
  • Configure: STT = Wyoming Whisper, TTS = Wyoming Kokoro, Conversation Agent = Home Assistant default (or OpenClaw if wired)
  • Set as default voice assistant pipeline

1C. Test HA Assist via Browser (P3)

  • Open HA dashboard → Assist panel
  • Type a query (e.g. "What time is it?") → verify spoken response plays back
  • Type a device command (e.g. "Turn on the reading lamp") → verify HA executes it

1D. Set Up mem0 with Chroma Backend (P4)

  • Install mem0: pip install mem0ai
  • Install chromadb: pip install chromadb
  • Pull embedding model: ollama pull nomic-embed-text
  • Write mem0 config pointing at Ollama for LLM + embeddings, Chroma for vector store
  • Test: store a memory, recall it via semantic search
  • Verify mem0 data persists at ~/.openclaw/memory/chroma/

1E. Write Memory Backup launchd Job (P4)

  • Create git repo at ~/.openclaw/memory/ (or a subdirectory)
  • Write backup script: git add . && git commit -m "mem0 backup $(date)" && git push
  • Write launchd plist: com.homeai.mem0-backup.plist — daily schedule
  • Load plist, verify it runs

1F. Build Morning Briefing n8n Workflow (P4)

  • Verify n8n is running (Docker, deployed in P1)
  • Create workflow: time trigger → fetch weather from HA → compose briefing text → POST to OpenClaw /speak endpoint
  • Export workflow JSON to homeai-agent/workflows/morning-briefing.json
  • Test: manually trigger → hear spoken briefing

1G. Build Notification Router n8n Workflow (P4)

  • Create workflow: HA webhook trigger → classify urgency → high: TTS immediately, low: queue
  • Export to homeai-agent/workflows/notification-router.json

1H. Verify Full Voice → Agent → HA Action Flow (P3 + P4)

  • Trigger wake word ("hey jarvis") via USB mic
  • Speak a command: "Turn on the reading lamp"
  • Verify: wake word detected → audio captured → STT transcribes → OpenClaw receives text → tool call to HA → lamp turns on → TTS response plays back
  • Document any latency issues or failure points

Sprint 1 Flow Diagram

flowchart LR
    A[USB Mic] -->|wake word| B[openWakeWord]
    B -->|audio stream| C[Wyoming STT - Whisper]
    C -->|transcribed text| D[Home Assistant Pipeline]
    D -->|text| E[OpenClaw Agent]
    E -->|tool call| F[HA REST API]
    F -->|action| G[Smart Device]
    E -->|response text| H[Wyoming TTS - Kokoro]
    H -->|audio| I[Speaker]

Sprint 2 — Foundation Hardening

Goal: All services survive a reboot, are monitored, and are remotely accessible.

Tasks

2A. Install and Configure Tailscale (P1)

  • Install Tailscale on Mac Mini: brew install tailscale
  • Authenticate and join Tailnet
  • Verify all services reachable via Tailscale IP (HA, Open WebUI, Portainer, Gitea, n8n, code-server)
  • Document Tailscale IP → service URL mapping

2B. Configure Uptime Kuma Monitors (P1 + P2)

  • Add monitors for: Home Assistant, Portainer, Gitea, code-server, n8n
  • Add monitors for: Ollama API (port 11434), Open WebUI (port 3030)
  • Add monitors for: Wyoming STT (port 10300), Wyoming TTS (port 10301)
  • Add monitor for: OpenClaw (port 8080)
  • Configure mobile push alerts (ntfy or Pushover)

2C. Cold Reboot Verification (P1)

  • Reboot Mac Mini
  • Verify all Docker containers come back up (restart policy: unless-stopped)
  • Verify launchd services start: Ollama, Wyoming STT, Wyoming TTS, openWakeWord, OpenClaw
  • Check Uptime Kuma — all monitors green within 2 minutes
  • Document any services that failed to restart and fix

2D. Run LLM Benchmarks (P2)

  • Run homeai-llm/scripts/benchmark.sh
  • Record results: tokens/sec for each model (qwen2.5:7b, llama3.3:70b, etc.)
  • Write results to homeai-llm/benchmark-results.md

Sprint 3 — Character System (P5)

Goal: Character schema defined, default character created, Character Manager UI functional.

Tasks

3A. Define Character Schema (P5)

  • Write homeai-character/schema/character.schema.json (v1) — based on the spec in PLAN.md
  • Write homeai-character/schema/README.md documenting each field

3B. Create Default Character (P5)

  • Write homeai-character/characters/aria.json with placeholder expression IDs
  • Validate aria.json against schema (manual or script)

3C. Set Up Vite Project (P5)

  • Initialize Vite + React project in homeai-character/
  • Install deps: npm install react react-dom ajv
  • Move existing character-manager.jsx into src/
  • Verify dev server runs at http://localhost:5173

3D. Wire Character Manager Features (P5)

  • Integrate schema validation on export (ajv)
  • Add expression mapping UI section
  • Add custom rules editor
  • Test full edit → export → validate → load cycle

3E. Wire Character into OpenClaw (P4 + P5)

  • Copy/symlink aria.json to ~/.openclaw/characters/aria.json
  • Configure OpenClaw to load system prompt from character JSON
  • Verify OpenClaw uses Aria's system prompt in responses

Open Decisions to Resolve During These Sprints

Decision Options Recommendation
Character name / wake word "Aria" vs custom Decide during Sprint 3 — affects wake word training later
mem0 backend Chroma vs Qdrant Start with Chroma (Sprint 1D) — migrate if recall quality is poor
HA conversation agent Default HA vs OpenClaw Test with HA default first, then wire OpenClaw as custom conversation agent

What This Unlocks

After these 3 sprints, the system will have:

  • End-to-end voice control: speak → understand → act → respond
  • Persistent memory: the assistant remembers across sessions
  • Automated workflows: morning briefings, notification routing
  • Monitoring: all services tracked, alerts on failure
  • Remote access: everything reachable via Tailscale
  • Character identity: Aria persona loaded into the agent pipeline
  • Reboot resilience: everything survives a cold restart

This positions the project to move into Phase 4 (ESP32 hardware) and Phase 5 (VTube Studio visual layer) with confidence that the core pipeline is solid.