## Voice Pipeline (P3) - Replace openWakeWord daemon with Wyoming Satellite approach - Add Wyoming Satellite service on port 10700 for HA voice pipeline - Update setup.sh with cross-platform sed compatibility (macOS/Linux) - Add version field to Kokoro TTS voice info - Update launchd service loader to use Wyoming Satellite ## Home Assistant Integration (P4) - Add custom conversation agent component (openclaw_conversation) - Fix: Use IntentResponse instead of plain strings (HA API requirement) - Support both HTTP API and CLI fallback modes - Config flow for easy HA UI setup - Add OpenClaw bridge scripts (Python + Bash) - Add ha-ctl utility for HA entity control - Fix: Use context manager for token file reading - Add HA configuration examples and documentation ## Infrastructure - Add mem0 backup automation (launchd + script) - Add n8n workflow templates (morning briefing, notification router) - Add VS Code workspace configuration - Reorganize model files into categorized folders: - lmstudio-community/ - mlx-community/ - bartowski/ - mradermacher/ ## Documentation - Update PROJECT_PLAN.md with Wyoming Satellite architecture - Update TODO.md with completed Wyoming integration tasks - Add OPENCLAW_INTEGRATION.md for HA setup guide ## Testing - Verified Wyoming services running (STT:10300, TTS:10301, Satellite:10700) - Verified OpenClaw CLI accessibility - Confirmed cross-platform compatibility fixes
180 lines
7.8 KiB
Markdown
180 lines
7.8 KiB
Markdown
# HomeAI — Master TODO
|
|
|
|
> Track progress across all sub-projects. See each sub-project `PLAN.md` for detailed implementation notes.
|
|
> Status: `[ ]` pending · `[~]` in progress · `[x]` done
|
|
|
|
---
|
|
|
|
## Phase 1 — Foundation
|
|
|
|
### P1 · homeai-infra
|
|
|
|
- [x] Install Docker Desktop for Mac, enable launch at login
|
|
- [x] Create shared `homeai` Docker network
|
|
- [x] Create `~/server/docker/` directory structure
|
|
- [x] Write compose files: Uptime Kuma, code-server, n8n (HA, Portainer, Gitea are pre-existing on 10.0.0.199)
|
|
- [x] `docker compose up -d` — bring all services up
|
|
- [x] Home Assistant onboarding — long-lived access token generated, stored in `.env`
|
|
- [ ] Install Tailscale, verify all services reachable on Tailnet
|
|
- [ ] Uptime Kuma: add monitors for all services, configure mobile alerts
|
|
- [ ] Verify all containers survive a cold reboot
|
|
|
|
### P2 · homeai-llm
|
|
|
|
- [x] Install Ollama natively via brew
|
|
- [x] Write and load launchd plist (`com.homeai.ollama.plist`) — `/opt/homebrew/bin/ollama`
|
|
- [x] Register local GGUF models via Modelfiles (no download): llama3.3:70b, qwen3:32b, codestral:22b, qwen2.5:7b
|
|
- [x] Register additional models: EVA-LLaMA-3.33-70B, Midnight-Miqu-70B, QwQ-32B, Qwen3.5-35B, Qwen3-Coder-30B, Qwen3-VL-30B, GLM-4.6V-Flash, DeepSeek-R1-8B, gemma-3-27b
|
|
- [x] Deploy Open WebUI via Docker compose (port 3030)
|
|
- [x] Verify Open WebUI connected to Ollama, all models available
|
|
- [ ] Run `scripts/benchmark.sh` — record results in `benchmark-results.md`
|
|
- [ ] Add Ollama + Open WebUI to Uptime Kuma monitors
|
|
|
|
---
|
|
|
|
## Phase 2 — Voice Pipeline
|
|
|
|
### P3 · homeai-voice
|
|
|
|
- [x] Install `wyoming-faster-whisper` — model: faster-whisper-large-v3 (auto-downloaded)
|
|
- [x] Install Kokoro ONNX TTS — models at `~/models/kokoro/`
|
|
- [x] Write Wyoming-Kokoro adapter server (`homeai-voice/tts/wyoming_kokoro_server.py`)
|
|
- [x] Write + load launchd plists for Wyoming STT (10300) and TTS (10301)
|
|
- [x] Install openWakeWord + pyaudio — model: hey_jarvis
|
|
- [x] Write + load openWakeWord launchd plist (`com.homeai.wakeword`) — DISABLED, replaced by Wyoming satellite
|
|
- [x] Write `wyoming/test-pipeline.sh` — smoke test (3/3 passing)
|
|
- [x] Install Wyoming satellite — handles wake word via HA voice pipeline
|
|
- [x] Connect Home Assistant Wyoming integration (STT + TTS + Satellite)
|
|
- [x] Install Wyoming satellite for Mac Mini (port 10700)
|
|
- [ ] Create HA Voice Assistant pipeline with OpenClaw conversation agent
|
|
- [ ] Test HA Assist via browser: type query → hear spoken response
|
|
- [ ] Install Chatterbox TTS (MPS build), test with sample `.wav`
|
|
- [ ] Install Qwen3-TTS via MLX (fallback)
|
|
- [ ] Train custom wake word using character name
|
|
- [ ] Add Wyoming STT/TTS to Uptime Kuma monitors
|
|
|
|
---
|
|
|
|
## Phase 3 — Agent & Character
|
|
|
|
### P4 · homeai-agent
|
|
|
|
- [x] Install OpenClaw (npm global, v2026.3.2)
|
|
- [x] Configure Ollama provider (native API, `http://localhost:11434`)
|
|
- [x] Write + load launchd plist (`com.homeai.openclaw`) — gateway on port 8080
|
|
- [x] Fix context window: set `contextWindow=32768` for llama3.3:70b in `openclaw.json`
|
|
- [x] Fix Llama 3.3 Modelfile: add tool-calling TEMPLATE block
|
|
- [x] Verify `openclaw agent --message "..." --agent main` → completed
|
|
- [x] Write `skills/home-assistant` SKILL.md — HA REST API control
|
|
- [x] Write `skills/voice-assistant` SKILL.md — voice response style guide
|
|
- [x] Wire HASS_TOKEN — create `~/.homeai/hass_token` or set env in launchd plist
|
|
- [x] Test home-assistant skill: "turn on/off the reading lamp"
|
|
- [ ] Set up mem0 with Chroma backend, test semantic recall
|
|
- [ ] Write memory backup launchd job
|
|
- [ ] Build morning briefing n8n workflow
|
|
- [ ] Build notification router n8n workflow
|
|
- [ ] Verify full voice → agent → HA action flow
|
|
- [ ] Add OpenClaw to Uptime Kuma monitors
|
|
|
|
### P5 · homeai-character *(can start alongside P4)*
|
|
|
|
- [ ] Define and write `schema/character.schema.json` (v1)
|
|
- [ ] Write `characters/aria.json` — default character
|
|
- [ ] Set up Vite project in `src/`, install deps
|
|
- [ ] Integrate existing `character-manager.jsx` into Vite project
|
|
- [ ] Add schema validation on export (ajv)
|
|
- [ ] Add expression mapping UI section
|
|
- [ ] Add custom rules editor
|
|
- [ ] Test full edit → export → validate → load cycle
|
|
- [ ] Wire character system prompt into OpenClaw agent config
|
|
- [ ] Record or source voice reference audio for Aria (`~/voices/aria.wav`)
|
|
- [ ] Pre-process audio with ffmpeg, test with Chatterbox
|
|
- [ ] Update `aria.json` with voice clone path if quality is good
|
|
|
|
---
|
|
|
|
## Phase 4 — Hardware Satellites
|
|
|
|
### P6 · homeai-esp32
|
|
|
|
- [ ] Install ESPHome: `pip install esphome`
|
|
- [ ] Write `esphome/secrets.yaml` (gitignored)
|
|
- [ ] Write `base.yaml`, `voice.yaml`, `display.yaml`, `animations.yaml`
|
|
- [ ] Write `s3-box-living-room.yaml` for first unit
|
|
- [ ] Flash first unit via USB
|
|
- [ ] Verify unit appears in HA device list
|
|
- [ ] Assign Wyoming voice pipeline to unit in HA
|
|
- [ ] Test full wake → STT → LLM → TTS → audio playback cycle
|
|
- [ ] Test LVGL face: idle → listening → thinking → speaking → error
|
|
- [ ] Verify OTA firmware update works wirelessly
|
|
- [ ] Flash remaining units (bedroom, kitchen, etc.)
|
|
- [ ] Document MAC address → room name mapping
|
|
|
|
---
|
|
|
|
## Phase 5 — Visual Layer
|
|
|
|
### P7 · homeai-visual
|
|
|
|
- [ ] Install VTube Studio (Mac App Store)
|
|
- [ ] Enable WebSocket API on port 8001
|
|
- [ ] Source/purchase a Live2D model (nizima.com or booth.pm)
|
|
- [ ] Load model in VTube Studio
|
|
- [ ] Create hotkeys for all 8 expression states
|
|
- [ ] Write `skills/vtube_studio` SKILL.md + implementation
|
|
- [ ] Run auth flow — click Allow in VTube Studio, save token
|
|
- [ ] Test all 8 expressions via test script
|
|
- [ ] Update `aria.json` with real VTube Studio hotkey IDs
|
|
- [ ] Write `lipsync.py` amplitude-based helper
|
|
- [ ] Integrate lip sync into OpenClaw TTS dispatch
|
|
- [ ] Test full pipeline: voice → thinking expression → speaking with lip sync
|
|
- [ ] Set up VTube Studio mobile (iPhone/iPad) on Tailnet
|
|
|
|
---
|
|
|
|
## Phase 6 — Image Generation
|
|
|
|
### P8 · homeai-images
|
|
|
|
- [ ] Clone ComfyUI to `~/ComfyUI/`, install deps in venv
|
|
- [ ] Verify MPS is detected at launch
|
|
- [ ] Write and load launchd plist (`com.homeai.comfyui.plist`)
|
|
- [ ] Download SDXL base model
|
|
- [ ] Download Flux.1-schnell
|
|
- [ ] Download ControlNet models (canny, depth)
|
|
- [ ] Test generation via ComfyUI web UI (port 8188)
|
|
- [ ] Build and export `quick.json`, `portrait.json`, `scene.json`, `upscale.json` workflows
|
|
- [ ] Write `skills/comfyui` SKILL.md + implementation
|
|
- [ ] Test skill: "Generate a portrait of Aria looking happy"
|
|
- [ ] Collect character reference images for LoRA training
|
|
- [ ] Train SDXL LoRA with kohya_ss, verify character consistency
|
|
- [ ] Add ComfyUI to Uptime Kuma monitors
|
|
|
|
---
|
|
|
|
## Phase 7 — Extended Integrations & Polish
|
|
|
|
- [ ] Deploy Music Assistant (Docker), integrate with Home Assistant
|
|
- [ ] Write `skills/music` SKILL.md for OpenClaw
|
|
- [ ] Deploy Snapcast server on Mac Mini
|
|
- [ ] Configure Snapcast clients on ESP32 units for multi-room audio
|
|
- [ ] Configure Authelia as 2FA layer in front of web UIs
|
|
- [ ] Build advanced n8n workflows (calendar reminders, daily briefing v2)
|
|
- [ ] Create iOS Shortcuts to trigger OpenClaw from iPhone widget
|
|
- [ ] Configure ntfy/Pushover alerts in Uptime Kuma for all services
|
|
- [ ] Automate mem0 + character config backup to Gitea (daily)
|
|
- [ ] Train custom wake word using character's name
|
|
- [ ] Document all service URLs, ports, and credentials in a private Gitea wiki
|
|
- [ ] Tailscale ACL hardening — restrict which devices can reach which services
|
|
- [ ] Stress test: reboot Mac Mini, verify all services recover in <2 minutes
|
|
|
|
---
|
|
|
|
## Open Decisions
|
|
|
|
- [ ] Confirm character name (determines wake word training)
|
|
- [ ] Live2D model: purchase off-the-shelf or commission custom?
|
|
- [ ] mem0 backend: Chroma (simple) vs Qdrant Docker (better semantic search)?
|
|
- [ ] Snapcast output: ESP32 built-in speakers or dedicated audio hardware per room?
|
|
- [ ] Authelia user store: local file vs LDAP?
|