Full project plan across 8 sub-projects (homeai-infra, homeai-llm, homeai-voice, homeai-agent, homeai-character, homeai-esp32, homeai-visual, homeai-images). Includes per-project PLAN.md files, top-level PROJECT_PLAN.md, and master TODO.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7.8 KiB
7.8 KiB
HomeAI — Master TODO
Track progress across all sub-projects. See each sub-project
PLAN.mdfor detailed implementation notes. Status:[ ]pending ·[~]in progress ·[x]done
Phase 1 — Foundation
P1 · homeai-infra
- Install Docker Desktop for Mac, enable launch at login
- Create shared
homeaiDocker network - Create
~/server/docker/directory structure - Write compose files: Home Assistant, Portainer, Uptime Kuma, Gitea, code-server, n8n
- Write
.env.secrets.exampleandMakefile make up-all— bring all services up- Home Assistant onboarding — generate long-lived access token
- Write
~/server/.env.serviceswith all service URLs - Install Tailscale, verify all services reachable on Tailnet
- Gitea: create admin account, initialise all 8 sub-project repos, configure SSH
- Uptime Kuma: add monitors for all services, configure mobile alerts
- Verify all containers survive a cold reboot
P2 · homeai-llm
- Install Ollama natively via brew
- Write and load launchd plist (
com.ollama.ollama.plist) - Write
ollama-models.txtwith model manifest - Run
scripts/pull-models.sh— pull all models - Run
scripts/benchmark.sh— record results inbenchmark-results.md - Deploy Open WebUI via Docker compose (port 3030)
- Verify Open WebUI connected to Ollama, all models available
- Add Ollama + Open WebUI to Uptime Kuma monitors
- Add
OLLAMA_URLandOPEN_WEBUI_URLto.env.services
Phase 2 — Voice Pipeline
P3 · homeai-voice
- Compile Whisper.cpp with Metal support
- Download Whisper models (
large-v3,medium.en) to~/models/whisper/ - Install
wyoming-faster-whisper, test STT from audio file - Install Kokoro TTS, test output to audio file
- Install Wyoming-Kokoro adapter, verify Wyoming protocol
- Write + load launchd plists for Wyoming STT (10300) and TTS (10301)
- Connect Home Assistant Wyoming integration (STT + TTS)
- Create HA Voice Assistant pipeline
- Test HA Assist via browser: type query → hear spoken response
- Install openWakeWord, test wake detection with USB mic
- Write + load openWakeWord launchd plist
- Install Chatterbox TTS (MPS build), test with sample
.wav - Install Qwen3-TTS via MLX (fallback)
- Write
wyoming/test-pipeline.sh— end-to-end smoke test - Add Wyoming STT/TTS to Uptime Kuma monitors
Phase 3 — Agent & Character
P5 · homeai-character (no runtime deps — can start alongside P1)
- Define and write
schema/character.schema.json(v1) - Write
characters/aria.json— default character - Set up Vite project in
src/, install deps - Integrate existing
character-manager.jsxinto Vite project - Add schema validation on export (ajv)
- Add expression mapping UI section
- Add custom rules editor
- Test full edit → export → validate → load cycle
- Record or source voice reference audio for Aria (
~/voices/aria.wav) - Pre-process audio with ffmpeg, test with Chatterbox
- Update
aria.jsonwith voice clone path if quality is good - Write
SchemaValidator.jsas standalone utility
P4 · homeai-agent
- Confirm OpenClaw installation method and Ollama compatibility
- Install OpenClaw, write
~/.openclaw/config.yaml - Verify OpenClaw responds to basic text query via
/chat - Write
skills/home_assistant.py— test lights on/off via voice - Write
skills/memory.py— test store and recall - Write
skills/weather.py— verify HA weather sensor data - Write
skills/timer.py— test set/fire a timer - Write skill stubs:
music.py,vtube_studio.py,comfyui.py - Set up mem0 with Chroma backend, test semantic recall
- Write and load memory backup launchd job
- Symlink
homeai-agent/skills/→~/.openclaw/skills/ - Build morning briefing n8n workflow
- Build notification router n8n workflow
- Verify full voice → agent → HA action flow
- Add OpenClaw to Uptime Kuma monitors
Phase 4 — Hardware Satellites
P6 · homeai-esp32
- Install ESPHome:
pip install esphome - Write
esphome/secrets.yaml(gitignored) - Write
base.yaml,voice.yaml,display.yaml,animations.yaml - Write
s3-box-living-room.yamlfor first unit - Flash first unit via USB
- Verify unit appears in HA device list
- Assign Wyoming voice pipeline to unit in HA
- Test full wake → STT → LLM → TTS → audio playback cycle
- Test LVGL face: idle → listening → thinking → speaking → error
- Verify OTA firmware update works wirelessly
- Flash remaining units (bedroom, kitchen, etc.)
- Document MAC address → room name mapping
Phase 5 — Visual Layer
P7 · homeai-visual
- Install VTube Studio (Mac App Store)
- Enable WebSocket API on port 8001
- Source/purchase a Live2D model (nizima.com or booth.pm)
- Load model in VTube Studio
- Create hotkeys for all 8 expression states
- Write
skills/vtube_studio.pyfull implementation - Run auth flow — click Allow in VTube Studio, save token
- Test all 8 expressions via test script
- Update
aria.jsonwith real VTube Studio hotkey IDs - Write
lipsync.pyamplitude-based helper - Integrate lip sync into OpenClaw TTS dispatch
- Symlink
skills/→~/.openclaw/skills/ - Test full pipeline: voice → thinking expression → speaking with lip sync
- Set up VTube Studio mobile (iPhone/iPad) on Tailnet
Phase 6 — Image Generation
P8 · homeai-images
- Clone ComfyUI to
~/ComfyUI/, install deps in venv - Verify MPS is detected at launch
- Write and load launchd plist (
com.homeai.comfyui.plist) - Download SDXL base model
- Download Flux.1-schnell
- Download ControlNet models (canny, depth)
- Test generation via ComfyUI web UI (port 8188)
- Build and export
quick.jsonworkflow - Build and export
portrait.jsonworkflow - Build and export
scene.jsonworkflow (ControlNet) - Build and export
upscale.jsonworkflow - Write
skills/comfyui.pyfull implementation - Test skill:
comfyui.quick("test prompt")→ image file returned - Collect character reference images for LoRA training
- Train SDXL LoRA with kohya_ss
- Load LoRA into
portrait.json, verify character consistency - Symlink
skills/→~/.openclaw/skills/ - Test via OpenClaw: "Generate a portrait of Aria looking happy"
- Add ComfyUI to Uptime Kuma monitors
Phase 7 — Extended Integrations & Polish
- Deploy Music Assistant (Docker), integrate with Home Assistant
- Complete
skills/music.pyin OpenClaw - Deploy Snapcast server on Mac Mini
- Configure Snapcast clients on ESP32 units for multi-room audio
- Configure Authelia as 2FA layer in front of web UIs
- Build advanced n8n workflows (calendar reminders, daily briefing v2)
- Create iOS Shortcuts to trigger OpenClaw from iPhone widget
- Configure ntfy/Pushover alerts in Uptime Kuma for all services
- Automate mem0 + character config backup to Gitea (daily)
- Train custom wake word using character's name
- Document all service URLs, ports, and credentials in a private Gitea wiki
- Tailscale ACL hardening — restrict which devices can reach which services
- Stress test: reboot Mac Mini, verify all services recover in <2 minutes
Open Decisions
- Confirm character name (determines wake word training)
- Confirm OpenClaw version/fork and Ollama compatibility
- Live2D model: purchase off-the-shelf or commission custom?
- mem0 backend: Chroma (simple) vs Qdrant Docker (better semantic search)?
- Snapcast output: ESP32 built-in speakers or dedicated audio hardware per room?
- Authelia user store: local file vs LDAP?