Initial project structure and planning docs

Full project plan across 8 sub-projects (homeai-infra, homeai-llm, homeai-voice, homeai-agent, homeai-character, homeai-esp32, homeai-visual, homeai-images). Includes per-project PLAN.md files, top-level PROJECT_PLAN.md, and master TODO.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04 01:11:37 +00:00
commit 38247d7cc4
11 changed files with 3060 additions and 0 deletions
--- a/TODO.md
+++ b/TODO.md
@@ -0,0 +1,189 @@
+# HomeAI — Master TODO
+
+> Track progress across all sub-projects. See each sub-project `PLAN.md` for detailed implementation notes.
+> Status: `[ ]` pending · `[~]` in progress · `[x]` done
+
+---
+
+## Phase 1 — Foundation
+
+### P1 · homeai-infra
+
+- [ ] Install Docker Desktop for Mac, enable launch at login
+- [ ] Create shared `homeai` Docker network
+- [ ] Create `~/server/docker/` directory structure
+- [ ] Write compose files: Home Assistant, Portainer, Uptime Kuma, Gitea, code-server, n8n
+- [ ] Write `.env.secrets.example` and `Makefile`
+- [ ] `make up-all` — bring all services up
+- [ ] Home Assistant onboarding — generate long-lived access token
+- [ ] Write `~/server/.env.services` with all service URLs
+- [ ] Install Tailscale, verify all services reachable on Tailnet
+- [ ] Gitea: create admin account, initialise all 8 sub-project repos, configure SSH
+- [ ] Uptime Kuma: add monitors for all services, configure mobile alerts
+- [ ] Verify all containers survive a cold reboot
+
+### P2 · homeai-llm
+
+- [ ] Install Ollama natively via brew
+- [ ] Write and load launchd plist (`com.ollama.ollama.plist`)
+- [ ] Write `ollama-models.txt` with model manifest
+- [ ] Run `scripts/pull-models.sh` — pull all models
+- [ ] Run `scripts/benchmark.sh` — record results in `benchmark-results.md`
+- [ ] Deploy Open WebUI via Docker compose (port 3030)
+- [ ] Verify Open WebUI connected to Ollama, all models available
+- [ ] Add Ollama + Open WebUI to Uptime Kuma monitors
+- [ ] Add `OLLAMA_URL` and `OPEN_WEBUI_URL` to `.env.services`
+
+---
+
+## Phase 2 — Voice Pipeline
+
+### P3 · homeai-voice
+
+- [ ] Compile Whisper.cpp with Metal support
+- [ ] Download Whisper models (`large-v3`, `medium.en`) to `~/models/whisper/`
+- [ ] Install `wyoming-faster-whisper`, test STT from audio file
+- [ ] Install Kokoro TTS, test output to audio file
+- [ ] Install Wyoming-Kokoro adapter, verify Wyoming protocol
+- [ ] Write + load launchd plists for Wyoming STT (10300) and TTS (10301)
+- [ ] Connect Home Assistant Wyoming integration (STT + TTS)
+- [ ] Create HA Voice Assistant pipeline
+- [ ] Test HA Assist via browser: type query → hear spoken response
+- [ ] Install openWakeWord, test wake detection with USB mic
+- [ ] Write + load openWakeWord launchd plist
+- [ ] Install Chatterbox TTS (MPS build), test with sample `.wav`
+- [ ] Install Qwen3-TTS via MLX (fallback)
+- [ ] Write `wyoming/test-pipeline.sh` — end-to-end smoke test
+- [ ] Add Wyoming STT/TTS to Uptime Kuma monitors
+
+---
+
+## Phase 3 — Agent & Character
+
+### P5 · homeai-character *(no runtime deps — can start alongside P1)*
+
+- [ ] Define and write `schema/character.schema.json` (v1)
+- [ ] Write `characters/aria.json` — default character
+- [ ] Set up Vite project in `src/`, install deps
+- [ ] Integrate existing `character-manager.jsx` into Vite project
+- [ ] Add schema validation on export (ajv)
+- [ ] Add expression mapping UI section
+- [ ] Add custom rules editor
+- [ ] Test full edit → export → validate → load cycle
+- [ ] Record or source voice reference audio for Aria (`~/voices/aria.wav`)
+- [ ] Pre-process audio with ffmpeg, test with Chatterbox
+- [ ] Update `aria.json` with voice clone path if quality is good
+- [ ] Write `SchemaValidator.js` as standalone utility
+
+### P4 · homeai-agent
+
+- [ ] Confirm OpenClaw installation method and Ollama compatibility
+- [ ] Install OpenClaw, write `~/.openclaw/config.yaml`
+- [ ] Verify OpenClaw responds to basic text query via `/chat`
+- [ ] Write `skills/home_assistant.py` — test lights on/off via voice
+- [ ] Write `skills/memory.py` — test store and recall
+- [ ] Write `skills/weather.py` — verify HA weather sensor data
+- [ ] Write `skills/timer.py` — test set/fire a timer
+- [ ] Write skill stubs: `music.py`, `vtube_studio.py`, `comfyui.py`
+- [ ] Set up mem0 with Chroma backend, test semantic recall
+- [ ] Write and load memory backup launchd job
+- [ ] Symlink `homeai-agent/skills/` → `~/.openclaw/skills/`
+- [ ] Build morning briefing n8n workflow
+- [ ] Build notification router n8n workflow
+- [ ] Verify full voice → agent → HA action flow
+- [ ] Add OpenClaw to Uptime Kuma monitors
+
+---
+
+## Phase 4 — Hardware Satellites
+
+### P6 · homeai-esp32
+
+- [ ] Install ESPHome: `pip install esphome`
+- [ ] Write `esphome/secrets.yaml` (gitignored)
+- [ ] Write `base.yaml`, `voice.yaml`, `display.yaml`, `animations.yaml`
+- [ ] Write `s3-box-living-room.yaml` for first unit
+- [ ] Flash first unit via USB
+- [ ] Verify unit appears in HA device list
+- [ ] Assign Wyoming voice pipeline to unit in HA
+- [ ] Test full wake → STT → LLM → TTS → audio playback cycle
+- [ ] Test LVGL face: idle → listening → thinking → speaking → error
+- [ ] Verify OTA firmware update works wirelessly
+- [ ] Flash remaining units (bedroom, kitchen, etc.)
+- [ ] Document MAC address → room name mapping
+
+---
+
+## Phase 5 — Visual Layer
+
+### P7 · homeai-visual
+
+- [ ] Install VTube Studio (Mac App Store)
+- [ ] Enable WebSocket API on port 8001
+- [ ] Source/purchase a Live2D model (nizima.com or booth.pm)
+- [ ] Load model in VTube Studio
+- [ ] Create hotkeys for all 8 expression states
+- [ ] Write `skills/vtube_studio.py` full implementation
+- [ ] Run auth flow — click Allow in VTube Studio, save token
+- [ ] Test all 8 expressions via test script
+- [ ] Update `aria.json` with real VTube Studio hotkey IDs
+- [ ] Write `lipsync.py` amplitude-based helper
+- [ ] Integrate lip sync into OpenClaw TTS dispatch
+- [ ] Symlink `skills/` → `~/.openclaw/skills/`
+- [ ] Test full pipeline: voice → thinking expression → speaking with lip sync
+- [ ] Set up VTube Studio mobile (iPhone/iPad) on Tailnet
+
+---
+
+## Phase 6 — Image Generation
+
+### P8 · homeai-images
+
+- [ ] Clone ComfyUI to `~/ComfyUI/`, install deps in venv
+- [ ] Verify MPS is detected at launch
+- [ ] Write and load launchd plist (`com.homeai.comfyui.plist`)
+- [ ] Download SDXL base model
+- [ ] Download Flux.1-schnell
+- [ ] Download ControlNet models (canny, depth)
+- [ ] Test generation via ComfyUI web UI (port 8188)
+- [ ] Build and export `quick.json` workflow
+- [ ] Build and export `portrait.json` workflow
+- [ ] Build and export `scene.json` workflow (ControlNet)
+- [ ] Build and export `upscale.json` workflow
+- [ ] Write `skills/comfyui.py` full implementation
+- [ ] Test skill: `comfyui.quick("test prompt")` → image file returned
+- [ ] Collect character reference images for LoRA training
+- [ ] Train SDXL LoRA with kohya_ss
+- [ ] Load LoRA into `portrait.json`, verify character consistency
+- [ ] Symlink `skills/` → `~/.openclaw/skills/`
+- [ ] Test via OpenClaw: "Generate a portrait of Aria looking happy"
+- [ ] Add ComfyUI to Uptime Kuma monitors
+
+---
+
+## Phase 7 — Extended Integrations & Polish
+
+- [ ] Deploy Music Assistant (Docker), integrate with Home Assistant
+- [ ] Complete `skills/music.py` in OpenClaw
+- [ ] Deploy Snapcast server on Mac Mini
+- [ ] Configure Snapcast clients on ESP32 units for multi-room audio
+- [ ] Configure Authelia as 2FA layer in front of web UIs
+- [ ] Build advanced n8n workflows (calendar reminders, daily briefing v2)
+- [ ] Create iOS Shortcuts to trigger OpenClaw from iPhone widget
+- [ ] Configure ntfy/Pushover alerts in Uptime Kuma for all services
+- [ ] Automate mem0 + character config backup to Gitea (daily)
+- [ ] Train custom wake word using character's name
+- [ ] Document all service URLs, ports, and credentials in a private Gitea wiki
+- [ ] Tailscale ACL hardening — restrict which devices can reach which services
+- [ ] Stress test: reboot Mac Mini, verify all services recover in <2 minutes
+
+---
+
+## Open Decisions
+
+- [ ] Confirm character name (determines wake word training)
+- [ ] Confirm OpenClaw version/fork and Ollama compatibility
+- [ ] Live2D model: purchase off-the-shelf or commission custom?
+- [ ] mem0 backend: Chroma (simple) vs Qdrant Docker (better semantic search)?
+- [ ] Snapcast output: ESP32 built-in speakers or dedicated audio hardware per room?
+- [ ] Authelia user store: local file vs LDAP?