# HomeAI — Next Steps Plan > Created: 2026-03-07 | Priority: Voice Loop → Foundation Hardening → Character System --- ## Current State Summary | Sub-Project | Status | Done / Total | |---|---|---| | P1 homeai-infra | Core done, tail items remain | 6 / 9 | | P2 homeai-llm | Core done, tail items remain | 6 / 8 | | P3 homeai-voice | STT + TTS + wake word running, HA integration pending | 7 / 13 | | P4 homeai-agent | OpenClaw + HA skill working, mem0 + n8n pending | 10 / 16 | | P5 homeai-character | Not started | 0 / 11 | | P6–P8 | Not started | 0 / * | **Key milestone reached:** OpenClaw can receive text, call `qwen2.5:7b` via Ollama, execute tool calls, and control Home Assistant entities. The voice pipeline components (STT, TTS, wake word) are all running as launchd services. **Critical gap:** The voice pipeline is not yet connected through Home Assistant to the agent. The pieces exist but the end-to-end flow is untested. --- ## Sprint 1 — Complete the Voice → Agent → HA Loop **Goal:** Speak a command → hear a spoken response + see the HA action execute. This is the highest-value work because it closes the core loop that every future feature builds on. ### Tasks #### 1A. Finish HA Wyoming Integration (P3) The Wyoming STT (port 10300) and TTS (port 10301) services are running. They need to be registered in Home Assistant. - [ ] Open HA UI → Settings → Integrations → Add Integration → Wyoming Protocol - [ ] Add STT provider: host `10.0.0.199` (or `localhost` if HA is on same machine), port `10300` - [ ] Add TTS provider: host `10.0.0.199`, port `10301` - [ ] Verify both appear as STT/TTS providers in HA #### 1B. Create HA Voice Assistant Pipeline (P3) - [ ] HA → Settings → Voice Assistants → Add Assistant - [ ] Configure: STT = Wyoming Whisper, TTS = Wyoming Kokoro, Conversation Agent = Home Assistant default (or OpenClaw if wired) - [ ] Set as default voice assistant pipeline #### 1C. Test HA Assist via Browser (P3) - [ ] Open HA dashboard → Assist panel - [ ] Type a query (e.g. "What time is it?") → verify spoken response plays back - [ ] Type a device command (e.g. "Turn on the reading lamp") → verify HA executes it #### 1D. Set Up mem0 with Chroma Backend (P4) - [ ] Install mem0: `pip install mem0ai` - [ ] Install chromadb: `pip install chromadb` - [ ] Pull embedding model: `ollama pull nomic-embed-text` - [ ] Write mem0 config pointing at Ollama for LLM + embeddings, Chroma for vector store - [ ] Test: store a memory, recall it via semantic search - [ ] Verify mem0 data persists at `~/.openclaw/memory/chroma/` #### 1E. Write Memory Backup launchd Job (P4) - [ ] Create git repo at `~/.openclaw/memory/` (or a subdirectory) - [ ] Write backup script: `git add . && git commit -m "mem0 backup $(date)" && git push` - [ ] Write launchd plist: `com.homeai.mem0-backup.plist` — daily schedule - [ ] Load plist, verify it runs #### 1F. Build Morning Briefing n8n Workflow (P4) - [ ] Verify n8n is running (Docker, deployed in P1) - [ ] Create workflow: time trigger → fetch weather from HA → compose briefing text → POST to OpenClaw `/speak` endpoint - [ ] Export workflow JSON to `homeai-agent/workflows/morning-briefing.json` - [ ] Test: manually trigger → hear spoken briefing #### 1G. Build Notification Router n8n Workflow (P4) - [ ] Create workflow: HA webhook trigger → classify urgency → high: TTS immediately, low: queue - [ ] Export to `homeai-agent/workflows/notification-router.json` #### 1H. Verify Full Voice → Agent → HA Action Flow (P3 + P4) - [ ] Trigger wake word ("hey jarvis") via USB mic - [ ] Speak a command: "Turn on the reading lamp" - [ ] Verify: wake word detected → audio captured → STT transcribes → OpenClaw receives text → tool call to HA → lamp turns on → TTS response plays back - [ ] Document any latency issues or failure points ### Sprint 1 Flow Diagram ```mermaid flowchart LR A[USB Mic] -->|wake word| B[openWakeWord] B -->|audio stream| C[Wyoming STT - Whisper] C -->|transcribed text| D[Home Assistant Pipeline] D -->|text| E[OpenClaw Agent] E -->|tool call| F[HA REST API] F -->|action| G[Smart Device] E -->|response text| H[Wyoming TTS - Kokoro] H -->|audio| I[Speaker] ``` --- ## Sprint 2 — Foundation Hardening **Goal:** All services survive a reboot, are monitored, and are remotely accessible. ### Tasks #### 2A. Install and Configure Tailscale (P1) - [ ] Install Tailscale on Mac Mini: `brew install tailscale` - [ ] Authenticate and join Tailnet - [ ] Verify all services reachable via Tailscale IP (HA, Open WebUI, Portainer, Gitea, n8n, code-server) - [ ] Document Tailscale IP → service URL mapping #### 2B. Configure Uptime Kuma Monitors (P1 + P2) - [ ] Add monitors for: Home Assistant, Portainer, Gitea, code-server, n8n - [ ] Add monitors for: Ollama API (port 11434), Open WebUI (port 3030) - [ ] Add monitors for: Wyoming STT (port 10300), Wyoming TTS (port 10301) - [ ] Add monitor for: OpenClaw (port 8080) - [ ] Configure mobile push alerts (ntfy or Pushover) #### 2C. Cold Reboot Verification (P1) - [ ] Reboot Mac Mini - [ ] Verify all Docker containers come back up (restart policy: `unless-stopped`) - [ ] Verify launchd services start: Ollama, Wyoming STT, Wyoming TTS, openWakeWord, OpenClaw - [ ] Check Uptime Kuma — all monitors green within 2 minutes - [ ] Document any services that failed to restart and fix #### 2D. Run LLM Benchmarks (P2) - [ ] Run `homeai-llm/scripts/benchmark.sh` - [ ] Record results: tokens/sec for each model (qwen2.5:7b, llama3.3:70b, etc.) - [ ] Write results to `homeai-llm/benchmark-results.md` --- ## Sprint 3 — Character System (P5) **Goal:** Character schema defined, default character created, Character Manager UI functional. ### Tasks #### 3A. Define Character Schema (P5) - [ ] Write `homeai-character/schema/character.schema.json` (v1) — based on the spec in PLAN.md - [ ] Write `homeai-character/schema/README.md` documenting each field #### 3B. Create Default Character (P5) - [ ] Write `homeai-character/characters/aria.json` with placeholder expression IDs - [ ] Validate aria.json against schema (manual or script) #### 3C. Set Up Vite Project (P5) - [ ] Initialize Vite + React project in `homeai-character/` - [ ] Install deps: `npm install react react-dom ajv` - [ ] Move existing `character-manager.jsx` into `src/` - [ ] Verify dev server runs at `http://localhost:5173` #### 3D. Wire Character Manager Features (P5) - [ ] Integrate schema validation on export (ajv) - [ ] Add expression mapping UI section - [ ] Add custom rules editor - [ ] Test full edit → export → validate → load cycle #### 3E. Wire Character into OpenClaw (P4 + P5) - [ ] Copy/symlink `aria.json` to `~/.openclaw/characters/aria.json` - [ ] Configure OpenClaw to load system prompt from character JSON - [ ] Verify OpenClaw uses Aria's system prompt in responses --- ## Open Decisions to Resolve During These Sprints | Decision | Options | Recommendation | |---|---|---| | Character name / wake word | "Aria" vs custom | Decide during Sprint 3 — affects wake word training later | | mem0 backend | Chroma vs Qdrant | Start with Chroma (Sprint 1D) — migrate if recall quality is poor | | HA conversation agent | Default HA vs OpenClaw | Test with HA default first, then wire OpenClaw as custom conversation agent | --- ## What This Unlocks After these 3 sprints, the system will have: - **End-to-end voice control**: speak → understand → act → respond - **Persistent memory**: the assistant remembers across sessions - **Automated workflows**: morning briefings, notification routing - **Monitoring**: all services tracked, alerts on failure - **Remote access**: everything reachable via Tailscale - **Character identity**: Aria persona loaded into the agent pipeline - **Reboot resilience**: everything survives a cold restart This positions the project to move into **Phase 4 (ESP32 hardware)** and **Phase 5 (VTube Studio visual layer)** with confidence that the core pipeline is solid.