homeai/plans/next-steps.md

# HomeAI — Next Steps Plan

> Created: 2026-03-07 | Priority: Voice Loop → Foundation Hardening → Character System

---

## Current State Summary

| Sub-Project | Status | Done / Total |
|---|---|---|
| P1 homeai-infra | Core done, tail items remain | 6 / 9 |
| P2 homeai-llm | Core done, tail items remain | 6 / 8 |
| P3 homeai-voice | STT + TTS + wake word running, HA integration pending | 7 / 13 |
| P4 homeai-agent | OpenClaw + HA skill working, mem0 + n8n pending | 10 / 16 |
| P5 homeai-character | Not started | 0 / 11 |
| P6–P8 | Not started | 0 / * |

**Key milestone reached:** OpenClaw can receive text, call `qwen2.5:7b` via Ollama, execute tool calls, and control Home Assistant entities. The voice pipeline components (STT, TTS, wake word) are all running as launchd services.

**Critical gap:** The voice pipeline is not yet connected through Home Assistant to the agent. The pieces exist but the end-to-end flow is untested.

---

## Sprint 1 — Complete the Voice → Agent → HA Loop

**Goal:** Speak a command → hear a spoken response + see the HA action execute.

This is the highest-value work because it closes the core loop that every future feature builds on.

### Tasks

#### 1A. Finish HA Wyoming Integration (P3)

The Wyoming STT (port 10300) and TTS (port 10301) services are running. They need to be registered in Home Assistant.

- [ ] Open HA UI → Settings → Integrations → Add Integration → Wyoming Protocol
- [ ] Add STT provider: host `10.0.0.199` (or `localhost` if HA is on same machine), port `10300`
- [ ] Add TTS provider: host `10.0.0.199`, port `10301`
- [ ] Verify both appear as STT/TTS providers in HA

#### 1B. Create HA Voice Assistant Pipeline (P3)

- [ ] HA → Settings → Voice Assistants → Add Assistant
- [ ] Configure: STT = Wyoming Whisper, TTS = Wyoming Kokoro, Conversation Agent = Home Assistant default (or OpenClaw if wired)
- [ ] Set as default voice assistant pipeline

#### 1C. Test HA Assist via Browser (P3)

- [ ] Open HA dashboard → Assist panel
- [ ] Type a query (e.g. "What time is it?") → verify spoken response plays back
- [ ] Type a device command (e.g. "Turn on the reading lamp") → verify HA executes it

#### 1D. Set Up mem0 with Chroma Backend (P4)

- [ ] Install mem0: `pip install mem0ai`
- [ ] Install chromadb: `pip install chromadb`
- [ ] Pull embedding model: `ollama pull nomic-embed-text`
- [ ] Write mem0 config pointing at Ollama for LLM + embeddings, Chroma for vector store
- [ ] Test: store a memory, recall it via semantic search
- [ ] Verify mem0 data persists at `~/.openclaw/memory/chroma/`

#### 1E. Write Memory Backup launchd Job (P4)

- [ ] Create git repo at `~/.openclaw/memory/` (or a subdirectory)
- [ ] Write backup script: `git add . && git commit -m "mem0 backup $(date)" && git push`
- [ ] Write launchd plist: `com.homeai.mem0-backup.plist` — daily schedule
- [ ] Load plist, verify it runs

#### 1F. Build Morning Briefing n8n Workflow (P4)

- [ ] Verify n8n is running (Docker, deployed in P1)
- [ ] Create workflow: time trigger → fetch weather from HA → compose briefing text → POST to OpenClaw `/speak` endpoint
- [ ] Export workflow JSON to `homeai-agent/workflows/morning-briefing.json`
- [ ] Test: manually trigger → hear spoken briefing

#### 1G. Build Notification Router n8n Workflow (P4)

- [ ] Create workflow: HA webhook trigger → classify urgency → high: TTS immediately, low: queue
- [ ] Export to `homeai-agent/workflows/notification-router.json`

#### 1H. Verify Full Voice → Agent → HA Action Flow (P3 + P4)

- [ ] Trigger wake word ("hey jarvis") via USB mic
- [ ] Speak a command: "Turn on the reading lamp"
- [ ] Verify: wake word detected → audio captured → STT transcribes → OpenClaw receives text → tool call to HA → lamp turns on → TTS response plays back
- [ ] Document any latency issues or failure points

### Sprint 1 Flow Diagram

```mermaid
flowchart LR
    A[USB Mic] -->|wake word| B[openWakeWord]
    B -->|audio stream| C[Wyoming STT - Whisper]
    C -->|transcribed text| D[Home Assistant Pipeline]
    D -->|text| E[OpenClaw Agent]
    E -->|tool call| F[HA REST API]
    F -->|action| G[Smart Device]
    E -->|response text| H[Wyoming TTS - Kokoro]
    H -->|audio| I[Speaker]
```

---

## Sprint 2 — Foundation Hardening

**Goal:** All services survive a reboot, are monitored, and are remotely accessible.

### Tasks

#### 2A. Install and Configure Tailscale (P1)

- [ ] Install Tailscale on Mac Mini: `brew install tailscale`
- [ ] Authenticate and join Tailnet
- [ ] Verify all services reachable via Tailscale IP (HA, Open WebUI, Portainer, Gitea, n8n, code-server)
- [ ] Document Tailscale IP → service URL mapping

#### 2B. Configure Uptime Kuma Monitors (P1 + P2)

- [ ] Add monitors for: Home Assistant, Portainer, Gitea, code-server, n8n
- [ ] Add monitors for: Ollama API (port 11434), Open WebUI (port 3030)
- [ ] Add monitors for: Wyoming STT (port 10300), Wyoming TTS (port 10301)
- [ ] Add monitor for: OpenClaw (port 8080)
- [ ] Configure mobile push alerts (ntfy or Pushover)

#### 2C. Cold Reboot Verification (P1)

- [ ] Reboot Mac Mini
- [ ] Verify all Docker containers come back up (restart policy: `unless-stopped`)
- [ ] Verify launchd services start: Ollama, Wyoming STT, Wyoming TTS, openWakeWord, OpenClaw
- [ ] Check Uptime Kuma — all monitors green within 2 minutes
- [ ] Document any services that failed to restart and fix

#### 2D. Run LLM Benchmarks (P2)

- [ ] Run `homeai-llm/scripts/benchmark.sh`
- [ ] Record results: tokens/sec for each model (qwen2.5:7b, llama3.3:70b, etc.)
- [ ] Write results to `homeai-llm/benchmark-results.md`

---

## Sprint 3 — Character System (P5)

**Goal:** Character schema defined, default character created, Character Manager UI functional.

### Tasks

#### 3A. Define Character Schema (P5)

- [ ] Write `homeai-character/schema/character.schema.json` (v1) — based on the spec in PLAN.md
- [ ] Write `homeai-character/schema/README.md` documenting each field

#### 3B. Create Default Character (P5)

- [ ] Write `homeai-character/characters/aria.json` with placeholder expression IDs
- [ ] Validate aria.json against schema (manual or script)

#### 3C. Set Up Vite Project (P5)

- [ ] Initialize Vite + React project in `homeai-character/`
- [ ] Install deps: `npm install react react-dom ajv`
- [ ] Move existing `character-manager.jsx` into `src/`
- [ ] Verify dev server runs at `http://localhost:5173`

#### 3D. Wire Character Manager Features (P5)

- [ ] Integrate schema validation on export (ajv)
- [ ] Add expression mapping UI section
- [ ] Add custom rules editor
- [ ] Test full edit → export → validate → load cycle

#### 3E. Wire Character into OpenClaw (P4 + P5)

- [ ] Copy/symlink `aria.json` to `~/.openclaw/characters/aria.json`
- [ ] Configure OpenClaw to load system prompt from character JSON
- [ ] Verify OpenClaw uses Aria's system prompt in responses

---

## Open Decisions to Resolve During These Sprints

| Decision | Options | Recommendation |
|---|---|---|
| Character name / wake word | "Aria" vs custom | Decide during Sprint 3 — affects wake word training later |
| mem0 backend | Chroma vs Qdrant | Start with Chroma (Sprint 1D) — migrate if recall quality is poor |
| HA conversation agent | Default HA vs OpenClaw | Test with HA default first, then wire OpenClaw as custom conversation agent |

---

## What This Unlocks

After these 3 sprints, the system will have:

- **End-to-end voice control**: speak → understand → act → respond
- **Persistent memory**: the assistant remembers across sessions
- **Automated workflows**: morning briefings, notification routing
- **Monitoring**: all services tracked, alerts on failure
- **Remote access**: everything reachable via Tailscale
- **Character identity**: Aria persona loaded into the agent pipeline
- **Reboot resilience**: everything survives a cold restart

This positions the project to move into **Phase 4 (ESP32 hardware)** and **Phase 5 (VTube Studio visual layer)** with confidence that the core pipeline is solid.