Full project plan across 8 sub-projects (homeai-infra, homeai-llm, homeai-voice, homeai-agent, homeai-character, homeai-esp32, homeai-visual, homeai-images). Includes per-project PLAN.md files, top-level PROJECT_PLAN.md, and master TODO.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
336 lines
8.8 KiB
Markdown
336 lines
8.8 KiB
Markdown
# P4: homeai-agent — AI Agent, Skills & Automation
|
|
|
|
> Phase 3 | Depends on: P1 (HA), P2 (Ollama), P3 (Wyoming/TTS), P5 (character JSON)
|
|
|
|
---
|
|
|
|
## Goal
|
|
|
|
OpenClaw running as the primary AI agent: receives voice/text input, loads character persona, calls tools (skills), manages memory (mem0), dispatches responses (TTS, HA actions, VTube expressions). n8n handles scheduled/automated workflows.
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Voice input (text from P3 Wyoming STT)
|
|
↓
|
|
OpenClaw API (port 8080)
|
|
↓ loads character JSON from P5
|
|
System prompt construction
|
|
↓
|
|
Ollama LLM (P2) — llama3.3:70b
|
|
↓ response + tool calls
|
|
Skill dispatcher
|
|
├── home_assistant.py → HA REST API (P1)
|
|
├── memory.py → mem0 (local)
|
|
├── vtube_studio.py → VTube WS (P7)
|
|
├── comfyui.py → ComfyUI API (P8)
|
|
├── music.py → Music Assistant (Phase 7)
|
|
└── weather.py → HA sensor data
|
|
↓ final response text
|
|
TTS dispatch:
|
|
├── Chatterbox (voice clone, if active)
|
|
└── Kokoro (via Wyoming, fallback)
|
|
↓
|
|
Audio playback to appropriate room
|
|
```
|
|
|
|
---
|
|
|
|
## OpenClaw Setup
|
|
|
|
### Installation
|
|
|
|
```bash
|
|
# Confirm OpenClaw supports Ollama — check repo for latest install method
|
|
pip install openclaw
|
|
# or
|
|
git clone https://github.com/<openclaw-repo>/openclaw
|
|
pip install -e .
|
|
```
|
|
|
|
**Key question:** Verify OpenClaw's Ollama/OpenAI-compatible backend support before installation. If OpenClaw doesn't support local Ollama natively, use a thin adapter layer pointing its OpenAI endpoint at `http://localhost:11434/v1`.
|
|
|
|
### Config — `~/.openclaw/config.yaml`
|
|
|
|
```yaml
|
|
version: 1
|
|
|
|
llm:
|
|
provider: ollama # or openai-compatible
|
|
base_url: http://localhost:11434/v1
|
|
model: llama3.3:70b
|
|
fast_model: qwen2.5:7b # used for quick intent classification
|
|
|
|
character:
|
|
active: aria
|
|
config_dir: ~/.openclaw/characters/
|
|
|
|
memory:
|
|
provider: mem0
|
|
store_path: ~/.openclaw/memory/
|
|
embedding_model: nomic-embed-text
|
|
embedding_url: http://localhost:11434/v1
|
|
|
|
api:
|
|
host: 0.0.0.0
|
|
port: 8080
|
|
|
|
tts:
|
|
primary: chatterbox # when voice clone active
|
|
fallback: kokoro-wyoming # Wyoming TTS endpoint
|
|
wyoming_tts_url: tcp://localhost:10301
|
|
|
|
wake:
|
|
endpoint: /wake # openWakeWord POSTs here to trigger listening
|
|
```
|
|
|
|
---
|
|
|
|
## Skills
|
|
|
|
All skills live in `~/.openclaw/skills/` (symlinked from `homeai-agent/skills/`).
|
|
|
|
### `home_assistant.py`
|
|
|
|
Wraps the HA REST API for common smart home actions.
|
|
|
|
**Functions:**
|
|
- `turn_on(entity_id, **kwargs)` — lights, switches, media players
|
|
- `turn_off(entity_id)`
|
|
- `toggle(entity_id)`
|
|
- `set_light(entity_id, brightness=None, color_temp=None, rgb_color=None)`
|
|
- `run_scene(scene_id)`
|
|
- `get_state(entity_id)` → returns state + attributes
|
|
- `list_entities(domain=None)` → returns entity list
|
|
|
|
Uses `HA_URL` and `HA_TOKEN` from `.env.services`.
|
|
|
|
### `memory.py`
|
|
|
|
Wraps mem0 for persistent long-term memory.
|
|
|
|
**Functions:**
|
|
- `remember(text, category=None)` — store a memory
|
|
- `recall(query, limit=5)` — semantic search over memories
|
|
- `forget(memory_id)` — delete a specific memory
|
|
- `list_recent(n=10)` — list most recent memories
|
|
|
|
mem0 uses `nomic-embed-text` via Ollama for embeddings.
|
|
|
|
### `weather.py`
|
|
|
|
Pulls weather data from Home Assistant sensors (local weather station or HA weather integration).
|
|
|
|
**Functions:**
|
|
- `get_current()` → temp, humidity, conditions
|
|
- `get_forecast(days=3)` → forecast array
|
|
|
|
### `timer.py`
|
|
|
|
Simple timer/reminder management.
|
|
|
|
**Functions:**
|
|
- `set_timer(duration_seconds, label=None)` → fires HA notification/TTS on expiry
|
|
- `set_reminder(datetime_str, message)` → schedules future TTS playback
|
|
- `list_timers()`
|
|
- `cancel_timer(timer_id)`
|
|
|
|
### `music.py` (stub — completed in Phase 7)
|
|
|
|
```python
|
|
def play(query: str): ... # "play jazz" → Music Assistant
|
|
def pause(): ...
|
|
def skip(): ...
|
|
def set_volume(level: int): ... # 0-100
|
|
```
|
|
|
|
### `vtube_studio.py` (implemented in P7)
|
|
|
|
Stub in P4, full implementation in P7:
|
|
```python
|
|
def trigger_expression(event: str): ... # "thinking", "happy", etc.
|
|
def set_parameter(name: str, value: float): ...
|
|
```
|
|
|
|
### `comfyui.py` (implemented in P8)
|
|
|
|
Stub in P4, full implementation in P8:
|
|
```python
|
|
def generate(workflow: str, params: dict) -> str: ... # returns image path
|
|
```
|
|
|
|
---
|
|
|
|
## mem0 — Long-Term Memory
|
|
|
|
### Setup
|
|
|
|
```bash
|
|
pip install mem0ai
|
|
```
|
|
|
|
### Config
|
|
|
|
```python
|
|
from mem0 import Memory
|
|
|
|
config = {
|
|
"llm": {
|
|
"provider": "ollama",
|
|
"config": {
|
|
"model": "llama3.3:70b",
|
|
"ollama_base_url": "http://localhost:11434",
|
|
}
|
|
},
|
|
"embedder": {
|
|
"provider": "ollama",
|
|
"config": {
|
|
"model": "nomic-embed-text",
|
|
"ollama_base_url": "http://localhost:11434",
|
|
}
|
|
},
|
|
"vector_store": {
|
|
"provider": "chroma",
|
|
"config": {
|
|
"collection_name": "homeai_memory",
|
|
"path": "~/.openclaw/memory/chroma",
|
|
}
|
|
}
|
|
}
|
|
|
|
memory = Memory.from_config(config)
|
|
```
|
|
|
|
> **Decision point:** Start with Chroma (local file-based). If semantic recall quality is poor, migrate to Qdrant (Docker container).
|
|
|
|
### Backup
|
|
|
|
Daily cron (via launchd) commits mem0 data to Gitea:
|
|
|
|
```bash
|
|
#!/usr/bin/env bash
|
|
cd ~/.openclaw/memory
|
|
git add .
|
|
git commit -m "mem0 backup $(date +%Y-%m-%d)"
|
|
git push origin main
|
|
```
|
|
|
|
---
|
|
|
|
## n8n Workflows
|
|
|
|
n8n runs in Docker (deployed in P1). Workflows exported as JSON and stored in `homeai-agent/workflows/`.
|
|
|
|
### Starter Workflows
|
|
|
|
**`morning-briefing.json`**
|
|
- Trigger: time-based (e.g., 7:30 AM on weekdays)
|
|
- Steps: fetch weather → fetch calendar events → compose briefing → POST to OpenClaw TTS → speak aloud
|
|
|
|
**`notification-router.json`**
|
|
- Trigger: HA webhook (new notification)
|
|
- Steps: classify urgency → if high: TTS immediately; if low: queue for next interaction
|
|
|
|
**`memory-backup.json`**
|
|
- Trigger: daily schedule
|
|
- Steps: commit mem0 data to Gitea
|
|
|
|
### n8n ↔ OpenClaw Integration
|
|
|
|
OpenClaw exposes a webhook endpoint that n8n can call to trigger TTS or run a skill:
|
|
|
|
```
|
|
POST http://localhost:8080/speak
|
|
{
|
|
"text": "Good morning. It is 7:30 and the weather is...",
|
|
"room": "all"
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## API Surface (OpenClaw)
|
|
|
|
Key endpoints consumed by other projects:
|
|
|
|
| Endpoint | Method | Description |
|
|
|---|---|---|
|
|
| `/chat` | POST | Send text, get response (+ fires skills) |
|
|
| `/wake` | POST | Wake word trigger from openWakeWord |
|
|
| `/speak` | POST | TTS only — no LLM, just speak text |
|
|
| `/skill/<name>` | POST | Call a specific skill directly |
|
|
| `/memory` | GET/POST | Read/write memories |
|
|
| `/status` | GET | Health check |
|
|
|
|
---
|
|
|
|
## Directory Layout
|
|
|
|
```
|
|
homeai-agent/
|
|
├── skills/
|
|
│ ├── home_assistant.py
|
|
│ ├── memory.py
|
|
│ ├── weather.py
|
|
│ ├── timer.py
|
|
│ ├── music.py # stub
|
|
│ ├── vtube_studio.py # stub
|
|
│ └── comfyui.py # stub
|
|
├── workflows/
|
|
│ ├── morning-briefing.json
|
|
│ ├── notification-router.json
|
|
│ └── memory-backup.json
|
|
└── config/
|
|
├── config.yaml.example
|
|
└── mem0-config.py
|
|
```
|
|
|
|
---
|
|
|
|
## Interface Contracts
|
|
|
|
**Consumes:**
|
|
- Ollama API: `http://localhost:11434/v1`
|
|
- HA API: `$HA_URL` with `$HA_TOKEN`
|
|
- Wyoming TTS: `tcp://localhost:10301`
|
|
- Character JSON: `~/.openclaw/characters/<active>.json` (from P5)
|
|
|
|
**Exposes:**
|
|
- OpenClaw HTTP API: `http://localhost:8080` — consumed by P3 (voice), P7 (visual triggers), P8 (image skill)
|
|
|
|
**Add to `.env.services`:**
|
|
```dotenv
|
|
OPENCLAW_URL=http://localhost:8080
|
|
```
|
|
|
|
---
|
|
|
|
## Implementation Steps
|
|
|
|
- [ ] Confirm OpenClaw installation method and Ollama compatibility
|
|
- [ ] Install OpenClaw, write `config.yaml` pointing at Ollama and HA
|
|
- [ ] Verify OpenClaw responds to a basic text query via `/chat`
|
|
- [ ] Write `home_assistant.py` skill — test lights on/off via voice
|
|
- [ ] Write `memory.py` skill — test store and recall
|
|
- [ ] Write `weather.py` skill — verify HA weather sensor data
|
|
- [ ] Write `timer.py` skill — test set/fire a timer
|
|
- [ ] Write skill stubs: `music.py`, `vtube_studio.py`, `comfyui.py`
|
|
- [ ] Set up mem0 with Chroma backend, test semantic recall
|
|
- [ ] Write and test memory backup launchd job
|
|
- [ ] Deploy n8n via Docker (P1 task if not done)
|
|
- [ ] Build morning briefing n8n workflow
|
|
- [ ] Symlink `homeai-agent/skills/` → `~/.openclaw/skills/`
|
|
- [ ] Verify full voice → agent → HA action flow (with P3 pipeline)
|
|
|
|
---
|
|
|
|
## Success Criteria
|
|
|
|
- [ ] "Turn on the living room lights" → lights turn on via HA
|
|
- [ ] "Remember that I prefer jazz in the mornings" → mem0 stores it; "What do I like in the mornings?" → recalls it
|
|
- [ ] Morning briefing n8n workflow fires on schedule and speaks via TTS
|
|
- [ ] OpenClaw `/status` returns healthy
|
|
- [ ] OpenClaw survives Mac Mini reboot (launchd or Docker — TBD based on OpenClaw's preferred run method)
|