Full project plan across 8 sub-projects (homeai-infra, homeai-llm, homeai-voice, homeai-agent, homeai-character, homeai-esp32, homeai-visual, homeai-images). Includes per-project PLAN.md files, top-level PROJECT_PLAN.md, and master TODO.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8.8 KiB
P4: homeai-agent — AI Agent, Skills & Automation
Phase 3 | Depends on: P1 (HA), P2 (Ollama), P3 (Wyoming/TTS), P5 (character JSON)
Goal
OpenClaw running as the primary AI agent: receives voice/text input, loads character persona, calls tools (skills), manages memory (mem0), dispatches responses (TTS, HA actions, VTube expressions). n8n handles scheduled/automated workflows.
Architecture
Voice input (text from P3 Wyoming STT)
↓
OpenClaw API (port 8080)
↓ loads character JSON from P5
System prompt construction
↓
Ollama LLM (P2) — llama3.3:70b
↓ response + tool calls
Skill dispatcher
├── home_assistant.py → HA REST API (P1)
├── memory.py → mem0 (local)
├── vtube_studio.py → VTube WS (P7)
├── comfyui.py → ComfyUI API (P8)
├── music.py → Music Assistant (Phase 7)
└── weather.py → HA sensor data
↓ final response text
TTS dispatch:
├── Chatterbox (voice clone, if active)
└── Kokoro (via Wyoming, fallback)
↓
Audio playback to appropriate room
OpenClaw Setup
Installation
# Confirm OpenClaw supports Ollama — check repo for latest install method
pip install openclaw
# or
git clone https://github.com/<openclaw-repo>/openclaw
pip install -e .
Key question: Verify OpenClaw's Ollama/OpenAI-compatible backend support before installation. If OpenClaw doesn't support local Ollama natively, use a thin adapter layer pointing its OpenAI endpoint at http://localhost:11434/v1.
Config — ~/.openclaw/config.yaml
version: 1
llm:
provider: ollama # or openai-compatible
base_url: http://localhost:11434/v1
model: llama3.3:70b
fast_model: qwen2.5:7b # used for quick intent classification
character:
active: aria
config_dir: ~/.openclaw/characters/
memory:
provider: mem0
store_path: ~/.openclaw/memory/
embedding_model: nomic-embed-text
embedding_url: http://localhost:11434/v1
api:
host: 0.0.0.0
port: 8080
tts:
primary: chatterbox # when voice clone active
fallback: kokoro-wyoming # Wyoming TTS endpoint
wyoming_tts_url: tcp://localhost:10301
wake:
endpoint: /wake # openWakeWord POSTs here to trigger listening
Skills
All skills live in ~/.openclaw/skills/ (symlinked from homeai-agent/skills/).
home_assistant.py
Wraps the HA REST API for common smart home actions.
Functions:
turn_on(entity_id, **kwargs)— lights, switches, media playersturn_off(entity_id)toggle(entity_id)set_light(entity_id, brightness=None, color_temp=None, rgb_color=None)run_scene(scene_id)get_state(entity_id)→ returns state + attributeslist_entities(domain=None)→ returns entity list
Uses HA_URL and HA_TOKEN from .env.services.
memory.py
Wraps mem0 for persistent long-term memory.
Functions:
remember(text, category=None)— store a memoryrecall(query, limit=5)— semantic search over memoriesforget(memory_id)— delete a specific memorylist_recent(n=10)— list most recent memories
mem0 uses nomic-embed-text via Ollama for embeddings.
weather.py
Pulls weather data from Home Assistant sensors (local weather station or HA weather integration).
Functions:
get_current()→ temp, humidity, conditionsget_forecast(days=3)→ forecast array
timer.py
Simple timer/reminder management.
Functions:
set_timer(duration_seconds, label=None)→ fires HA notification/TTS on expiryset_reminder(datetime_str, message)→ schedules future TTS playbacklist_timers()cancel_timer(timer_id)
music.py (stub — completed in Phase 7)
def play(query: str): ... # "play jazz" → Music Assistant
def pause(): ...
def skip(): ...
def set_volume(level: int): ... # 0-100
vtube_studio.py (implemented in P7)
Stub in P4, full implementation in P7:
def trigger_expression(event: str): ... # "thinking", "happy", etc.
def set_parameter(name: str, value: float): ...
comfyui.py (implemented in P8)
Stub in P4, full implementation in P8:
def generate(workflow: str, params: dict) -> str: ... # returns image path
mem0 — Long-Term Memory
Setup
pip install mem0ai
Config
from mem0 import Memory
config = {
"llm": {
"provider": "ollama",
"config": {
"model": "llama3.3:70b",
"ollama_base_url": "http://localhost:11434",
}
},
"embedder": {
"provider": "ollama",
"config": {
"model": "nomic-embed-text",
"ollama_base_url": "http://localhost:11434",
}
},
"vector_store": {
"provider": "chroma",
"config": {
"collection_name": "homeai_memory",
"path": "~/.openclaw/memory/chroma",
}
}
}
memory = Memory.from_config(config)
Decision point: Start with Chroma (local file-based). If semantic recall quality is poor, migrate to Qdrant (Docker container).
Backup
Daily cron (via launchd) commits mem0 data to Gitea:
#!/usr/bin/env bash
cd ~/.openclaw/memory
git add .
git commit -m "mem0 backup $(date +%Y-%m-%d)"
git push origin main
n8n Workflows
n8n runs in Docker (deployed in P1). Workflows exported as JSON and stored in homeai-agent/workflows/.
Starter Workflows
morning-briefing.json
- Trigger: time-based (e.g., 7:30 AM on weekdays)
- Steps: fetch weather → fetch calendar events → compose briefing → POST to OpenClaw TTS → speak aloud
notification-router.json
- Trigger: HA webhook (new notification)
- Steps: classify urgency → if high: TTS immediately; if low: queue for next interaction
memory-backup.json
- Trigger: daily schedule
- Steps: commit mem0 data to Gitea
n8n ↔ OpenClaw Integration
OpenClaw exposes a webhook endpoint that n8n can call to trigger TTS or run a skill:
POST http://localhost:8080/speak
{
"text": "Good morning. It is 7:30 and the weather is...",
"room": "all"
}
API Surface (OpenClaw)
Key endpoints consumed by other projects:
| Endpoint | Method | Description |
|---|---|---|
/chat |
POST | Send text, get response (+ fires skills) |
/wake |
POST | Wake word trigger from openWakeWord |
/speak |
POST | TTS only — no LLM, just speak text |
/skill/<name> |
POST | Call a specific skill directly |
/memory |
GET/POST | Read/write memories |
/status |
GET | Health check |
Directory Layout
homeai-agent/
├── skills/
│ ├── home_assistant.py
│ ├── memory.py
│ ├── weather.py
│ ├── timer.py
│ ├── music.py # stub
│ ├── vtube_studio.py # stub
│ └── comfyui.py # stub
├── workflows/
│ ├── morning-briefing.json
│ ├── notification-router.json
│ └── memory-backup.json
└── config/
├── config.yaml.example
└── mem0-config.py
Interface Contracts
Consumes:
- Ollama API:
http://localhost:11434/v1 - HA API:
$HA_URLwith$HA_TOKEN - Wyoming TTS:
tcp://localhost:10301 - Character JSON:
~/.openclaw/characters/<active>.json(from P5)
Exposes:
- OpenClaw HTTP API:
http://localhost:8080— consumed by P3 (voice), P7 (visual triggers), P8 (image skill)
Add to .env.services:
OPENCLAW_URL=http://localhost:8080
Implementation Steps
- Confirm OpenClaw installation method and Ollama compatibility
- Install OpenClaw, write
config.yamlpointing at Ollama and HA - Verify OpenClaw responds to a basic text query via
/chat - Write
home_assistant.pyskill — test lights on/off via voice - Write
memory.pyskill — test store and recall - Write
weather.pyskill — verify HA weather sensor data - Write
timer.pyskill — test set/fire a timer - Write skill stubs:
music.py,vtube_studio.py,comfyui.py - Set up mem0 with Chroma backend, test semantic recall
- Write and test memory backup launchd job
- Deploy n8n via Docker (P1 task if not done)
- Build morning briefing n8n workflow
- Symlink
homeai-agent/skills/→~/.openclaw/skills/ - Verify full voice → agent → HA action flow (with P3 pipeline)
Success Criteria
- "Turn on the living room lights" → lights turn on via HA
- "Remember that I prefer jazz in the mornings" → mem0 stores it; "What do I like in the mornings?" → recalls it
- Morning briefing n8n workflow fires on schedule and speaks via TTS
- OpenClaw
/statusreturns healthy - OpenClaw survives Mac Mini reboot (launchd or Docker — TBD based on OpenClaw's preferred run method)