feat: Music Assistant, Claude primary LLM, model tag in chat, setup.sh rewrite

- Deploy Music Assistant on Pi (10.0.0.199:8095) with host networking for
  Chromecast mDNS discovery, Spotify + SMB library support
- Switch primary LLM from Ollama to Claude Sonnet 4 (Anthropic API),
  local models remain as fallback
- Add model info tag under each assistant message in dashboard chat,
  persisted in conversation JSON
- Rewrite homeai-agent/setup.sh: loads .env, injects API keys into plists,
  symlinks plists to ~/Library/LaunchAgents/, smoke tests services
- Update install_service() in common.sh to use symlinks instead of copies
- Open UFW ports on Pi for Music Assistant (8095, 8097, 8927)
- Add ANTHROPIC_API_KEY to openclaw + bridge launchd plists

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Aodhan Collins
2026-03-18 22:21:28 +00:00
parent 60eb89ea42
commit 117254d560
17 changed files with 1399 additions and 361 deletions

View File

@@ -1,37 +1,38 @@
# P4: homeai-agent — AI Agent, Skills & Automation
> Phase 3 | Depends on: P1 (HA), P2 (Ollama), P3 (Wyoming/TTS), P5 (character JSON)
---
## Goal
OpenClaw running as the primary AI agent: receives voice/text input, loads character persona, calls tools (skills), manages memory (mem0), dispatches responses (TTS, HA actions, VTube expressions). n8n handles scheduled/automated workflows.
> Phase 4 | Depends on: P1 (HA), P2 (Ollama), P3 (Wyoming/TTS), P5 (character JSON)
> Status: **COMPLETE** (all skills implemented)
---
## Architecture
```
Voice input (text from P3 Wyoming STT)
Voice input (text from Wyoming STT via HA pipeline)
OpenClaw API (port 8080)
loads character JSON from P5
System prompt construction
Ollama LLM (P2) — llama3.3:70b
↓ response + tool calls
Skill dispatcher
├── home_assistant.py → HA REST API (P1)
├── memory.py → mem0 (local)
├── vtube_studio.py → VTube WS (P7)
├── comfyui.py → ComfyUI API (P8)
├── music.py → Music Assistant (Phase 7)
── weather.py → HA sensor data
OpenClaw HTTP Bridge (port 8081)
resolves character, loads memories, checks mode
System prompt construction (profile + memories)
checks active-mode.json for model routing
OpenClaw CLI → LLM (Ollama local or cloud API)
↓ response + tool calls via exec
Skill dispatcher (CLIs on PATH)
├── ha-ctl → Home Assistant REST API
├── memory-ctl → JSON memory files
├── monitor-ctl → service health checks
├── character-ctl → character switching
├── routine-ctl → scenes, scripts, multi-step routines
── music-ctl → media player control
├── workflow-ctl → n8n workflow triggering
├── gitea-ctl → Gitea repo/issue queries
├── calendar-ctl → HA calendar + voice reminders
├── mode-ctl → public/private LLM routing
├── gaze-ctl → image generation
└── vtube-ctl → VTube Studio expressions
↓ final response text
TTS dispatch:
├── Chatterbox (voice clone, if active)
└── Kokoro (via Wyoming, fallback)
TTS dispatch (via active-tts-voice.json):
├── Kokoro (local, Wyoming)
└── ElevenLabs (cloud API)
Audio playback to appropriate room
```
@@ -40,296 +41,148 @@ OpenClaw API (port 8080)
## OpenClaw Setup
### Installation
```bash
# Confirm OpenClaw supports Ollama — check repo for latest install method
pip install openclaw
# or
git clone https://github.com/<openclaw-repo>/openclaw
pip install -e .
```
**Key question:** Verify OpenClaw's Ollama/OpenAI-compatible backend support before installation. If OpenClaw doesn't support local Ollama natively, use a thin adapter layer pointing its OpenAI endpoint at `http://localhost:11434/v1`.
### Config — `~/.openclaw/config.yaml`
```yaml
version: 1
llm:
provider: ollama # or openai-compatible
base_url: http://localhost:11434/v1
model: llama3.3:70b
fast_model: qwen2.5:7b # used for quick intent classification
character:
active: aria
config_dir: ~/.openclaw/characters/
memory:
provider: mem0
store_path: ~/.openclaw/memory/
embedding_model: nomic-embed-text
embedding_url: http://localhost:11434/v1
api:
host: 0.0.0.0
port: 8080
tts:
primary: chatterbox # when voice clone active
fallback: kokoro-wyoming # Wyoming TTS endpoint
wyoming_tts_url: tcp://localhost:10301
wake:
endpoint: /wake # openWakeWord POSTs here to trigger listening
```
- **Runtime:** Node.js global install at `/opt/homebrew/bin/openclaw` (v2026.3.2)
- **Config:** `~/.openclaw/openclaw.json`
- **Gateway:** port 8080, mode local, launchd: `com.homeai.openclaw`
- **Default model:** `ollama/qwen3.5:35b-a3b` (MoE, 35B total, 3B active, 26.7 tok/s)
- **Cloud models (public mode):** `anthropic/claude-sonnet-4-20250514`, `openai/gpt-4o`
- **Critical:** `commands.native: true` in config (enables exec tool for CLI skills)
- **Critical:** `contextWindow: 32768` for large models (prevents GPU OOM)
---
## Skills
## Skills (13 total)
All skills live in `~/.openclaw/skills/` (symlinked from `homeai-agent/skills/`).
All skills follow the same pattern:
- `~/.openclaw/skills/<name>/SKILL.md` — metadata + agent instructions
- `~/.openclaw/skills/<name>/<tool>` — executable Python CLI (stdlib only)
- Symlinked to `/opt/homebrew/bin/` for PATH access
- Agent invokes via `exec` tool
- Documented in `~/.openclaw/workspace/TOOLS.md`
### `home_assistant.py`
### Existing Skills (4)
Wraps the HA REST API for common smart home actions.
| Skill | CLI | Description |
|-------|-----|-------------|
| home-assistant | `ha-ctl` | Smart home device control |
| image-generation | `gaze-ctl` | Image generation via ComfyUI/GAZE |
| voice-assistant | (none) | Voice pipeline handling |
| vtube-studio | `vtube-ctl` | VTube Studio expression control |
**Functions:**
- `turn_on(entity_id, **kwargs)` — lights, switches, media players
- `turn_off(entity_id)`
- `toggle(entity_id)`
- `set_light(entity_id, brightness=None, color_temp=None, rgb_color=None)`
- `run_scene(scene_id)`
- `get_state(entity_id)` → returns state + attributes
- `list_entities(domain=None)` → returns entity list
### New Skills (9) — Added 2026-03-17
Uses `HA_URL` and `HA_TOKEN` from `.env.services`.
| Skill | CLI | Description |
|-------|-----|-------------|
| memory | `memory-ctl` | Store/search/recall memories |
| service-monitor | `monitor-ctl` | Service health checks |
| character | `character-ctl` | Character switching |
| routine | `routine-ctl` | Scenes and multi-step routines |
| music | `music-ctl` | Media player control |
| workflow | `workflow-ctl` | n8n workflow management |
| gitea | `gitea-ctl` | Gitea repo/issue/PR queries |
| calendar | `calendar-ctl` | Calendar events and voice reminders |
| mode | `mode-ctl` | Public/private LLM routing |
### `memory.py`
Wraps mem0 for persistent long-term memory.
**Functions:**
- `remember(text, category=None)` — store a memory
- `recall(query, limit=5)` — semantic search over memories
- `forget(memory_id)` — delete a specific memory
- `list_recent(n=10)` — list most recent memories
mem0 uses `nomic-embed-text` via Ollama for embeddings.
### `weather.py`
Pulls weather data from Home Assistant sensors (local weather station or HA weather integration).
**Functions:**
- `get_current()` → temp, humidity, conditions
- `get_forecast(days=3)` → forecast array
### `timer.py`
Simple timer/reminder management.
**Functions:**
- `set_timer(duration_seconds, label=None)` → fires HA notification/TTS on expiry
- `set_reminder(datetime_str, message)` → schedules future TTS playback
- `list_timers()`
- `cancel_timer(timer_id)`
### `music.py` (stub — completed in Phase 7)
```python
def play(query: str): ... # "play jazz" → Music Assistant
def pause(): ...
def skip(): ...
def set_volume(level: int): ... # 0-100
```
### `vtube_studio.py` (implemented in P7)
Stub in P4, full implementation in P7:
```python
def trigger_expression(event: str): ... # "thinking", "happy", etc.
def set_parameter(name: str, value: float): ...
```
### `comfyui.py` (implemented in P8)
Stub in P4, full implementation in P8:
```python
def generate(workflow: str, params: dict) -> str: ... # returns image path
```
See `SKILLS_GUIDE.md` for full user documentation.
---
## mem0 — Long-Term Memory
## HTTP Bridge
### Setup
**File:** `openclaw-http-bridge.py` (runs in homeai-voice-env)
**Port:** 8081, launchd: `com.homeai.openclaw-bridge`
```bash
pip install mem0ai
```
### Config
```python
from mem0 import Memory
config = {
"llm": {
"provider": "ollama",
"config": {
"model": "llama3.3:70b",
"ollama_base_url": "http://localhost:11434",
}
},
"embedder": {
"provider": "ollama",
"config": {
"model": "nomic-embed-text",
"ollama_base_url": "http://localhost:11434",
}
},
"vector_store": {
"provider": "chroma",
"config": {
"collection_name": "homeai_memory",
"path": "~/.openclaw/memory/chroma",
}
}
}
memory = Memory.from_config(config)
```
> **Decision point:** Start with Chroma (local file-based). If semantic recall quality is poor, migrate to Qdrant (Docker container).
### Backup
Daily cron (via launchd) commits mem0 data to Gitea:
```bash
#!/usr/bin/env bash
cd ~/.openclaw/memory
git add .
git commit -m "mem0 backup $(date +%Y-%m-%d)"
git push origin main
```
---
## n8n Workflows
n8n runs in Docker (deployed in P1). Workflows exported as JSON and stored in `homeai-agent/workflows/`.
### Starter Workflows
**`morning-briefing.json`**
- Trigger: time-based (e.g., 7:30 AM on weekdays)
- Steps: fetch weather → fetch calendar events → compose briefing → POST to OpenClaw TTS → speak aloud
**`notification-router.json`**
- Trigger: HA webhook (new notification)
- Steps: classify urgency → if high: TTS immediately; if low: queue for next interaction
**`memory-backup.json`**
- Trigger: daily schedule
- Steps: commit mem0 data to Gitea
### n8n ↔ OpenClaw Integration
OpenClaw exposes a webhook endpoint that n8n can call to trigger TTS or run a skill:
```
POST http://localhost:8080/speak
{
"text": "Good morning. It is 7:30 and the weather is...",
"room": "all"
}
```
---
## API Surface (OpenClaw)
Key endpoints consumed by other projects:
### Endpoints
| Endpoint | Method | Description |
|---|---|---|
| `/chat` | POST | Send text, get response (+ fires skills) |
| `/wake` | POST | Wake word trigger from openWakeWord |
| `/speak` | POST | TTS only — no LLM, just speak text |
| `/skill/<name>` | POST | Call a specific skill directly |
| `/memory` | GET/POST | Read/write memories |
|----------|--------|-------------|
| `/api/agent/message` | POST | Send message → LLM → response |
| `/api/tts` | POST | Text-to-speech (Kokoro or ElevenLabs) |
| `/api/stt` | POST | Speech-to-text (Wyoming/Whisper) |
| `/wake` | POST | Wake word notification |
| `/status` | GET | Health check |
---
### Request Flow
## Directory Layout
1. Resolve character: explicit `character_id` > `satellite_id` mapping > default
2. Build system prompt: profile fields + metadata + personal/general memories
3. Write TTS config to `active-tts-voice.json`
4. Load mode from `active-mode.json`, resolve model (private → local, public → cloud)
5. Call OpenClaw CLI with `--model` flag if public mode
6. Detect/re-prompt if model promises action but doesn't call exec tool
7. Return response
```
homeai-agent/
├── skills/
│ ├── home_assistant.py
│ ├── memory.py
│ ├── weather.py
│ ├── timer.py
│ ├── music.py # stub
│ ├── vtube_studio.py # stub
│ └── comfyui.py # stub
├── workflows/
│ ├── morning-briefing.json
│ ├── notification-router.json
│ └── memory-backup.json
└── config/
├── config.yaml.example
└── mem0-config.py
```
### Timeout Strategy
| State | Timeout |
|-------|---------|
| Model warm (loaded in VRAM) | 120s |
| Model cold (loading) | 180s |
---
## Interface Contracts
## Daemons
**Consumes:**
- Ollama API: `http://localhost:11434/v1`
- HA API: `$HA_URL` with `$HA_TOKEN`
- Wyoming TTS: `tcp://localhost:10301`
- Character JSON: `~/.openclaw/characters/<active>.json` (from P5)
**Exposes:**
- OpenClaw HTTP API: `http://localhost:8080` — consumed by P3 (voice), P7 (visual triggers), P8 (image skill)
**Add to `.env.services`:**
```dotenv
OPENCLAW_URL=http://localhost:8080
```
| Daemon | Plist | Purpose |
|--------|-------|---------|
| `com.homeai.openclaw` | `launchd/com.homeai.openclaw.plist` | OpenClaw gateway (port 8080) |
| `com.homeai.openclaw-bridge` | `launchd/com.homeai.openclaw-bridge.plist` | HTTP bridge (port 8081) |
| `com.homeai.reminder-daemon` | `launchd/com.homeai.reminder-daemon.plist` | Voice reminder checker (60s interval) |
---
## Implementation Steps
## Data Files
- [ ] Confirm OpenClaw installation method and Ollama compatibility
- [ ] Install OpenClaw, write `config.yaml` pointing at Ollama and HA
- [ ] Verify OpenClaw responds to a basic text query via `/chat`
- [ ] Write `home_assistant.py` skill — test lights on/off via voice
- [ ] Write `memory.py` skill — test store and recall
- [ ] Write `weather.py` skill — verify HA weather sensor data
- [ ] Write `timer.py` skill — test set/fire a timer
- [ ] Write skill stubs: `music.py`, `vtube_studio.py`, `comfyui.py`
- [ ] Set up mem0 with Chroma backend, test semantic recall
- [ ] Write and test memory backup launchd job
- [ ] Deploy n8n via Docker (P1 task if not done)
- [ ] Build morning briefing n8n workflow
- [ ] Symlink `homeai-agent/skills/``~/.openclaw/skills/`
- [ ] Verify full voice → agent → HA action flow (with P3 pipeline)
| File | Purpose |
|------|---------|
| `~/homeai-data/memories/personal/*.json` | Per-character memories |
| `~/homeai-data/memories/general.json` | Shared general memories |
| `~/homeai-data/characters/*.json` | Character profiles (schema v2) |
| `~/homeai-data/satellite-map.json` | Satellite → character mapping |
| `~/homeai-data/active-tts-voice.json` | Current TTS engine/voice |
| `~/homeai-data/active-mode.json` | Public/private mode state |
| `~/homeai-data/routines/*.json` | Local routine definitions |
| `~/homeai-data/reminders.json` | Pending voice reminders |
| `~/homeai-data/conversations/*.json` | Chat conversation history |
---
## Success Criteria
## Environment Variables (OpenClaw Plist)
- [ ] "Turn on the living room lights" → lights turn on via HA
- [ ] "Remember that I prefer jazz in the mornings" → mem0 stores it; "What do I like in the mornings?" → recalls it
- [ ] Morning briefing n8n workflow fires on schedule and speaks via TTS
- [ ] OpenClaw `/status` returns healthy
- [ ] OpenClaw survives Mac Mini reboot (launchd or Docker — TBD based on OpenClaw's preferred run method)
| Variable | Purpose |
|----------|---------|
| `HASS_TOKEN` / `HA_TOKEN` | Home Assistant API token |
| `HA_URL` | Home Assistant URL |
| `GAZE_API_KEY` | Image generation API key |
| `N8N_API_KEY` | n8n automation API key |
| `GITEA_TOKEN` | Gitea API token |
| `ANTHROPIC_API_KEY` | Claude API key (public mode) |
| `OPENAI_API_KEY` | OpenAI API key (public mode) |
---
## Implementation Status
- [x] OpenClaw installed and configured
- [x] HTTP bridge with character resolution and memory injection
- [x] ha-ctl — smart home control
- [x] gaze-ctl — image generation
- [x] vtube-ctl — VTube Studio expressions
- [x] memory-ctl — memory store/search/recall
- [x] monitor-ctl — service health checks
- [x] character-ctl — character switching
- [x] routine-ctl — scenes and multi-step routines
- [x] music-ctl — media player control
- [x] workflow-ctl — n8n workflow triggering
- [x] gitea-ctl — Gitea integration
- [x] calendar-ctl — calendar + voice reminders
- [x] mode-ctl — public/private LLM routing
- [x] Bridge mode routing (active-mode.json → --model flag)
- [x] Cloud providers in openclaw.json (Anthropic, OpenAI)
- [x] Dashboard /api/mode endpoint
- [x] Reminder daemon (com.homeai.reminder-daemon)
- [x] TOOLS.md updated with all skills
- [ ] Set N8N_API_KEY (requires generating in n8n UI)
- [ ] Set GITEA_TOKEN (requires generating in Gitea UI)
- [ ] Set ANTHROPIC_API_KEY / OPENAI_API_KEY for public mode
- [ ] End-to-end voice test of each skill