- Deploy Music Assistant on Pi (10.0.0.199:8095) with host networking for Chromecast mDNS discovery, Spotify + SMB library support - Switch primary LLM from Ollama to Claude Sonnet 4 (Anthropic API), local models remain as fallback - Add model info tag under each assistant message in dashboard chat, persisted in conversation JSON - Rewrite homeai-agent/setup.sh: loads .env, injects API keys into plists, symlinks plists to ~/Library/LaunchAgents/, smoke tests services - Update install_service() in common.sh to use symlinks instead of copies - Open UFW ports on Pi for Music Assistant (8095, 8097, 8927) - Add ANTHROPIC_API_KEY to openclaw + bridge launchd plists Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
189 lines
7.0 KiB
Markdown
189 lines
7.0 KiB
Markdown
# P4: homeai-agent — AI Agent, Skills & Automation
|
|
|
|
> Phase 4 | Depends on: P1 (HA), P2 (Ollama), P3 (Wyoming/TTS), P5 (character JSON)
|
|
> Status: **COMPLETE** (all skills implemented)
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Voice input (text from Wyoming STT via HA pipeline)
|
|
↓
|
|
OpenClaw HTTP Bridge (port 8081)
|
|
↓ resolves character, loads memories, checks mode
|
|
System prompt construction (profile + memories)
|
|
↓ checks active-mode.json for model routing
|
|
OpenClaw CLI → LLM (Ollama local or cloud API)
|
|
↓ response + tool calls via exec
|
|
Skill dispatcher (CLIs on PATH)
|
|
├── ha-ctl → Home Assistant REST API
|
|
├── memory-ctl → JSON memory files
|
|
├── monitor-ctl → service health checks
|
|
├── character-ctl → character switching
|
|
├── routine-ctl → scenes, scripts, multi-step routines
|
|
├── music-ctl → media player control
|
|
├── workflow-ctl → n8n workflow triggering
|
|
├── gitea-ctl → Gitea repo/issue queries
|
|
├── calendar-ctl → HA calendar + voice reminders
|
|
├── mode-ctl → public/private LLM routing
|
|
├── gaze-ctl → image generation
|
|
└── vtube-ctl → VTube Studio expressions
|
|
↓ final response text
|
|
TTS dispatch (via active-tts-voice.json):
|
|
├── Kokoro (local, Wyoming)
|
|
└── ElevenLabs (cloud API)
|
|
↓
|
|
Audio playback to appropriate room
|
|
```
|
|
|
|
---
|
|
|
|
## OpenClaw Setup
|
|
|
|
- **Runtime:** Node.js global install at `/opt/homebrew/bin/openclaw` (v2026.3.2)
|
|
- **Config:** `~/.openclaw/openclaw.json`
|
|
- **Gateway:** port 8080, mode local, launchd: `com.homeai.openclaw`
|
|
- **Default model:** `ollama/qwen3.5:35b-a3b` (MoE, 35B total, 3B active, 26.7 tok/s)
|
|
- **Cloud models (public mode):** `anthropic/claude-sonnet-4-20250514`, `openai/gpt-4o`
|
|
- **Critical:** `commands.native: true` in config (enables exec tool for CLI skills)
|
|
- **Critical:** `contextWindow: 32768` for large models (prevents GPU OOM)
|
|
|
|
---
|
|
|
|
## Skills (13 total)
|
|
|
|
All skills follow the same pattern:
|
|
- `~/.openclaw/skills/<name>/SKILL.md` — metadata + agent instructions
|
|
- `~/.openclaw/skills/<name>/<tool>` — executable Python CLI (stdlib only)
|
|
- Symlinked to `/opt/homebrew/bin/` for PATH access
|
|
- Agent invokes via `exec` tool
|
|
- Documented in `~/.openclaw/workspace/TOOLS.md`
|
|
|
|
### Existing Skills (4)
|
|
|
|
| Skill | CLI | Description |
|
|
|-------|-----|-------------|
|
|
| home-assistant | `ha-ctl` | Smart home device control |
|
|
| image-generation | `gaze-ctl` | Image generation via ComfyUI/GAZE |
|
|
| voice-assistant | (none) | Voice pipeline handling |
|
|
| vtube-studio | `vtube-ctl` | VTube Studio expression control |
|
|
|
|
### New Skills (9) — Added 2026-03-17
|
|
|
|
| Skill | CLI | Description |
|
|
|-------|-----|-------------|
|
|
| memory | `memory-ctl` | Store/search/recall memories |
|
|
| service-monitor | `monitor-ctl` | Service health checks |
|
|
| character | `character-ctl` | Character switching |
|
|
| routine | `routine-ctl` | Scenes and multi-step routines |
|
|
| music | `music-ctl` | Media player control |
|
|
| workflow | `workflow-ctl` | n8n workflow management |
|
|
| gitea | `gitea-ctl` | Gitea repo/issue/PR queries |
|
|
| calendar | `calendar-ctl` | Calendar events and voice reminders |
|
|
| mode | `mode-ctl` | Public/private LLM routing |
|
|
|
|
See `SKILLS_GUIDE.md` for full user documentation.
|
|
|
|
---
|
|
|
|
## HTTP Bridge
|
|
|
|
**File:** `openclaw-http-bridge.py` (runs in homeai-voice-env)
|
|
**Port:** 8081, launchd: `com.homeai.openclaw-bridge`
|
|
|
|
### Endpoints
|
|
|
|
| Endpoint | Method | Description |
|
|
|----------|--------|-------------|
|
|
| `/api/agent/message` | POST | Send message → LLM → response |
|
|
| `/api/tts` | POST | Text-to-speech (Kokoro or ElevenLabs) |
|
|
| `/api/stt` | POST | Speech-to-text (Wyoming/Whisper) |
|
|
| `/wake` | POST | Wake word notification |
|
|
| `/status` | GET | Health check |
|
|
|
|
### Request Flow
|
|
|
|
1. Resolve character: explicit `character_id` > `satellite_id` mapping > default
|
|
2. Build system prompt: profile fields + metadata + personal/general memories
|
|
3. Write TTS config to `active-tts-voice.json`
|
|
4. Load mode from `active-mode.json`, resolve model (private → local, public → cloud)
|
|
5. Call OpenClaw CLI with `--model` flag if public mode
|
|
6. Detect/re-prompt if model promises action but doesn't call exec tool
|
|
7. Return response
|
|
|
|
### Timeout Strategy
|
|
|
|
| State | Timeout |
|
|
|-------|---------|
|
|
| Model warm (loaded in VRAM) | 120s |
|
|
| Model cold (loading) | 180s |
|
|
|
|
---
|
|
|
|
## Daemons
|
|
|
|
| Daemon | Plist | Purpose |
|
|
|--------|-------|---------|
|
|
| `com.homeai.openclaw` | `launchd/com.homeai.openclaw.plist` | OpenClaw gateway (port 8080) |
|
|
| `com.homeai.openclaw-bridge` | `launchd/com.homeai.openclaw-bridge.plist` | HTTP bridge (port 8081) |
|
|
| `com.homeai.reminder-daemon` | `launchd/com.homeai.reminder-daemon.plist` | Voice reminder checker (60s interval) |
|
|
|
|
---
|
|
|
|
## Data Files
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `~/homeai-data/memories/personal/*.json` | Per-character memories |
|
|
| `~/homeai-data/memories/general.json` | Shared general memories |
|
|
| `~/homeai-data/characters/*.json` | Character profiles (schema v2) |
|
|
| `~/homeai-data/satellite-map.json` | Satellite → character mapping |
|
|
| `~/homeai-data/active-tts-voice.json` | Current TTS engine/voice |
|
|
| `~/homeai-data/active-mode.json` | Public/private mode state |
|
|
| `~/homeai-data/routines/*.json` | Local routine definitions |
|
|
| `~/homeai-data/reminders.json` | Pending voice reminders |
|
|
| `~/homeai-data/conversations/*.json` | Chat conversation history |
|
|
|
|
---
|
|
|
|
## Environment Variables (OpenClaw Plist)
|
|
|
|
| Variable | Purpose |
|
|
|----------|---------|
|
|
| `HASS_TOKEN` / `HA_TOKEN` | Home Assistant API token |
|
|
| `HA_URL` | Home Assistant URL |
|
|
| `GAZE_API_KEY` | Image generation API key |
|
|
| `N8N_API_KEY` | n8n automation API key |
|
|
| `GITEA_TOKEN` | Gitea API token |
|
|
| `ANTHROPIC_API_KEY` | Claude API key (public mode) |
|
|
| `OPENAI_API_KEY` | OpenAI API key (public mode) |
|
|
|
|
---
|
|
|
|
## Implementation Status
|
|
|
|
- [x] OpenClaw installed and configured
|
|
- [x] HTTP bridge with character resolution and memory injection
|
|
- [x] ha-ctl — smart home control
|
|
- [x] gaze-ctl — image generation
|
|
- [x] vtube-ctl — VTube Studio expressions
|
|
- [x] memory-ctl — memory store/search/recall
|
|
- [x] monitor-ctl — service health checks
|
|
- [x] character-ctl — character switching
|
|
- [x] routine-ctl — scenes and multi-step routines
|
|
- [x] music-ctl — media player control
|
|
- [x] workflow-ctl — n8n workflow triggering
|
|
- [x] gitea-ctl — Gitea integration
|
|
- [x] calendar-ctl — calendar + voice reminders
|
|
- [x] mode-ctl — public/private LLM routing
|
|
- [x] Bridge mode routing (active-mode.json → --model flag)
|
|
- [x] Cloud providers in openclaw.json (Anthropic, OpenAI)
|
|
- [x] Dashboard /api/mode endpoint
|
|
- [x] Reminder daemon (com.homeai.reminder-daemon)
|
|
- [x] TOOLS.md updated with all skills
|
|
- [ ] Set N8N_API_KEY (requires generating in n8n UI)
|
|
- [ ] Set GITEA_TOKEN (requires generating in Gitea UI)
|
|
- [ ] Set ANTHROPIC_API_KEY / OPENAI_API_KEY for public mode
|
|
- [ ] End-to-end voice test of each skill
|