Files
homeai/plans/p5_development_plan.md
Aodhan Collins 6db8ae4492 feat: complete voice pipeline — fix wake word crash, bridge timeout, HA conversation agent
- Fix Wyoming satellite crash on wake word: convert macOS .aiff chimes to .wav
  (Python wave module only reads RIFF format, not AIFF)
- Fix OpenClaw HTTP bridge: increase subprocess timeout 30s → 120s, add SO_REUSEADDR
- Fix HA conversation component: use HTTP agent (not CLI) since HA runs in Docker
  on a different machine; update default host to Mac Mini IP, timeout to 120s
- Rewrite character manager as Vite+React app with schema validation
- Add Wyoming satellite wake word command, ElevenLabs TTS server, wakeword monitor
- Add Phase 5 development plan
- Update TODO.md: mark voice pipeline and agent tasks complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 00:15:55 +00:00

92 lines
4.9 KiB
Markdown

# P5: HomeAI Character System Development Plan
> Created: 2026-03-07 | Phase: 3 - Agent & Character
## Overview
Phase 5 (P5) focuses on creating a unified, JSON-based character configuration system that serves as the single source of truth for the AI assistant's personality, voice, visual expressions, and behavioral rules. This configuration will be consumed by OpenClaw (P4), the Voice Pipeline (P3), and the Visual Layer (P7).
A key component of this phase is building the **Character Manager UI**—a local React application that provides a user-friendly interface for editing character definitions, validating them against a strict JSON schema, and exporting them for use by the agent.
---
## 1. Schema & Foundation
The first step is establishing the strict data contract that all other services will rely on.
### 1.1 Define Character Schema
- Create `homeai-character/schema/character.schema.json` (v1).
- Define required fields: `schema_version`, `name`, `system_prompt`, `tts`.
- Define optional/advanced fields: `model_overrides`, `live2d_expressions`, `vtube_ws_triggers`, `custom_rules`, `notes`.
- Document the schema in `homeai-character/schema/README.md`.
### 1.2 Create Default Character Profile
- Create `homeai-character/characters/aria.json` conforming to the schema.
- Define the default system prompt for "Aria" (warm, helpful, concise for smart home tasks).
- Configure default TTS settings (`engine: "kokoro"`, `kokoro_voice: "af_heart"`).
- Add placeholder mappings for `live2d_expressions` and `vtube_ws_triggers`.
---
## 2. Character Manager UI Development
Transform the existing prototype (`character-manager.jsx`) into a fully functional local web tool.
### 2.1 Project Initialization
- Scaffold a new Vite + React project in `homeai-character/src/`.
- Install necessary dependencies: `react`, `react-dom`, `ajv` (for schema validation), and styling utilities (e.g., Tailwind CSS).
- Migrate the existing `character-manager.jsx` into the new project structure.
### 2.2 Schema Validation Integration
- Implement `SchemaValidator.js` using `ajv` to validate character configurations against `character.schema.json`.
- Enforce validation checks before allowing the user to export or save a character profile.
- Display clear error messages in the UI if validation fails.
### 2.3 UI Feature Implementation
- **Basic Info & Prompt Editor:** Fields for name, description, and a multi-line editor for the system prompt (with character count).
- **TTS Configuration:** Dropdowns for engine selection (Kokoro, Chatterbox, Qwen3) and inputs for voice reference paths/speed.
- **Expression Mapping Table:** UI to map semantic states (idle, listening, thinking, speaking, etc.) to VTube Studio hotkey IDs.
- **Custom Rules Editor:** Interface to add, edit, and delete trigger/response/condition pairs.
- **Import/Export Pipeline:** Functionality to load an existing JSON file, edit it, and download/save the validated output.
---
## 3. Pipeline Integration (Wiring it up)
Ensure that the generated character configurations are actually used by the rest of the HomeAI ecosystem.
### 3.1 OpenClaw Integration (P4 Link)
- Configure OpenClaw to load the active character from `~/.openclaw/characters/aria.json`.
- Modify OpenClaw's initialization to inject the `system_prompt` from the JSON into Ollama requests.
- Implement schema version checking in OpenClaw (fail gracefully if `schema_version` is unsupported).
- Ensure OpenClaw supports hot-reloading if the character JSON is updated.
### 3.2 Voice Pipeline Integration (P3 Link)
- Update the TTS dispatch logic to read the `tts` configuration block from the character JSON.
- Dynamically route TTS requests based on the `engine` field (e.g., routing to Kokoro vs. Chatterbox).
---
## 4. Custom Voice Cloning (Optional/Advanced)
If moving beyond the default Kokoro voice, set up a custom voice clone.
### 4.1 Audio Processing
- Record 30-60 seconds of clean reference audio for the character (`~/voices/aria-raw.wav`).
- Pre-process the audio using FFmpeg: `ffmpeg -i aria-raw.wav -ar 22050 -ac 1 aria.wav`.
- Move the processed file to the designated directory (`~/voices/aria.wav`).
### 4.2 Configuration & Testing
- Update `aria.json` to use `"engine": "chatterbox"` and set `"voice_ref_path"` to the new audio file.
- Test the voice output. If the quality is insufficient, evaluate Qwen3-TTS as a fallback alternative.
---
## Success Criteria Checklist
- [ ] `character.schema.json` is fully defined and documented.
- [ ] `aria.json` is created and passes strict validation against the schema.
- [ ] Vite-based Character Manager UI runs locally without errors.
- [ ] Character Manager successfully imports, edits, validates, and exports character JSONs.
- [ ] OpenClaw successfully reads `aria.json` and applies the system prompt to LLM generation.
- [ ] TTS engine selection dynamically respects the configuration in the character JSON.
- [ ] (Optional) Custom voice reference audio is processed and tested.