Files
homeai/plans/p5_development_plan.md
Aodhan Collins 6db8ae4492 feat: complete voice pipeline — fix wake word crash, bridge timeout, HA conversation agent
- Fix Wyoming satellite crash on wake word: convert macOS .aiff chimes to .wav
  (Python wave module only reads RIFF format, not AIFF)
- Fix OpenClaw HTTP bridge: increase subprocess timeout 30s → 120s, add SO_REUSEADDR
- Fix HA conversation component: use HTTP agent (not CLI) since HA runs in Docker
  on a different machine; update default host to Mac Mini IP, timeout to 120s
- Rewrite character manager as Vite+React app with schema validation
- Add Wyoming satellite wake word command, ElevenLabs TTS server, wakeword monitor
- Add Phase 5 development plan
- Update TODO.md: mark voice pipeline and agent tasks complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 00:15:55 +00:00

4.9 KiB

P5: HomeAI Character System Development Plan

Created: 2026-03-07 | Phase: 3 - Agent & Character

Overview

Phase 5 (P5) focuses on creating a unified, JSON-based character configuration system that serves as the single source of truth for the AI assistant's personality, voice, visual expressions, and behavioral rules. This configuration will be consumed by OpenClaw (P4), the Voice Pipeline (P3), and the Visual Layer (P7).

A key component of this phase is building the Character Manager UI—a local React application that provides a user-friendly interface for editing character definitions, validating them against a strict JSON schema, and exporting them for use by the agent.


1. Schema & Foundation

The first step is establishing the strict data contract that all other services will rely on.

1.1 Define Character Schema

  • Create homeai-character/schema/character.schema.json (v1).
  • Define required fields: schema_version, name, system_prompt, tts.
  • Define optional/advanced fields: model_overrides, live2d_expressions, vtube_ws_triggers, custom_rules, notes.
  • Document the schema in homeai-character/schema/README.md.

1.2 Create Default Character Profile

  • Create homeai-character/characters/aria.json conforming to the schema.
  • Define the default system prompt for "Aria" (warm, helpful, concise for smart home tasks).
  • Configure default TTS settings (engine: "kokoro", kokoro_voice: "af_heart").
  • Add placeholder mappings for live2d_expressions and vtube_ws_triggers.

2. Character Manager UI Development

Transform the existing prototype (character-manager.jsx) into a fully functional local web tool.

2.1 Project Initialization

  • Scaffold a new Vite + React project in homeai-character/src/.
  • Install necessary dependencies: react, react-dom, ajv (for schema validation), and styling utilities (e.g., Tailwind CSS).
  • Migrate the existing character-manager.jsx into the new project structure.

2.2 Schema Validation Integration

  • Implement SchemaValidator.js using ajv to validate character configurations against character.schema.json.
  • Enforce validation checks before allowing the user to export or save a character profile.
  • Display clear error messages in the UI if validation fails.

2.3 UI Feature Implementation

  • Basic Info & Prompt Editor: Fields for name, description, and a multi-line editor for the system prompt (with character count).
  • TTS Configuration: Dropdowns for engine selection (Kokoro, Chatterbox, Qwen3) and inputs for voice reference paths/speed.
  • Expression Mapping Table: UI to map semantic states (idle, listening, thinking, speaking, etc.) to VTube Studio hotkey IDs.
  • Custom Rules Editor: Interface to add, edit, and delete trigger/response/condition pairs.
  • Import/Export Pipeline: Functionality to load an existing JSON file, edit it, and download/save the validated output.

3. Pipeline Integration (Wiring it up)

Ensure that the generated character configurations are actually used by the rest of the HomeAI ecosystem.

  • Configure OpenClaw to load the active character from ~/.openclaw/characters/aria.json.
  • Modify OpenClaw's initialization to inject the system_prompt from the JSON into Ollama requests.
  • Implement schema version checking in OpenClaw (fail gracefully if schema_version is unsupported).
  • Ensure OpenClaw supports hot-reloading if the character JSON is updated.
  • Update the TTS dispatch logic to read the tts configuration block from the character JSON.
  • Dynamically route TTS requests based on the engine field (e.g., routing to Kokoro vs. Chatterbox).

4. Custom Voice Cloning (Optional/Advanced)

If moving beyond the default Kokoro voice, set up a custom voice clone.

4.1 Audio Processing

  • Record 30-60 seconds of clean reference audio for the character (~/voices/aria-raw.wav).
  • Pre-process the audio using FFmpeg: ffmpeg -i aria-raw.wav -ar 22050 -ac 1 aria.wav.
  • Move the processed file to the designated directory (~/voices/aria.wav).

4.2 Configuration & Testing

  • Update aria.json to use "engine": "chatterbox" and set "voice_ref_path" to the new audio file.
  • Test the voice output. If the quality is insufficient, evaluate Qwen3-TTS as a fallback alternative.

Success Criteria Checklist

  • character.schema.json is fully defined and documented.
  • aria.json is created and passes strict validation against the schema.
  • Vite-based Character Manager UI runs locally without errors.
  • Character Manager successfully imports, edits, validates, and exports character JSONs.
  • OpenClaw successfully reads aria.json and applies the system prompt to LLM generation.
  • TTS engine selection dynamically respects the configuration in the character JSON.
  • (Optional) Custom voice reference audio is processed and tested.