feat: upgrade voice pipeline — MLX Whisper STT (20x faster), Qwen3.5 MoE LLM, fix HA tool calling

- Replace faster-whisper with wyoming-mlx-whisper (whisper-large-v3-turbo, MLX Metal GPU) STT latency: 8.4s → 400ms for short voice commands - Add Qwen3.5-35B-A3B (MoE, 3B active params, Q8_0) to Ollama — 26.7 tok/s vs 5.4 tok/s (70B) - Add model preload launchd service to pin voice model in VRAM permanently - Fix HA tool calling: set commands.native=true, symlink ha-ctl to PATH - Add pipeline benchmark script (STT/LLM/TTS latency profiling) - Add service restart buttons and STT endpoint to dashboard - Bind Vite dev server to 0.0.0.0 for LAN access Total estimated pipeline latency: ~27s → ~4s Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 18:03:12 +00:00
parent 1bfd7fbd08
commit af6b7bd945
10 changed files with 721 additions and 27 deletions
--- a/.env.example
+++ b/.env.example
@@ -2,6 +2,14 @@
 # Copy to .env and fill in your values.
 # .env is gitignored — never commit it.

+# ─── API Keys ──────────────────────────────────────────────────────────────────
+HUGGING_FACE_API_KEY=
+OPENROUTER_API_KEY=
+OPENAI_API_KEY=
+DEEPSEEK_API_KEY=
+GEMINI_API_KEY=
+ELEVENLABS_API_KEY=
+
 # ─── Data & Paths ──────────────────────────────────────────────────────────────
 DATA_DIR=${HOME}/homeai-data
 REPO_DIR=${HOME}/Projects/HomeAI
@@ -45,3 +53,4 @@ VTUBE_WS_URL=ws://localhost:8001

 # ─── P8: Images ────────────────────────────────────────────────────────────────
 COMFYUI_URL=http://localhost:8188
+