Merge branch 'esp32': ESP32-S3-BOX-3 room satellite with voice pipeline

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 20:48:09 +00:00
parent 3c0d905e64 c4cecbd8dc
commit 5f147cae61
13 changed files with 1410 additions and 341 deletions
--- a/TODO.md
+++ b/TODO.md
@@ -108,17 +108,19 @@

 ### P6 · homeai-esp32

- [ ] Install ESPHome: `pip install esphome`
- [ ] Write `esphome/secrets.yaml` (gitignored)
- [ ] Write `base.yaml`, `voice.yaml`, `display.yaml`, `animations.yaml`
- [ ] Write `s3-box-living-room.yaml` for first unit
- [ ] Flash first unit via USB
- [ ] Verify unit appears in HA device list
- [ ] Assign Wyoming voice pipeline to unit in HA
- [ ] Test full wake → STT → LLM → TTS → audio playback cycle
- [ ] Test LVGL face: idle → listening → thinking → speaking → error
- [ ] Verify OTA firmware update works wirelessly
- [ ] Flash remaining units (bedroom, kitchen, etc.)
+- [x] Install ESPHome in `~/homeai-esphome-env` (Python 3.12 venv)
+- [x] Write `esphome/secrets.yaml` (gitignored)
+- [x] Write `homeai-living-room.yaml` (based on official S3-BOX-3 reference config)
+- [x] Generate placeholder face illustrations (7 PNGs, 320×240)
+- [x] Write `setup.sh` with flash/ota/logs/validate commands
+- [x] Write `deploy.sh` with OTA deploy, image management, multi-unit support
+- [x] Flash first unit via USB (living room)
+- [x] Verify unit appears in HA device list (requires HA 2026.x for ESPHome 2025.12+ compat)
+- [x] Assign Wyoming voice pipeline to unit in HA
+- [x] Test full wake → STT → LLM → TTS → audio playback cycle
+- [x] Test display states: idle → listening → thinking → replying → error
+- [x] Verify OTA firmware update works wirelessly (`deploy.sh --device OTA`)
+- [ ] Flash remaining units (bedroom, kitchen)
 - [ ] Document MAC address → room name mapping

 ---
--- a/homeai-esp32/PLAN.md
+++ b/homeai-esp32/PLAN.md
@@ -6,7 +6,7 @@

 ## Goal

-Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite: always-on mic, local wake word detection, audio playback, and an LVGL animated face showing assistant state. All intelligence stays on the Mac Mini.
+Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite: always-on mic, on-device wake word detection, audio playback, and a display showing assistant state via static PNG face illustrations. All intelligence stays on the Mac Mini.

 ---

@@ -17,11 +17,12 @@ Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite
 | SoC | ESP32-S3 (dual-core Xtensa, 240MHz) |
 | RAM | 512KB SRAM + 16MB PSRAM |
 | Flash | 16MB |
-| Display | 2.4" IPS LCD, 320×240, touchscreen |
-| Mic | Dual microphone array |
-| Speaker | Built-in 1W speaker |
-| Connectivity | WiFi 802.11b/g/n, BT 5.0 |
-| USB | USB-C (programming + power) |
+| Display | 2.4" IPS LCD, 320×240, touchscreen (ILI9xxx, model S3BOX) |
+| Audio ADC | ES7210 (dual mic array, 16kHz 16-bit) |
+| Audio DAC | ES8311 (speaker output, 48kHz 16-bit) |
+| Speaker | Built-in 1W |
+| Connectivity | WiFi 802.11b/g/n (2.4GHz only), BT 5.0 |
+| USB | USB-C (programming + power, native USB JTAG serial) |

 ---

@@ -29,273 +30,86 @@ Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite

 ```
 ESP32-S3-BOX-3
-├── microWakeWord (on-device, always listening)
-│   └── triggers Wyoming Satellite on wake detection
-├── Wyoming Satellite
-│   ├── streams mic audio → Mac Mini Wyoming STT (port 10300)
-│   └── receives TTS audio ← Mac Mini Wyoming TTS (port 10301)
-├── LVGL Display
-│   └── animated face, driven by HA entity state
+├── micro_wake_word (on-device, always listening)
+│   └── "hey_jarvis" — triggers voice_assistant on wake detection
+├── voice_assistant (ESPHome component)
+│   ├── connects to Home Assistant via ESPHome API
+│   ├── HA routes audio → Mac Mini Wyoming STT (10.0.0.101:10300)
+│   ├── HA routes text → OpenClaw conversation agent (10.0.0.101:8081)
+│   └── HA routes response → Mac Mini Wyoming TTS (10.0.0.101:10301)
+├── Display (ili9xxx, model S3BOX, 320×240)
+│   └── static PNG faces per state (idle, listening, thinking, replying, error)
 └── ESPHome OTA
    └── firmware updates over WiFi
 ```

 ---

+## Pin Map (ESP32-S3-BOX-3)
+
+| Function | Pin(s) | Notes |
+|---|---|---|
+| I2S LRCLK | GPIO45 | strapping pin — warning ignored |
+| I2S BCLK | GPIO17 | |
+| I2S MCLK | GPIO2 | |
+| I2S DIN (mic) | GPIO16 | ES7210 ADC input |
+| I2S DOUT (speaker) | GPIO15 | ES8311 DAC output |
+| Speaker enable | GPIO46 | strapping pin — warning ignored |
+| I2C SCL | GPIO18 | audio codec control bus |
+| I2C SDA | GPIO8 | audio codec control bus |
+| SPI CLK (display) | GPIO7 | |
+| SPI MOSI (display) | GPIO6 | |
+| Display CS | GPIO5 | |
+| Display DC | GPIO4 | |
+| Display Reset | GPIO48 | inverted |
+| Backlight | GPIO47 | LEDC PWM |
+| Left top button | GPIO0 | strapping pin — mute toggle / factory reset |
+
+---
+
 ## ESPHome Configuration

-### Base Config Template
-
-`esphome/base.yaml` — shared across all units:
+### Platform & Framework

 ```yaml
-esphome:
-  name: homeai-${room}
-  friendly_name: "HomeAI ${room_display}"
-  platform: esp32
-  board: esp32-s3-box-3
+esp32:
+  board: esp32s3box
+  flash_size: 16MB
+  cpu_frequency: 240MHz
+  framework:
+    type: esp-idf
+    sdkconfig_options:
+      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
+      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
+      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"

-wifi:
-  ssid: !secret wifi_ssid
-  password: !secret wifi_password
-  ap:
-    ssid: "HomeAI Fallback"
-
-api:
-  encryption:
-    key: !secret api_key
-
-ota:
-  password: !secret ota_password
-
-logger:
-  level: INFO
+psram:
+  mode: octal
+  speed: 80MHz
 ```

-### Room-Specific Config
+### Audio Stack

-`esphome/s3-box-living-room.yaml`:
+Uses `i2s_audio` platform with external ADC/DAC codec chips:

-```yaml
-substitutions:
-  room: living-room
-  room_display: "Living Room"
-  mac_mini_ip: "192.168.1.x"    # or Tailscale IP
+- **Microphone**: ES7210 ADC via I2S, 16kHz 16-bit mono
+- **Speaker**: ES8311 DAC via I2S, 48kHz 16-bit mono (left channel)
+- **Media player**: wraps speaker with volume control (min 50%, max 85%)

-packages:
-  base: !include base.yaml
-  voice: !include voice.yaml
-  display: !include display.yaml
-```
+### Wake Word

-One file per room, only the substitutions change.
+On-device `micro_wake_word` component with `hey_jarvis` model. Can optionally be switched to Home Assistant streaming wake word via a selector entity.

-### Voice / Wyoming Satellite — `esphome/voice.yaml`
+### Display

-```yaml
-microphone:
-  - platform: esp_adf
-    id: mic
+`ili9xxx` platform with model `S3BOX`. Uses `update_interval: never` — display updates are triggered by scripts on voice assistant state changes. Static 320×240 PNG images for each state are compiled into firmware.

-speaker:
-  - platform: esp_adf
-    id: spk
+### Voice Assistant

-micro_wake_word:
-  model: hey_jarvis            # or custom model path
-  on_wake_word_detected:
-    - voice_assistant.start:
-
-voice_assistant:
-  microphone: mic
-  speaker: spk
-  noise_suppression_level: 2
-  auto_gain: 31dBFS
-  volume_multiplier: 2.0
-
-  on_listening:
-    - display.page.show: page_listening
-    - script.execute: animate_face_listening
-
-  on_stt_vad_end:
-    - display.page.show: page_thinking
-    - script.execute: animate_face_thinking
-
-  on_tts_start:
-    - display.page.show: page_speaking
-    - script.execute: animate_face_speaking
-
-  on_end:
-    - display.page.show: page_idle
-    - script.execute: animate_face_idle
-
-  on_error:
-    - display.page.show: page_error
-    - script.execute: animate_face_error
-```
-
-**Note:** ESPHome's `voice_assistant` component connects to HA, which routes to Wyoming STT/TTS on the Mac Mini. This is the standard ESPHome → HA → Wyoming path.
-
-### LVGL Display — `esphome/display.yaml`
-
-```yaml
-display:
-  - platform: ili9xxx
-    model: ILI9341
-    id: lcd
-    cs_pin: GPIO5
-    dc_pin: GPIO4
-    reset_pin: GPIO48
-
-touchscreen:
-  - platform: tt21100
-    id: touch
-
-lvgl:
-  displays:
-    - lcd
-  touchscreens:
-    - touch
-
-  # Face widget — centered on screen
-  widgets:
-    - obj:
-        id: face_container
-        width: 320
-        height: 240
-        bg_color: 0x000000
-        children:
-          # Eyes (two circles)
-          - obj:
-              id: eye_left
-              x: 90
-              y: 90
-              width: 50
-              height: 50
-              radius: 25
-              bg_color: 0xFFFFFF
-          - obj:
-              id: eye_right
-              x: 180
-              y: 90
-              width: 50
-              height: 50
-              radius: 25
-              bg_color: 0xFFFFFF
-          # Mouth (line/arc)
-          - arc:
-              id: mouth
-              x: 110
-              y: 160
-              width: 100
-              height: 40
-              start_angle: 180
-              end_angle: 360
-              arc_color: 0xFFFFFF
-
-  pages:
-    - id: page_idle
-    - id: page_listening
-    - id: page_thinking
-    - id: page_speaking
-    - id: page_error
-```
-
-### LVGL Face State Animations — `esphome/animations.yaml`
-
-```yaml
-script:
-  - id: animate_face_idle
-    then:
-      - lvgl.widget.modify:
-          id: eye_left
-          height: 50     # normal open
-      - lvgl.widget.modify:
-          id: eye_right
-          height: 50
-      - lvgl.widget.modify:
-          id: mouth
-          arc_color: 0xFFFFFF
-
-  - id: animate_face_listening
-    then:
-      - lvgl.widget.modify:
-          id: eye_left
-          height: 60     # wider eyes
-      - lvgl.widget.modify:
-          id: eye_right
-          height: 60
-      - lvgl.widget.modify:
-          id: mouth
-          arc_color: 0x00BFFF  # blue tint
-
-  - id: animate_face_thinking
-    then:
-      - lvgl.widget.modify:
-          id: eye_left
-          height: 20     # squinting
-      - lvgl.widget.modify:
-          id: eye_right
-          height: 20
-
-  - id: animate_face_speaking
-    then:
-      - lvgl.widget.modify:
-          id: mouth
-          arc_color: 0x00FF88  # green speaking indicator
-
-  - id: animate_face_error
-    then:
-      - lvgl.widget.modify:
-          id: eye_left
-          bg_color: 0xFF2200  # red eyes
-      - lvgl.widget.modify:
-          id: eye_right
-          bg_color: 0xFF2200
-```
-
-> **Note:** True lip-sync animation (mouth moving with audio) is complex on ESP32. Phase 1: static states. Phase 2: amplitude-driven mouth height using speaker volume feedback.
-
---
-
-## Secrets File
-
-`esphome/secrets.yaml` (gitignored):
-
-```yaml
-wifi_ssid: "YourNetwork"
-wifi_password: "YourPassword"
-api_key: "<32-byte base64 key>"
-ota_password: "YourOTAPassword"
-```
-
---
-
-## Flash & Deployment Workflow
-
-```bash
-# Install ESPHome
-pip install esphome
-
-# Compile + flash via USB (first time)
-esphome run esphome/s3-box-living-room.yaml
-
-# OTA update (subsequent)
-esphome upload esphome/s3-box-living-room.yaml --device <device-ip>
-
-# View logs
-esphome logs esphome/s3-box-living-room.yaml
-```
-
---
-
-## Home Assistant Integration
-
-After flashing:
-1. HA discovers ESP32 automatically via mDNS
-2. Add device in HA → Settings → Devices
-3. Assign Wyoming voice assistant pipeline to the device
-4. Set up room-specific automations (e.g., "Living Room" light control from that satellite)
+ESPHome's `voice_assistant` component connects to HA via the ESPHome native API (not directly to Wyoming). HA orchestrates the pipeline:
+1. Audio → Wyoming STT (Mac Mini) → text
+2. Text → OpenClaw conversation agent → response
+3. Response → Wyoming TTS (Mac Mini) → audio back to ESP32

 ---

@@ -303,43 +117,71 @@ After flashing:

 ```
 homeai-esp32/
+├── PLAN.md
+├── setup.sh                          # env check + flash/ota/logs commands
 └── esphome/
-    ├── base.yaml
-    ├── voice.yaml
-    ├── display.yaml
-    ├── animations.yaml
-    ├── s3-box-living-room.yaml
-    ├── s3-box-bedroom.yaml       # template, fill in when hardware available
-    ├── s3-box-kitchen.yaml       # template
-    └── secrets.yaml              # gitignored
+    ├── secrets.yaml                  # gitignored — WiFi + API key
+    ├── homeai-living-room.yaml       # first unit (full config)
+    ├── homeai-bedroom.yaml           # future: copy + change substitutions
+    ├── homeai-kitchen.yaml           # future: copy + change substitutions
+    └── illustrations/                # 320×240 PNG face images
+        ├── idle.png
+        ├── loading.png
+        ├── listening.png
+        ├── thinking.png
+        ├── replying.png
+        ├── error.png
+        └── timer_finished.png
 ```

 ---

-## Wake Word Decisions
+## ESPHome Environment
+
+```bash
+# Dedicated venv (Python 3.12) — do NOT share with voice/whisper venvs
+~/homeai-esphome-env/bin/esphome version  # ESPHome 2026.2.4+
+
+# Quick commands
+cd ~/gitea/homeai/homeai-esp32
+~/homeai-esphome-env/bin/esphome run esphome/homeai-living-room.yaml     # compile + flash
+~/homeai-esphome-env/bin/esphome logs esphome/homeai-living-room.yaml    # stream logs
+
+# Or use the setup script
+./setup.sh flash    # compile + USB flash
+./setup.sh ota      # compile + OTA update
+./setup.sh logs     # stream device logs
+./setup.sh validate # check YAML without compiling
+```
+
+---
+
+## Wake Word Options

 | Option | Latency | Privacy | Effort |
 |---|---|---|---|
-| `hey_jarvis` (built-in microWakeWord) | ~200ms | On-device | Zero |
+| `hey_jarvis` (built-in micro_wake_word) | ~200ms | On-device | Zero |
 | Custom word (trained model) | ~200ms | On-device | High — requires 50+ recordings |
-| Mac Mini openWakeWord (stream audio) | ~500ms | On Mac | Medium |
+| HA streaming wake word | ~500ms | On Mac Mini | Medium — stream all audio |

-**Recommendation:** Start with `hey_jarvis`. Train a custom word (character's name) once character name is finalised.
+**Current**: `hey_jarvis` on-device. Train a custom word (character's name) once finalised.

 ---

 ## Implementation Steps

- [ ] Install ESPHome: `pip install esphome`
- [ ] Write `esphome/secrets.yaml` (gitignored)
- [ ] Write `base.yaml`, `voice.yaml`, `display.yaml`, `animations.yaml`
- [ ] Write `s3-box-living-room.yaml` for first unit
- [ ] Flash first unit via USB: `esphome run s3-box-living-room.yaml`
- [ ] Verify unit appears in HA device list
- [ ] Assign Wyoming voice pipeline to unit in HA
- [ ] Test: speak wake word → transcription → LLM response → spoken reply
- [ ] Test: LVGL face cycles through idle → listening → thinking → speaking
- [ ] Verify OTA update works: change LVGL color, deploy wirelessly
+- [x] Install ESPHome in `~/homeai-esphome-env` (Python 3.12)
+- [x] Write `esphome/secrets.yaml` (gitignored)
+- [x] Write `homeai-living-room.yaml` (based on official S3-BOX-3 reference config)
+- [x] Generate placeholder face illustrations (7 PNGs, 320×240)
+- [x] Write `setup.sh` with flash/ota/logs/validate commands
+- [x] Write `deploy.sh` with OTA deploy, image management, multi-unit support
+- [x] Flash first unit via USB (living room)
+- [x] Verify unit appears in HA device list
+- [x] Assign Wyoming voice pipeline to unit in HA
+- [x] Test: speak wake word → transcription → LLM response → spoken reply
+- [x] Test: display cycles through idle → listening → thinking → replying
+- [x] Verify OTA update works: change config, deploy wirelessly
 - [ ] Write config templates for remaining rooms (bedroom, kitchen)
 - [ ] Flash remaining units, verify each works independently
 - [ ] Document final MAC address → room name mapping
@@ -351,7 +193,17 @@ homeai-esp32/
 - [ ] Wake word "hey jarvis" triggers pipeline reliably from 3m distance
 - [ ] STT transcription accuracy >90% for clear speech in quiet room
 - [ ] TTS audio plays clearly through ESP32 speaker
- [ ] LVGL face shows correct state for idle / listening / thinking / speaking / error
+- [ ] Display shows correct state for idle / listening / thinking / replying / error / muted
 - [ ] OTA firmware updates work without USB cable
 - [ ] Unit reconnects automatically after WiFi drop
 - [ ] Unit survives power cycle and resumes normal operation
+
+---
+
+## Known Constraints
+
+- **Memory**: voice_assistant + micro_wake_word + display is near the limit. Do NOT add Bluetooth or LVGL widgets — they will cause crashes.
+- **WiFi**: 2.4GHz only. 5GHz networks are not supported.
+- **Speaker**: 1W built-in. Volume capped at 85% to avoid distortion.
+- **Display**: Static PNGs compiled into firmware. To change images, reflash via OTA (~1-2 min).
+- **First compile**: Downloads ESP-IDF toolchain (~500MB), takes 5-10 minutes. Incremental builds are 1-2 minutes.
--- a/homeai-esp32/deploy.sh
+++ b/homeai-esp32/deploy.sh
@@ -0,0 +1,244 @@
+#!/usr/bin/env bash
+# homeai-esp32/deploy.sh — Quick OTA deploy for ESP32-S3-BOX-3 satellites
+#
+# Usage:
+#   ./deploy.sh                    — deploy config + images to living room (default)
+#   ./deploy.sh bedroom            — deploy to bedroom unit
+#   ./deploy.sh --images-only      — deploy existing PNGs from illustrations/ (no regen)
+#   ./deploy.sh --regen-images     — regenerate placeholder PNGs then deploy
+#   ./deploy.sh --validate         — validate config without deploying
+#   ./deploy.sh --all              — deploy to all configured units
+#
+# Images are compiled into firmware, so any PNG changes require a reflash.
+# To use custom images: drop 320x240 PNGs into esphome/illustrations/ then ./deploy.sh
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+ESPHOME_DIR="${SCRIPT_DIR}/esphome"
+ESPHOME_VENV="${HOME}/homeai-esphome-env"
+ESPHOME="${ESPHOME_VENV}/bin/esphome"
+PYTHON="${ESPHOME_VENV}/bin/python3"
+ILLUSTRATIONS_DIR="${ESPHOME_DIR}/illustrations"
+
+# Colors
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+CYAN='\033[0;36m'
+NC='\033[0m'
+
+log_info()  { echo -e "${BLUE}[INFO]${NC} $*"; }
+log_ok()    { echo -e "${GREEN}[OK]${NC} $*"; }
+log_warn()  { echo -e "${YELLOW}[WARN]${NC} $*"; }
+log_error() { echo -e "${RED}[ERROR]${NC} $*"; exit 1; }
+log_step()  { echo -e "${CYAN}[STEP]${NC} $*"; }
+
+# ─── Available units ──────────────────────────────────────────────────────────
+
+UNIT_NAMES=(living-room bedroom kitchen)
+DEFAULT_UNIT="living-room"
+
+unit_config() {
+  case "$1" in
+    living-room) echo "homeai-living-room.yaml" ;;
+    bedroom)     echo "homeai-bedroom.yaml" ;;
+    kitchen)     echo "homeai-kitchen.yaml" ;;
+    *)           echo "" ;;
+  esac
+}
+
+unit_list() {
+  echo "${UNIT_NAMES[*]}"
+}
+
+# ─── Face image generator ────────────────────────────────────────────────────
+
+generate_faces() {
+  log_step "Generating face illustrations (320x240 PNG)..."
+  "${PYTHON}" << 'PYEOF'
+from PIL import Image, ImageDraw
+import os
+
+WIDTH, HEIGHT = 320, 240
+OUT = os.environ.get("ILLUSTRATIONS_DIR", "esphome/illustrations")
+
+def draw_face(draw, eye_color, mouth_color, eye_height=40, eye_y=80, mouth_style="smile"):
+    ex1, ey1 = 95, eye_y
+    draw.ellipse([ex1-25, ey1-eye_height//2, ex1+25, ey1+eye_height//2], fill=eye_color)
+    ex2, ey2 = 225, eye_y
+    draw.ellipse([ex2-25, ey2-eye_height//2, ex2+25, ey2+eye_height//2], fill=eye_color)
+    if mouth_style == "smile":
+        draw.arc([110, 140, 210, 200], start=0, end=180, fill=mouth_color, width=3)
+    elif mouth_style == "open":
+        draw.ellipse([135, 150, 185, 190], fill=mouth_color)
+    elif mouth_style == "flat":
+        draw.line([120, 170, 200, 170], fill=mouth_color, width=3)
+    elif mouth_style == "frown":
+        draw.arc([110, 160, 210, 220], start=180, end=360, fill=mouth_color, width=3)
+
+states = {
+    "idle":            {"eye_color": "#FFFFFF", "mouth_color": "#FFFFFF", "eye_height": 40, "mouth_style": "smile"},
+    "loading":         {"eye_color": "#6366F1", "mouth_color": "#6366F1", "eye_height": 30, "mouth_style": "flat"},
+    "listening":       {"eye_color": "#00BFFF", "mouth_color": "#00BFFF", "eye_height": 50, "mouth_style": "open"},
+    "thinking":        {"eye_color": "#A78BFA", "mouth_color": "#A78BFA", "eye_height": 20, "mouth_style": "flat"},
+    "replying":        {"eye_color": "#10B981", "mouth_color": "#10B981", "eye_height": 40, "mouth_style": "open"},
+    "error":           {"eye_color": "#EF4444", "mouth_color": "#EF4444", "eye_height": 40, "mouth_style": "frown"},
+    "timer_finished":  {"eye_color": "#F59E0B", "mouth_color": "#F59E0B", "eye_height": 50, "mouth_style": "smile"},
+}
+
+os.makedirs(OUT, exist_ok=True)
+for name, p in states.items():
+    img = Image.new("RGBA", (WIDTH, HEIGHT), (0, 0, 0, 255))
+    draw = ImageDraw.Draw(img)
+    draw_face(draw, p["eye_color"], p["mouth_color"], p["eye_height"], mouth_style=p["mouth_style"])
+    img.save(f"{OUT}/{name}.png")
+    print(f"  {name}.png")
+PYEOF
+  log_ok "Generated 7 face illustrations"
+}
+
+# ─── Check existing images ───────────────────────────────────────────────────
+
+REQUIRED_IMAGES=(idle loading listening thinking replying error timer_finished)
+
+check_images() {
+  local missing=()
+  for name in "${REQUIRED_IMAGES[@]}"; do
+    if [[ ! -f "${ILLUSTRATIONS_DIR}/${name}.png" ]]; then
+      missing+=("${name}.png")
+    fi
+  done
+
+  if [[ ${#missing[@]} -gt 0 ]]; then
+    log_error "Missing illustrations: ${missing[*]}
+  Place 320x240 PNGs in ${ILLUSTRATIONS_DIR}/ or use --regen-images to generate placeholders."
+  fi
+
+  log_ok "All ${#REQUIRED_IMAGES[@]} illustrations present in illustrations/"
+  for name in "${REQUIRED_IMAGES[@]}"; do
+    local size
+    size=$(wc -c < "${ILLUSTRATIONS_DIR}/${name}.png" | tr -d ' ')
+    echo -e "  ${name}.png  (${size} bytes)"
+  done
+}
+
+# ─── Deploy to a single unit ─────────────────────────────────────────────────
+
+deploy_unit() {
+  local unit_name="$1"
+  local config
+  config="$(unit_config "$unit_name")"
+
+  if [[ -z "$config" ]]; then
+    log_error "Unknown unit: ${unit_name}. Available: $(unit_list)"
+  fi
+
+  local config_path="${ESPHOME_DIR}/${config}"
+  if [[ ! -f "$config_path" ]]; then
+    log_error "Config not found: ${config_path}"
+  fi
+
+  log_step "Validating ${config}..."
+  cd "${ESPHOME_DIR}"
+  "${ESPHOME}" config "${config}" > /dev/null
+  log_ok "Config valid"
+
+  log_step "Compiling + OTA deploying ${config}..."
+  "${ESPHOME}" run "${config}" --device OTA 2>&1
+  log_ok "Deployed to ${unit_name}"
+}
+
+# ─── Main ─────────────────────────────────────────────────────────────────────
+
+IMAGES_ONLY=false
+REGEN_IMAGES=false
+VALIDATE_ONLY=false
+DEPLOY_ALL=false
+TARGET="${DEFAULT_UNIT}"
+
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --images-only)   IMAGES_ONLY=true; shift ;;
+    --regen-images)  REGEN_IMAGES=true; shift ;;
+    --validate)      VALIDATE_ONLY=true; shift ;;
+    --all)           DEPLOY_ALL=true; shift ;;
+    --help|-h)
+      echo "Usage: $0 [unit-name] [--images-only] [--regen-images] [--validate] [--all]"
+      echo ""
+      echo "Units: $(unit_list)"
+      echo ""
+      echo "Options:"
+      echo "  --images-only    Deploy existing PNGs from illustrations/ (for custom images)"
+      echo "  --regen-images   Regenerate placeholder face PNGs then deploy"
+      echo "  --validate       Validate config without deploying"
+      echo "  --all            Deploy to all configured units"
+      echo ""
+      echo "Examples:"
+      echo "  $0                        # deploy config to living-room"
+      echo "  $0 bedroom                # deploy to bedroom"
+      echo "  $0 --images-only          # deploy with current images (custom or generated)"
+      echo "  $0 --regen-images         # regenerate placeholder faces + deploy"
+      echo "  $0 --all                  # deploy to all units"
+      echo ""
+      echo "Custom images: drop 320x240 PNGs into esphome/illustrations/"
+      echo "Required files: ${REQUIRED_IMAGES[*]}"
+      exit 0
+      ;;
+    *)
+      if [[ -n "$(unit_config "$1")" ]]; then
+        TARGET="$1"
+      else
+        log_error "Unknown option or unit: $1. Use --help for usage."
+      fi
+      shift
+      ;;
+  esac
+done
+
+# Check ESPHome
+if [[ ! -x "${ESPHOME}" ]]; then
+  log_error "ESPHome not found at ${ESPHOME}. Run setup.sh first."
+fi
+
+# Regenerate placeholder images if requested
+if $REGEN_IMAGES; then
+  export ILLUSTRATIONS_DIR
+  generate_faces
+fi
+
+# Check existing images if deploying with --images-only (or always before deploy)
+if $IMAGES_ONLY; then
+  check_images
+fi
+
+# Validate only
+if $VALIDATE_ONLY; then
+  cd "${ESPHOME_DIR}"
+  for unit_name in "${UNIT_NAMES[@]}"; do
+    config="$(unit_config "$unit_name")"
+    if [[ -f "${config}" ]]; then
+      log_step "Validating ${config}..."
+      "${ESPHOME}" config "${config}" > /dev/null && log_ok "${config} valid" || log_warn "${config} invalid"
+    fi
+  done
+  exit 0
+fi
+
+# Deploy
+if $DEPLOY_ALL; then
+  for unit_name in "${UNIT_NAMES[@]}"; do
+    config="$(unit_config "$unit_name")"
+    if [[ -f "${ESPHOME_DIR}/${config}" ]]; then
+      deploy_unit "$unit_name"
+    else
+      log_warn "Skipping ${unit_name} — ${config} not found"
+    fi
+  done
+else
+  deploy_unit "$TARGET"
+fi
+
+echo ""
+log_ok "Deploy complete!"
--- a/homeai-esp32/esphome/.gitignore
+++ b/homeai-esp32/esphome/.gitignore
@@ -0,0 +1,5 @@
+# Gitignore settings for ESPHome
+# This is an example and may include too much for your use-case.
+# You can modify this file to suit your needs.
+/.esphome/
+/secrets.yaml
--- a/homeai-esp32/esphome/homeai-living-room.yaml
+++ b/homeai-esp32/esphome/homeai-living-room.yaml
@@ -0,0 +1,865 @@
+---
+# HomeAI Living Room Satellite — ESP32-S3-BOX-3
+# Based on official ESPHome voice assistant config
+# https://github.com/esphome/wake-word-voice-assistants
+
+substitutions:
+  name: homeai-living-room
+  friendly_name: HomeAI Living Room
+
+  # Face illustrations — compiled into firmware (320x240 PNG)
+  loading_illustration_file: illustrations/loading.png
+  idle_illustration_file: illustrations/idle.png
+  listening_illustration_file: illustrations/listening.png
+  thinking_illustration_file: illustrations/thinking.png
+  replying_illustration_file: illustrations/replying.png
+  error_illustration_file: illustrations/error.png
+  timer_finished_illustration_file: illustrations/timer_finished.png
+
+  # Dark background for all states (matches HomeAI dashboard theme)
+  loading_illustration_background_color: "000000"
+  idle_illustration_background_color: "000000"
+  listening_illustration_background_color: "000000"
+  thinking_illustration_background_color: "000000"
+  replying_illustration_background_color: "000000"
+  error_illustration_background_color: "000000"
+
+  voice_assist_idle_phase_id: "1"
+  voice_assist_listening_phase_id: "2"
+  voice_assist_thinking_phase_id: "3"
+  voice_assist_replying_phase_id: "4"
+  voice_assist_not_ready_phase_id: "10"
+  voice_assist_error_phase_id: "11"
+  voice_assist_muted_phase_id: "12"
+  voice_assist_timer_finished_phase_id: "20"
+
+  font_glyphsets: "GF_Latin_Core"
+  font_family: Figtree
+
+esphome:
+  name: ${name}
+  friendly_name: ${friendly_name}
+  min_version: 2025.5.0
+  name_add_mac_suffix: false
+  on_boot:
+    priority: 600
+    then:
+      - script.execute: draw_display
+      - delay: 30s
+      - if:
+          condition:
+            lambda: return id(init_in_progress);
+          then:
+            - lambda: id(init_in_progress) = false;
+            - script.execute: draw_display
+
+esp32:
+  board: esp32s3box
+  flash_size: 16MB
+  cpu_frequency: 240MHz
+  framework:
+    type: esp-idf
+    sdkconfig_options:
+      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
+      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
+      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
+
+psram:
+  mode: octal
+  speed: 80MHz
+
+wifi:
+  ssid: !secret wifi_ssid
+  password: !secret wifi_password
+  ap:
+    ssid: "HomeAI Fallback"
+  on_connect:
+    - script.execute: draw_display
+  on_disconnect:
+    - script.execute: draw_display
+
+captive_portal:
+
+api:
+  encryption:
+    key: !secret api_key
+  # Prevent device from rebooting if HA connection drops temporarily
+  reboot_timeout: 0s
+  on_client_connected:
+    - script.execute: draw_display
+  on_client_disconnected:
+    # Debounce: wait 5s before showing "HA not found" to avoid flicker on brief drops
+    - delay: 5s
+    - if:
+        condition:
+          not:
+            api.connected:
+        then:
+          - script.execute: draw_display
+
+ota:
+  - platform: esphome
+    id: ota_esphome
+
+logger:
+  hardware_uart: USB_SERIAL_JTAG
+
+button:
+  - platform: factory_reset
+    id: factory_reset_btn
+    internal: true
+
+binary_sensor:
+  - platform: gpio
+    pin:
+      number: GPIO0
+      ignore_strapping_warning: true
+      mode: INPUT_PULLUP
+      inverted: true
+    id: left_top_button
+    internal: true
+    on_multi_click:
+      # Short press: dismiss timer / toggle mute
+      - timing:
+          - ON for at least 50ms
+          - OFF for at least 50ms
+        then:
+          - if:
+              condition:
+                switch.is_on: timer_ringing
+              then:
+                - switch.turn_off: timer_ringing
+              else:
+                - switch.toggle: mute
+      # Long press (10s): factory reset
+      - timing:
+          - ON for at least 10s
+        then:
+          - button.press: factory_reset_btn
+
+# --- Display backlight ---
+
+output:
+  - platform: ledc
+    pin: GPIO47
+    id: backlight_output
+
+light:
+  - platform: monochromatic
+    id: led
+    name: Screen
+    icon: "mdi:television"
+    entity_category: config
+    output: backlight_output
+    restore_mode: RESTORE_DEFAULT_ON
+    default_transition_length: 250ms
+
+# --- Audio hardware ---
+
+i2c:
+  scl: GPIO18
+  sda: GPIO8
+
+i2s_audio:
+  - id: i2s_audio_bus
+    i2s_lrclk_pin:
+      number: GPIO45
+      ignore_strapping_warning: true
+    i2s_bclk_pin: GPIO17
+    i2s_mclk_pin: GPIO2
+
+audio_adc:
+  - platform: es7210
+    id: es7210_adc
+    bits_per_sample: 16bit
+    sample_rate: 16000
+
+audio_dac:
+  - platform: es8311
+    id: es8311_dac
+    bits_per_sample: 16bit
+    sample_rate: 48000
+
+microphone:
+  - platform: i2s_audio
+    id: box_mic
+    sample_rate: 16000
+    i2s_din_pin: GPIO16
+    bits_per_sample: 16bit
+    adc_type: external
+
+speaker:
+  - platform: i2s_audio
+    id: box_speaker
+    i2s_dout_pin: GPIO15
+    dac_type: external
+    sample_rate: 48000
+    bits_per_sample: 16bit
+    channel: left
+    audio_dac: es8311_dac
+    buffer_duration: 100ms
+
+media_player:
+  - platform: speaker
+    name: None
+    id: speaker_media_player
+    volume_min: 0.5
+    volume_max: 0.85
+    announcement_pipeline:
+      speaker: box_speaker
+      format: FLAC
+      sample_rate: 48000
+      num_channels: 1
+    files:
+      - id: timer_finished_sound
+        file: https://github.com/esphome/home-assistant-voice-pe/raw/dev/sounds/timer_finished.flac
+    on_announcement:
+      - if:
+          condition:
+            - microphone.is_capturing:
+          then:
+            - script.execute: stop_wake_word
+            - if:
+                condition:
+                  - lambda: return id(wake_word_engine_location).current_option() == "In Home Assistant";
+                then:
+                  - wait_until:
+                      - not:
+                          voice_assistant.is_running:
+      - if:
+          condition:
+            not:
+              voice_assistant.is_running:
+          then:
+            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
+            - script.execute: draw_display
+    on_idle:
+      - if:
+          condition:
+            not:
+              voice_assistant.is_running:
+          then:
+            - script.execute: start_wake_word
+            - script.execute: set_idle_or_mute_phase
+            - script.execute: draw_display
+
+# --- Wake word (on-device) ---
+
+micro_wake_word:
+  id: mww
+  models:
+    - hey_jarvis
+  on_wake_word_detected:
+    - voice_assistant.start:
+        wake_word: !lambda return wake_word;
+
+# --- Voice assistant ---
+
+voice_assistant:
+  id: va
+  microphone: box_mic
+  media_player: speaker_media_player
+  micro_wake_word: mww
+  noise_suppression_level: 2
+  auto_gain: 31dBFS
+  volume_multiplier: 2.0
+  on_listening:
+    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
+    - text_sensor.template.publish:
+        id: text_request
+        state: "..."
+    - text_sensor.template.publish:
+        id: text_response
+        state: "..."
+    - script.execute: draw_display
+  on_stt_vad_end:
+    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
+    - script.execute: draw_display
+  on_stt_end:
+    - text_sensor.template.publish:
+        id: text_request
+        state: !lambda return x;
+    - script.execute: draw_display
+  on_tts_start:
+    - text_sensor.template.publish:
+        id: text_response
+        state: !lambda return x;
+    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
+    - script.execute: draw_display
+  on_end:
+    - wait_until:
+        condition:
+          - media_player.is_announcing:
+        timeout: 0.5s
+    - wait_until:
+        - and:
+            - not:
+                media_player.is_announcing:
+            - not:
+                speaker.is_playing:
+    - if:
+        condition:
+          - lambda: return id(wake_word_engine_location).current_option() == "On device";
+        then:
+          - lambda: id(va).set_use_wake_word(false);
+          - micro_wake_word.start:
+    - script.execute: set_idle_or_mute_phase
+    - script.execute: draw_display
+    - text_sensor.template.publish:
+        id: text_request
+        state: ""
+    - text_sensor.template.publish:
+        id: text_response
+        state: ""
+  on_error:
+    - if:
+        condition:
+          lambda: return !id(init_in_progress);
+        then:
+          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
+          - script.execute: draw_display
+          - delay: 1s
+          - if:
+              condition:
+                switch.is_off: mute
+              then:
+                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
+              else:
+                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
+          - script.execute: draw_display
+  on_client_connected:
+    - lambda: id(init_in_progress) = false;
+    - script.execute: start_wake_word
+    - script.execute: set_idle_or_mute_phase
+    - script.execute: draw_display
+  on_client_disconnected:
+    - script.execute: stop_wake_word
+    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
+    - script.execute: draw_display
+  on_timer_started:
+    - script.execute: draw_display
+  on_timer_cancelled:
+    - script.execute: draw_display
+  on_timer_updated:
+    - script.execute: draw_display
+  on_timer_tick:
+    - script.execute: draw_display
+  on_timer_finished:
+    - switch.turn_on: timer_ringing
+    - wait_until:
+        media_player.is_announcing:
+    - lambda: id(voice_assistant_phase) = ${voice_assist_timer_finished_phase_id};
+    - script.execute: draw_display
+
+# --- Scripts ---
+
+script:
+  - id: draw_display
+    then:
+      - if:
+          condition:
+            lambda: return !id(init_in_progress);
+          then:
+            - if:
+                condition:
+                  wifi.connected:
+                then:
+                  - if:
+                      condition:
+                        api.connected:
+                      then:
+                        - lambda: |
+                            switch(id(voice_assistant_phase)) {
+                              case ${voice_assist_listening_phase_id}:
+                                id(s3_box_lcd).show_page(listening_page);
+                                id(s3_box_lcd).update();
+                                break;
+                              case ${voice_assist_thinking_phase_id}:
+                                id(s3_box_lcd).show_page(thinking_page);
+                                id(s3_box_lcd).update();
+                                break;
+                              case ${voice_assist_replying_phase_id}:
+                                id(s3_box_lcd).show_page(replying_page);
+                                id(s3_box_lcd).update();
+                                break;
+                              case ${voice_assist_error_phase_id}:
+                                id(s3_box_lcd).show_page(error_page);
+                                id(s3_box_lcd).update();
+                                break;
+                              case ${voice_assist_muted_phase_id}:
+                                id(s3_box_lcd).show_page(muted_page);
+                                id(s3_box_lcd).update();
+                                break;
+                              case ${voice_assist_not_ready_phase_id}:
+                                id(s3_box_lcd).show_page(no_ha_page);
+                                id(s3_box_lcd).update();
+                                break;
+                              case ${voice_assist_timer_finished_phase_id}:
+                                id(s3_box_lcd).show_page(timer_finished_page);
+                                id(s3_box_lcd).update();
+                                break;
+                              default:
+                                id(s3_box_lcd).show_page(idle_page);
+                                id(s3_box_lcd).update();
+                            }
+                      else:
+                        - display.page.show: no_ha_page
+                        - component.update: s3_box_lcd
+                else:
+                  - display.page.show: no_wifi_page
+                  - component.update: s3_box_lcd
+          else:
+            - display.page.show: initializing_page
+            - component.update: s3_box_lcd
+
+  - id: fetch_first_active_timer
+    then:
+      - lambda: |
+          const auto &timers = id(va).get_timers();
+          auto output_timer = timers.begin()->second;
+          for (const auto &timer : timers) {
+            if (timer.second.is_active && timer.second.seconds_left <= output_timer.seconds_left) {
+              output_timer = timer.second;
+            }
+          }
+          id(global_first_active_timer) = output_timer;
+
+  - id: check_if_timers_active
+    then:
+      - lambda: |
+          const auto &timers = id(va).get_timers();
+          bool output = false;
+          for (const auto &timer : timers) {
+            if (timer.second.is_active) { output = true; }
+          }
+          id(global_is_timer_active) = output;
+
+  - id: fetch_first_timer
+    then:
+      - lambda: |
+          const auto &timers = id(va).get_timers();
+          auto output_timer = timers.begin()->second;
+          for (const auto &timer : timers) {
+            if (timer.second.seconds_left <= output_timer.seconds_left) {
+              output_timer = timer.second;
+            }
+          }
+          id(global_first_timer) = output_timer;
+
+  - id: check_if_timers
+    then:
+      - lambda: |
+          const auto &timers = id(va).get_timers();
+          bool output = false;
+          for (const auto &timer : timers) {
+            if (timer.second.is_active) { output = true; }
+          }
+          id(global_is_timer) = output;
+
+  - id: draw_timer_timeline
+    then:
+      - lambda: |
+          id(check_if_timers_active).execute();
+          id(check_if_timers).execute();
+          if (id(global_is_timer_active)){
+            id(fetch_first_active_timer).execute();
+            int active_pixels = round( 320 * id(global_first_active_timer).seconds_left / max(id(global_first_active_timer).total_seconds, static_cast<uint32_t>(1)) );
+            if (active_pixels > 0){
+              id(s3_box_lcd).filled_rectangle(0, 225, 320, 15, Color::WHITE);
+              id(s3_box_lcd).filled_rectangle(0, 226, active_pixels, 13, id(active_timer_color));
+            }
+          } else if (id(global_is_timer)){
+            id(fetch_first_timer).execute();
+            int active_pixels = round( 320 * id(global_first_timer).seconds_left / max(id(global_first_timer).total_seconds, static_cast<uint32_t>(1)));
+            if (active_pixels > 0){
+              id(s3_box_lcd).filled_rectangle(0, 225, 320, 15, Color::WHITE);
+              id(s3_box_lcd).filled_rectangle(0, 226, active_pixels, 13, id(paused_timer_color));
+            }
+          }
+
+  - id: draw_active_timer_widget
+    then:
+      - lambda: |
+          id(check_if_timers_active).execute();
+          if (id(global_is_timer_active)){
+            id(s3_box_lcd).filled_rectangle(80, 40, 160, 50, Color::WHITE);
+            id(s3_box_lcd).rectangle(80, 40, 160, 50, Color::BLACK);
+            id(fetch_first_active_timer).execute();
+            int hours_left = floor(id(global_first_active_timer).seconds_left / 3600);
+            int minutes_left = floor((id(global_first_active_timer).seconds_left - hours_left * 3600) / 60);
+            int seconds_left = id(global_first_active_timer).seconds_left - hours_left * 3600 - minutes_left * 60;
+            auto display_hours = (hours_left < 10 ? "0" : "") + std::to_string(hours_left);
+            auto display_minute = (minutes_left < 10 ? "0" : "") + std::to_string(minutes_left);
+            auto display_seconds = (seconds_left < 10 ? "0" : "") + std::to_string(seconds_left);
+            std::string display_string = "";
+            if (hours_left > 0) {
+              display_string = display_hours + ":" + display_minute;
+            } else {
+              display_string = display_minute + ":" + display_seconds;
+            }
+            id(s3_box_lcd).printf(120, 47, id(font_timer), Color::BLACK, "%s", display_string.c_str());
+          }
+
+  - id: start_wake_word
+    then:
+      - if:
+          condition:
+            and:
+              - not:
+                  - voice_assistant.is_running:
+              - lambda: return id(wake_word_engine_location).current_option() == "On device";
+          then:
+            - lambda: id(va).set_use_wake_word(false);
+            - micro_wake_word.start:
+      - if:
+          condition:
+            and:
+              - not:
+                  - voice_assistant.is_running:
+              - lambda: return id(wake_word_engine_location).current_option() == "In Home Assistant";
+          then:
+            - lambda: id(va).set_use_wake_word(true);
+            - voice_assistant.start_continuous:
+
+  - id: stop_wake_word
+    then:
+      - if:
+          condition:
+            lambda: return id(wake_word_engine_location).current_option() == "In Home Assistant";
+          then:
+            - lambda: id(va).set_use_wake_word(false);
+            - voice_assistant.stop:
+      - if:
+          condition:
+            lambda: return id(wake_word_engine_location).current_option() == "On device";
+          then:
+            - micro_wake_word.stop:
+
+  - id: set_idle_or_mute_phase
+    then:
+      - if:
+          condition:
+            switch.is_off: mute
+          then:
+            - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
+          else:
+            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
+
+# --- Switches ---
+
+switch:
+  - platform: gpio
+    name: Speaker Enable
+    pin:
+      number: GPIO46
+      ignore_strapping_warning: true
+    restore_mode: RESTORE_DEFAULT_ON
+    entity_category: config
+    disabled_by_default: true
+  - platform: template
+    name: Mute
+    id: mute
+    icon: "mdi:microphone-off"
+    optimistic: true
+    restore_mode: RESTORE_DEFAULT_OFF
+    entity_category: config
+    on_turn_off:
+      - microphone.unmute:
+      - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
+      - script.execute: draw_display
+    on_turn_on:
+      - microphone.mute:
+      - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
+      - script.execute: draw_display
+  - platform: template
+    id: timer_ringing
+    optimistic: true
+    internal: true
+    restore_mode: ALWAYS_OFF
+    on_turn_off:
+      - lambda: |-
+          id(speaker_media_player)
+            ->make_call()
+            .set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_REPEAT_OFF)
+            .set_announcement(true)
+            .perform();
+          id(speaker_media_player)->set_playlist_delay_ms(speaker::AudioPipelineType::ANNOUNCEMENT, 0);
+      - media_player.stop:
+          announcement: true
+    on_turn_on:
+      - lambda: |-
+          id(speaker_media_player)
+            ->make_call()
+            .set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_REPEAT_ONE)
+            .set_announcement(true)
+            .perform();
+          id(speaker_media_player)->set_playlist_delay_ms(speaker::AudioPipelineType::ANNOUNCEMENT, 1000);
+      - media_player.speaker.play_on_device_media_file:
+          media_file: timer_finished_sound
+          announcement: true
+      - delay: 15min
+      - switch.turn_off: timer_ringing
+
+# --- Wake word engine location selector ---
+
+select:
+  - platform: template
+    entity_category: config
+    name: Wake word engine location
+    id: wake_word_engine_location
+    icon: "mdi:account-voice"
+    optimistic: true
+    restore_value: true
+    options:
+      - In Home Assistant
+      - On device
+    initial_option: On device
+    on_value:
+      - if:
+          condition:
+            lambda: return !id(init_in_progress);
+          then:
+            - wait_until:
+                lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
+            - if:
+                condition:
+                  lambda: return x == "In Home Assistant";
+                then:
+                  - micro_wake_word.stop
+                  - delay: 500ms
+                  - if:
+                      condition:
+                        switch.is_off: mute
+                      then:
+                        - lambda: id(va).set_use_wake_word(true);
+                        - voice_assistant.start_continuous:
+            - if:
+                condition:
+                  lambda: return x == "On device";
+                then:
+                  - lambda: id(va).set_use_wake_word(false);
+                  - voice_assistant.stop
+                  - delay: 500ms
+                  - if:
+                      condition:
+                        switch.is_off: mute
+                      then:
+                        - micro_wake_word.start
+
+# --- Global variables ---
+
+globals:
+  - id: init_in_progress
+    type: bool
+    restore_value: false
+    initial_value: "true"
+  - id: voice_assistant_phase
+    type: int
+    restore_value: false
+    initial_value: ${voice_assist_not_ready_phase_id}
+  - id: global_first_active_timer
+    type: voice_assistant::Timer
+    restore_value: false
+  - id: global_is_timer_active
+    type: bool
+    restore_value: false
+  - id: global_first_timer
+    type: voice_assistant::Timer
+    restore_value: false
+  - id: global_is_timer
+    type: bool
+    restore_value: false
+
+# --- Display images ---
+
+image:
+  - file: ${error_illustration_file}
+    id: casita_error
+    resize: 320x240
+    type: RGB
+    transparency: alpha_channel
+  - file: ${idle_illustration_file}
+    id: casita_idle
+    resize: 320x240
+    type: RGB
+    transparency: alpha_channel
+  - file: ${listening_illustration_file}
+    id: casita_listening
+    resize: 320x240
+    type: RGB
+    transparency: alpha_channel
+  - file: ${thinking_illustration_file}
+    id: casita_thinking
+    resize: 320x240
+    type: RGB
+    transparency: alpha_channel
+  - file: ${replying_illustration_file}
+    id: casita_replying
+    resize: 320x240
+    type: RGB
+    transparency: alpha_channel
+  - file: ${timer_finished_illustration_file}
+    id: casita_timer_finished
+    resize: 320x240
+    type: RGB
+    transparency: alpha_channel
+  - file: ${loading_illustration_file}
+    id: casita_initializing
+    resize: 320x240
+    type: RGB
+    transparency: alpha_channel
+  - file: https://github.com/esphome/wake-word-voice-assistants/raw/main/error_box_illustrations/error-no-wifi.png
+    id: error_no_wifi
+    resize: 320x240
+    type: RGB
+    transparency: alpha_channel
+  - file: https://github.com/esphome/wake-word-voice-assistants/raw/main/error_box_illustrations/error-no-ha.png
+    id: error_no_ha
+    resize: 320x240
+    type: RGB
+    transparency: alpha_channel
+
+# --- Fonts ---
+
+font:
+  - file:
+      type: gfonts
+      family: ${font_family}
+      weight: 300
+      italic: true
+    id: font_request
+    size: 15
+    glyphsets:
+      - ${font_glyphsets}
+  - file:
+      type: gfonts
+      family: ${font_family}
+      weight: 300
+    id: font_response
+    size: 15
+    glyphsets:
+      - ${font_glyphsets}
+  - file:
+      type: gfonts
+      family: ${font_family}
+      weight: 300
+    id: font_timer
+    size: 30
+    glyphsets:
+      - ${font_glyphsets}
+
+# --- Text sensors (request/response display) ---
+
+text_sensor:
+  - id: text_request
+    platform: template
+    on_value:
+      lambda: |-
+        if(id(text_request).state.length()>32) {
+          std::string name = id(text_request).state.c_str();
+          std::string truncated = esphome::str_truncate(name.c_str(),31);
+          id(text_request).state = (truncated+"...").c_str();
+        }
+  - id: text_response
+    platform: template
+    on_value:
+      lambda: |-
+        if(id(text_response).state.length()>32) {
+          std::string name = id(text_response).state.c_str();
+          std::string truncated = esphome::str_truncate(name.c_str(),31);
+          id(text_response).state = (truncated+"...").c_str();
+        }
+
+# --- Colors ---
+
+color:
+  - id: idle_color
+    hex: ${idle_illustration_background_color}
+  - id: listening_color
+    hex: ${listening_illustration_background_color}
+  - id: thinking_color
+    hex: ${thinking_illustration_background_color}
+  - id: replying_color
+    hex: ${replying_illustration_background_color}
+  - id: loading_color
+    hex: ${loading_illustration_background_color}
+  - id: error_color
+    hex: ${error_illustration_background_color}
+  - id: active_timer_color
+    hex: "26ed3a"
+  - id: paused_timer_color
+    hex: "3b89e3"
+
+# --- SPI + Display ---
+
+spi:
+  - id: spi_bus
+    clk_pin: 7
+    mosi_pin: 6
+
+display:
+  - platform: ili9xxx
+    id: s3_box_lcd
+    model: S3BOX
+    invert_colors: false
+    data_rate: 40MHz
+    cs_pin: 5
+    dc_pin: 4
+    reset_pin:
+      number: 48
+      inverted: true
+    update_interval: never
+    pages:
+      - id: idle_page
+        lambda: |-
+          it.fill(id(idle_color));
+          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_idle), ImageAlign::CENTER);
+          id(draw_timer_timeline).execute();
+          id(draw_active_timer_widget).execute();
+      - id: listening_page
+        lambda: |-
+          it.fill(id(listening_color));
+          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_listening), ImageAlign::CENTER);
+          id(draw_timer_timeline).execute();
+      - id: thinking_page
+        lambda: |-
+          it.fill(id(thinking_color));
+          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_thinking), ImageAlign::CENTER);
+          it.filled_rectangle(20, 20, 280, 30, Color::WHITE);
+          it.rectangle(20, 20, 280, 30, Color::BLACK);
+          it.printf(30, 25, id(font_request), Color::BLACK, "%s", id(text_request).state.c_str());
+          id(draw_timer_timeline).execute();
+      - id: replying_page
+        lambda: |-
+          it.fill(id(replying_color));
+          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_replying), ImageAlign::CENTER);
+          it.filled_rectangle(20, 20, 280, 30, Color::WHITE);
+          it.rectangle(20, 20, 280, 30, Color::BLACK);
+          it.filled_rectangle(20, 190, 280, 30, Color::WHITE);
+          it.rectangle(20, 190, 280, 30, Color::BLACK);
+          it.printf(30, 25, id(font_request), Color::BLACK, "%s", id(text_request).state.c_str());
+          it.printf(30, 195, id(font_response), Color::BLACK, "%s", id(text_response).state.c_str());
+          id(draw_timer_timeline).execute();
+      - id: timer_finished_page
+        lambda: |-
+          it.fill(id(idle_color));
+          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_timer_finished), ImageAlign::CENTER);
+      - id: error_page
+        lambda: |-
+          it.fill(id(error_color));
+          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_error), ImageAlign::CENTER);
+      - id: no_ha_page
+        lambda: |-
+          it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_ha), ImageAlign::CENTER);
+      - id: no_wifi_page
+        lambda: |-
+          it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_wifi), ImageAlign::CENTER);
+      - id: initializing_page
+        lambda: |-
+          it.fill(id(loading_color));
+          it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_initializing), ImageAlign::CENTER);
+      - id: muted_page
+        lambda: |-
+          it.fill(Color::BLACK);
+          id(draw_timer_timeline).execute();
+          id(draw_active_timer_widget).execute();
--- a/homeai-esp32/esphome/illustrations/error.png
+++ b/homeai-esp32/esphome/illustrations/error.png
--- a/homeai-esp32/esphome/illustrations/idle.png
+++ b/homeai-esp32/esphome/illustrations/idle.png
--- a/homeai-esp32/esphome/illustrations/listening.png
+++ b/homeai-esp32/esphome/illustrations/listening.png
--- a/homeai-esp32/esphome/illustrations/loading.png
+++ b/homeai-esp32/esphome/illustrations/loading.png
--- a/homeai-esp32/esphome/illustrations/replying.png
+++ b/homeai-esp32/esphome/illustrations/replying.png
--- a/homeai-esp32/esphome/illustrations/thinking.png
+++ b/homeai-esp32/esphome/illustrations/thinking.png
--- a/homeai-esp32/esphome/illustrations/timer_finished.png
+++ b/homeai-esp32/esphome/illustrations/timer_finished.png
--- a/homeai-esp32/setup.sh
+++ b/homeai-esp32/setup.sh
@@ -1,76 +1,177 @@
 #!/usr/bin/env bash
 # homeai-esp32/setup.sh — P6: ESPHome firmware for ESP32-S3-BOX-3
 #
-# Components:
-#   - ESPHome            — firmware build + flash tool
-#   - base.yaml          — shared device config
-#   - voice.yaml         — Wyoming Satellite + microWakeWord
-#   - display.yaml       — LVGL animated face
-#   - Per-room configs   — s3-box-living-room.yaml, etc.
+# Usage:
+#   ./setup.sh              — check environment + validate config
+#   ./setup.sh flash        — compile + flash via USB (first time)
+#   ./setup.sh ota          — compile + flash via OTA (wireless)
+#   ./setup.sh logs         — stream device logs
+#   ./setup.sh validate     — validate YAML without compiling
 #
 # Prerequisites:
-#   - P1 (homeai-infra) — Home Assistant running
-#   - P3 (homeai-voice) — Wyoming STT/TTS running (ports 10300/10301)
-#   - Python 3.10+
-#   - USB-C cable for first flash (subsequent updates via OTA)
-#   - On Linux: ensure user is in the dialout group for USB access
+#   - ~/homeai-esphome-env  — Python 3.12 venv with ESPHome
+#   - Home Assistant running on 10.0.0.199
+#   - Wyoming STT/TTS running on Mac Mini (ports 10300/10301)

 set -euo pipefail

 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 REPO_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
-source "${REPO_DIR}/scripts/common.sh"
+ESPHOME_VENV="${HOME}/homeai-esphome-env"
+ESPHOME="${ESPHOME_VENV}/bin/esphome"
+ESPHOME_DIR="${SCRIPT_DIR}/esphome"
+DEFAULT_CONFIG="${ESPHOME_DIR}/homeai-living-room.yaml"

-log_section "P6: ESP32 Firmware (ESPHome)"
-detect_platform
+# Colors
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m'

-# ─── Prerequisite check ────────────────────────────────────────────────────────
-log_info "Checking prerequisites..."
+log_info()  { echo -e "${BLUE}[INFO]${NC} $*"; }
+log_ok()    { echo -e "${GREEN}[OK]${NC} $*"; }
+log_warn()  { echo -e "${YELLOW}[WARN]${NC} $*"; }
+log_error() { echo -e "${RED}[ERROR]${NC} $*"; }

-if ! command_exists python3; then
-  log_warn "python3 not found — required for ESPHome"
+# ─── Environment checks ──────────────────────────────────────────────────────
+
+check_env() {
+  local ok=true
+
+  log_info "Checking environment..."
+
+  # ESPHome venv
+  if [[ -x "${ESPHOME}" ]]; then
+    local version
+    version=$("${ESPHOME}" version 2>/dev/null)
+    log_ok "ESPHome: ${version}"
+  else
+    log_error "ESPHome not found at ${ESPHOME}"
+    echo "  Install: /opt/homebrew/opt/python@3.12/bin/python3.12 -m venv ${ESPHOME_VENV}"
+    echo "           ${ESPHOME_VENV}/bin/pip install 'esphome>=2025.5.0'"
+    ok=false
  fi

-if ! command_exists esphome; then
-  log_info "ESPHome not installed. To install: pip install esphome"
+  # secrets.yaml
+  if [[ -f "${ESPHOME_DIR}/secrets.yaml" ]]; then
+    if grep -q "YOUR_" "${ESPHOME_DIR}/secrets.yaml" 2>/dev/null; then
+      log_warn "secrets.yaml contains placeholder values — edit before flashing"
+      ok=false
+    else
+      log_ok "secrets.yaml configured"
+    fi
+  else
+    log_error "secrets.yaml not found at ${ESPHOME_DIR}/secrets.yaml"
+    ok=false
  fi

-if [[ "$OS_TYPE" == "linux" ]]; then
-  if ! groups "$USER" | grep -q dialout; then
-    log_warn "User '$USER' not in 'dialout' group — USB flashing may fail."
-    log_warn "Fix: sudo usermod -aG dialout $USER  (then log out and back in)"
-  fi
+  # Config file
+  if [[ -f "${DEFAULT_CONFIG}" ]]; then
+    log_ok "Config: $(basename "${DEFAULT_CONFIG}")"
+  else
+    log_error "Config not found: ${DEFAULT_CONFIG}"
+    ok=false
  fi

-# Check P3 dependency
-if ! curl -sf http://localhost:8123 -o /dev/null 2>/dev/null; then
-  log_warn "Home Assistant (P1) not reachable — ESP32 units won't auto-discover"
+  # Illustrations
+  local illust_dir="${ESPHOME_DIR}/illustrations"
+  local illust_count
+  illust_count=$(find "${illust_dir}" -name "*.png" 2>/dev/null | wc -l | tr -d ' ')
+  if [[ "${illust_count}" -ge 7 ]]; then
+    log_ok "Illustrations: ${illust_count} PNGs in illustrations/"
+  else
+    log_warn "Missing illustrations (found ${illust_count}, need 7)"
  fi

-# ─── TODO: Implementation ──────────────────────────────────────────────────────
-cat <<'EOF'
+  # Wyoming services on Mac Mini
+  if curl -sf "http://localhost:10300" -o /dev/null 2>/dev/null || nc -z localhost 10300 2>/dev/null; then
+    log_ok "Wyoming STT (port 10300) reachable"
+  else
+    log_warn "Wyoming STT (port 10300) not reachable"
+  fi

-  ┌─────────────────────────────────────────────────────────────────┐
-  │  P6: homeai-esp32 — NOT YET IMPLEMENTED                        │
-  │                                                                  │
-  │  Implementation steps:                                           │
-  │  1. pip install esphome                                         │
-  │  2. Create esphome/secrets.yaml (gitignored)                   │
-  │  3. Create esphome/base.yaml (WiFi, API, OTA)                  │
-  │  4. Create esphome/voice.yaml (Wyoming Satellite, wakeword)    │
-  │  5. Create esphome/display.yaml (LVGL face, 5 states)          │
-  │  6. Create esphome/animations.yaml (face state scripts)        │
-  │  7. Create per-room configs (s3-box-living-room.yaml, etc.)    │
-  │  8. First flash via USB: esphome run esphome/<room>.yaml       │
-  │  9. Subsequent OTA: esphome upload esphome/<room>.yaml         │
-  │  10. Add to Home Assistant → assign Wyoming voice pipeline     │
-  │                                                                  │
-  │  Quick flash (once esphome/ is ready):                          │
-  │    esphome run esphome/s3-box-living-room.yaml                 │
-  │    esphome logs esphome/s3-box-living-room.yaml                │
-  └─────────────────────────────────────────────────────────────────┘
+  if curl -sf "http://localhost:10301" -o /dev/null 2>/dev/null || nc -z localhost 10301 2>/dev/null; then
+    log_ok "Wyoming TTS (port 10301) reachable"
+  else
+    log_warn "Wyoming TTS (port 10301) not reachable"
+  fi

-EOF
+  # Home Assistant
+  if curl -sk "https://10.0.0.199:8123" -o /dev/null 2>/dev/null; then
+    log_ok "Home Assistant (10.0.0.199:8123) reachable"
+  else
+    log_warn "Home Assistant not reachable — ESP32 won't be able to connect"
+  fi

-log_info "P6 is not yet implemented. See homeai-esp32/PLAN.md for details."
-exit 0
+  if $ok; then
+    log_ok "Environment ready"
+  else
+    log_warn "Some issues found — fix before flashing"
+  fi
+}
+
+# ─── Commands ─────────────────────────────────────────────────────────────────
+
+cmd_flash() {
+  local config="${1:-${DEFAULT_CONFIG}}"
+  log_info "Compiling + flashing via USB: $(basename "${config}")"
+  log_info "First compile downloads ESP-IDF toolchain (~500MB), takes 5-10 min..."
+  cd "${ESPHOME_DIR}"
+  "${ESPHOME}" run "$(basename "${config}")"
+}
+
+cmd_ota() {
+  local config="${1:-${DEFAULT_CONFIG}}"
+  log_info "Compiling + OTA upload: $(basename "${config}")"
+  cd "${ESPHOME_DIR}"
+  "${ESPHOME}" run "$(basename "${config}")"
+}
+
+cmd_logs() {
+  local config="${1:-${DEFAULT_CONFIG}}"
+  log_info "Streaming logs for: $(basename "${config}")"
+  cd "${ESPHOME_DIR}"
+  "${ESPHOME}" logs "$(basename "${config}")"
+}
+
+cmd_validate() {
+  local config="${1:-${DEFAULT_CONFIG}}"
+  log_info "Validating: $(basename "${config}")"
+  cd "${ESPHOME_DIR}"
+  "${ESPHOME}" config "$(basename "${config}")"
+  log_ok "Config valid"
+}
+
+# ─── Main ─────────────────────────────────────────────────────────────────────
+
+case "${1:-}" in
+  flash)
+    check_env
+    echo ""
+    cmd_flash "${2:-}"
+    ;;
+  ota)
+    check_env
+    echo ""
+    cmd_ota "${2:-}"
+    ;;
+  logs)
+    cmd_logs "${2:-}"
+    ;;
+  validate)
+    cmd_validate "${2:-}"
+    ;;
+  *)
+    check_env
+    echo ""
+    echo "Usage: $0 {flash|ota|logs|validate} [config.yaml]"
+    echo ""
+    echo "  flash     Compile + flash via USB (first time)"
+    echo "  ota       Compile + flash via OTA (wireless, after first flash)"
+    echo "  logs      Stream device logs"
+    echo "  validate  Validate YAML config without compiling"
+    echo ""
+    echo "Default config: $(basename "${DEFAULT_CONFIG}")"
+    ;;
+esac