Merge branch 'esp32': ESP32-S3-BOX-3 room satellite with voice pipeline

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Aodhan Collins
2026-03-13 20:48:09 +00:00
13 changed files with 1410 additions and 341 deletions

24
TODO.md
View File

@@ -108,17 +108,19 @@
### P6 · homeai-esp32
- [ ] Install ESPHome: `pip install esphome`
- [ ] Write `esphome/secrets.yaml` (gitignored)
- [ ] Write `base.yaml`, `voice.yaml`, `display.yaml`, `animations.yaml`
- [ ] Write `s3-box-living-room.yaml` for first unit
- [ ] Flash first unit via USB
- [ ] Verify unit appears in HA device list
- [ ] Assign Wyoming voice pipeline to unit in HA
- [ ] Test full wake → STT → LLM → TTS → audio playback cycle
- [ ] Test LVGL face: idle → listening → thinking → speaking → error
- [ ] Verify OTA firmware update works wirelessly
- [ ] Flash remaining units (bedroom, kitchen, etc.)
- [x] Install ESPHome in `~/homeai-esphome-env` (Python 3.12 venv)
- [x] Write `esphome/secrets.yaml` (gitignored)
- [x] Write `homeai-living-room.yaml` (based on official S3-BOX-3 reference config)
- [x] Generate placeholder face illustrations (7 PNGs, 320×240)
- [x] Write `setup.sh` with flash/ota/logs/validate commands
- [x] Write `deploy.sh` with OTA deploy, image management, multi-unit support
- [x] Flash first unit via USB (living room)
- [x] Verify unit appears in HA device list (requires HA 2026.x for ESPHome 2025.12+ compat)
- [x] Assign Wyoming voice pipeline to unit in HA
- [x] Test full wake → STT → LLM → TTS → audio playback cycle
- [x] Test display states: idle → listening → thinking → replying → error
- [x] Verify OTA firmware update works wirelessly (`deploy.sh --device OTA`)
- [ ] Flash remaining units (bedroom, kitchen)
- [ ] Document MAC address → room name mapping
---

View File

@@ -6,7 +6,7 @@
## Goal
Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite: always-on mic, local wake word detection, audio playback, and an LVGL animated face showing assistant state. All intelligence stays on the Mac Mini.
Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite: always-on mic, on-device wake word detection, audio playback, and a display showing assistant state via static PNG face illustrations. All intelligence stays on the Mac Mini.
---
@@ -17,11 +17,12 @@ Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite
| SoC | ESP32-S3 (dual-core Xtensa, 240MHz) |
| RAM | 512KB SRAM + 16MB PSRAM |
| Flash | 16MB |
| Display | 2.4" IPS LCD, 320×240, touchscreen |
| Mic | Dual microphone array |
| Speaker | Built-in 1W speaker |
| Connectivity | WiFi 802.11b/g/n, BT 5.0 |
| USB | USB-C (programming + power) |
| Display | 2.4" IPS LCD, 320×240, touchscreen (ILI9xxx, model S3BOX) |
| Audio ADC | ES7210 (dual mic array, 16kHz 16-bit) |
| Audio DAC | ES8311 (speaker output, 48kHz 16-bit) |
| Speaker | Built-in 1W |
| Connectivity | WiFi 802.11b/g/n (2.4GHz only), BT 5.0 |
| USB | USB-C (programming + power, native USB JTAG serial) |
---
@@ -29,273 +30,86 @@ Flash ESP32-S3-BOX-3 units with ESPHome. Each unit acts as a dumb room satellite
```
ESP32-S3-BOX-3
├── microWakeWord (on-device, always listening)
│ └── triggers Wyoming Satellite on wake detection
├── Wyoming Satellite
│ ├── streams mic audio → Mac Mini Wyoming STT (port 10300)
── receives TTS audio Mac Mini Wyoming TTS (port 10301)
├── LVGL Display
│ └── animated face, driven by HA entity state
├── micro_wake_word (on-device, always listening)
│ └── "hey_jarvis" — triggers voice_assistant on wake detection
├── voice_assistant (ESPHome component)
│ ├── connects to Home Assistant via ESPHome API
── HA routes audio Mac Mini Wyoming STT (10.0.0.101:10300)
│ ├── HA routes text → OpenClaw conversation agent (10.0.0.101:8081)
│ └── HA routes response → Mac Mini Wyoming TTS (10.0.0.101:10301)
├── Display (ili9xxx, model S3BOX, 320×240)
│ └── static PNG faces per state (idle, listening, thinking, replying, error)
└── ESPHome OTA
└── firmware updates over WiFi
```
---
## Pin Map (ESP32-S3-BOX-3)
| Function | Pin(s) | Notes |
|---|---|---|
| I2S LRCLK | GPIO45 | strapping pin — warning ignored |
| I2S BCLK | GPIO17 | |
| I2S MCLK | GPIO2 | |
| I2S DIN (mic) | GPIO16 | ES7210 ADC input |
| I2S DOUT (speaker) | GPIO15 | ES8311 DAC output |
| Speaker enable | GPIO46 | strapping pin — warning ignored |
| I2C SCL | GPIO18 | audio codec control bus |
| I2C SDA | GPIO8 | audio codec control bus |
| SPI CLK (display) | GPIO7 | |
| SPI MOSI (display) | GPIO6 | |
| Display CS | GPIO5 | |
| Display DC | GPIO4 | |
| Display Reset | GPIO48 | inverted |
| Backlight | GPIO47 | LEDC PWM |
| Left top button | GPIO0 | strapping pin — mute toggle / factory reset |
---
## ESPHome Configuration
### Base Config Template
`esphome/base.yaml` — shared across all units:
### Platform & Framework
```yaml
esphome:
name: homeai-${room}
friendly_name: "HomeAI ${room_display}"
platform: esp32
board: esp32-s3-box-3
esp32:
board: esp32s3box
flash_size: 16MB
cpu_frequency: 240MHz
framework:
type: esp-idf
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
ap:
ssid: "HomeAI Fallback"
api:
encryption:
key: !secret api_key
ota:
password: !secret ota_password
logger:
level: INFO
psram:
mode: octal
speed: 80MHz
```
### Room-Specific Config
### Audio Stack
`esphome/s3-box-living-room.yaml`:
Uses `i2s_audio` platform with external ADC/DAC codec chips:
```yaml
substitutions:
room: living-room
room_display: "Living Room"
mac_mini_ip: "192.168.1.x" # or Tailscale IP
- **Microphone**: ES7210 ADC via I2S, 16kHz 16-bit mono
- **Speaker**: ES8311 DAC via I2S, 48kHz 16-bit mono (left channel)
- **Media player**: wraps speaker with volume control (min 50%, max 85%)
packages:
base: !include base.yaml
voice: !include voice.yaml
display: !include display.yaml
```
### Wake Word
One file per room, only the substitutions change.
On-device `micro_wake_word` component with `hey_jarvis` model. Can optionally be switched to Home Assistant streaming wake word via a selector entity.
### Voice / Wyoming Satellite — `esphome/voice.yaml`
### Display
```yaml
microphone:
- platform: esp_adf
id: mic
`ili9xxx` platform with model `S3BOX`. Uses `update_interval: never` — display updates are triggered by scripts on voice assistant state changes. Static 320×240 PNG images for each state are compiled into firmware.
speaker:
- platform: esp_adf
id: spk
### Voice Assistant
micro_wake_word:
model: hey_jarvis # or custom model path
on_wake_word_detected:
- voice_assistant.start:
voice_assistant:
microphone: mic
speaker: spk
noise_suppression_level: 2
auto_gain: 31dBFS
volume_multiplier: 2.0
on_listening:
- display.page.show: page_listening
- script.execute: animate_face_listening
on_stt_vad_end:
- display.page.show: page_thinking
- script.execute: animate_face_thinking
on_tts_start:
- display.page.show: page_speaking
- script.execute: animate_face_speaking
on_end:
- display.page.show: page_idle
- script.execute: animate_face_idle
on_error:
- display.page.show: page_error
- script.execute: animate_face_error
```
**Note:** ESPHome's `voice_assistant` component connects to HA, which routes to Wyoming STT/TTS on the Mac Mini. This is the standard ESPHome → HA → Wyoming path.
### LVGL Display — `esphome/display.yaml`
```yaml
display:
- platform: ili9xxx
model: ILI9341
id: lcd
cs_pin: GPIO5
dc_pin: GPIO4
reset_pin: GPIO48
touchscreen:
- platform: tt21100
id: touch
lvgl:
displays:
- lcd
touchscreens:
- touch
# Face widget — centered on screen
widgets:
- obj:
id: face_container
width: 320
height: 240
bg_color: 0x000000
children:
# Eyes (two circles)
- obj:
id: eye_left
x: 90
y: 90
width: 50
height: 50
radius: 25
bg_color: 0xFFFFFF
- obj:
id: eye_right
x: 180
y: 90
width: 50
height: 50
radius: 25
bg_color: 0xFFFFFF
# Mouth (line/arc)
- arc:
id: mouth
x: 110
y: 160
width: 100
height: 40
start_angle: 180
end_angle: 360
arc_color: 0xFFFFFF
pages:
- id: page_idle
- id: page_listening
- id: page_thinking
- id: page_speaking
- id: page_error
```
### LVGL Face State Animations — `esphome/animations.yaml`
```yaml
script:
- id: animate_face_idle
then:
- lvgl.widget.modify:
id: eye_left
height: 50 # normal open
- lvgl.widget.modify:
id: eye_right
height: 50
- lvgl.widget.modify:
id: mouth
arc_color: 0xFFFFFF
- id: animate_face_listening
then:
- lvgl.widget.modify:
id: eye_left
height: 60 # wider eyes
- lvgl.widget.modify:
id: eye_right
height: 60
- lvgl.widget.modify:
id: mouth
arc_color: 0x00BFFF # blue tint
- id: animate_face_thinking
then:
- lvgl.widget.modify:
id: eye_left
height: 20 # squinting
- lvgl.widget.modify:
id: eye_right
height: 20
- id: animate_face_speaking
then:
- lvgl.widget.modify:
id: mouth
arc_color: 0x00FF88 # green speaking indicator
- id: animate_face_error
then:
- lvgl.widget.modify:
id: eye_left
bg_color: 0xFF2200 # red eyes
- lvgl.widget.modify:
id: eye_right
bg_color: 0xFF2200
```
> **Note:** True lip-sync animation (mouth moving with audio) is complex on ESP32. Phase 1: static states. Phase 2: amplitude-driven mouth height using speaker volume feedback.
---
## Secrets File
`esphome/secrets.yaml` (gitignored):
```yaml
wifi_ssid: "YourNetwork"
wifi_password: "YourPassword"
api_key: "<32-byte base64 key>"
ota_password: "YourOTAPassword"
```
---
## Flash & Deployment Workflow
```bash
# Install ESPHome
pip install esphome
# Compile + flash via USB (first time)
esphome run esphome/s3-box-living-room.yaml
# OTA update (subsequent)
esphome upload esphome/s3-box-living-room.yaml --device <device-ip>
# View logs
esphome logs esphome/s3-box-living-room.yaml
```
---
## Home Assistant Integration
After flashing:
1. HA discovers ESP32 automatically via mDNS
2. Add device in HA → Settings → Devices
3. Assign Wyoming voice assistant pipeline to the device
4. Set up room-specific automations (e.g., "Living Room" light control from that satellite)
ESPHome's `voice_assistant` component connects to HA via the ESPHome native API (not directly to Wyoming). HA orchestrates the pipeline:
1. Audio → Wyoming STT (Mac Mini) → text
2. Text → OpenClaw conversation agent → response
3. Response → Wyoming TTS (Mac Mini) → audio back to ESP32
---
@@ -303,43 +117,71 @@ After flashing:
```
homeai-esp32/
├── PLAN.md
├── setup.sh # env check + flash/ota/logs commands
└── esphome/
├── base.yaml
├── voice.yaml
├── display.yaml
├── animations.yaml
── s3-box-living-room.yaml
├── s3-box-bedroom.yaml # template, fill in when hardware available
├── s3-box-kitchen.yaml # template
└── secrets.yaml # gitignored
├── secrets.yaml # gitignored — WiFi + API key
├── homeai-living-room.yaml # first unit (full config)
├── homeai-bedroom.yaml # future: copy + change substitutions
├── homeai-kitchen.yaml # future: copy + change substitutions
── illustrations/ # 320×240 PNG face images
├── idle.png
├── loading.png
├── listening.png
├── thinking.png
├── replying.png
├── error.png
└── timer_finished.png
```
---
## Wake Word Decisions
## ESPHome Environment
```bash
# Dedicated venv (Python 3.12) — do NOT share with voice/whisper venvs
~/homeai-esphome-env/bin/esphome version # ESPHome 2026.2.4+
# Quick commands
cd ~/gitea/homeai/homeai-esp32
~/homeai-esphome-env/bin/esphome run esphome/homeai-living-room.yaml # compile + flash
~/homeai-esphome-env/bin/esphome logs esphome/homeai-living-room.yaml # stream logs
# Or use the setup script
./setup.sh flash # compile + USB flash
./setup.sh ota # compile + OTA update
./setup.sh logs # stream device logs
./setup.sh validate # check YAML without compiling
```
---
## Wake Word Options
| Option | Latency | Privacy | Effort |
|---|---|---|---|
| `hey_jarvis` (built-in microWakeWord) | ~200ms | On-device | Zero |
| `hey_jarvis` (built-in micro_wake_word) | ~200ms | On-device | Zero |
| Custom word (trained model) | ~200ms | On-device | High — requires 50+ recordings |
| Mac Mini openWakeWord (stream audio) | ~500ms | On Mac | Medium |
| HA streaming wake word | ~500ms | On Mac Mini | Medium — stream all audio |
**Recommendation:** Start with `hey_jarvis`. Train a custom word (character's name) once character name is finalised.
**Current**: `hey_jarvis` on-device. Train a custom word (character's name) once finalised.
---
## Implementation Steps
- [ ] Install ESPHome: `pip install esphome`
- [ ] Write `esphome/secrets.yaml` (gitignored)
- [ ] Write `base.yaml`, `voice.yaml`, `display.yaml`, `animations.yaml`
- [ ] Write `s3-box-living-room.yaml` for first unit
- [ ] Flash first unit via USB: `esphome run s3-box-living-room.yaml`
- [ ] Verify unit appears in HA device list
- [ ] Assign Wyoming voice pipeline to unit in HA
- [ ] Test: speak wake word → transcription → LLM response → spoken reply
- [ ] Test: LVGL face cycles through idle → listening → thinking → speaking
- [ ] Verify OTA update works: change LVGL color, deploy wirelessly
- [x] Install ESPHome in `~/homeai-esphome-env` (Python 3.12)
- [x] Write `esphome/secrets.yaml` (gitignored)
- [x] Write `homeai-living-room.yaml` (based on official S3-BOX-3 reference config)
- [x] Generate placeholder face illustrations (7 PNGs, 320×240)
- [x] Write `setup.sh` with flash/ota/logs/validate commands
- [x] Write `deploy.sh` with OTA deploy, image management, multi-unit support
- [x] Flash first unit via USB (living room)
- [x] Verify unit appears in HA device list
- [x] Assign Wyoming voice pipeline to unit in HA
- [x] Test: speak wake word → transcription → LLM response → spoken reply
- [x] Test: display cycles through idle → listening → thinking → replying
- [x] Verify OTA update works: change config, deploy wirelessly
- [ ] Write config templates for remaining rooms (bedroom, kitchen)
- [ ] Flash remaining units, verify each works independently
- [ ] Document final MAC address → room name mapping
@@ -351,7 +193,17 @@ homeai-esp32/
- [ ] Wake word "hey jarvis" triggers pipeline reliably from 3m distance
- [ ] STT transcription accuracy >90% for clear speech in quiet room
- [ ] TTS audio plays clearly through ESP32 speaker
- [ ] LVGL face shows correct state for idle / listening / thinking / speaking / error
- [ ] Display shows correct state for idle / listening / thinking / replying / error / muted
- [ ] OTA firmware updates work without USB cable
- [ ] Unit reconnects automatically after WiFi drop
- [ ] Unit survives power cycle and resumes normal operation
---
## Known Constraints
- **Memory**: voice_assistant + micro_wake_word + display is near the limit. Do NOT add Bluetooth or LVGL widgets — they will cause crashes.
- **WiFi**: 2.4GHz only. 5GHz networks are not supported.
- **Speaker**: 1W built-in. Volume capped at 85% to avoid distortion.
- **Display**: Static PNGs compiled into firmware. To change images, reflash via OTA (~1-2 min).
- **First compile**: Downloads ESP-IDF toolchain (~500MB), takes 5-10 minutes. Incremental builds are 1-2 minutes.

244
homeai-esp32/deploy.sh Executable file
View File

@@ -0,0 +1,244 @@
#!/usr/bin/env bash
# homeai-esp32/deploy.sh — Quick OTA deploy for ESP32-S3-BOX-3 satellites
#
# Usage:
# ./deploy.sh — deploy config + images to living room (default)
# ./deploy.sh bedroom — deploy to bedroom unit
# ./deploy.sh --images-only — deploy existing PNGs from illustrations/ (no regen)
# ./deploy.sh --regen-images — regenerate placeholder PNGs then deploy
# ./deploy.sh --validate — validate config without deploying
# ./deploy.sh --all — deploy to all configured units
#
# Images are compiled into firmware, so any PNG changes require a reflash.
# To use custom images: drop 320x240 PNGs into esphome/illustrations/ then ./deploy.sh
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
ESPHOME_DIR="${SCRIPT_DIR}/esphome"
ESPHOME_VENV="${HOME}/homeai-esphome-env"
ESPHOME="${ESPHOME_VENV}/bin/esphome"
PYTHON="${ESPHOME_VENV}/bin/python3"
ILLUSTRATIONS_DIR="${ESPHOME_DIR}/illustrations"
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
CYAN='\033[0;36m'
NC='\033[0m'
log_info() { echo -e "${BLUE}[INFO]${NC} $*"; }
log_ok() { echo -e "${GREEN}[OK]${NC} $*"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
log_error() { echo -e "${RED}[ERROR]${NC} $*"; exit 1; }
log_step() { echo -e "${CYAN}[STEP]${NC} $*"; }
# ─── Available units ──────────────────────────────────────────────────────────
UNIT_NAMES=(living-room bedroom kitchen)
DEFAULT_UNIT="living-room"
unit_config() {
case "$1" in
living-room) echo "homeai-living-room.yaml" ;;
bedroom) echo "homeai-bedroom.yaml" ;;
kitchen) echo "homeai-kitchen.yaml" ;;
*) echo "" ;;
esac
}
unit_list() {
echo "${UNIT_NAMES[*]}"
}
# ─── Face image generator ────────────────────────────────────────────────────
generate_faces() {
log_step "Generating face illustrations (320x240 PNG)..."
"${PYTHON}" << 'PYEOF'
from PIL import Image, ImageDraw
import os
WIDTH, HEIGHT = 320, 240
OUT = os.environ.get("ILLUSTRATIONS_DIR", "esphome/illustrations")
def draw_face(draw, eye_color, mouth_color, eye_height=40, eye_y=80, mouth_style="smile"):
ex1, ey1 = 95, eye_y
draw.ellipse([ex1-25, ey1-eye_height//2, ex1+25, ey1+eye_height//2], fill=eye_color)
ex2, ey2 = 225, eye_y
draw.ellipse([ex2-25, ey2-eye_height//2, ex2+25, ey2+eye_height//2], fill=eye_color)
if mouth_style == "smile":
draw.arc([110, 140, 210, 200], start=0, end=180, fill=mouth_color, width=3)
elif mouth_style == "open":
draw.ellipse([135, 150, 185, 190], fill=mouth_color)
elif mouth_style == "flat":
draw.line([120, 170, 200, 170], fill=mouth_color, width=3)
elif mouth_style == "frown":
draw.arc([110, 160, 210, 220], start=180, end=360, fill=mouth_color, width=3)
states = {
"idle": {"eye_color": "#FFFFFF", "mouth_color": "#FFFFFF", "eye_height": 40, "mouth_style": "smile"},
"loading": {"eye_color": "#6366F1", "mouth_color": "#6366F1", "eye_height": 30, "mouth_style": "flat"},
"listening": {"eye_color": "#00BFFF", "mouth_color": "#00BFFF", "eye_height": 50, "mouth_style": "open"},
"thinking": {"eye_color": "#A78BFA", "mouth_color": "#A78BFA", "eye_height": 20, "mouth_style": "flat"},
"replying": {"eye_color": "#10B981", "mouth_color": "#10B981", "eye_height": 40, "mouth_style": "open"},
"error": {"eye_color": "#EF4444", "mouth_color": "#EF4444", "eye_height": 40, "mouth_style": "frown"},
"timer_finished": {"eye_color": "#F59E0B", "mouth_color": "#F59E0B", "eye_height": 50, "mouth_style": "smile"},
}
os.makedirs(OUT, exist_ok=True)
for name, p in states.items():
img = Image.new("RGBA", (WIDTH, HEIGHT), (0, 0, 0, 255))
draw = ImageDraw.Draw(img)
draw_face(draw, p["eye_color"], p["mouth_color"], p["eye_height"], mouth_style=p["mouth_style"])
img.save(f"{OUT}/{name}.png")
print(f" {name}.png")
PYEOF
log_ok "Generated 7 face illustrations"
}
# ─── Check existing images ───────────────────────────────────────────────────
REQUIRED_IMAGES=(idle loading listening thinking replying error timer_finished)
check_images() {
local missing=()
for name in "${REQUIRED_IMAGES[@]}"; do
if [[ ! -f "${ILLUSTRATIONS_DIR}/${name}.png" ]]; then
missing+=("${name}.png")
fi
done
if [[ ${#missing[@]} -gt 0 ]]; then
log_error "Missing illustrations: ${missing[*]}
Place 320x240 PNGs in ${ILLUSTRATIONS_DIR}/ or use --regen-images to generate placeholders."
fi
log_ok "All ${#REQUIRED_IMAGES[@]} illustrations present in illustrations/"
for name in "${REQUIRED_IMAGES[@]}"; do
local size
size=$(wc -c < "${ILLUSTRATIONS_DIR}/${name}.png" | tr -d ' ')
echo -e " ${name}.png (${size} bytes)"
done
}
# ─── Deploy to a single unit ─────────────────────────────────────────────────
deploy_unit() {
local unit_name="$1"
local config
config="$(unit_config "$unit_name")"
if [[ -z "$config" ]]; then
log_error "Unknown unit: ${unit_name}. Available: $(unit_list)"
fi
local config_path="${ESPHOME_DIR}/${config}"
if [[ ! -f "$config_path" ]]; then
log_error "Config not found: ${config_path}"
fi
log_step "Validating ${config}..."
cd "${ESPHOME_DIR}"
"${ESPHOME}" config "${config}" > /dev/null
log_ok "Config valid"
log_step "Compiling + OTA deploying ${config}..."
"${ESPHOME}" run "${config}" --device OTA 2>&1
log_ok "Deployed to ${unit_name}"
}
# ─── Main ─────────────────────────────────────────────────────────────────────
IMAGES_ONLY=false
REGEN_IMAGES=false
VALIDATE_ONLY=false
DEPLOY_ALL=false
TARGET="${DEFAULT_UNIT}"
while [[ $# -gt 0 ]]; do
case "$1" in
--images-only) IMAGES_ONLY=true; shift ;;
--regen-images) REGEN_IMAGES=true; shift ;;
--validate) VALIDATE_ONLY=true; shift ;;
--all) DEPLOY_ALL=true; shift ;;
--help|-h)
echo "Usage: $0 [unit-name] [--images-only] [--regen-images] [--validate] [--all]"
echo ""
echo "Units: $(unit_list)"
echo ""
echo "Options:"
echo " --images-only Deploy existing PNGs from illustrations/ (for custom images)"
echo " --regen-images Regenerate placeholder face PNGs then deploy"
echo " --validate Validate config without deploying"
echo " --all Deploy to all configured units"
echo ""
echo "Examples:"
echo " $0 # deploy config to living-room"
echo " $0 bedroom # deploy to bedroom"
echo " $0 --images-only # deploy with current images (custom or generated)"
echo " $0 --regen-images # regenerate placeholder faces + deploy"
echo " $0 --all # deploy to all units"
echo ""
echo "Custom images: drop 320x240 PNGs into esphome/illustrations/"
echo "Required files: ${REQUIRED_IMAGES[*]}"
exit 0
;;
*)
if [[ -n "$(unit_config "$1")" ]]; then
TARGET="$1"
else
log_error "Unknown option or unit: $1. Use --help for usage."
fi
shift
;;
esac
done
# Check ESPHome
if [[ ! -x "${ESPHOME}" ]]; then
log_error "ESPHome not found at ${ESPHOME}. Run setup.sh first."
fi
# Regenerate placeholder images if requested
if $REGEN_IMAGES; then
export ILLUSTRATIONS_DIR
generate_faces
fi
# Check existing images if deploying with --images-only (or always before deploy)
if $IMAGES_ONLY; then
check_images
fi
# Validate only
if $VALIDATE_ONLY; then
cd "${ESPHOME_DIR}"
for unit_name in "${UNIT_NAMES[@]}"; do
config="$(unit_config "$unit_name")"
if [[ -f "${config}" ]]; then
log_step "Validating ${config}..."
"${ESPHOME}" config "${config}" > /dev/null && log_ok "${config} valid" || log_warn "${config} invalid"
fi
done
exit 0
fi
# Deploy
if $DEPLOY_ALL; then
for unit_name in "${UNIT_NAMES[@]}"; do
config="$(unit_config "$unit_name")"
if [[ -f "${ESPHOME_DIR}/${config}" ]]; then
deploy_unit "$unit_name"
else
log_warn "Skipping ${unit_name}${config} not found"
fi
done
else
deploy_unit "$TARGET"
fi
echo ""
log_ok "Deploy complete!"

5
homeai-esp32/esphome/.gitignore vendored Normal file
View File

@@ -0,0 +1,5 @@
# Gitignore settings for ESPHome
# This is an example and may include too much for your use-case.
# You can modify this file to suit your needs.
/.esphome/
/secrets.yaml

View File

@@ -0,0 +1,865 @@
---
# HomeAI Living Room Satellite — ESP32-S3-BOX-3
# Based on official ESPHome voice assistant config
# https://github.com/esphome/wake-word-voice-assistants
substitutions:
name: homeai-living-room
friendly_name: HomeAI Living Room
# Face illustrations — compiled into firmware (320x240 PNG)
loading_illustration_file: illustrations/loading.png
idle_illustration_file: illustrations/idle.png
listening_illustration_file: illustrations/listening.png
thinking_illustration_file: illustrations/thinking.png
replying_illustration_file: illustrations/replying.png
error_illustration_file: illustrations/error.png
timer_finished_illustration_file: illustrations/timer_finished.png
# Dark background for all states (matches HomeAI dashboard theme)
loading_illustration_background_color: "000000"
idle_illustration_background_color: "000000"
listening_illustration_background_color: "000000"
thinking_illustration_background_color: "000000"
replying_illustration_background_color: "000000"
error_illustration_background_color: "000000"
voice_assist_idle_phase_id: "1"
voice_assist_listening_phase_id: "2"
voice_assist_thinking_phase_id: "3"
voice_assist_replying_phase_id: "4"
voice_assist_not_ready_phase_id: "10"
voice_assist_error_phase_id: "11"
voice_assist_muted_phase_id: "12"
voice_assist_timer_finished_phase_id: "20"
font_glyphsets: "GF_Latin_Core"
font_family: Figtree
esphome:
name: ${name}
friendly_name: ${friendly_name}
min_version: 2025.5.0
name_add_mac_suffix: false
on_boot:
priority: 600
then:
- script.execute: draw_display
- delay: 30s
- if:
condition:
lambda: return id(init_in_progress);
then:
- lambda: id(init_in_progress) = false;
- script.execute: draw_display
esp32:
board: esp32s3box
flash_size: 16MB
cpu_frequency: 240MHz
framework:
type: esp-idf
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
psram:
mode: octal
speed: 80MHz
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
ap:
ssid: "HomeAI Fallback"
on_connect:
- script.execute: draw_display
on_disconnect:
- script.execute: draw_display
captive_portal:
api:
encryption:
key: !secret api_key
# Prevent device from rebooting if HA connection drops temporarily
reboot_timeout: 0s
on_client_connected:
- script.execute: draw_display
on_client_disconnected:
# Debounce: wait 5s before showing "HA not found" to avoid flicker on brief drops
- delay: 5s
- if:
condition:
not:
api.connected:
then:
- script.execute: draw_display
ota:
- platform: esphome
id: ota_esphome
logger:
hardware_uart: USB_SERIAL_JTAG
button:
- platform: factory_reset
id: factory_reset_btn
internal: true
binary_sensor:
- platform: gpio
pin:
number: GPIO0
ignore_strapping_warning: true
mode: INPUT_PULLUP
inverted: true
id: left_top_button
internal: true
on_multi_click:
# Short press: dismiss timer / toggle mute
- timing:
- ON for at least 50ms
- OFF for at least 50ms
then:
- if:
condition:
switch.is_on: timer_ringing
then:
- switch.turn_off: timer_ringing
else:
- switch.toggle: mute
# Long press (10s): factory reset
- timing:
- ON for at least 10s
then:
- button.press: factory_reset_btn
# --- Display backlight ---
output:
- platform: ledc
pin: GPIO47
id: backlight_output
light:
- platform: monochromatic
id: led
name: Screen
icon: "mdi:television"
entity_category: config
output: backlight_output
restore_mode: RESTORE_DEFAULT_ON
default_transition_length: 250ms
# --- Audio hardware ---
i2c:
scl: GPIO18
sda: GPIO8
i2s_audio:
- id: i2s_audio_bus
i2s_lrclk_pin:
number: GPIO45
ignore_strapping_warning: true
i2s_bclk_pin: GPIO17
i2s_mclk_pin: GPIO2
audio_adc:
- platform: es7210
id: es7210_adc
bits_per_sample: 16bit
sample_rate: 16000
audio_dac:
- platform: es8311
id: es8311_dac
bits_per_sample: 16bit
sample_rate: 48000
microphone:
- platform: i2s_audio
id: box_mic
sample_rate: 16000
i2s_din_pin: GPIO16
bits_per_sample: 16bit
adc_type: external
speaker:
- platform: i2s_audio
id: box_speaker
i2s_dout_pin: GPIO15
dac_type: external
sample_rate: 48000
bits_per_sample: 16bit
channel: left
audio_dac: es8311_dac
buffer_duration: 100ms
media_player:
- platform: speaker
name: None
id: speaker_media_player
volume_min: 0.5
volume_max: 0.85
announcement_pipeline:
speaker: box_speaker
format: FLAC
sample_rate: 48000
num_channels: 1
files:
- id: timer_finished_sound
file: https://github.com/esphome/home-assistant-voice-pe/raw/dev/sounds/timer_finished.flac
on_announcement:
- if:
condition:
- microphone.is_capturing:
then:
- script.execute: stop_wake_word
- if:
condition:
- lambda: return id(wake_word_engine_location).current_option() == "In Home Assistant";
then:
- wait_until:
- not:
voice_assistant.is_running:
- if:
condition:
not:
voice_assistant.is_running:
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: draw_display
on_idle:
- if:
condition:
not:
voice_assistant.is_running:
then:
- script.execute: start_wake_word
- script.execute: set_idle_or_mute_phase
- script.execute: draw_display
# --- Wake word (on-device) ---
micro_wake_word:
id: mww
models:
- hey_jarvis
on_wake_word_detected:
- voice_assistant.start:
wake_word: !lambda return wake_word;
# --- Voice assistant ---
voice_assistant:
id: va
microphone: box_mic
media_player: speaker_media_player
micro_wake_word: mww
noise_suppression_level: 2
auto_gain: 31dBFS
volume_multiplier: 2.0
on_listening:
- lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
- text_sensor.template.publish:
id: text_request
state: "..."
- text_sensor.template.publish:
id: text_response
state: "..."
- script.execute: draw_display
on_stt_vad_end:
- lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
- script.execute: draw_display
on_stt_end:
- text_sensor.template.publish:
id: text_request
state: !lambda return x;
- script.execute: draw_display
on_tts_start:
- text_sensor.template.publish:
id: text_response
state: !lambda return x;
- lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
- script.execute: draw_display
on_end:
- wait_until:
condition:
- media_player.is_announcing:
timeout: 0.5s
- wait_until:
- and:
- not:
media_player.is_announcing:
- not:
speaker.is_playing:
- if:
condition:
- lambda: return id(wake_word_engine_location).current_option() == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- micro_wake_word.start:
- script.execute: set_idle_or_mute_phase
- script.execute: draw_display
- text_sensor.template.publish:
id: text_request
state: ""
- text_sensor.template.publish:
id: text_response
state: ""
on_error:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
- script.execute: draw_display
- delay: 1s
- if:
condition:
switch.is_off: mute
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: draw_display
on_client_connected:
- lambda: id(init_in_progress) = false;
- script.execute: start_wake_word
- script.execute: set_idle_or_mute_phase
- script.execute: draw_display
on_client_disconnected:
- script.execute: stop_wake_word
- lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
- script.execute: draw_display
on_timer_started:
- script.execute: draw_display
on_timer_cancelled:
- script.execute: draw_display
on_timer_updated:
- script.execute: draw_display
on_timer_tick:
- script.execute: draw_display
on_timer_finished:
- switch.turn_on: timer_ringing
- wait_until:
media_player.is_announcing:
- lambda: id(voice_assistant_phase) = ${voice_assist_timer_finished_phase_id};
- script.execute: draw_display
# --- Scripts ---
script:
- id: draw_display
then:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- if:
condition:
wifi.connected:
then:
- if:
condition:
api.connected:
then:
- lambda: |
switch(id(voice_assistant_phase)) {
case ${voice_assist_listening_phase_id}:
id(s3_box_lcd).show_page(listening_page);
id(s3_box_lcd).update();
break;
case ${voice_assist_thinking_phase_id}:
id(s3_box_lcd).show_page(thinking_page);
id(s3_box_lcd).update();
break;
case ${voice_assist_replying_phase_id}:
id(s3_box_lcd).show_page(replying_page);
id(s3_box_lcd).update();
break;
case ${voice_assist_error_phase_id}:
id(s3_box_lcd).show_page(error_page);
id(s3_box_lcd).update();
break;
case ${voice_assist_muted_phase_id}:
id(s3_box_lcd).show_page(muted_page);
id(s3_box_lcd).update();
break;
case ${voice_assist_not_ready_phase_id}:
id(s3_box_lcd).show_page(no_ha_page);
id(s3_box_lcd).update();
break;
case ${voice_assist_timer_finished_phase_id}:
id(s3_box_lcd).show_page(timer_finished_page);
id(s3_box_lcd).update();
break;
default:
id(s3_box_lcd).show_page(idle_page);
id(s3_box_lcd).update();
}
else:
- display.page.show: no_ha_page
- component.update: s3_box_lcd
else:
- display.page.show: no_wifi_page
- component.update: s3_box_lcd
else:
- display.page.show: initializing_page
- component.update: s3_box_lcd
- id: fetch_first_active_timer
then:
- lambda: |
const auto &timers = id(va).get_timers();
auto output_timer = timers.begin()->second;
for (const auto &timer : timers) {
if (timer.second.is_active && timer.second.seconds_left <= output_timer.seconds_left) {
output_timer = timer.second;
}
}
id(global_first_active_timer) = output_timer;
- id: check_if_timers_active
then:
- lambda: |
const auto &timers = id(va).get_timers();
bool output = false;
for (const auto &timer : timers) {
if (timer.second.is_active) { output = true; }
}
id(global_is_timer_active) = output;
- id: fetch_first_timer
then:
- lambda: |
const auto &timers = id(va).get_timers();
auto output_timer = timers.begin()->second;
for (const auto &timer : timers) {
if (timer.second.seconds_left <= output_timer.seconds_left) {
output_timer = timer.second;
}
}
id(global_first_timer) = output_timer;
- id: check_if_timers
then:
- lambda: |
const auto &timers = id(va).get_timers();
bool output = false;
for (const auto &timer : timers) {
if (timer.second.is_active) { output = true; }
}
id(global_is_timer) = output;
- id: draw_timer_timeline
then:
- lambda: |
id(check_if_timers_active).execute();
id(check_if_timers).execute();
if (id(global_is_timer_active)){
id(fetch_first_active_timer).execute();
int active_pixels = round( 320 * id(global_first_active_timer).seconds_left / max(id(global_first_active_timer).total_seconds, static_cast<uint32_t>(1)) );
if (active_pixels > 0){
id(s3_box_lcd).filled_rectangle(0, 225, 320, 15, Color::WHITE);
id(s3_box_lcd).filled_rectangle(0, 226, active_pixels, 13, id(active_timer_color));
}
} else if (id(global_is_timer)){
id(fetch_first_timer).execute();
int active_pixels = round( 320 * id(global_first_timer).seconds_left / max(id(global_first_timer).total_seconds, static_cast<uint32_t>(1)));
if (active_pixels > 0){
id(s3_box_lcd).filled_rectangle(0, 225, 320, 15, Color::WHITE);
id(s3_box_lcd).filled_rectangle(0, 226, active_pixels, 13, id(paused_timer_color));
}
}
- id: draw_active_timer_widget
then:
- lambda: |
id(check_if_timers_active).execute();
if (id(global_is_timer_active)){
id(s3_box_lcd).filled_rectangle(80, 40, 160, 50, Color::WHITE);
id(s3_box_lcd).rectangle(80, 40, 160, 50, Color::BLACK);
id(fetch_first_active_timer).execute();
int hours_left = floor(id(global_first_active_timer).seconds_left / 3600);
int minutes_left = floor((id(global_first_active_timer).seconds_left - hours_left * 3600) / 60);
int seconds_left = id(global_first_active_timer).seconds_left - hours_left * 3600 - minutes_left * 60;
auto display_hours = (hours_left < 10 ? "0" : "") + std::to_string(hours_left);
auto display_minute = (minutes_left < 10 ? "0" : "") + std::to_string(minutes_left);
auto display_seconds = (seconds_left < 10 ? "0" : "") + std::to_string(seconds_left);
std::string display_string = "";
if (hours_left > 0) {
display_string = display_hours + ":" + display_minute;
} else {
display_string = display_minute + ":" + display_seconds;
}
id(s3_box_lcd).printf(120, 47, id(font_timer), Color::BLACK, "%s", display_string.c_str());
}
- id: start_wake_word
then:
- if:
condition:
and:
- not:
- voice_assistant.is_running:
- lambda: return id(wake_word_engine_location).current_option() == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- micro_wake_word.start:
- if:
condition:
and:
- not:
- voice_assistant.is_running:
- lambda: return id(wake_word_engine_location).current_option() == "In Home Assistant";
then:
- lambda: id(va).set_use_wake_word(true);
- voice_assistant.start_continuous:
- id: stop_wake_word
then:
- if:
condition:
lambda: return id(wake_word_engine_location).current_option() == "In Home Assistant";
then:
- lambda: id(va).set_use_wake_word(false);
- voice_assistant.stop:
- if:
condition:
lambda: return id(wake_word_engine_location).current_option() == "On device";
then:
- micro_wake_word.stop:
- id: set_idle_or_mute_phase
then:
- if:
condition:
switch.is_off: mute
then:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
# --- Switches ---
switch:
- platform: gpio
name: Speaker Enable
pin:
number: GPIO46
ignore_strapping_warning: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
disabled_by_default: true
- platform: template
name: Mute
id: mute
icon: "mdi:microphone-off"
optimistic: true
restore_mode: RESTORE_DEFAULT_OFF
entity_category: config
on_turn_off:
- microphone.unmute:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- script.execute: draw_display
on_turn_on:
- microphone.mute:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: draw_display
- platform: template
id: timer_ringing
optimistic: true
internal: true
restore_mode: ALWAYS_OFF
on_turn_off:
- lambda: |-
id(speaker_media_player)
->make_call()
.set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_REPEAT_OFF)
.set_announcement(true)
.perform();
id(speaker_media_player)->set_playlist_delay_ms(speaker::AudioPipelineType::ANNOUNCEMENT, 0);
- media_player.stop:
announcement: true
on_turn_on:
- lambda: |-
id(speaker_media_player)
->make_call()
.set_command(media_player::MediaPlayerCommand::MEDIA_PLAYER_COMMAND_REPEAT_ONE)
.set_announcement(true)
.perform();
id(speaker_media_player)->set_playlist_delay_ms(speaker::AudioPipelineType::ANNOUNCEMENT, 1000);
- media_player.speaker.play_on_device_media_file:
media_file: timer_finished_sound
announcement: true
- delay: 15min
- switch.turn_off: timer_ringing
# --- Wake word engine location selector ---
select:
- platform: template
entity_category: config
name: Wake word engine location
id: wake_word_engine_location
icon: "mdi:account-voice"
optimistic: true
restore_value: true
options:
- In Home Assistant
- On device
initial_option: On device
on_value:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- wait_until:
lambda: return id(voice_assistant_phase) == ${voice_assist_muted_phase_id} || id(voice_assistant_phase) == ${voice_assist_idle_phase_id};
- if:
condition:
lambda: return x == "In Home Assistant";
then:
- micro_wake_word.stop
- delay: 500ms
- if:
condition:
switch.is_off: mute
then:
- lambda: id(va).set_use_wake_word(true);
- voice_assistant.start_continuous:
- if:
condition:
lambda: return x == "On device";
then:
- lambda: id(va).set_use_wake_word(false);
- voice_assistant.stop
- delay: 500ms
- if:
condition:
switch.is_off: mute
then:
- micro_wake_word.start
# --- Global variables ---
globals:
- id: init_in_progress
type: bool
restore_value: false
initial_value: "true"
- id: voice_assistant_phase
type: int
restore_value: false
initial_value: ${voice_assist_not_ready_phase_id}
- id: global_first_active_timer
type: voice_assistant::Timer
restore_value: false
- id: global_is_timer_active
type: bool
restore_value: false
- id: global_first_timer
type: voice_assistant::Timer
restore_value: false
- id: global_is_timer
type: bool
restore_value: false
# --- Display images ---
image:
- file: ${error_illustration_file}
id: casita_error
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${idle_illustration_file}
id: casita_idle
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${listening_illustration_file}
id: casita_listening
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${thinking_illustration_file}
id: casita_thinking
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${replying_illustration_file}
id: casita_replying
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${timer_finished_illustration_file}
id: casita_timer_finished
resize: 320x240
type: RGB
transparency: alpha_channel
- file: ${loading_illustration_file}
id: casita_initializing
resize: 320x240
type: RGB
transparency: alpha_channel
- file: https://github.com/esphome/wake-word-voice-assistants/raw/main/error_box_illustrations/error-no-wifi.png
id: error_no_wifi
resize: 320x240
type: RGB
transparency: alpha_channel
- file: https://github.com/esphome/wake-word-voice-assistants/raw/main/error_box_illustrations/error-no-ha.png
id: error_no_ha
resize: 320x240
type: RGB
transparency: alpha_channel
# --- Fonts ---
font:
- file:
type: gfonts
family: ${font_family}
weight: 300
italic: true
id: font_request
size: 15
glyphsets:
- ${font_glyphsets}
- file:
type: gfonts
family: ${font_family}
weight: 300
id: font_response
size: 15
glyphsets:
- ${font_glyphsets}
- file:
type: gfonts
family: ${font_family}
weight: 300
id: font_timer
size: 30
glyphsets:
- ${font_glyphsets}
# --- Text sensors (request/response display) ---
text_sensor:
- id: text_request
platform: template
on_value:
lambda: |-
if(id(text_request).state.length()>32) {
std::string name = id(text_request).state.c_str();
std::string truncated = esphome::str_truncate(name.c_str(),31);
id(text_request).state = (truncated+"...").c_str();
}
- id: text_response
platform: template
on_value:
lambda: |-
if(id(text_response).state.length()>32) {
std::string name = id(text_response).state.c_str();
std::string truncated = esphome::str_truncate(name.c_str(),31);
id(text_response).state = (truncated+"...").c_str();
}
# --- Colors ---
color:
- id: idle_color
hex: ${idle_illustration_background_color}
- id: listening_color
hex: ${listening_illustration_background_color}
- id: thinking_color
hex: ${thinking_illustration_background_color}
- id: replying_color
hex: ${replying_illustration_background_color}
- id: loading_color
hex: ${loading_illustration_background_color}
- id: error_color
hex: ${error_illustration_background_color}
- id: active_timer_color
hex: "26ed3a"
- id: paused_timer_color
hex: "3b89e3"
# --- SPI + Display ---
spi:
- id: spi_bus
clk_pin: 7
mosi_pin: 6
display:
- platform: ili9xxx
id: s3_box_lcd
model: S3BOX
invert_colors: false
data_rate: 40MHz
cs_pin: 5
dc_pin: 4
reset_pin:
number: 48
inverted: true
update_interval: never
pages:
- id: idle_page
lambda: |-
it.fill(id(idle_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_idle), ImageAlign::CENTER);
id(draw_timer_timeline).execute();
id(draw_active_timer_widget).execute();
- id: listening_page
lambda: |-
it.fill(id(listening_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_listening), ImageAlign::CENTER);
id(draw_timer_timeline).execute();
- id: thinking_page
lambda: |-
it.fill(id(thinking_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_thinking), ImageAlign::CENTER);
it.filled_rectangle(20, 20, 280, 30, Color::WHITE);
it.rectangle(20, 20, 280, 30, Color::BLACK);
it.printf(30, 25, id(font_request), Color::BLACK, "%s", id(text_request).state.c_str());
id(draw_timer_timeline).execute();
- id: replying_page
lambda: |-
it.fill(id(replying_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_replying), ImageAlign::CENTER);
it.filled_rectangle(20, 20, 280, 30, Color::WHITE);
it.rectangle(20, 20, 280, 30, Color::BLACK);
it.filled_rectangle(20, 190, 280, 30, Color::WHITE);
it.rectangle(20, 190, 280, 30, Color::BLACK);
it.printf(30, 25, id(font_request), Color::BLACK, "%s", id(text_request).state.c_str());
it.printf(30, 195, id(font_response), Color::BLACK, "%s", id(text_response).state.c_str());
id(draw_timer_timeline).execute();
- id: timer_finished_page
lambda: |-
it.fill(id(idle_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_timer_finished), ImageAlign::CENTER);
- id: error_page
lambda: |-
it.fill(id(error_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_error), ImageAlign::CENTER);
- id: no_ha_page
lambda: |-
it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_ha), ImageAlign::CENTER);
- id: no_wifi_page
lambda: |-
it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_wifi), ImageAlign::CENTER);
- id: initializing_page
lambda: |-
it.fill(id(loading_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_initializing), ImageAlign::CENTER);
- id: muted_page
lambda: |-
it.fill(Color::BLACK);
id(draw_timer_timeline).execute();
id(draw_active_timer_widget).execute();

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

View File

@@ -1,76 +1,177 @@
#!/usr/bin/env bash
# homeai-esp32/setup.sh — P6: ESPHome firmware for ESP32-S3-BOX-3
#
# Components:
# - ESPHomefirmware build + flash tool
# - base.yaml — shared device config
# - voice.yaml — Wyoming Satellite + microWakeWord
# - display.yaml — LVGL animated face
# - Per-room configs — s3-box-living-room.yaml, etc.
# Usage:
# ./setup.sh check environment + validate config
# ./setup.sh flash — compile + flash via USB (first time)
# ./setup.sh ota — compile + flash via OTA (wireless)
# ./setup.sh logs — stream device logs
# ./setup.sh validate — validate YAML without compiling
#
# Prerequisites:
# - P1 (homeai-infra) — Home Assistant running
# - P3 (homeai-voice) — Wyoming STT/TTS running (ports 10300/10301)
# - Python 3.10+
# - USB-C cable for first flash (subsequent updates via OTA)
# - On Linux: ensure user is in the dialout group for USB access
# - ~/homeai-esphome-env — Python 3.12 venv with ESPHome
# - Home Assistant running on 10.0.0.199
# - Wyoming STT/TTS running on Mac Mini (ports 10300/10301)
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)"
source "${REPO_DIR}/scripts/common.sh"
ESPHOME_VENV="${HOME}/homeai-esphome-env"
ESPHOME="${ESPHOME_VENV}/bin/esphome"
ESPHOME_DIR="${SCRIPT_DIR}/esphome"
DEFAULT_CONFIG="${ESPHOME_DIR}/homeai-living-room.yaml"
log_section "P6: ESP32 Firmware (ESPHome)"
detect_platform
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# ─── Prerequisite check ────────────────────────────────────────────────────────
log_info "Checking prerequisites..."
log_info() { echo -e "${BLUE}[INFO]${NC} $*"; }
log_ok() { echo -e "${GREEN}[OK]${NC} $*"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $*"; }
log_error() { echo -e "${RED}[ERROR]${NC} $*"; }
if ! command_exists python3; then
log_warn "python3 not found — required for ESPHome"
# ─── Environment checks ──────────────────────────────────────────────────────
check_env() {
local ok=true
log_info "Checking environment..."
# ESPHome venv
if [[ -x "${ESPHOME}" ]]; then
local version
version=$("${ESPHOME}" version 2>/dev/null)
log_ok "ESPHome: ${version}"
else
log_error "ESPHome not found at ${ESPHOME}"
echo " Install: /opt/homebrew/opt/python@3.12/bin/python3.12 -m venv ${ESPHOME_VENV}"
echo " ${ESPHOME_VENV}/bin/pip install 'esphome>=2025.5.0'"
ok=false
fi
if ! command_exists esphome; then
log_info "ESPHome not installed. To install: pip install esphome"
# secrets.yaml
if [[ -f "${ESPHOME_DIR}/secrets.yaml" ]]; then
if grep -q "YOUR_" "${ESPHOME_DIR}/secrets.yaml" 2>/dev/null; then
log_warn "secrets.yaml contains placeholder values — edit before flashing"
ok=false
else
log_ok "secrets.yaml configured"
fi
else
log_error "secrets.yaml not found at ${ESPHOME_DIR}/secrets.yaml"
ok=false
fi
if [[ "$OS_TYPE" == "linux" ]]; then
if ! groups "$USER" | grep -q dialout; then
log_warn "User '$USER' not in 'dialout' group — USB flashing may fail."
log_warn "Fix: sudo usermod -aG dialout $USER (then log out and back in)"
fi
# Config file
if [[ -f "${DEFAULT_CONFIG}" ]]; then
log_ok "Config: $(basename "${DEFAULT_CONFIG}")"
else
log_error "Config not found: ${DEFAULT_CONFIG}"
ok=false
fi
# Check P3 dependency
if ! curl -sf http://localhost:8123 -o /dev/null 2>/dev/null; then
log_warn "Home Assistant (P1) not reachable — ESP32 units won't auto-discover"
# Illustrations
local illust_dir="${ESPHOME_DIR}/illustrations"
local illust_count
illust_count=$(find "${illust_dir}" -name "*.png" 2>/dev/null | wc -l | tr -d ' ')
if [[ "${illust_count}" -ge 7 ]]; then
log_ok "Illustrations: ${illust_count} PNGs in illustrations/"
else
log_warn "Missing illustrations (found ${illust_count}, need 7)"
fi
# ─── TODO: Implementation ──────────────────────────────────────────────────────
cat <<'EOF'
# Wyoming services on Mac Mini
if curl -sf "http://localhost:10300" -o /dev/null 2>/dev/null || nc -z localhost 10300 2>/dev/null; then
log_ok "Wyoming STT (port 10300) reachable"
else
log_warn "Wyoming STT (port 10300) not reachable"
fi
┌─────────────────────────────────────────────────────────────────┐
P6: homeai-esp32 — NOT YET IMPLEMENTED │
│ │
Implementation steps: │
│ 1. pip install esphome │
│ 2. Create esphome/secrets.yaml (gitignored) │
│ 3. Create esphome/base.yaml (WiFi, API, OTA) │
│ 4. Create esphome/voice.yaml (Wyoming Satellite, wakeword) │
│ 5. Create esphome/display.yaml (LVGL face, 5 states) │
│ 6. Create esphome/animations.yaml (face state scripts) │
│ 7. Create per-room configs (s3-box-living-room.yaml, etc.) │
│ 8. First flash via USB: esphome run esphome/<room>.yaml │
│ 9. Subsequent OTA: esphome upload esphome/<room>.yaml │
│ 10. Add to Home Assistant → assign Wyoming voice pipeline │
│ │
│ Quick flash (once esphome/ is ready): │
│ esphome run esphome/s3-box-living-room.yaml │
│ esphome logs esphome/s3-box-living-room.yaml │
└─────────────────────────────────────────────────────────────────┘
if curl -sf "http://localhost:10301" -o /dev/null 2>/dev/null || nc -z localhost 10301 2>/dev/null; then
log_ok "Wyoming TTS (port 10301) reachable"
else
log_warn "Wyoming TTS (port 10301) not reachable"
fi
EOF
# Home Assistant
if curl -sk "https://10.0.0.199:8123" -o /dev/null 2>/dev/null; then
log_ok "Home Assistant (10.0.0.199:8123) reachable"
else
log_warn "Home Assistant not reachable — ESP32 won't be able to connect"
fi
log_info "P6 is not yet implemented. See homeai-esp32/PLAN.md for details."
exit 0
if $ok; then
log_ok "Environment ready"
else
log_warn "Some issues found — fix before flashing"
fi
}
# ─── Commands ─────────────────────────────────────────────────────────────────
cmd_flash() {
local config="${1:-${DEFAULT_CONFIG}}"
log_info "Compiling + flashing via USB: $(basename "${config}")"
log_info "First compile downloads ESP-IDF toolchain (~500MB), takes 5-10 min..."
cd "${ESPHOME_DIR}"
"${ESPHOME}" run "$(basename "${config}")"
}
cmd_ota() {
local config="${1:-${DEFAULT_CONFIG}}"
log_info "Compiling + OTA upload: $(basename "${config}")"
cd "${ESPHOME_DIR}"
"${ESPHOME}" run "$(basename "${config}")"
}
cmd_logs() {
local config="${1:-${DEFAULT_CONFIG}}"
log_info "Streaming logs for: $(basename "${config}")"
cd "${ESPHOME_DIR}"
"${ESPHOME}" logs "$(basename "${config}")"
}
cmd_validate() {
local config="${1:-${DEFAULT_CONFIG}}"
log_info "Validating: $(basename "${config}")"
cd "${ESPHOME_DIR}"
"${ESPHOME}" config "$(basename "${config}")"
log_ok "Config valid"
}
# ─── Main ─────────────────────────────────────────────────────────────────────
case "${1:-}" in
flash)
check_env
echo ""
cmd_flash "${2:-}"
;;
ota)
check_env
echo ""
cmd_ota "${2:-}"
;;
logs)
cmd_logs "${2:-}"
;;
validate)
cmd_validate "${2:-}"
;;
*)
check_env
echo ""
echo "Usage: $0 {flash|ota|logs|validate} [config.yaml]"
echo ""
echo " flash Compile + flash via USB (first time)"
echo " ota Compile + flash via OTA (wireless, after first flash)"
echo " logs Stream device logs"
echo " validate Validate YAML config without compiling"
echo ""
echo "Default config: $(basename "${DEFAULT_CONFIG}")"
;;
esac