Initial commit

This commit is contained in:
Aodhan Collins
2025-10-06 00:33:04 +01:00
commit 66749a5ce7
71 changed files with 22041 additions and 0 deletions

View File

@@ -0,0 +1,206 @@
# ElevenLabs Voice ID Debug Checklist
## Current Issue
- **LocalStorage shows**: `"elevenlabs:undefined"`
- **Error message**: `Voice not found in available voices: "6WkvZo1vwba1zF4N2vlY"`
- **Problem**: Voice ID is valid (`6WkvZo1vwba1zF4N2vlY`) but not being captured from API
## What to Check in Console
### 1. When Settings Opens (ElevenLabs API Response)
Look for these logs:
```javascript
🎤 ElevenLabs API Response: {...}
🎤 First voice object: {...}
🎤 Voice properties: [...]
```
**Check:**
- ✅ Does the first voice object have a `voice_id` property?
- ✅ Or does it use `voiceId`, `id`, or something else?
- ✅ What properties are listed?
### 2. Voice Processing
```javascript
🔍 Processing voice: {
name: "...",
voice_id: "...",
voiceId: "...",
id: "...",
finalVoiceId: "...",
allKeys: [...]
}
```
**Check:**
- ✅ Which property contains the actual voice ID?
- ✅ Is `finalVoiceId` populated correctly?
- ⚠️ Are any voices showing `finalVoiceId: undefined`?
### 3. After API Processing
```javascript
🎵 ElevenLabs voices loaded: 25
🎵 Sample voice: {...}
🎵 All voice IDs: ["Rachel: xxx", "Adam: xxx", ...]
```
**Check:**
- ✅ How many voices loaded?
- ✅ Do the voice IDs look valid? (should be long strings like `6WkvZo1vwba1zF4N2vlY`)
- ⚠️ Are any showing as "Rachel: undefined"?
### 4. Dropdown Options
```javascript
📋 Sample ElevenLabs dropdown option: {
name: "Rachel",
voice_id: "...",
optionValue: "elevenlabs:..."
}
```
**Check:**
- ✅ Is `voice_id` populated?
- ✅ Does `optionValue` look correct? (e.g., `elevenlabs:6WkvZo1vwba1zF4N2vlY`)
- ⚠️ Is it showing `elevenlabs:undefined`?
### 5. When Selecting a Voice
```javascript
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🎛️ Settings: Voice dropdown changed
📥 Selected value: "elevenlabs:6WkvZo1vwba1zF4N2vlY"
🔍 Value breakdown: {
hasPrefix: true,
prefix: "elevenlabs",
voiceId: "6WkvZo1vwba1zF4N2vlY"
}
💾 LocalStorage ttsVoice: "elevenlabs:6WkvZo1vwba1zF4N2vlY"
```
**Check:**
- ✅ Is the selected value correct?
- ✅ Is the voiceId part valid?
- ⚠️ Is it showing `elevenlabs:undefined`?
## Possible Scenarios
### Scenario 1: Property Name Mismatch
**Symptoms:**
```
voice_id: undefined
voiceId: undefined
id: "6WkvZo1vwba1zF4N2vlY" ← Actual ID
```
**Solution:** API uses `id` instead of `voice_id`
- Already handled with fallback: `voice.voice_id || voice.voiceId || voice.id`
### Scenario 2: Nested Property
**Symptoms:**
```
allKeys: ["voiceSettings", "data", ...]
// ID might be in voice.data.id or similar
```
**Solution:** Need to adjust path to voice ID
### Scenario 3: API Response Structure Different
**Symptoms:**
```
response.voices undefined
// Maybe it's response.data.voices or response.items
```
**Solution:** Adjust API response parsing
### Scenario 4: Async Timing Issue
**Symptoms:**
- Voices load correctly in console
- But dropdown options show undefined
- Race condition between state updates
**Solution:** Add loading state management
## Quick Test Commands
### Check LocalStorage Directly
```javascript
// In browser console
const settings = JSON.parse(localStorage.getItem('eve-settings'))
console.log('TTS Voice:', settings.state.ttsVoice)
```
### Check ElevenLabs Client Directly
```javascript
// After opening settings with API key
const client = window.__elevenLabsClient // If we expose it
```
## What to Share
Please copy and share:
1. **All 🎤 logs** (ElevenLabs API Response)
2. **All 🔍 logs** (Voice processing)
3. **All 🎵 logs** (Final processed voices)
4. **All 📋 logs** (Dropdown options)
5. **The full first voice object** from the API
## Expected Good Output
```javascript
🎤 First voice object: {
voice_id: "6WkvZo1vwba1zF4N2vlY",
name: "Rachel",
category: "premade",
labels: {...}
}
🔍 Processing voice: {
name: "Rachel",
voice_id: "6WkvZo1vwba1zF4N2vlY",
voiceId: "6WkvZo1vwba1zF4N2vlY",
id: "6WkvZo1vwba1zF4N2vlY",
finalVoiceId: "6WkvZo1vwba1zF4N2vlY",
allKeys: [...]
}
🎵 All voice IDs: ["Rachel: 6WkvZo1vwba1zF4N2vlY", ...]
📋 Sample ElevenLabs dropdown option: {
name: "Rachel",
voice_id: "6WkvZo1vwba1zF4N2vlY",
optionValue: "elevenlabs:6WkvZo1vwba1zF4N2vlY"
}
```
## Next Steps Based on Results
**If voice_id is undefined:**
→ Check the `allKeys` array to find the correct property name
**If voice_id exists but dropdown shows undefined:**
→ State/rendering issue, check React component re-render
**If API returns empty:**
→ API key or permissions issue
**If API returns different structure:**
→ Need to adjust response parsing
---
**Status**: Waiting for console output
**Action**: Open Settings and check console logs

View File

@@ -0,0 +1,258 @@
# ElevenLabs TTS Models
## Overview
EVE uses **ElevenLabs Turbo v2.5** by default for text-to-speech. This model is specifically optimized for real-time conversational AI.
## ⚠️ Important: V3 Alpha Not Recommended for EVE
According to [ElevenLabs documentation](https://elevenlabs.io/docs/models#eleven-v3-alpha):
> **"Eleven v3 is not made for real-time applications like Agents Platform."**
While V3 offers the highest quality, it is:
-**Not optimized for real-time** conversation
-**Higher latency** - Slower response times
-**Requires multiple generations** - Need to generate several versions and pick the best
-**Best for**: Audiobooks, character discussions, pre-recorded content
## Current Default Model
**Default**: `eleven_turbo_v2_5`
This model is optimized for EVE and provides:
- ✅ Fast generation speed
- ✅ High-quality natural voices
- ✅ Low latency for real-time conversation (~100-300ms)
- ✅ Cost-effective
- ✅ Multilingual support
-**Recommended by ElevenLabs for conversational AI**
## Available Models
ElevenLabs offers several models you can use:
### Turbo Models (Recommended)
**`eleven_turbo_v2_5`** (Current Default)
- Latest turbo model
- Excellent quality with fast generation
- Best for conversational AI
- Low latency
**`eleven_turbo_v2`**
- Previous turbo version
- Still high quality
- Slightly older technology
### Multilingual Models
**`eleven_multilingual_v2`**
- Supports 29+ languages
- High quality across languages
- Slower than turbo but more versatile
**`eleven_multilingual_v1`**
- Original multilingual model
- Stable and reliable
- Good for non-English content
### Monolingual Models
**`eleven_monolingual_v1`**
- English only
- High quality
- Original ElevenLabs model
- More expensive than turbo
### Flash Models
**`eleven_flash_v2_5`**
- Ultra-fast generation
- Lowest latency
- Good quality
- Best for real-time applications
**`eleven_flash_v2`**
- Previous flash version
- Very fast
- Lower cost
## Changing the Model
The model is configurable in the settings store:
```typescript
// In settingsStore.ts
ttsModel: 'eleven_turbo_v2_5' // Default
```
To change:
```typescript
setTtsModel('eleven_flash_v2_5') // For lower latency
setTtsModel('eleven_multilingual_v2') // For better multilingual support
```
## Model Characteristics
### Speed Comparison
1. **Flash** - Fastest (< 300ms)
2. **Turbo** - Very Fast (< 500ms)
3. **Multilingual** - Fast (< 1s)
4. **Monolingual** - Standard (1-2s)
### Quality Comparison
1. **Monolingual** - Highest quality
2. **Turbo v2.5** - Excellent quality
3. **Multilingual v2** - Great quality
4. **Flash** - Good quality
### Cost Comparison
1. **Flash** - Most economical
2. **Turbo** - Cost-effective
3. **Multilingual** - Standard pricing
4. **Monolingual** - Premium pricing
## Recommended Use Cases
### Real-Time Conversation (Default)
```
Model: eleven_turbo_v2_5
Speed: 1.0x
Stability: 50%
Clarity: 75%
```
Best balance for EVE assistant
### Ultra-Low Latency
```
Model: eleven_flash_v2_5
Speed: 1.0x
Stability: 60%
Clarity: 80%
```
For instant responses
### Maximum Quality
```
Model: eleven_monolingual_v1
Speed: 1.0x
Stability: 70%
Clarity: 85%
```
For professional content
### Multilingual
```
Model: eleven_multilingual_v2
Speed: 1.0x
Stability: 55%
Clarity: 75%
```
For non-English languages
## Technical Details
### API Call
```typescript
await client.textToSpeech.convert(voiceId, {
text: "Hello, how can I help you?",
model_id: "eleven_turbo_v2_5",
voice_settings: {
stability: 0.5,
similarity_boost: 0.75,
style: 0.0,
use_speaker_boost: true
}
})
```
### Model Selection Flow
1. User sends message
2. EVE responds
3. User clicks 🔊 speaker icon
4. TTSControls reads `ttsModel` from settings
5. Passes to TTS Manager
6. TTS Manager calls ElevenLabs with model ID
7. Audio generated and played
### Fallback Behavior
If ElevenLabs model fails or is unavailable:
- Falls back to Browser Web Speech API
- Logs warning in console
- Continues with free browser TTS
## Future Enhancements
### Planned Features
- **Model selector in UI** - Dropdown to choose model in Settings
- **Auto-detect best model** - Based on language and use case
- **Model presets** - Quick selection for different scenarios
- **Cost tracking** - Show estimated cost per request
- **Quality metrics** - User feedback on voice quality
### Potential Models
As ElevenLabs releases new models, EVE can be updated:
- `eleven_turbo_v3` - Next generation turbo
- `eleven_flash_v3` - Even faster flash model
- `eleven_multilingual_v3` - Improved multilingual
- Specialized models for specific use cases
## Troubleshooting
### Audio Not Playing
- Check that ElevenLabs API key is valid
- Verify model ID is correct
- Check console for error messages
- Try switching to `eleven_turbo_v2` if v2.5 fails
### Poor Quality
- Try `eleven_monolingual_v1` for better quality
- Adjust stability and clarity settings
- Check voice selection
- Ensure text is well-formatted
### Slow Generation
- Switch to `eleven_flash_v2_5` for speed
- Reduce text length
- Check network connection
- Verify API quota not exceeded
### Model Not Found Error
```
Error: Model 'eleven_turbo_v3' not found
```
- Model ID may be incorrect
- Model might not be available on your plan
- Fall back to `eleven_turbo_v2_5`
- Check ElevenLabs documentation
## Model Changelog
### v2.5 Models (Current)
- Released: 2024
- Improvements: Better quality, faster generation
- Models: `eleven_turbo_v2_5`, `eleven_flash_v2_5`
### v2 Models
- Released: 2023
- Improvements: Multilingual support, reduced latency
- Models: `eleven_turbo_v2`, `eleven_flash_v2`, `eleven_multilingual_v2`
### v1 Models (Legacy)
- Released: 2022-2023
- Original high-quality models
- Models: `eleven_monolingual_v1`, `eleven_multilingual_v1`
## References
- [ElevenLabs Models Documentation](https://elevenlabs.io/docs/api-reference/text-to-speech)
- [Model Comparison Guide](https://elevenlabs.io/docs/models)
- [Pricing Information](https://elevenlabs.io/pricing)
---
**Current Default**: `eleven_turbo_v2_5`
**Status**: Configured
**Version**: v0.2.0-rc
**Date**: October 5, 2025

View File

@@ -0,0 +1,225 @@
# ElevenLabs Voice Integration
## Overview
EVE now automatically fetches and displays **all available ElevenLabs voices** from your account when you configure your API key.
## Features
### Automatic Voice Discovery
- Fetches complete voice list from ElevenLabs API
- Updates automatically when API key is configured
- Shows loading state while fetching
- Graceful error handling if API fails
### Voice Details
Each voice includes:
- **Name** - The voice's display name
- **Voice ID** - Unique identifier
- **Category** - Voice category (premade, cloned, etc.)
- **Labels** - Metadata including:
- Accent (e.g., "American", "British")
- Age (e.g., "young", "middle-aged")
- Gender (e.g., "male", "female")
- Use case (e.g., "narration", "conversational")
- **Description** - Voice description
- **Preview URL** - Audio preview (future feature)
### Voice Selection UI
**Grouped Categories**:
1. **ElevenLabs Voices (Premium)** - All your ElevenLabs voices with rich details
2. **Browser Voices (Free)** - System text-to-speech voices
**Display Format**:
```text
Rachel - American (young)
Adam - American (middle-aged)
Antoni - British (young)
```
### Automatic Provider Detection
The system automatically detects which provider to use based on voice selection:
- Voice IDs prefixed with `elevenlabs:` → ElevenLabs TTS
- Voice IDs prefixed with `browser:` → Browser TTS
- `default` → Browser TTS fallback
## How It Works
### 1. API Key Configuration
When you enter your ElevenLabs API key in Settings:
1. API key is saved to settings store
2. `useEffect` hook triggers voice fetching
3. Loading state is shown
4. Voices are fetched from ElevenLabs API
5. Voices populate the dropdown
### 2. Voice Selection
1. User selects a voice from dropdown
2. Voice ID is saved with provider prefix (e.g., `elevenlabs:21m00Tcm4TlvDq8ikWAM`)
3. Prefix is stored in settings
### 3. Playback
1. User clicks speaker icon on message
2. TTS manager parses voice ID prefix
3. Correct provider is initialized
4. Audio is generated and played
## Code Architecture
### Components
- **SettingsPanel** - Fetches and displays voices
- **TTSControls** - Initializes client and plays audio
### Libraries
- **elevenlabs.ts** - ElevenLabs API client with `getVoices()` method
- **tts.ts** - TTS manager with automatic provider detection
### Data Flow
```text
Settings Panel
[ElevenLabs API Key Entered]
useEffect Hook Triggered
getElevenLabsClient(apiKey)
client.getVoices()
ElevenLabs API
Voice List Returned
Populate Dropdown
User Selects Voice
Save with Prefix (elevenlabs:VOICE_ID)
TTSControls Plays Message
Parse Prefix → Use ElevenLabs
Audio Playback
```
## API Response Example
```typescript
{
voices: [
{
voice_id: "21m00Tcm4TlvDq8ikWAM",
name: "Rachel",
category: "premade",
labels: {
accent: "American",
age: "young",
gender: "female",
use_case: "narration"
},
description: "A calm and professional female voice",
preview_url: "https://..."
},
// ... more voices
]
}
```
## Error Handling
### No API Key
- Shows: "Add ElevenLabs API key above to access premium voices"
- Falls back to browser voices
### Invalid API Key
- Shows: "Failed to load ElevenLabs voices. Check your API key."
- Error message in red text
- Falls back to browser voices
### Network Error
- Logs error to console
- Shows user-friendly error message
- Maintains browser voices as fallback
## Future Enhancements
### Voice Preview
- Click to hear voice sample before selecting
- Uses `preview_url` from API response
### Voice Filtering
- Filter by accent
- Filter by age
- Filter by gender
- Filter by use case
### Custom Voice Upload
- Support for cloned voices
- Voice cloning interface
### Voice Settings per Character
- Different voices for different AI personalities
- Character-specific voice preferences
## Testing
### To Test Voice Fetching
1. Open Settings
2. Enter valid ElevenLabs API key
3. Enable TTS
4. Wait for "Loading voices..." to complete
5. Open TTS Voice Selection dropdown
6. Verify ElevenLabs voices appear with details
### To Test Voice Playback
1. Select an ElevenLabs voice
2. Save settings
3. Send a message to EVE
4. Click speaker icon on response
5. Verify audio plays with selected voice
### To Test Fallback
1. Select an ElevenLabs voice
2. Remove API key
3. Click speaker icon
4. Verify fallback to browser TTS with warning message
## Benefits
**No Manual Configuration** - Voices auto-populate
**Always Up-to-Date** - Gets latest voices from your account
**Rich Information** - See voice details before selecting
**Smart Fallback** - Gracefully handles errors
**User-Friendly** - Clear feedback at every step
**Flexible** - Mix ElevenLabs and browser voices
---
**Implementation Complete**: October 5, 2025
**Status**: Production Ready ✅