226 lines
4.9 KiB
Markdown
226 lines
4.9 KiB
Markdown
# ElevenLabs Voice Integration
|
|
|
|
## Overview
|
|
|
|
EVE now automatically fetches and displays **all available ElevenLabs voices** from your account when you configure your API key.
|
|
|
|
## Features
|
|
|
|
### Automatic Voice Discovery
|
|
|
|
- Fetches complete voice list from ElevenLabs API
|
|
- Updates automatically when API key is configured
|
|
- Shows loading state while fetching
|
|
- Graceful error handling if API fails
|
|
|
|
### Voice Details
|
|
|
|
Each voice includes:
|
|
|
|
- **Name** - The voice's display name
|
|
- **Voice ID** - Unique identifier
|
|
- **Category** - Voice category (premade, cloned, etc.)
|
|
- **Labels** - Metadata including:
|
|
- Accent (e.g., "American", "British")
|
|
- Age (e.g., "young", "middle-aged")
|
|
- Gender (e.g., "male", "female")
|
|
- Use case (e.g., "narration", "conversational")
|
|
- **Description** - Voice description
|
|
- **Preview URL** - Audio preview (future feature)
|
|
|
|
### Voice Selection UI
|
|
|
|
**Grouped Categories**:
|
|
|
|
1. **ElevenLabs Voices (Premium)** - All your ElevenLabs voices with rich details
|
|
2. **Browser Voices (Free)** - System text-to-speech voices
|
|
|
|
**Display Format**:
|
|
|
|
```text
|
|
Rachel - American (young)
|
|
Adam - American (middle-aged)
|
|
Antoni - British (young)
|
|
```
|
|
|
|
### Automatic Provider Detection
|
|
|
|
The system automatically detects which provider to use based on voice selection:
|
|
|
|
- Voice IDs prefixed with `elevenlabs:` → ElevenLabs TTS
|
|
- Voice IDs prefixed with `browser:` → Browser TTS
|
|
- `default` → Browser TTS fallback
|
|
|
|
## How It Works
|
|
|
|
### 1. API Key Configuration
|
|
|
|
When you enter your ElevenLabs API key in Settings:
|
|
|
|
1. API key is saved to settings store
|
|
2. `useEffect` hook triggers voice fetching
|
|
3. Loading state is shown
|
|
4. Voices are fetched from ElevenLabs API
|
|
5. Voices populate the dropdown
|
|
|
|
### 2. Voice Selection
|
|
|
|
1. User selects a voice from dropdown
|
|
2. Voice ID is saved with provider prefix (e.g., `elevenlabs:21m00Tcm4TlvDq8ikWAM`)
|
|
3. Prefix is stored in settings
|
|
|
|
### 3. Playback
|
|
|
|
1. User clicks speaker icon on message
|
|
2. TTS manager parses voice ID prefix
|
|
3. Correct provider is initialized
|
|
4. Audio is generated and played
|
|
|
|
## Code Architecture
|
|
|
|
### Components
|
|
|
|
- **SettingsPanel** - Fetches and displays voices
|
|
- **TTSControls** - Initializes client and plays audio
|
|
|
|
### Libraries
|
|
|
|
- **elevenlabs.ts** - ElevenLabs API client with `getVoices()` method
|
|
- **tts.ts** - TTS manager with automatic provider detection
|
|
|
|
### Data Flow
|
|
|
|
```text
|
|
Settings Panel
|
|
↓
|
|
[ElevenLabs API Key Entered]
|
|
↓
|
|
useEffect Hook Triggered
|
|
↓
|
|
getElevenLabsClient(apiKey)
|
|
↓
|
|
client.getVoices()
|
|
↓
|
|
ElevenLabs API
|
|
↓
|
|
Voice List Returned
|
|
↓
|
|
Populate Dropdown
|
|
↓
|
|
User Selects Voice
|
|
↓
|
|
Save with Prefix (elevenlabs:VOICE_ID)
|
|
↓
|
|
TTSControls Plays Message
|
|
↓
|
|
Parse Prefix → Use ElevenLabs
|
|
↓
|
|
Audio Playback
|
|
```
|
|
|
|
## API Response Example
|
|
|
|
```typescript
|
|
{
|
|
voices: [
|
|
{
|
|
voice_id: "21m00Tcm4TlvDq8ikWAM",
|
|
name: "Rachel",
|
|
category: "premade",
|
|
labels: {
|
|
accent: "American",
|
|
age: "young",
|
|
gender: "female",
|
|
use_case: "narration"
|
|
},
|
|
description: "A calm and professional female voice",
|
|
preview_url: "https://..."
|
|
},
|
|
// ... more voices
|
|
]
|
|
}
|
|
```
|
|
|
|
## Error Handling
|
|
|
|
### No API Key
|
|
|
|
- Shows: "Add ElevenLabs API key above to access premium voices"
|
|
- Falls back to browser voices
|
|
|
|
### Invalid API Key
|
|
|
|
- Shows: "Failed to load ElevenLabs voices. Check your API key."
|
|
- Error message in red text
|
|
- Falls back to browser voices
|
|
|
|
### Network Error
|
|
|
|
- Logs error to console
|
|
- Shows user-friendly error message
|
|
- Maintains browser voices as fallback
|
|
|
|
## Future Enhancements
|
|
|
|
### Voice Preview
|
|
|
|
- Click to hear voice sample before selecting
|
|
- Uses `preview_url` from API response
|
|
|
|
### Voice Filtering
|
|
|
|
- Filter by accent
|
|
- Filter by age
|
|
- Filter by gender
|
|
- Filter by use case
|
|
|
|
### Custom Voice Upload
|
|
|
|
- Support for cloned voices
|
|
- Voice cloning interface
|
|
|
|
### Voice Settings per Character
|
|
|
|
- Different voices for different AI personalities
|
|
- Character-specific voice preferences
|
|
|
|
## Testing
|
|
|
|
### To Test Voice Fetching
|
|
|
|
1. Open Settings
|
|
2. Enter valid ElevenLabs API key
|
|
3. Enable TTS
|
|
4. Wait for "Loading voices..." to complete
|
|
5. Open TTS Voice Selection dropdown
|
|
6. Verify ElevenLabs voices appear with details
|
|
|
|
### To Test Voice Playback
|
|
|
|
1. Select an ElevenLabs voice
|
|
2. Save settings
|
|
3. Send a message to EVE
|
|
4. Click speaker icon on response
|
|
5. Verify audio plays with selected voice
|
|
|
|
### To Test Fallback
|
|
|
|
1. Select an ElevenLabs voice
|
|
2. Remove API key
|
|
3. Click speaker icon
|
|
4. Verify fallback to browser TTS with warning message
|
|
|
|
## Benefits
|
|
|
|
✅ **No Manual Configuration** - Voices auto-populate
|
|
✅ **Always Up-to-Date** - Gets latest voices from your account
|
|
✅ **Rich Information** - See voice details before selecting
|
|
✅ **Smart Fallback** - Gracefully handles errors
|
|
✅ **User-Friendly** - Clear feedback at every step
|
|
✅ **Flexible** - Mix ElevenLabs and browser voices
|
|
|
|
---
|
|
|
|
**Implementation Complete**: October 5, 2025
|
|
**Status**: Production Ready ✅
|