Phase 2 complete.

This commit is contained in:
Aodhan Collins
2025-10-06 21:08:25 +01:00
parent 66749a5ce7
commit 8d6a681baa
26 changed files with 3163 additions and 164 deletions

105
README.md
View File

@@ -2,12 +2,12 @@
A sophisticated local-first desktop AI assistant with modular personality system, multi-model support, and seamless integration with your development environment.
> **Current Version**: 0.1.0
> **Status**: ✅ Phase 1 Complete - Core functionality stable and ready to use
> **Current Version**: 0.2.0
> **Status**: ✅ Phase 2 Complete - Enhanced multimodal assistant with voice, files, and system integration
## ✨ Features
### ✅ Implemented (v0.1.0)
### ✅ Implemented (v0.2.0)
- **🤖 Multi-Model AI Chat**
- Full-featured chat interface with conversation history
@@ -33,13 +33,43 @@ A sophisticated local-first desktop AI assistant with modular personality system
- Conversation management (clear history)
- Persistent settings across sessions
- **🗣️ Voice Integration (NEW in v0.2.0)**
- Text-to-Speech with ElevenLabs API and browser fallback
- Speech-to-Text with Web Speech API (25+ languages)
- Audio conversation mode for hands-free interaction
- Per-message voice controls
- **📁 File Attachment Support (NEW in v0.2.0)**
- Drag & drop file upload
- Support for images, PDFs, text files, and code
- Preview thumbnails and content
- AI can analyze and discuss attached files
- **💾 Conversation Management (NEW in v0.2.0)**
- Save and load conversations
- Export to Markdown, JSON, or TXT
- Search and filter saved conversations
- Tag and organize conversation history
- **🎨 Advanced Formatting (NEW in v0.2.0)**
- Markdown with GitHub Flavored Markdown
- Syntax highlighting for code blocks
- LaTeX math equation rendering
- Mermaid diagrams for flowcharts
- **🖥️ System Integration (NEW in v0.2.0)**
- System tray icon for quick access
- Global keyboard shortcut (Ctrl+Shift+E / Cmd+Shift+E)
- Desktop notifications for responses
- Minimize to tray functionality
### 🚧 Planned Features
See [Roadmap](./docs/planning/ROADMAP.md) for the complete development plan:
- **Phase 2**: Voice integration (TTS/STT), file attachments, advanced formatting
- **Phase 3**: Knowledge base, long-term memory, multi-modal capabilities
- **Phase 3** (Next): Knowledge base, long-term memory, multi-modal capabilities
- **Phase 4**: Developer tools, plugin system, multi-device sync
- **Long-term**: Avatar system, screen/audio monitoring, gaming integration
## Tech Stack
@@ -114,6 +144,42 @@ xcode-select --install
- Install [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/)
- Install [WebView2](https://developer.microsoft.com/en-us/microsoft-edge/webview2/)
## Quick Start
### Option 1: Automated Setup (Recommended)
Use the setup script to automatically install dependencies:
**Linux/macOS:**
```bash
./setup.sh
```
**Windows:**
```powershell
.\setup.ps1
```
The script will guide you through installing Node.js, Rust, and system dependencies.
### Option 2: Development Scripts
Once setup is complete, use these convenient scripts:
**Start the app:**
```bash
./run.sh # Linux/macOS
.\run.ps1 # Windows
```
**Stop the app:**
```bash
./kill.sh # Linux/macOS
.\kill.ps1 # Windows
```
See [SCRIPTS.md](./docs/SCRIPTS.md) for detailed script documentation.
## Getting Started
### 1. Install Dependencies
@@ -250,15 +316,26 @@ All core features are implemented and stable:
- ✅ Linux graphics compatibility fixes
- ✅ Clean, modern UI with dark mode
### 🚀 Next: Phase 2 - Enhanced Capabilities (v0.2.0)
### Phase 2 Complete - Enhanced Capabilities (v0.2.0)
All Phase 2 features are production-ready:
- ✅ Voice integration (TTS with ElevenLabs, STT with Web Speech API)
- ✅ File attachment support (images, PDFs, code, text files)
- ✅ Advanced message formatting (syntax highlighting, LaTeX, Mermaid diagrams)
- ✅ System integration (global shortcuts, system tray, notifications)
- ✅ Conversation export and management (Markdown, JSON, TXT)
- ✅ Audio conversation mode for hands-free interaction
### 🚀 Next: Phase 3 - Knowledge Base & Memory (v0.3.0)
Planned features:
- Voice integration (TTS with ElevenLabs, STT)
- File attachment support
- Advanced message formatting (code highlighting, LaTeX, diagrams)
- System integration (keyboard shortcuts, tray icon)
- Conversation export and management
- Long-term memory with vector database
- Semantic search across conversations
- Personal knowledge graph
- Document library integration
- Multi-modal capabilities (vision, image generation)
See [Roadmap](./docs/planning/ROADMAP.md) for the complete development plan.
@@ -317,8 +394,8 @@ This is currently a personal project, but contributions, suggestions, and feedba
---
**Version**: 0.1.0
**Status**: ✅ Stable - Ready for use
**Last Updated**: October 5, 2025
**Version**: 0.2.0
**Status**: ✅ Production Ready - Full-featured multimodal AI assistant
**Last Updated**: October 6, 2025
For detailed changes, see [Changelog](./docs/releases/CHANGELOG.md)