14 KiB
Phase 3 - Knowledge Base & Memory (v0.3.0)
Target Version: v0.3.0
Estimated Duration: 20-30 hours
Priority: High
Status: 📋 Planning
🎯 Phase 3 Goals
Transform EVE from a conversational assistant into an intelligent knowledge companion with:
- Long-term memory - Remember past conversations and user preferences
- Document library - Manage and reference documents
- Vision capabilities - Generate and analyze images
- Web access - Real-time information retrieval
📊 Feature Breakdown
1. Long-Term Memory System
Priority: Critical
Estimated Time: 8-10 hours
Objectives
- Store and retrieve conversational context across sessions
- Semantic search through all conversations
- Auto-extract and store key information
- Build personal knowledge graph
Technical Approach
A. Vector Database Integration
- Options:
- ChromaDB (lightweight, local-first)
- LanceDB (Rust-based, fast)
- SQLite + vector extension
- Recommendation: ChromaDB for ease of use
- Storage: Embed messages, extract entities, store relationships
B. Embedding Pipeline
User Message → OpenAI Embeddings API → Vector Store
↓
Semantic Search ← Query
↓
Retrieved Context → Enhanced Prompt
C. Implementation Plan
- Set up vector database (ChromaDB)
- Create embedding service (
src/lib/embeddings.ts) - Background job to embed existing messages
- Add semantic search to conversation store
- UI for memory search and management
- Context injection for relevant memories
D. Files to Create
src/lib/embeddings.ts- Embedding servicesrc/lib/vectordb.ts- Vector database clientsrc/stores/memoryStore.ts- Memory state managementsrc/components/MemorySearch.tsx- Search UIsrc/components/MemoryPanel.tsx- Memory management UI
E. Features
- Vector database setup
- Automatic message embedding
- Semantic search interface
- Memory extraction (entities, facts)
- Knowledge graph visualization
- Context injection in prompts
- Memory management UI
2. Document Library
Priority: High
Estimated Time: 6-8 hours
Objectives
- Upload and store reference documents
- Full-text search across documents
- Automatic document summarization
- Link documents to conversations
Technical Approach
A. Document Storage
- Backend: Tauri file system access
- Location:
{app_data_dir}/documents/ - Indexing: SQLite FTS5 for full-text search
- Metadata: Title, author, date, tags, summary
B. Document Processing Pipeline
Upload → Parse (PDF/DOCX/MD) → Extract Text → Embed Chunks
↓ ↓ ↓
Metadata Full-Text Index Vector Store
C. Implementation Plan
- Rust commands for file management
- Document parser library integration
- SQLite database for metadata and FTS
- Chunking and embedding for semantic search
- Document viewer component
- Library management UI
D. Files to Create
src-tauri/src/documents.rs- Document management (Rust)src/lib/documentParser.ts- Document parsingsrc/stores/documentStore.ts- Document statesrc/components/DocumentLibrary.tsx- Library UIsrc/components/DocumentViewer.tsx- Document viewer
E. Features
- Upload documents (PDF, DOCX, TXT, MD)
- Full-text search
- Document categorization
- Automatic summarization
- Reference in conversations
- Document viewer
- Export/backup library
F. Dependencies
{
"pdf-parse": "^1.1.1", // PDF parsing
"mammoth": "^1.6.0", // DOCX parsing
"better-sqlite3": "^9.0.0" // SQLite
}
3. Vision & Image Generation
Priority: High
Estimated Time: 4-6 hours
Objectives
- Generate images from text prompts
- Analyze uploaded images
- Edit and manipulate existing images
- Screenshot annotation tools
Technical Approach
A. Image Generation
- Provider: DALL-E 3 (via OpenAI API)
- Alternative: Stable Diffusion (local)
- Storage:
{app_data_dir}/generated_images/
B. Image Analysis
- Provider: GPT-4 Vision (OpenAI)
- Features:
- Describe images
- Extract text (OCR)
- Answer questions about images
- Compare multiple images
C. Implementation Plan
- OpenAI Vision API integration
- DALL-E 3 API integration
- Image storage and management
- Image generation UI
- Image analysis in chat
- Gallery component
D. Files to Create
src/lib/vision.ts- Vision API clientsrc/lib/imageGeneration.ts- DALL-E clientsrc/components/ImageGenerator.tsx- Generation UIsrc/components/ImageGallery.tsx- Gallery viewsrc/stores/imageStore.ts- Image state
E. Features
- Text-to-image generation
- Image analysis and description
- OCR text extraction
- Image-based conversations
- Generation history
- Image editing tools (basic)
- Screenshot capture and analysis
F. Dependencies
{
"openai": "^4.0.0" // Already installed
}
4. Web Access & Real-Time Information
Priority: Medium
Estimated Time: 6-8 hours
Objectives
- Search the web for current information
- Extract and summarize web content
- Integrate news and articles
- Fact-checking capabilities
Technical Approach
A. Web Search
- Options:
- Brave Search API (privacy-focused, free tier)
- SerpAPI (Google results, paid)
- Custom scraper (legal concerns)
- Recommendation: Brave Search API
B. Content Extraction
- Library: Mozilla Readability or Cheerio
- Process: Fetch → Parse → Clean → Summarize
- Caching: Store extracted content locally
C. Implementation Plan
- Web search API integration
- Content extraction service
- URL preview component
- Web search command in chat
- Article summarization
- Citation tracking
D. Files to Create
src/lib/webSearch.ts- Search API clientsrc/lib/webScraper.ts- Content extractionsrc/components/WebSearchPanel.tsx- Search UIsrc/components/ArticlePreview.tsx- Preview componentsrc/stores/webStore.ts- Web content state
E. Features
- Web search from chat
- URL content extraction
- Article summarization
- News aggregation
- Fact verification
- Source citations
- Link preview cards
F. Commands
// In-chat commands
/search [query] // Web search
/summarize [url] // Summarize article
/news [topic] // Get latest news
/fact-check [claim] // Verify information
G. Dependencies
{
"cheerio": "^1.0.0-rc.12", // HTML parsing
"@mozilla/readability": "^0.5.0", // Content extraction
"node-fetch": "^3.3.2" // HTTP requests
}
🗂️ Database Schema
Memory Database (Vector Store)
interface Memory {
id: string
conversationId: string
messageId: string
content: string
embedding: number[] // 1536-dim vector
entities: string[] // Extracted entities
timestamp: number
importance: number // 0-1 relevance score
metadata: {
speaker: 'user' | 'assistant'
tags: string[]
references: string[] // Related memory IDs
}
}
Document Database (SQLite)
CREATE TABLE documents (
id TEXT PRIMARY KEY,
title TEXT NOT NULL,
filename TEXT NOT NULL,
filepath TEXT NOT NULL,
content TEXT, -- Full text for FTS
summary TEXT,
file_type TEXT, -- pdf, docx, txt, md
file_size INTEGER,
upload_date INTEGER,
tags TEXT, -- JSON array
metadata TEXT -- JSON object
);
CREATE VIRTUAL TABLE documents_fts USING fts5(
content,
title,
tags
);
Image Database (SQLite)
CREATE TABLE images (
id TEXT PRIMARY KEY,
filename TEXT NOT NULL,
filepath TEXT NOT NULL,
prompt TEXT, -- For generated images
description TEXT, -- AI-generated description
analysis TEXT, -- Detailed analysis
width INTEGER,
height INTEGER,
file_size INTEGER,
created_date INTEGER,
source TEXT, -- 'generated', 'uploaded', 'screenshot'
metadata TEXT -- JSON object
);
🎨 UI Components
New Screens
-
Memory Dashboard (
/memory)- Knowledge graph visualization
- Memory timeline
- Entity browser
- Search interface
-
Document Library (
/documents)- Grid/list view
- Upload area
- Search and filter
- Document viewer
-
Image Gallery (
/images)- Masonry layout
- Generation form
- Image details panel
- Edit tools
-
Web Research (
/web)- Search interface
- Article list
- Preview panel
- Saved articles
Enhanced Components
-
Chat Interface
- Memory context indicator
- Document reference links
- Image inline display
- Web search results
-
Settings
- Memory settings (retention, privacy)
- API keys (OpenAI, Brave)
- Storage management
- Feature toggles
🔧 Technical Architecture
State Management
// New Stores
memoryStore // Memory & knowledge graph
documentStore // Document library
imageStore // Image gallery
webStore // Web search & articles
// Enhanced Stores
chatStore // Add memory injection
settingsStore // Add new API keys
Backend (Rust)
// New modules
src-tauri/src/
├── memory/
│ ├── embeddings.rs
│ └── vectordb.rs
├── documents/
│ ├── parser.rs
│ ├── storage.rs
│ └── search.rs
└── images/
├── generator.rs
└── storage.rs
API Integration
// New API clients
OpenAI Embeddings API // Text embeddings
OpenAI Vision API // Image analysis
DALL-E 3 API // Image generation
Brave Search API // Web search
📦 Dependencies
Frontend
{
"chromadb": "^1.7.0", // Vector database
"better-sqlite3": "^9.0.0", // SQLite
"cheerio": "^1.0.0-rc.12", // Web scraping
"@mozilla/readability": "^0.5.0", // Content extraction
"d3": "^7.8.5", // Knowledge graph viz
"react-force-graph": "^1.43.0", // Graph component
"pdfjs-dist": "^3.11.174", // PDF preview
"react-image-gallery": "^1.3.0" // Image gallery
}
Backend (Rust)
[dependencies]
chromadb = "0.1" # Vector DB client
rusqlite = "0.30" # SQLite
pdf-extract = "0.7" # PDF parsing
lopdf = "0.31" # PDF manipulation
image = "0.24" # Image processing
🚀 Implementation Timeline
Week 1: Foundation (8-10 hours)
- Days 1-2: Vector database setup
- Day 3: Embedding pipeline
- Day 4: Memory store and basic UI
- Day 5: Testing and refinement
Week 2: Documents & Vision (10-12 hours)
- Days 1-2: Document storage and parsing
- Day 3: Full-text search implementation
- Day 4: Vision API integration
- Day 5: Image generation UI
Week 3: Web & Polish (6-8 hours)
- Days 1-2: Web search integration
- Day 3: Content extraction
- Day 4: UI polish and testing
- Day 5: Documentation
Total Estimated Time: 24-30 hours
🎯 Success Metrics
Functionality
- Can remember facts from past conversations
- Can search semantically through history
- Can reference uploaded documents
- Can generate images from prompts
- Can analyze uploaded images
- Can search the web for information
- Can summarize web articles
Performance
- Memory search: <500ms
- Document search: <200ms
- Image generation: <10s (API-dependent)
- Web search: <2s
- No UI lag with large knowledge base
User Experience
- Intuitive memory management
- Easy document upload and search
- Seamless image generation workflow
- Useful web search integration
- Clear indication of memory usage
🔒 Privacy & Security
Data Storage
- All data stored locally by default
- Encrypted sensitive information
- User control over data retention
- Clear data deletion options
API Keys
- Secure storage in Tauri config
- Never logged or exposed
- Optional API usage (user can disable features)
Memory System
- User can view all stored memories
- One-click memory deletion
- Configurable retention periods
- Export capabilities for transparency
🧪 Testing Strategy
Unit Tests
- Vector database operations
- Document parsing
- Search functionality
- Embedding generation
Integration Tests
- End-to-end memory storage/retrieval
- Document upload workflow
- Image generation pipeline
- Web search flow
Manual Testing
- Memory accuracy
- Search relevance
- UI responsiveness
- Cross-platform compatibility
📝 Documentation
User Documentation
- Memory system guide
- Document library tutorial
- Image generation how-to
- Web search commands reference
Developer Documentation
- Vector database architecture
- Embedding pipeline details
- API integration guides
- Database schemas
🎉 Phase 3 Vision
By the end of Phase 3, EVE will:
- Remember everything - Long-term conversational memory
- Reference knowledge - Built-in document library
- See and create - Vision and image generation
- Stay current - Real-time web information
This transforms EVE from a conversational assistant into a knowledge companion that grows smarter over time and has access to both personal knowledge and real-time information.
🔜 Post-Phase 3
After Phase 3 completion, we'll move to:
- Phase 4: Developer tools, plugins, customization
- v1.0: Production release with all core features
- Beyond: Mobile apps, team features, advanced AI
Status: Ready to Start
Prerequisites: Phase 2 Complete ✅
Next Step: Begin Long-Term Memory implementation
Created: October 6, 2025, 11:20pm UTC+01:00