Files
llm-rpg/project_plan.md
Aodhan Collins d03d25ca3b Initial commit
2026-01-27 11:50:11 +00:00

299 lines
12 KiB
Markdown

# Project Plan: LLM RPG Engine
## Table of Contents
1. [Phase 1: Architecture & Tech Stack](#phase-1-architecture--tech-stack)
2. [Phase 2: Core Game Loop & Data Models](#phase-2-core-game-loop--data-models)
3. [Phase 3: Procedural Generation & Quest System](#phase-3-procedural-generation--quest-system)
4. [Phase 4: UI/UX & Client Architecture](#phase-4-uiux--client-architecture)
---
# Phase 1: Architecture & Tech Stack
## 1. High-Level Concept
A text-adventure RPG running as a web application. It combines nostalgic text-based input with modern RPG mechanics (inventory, stats, procedural dungeons). The core innovation is using an LLM as a strict "Game Master" that parses user intent and narrates outcomes, while a deterministic Game Engine handles all rules, state, and logic.
## 2. Architecture Overview
The system follows a **Client-Server** model to ensure security and state integrity.
* **Client (Frontend):** Handles user input and renders the game state (UI, Health, Inventory, Map).
* **Server (Backend):** Hosts the API, runs the deterministic Game Engine, manages the Database, and proxies calls to the LLM.
* **LLM (OpenRouter):** Acts as two distinct agents:
1. **Intent Parser:** Translates natural language into strict Game Engine commands.
2. **Narrator:** Translates Game Engine results into immersive flavor text.
## 3. Technology Stack
### Core
* **Language:** TypeScript (Full Stack)
* *Reasoning:* Shared type definitions (`Item`, `Room`, `GameState`) between Frontend and Backend ensure the UI always reflects the true game state.
### Frontend
* **Framework:** React
* **Styling:** Tailwind CSS (for rapid UI development and distinct "retro" aesthetic).
* **State Management:** React Context or Zustand (for local handling of UI state).
### Backend
* **Runtime:** Node.js
* **Framework:** Express or Fastify.
* **Database:** SQLite.
* *Reasoning:* Lightweight, file-based, and perfect for handling structured game data and save states without complex setup. Easy to migrate to PostgreSQL later.
### AI / ML
* **Provider:** OpenRouter (Access to various models like Llama 3, Claude 3, Gemini, etc.).
* **Integration:** Server-side API calls to prevent API key exposure.
## 4. Data Flow (The Game Loop)
1. **Input:** Player types: *"I grab the rusty sword and slash the goblin."*
2. **Parse (LLM):** Backend sends text + context to LLM.
* *Output:* `[{"action": "pickup", "target": "rusty sword"}, {"action": "attack", "target": "goblin"}]`
3. **Execute (Engine):** Backend Game Engine processes commands sequentially.
* *Checks:* Is sword in room? (Yes) -> Add to inventory.
* *Checks:* Is goblin in room? (Yes) -> Roll hit chance -> Apply damage.
* *Update:* Database updated with new state.
4. **Narrate (LLM):** Engine sends the *result* to LLM.
* *Input:* `Event: Pickup Success (Rusty Sword), Attack Success (Goblin, 5 dmg). Current HP: 10.`
* *Output:* *"You snatch the rusty sword from the dirt, weighing it in your hand. With a desperate lunge, you slash at the goblin, carving a shallow gash across its chest."*
5. **Render:** Backend sends new Game State + Flavor Text to Frontend.
## 5. Development Strategy
* **Monorepo Structure:** Keep Client and Server in one repository to share types easily.
* **Offline-First Logic:** Design the Game Engine so it *could* run without an LLM (returning raw text) for easier testing/debugging.
---
# Phase 2: Core Game Loop & Data Models
## 1. Data Models (TypeScript Interfaces)
The application state will be strictly typed to ensure consistency.
### Base Entities
```typescript
type StatBlock = {
hp: number;
maxHp: number;
attack: number;
defense: number;
};
interface Entity {
id: string;
name: string;
description: string;
stats: StatBlock;
isDead: boolean;
}
interface Item {
id: string;
name: string;
description: string;
type: 'weapon' | 'armor' | 'consumable' | 'key';
modifiers?: Partial<StatBlock>; // e.g., { attack: 2 }
}
```
### World Structure
```typescript
interface Room {
id: string;
name: string;
description: string; // The "base" description before dynamic elements
exits: Record<string, string>; // e.g., { "north": "room_id_102" }
items: string[]; // List of Item IDs currently on the floor
entities: string[]; // List of Entity IDs (monsters/NPCs) in the room
}
```
### Global State
```typescript
interface GameState {
player: Entity & {
inventory: string[]; // List of Item IDs
equipped: {
weapon?: string;
armor?: string;
};
};
currentRoomId: string;
world: Record<string, Room>; // The dungeon map
entities: Record<string, Entity>; // All active monsters lookup
items: Record<string, Item>; // Item lookup
turnLog: GameEvent[]; // Buffer of what happened this turn for the Narrator
}
```
## 2. RPG Mechanics (The "Engine")
We will use a simplified d20 system to keep math transparent but extensible.
### Combat Formulas
1. **Hit Chance:**
* Roll: `1d20`
* Target: `10 + (Target Defense) - (Attacker Attack)`
* *Logic:* Higher attack lowers the threshold needed to hit.
2. **Damage Calculation:**
* If Hit: `(Base Attack + Weapon Bonus) + (Random Variance 0-2)`
* *Mitigation:* Damage is raw for now (Defense only helps avoid hits), or we can add flat reduction later.
### Actions (Engine Primitives)
These are the atomic operations the Engine can perform.
* `MOVE(direction)`: Validates exit exists -> Updates `currentRoomId`.
* `PICKUP(itemId)`: Validates item in room -> Moves from Room.items to Player.inventory.
* `EQUIP(itemId)`: Validates item in inventory -> Updates Player.equipped -> Recalculates Stats.
* `ATTACK(targetId)`: Validates target in room -> Runs Combat Formula -> Updates Target HP -> Adds result to `turnLog`.
## 3. The LLM Interface Layer
### Intent Parser (Input)
The LLM will be given the user's text and a simplified list of valid targets.
**Expected JSON Output:**
```json
[
{ "action": "MOVE", "params": { "direction": "north" } },
{ "action": "EQUIP", "params": { "itemId": "rusty_sword" } }
]
```
### Narrator (Output)
The LLM receives the `turnLog` to generate the final response.
**Input Context:**
```text
Player Action: "I head north and stab the rat."
Engine Results:
1. Move North: Success. New Room: "Damp Cave".
2. Attack Rat: Success. Rolled 18 (Hit). Dealt 4 Damage. Rat dies.
```
**Generated Output:**
"You leave the corridor, stepping into a damp cave that smells of mildew. A large rat lunges from the shadows! You react instantly, skewering it mid-air. It lets out a final squeak before hitting the floor."
---
# Phase 3: Procedural Generation & Quest System
## 1. Dungeon Generation Algorithm
For the prototype, we will use a **Random Walk** algorithm with constraint checking to ensure playability.
### Algorithm Steps
1. **Initialize Grid:** Create a 10x10 empty grid.
2. **Start Point:** Place `Entrance` at a random edge.
3. **Walk:** An agent moves randomly (N/S/E/W), carving "Rooms" until ~15 rooms exist.
4. **Connectivity:** Ensure all rooms are reachable (no islands). The Random Walk naturally handles this mostly, but a flood-fill check confirms it.
5. **Tagging:**
* Mark the room furthest from the `Entrance` as the `Boss Room` (locked by default).
* Place `Exit` behind the `Boss Room`.
* Randomly select 2 dead-end rooms (or secluded rooms) for `Treasure Chests`.
## 2. Quest System: "The Escape"
We will define a rigid Quest Interface to allow both static and dynamic creation.
### Data Structure
```typescript
interface QuestStep {
id: string;
description: string;
isComplete: boolean;
type: 'FIND_ITEM' | 'KILL_ENTITY' | 'UNLOCK_DOOR';
targetId: string; // The ID of the key, monster, or door
}
interface Quest {
id: string;
title: string;
description: string; // LLM Flavour text: "Escape the goblin prison!"
steps: QuestStep[];
reward: {
xp: number;
items?: string[];
};
winCondition: 'ALL_STEPS' | 'specific_step_id';
}
```
### "Escape" Scenario Implementation
This specific scenario will be generated as follows:
1. **Item Generation:**
* Create 3 unique Keys: `Key_Red` (Exit Door), `Key_Blue` (Weapon Chest), `Key_Green` (Potion Chest).
* Create 1 `Boss Enemy` and N `Roaming Enemies`.
2. **Distribution:**
* Place `Key_Red` (Exit) inside one of the Chests or on a roaming enemy (randomized).
* Place `Key_Blue` & `Key_Green` in random rooms (hidden in "furniture" or dropped by mobs).
3. **Locking:**
* `Boss Room` door requires `Key_Red`.
* `Chest 1` requires `Key_Blue`.
* `Chest 2` requires `Key_Green`.
## 3. Enemy AI (Roaming)
To make the dungeon feel alive, enemies won't just stand still.
* **Turn Based:** Every time the player takes an action (Move/Wait), enemies take a turn.
* **Logic:**
1. **Aggro:** If Player is in same room -> Attack.
2. **Patrol:** If Player not in room -> Move to a random adjacent connected room (25% chance to stay still).
3. **Leash:** Enemies generally stay within 2-3 rooms of their spawn point to prevent clumping.
## 4. LLM Integration for Generation
While the *logic* is hardcoded, the *names and descriptions* are generated once at dungeon start.
* **Prompt:** "Generate a dungeon theme. Theme: 'Goblin Prison'. Give me names for 3 keys, a boss name, and a description for the damp boss room."
* **Result:**
* `Key_Red` -> "Rusty Iron Key"
* `Boss` -> "Grumlock the Jailor"
* `Room` -> "A circular chamber reeking of stale meat..."
---
# Phase 4: UI/UX & Client Architecture
## 1. Layout Structure (The "Dashboard")
The interface is designed to resemble a modern cockpit for a text adventure.
* **Header:** Game Title, Global Menu (Save/Settings).
* **Main Content (3-Column Grid):**
* **Left Column (Visuals):**
* **Scene View:** A static 16:9 image container showing the current room context (e.g., `dungeon_sparse_lmr.png`).
* **Mini-Map:** A canvas-based grid visualizer. Renders visited rooms as squares and connections as lines. Unvisited rooms are black (Fog of War). Current room highlighted.
* **Center Column (The Narrative):**
* **Scrollable Log:** The history of LLM text and system messages.
* **Input Area:** Fixed at the bottom.
* Text Input (`> type here...`)
* Quick Actions Bar (Contextual buttons: "Attack", "Search", "Wait").
* **Right Column (State):**
* **Character Card:** HP bar, Stats, Level.
* **Inventory List:** Clickable items (Open modal for details/actions).
* **Room Objects:** List of interactables currently visible (Enemies, Chests, Doors).
## 2. Visual Assets Strategy
To keep the prototype visually complete without dynamic generation yet, we use a **Compositional Asset Naming Convention**.
* **Base Path:** `/assets/images/rooms/`
* **Naming Schema:** `[biome]_[type]_[exits].png`
* **Parameters:**
* `biome`: `dungeon` (fixed for now).
* `type`: `empty`, `sparse`, `furnished`, `treasure`, `boss`.
* `exits`: A string combination of `l` (left), `m` (middle/forward), `r` (right).
* *Example:* `dungeon_sparse_lr.png` (Room with exits Left and Right).
* *Fallback:* If specific exit variation missing, fall back to `[biome]_[type].png`.
## 3. Frontend Technology (React + Tailwind)
### Components
* `GameContainer`: Manages the main API polling/WebSocket connection.
* `TerminalLog`: A virtualized list (using `react-window` or similar if text gets long) to handle infinite scroll efficiently.
* `RoomRenderer`: Logic to determine which PNG to load based on `GameState.currentRoom`.
* `MapCanvas`: HTML5 Canvas component that draws the grid graph relative to the player's position.
### State Management
* **Zustand Store:**
* `messages[]`: Array of `{ sender: 'user'|'system'|'narrator', text: string }`.
* `gameState`: The full JSON object synced from backend (Player, Room, etc.).
* `uiState`: `{ isInventoryOpen: boolean, isLoading: boolean }`.
## 4. Interaction Design
* **Hybrid Input:**
* **Typing:** Users can type "Pick up sword".
* **Clicking:** Clicking "Rusty Sword" in the `Room Objects` panel triggers the exact same API call (`pickup sword`) as if the user typed it. This teaches the user the text commands while offering GUI convenience.