Files
llm-rpg/project_plan.md
Aodhan Collins d03d25ca3b Initial commit
2026-01-27 11:50:11 +00:00

12 KiB

Project Plan: LLM RPG Engine

Table of Contents

  1. Phase 1: Architecture & Tech Stack
  2. Phase 2: Core Game Loop & Data Models
  3. Phase 3: Procedural Generation & Quest System
  4. Phase 4: UI/UX & Client Architecture

Phase 1: Architecture & Tech Stack

1. High-Level Concept

A text-adventure RPG running as a web application. It combines nostalgic text-based input with modern RPG mechanics (inventory, stats, procedural dungeons). The core innovation is using an LLM as a strict "Game Master" that parses user intent and narrates outcomes, while a deterministic Game Engine handles all rules, state, and logic.

2. Architecture Overview

The system follows a Client-Server model to ensure security and state integrity.

  • Client (Frontend): Handles user input and renders the game state (UI, Health, Inventory, Map).
  • Server (Backend): Hosts the API, runs the deterministic Game Engine, manages the Database, and proxies calls to the LLM.
  • LLM (OpenRouter): Acts as two distinct agents:
    1. Intent Parser: Translates natural language into strict Game Engine commands.
    2. Narrator: Translates Game Engine results into immersive flavor text.

3. Technology Stack

Core

  • Language: TypeScript (Full Stack)
    • Reasoning: Shared type definitions (Item, Room, GameState) between Frontend and Backend ensure the UI always reflects the true game state.

Frontend

  • Framework: React
  • Styling: Tailwind CSS (for rapid UI development and distinct "retro" aesthetic).
  • State Management: React Context or Zustand (for local handling of UI state).

Backend

  • Runtime: Node.js
  • Framework: Express or Fastify.
  • Database: SQLite.
    • Reasoning: Lightweight, file-based, and perfect for handling structured game data and save states without complex setup. Easy to migrate to PostgreSQL later.

AI / ML

  • Provider: OpenRouter (Access to various models like Llama 3, Claude 3, Gemini, etc.).
  • Integration: Server-side API calls to prevent API key exposure.

4. Data Flow (The Game Loop)

  1. Input: Player types: "I grab the rusty sword and slash the goblin."
  2. Parse (LLM): Backend sends text + context to LLM.
    • Output: [{"action": "pickup", "target": "rusty sword"}, {"action": "attack", "target": "goblin"}]
  3. Execute (Engine): Backend Game Engine processes commands sequentially.
    • Checks: Is sword in room? (Yes) -> Add to inventory.
    • Checks: Is goblin in room? (Yes) -> Roll hit chance -> Apply damage.
    • Update: Database updated with new state.
  4. Narrate (LLM): Engine sends the result to LLM.
    • Input: Event: Pickup Success (Rusty Sword), Attack Success (Goblin, 5 dmg). Current HP: 10.
    • Output: "You snatch the rusty sword from the dirt, weighing it in your hand. With a desperate lunge, you slash at the goblin, carving a shallow gash across its chest."
  5. Render: Backend sends new Game State + Flavor Text to Frontend.

5. Development Strategy

  • Monorepo Structure: Keep Client and Server in one repository to share types easily.
  • Offline-First Logic: Design the Game Engine so it could run without an LLM (returning raw text) for easier testing/debugging.

Phase 2: Core Game Loop & Data Models

1. Data Models (TypeScript Interfaces)

The application state will be strictly typed to ensure consistency.

Base Entities

type StatBlock = {
  hp: number;
  maxHp: number;
  attack: number;
  defense: number;
};

interface Entity {
  id: string;
  name: string;
  description: string;
  stats: StatBlock;
  isDead: boolean;
}

interface Item {
  id: string;
  name: string;
  description: string;
  type: 'weapon' | 'armor' | 'consumable' | 'key';
  modifiers?: Partial<StatBlock>; // e.g., { attack: 2 }
}

World Structure

interface Room {
  id: string;
  name: string;
  description: string; // The "base" description before dynamic elements
  exits: Record<string, string>; // e.g., { "north": "room_id_102" }
  items: string[]; // List of Item IDs currently on the floor
  entities: string[]; // List of Entity IDs (monsters/NPCs) in the room
}

Global State

interface GameState {
  player: Entity & {
    inventory: string[]; // List of Item IDs
    equipped: {
      weapon?: string;
      armor?: string;
    };
  };
  currentRoomId: string;
  world: Record<string, Room>; // The dungeon map
  entities: Record<string, Entity>; // All active monsters lookup
  items: Record<string, Item>; // Item lookup
  turnLog: GameEvent[]; // Buffer of what happened this turn for the Narrator
}

2. RPG Mechanics (The "Engine")

We will use a simplified d20 system to keep math transparent but extensible.

Combat Formulas

  1. Hit Chance:
    • Roll: 1d20
    • Target: 10 + (Target Defense) - (Attacker Attack)
    • Logic: Higher attack lowers the threshold needed to hit.
  2. Damage Calculation:
    • If Hit: (Base Attack + Weapon Bonus) + (Random Variance 0-2)
    • Mitigation: Damage is raw for now (Defense only helps avoid hits), or we can add flat reduction later.

Actions (Engine Primitives)

These are the atomic operations the Engine can perform.

  • MOVE(direction): Validates exit exists -> Updates currentRoomId.
  • PICKUP(itemId): Validates item in room -> Moves from Room.items to Player.inventory.
  • EQUIP(itemId): Validates item in inventory -> Updates Player.equipped -> Recalculates Stats.
  • ATTACK(targetId): Validates target in room -> Runs Combat Formula -> Updates Target HP -> Adds result to turnLog.

3. The LLM Interface Layer

Intent Parser (Input)

The LLM will be given the user's text and a simplified list of valid targets. Expected JSON Output:

[
  { "action": "MOVE", "params": { "direction": "north" } },
  { "action": "EQUIP", "params": { "itemId": "rusty_sword" } }
]

Narrator (Output)

The LLM receives the turnLog to generate the final response. Input Context:

Player Action: "I head north and stab the rat."
Engine Results:
1. Move North: Success. New Room: "Damp Cave".
2. Attack Rat: Success. Rolled 18 (Hit). Dealt 4 Damage. Rat dies.

Generated Output: "You leave the corridor, stepping into a damp cave that smells of mildew. A large rat lunges from the shadows! You react instantly, skewering it mid-air. It lets out a final squeak before hitting the floor."


Phase 3: Procedural Generation & Quest System

1. Dungeon Generation Algorithm

For the prototype, we will use a Random Walk algorithm with constraint checking to ensure playability.

Algorithm Steps

  1. Initialize Grid: Create a 10x10 empty grid.
  2. Start Point: Place Entrance at a random edge.
  3. Walk: An agent moves randomly (N/S/E/W), carving "Rooms" until ~15 rooms exist.
  4. Connectivity: Ensure all rooms are reachable (no islands). The Random Walk naturally handles this mostly, but a flood-fill check confirms it.
  5. Tagging:
    • Mark the room furthest from the Entrance as the Boss Room (locked by default).
    • Place Exit behind the Boss Room.
    • Randomly select 2 dead-end rooms (or secluded rooms) for Treasure Chests.

2. Quest System: "The Escape"

We will define a rigid Quest Interface to allow both static and dynamic creation.

Data Structure

interface QuestStep {
  id: string;
  description: string;
  isComplete: boolean;
  type: 'FIND_ITEM' | 'KILL_ENTITY' | 'UNLOCK_DOOR';
  targetId: string; // The ID of the key, monster, or door
}

interface Quest {
  id: string;
  title: string;
  description: string; // LLM Flavour text: "Escape the goblin prison!"
  steps: QuestStep[];
  reward: {
    xp: number;
    items?: string[];
  };
  winCondition: 'ALL_STEPS' | 'specific_step_id';
}

"Escape" Scenario Implementation

This specific scenario will be generated as follows:

  1. Item Generation:
    • Create 3 unique Keys: Key_Red (Exit Door), Key_Blue (Weapon Chest), Key_Green (Potion Chest).
    • Create 1 Boss Enemy and N Roaming Enemies.
  2. Distribution:
    • Place Key_Red (Exit) inside one of the Chests or on a roaming enemy (randomized).
    • Place Key_Blue & Key_Green in random rooms (hidden in "furniture" or dropped by mobs).
  3. Locking:
    • Boss Room door requires Key_Red.
    • Chest 1 requires Key_Blue.
    • Chest 2 requires Key_Green.

3. Enemy AI (Roaming)

To make the dungeon feel alive, enemies won't just stand still.

  • Turn Based: Every time the player takes an action (Move/Wait), enemies take a turn.
  • Logic:
    1. Aggro: If Player is in same room -> Attack.
    2. Patrol: If Player not in room -> Move to a random adjacent connected room (25% chance to stay still).
    3. Leash: Enemies generally stay within 2-3 rooms of their spawn point to prevent clumping.

4. LLM Integration for Generation

While the logic is hardcoded, the names and descriptions are generated once at dungeon start.

  • Prompt: "Generate a dungeon theme. Theme: 'Goblin Prison'. Give me names for 3 keys, a boss name, and a description for the damp boss room."
  • Result:
    • Key_Red -> "Rusty Iron Key"
    • Boss -> "Grumlock the Jailor"
    • Room -> "A circular chamber reeking of stale meat..."

Phase 4: UI/UX & Client Architecture

1. Layout Structure (The "Dashboard")

The interface is designed to resemble a modern cockpit for a text adventure.

  • Header: Game Title, Global Menu (Save/Settings).
  • Main Content (3-Column Grid):
    • Left Column (Visuals):
      • Scene View: A static 16:9 image container showing the current room context (e.g., dungeon_sparse_lmr.png).
      • Mini-Map: A canvas-based grid visualizer. Renders visited rooms as squares and connections as lines. Unvisited rooms are black (Fog of War). Current room highlighted.
    • Center Column (The Narrative):
      • Scrollable Log: The history of LLM text and system messages.
      • Input Area: Fixed at the bottom.
        • Text Input (> type here...)
        • Quick Actions Bar (Contextual buttons: "Attack", "Search", "Wait").
    • Right Column (State):
      • Character Card: HP bar, Stats, Level.
      • Inventory List: Clickable items (Open modal for details/actions).
      • Room Objects: List of interactables currently visible (Enemies, Chests, Doors).

2. Visual Assets Strategy

To keep the prototype visually complete without dynamic generation yet, we use a Compositional Asset Naming Convention.

  • Base Path: /assets/images/rooms/
  • Naming Schema: [biome]_[type]_[exits].png
  • Parameters:
    • biome: dungeon (fixed for now).
    • type: empty, sparse, furnished, treasure, boss.
    • exits: A string combination of l (left), m (middle/forward), r (right).
      • Example: dungeon_sparse_lr.png (Room with exits Left and Right).
      • Fallback: If specific exit variation missing, fall back to [biome]_[type].png.

3. Frontend Technology (React + Tailwind)

Components

  • GameContainer: Manages the main API polling/WebSocket connection.
  • TerminalLog: A virtualized list (using react-window or similar if text gets long) to handle infinite scroll efficiently.
  • RoomRenderer: Logic to determine which PNG to load based on GameState.currentRoom.
  • MapCanvas: HTML5 Canvas component that draws the grid graph relative to the player's position.

State Management

  • Zustand Store:
    • messages[]: Array of { sender: 'user'|'system'|'narrator', text: string }.
    • gameState: The full JSON object synced from backend (Player, Room, etc.).
    • uiState: { isInventoryOpen: boolean, isLoading: boolean }.

4. Interaction Design

  • Hybrid Input:
    • Typing: Users can type "Pick up sword".
    • Clicking: Clicking "Rusty Sword" in the Room Objects panel triggers the exact same API call (pickup sword) as if the user typed it. This teaches the user the text commands while offering GUI convenience.