From Prompt
to Platform

Building an AI-powered D&D game engine, from a single system prompt to a real-time multiplayer platform.

↓56% Token Cost

4 Specialized AI Roles

Zero Game State Drift

Claude AI AI Systems Design Prompt Engineering

The Problem

D&D Needs a DM. Finding One Is Hard.

Dungeons & Dragons is one of the most collaborative tabletop games ever made, but it requires one person to run the whole show: the Dungeon Master. A DM designs worlds, interactive stories, encounters, voices every NPC, enforces rules, tracks game state, and improvises for hours at a time.

D&D has over 50 million players worldwide and saw 33% year-over-year growth for seven consecutive years — but the player base is expanding faster than the DM pool. Only 1 in 6 players will step into the DM role, and burnout is the second most common reason campaigns end before their natural conclusion.

1 in 6

Players willing to take on the DM role

10%

Online players seeking games offer to be the DM

Reason campaigns end: DM burnout

29%

Of DMs spend 3+ hours preparing for each session

The campaign home screen — start a new adventure or pick up where your party left off.

The Approach

What If the AI Could Be the DM?

Claude has the language ability, rules knowledge, and narrative range to serve as a Dungeon Master. It can improvise story beats, voice NPCs, resolve ambiguous player actions, and keep a campaign feeling alive — all the things that make DMing demanding for a human to sustain across a long campaign.

But a language model alone can't run a game. Context windows don't persist state. Without an external source of truth, a complex game would eventually outpace the model's ability to track it reliably. The question wasn't whether AI could narrate a D&D session. It was whether you could build a reliable enough game engine around it that provided a fun and consistent experience for the players.

Customized for you

Campaign Setup

Before a session begins, someone needs to design the adventure. In a traditional game, that's the DM's job. I wanted to give users a structured setup flow, one that collected the right information upfront and shaped the AI's behavior, game style, tone, and other elements throughout the entire campaign that make it enjoyable for the players.

The solution to this is a Campaign Architect feature. In this, a player converses with the agent so it knows what kind of campaign they and their friends want. This planning conversation is spoiler-safe, but gives players some agency when it comes to the style of adventure they will be playing. Much like they would with with a human DM.

The Campaign Architect asks players about their preferred style, length, and party before the adventure begins.

Early game interface showing the original dark amber theme with battlemap, character sidebar, and DM chat

The first working version: one system prompt, no persistent state, no structured output.

Starting simple

The First Experiment: One Large System Prompt

The initial experiment relied on a carefully crafted 14KB system prompt that told Claude who it was, what rules to follow, and how to respond. It worked well for storytelling. Rich narrative, consistent NPC voices, strong rules knowledge.

Problems surfaced when using the integrated scene map (aka battlemap). Especially so when tracking a lot of different details like in combat. Claude was tracking everything in its context window: hit points, positions, turn order, conditions, etc. Numbers drifted. HP changed by inconsistent amounts. Positions were inconsistent. Rules were applied unevenly depending on what Claude could recall from earlier in the conversation.

Storing the details of what was currently occuring in the game predominantly in context produced an experience that was ok with loose details but failed when precision was required.

More than an agent

Separating Narrative from State

The fix was architectural. Instead of letting Claude respond in free text, every response was forced through a structured tool call. Claude had to return two things: a narrative block for the players, and an actions array describing every state change, expressed as structured data that the engine applied deterministically.

This separated two things that had been mixed together: the art of storytelling and the accounting of a game. Claude kept the creative work. The engine kept the books. Neither had to do the other's job.

The Architecture

Player Input

Text command sent to the game engine

↓

Claude API

Receives structured context, must return a tool call

↓

Structured Response

Narrative text + actions array (separate fields)

↓

Game Engine

Applies all state mutations deterministically

↓

State Store

HP, positions, conditions, etc.

Story Broadcast

All players update live

Rich experience

The World Beyond the Chat Window

The chat tells the story and the panels around it make it richer. Scene images drop players into each new location the moment it changes, and character portraits give faces to NPCs and creatures so players can feel the joy of seeing a familiar face and/or the dread of seeing something truly dangerous.

The game map shows where everyone stands relative to each other and the environmentplayers. NPCs, buildings, doors, objects, etc. — so positioning decisions feel grounded rather than abstract. When it's time to roll, integrated character sheets and Beyond20 browser extension support makes it easy for players to quickly make their dice roll.

Combat scene with battle map, character sheet, and game chat

The full game environment: battle map, character sheet actions, and live DM narrative side by side.

Zoomed battle map with character token and enemy positions

Visual dice rolling experience keeps players immersed.

Making a connection

Dice Animations and Synthesized Audio

Visible dice rolling matters. When you roll at a physical table, everyone sees it. There's anticipation, shared reaction, a moment that belongs to the group. Text-based games can lose that.

The dice animation plays for all players simultaneously. Multi-die rolls stagger by 80ms per die so a handful of d6s doesn't collapse into a single indistinguishable visual event. Small details, but they help a player connect with the game.

Campaign Memory

Improving The Long Campaign Experience

Keeping a campaign alive across hundreds of sessions requires remembering things: NPC names, motivations, past decisions, ongoing quests, details that don't survive in a general-purpose AI's context window and can be difficult to keep track of even when playing with a human DM depending on how players or DM stay organized.

With an NPC section filled with details about everyone met along the way, a quest log that shows clearly defined tasks, and session logs with a full transcript, players can easily stay up to date with the evolving campaign. Even if they miss a session, have to leave the campaign for a while, or are joining mid way through.

NPC browser, session log, and quest tracker — persistent memory across the full campaign.

Making it Affordable

Right Model for the Right Job

Running an AI-powered game has real costs. Every high-capability model response adds up across a session, and some tasks simply don't need that level of capability. Monster tactical decisions, session summarization, and wall detection from map images are high-volume, low-ambiguity tasks that a lighter, faster model handles just as well.

The platform routes DM narrative to the most capable model and everything else to a smaller one. That single routing decision drops per-session model cost by 56% compared to running everything through a single high-capability model. And it's just the first lever.

Model Routing

DM Narrative → Sonnet

Monster Tactics → Haiku

Session Summarization → Haiku

Wall Detection (Vision) → Haiku

Measured per-call cost reduction ↓35%

Token usage audit: every API call logged with model, tokens, cache status, and cost.

Cost Engineering

Cache Strategy, Logging, and Tool-First Design

Model routing cuts cost by choosing the right model for each task. Caching cuts it further by not re-sending the same context twice. The game's memory is structured with this in mind: stable content (campaign rules, character data, world state, and session summaries) is front-loaded so it can be cached across requests. Across logged sessions, these stable blocks achieve a 100% cache hit rate, and about 22% of all input tokens are served from cache rather than billed at full price. Caching alone reduces DM response costs by ~15%, compounding directly on top of the routing savings.

Within individual campaigns, session summarization keeps the active context window bounded. Without it, input tokens grow unboundedly as a campaign progresses. With it, per-call input token counts dropped 44–60% over the course of campaigns in the logs, keeping long adventures from becoming prohibitively expensive. Every API call is logged with token count, model, cache status, and cost. That data is the primary tool for deciding where to optimize next, with the goal of making sessions affordable enough to be viable as a real product.

A third lever is tool-first design. When Claude calls a structured function to look up a rule, update a position, or fetch campaign state, it uses far fewer tokens than reasoning through the same task from prose in context. Building new tools as features grow keeps per-task token usage efficient, and compounds with routing and caching to reduce the cost of every session.

What I Learned

Four Lessons from Building an AI Platform

The platform evolved significantly from its first version. These are the decisions that had the biggest impact on reliability, player experience, and cost.

Create tools for agents. AI is most reliable when it expresses intent through structured data, not free text. Design the interface between AI and code before designing the screens around it.
Visual references beat text descriptions. Claude with a battle map coordinate system produced spatially accurate combat. Without it, positions drifted. Give AI agents ground truth wherever possible.
Build your own observability tools. The admin panel and token auditor weren't conveniences. They were how I understood what was happening inside the platform and made real-time decisions about design and cost.
Prototype over mockups. This entire platform started as a conversation with a system prompt. Building early revealed design problems that wireframes would have missed entirely.

Back to Work

From Promptto Platform