Harker’s Escape Lesson 1: LLMs Don't Automatically Make Gameplay Better
A Year of Discoveries at Atomic Dirt
Building Harker’s Escape taught us that while LLM integration has real potential, it also has even more real problems we shouldn’t hand wave away. Issues like hallucinations, confabulations, and gaslighting have been well documented, but become arguably more problematic when you’re building a game like Harker’s Escape.
Right now many (most?) early AI integrations into games are primarily about generating “flavor copy” that makes a game a bit richer, but largely this gives only an illusion of spontaneity; ex. a player never hears an NPC say the exact same sentence twice. But is the juice worth the squeeze?
A lesson from Harker’s Escape:
In Harker’s Escape there’s a narrator who helps immerse you in your setting. When they’re doing their job right, they give you meaningful context that helps make the world more immersive and escaping Dracula’s castle more probable. But it’s not always easy for an LLM to know its job responsibilities…
Example: In Harker’s Escape, you can smash up a table and chairs to use the pieces in various ways to escape.
Here’s an early example of what we learned after I intentionally dropped a chair:
This visual + at the same time, our LLM-powered narrator says: “The chair slips through your grasp and crashes into the darkness.”
Player:
For anyone who’s worked with an LLM, you’ve likely seen similar kinds of incorrectly mashed together thoughts where the LLM pulls some previous context (ex. you are in Dracula’s castle which is likely dark and creepy) and force feeds it into your latest command (dropping the chair).
At best, the narration in this example is unnecessary. The Unity Engine shows me dropping the chair and I can clearly see it. But at worst, it is inaccurate. It creates immediate cognitive dissonance between what I did (chair didn’t slip), what happened (it didn’t crash), and where it’s at (it’s not in the dark).
Going Commando
The “chair drop” highlights an ongoing question: what precise problem are we even having the LLM solve?
For instance, let’s say we injected an LLM and Elevenlabs (an incredibly impressive AI voice platform) into the original Command & Conquer real-time strategy game. We train the Commando unit in our LLM with enough proper context and connect it to Elevenlabs so he can say dynamic responses when you give him an order. Now when we’re playing instead of always hearing a stock quote like “I’m on it” or “No problem”, our AI-infused Commando could reply with an infinite number of order confirmations. Now assuming we can train him properly, this could make the game a nicer experience for the player, but the gameplay is exactly the same.
Is all of this lift worth it? Is making a game a bit richer worth the time and effort it takes to fight off the illusion-breaking hallucinations? Maybe. Depends on your mileage. But I believe pushing the technology further so it can unlock new kinds of gameplay is worth the lift.
Takeaway: Our industry should prioritize AI that truly elevates gameplay, creating novel experiences and streamlining complexity. If you're using AI to add flashy bells and whistles, that's your prerogative, but ensure they offer genuine value and a return on investment.
Next on the Dirt: we’ll dig into generative AI in 3D