How I built an AI story generator in 1 day that writes better than I do

5 min readNov 1, 2022

“The morning air is chill and the fog is rolling in, shrouding the valley in a thick blanket of white. The sun is a mere sliver of light on the horizon, casting a weak glow over the landscape. There is a silence in the air, broken only by the sound of the wind rustling through the trees.”
— John Muir describing Yosemite, 1868

Ha, you’ve been fooled! The quote above is not from John Muir, but from asking OpenAI’s GPT-3 AI engine to describe Yosemite on a calm morning. My mind has been blown by the creative capabilities of AI , particularly in the arena of creative storytelling— every piece of generated writing has left me laughing out loud, shocked by a plot twist, or simply moved by the beauty of the language.

An excerpt from an AI generated story. The AI is exceptionally good at writing suspenseful plots.

My programmer friend Eric and I set out to to share the joys of AI storytelling with more people. Our goal was simple:

Make people chuckle 🤭
Excite people about the future of this nascent technology 📈

The result is 🙈 InfiniteMonkeys 🙊, a super simple web app that allows anyone to create short stories with AI.

Just type in a prompt, and press Generate.

A 200–400 word story is written along with an AI generated illustration! Here’s an example result!

Here are 3 nuggets I learned from building InfiniteMonkeys:

AI is now accessible to everyone through easy-to-use APIs. 🙌🏼
Some of these APIs are slow 🐌 and costly 💰.
The modern web stack makes building simple web apps trivial ⏩. I’ll do a deep dive into how we built the entire app 🔨.

1. AI is now accessible to all

For the first time, a non-AI/ML trained person such as myself can add “intelligence” to their app with a simple API call. APIs we used:

Text generation: OpenAI’s GPT-3 Davinci model. No training or data sets required, just a bit of trial and error to tweak prompt formats and parameters. Just ask for what you want, ex. “Write me a story about …” and the model will generate what you want!
Image generation: StabilityAI’s Stable Diffusion API. Again, no training required. Takes in a text prompt and returns an image. API was a little
Text to speech: Amazon Polly has a simple API that takes in text and returns a URL to stream the generated audio. The voice quality and modulations have become much more natural and realistic than I expected.

A story about a magical seamstress. GPT-3’s writing is natural, creative, and witty.

2. Some of these APIs are slow and costly

Specifically, GPT-3 and Stable Diffusion have quite high latency, which is completely understandable given how powerful they are, but still a unique challenge to build smooth UX around. For example, you’ll see that InfiniteMonkeys takes ~15 seconds to generate a story. Both GPT-3 and Stable Diffusion take >10 seconds to execute. This means slow loading times for users. I may be missing some performance optimizations, but this was my out of the box experience.

We spent a lot of time perfecting the loading animation since the user has to stare at it for so long.

Amazon Polly, on the other hand, is an incredibly performant API that returns in under a second. 👏🏼 to their engineering team.

These APIs, while 100x cheaper than hiring a writer, artist, or voiceover actor to do the equivalent task, are still expensive enough to be cost prohibitive in some use cases.

Cost breakdown

GPT-3: $0.03 per story on average
Stable Diffusion: $0.02 per image on average. This model is open-source, but we use a hosted API for convenience.
Amazon Polly: $0.02 for highest quality model. $0.005 for standard model.

Total price per story: ~$0.06

While $0.06 doesn’t seem like much, it quickly adds up especially in the context of a free hobby site like the one we built. We’ll see how long we can keep the site up for before we exhaust our budget.

The API costs can quickly add up. We’re hoping the costs will decline in the near future.

3. The modern web stack makes building simple web apps trivial

Equally as surprising as GPT-3 writing better than I can was that it only took 10 hours to build and deploy the MVP of InfiniteMonkeys. For context, Eric is a web developer, whereas I am an iOS developer who spent a week doing React tutorials in preparation for this project. We both have 5 years of industry experience, so YMMV.

We started off the day with a sketch of what the app would look like. 10 hours later we could show it to our friends through a public URL.

Our web stack

Frontend: Next.js is a framework built on React that makes it super easy to build and deploy web apps. It even allows you to write server-less backend code in the same project. If you hook up your GitHub repo to Vercel (company that created Next.js), every commit will automatically be deployed to a public URL within minutes!
Backend: We used Firebase to store stories and images. It takes 5 minutes to integrate into your project, has an elegant API, and an easy to use GUI console to view and modify your data live.
Leveraging Open Source: Every problem we encountered had an open source solution. For example, StabilityAI’s API was difficult to use. A quick search on npmjs.com revealed an open source Javascript wrapper to call their API. With so many free and ready-to-use building blocks out there, it felt like our task was just to glue things together.

Time breakdown for MVP (10 hours total)

We ran into surprisingly few glitches while making our MVP. Maybe we just got lucky. We spent a lot of additional time (not shown in the diagram) polishing the app for public use and adding things like ReCaptcha to protect our site.

Final thoughts

It’s never been easier to turn ideas into reality using the modern web stack. It’s a fun time to be a tinkerer!

Note: I tried to get GPT-3 to write this blog post for me, but unfortunately it’s not sophisticated enough. Maybe GPT-4 will be able to do the job 😄.

Here’s one last link to InfiniteMonkeys. We hope it makes you chuckle 🤭. Please give us feedback! — Matt and Eric