好吧,人工智能现在可以写出相当不错的小说了

Ok, AI Can Write Pretty Good Fiction Now
作者:Justis Mills    发布时间:2025-07-04 12:51:58    浏览次数:0
Three months ago, I wrote a post called AI Can’t Write Good Fiction. The subtitle was “Yet, at least”. Good thing for the subtitle, because now maybe it can.

Here’s a (very short) story written by Claude 4 Opus, at my minimal direction:

Sam's alarm didn't go off. She woke to sunlight and the sound of Kai making coffee in the kitchen. He'd left for his run already and come back. She could tell from the particular way he moved around the counter, post-endorphins quiet."You're late," he said when she appeared."I know." He handed her a mug. The coffee was exactly right. She checked her phone while drinking it. Two missed calls from work, three texts. Nothing that couldn't wait another twenty minutes."I should go," she said."Should you?" She looked at him. He was leaning against the sink, still in his running clothes, not checking his own phone."Probably not," she said. He nodded. She called in sick from the couch while he showered. They spent the morning at the farmers market, buying vegetables they didn't need. Kai talked to the tomato vendor about soil pH. Sam bought flowers because they were yellow. At lunch, her phone buzzed with work emails. She glanced at them, then put the phone in her bag."Good?" Kai asked."Fine." They walked home slowly. The flowers fit perfectly in the blue vase on their kitchen table.

Is this story good? I think it’s pretty good, which puts me in an awkward position. It has problems; if it were submitted to me as a literary review editor, I’d say:

“She could tell from the particular way he moved around the counter, post-endorphins quiet.” is pretty weak; he’s in running clothes, presumably he generally runs before coffee… it’s going for intimacy/her knowing him well but achieves the opposite in context. I’d cut it.

“buying vegetables they didn't need” doesn’t make any sense. Either nobody needs vegetables or everybody does; they’re healthy but not necessary to stay alive.

While the spare style mostly works, you could tighten up further. There’s no way, my wife points out, that “He'd left for his run already and come back” is the strongest way to form that sentence.

But previous AI-generated fiction reliably pushed me into “hater mode”, the state of mind occupied by YouTubers who catalogue thousands of flaws in blockbuster films. One turn of phrase would be stupid, then another, then another, and pretty soon the idea that anybody could think it was good made me angry.

Probably, the story in this post is still like that for some people. But it isn’t for me. There are few enough problems that I can notice nice stuff, such as:

The ending is good: an evocative lingering image rather than wrapping everything up with a bow

You get the feeling of a strong relationship and the dialogue feels pretty real; Kai is accurately diagnosing Sam’s burnout and helping her through it by example

“Fine” gets bonus points. Can be interpreted different ways, and expresses that the relationship helps Sam understand that minor annoyances are okay

Like everything LLM-y, if I kept generating 50 stories I’d get bored of the repetition; having only read a few I’ve noticed, for example, that LLMs love random bit characters singing in a courtyard below where the action is taking place, which gets stale fast. But it ain’t hyperaggressive, every-sentence-must-be-an-epic-revelation word salad. Three short months ago, that was the state of the art.

The Details

I prompted Claude 4 Opus like so:

I'm interested in your fiction capabilities. Please write a short story about a modern relationship. The main failure mode to avoid: at no level, sentence, paragraph, or structure, should you lay it on thick. Trust the reader, and be subtler than you think you can. Avoid cliches really aggressively, to counteract your default latent tendency to steer to the deepest basins in the corpus landscape. Thank you, and good luck.

It did just okay, so I coaxed it toward being even more spare:

Try to be almost zenlike in your spareness. Eschew splashy contrast. Assume your reader is enlightened and a genius. Make a happier story, too.

And, well, there you go. It did fine. Also, it’s not like it’s only good at the spare narrative style that I prefer. I tested Claude 4 Opus because Kaj Sotala shared two of its outputs that I thought were decent. These were a major step up from the previous “best AI short fiction” record in my estimation, also shown to me by Kaj, from a previous version of Claude.

If you’re into this stuff, I recommend you read all three of these, and judge the improvement for yourself. But to avoid inundating you with AI content, I’ll show you two snippets.

Here’s the snippet from the (worse) story a few months ago, before Claude 4 Opus was out:

The pigeons started their own newspapers, printed on leaves that fell upward instead of down. Anyone who caught one and could read their language (which looked like coffee stains but tasted like morse code) reported stories about pigeon divorce rates, weather forecasts for altitudes humans couldn't breathe at, and classified ads seeking slightly used dreams.

It’s… almost something? Going for magical realism but laying it on way too thick, and ending up sloppy (languages don’t have taste, and even if you accept that on poetic license it’s excessive after the already-cute upward falling leaves).

And here’s the snippet from one written by the new, state of the art Claude:

"You would have been six when he passed. A fever, I believe.""The same fever that took his whole household staff." Her voice had found its footing now, each word placed with deliberate care."And his personal guard. And the archivists who worked that wing. Very specific in its targets, that fever." He set down his cup with a soft click against the saucer. Outside, someone was singing in the courtyard—one of the kitchen girls, voice bright with the careless joy of someone who'd never had to weigh the cost of a single life against a thousand.

Much better! The murderous official saying “I believe” as a postscript to a coverup he personally authorized. Drinking tea out of saucers. That same official lionizing his own burdens in his head, and imagining an ethical dilemma that’s different than what he’s being accused of, but superficially similar (he had many people killed, not just one). It’s not perfect, but it’s pretty good. It didn’t activate hater mode.

What Does This Mean?

I’m not sure. I’ll radiate my thoughts from the personal outward.

Personal

When I said AI couldn’t write good fiction three months ago, was I wrong? I don’t think so, but it’s unclear. Gwern argued back then that to really know if AI models could write good fiction, you’d have to give them lots of scaffolding and context and elicit them properly, generating many stories and picking out the best ones, for a fair apples-to-apples comparison with high-quality human efforts. I have yet to see a story written by the AI of that era that I actually liked, but the fact that only slightly more advanced AI can do it suggests that maybe it was possible all along. Or maybe I was right, and old models just didn’t have the juice (whereas new ones do).

Niche Communal

As of Claude 4 Opus, AI can write (extremely short) stories good enough that a decent literary review could plausibly accept them. Nothing I have generated with Opus is good enough to get into the very best reviews, but then again, I’ve only tinkered for an hour. And there are lots more things I’d try, if I wanted to generate the best AI fiction I could! As the lowest hanging fruit, using a base model via the API would do better than prompting the consumer-facing chat interface.

It wouldn’t shock me if Opus has the goods to compete at close to a top human level, given masterful prompting. I just don’t know. And if it can do well enough to compete with the best human flash fictions, a small number of prompters could thereby flood the zone if they wanted, all-but-guaranteeing humans were crowded out of niche literary magazines. Not the saddest possible AI future, nor likely right this minute (there’s almost no money in flash fiction, so I’m not sure who would bother), but it does give me a pang.

Meta

We’re in a weird place. In fiction (I posit), as in software, AI can do a pretty good job at narrowly scoped work. You can get 500 decent words, just like you can get a decent to-do list app.

It’s easy to imagine how you might take that core capability, and extrapolate it very far. Like, if you can write one good scene, and you can write a good outline, and you have various other modules to sanity check and retain consistency, can you thereby write a decent novel? Certainly, people imagine this in the programming case; there are whole reports on how long until AI can perform arbitrarily long software tasks.

In the pro column, yeah, sounds plausible. In the con column, reality has a surprising amount of detail. The dueling stories are both quite compelling: in the red corner, an amped-up nerd pointing out the rate of progress so far and the human inability to extrapolate the trajectories of simple curves. In the blue corner, a turtlenecked aesthete, scoffing that the recent pivot to post-training RL is already a sign of diminishing returns, and current offerings remain mediocre. Inside me are these two wolves, so I’m not sure.

Still, as a person who writes novels recreationally, I think writing a decent scene is a really important ingredient. Maybe in a year (or two?), I’ll be reviewing an LLM-generated novel. Maybe in five years, so will everyone else. Or maybe not.

I hear people talking about how fast AI is moving all the time. I don’t usually really feel it. Most models, for most of my purposes, have only felt a little bit better with each incremental release. With Claude 4 Opus, though, there’s been a sea change. I can ask for a literary style, and that style can require subtlety, and it basically gets it.

A strange feeling. When I was young, I determined that writing novels was my main purpose in life. Not getting them published, not getting anyone to read them (though I’m lucky to have both a mom and a wife who’ve read every single one), but simply to write them. I’ve got other purposes now too (happy Father’s Day), but writing novels is still extremely important to me. Machines being better at it than I am wouldn’t make me stop, but it would make the activity feel different. And feeling in my gut that it might happen, actually, that the pace of progress might be real and that in 2030 you’ll be able to get Infinite Jest but it’s Super Smash Bros instead of Tennis at the click of a button, well…

Huh. Actually, that sounds epic. Sign me up.

最新文章

热门文章