Julián Deangelis published a piece this week that hit 194k+ views in days. The argument: AI coding agents don't fail because the model is weak. They fail because the instructions are ambiguous.
He called it Spec Driven Development. Four steps: specify what you want, plan how to build it technically, break it into tasks, implement one task at a time. Each step reduces ambiguity. By the time the agent starts writing code, it has everything it needs — what the feature does, what the edge cases are, what the tests should verify.
The piece is right. The problem is the workflow takes discipline most sessions don't have. Under deadline pressure, the spec step disappears and you're back to prompting directly.
So I built a skill that does the spec-writing for you.
The silent decisions problem
Julián's example stuck with me:
"Add a feature to manage items from the backoffice."
The agent reads the codebase, picks a pattern, writes the feature. At first it looks fine. Then you click "add item" again and it inserts the same item twice. All the assumptions you thought were obvious were never in the prompt.
Which backoffice — internal or seller-facing? Should the operation be idempotent? Admin-only or all users? Which storage layer? Which error handling strategy?
Each one is a silent decision. The agent guesses. Some guesses are right. Some are wrong. And you don't find out which until the feature is in production behaving in ways you didn't expect.
The spec is what makes those decisions visible before the agent guesses. But writing a spec from scratch, for every feature, before every session — that's the friction that kills the habit.
Generate first, flag assumptions inline
The standard approach to spec tools is Q&A: the tool asks clarifying questions before generating anything. What's the user role? What's the auth model? What should happen on error?
The problem with Q&A is you have to know what you don't know. If you knew exactly what questions to ask, you'd be close to having the spec already.
spec-writer takes a different approach. It generates the full spec immediately and marks every decision it made that you didn't specify:
Given an authenticated user requesting an export
When the export contains more than 1,000 rows [ASSUMPTION: async for large exports]
Then the export is generated asynchronously and the user is notified by email
At the end, every assumption is collected into a prioritized list:
## Assumptions to review
1. Async generation for exports over 1,000 rows — Impact: HIGH
Correct this if: your order volumes are low and synchronous is fine
2. Date range filter is required — Impact: MEDIUM
Correct this if: the first version should export all orders with no filtering
Correcting a draft is faster than answering questions cold. The [ASSUMPTION: ...] tags show you exactly where the agent would have guessed wrong — before it guesses. That's the value. Not the spec itself. The visible assumptions.
A real example
I ran it on something I'm actually building — CLI capture for Foundation, my knowledge commons project. Foundation currently captures Claude.ai browser conversations via a browser extension. Claude Code sessions live in ~/.claude/projects/ and disappear after 30 days. I needed to capture those too.
The prompt:
/spec-writer Add CLI capture for Claude Code sessions stored in ~/.claude/projects/
Here's a slice of what came back:
From the spec:
Given a developer with Claude Code sessions in ~/.claude/projects/
When they run the Foundation CLI capture command
Then a list of uncaptured sessions is displayed, sorted by most recent first
Given a session already in Foundation
When the developer runs capture again
Then that session does not appear in the review list
Given a malformed .jsonl file
When the capture command processes it
Then an error is logged for that file and processing continues on remaining sessions
From the plan:
CLI capture runs locally — it needs to read the local filesystem, which
Cloudflare Workers cannot do. [ASSUMPTION: CLI capture is a local Node.js
or Bun script that calls the Foundation API to insert sessions, rather
than a Worker itself]
That assumption stopped me. I hadn't thought through the architecture split — Foundation runs on Cloudflare Workers, which can't read local filesystems. The skill surfaced it before I'd written a line of code.
The assumptions summary:
1. CLI capture is a local script calling the Foundation API, not a Worker
Impact: HIGH
Correct this if: you want a purely serverless approach
2. Manual curation before capture, not automatic bulk import
Impact: HIGH
Correct this if: you want automatic background capture
3. Session ID from .jsonl filename is the deduplication key
Impact: MEDIUM
Correct this if: session IDs are stored differently in your schema
4. No sensitive data scrubbing in v1
Impact: MEDIUM
Correct this if: your sessions contain credentials or keys
Four assumptions, two of them high-impact. The architecture one I would have hit mid-implementation. The sensitive data one I wouldn't have thought about until someone complained.
That's the value. Not the spec itself. The visible assumptions.
How it fits into SDD
spec-writer gets you to Spec-First immediately, with no ceremony. One command, full output, correct the assumptions, hand to the agent.
Julián describes two levels beyond that. Spec-Anchored: the spec lives in the repo and evolves with the code. Spec-as-Source: the spec is the primary artifact, code is regenerated to match. If you want to move toward Spec-Anchored, save the output to specs/feature-name.md in your repo. The skill produces something worth keeping.
The methodology is Julián's. The skill is the friction remover.
Install
mkdir -p ~/.claude/skills
git clone https://github.com/dannwaneri/spec-writer.git ~/.claude/skills/spec-writer
Then:
/spec-writer [your feature description]
Works across Claude Code, Cursor, Gemini CLI, and any agent that supports the Agent Skills standard. The same SKILL.md file works everywhere.
The repo is at github.com/dannwaneri/spec-writer. The README has the full output format and a worked example.
The agents are getting better at implementing. The bottleneck was always specification — knowing what to build precisely enough that the agent doesn't have to guess. spec-writer doesn't remove that requirement. It makes it faster to satisfy.
The spec isn't the output. The assumptions are.
Built on the Spec Driven Development methodology — operationalized by tools like GitHub Spec Kit and OpenSpec. Julián Deangelis's writing on SDD at MercadoLibre was the direct inspiration.
Other skills: voice-humanizer — checks your writing against your own voice, not generic AI patterns.
Top comments (18)
Sounds useful !
Try it on your next feature and let me know what the assumptions surface.
Thanks man - bookmarked this and archived it in my "AI - All The Stuff I Should Really Check Out" folder - will let you know when I get around to it!
That folder name is doing a lot of work. The assumptions list is the part worth seeing in practice. it's not obvious until you watch the skill surface a decision you would have left implicit. When you get around to it, run it on something you're actually building, not a toy example. 👊👊
Gotcha - toy examples tend to be artificial, or simplistic ...
Exactly.
The assumptions that matter are the ones you didn't know you were making and those only show up when the domain is real. A toy example has no legacy constraints, no auth model you didn't pick, no business logic the agent has to guess at. The skill earns its keep on the second or third feature in a codebase you've been living in for months.
If AI agents can even perform spec-driven development as well, I wonder how the role of developers will shift...
The shift is already happening - away from writing code toward writing the contract that defines what correct looks like. The spec is the thing agents can't generate without you because it requires knowing what your system is supposed to do, what your business constraints are, what failed last time. The agent can implement from a good spec. It can't produce the spec without the understanding that comes from having built the system and watched it fail.
The role isn't disappearing. It's moving upstream.
Just used this as my first skill to add a new 'birthday' feature to my telegrambot. Very usefull. I'm coding on my phone in a termux-ubuntu environment with claude code and codex, so its very easy to miss the unknown assumptions and end up in a long conversation to fix the problem. This fixes the problem up front :)
Even with a small feature like this it helps, for example it shows the commands it wanted me to type, but i want more of a free format - Fixed. Thanks for this!
Termux on a phone is exactly the environment where the assumption problem bites hardest . The iteration cost is high enough that one wrong assumption can cost you an hour. Glad it caught the commands-vs-free-format issue before you were mid-conversation.
That free format flag is good feedback on the skill itself. Defaulting to explicit commands made sense for the use case I built it around, but it's an assumption the skill makes without asking which is ironic given what the skill is supposed to do. Worth adding a prompt at the start that asks about output format preference before generating the spec.
What's the birthday feature doing?
It gives updates in the morning with a birthday reminder. Can create overview, add or remove birthdays. And sync with calender. Plan to add a personalized card feature.
Calendar sync on a phone in Termux is genuinely impressive and a personalized card feature on top of that is the kind of thing that makes a utility actually feel like it was built for someone.
This is a really interesting shift from “prompting” to actually encoding thinking into reusable skills. Instead of repeating instructions every time, you’re building a system that compounds knowledge—exactly what modern AI workflows need. It also reinforces how critical clear specs are, since AI performs best when the intent is precise and structured
"Encoding thinking" is the right frame and it's why the output format matters as much as the prompt. A spec that marks its assumptions explicitly with [ASSUMPTION: ...] forces the thinking to be visible, not just the conclusion. The AI didn't decide those were assumptions. You did. That's the difference between a reusable artifact and a one-time answer.
The key insight here is brilliant: Correcting a draft is faster than answering questions cold. This is exactly why Q&A-first tools fail they assume you know what you don't know. By generating the spec first and flagging assumptions inline, you're making the unknown unknowns visible. That's a fundamentally better UX for spec writing. The architecture assumption surfacing in your example proves the point perfectly.
"Unknown unknowns" is exactly the framing. Q&A assumes you can enumerate the gaps before you've seen the spec which is precisely when you can't. The generate-first approach inverts that: produce the draft, let the assumptions surface, then correct only what's wrong.
The architecture assumption in the Foundation example was the one that would have hit hardest mid-implementation. Cloudflare Workers can't read local filesystems - obvious in retrospect, invisible before the spec named it. That's the category that costs the most: not bugs you can see in the diff, but constraints you didn't know applied until the agent was already 3 hours into the wrong approach...
The Q&A tools that fail are optimizing for first-draft cleanliness. spec-writer optimizes for assumption visibility. Those are different goals and only one of them scales with complexity...
Great project, congrat
Some comments may only be visible to logged-in visitors. Sign in to view all comments.