Spec-Driven Development: How to Build Real Software with AI Coding Agents
Software Development using in the age of AI
Overview
When AI coding assistants became mature and accessible in early 2025, early adopters took to ‘Vibe Coding’ - guiding an AI coding agent through a series of prompts toward building a full-fledged application. This approach works well for quick prototypes. However, it breaks down when you build something larger with real scope: multiple components, interdependencies, testing requirements, and design decisions. The root cause is ‘Context Collapse’ - you are trying to define architecture, requirements, and implementation all at once, overflowing the context window.
How do we address this problem? Enter Spec-Driven Development (SDD).
SDD solves the context collapse problem by enforcing separation of concerns across different phases. Each phase produces an artifact that constrains and guides the next. No spec without guiding principles. No plan without a spec. No implementation without a plan. SDD structures the entire development process into a sequence of artifacts that each feed into the next, turning vague requirements into production-ready code through disciplined, incremental refinement.
The Constitution: The Rules That Govern Your Specs
The constitution is the anchor point in SDD. Before you define WHAT to build or HOW to build it, you must clearly establish the core governing principles - the non-negotiable rules that every subsequent decision must respect - a meta-document that every spec in your organization must conform to.
A well-formed constitution addresses the following aspects:
Design Principles: DRY, single responsibility, interface segregation as they apply to a project
Code Quality Standards: naming conventions, module structure, linting rules
Testing Philosophy: what needs tests, what level of coverage, testing strategies
Performance Expectations: latency budgets, resource constraints
Security Posture: authentication and authorization patterns, data handling rules
Governance: review standards, deployment rules, rollback procedures
The constitution is a Decision Filter against which all design choices are validated during implementation. This especially matters with AI coding agents - the constitution becomes the system prompt that keeps every subsequent phase aligned and ensures they follow consistent rules.
The Specify Phase
With non-negotiable principles in place, the next question is “What should the system do?” The Specify phase captures functional requirements, user narratives, and success criteria - without delving into any implementation details - into a Specification document.
The Specify Phase covers the following aspects:
User stories with clear acceptance criteria
Functional requirements (what the system must do)
Edge cases and error conditions
Data boundaries (what data flows where)
Non-functional requirements that affect design (throughput, availability)
A spec document should read like a contract between the product team and the engineering team. It should be detailed enough that a developer who didn’t write it can implement from it, but abstract enough that swapping a framework or database does not require rewriting the specification.
The Plan Phase
The Plan Phase bridges the gap between the WHAT and the HOW. It translates functional requirements into a technical implementation Plan document covering technology choices, data models, API contracts, and component interactions.
The Plan Phase covers the following aspects:
Technology Stack: frameworks, libraries, infrastructure, with rationale
Data Model: schemas, relationships, migration strategy
API Contracts: endpoints, request/response shapes, error formats
Component Architecture: module boundaries, dependency direction, interface contracts
Implementation Strategy: which parts go first, what parallelizes, what depends on what
The plan is a living document. It should be updated as implementation proceeds and reality deviates from the original plan. Deviations from the spec, however, require going back to the spec and amending it - not quietly absorbing them into the plan.
The Task Phase
The Task Phase decomposes the technical implementation Plan into an ordered, dependency-aware execution list of actionable Tasks that can be executed sequentially or in parallel. Tasks are the atomic units of work derived from the Plan. They are granular enough to be completed in a single sitting (typically two to four hours), independently reviewable, and unambiguous in their definition of done.
Each Task covers the following aspects:
Context: a reference to the spec and plan section this task implements
Objective: one sentence describing what this task accomplishes
Acceptance Criteria: a checklist of conditions that must be true for the task to be complete
Dependencies: any tasks that must be completed before this one can start
Parallelism Signal: whether this task can run alongside others or must be serial
The relationship between spec, plan, and tasks is hierarchical and traceable. Every task should point back to the plan section it implements. Every plan section should point back to the spec goal it addresses. This traceability is what makes spec-driven development auditable.
The Build Phase
With the constitution, spec, plan, and tasks all in place, implementation becomes a disciplined translation exercise rather than an act of improvisation.
The Build Phase covers the following aspects:
Executing tasks in dependency order, one atomic unit at a time
Writing code that conforms to the constitution’s rules and naming conventions
Logging any deviations from the plan as they arise, for later review
Keeping each task’s acceptance criteria visible as the implementation target
When upstream artifacts are solid, building is not creative work - it is translating a well-defined task into well-structured code. The creative decisions were made upstream. The Build Phase is about fidelity to the plan.
The Review Phase
The Review Phase closes the loop between what was specified and what was built. A spec-driven review is not just a code review - it is a verification that the implementation satisfies the spec.
The review checklist covers the following aspects:
Does the implementation satisfy all goals stated in the spec?
Does it avoid doing anything excluded by non-goals?
Do all decision records reflect what was actually implemented?
Are all success criteria testable and passing?
Is the spec itself updated to reflect any approved deviations?
Is the plan marked complete?
If the implementation diverges from the spec in ways that were not approved, the divergence must be resolved - either by fixing the code or formally amending the spec. Leaving unapproved divergences in place trains the team to treat the spec as aspirational rather than authoritative.
Frameworks That Make SDD Real
The phases described above are not just theory - several open source frameworks operationalize them into concrete tooling. Three projects have emerged as the leading options for teams adopting spec-driven development.
BMad-Method
BMad-Method (https://github.com/bmadcode/bmad-method) is an agent orchestration framework built around the concept of specialized AI personas - each agent type owns a specific SDD phase and has a defined scope of authority. The framework ships with personas for the Analyst, Architect, Developer, and QA roles, along with a set of structured document templates that each persona produces and consumes. This persona-based model makes the handoff between phases explicit: an Architect agent cannot begin until the Analyst agent has produced a signed-off spec. BMad integrates with both Claude Code and popular IDE extensions, making it practical for teams that want the discipline of SDD without abandoning their existing workflows.
GitHub spec-kit
GitHub spec-kit (https://github.com/github/spec-kit) is a Python toolkit that structures AI-assisted development as a pipeline of artifacts rather than a conversation. It provides templates, agent instructions, and validation tooling that turn the abstract phases into concrete commands.
The specify CLI initializes a project with structured templates for each phase. You then run agent commands in sequence:
/speckit.constitution - Drafts the constitution with the governing rules for the codebase
/speckit.specify - Captures functional requirements and user stories into a specification
/speckit.plan - Produces technical architecture and an implementation plan
/speckit.tasks - Decomposes the plan into an ordered, dependency-aware task execution list
/speckit.implement - Implements the code for each task
OpenSpec
OpenSpec (https://github.com/openspec-ai/openspec) takes a document-first approach: every SDD artifact - constitution, spec, plan, and task list - is a versioned, schema-validated Markdown file stored alongside the code. OpenSpec defines a formal schema for each document type and ships a validator that CI can run to reject PRs whose implementation diverges from the approved spec. Where BMad enforces SDD through agent personas and spec-kit enforces it through CLI commands, OpenSpec enforces it through the repository itself - making compliance visible in every pull request and code review.
Choosing a Framework
All three frameworks enforce the same core SDD discipline: no plan without a spec, no implementation without a plan. The choice depends on team context:
BMad is a strong fit for teams using AI agents heavily and wanting persona-level role separation
spec-kit is the fastest path to getting started, with minimal setup and a straightforward CLI
OpenSpec is the right choice when auditability, schema validation, and CI enforcement are priorities
Conclusion
Vibe coding is a powerful starting point, but it does not scale. Spec-driven development gives AI coding agents the structure they need to build production-grade software - not by constraining creativity, but by channeling it into the right phase at the right time. When the constitution, spec, plan, and tasks are in place before the first line of code is written, the AI coding agent becomes a precise executor rather than an improviser.
If you are building anything beyond a weekend prototype, SDD maybe a worthwhile the investment.


