From a SPEC to tickets, tests, and acceptance criteria: a real SDD workflow

How to turn a spec into executable work: tickets, acceptance criteria, tests, and prompts for AI agents.

Roger Bosch May 18, 2026

Writing a good spec is half the work. The other half is turning it into something that someone (person or agent) can execute. And that’s exactly where most teams fail.

In the previous articles on SDD and how to write a spec I talked about why specs matter and how to write them well. But I was still missing the most practical step: how to go from an approved spec to concrete tickets, verifiable acceptance criteria, tests derived from those criteria, and prompts that an AI agent can use to implement each ticket.

This article is that bridge. It’s what I use when I have a signed-off spec and need the team (human and agents) to start working.

The problem: specs that don’t get executed

I’ve seen technically perfect specs that sit in a Google Doc and never become real work. The team reads the spec, nods during the planning meeting, and then everyone interprets on their own what needs to be done. Vague tickets like “Implement notification system” get created and the developer (or the agent) has to guess the scope.

The result: tickets that get reopened because “this was missing,” PRs that don’t pass review because “that’s not what was asked,” and a spec that nobody consults again after the first day.

A spec that isn’t broken down into executable work is a statement of intentions, not a working tool.

Concrete example: spec for a notification system

I’ll use a real (simplified) example to show the full workflow. Imagine the spec describes an email notification system for an order platform.

Summarized spec

# SPEC: Email notification system

## Objective
Allow users to receive email notifications when their order status changes
(confirmed, shipped, delivered, cancelled).

## Scope
- Email only (no push or SMS in this phase).
- Order status changes only.
- Users can disable notifications from their profile.

## Functional requirements
1. When an order changes status, an email is sent to the user.
2. Email content depends on the new status (different template per status).
3. Emails are sent asynchronously (they don't block the order flow).
4. If sending fails, retry up to 3 times with exponential backoff.
5. Users can enable/disable email notifications.
6. A log is kept for each notification sent (status, timestamp, attempts).

## Non-functional requirements
- Maximum latency between status change and email delivery: 5 minutes.
- The system must support 1000 notifications/hour without degradation.
- Emails must comply with anti-spam policies (SPF, DKIM).

## Out of scope
- Push or SMS notifications.
- Frequency customization (daily digest, etc.).
- Notifications for events other than order status changes.

This spec is clear, has a defined scope, and specifies what’s out of bounds. Now comes the question: how do I turn it into work.

Step 1: Identify the functional blocks

Before creating tickets, I identify the main functional blocks. These aren’t tickets yet, they’re logical groupings of work:

Event infrastructure: publishing and consuming status change events.
Notification service: deciding what to notify, to whom, and managing preferences.
Email sending: integration with email provider, templates, retries.
User preferences: API and persistence to enable/disable.
Logging and observability: record of notifications sent.

Each block can be developed relatively independently, and that’s key for ticket decomposition.

Step 2: Break down into tickets

Each ticket must meet three criteria:

Self-contained: it can be implemented and tested without depending on other tickets in progress (though it can depend on already completed tickets).
Verifiable: it has concrete acceptance criteria that can be checked.
Right-sized: a developer (or an agent) can complete it in 1-3 days.

For the notification spec, the breakdown looks like this:

NOTIF-1: Publish order status change events

Description: When an order changes status, the order service publishes an OrderStatusChanged event on the internal event bus.

Acceptance criteria:

An OrderStatusChanged event is published every time Order.status changes.
The event contains: orderId, userId, previousStatus, newStatus, timestamp.
The event is published asynchronously (doesn’t block the order transaction).
If publishing fails, the error is logged but the status change is not rolled back.

Dependencies: none.

NOTIF-2: Event consumer and notification decision service

Description: A consumer listens for OrderStatusChanged events, checks the user’s preferences, and decides whether a notification should be sent.

Acceptance criteria:

The consumer processes OrderStatusChanged events.
Before notifying, it verifies that the user has email notifications enabled.
If the user has notifications disabled, the event is discarded and logged.
If the user has notifications enabled, a Notification record is created with status PENDING.

Dependencies: NOTIF-1.

NOTIF-3: Email sending service with templates

Description: Service that takes a pending notification, selects the template corresponding to the event type, renders the email, and sends it through the email provider.

Acceptance criteria:

A template exists for each status: CONFIRMED, SHIPPED, DELIVERED, CANCELLED.
Each template includes: user name, order number, new status, link to details.
The email is sent to the user’s registered address.
If sending succeeds, the notification moves to SENT status.
If sending fails, the notification moves to FAILED status and the error is recorded.

Dependencies: NOTIF-2.

NOTIF-4: Retry system for failed emails

Description: Notifications in FAILED status are automatically retried with exponential backoff, up to a maximum of 3 attempts.

Acceptance criteria:

FAILED notifications are retried automatically.
Backoff: 1 minute, 5 minutes, 15 minutes.
Maximum 3 retries. After that, moves to PERMANENTLY_FAILED status.
Each retry is recorded with timestamp and result.

Dependencies: NOTIF-3.

NOTIF-5: User notification preferences API

Description: Endpoints for users to view and modify their email notification preferences.

Acceptance criteria:

GET /api/v1/users/{userId}/notification-preferences: returns current preferences.
PUT /api/v1/users/{userId}/notification-preferences: updates preferences.
By default, email notifications are enabled for new users.
Only the user themselves can modify their preferences (auth validation).
Changes take effect immediately.

Dependencies: none (can be developed in parallel with NOTIF-1).

NOTIF-6: Notification logging and metrics

Description: Record of each notification sent and metrics for monitoring.

Acceptance criteria:

Each notification has a record with: notificationId, userId, orderId, type, status, attempts, timestamps.
Exposed metrics: notifications sent/minute, error rate, average latency between event and delivery.
Logs include enough information for debugging without exposing the user’s personal data.

Dependencies: NOTIF-3.

Step 3: Tests derived from acceptance criteria

Here’s the part where most teams cut corners. They say “yeah, we’ll write the tests later” and the tests end up being generic or insufficient. My approach: each acceptance criterion generates at least one test.

For NOTIF-2 (the decision service), the tests would be:

class NotificationDecisionServiceTest {

    private val eventConsumer = NotificationDecisionService(
        notificationRepository = mockk(),
        userPreferencesRepository = mockk(),
        eventPublisher = mockk()
    )

    @Test
    fun `should create pending notification when user has email enabled`() {
        // Given
        val event = OrderStatusChangedEvent(
            orderId = 1L,
            userId = 42L,
            previousStatus = OrderStatus.CONFIRMED,
            newStatus = OrderStatus.SHIPPED,
            timestamp = Instant.now()
        )
        every { userPreferencesRepository.findByUserId(42L) } returns
            UserPreferences(userId = 42L, emailNotificationsEnabled = true)
        every { notificationRepository.save(any()) } answers { firstArg() }

        // When
        eventConsumer.handleOrderStatusChanged(event)

        // Then
        verify {
            notificationRepository.save(match { notification ->
                notification.userId == 42L &&
                notification.orderId == 1L &&
                notification.status == NotificationStatus.PENDING &&
                notification.type == "ORDER_SHIPPED"
            })
        }
    }

    @Test
    fun `should skip notification when user has email disabled`() {
        // Given
        val event = OrderStatusChangedEvent(
            orderId = 1L,
            userId = 42L,
            previousStatus = OrderStatus.CONFIRMED,
            newStatus = OrderStatus.SHIPPED,
            timestamp = Instant.now()
        )
        every { userPreferencesRepository.findByUserId(42L) } returns
            UserPreferences(userId = 42L, emailNotificationsEnabled = false)

        // When
        eventConsumer.handleOrderStatusChanged(event)

        // Then
        verify(exactly = 0) { notificationRepository.save(any()) }
    }

    @Test
    fun `should log discarded notification when user preferences disabled`() {
        // Given
        val event = orderStatusChangedEvent(userId = 42L, newStatus = OrderStatus.SHIPPED)
        every { userPreferencesRepository.findByUserId(42L) } returns
            UserPreferences(userId = 42L, emailNotificationsEnabled = false)

        // When
        eventConsumer.handleOrderStatusChanged(event)

        // Then
        verify { logger.info(match { it.contains("discarded") && it.contains("42") }) }
    }
}

Notice the direct correspondence:

Acceptance criterion	Test
”verifies that the user has notifications enabled”	`should create pending notification when user has email enabled`
”if the user has notifications disabled, it’s discarded”	`should skip notification when user has email disabled`
”the discard is logged”	`should log discarded notification when user preferences disabled`

Each criterion has its test. There’s no ambiguity about what gets verified or how.

Step 4: Prompts for AI agents

When I have tickets with acceptance criteria and defined tests, preparing a prompt for an agent is almost mechanical. The trick is giving it all the necessary information without noise.

Prompt structure I use:

## Context
Project: [name] - Spring Boot 3.x with Kotlin
Skills loaded: backend-conventions.md, testing-patterns.md

## Ticket: NOTIF-3 - Email sending service with templates

### Description
Implement the service that takes a notification in PENDING status, selects
the corresponding email template, renders the content, and sends it through
the configured email provider.

### Acceptance criteria
1. A template exists for each status: CONFIRMED, SHIPPED, DELIVERED, CANCELLED
2. Each template includes: user name, order number, status, link
3. The email is sent to the user's registered address
4. If sending succeeds: notification moves to SENT
5. If sending fails: notification moves to FAILED with error recorded

### Existing interfaces
- `NotificationRepository`: already implemented in NOTIF-2
- `EmailProvider`: interface to implement (abstraction over the provider)
- `Notification`: entity with fields orderId, userId, type, status, attempts

### Expected tests
- Unit test: correct rendering of each template
- Unit test: status update to SENT on success
- Unit test: status update to FAILED on error
- Integration test: real sending with mock provider (Testcontainers not applicable here)

### Constraints
- Don't use string concatenation for templates. Use Thymeleaf or similar.
- EmailProvider must be an interface (don't couple to a specific provider).
- User's personal data is not logged in plain text.

The key to the prompt isn’t being long. It’s being precise. The agent needs to know: what to build, what criteria to meet, what interfaces exist, what tests to write, and what constraints to respect. With that and the project’s skills, the result is usually quite good.

Common mistakes when decomposing specs

After doing this many times, these are the mistakes I’ve seen (and made) the most:

1. Tickets that are too large

“Implement the notification system” is not a ticket. It’s an epic. If a ticket can’t be completed in 1-3 days, it needs more decomposition. A large ticket generates enormous PRs, superficial code reviews, and a false sense of progress.

2. Tickets without verifiable acceptance criteria

“The system should work correctly” is not an acceptance criterion. “When an order changes to SHIPPED status, an email is sent with the order number and tracking link to the user’s registered email” is. The difference is that the second one can be turned into an automated test.

3. Ignoring dependencies between tickets

If NOTIF-3 needs NOTIF-2 to be complete to function, that needs to be explicit. Otherwise, someone starts NOTIF-3 without having the interface they need and wastes half a day figuring out what’s missing.

4. Not separating infrastructure from business logic

Mixing “configure the email service” with “implement the notification decision logic” in the same ticket is a recipe for disaster. They’re different responsibilities, requiring different expertise, and probably done by different people (or agents).

5. Forgetting the observability ticket

There’s always a ticket for logs, metrics, and alerts. Always. If it’s not explicit, it doesn’t get done. And when something fails in production, you regret not having included it.

6. Ambiguous acceptance criteria that invite interpretation

“The email is sent quickly” is ambiguous. Quickly for whom, under what conditions, measured how. “The latency between the status change and email delivery is under 5 minutes at the 95th percentile” is verifiable. If you can’t write a test for the criterion, the criterion isn’t well written.

The full workflow in one picture

SPEC (approved)
  │
  ├── Identify functional blocks
  │
  ├── Break down into tickets
  │     ├── Concrete description
  │     ├── Verifiable acceptance criteria
  │     └── Explicit dependencies
  │
  ├── Derive tests from criteria
  │     ├── One test per criterion (minimum)
  │     ├── Unit tests for logic
  │     └── Integration tests for flow
  │
  └── Prepare prompts for agents
        ├── Context + skills
        ├── Ticket + criteria
        ├── Existing interfaces
        ├── Expected tests
        └── Constraints

Each step feeds the next. The spec defines the requirements, tickets make them executable, criteria make them verifiable, tests make them automatable, and prompts make them delegable to an agent.

It’s not a bureaucratic process. It’s a process that reduces ambiguity at each step. And in a world where part of your team consists of AI agents that interpret instructions literally, reducing ambiguity is the difference between a well-implemented feature and a technical disaster.

What changed when I started doing this right

Before applying this workflow, sprints ended with reopened tickets, PRs that needed three rounds of review, and a constant feeling that “the spec said one thing but something else got implemented.”

Now the workflow is more predictable. Not perfect, but predictable. Tickets get closed on the first try more often. Agents generate code that passes the acceptance criteria because the criteria are written in a way that a literal system can verify. And tests exist before the code exists, not after.

It’s not magic. It’s structure. And structure works regardless of whether the one executing the ticket is a person or an agent.

Tags: #sdd #spec #tickets #acceptance criteria #tests #ai agents #planning

Back to all posts

Cover for MCP in the enterprise: the problem isn't connecting it, it's deciding what permissions to give it

Artificial intelligence

Roger Bosch

•

May 18, 2026

MCP in the enterprise: the problem isn't connecting it, it's deciding what permissions to give it

Cover for How to Write a SPEC That Actually Works (SDD, AI & Agents)

SDD

Roger Bosch

•

Feb 2, 2026

How to Write a SPEC That Actually Works (SDD, AI & Agents)

Cover for Complete Guide to Spec-Driven Development (SDD) with AI in 2026

SDD

Roger Bosch

•

Jan 27, 2026

From a SPEC to tickets, tests, and acceptance criteria: a real SDD workflow

The problem: specs that don’t get executed

Concrete example: spec for a notification system

Summarized spec

Step 1: Identify the functional blocks

Step 2: Break down into tickets

NOTIF-1: Publish order status change events

NOTIF-2: Event consumer and notification decision service

NOTIF-3: Email sending service with templates

NOTIF-4: Retry system for failed emails

NOTIF-5: User notification preferences API

NOTIF-6: Notification logging and metrics

Step 3: Tests derived from acceptance criteria

Step 4: Prompts for AI agents

Common mistakes when decomposing specs

1. Tickets that are too large

2. Tickets without verifiable acceptance criteria

3. Ignoring dependencies between tickets

4. Not separating infrastructure from business logic

5. Forgetting the observability ticket

6. Ambiguous acceptance criteria that invite interpretation

The full workflow in one picture

What changed when I started doing this right

Related Posts

MCP in the enterprise: the problem isn't connecting it, it's deciding what permissions to give it

How to Write a SPEC That Actually Works (SDD, AI & Agents)

Complete Guide to Spec-Driven Development (SDD) with AI in 2026

Legal

Navigation

RRSS

Cookie Settings

From a SPEC to tickets, tests, and acceptance criteria: a real SDD workflow

The problem: specs that don’t get executed

Concrete example: spec for a notification system

Summarized spec

Step 1: Identify the functional blocks

Step 2: Break down into tickets

NOTIF-1: Publish order status change events

NOTIF-2: Event consumer and notification decision service

NOTIF-3: Email sending service with templates

NOTIF-4: Retry system for failed emails

NOTIF-5: User notification preferences API

NOTIF-6: Notification logging and metrics

Step 3: Tests derived from acceptance criteria

Step 4: Prompts for AI agents

Common mistakes when decomposing specs

1. Tickets that are too large

2. Tickets without verifiable acceptance criteria

3. Ignoring dependencies between tickets

4. Not separating infrastructure from business logic

5. Forgetting the observability ticket

6. Ambiguous acceptance criteria that invite interpretation

The full workflow in one picture

What changed when I started doing this right

Related Posts

MCP in the enterprise: the problem isn't connecting it, it's deciding what permissions to give it

How to Write a SPEC That Actually Works (SDD, AI & Agents)

Complete Guide to Spec-Driven Development (SDD) with AI in 2026

Legal

Navigation

RRSS