AI Skills as living documentation for a development team

How to use AI skills to capture conventions, architecture, and internal patterns of a team. Not magic, but operational documentation.

Cover for AI Skills as living documentation for a development team

Every team I know has conventions that aren’t written down anywhere. How endpoints are named. What structure a new service follows. Where integration tests go. What the pattern is for handling errors. Which libraries are approved and which aren’t. Why Kotlin was chosen over Java for that module.

All that knowledge lives in the heads of three or four people on the team. When someone new joins, they learn it through code reviews, Slack questions, and mistakes that get corrected in PRs. When someone leaves, part of that knowledge disappears.

AI skills aren’t the magic solution to this problem. But they are a surprisingly good tool for capturing it in a way that actually gets used. Not as a Confluence document that nobody reads, but as operational instructions that an AI agent applies every time it works with your code.


What is a skill (in the context we care about)

In the article on MCP vs Skills I already explained the difference between both concepts. I won’t repeat that here. What I’m interested in is a specific use case: skills as living documentation of a development team’s practices.

A skill, in this context, is a structured document that tells an AI agent how your team works. It’s not a generic manual. It’s the codification of your specific decisions, conventions, and patterns.

The difference between a skill and traditional documentation is that the skill gets executed. A person doesn’t read it and interpret it. An agent reads it and applies it directly when generating code, reviewing PRs, or creating tests.

That radically changes what’s worth documenting and how to do it.


Real example: backend conventions skill

I’m going to show a concrete example of a skill I use on a project with Kotlin and Spring Boot. This isn’t a made-up example; it’s a simplified version of what I have in a real repository.

# Backend Conventions - [Project Name]

## Stack and versions
- Kotlin 2.x, Spring Boot 3.x, Gradle (Kotlin DSL)
- Database: PostgreSQL 16
- Tests: JUnit 5 + MockK + Testcontainers

## Package structure
We follow feature-based structure, not layer-based:

src/main/kotlin/com/example/project/ ├── order/ │ ├── OrderController.kt │ ├── OrderService.kt │ ├── OrderRepository.kt │ ├── Order.kt # Entity │ ├── OrderDto.kt # Request/response DTOs │ └── OrderException.kt # Feature-specific exceptions ├── product/ │ └── … └── shared/ ├── config/ ├── security/ └── exception/


## Code conventions

### Naming
- Controllers: `{Feature}Controller`
- Services: `{Feature}Service` (interface) + `{Feature}ServiceImpl` only if there's more than one implementation
- DTOs: `{Feature}Request`, `{Feature}Response`
- Exceptions: `{Feature}{Type}Exception` (e.g.: `OrderNotFoundException`)

### REST endpoints
- Always versioned: `/api/v1/{resource}`
- Plural: `/api/v1/orders`, not `/api/v1/order`
- Filters as query params, not path params
- Paginated responses use Spring Data's `Page<T>`

### Error handling
- We use a centralized `@RestControllerAdvice`
- Business exceptions extend `BusinessException`
- Error format is always:
```json
{
  "code": "ORDER_NOT_FOUND",
  "message": "Descriptive text",
  "timestamp": "2026-05-18T10:00:00Z"
}

Tests

  • Unit tests: MockK for mocks, Test suffix
  • Integration tests: Testcontainers, IntegrationTest suffix
  • Each service has at least: happy path, input validation, main error case
  • We do NOT mock repositories in integration tests

Anti-patterns (don’t do this)

  • Don’t use @Autowired field injection. Always constructor injection.
  • Don’t create generic DTOs shared between features. Each feature has its own.
  • Don’t put business logic in controllers.
  • Don’t use Java’s Optional<T>. Use Kotlin nullable types.
  • Don’t write queries with string concatenation. Always named parameters.

Example of a correct service

@Service
class OrderService(
    private val orderRepository: OrderRepository,
    private val eventPublisher: ApplicationEventPublisher
) {
    fun getOrder(orderId: Long): OrderResponse {
        val order = orderRepository.findByIdOrNull(orderId)
            ?: throw OrderNotFoundException(orderId)
        return order.toResponse()
    }

    @Transactional
    fun createOrder(request: CreateOrderRequest): OrderResponse {
        val order = request.toEntity()
        val saved = orderRepository.save(order)
        eventPublisher.publishEvent(OrderCreatedEvent(saved.id))
        return saved.toResponse()
    }
}

This document, loaded as a skill in an AI agent, makes the generated code automatically follow the team's conventions. You don't need to correct it in code review because it already knows you use constructor injection, that endpoints are plural and versioned, and that integration tests use Testcontainers.

---

## What to put in a skill

Not everything belongs in a skill. The content should meet three criteria:

**1. It's stable.** It doesn't change every week. Naming conventions, package structure, error handling patterns... those are stable. The exact version of a dependency you're evaluating isn't.

**2. It's operational.** Someone (or an agent) can apply it directly when writing code. "We use feature-based structure" is operational. "We value clean architecture" is not (it's too abstract).

**3. It's specific to your team.** If it's a universal Kotlin or Spring Boot practice, it doesn't need to be there. The agent already knows that. What the agent doesn't know is that you use a centralized `@RestControllerAdvice` with a specific error format.

Specifically, what I usually include:

### Code conventions
- Class, method, and package naming.
- Project structure.
- Error handling patterns.
- How DTOs are defined.
- Log format.

### Architectural patterns
- How modules communicate with each other.
- What pattern is used for events (if you use event-driven).
- How transactionality is managed.
- Separation of responsibilities between layers.

### Key technical decisions
- Why one technology was chosen over another.
- Security constraints (which libraries are banned, which endpoints need auth).
- API versioning policies.

### Concrete examples
- An example of a well-written service.
- An example of a well-written test.
- An example of a database migration.

### Explicit anti-patterns
- Things that shouldn't be done, with the reason why.
- Mistakes made in the past that shouldn't be repeated.

---

## What NOT to put in a skill

Knowing what to leave out is just as important as knowing what to include. I've seen skills that try to be a project encyclopedia and end up being useless because the agent gets lost in all the noise.

**Ephemeral information.** The current state of a sprint, who's on vacation, which tickets are in progress. That changes constantly and contaminates the agent's context.

**Sensitive data.** Credentials, API keys, internal system URLs, real customer names. A skill can end up in the context of an external LLM. Never put anything in a skill that you wouldn't put in a public repo.

**Third-party implementation details.** You don't need to explain to the agent how Spring Security works internally. You do need to explain how you've configured it.

**Anything that changes every week.** If a decision isn't yet consolidated, don't put it in the skill. Wait until it stabilizes. A skill with contradictory or outdated information is worse than having no skill at all.

**Non-consensus personal opinions.** "I prefer using coroutines for everything" is not a team convention. If the team has decided to use coroutines in a specific context, that is documentable.

> The rule is simple: if you wouldn't defend it in a code review as a team convention, it doesn't go in the skill.

---

## How to version it: alongside the code, in the repo

Skills should live in the same repository as the code. Not in a separate wiki. Not in a Google Doc. Not in a pinned Slack channel.

The reason is practical: if the skill is in the repo, it's versioned with git, reviewed in PRs, and evolves alongside the code it describes. If it's somewhere else, it goes out of sync within two weeks.

Structure I use:

project/ ├── .ai/ │ ├── skills/ │ │ ├── backend-conventions.md │ │ ├── testing-patterns.md │ │ ├── api-design.md │ │ └── deployment-checklist.md │ └── context/ │ ├── architecture-overview.md │ └── tech-decisions.md ├── src/ ├── build.gradle.kts └── …


The `.ai/skills/` folder contains the skills themselves. The `.ai/context/` folder contains broader context documents (architecture overview, ADR-style technical decisions) that complement the skills but aren't operational instructions.

Benefits of this approach:

- **Versioning.** Every change to a skill is recorded in git. You can see who changed it, when, and why.
- **Code review.** A skill change goes through the same review process as a code change. This prevents one person from adding their personal preferences without consensus.
- **Synchronization.** If you change a code convention, you can update the skill in the same PR. It never goes out of sync.
- **Discoverability.** Any new person who clones the repo sees the skills immediately. They don't have to search across three different tools.

---

## How to avoid garbage documentation

The biggest risk with skills is that they become yet another repository of dead documentation. To prevent this, I apply three practices:

### 1. Mandatory review

Any change to a skill goes through a PR with at least one team reviewer. It doesn't get merged without consensus. If someone wants to add a new convention, they have to justify it and the team has to accept it.

This seems obvious but it prevents the most common problem: one person writes a skill with their personal preferences and the team ignores it because they don't feel ownership of it.

### 2. Testing skills

Sounds weird, but it makes sense. Periodically (I do it every two or three sprints) I run a simple test: I give the skill to an AI agent and ask it to generate a new component. Then I review whether the result follows the team's conventions.

If the agent generates code that doesn't comply with your conventions, the skill isn't well written. It's a functional test of the documentation.

```bash
# Example prompt for testing a skill
"Using the project conventions, generate a new service for managing
notifications. Include the controller, the service, the DTOs, the unit
tests, and an integration test."

I review the output and verify:

  • Correct package structure.
  • Correct naming.
  • Error handling following the pattern.
  • Tests in the expected format.
  • No anti-patterns.

If it fails on something, I update the skill to be clearer on that point.

3. Review date

Each skill has a last-reviewed date in the document itself:

# Backend Conventions

> Last reviewed: 2026-05-01 | Next review: 2026-08-01
> Owner: @roger

If the review date passes without anyone reviewing it, that’s a signal that the skill may be outdated. It’s not automation, it’s team discipline.


Skills for onboarding

One of the most powerful uses of skills is onboarding new team members. Instead of a week of meetings explaining “how we do things here,” the new person can:

  1. Read the skills in the repo (they’re short, concrete documents).
  2. Use an AI agent with those skills loaded to make their first contributions.
  3. The agent generates code that already follows the conventions, and the person learns the team’s patterns while working.

It doesn’t replace human mentoring, but it complements it in a way that wasn’t possible before. The new person doesn’t have to guess conventions or wait to be corrected in the PR. The agent already gives them the right format from the start.

I’ve seen teams where the effective onboarding time (until the person contributes code that doesn’t require major style corrections) went from 3-4 weeks to under 1 week using well-written skills.


Relationship with specs and SDD

Skills fit naturally into a Spec-Driven Development workflow. The relationship is this:

  • The spec defines what will be built and with what acceptance criteria.
  • The skill defines how it’s built on this team (conventions, patterns, constraints).
  • The agent uses both to generate code that meets the requirements and follows the standards.

Without the spec, the agent doesn’t know what to build. Without the skill, the agent builds something generic that doesn’t fit your project. You need both.

A typical prompt that combines both:

Context: Load the skills from backend-conventions.md and testing-patterns.md

Task: Implement ticket ORDER-123 following the attached spec.
Requirements:
- Endpoint POST /api/v1/orders/bulk for bulk order creation
- Stock validation before confirming
- OrdersBulkCreated event on completion
- Unit and integration tests

Spec: [spec content for the ticket]

The result is code that meets the specified functionality AND follows the team’s conventions. That’s what makes skills more than a decorative document: they’re a functional piece of the workflow.


Start small

If you don’t have skills and want to start, don’t try to document everything on day one. Start with a single skill: the most basic code conventions of your project. The minimum you’d tell someone new on their first day.

A well-written 40-50 line skill is worth more than a 500-line document that nobody maintains. Over time, you’ll add more skills as you identify patterns the team repeats and that the agent needs to know.

What matters isn’t having the perfect skill. It’s having a skill that gets used, gets reviewed, and evolves with the project. That alone is more living documentation than 90% of teams have.

OshyTech

Backend and data engineering focused on scalable systems, automation, and AI.

Navigation

Copyright 2026 OshyTech. All Rights Reserved