Generated code review

AI Coding PR Review Checklist

Do not review AI-generated code by asking whether it looks plausible. Review it by comparing the diff against the spec, criteria, allowed files, and evidence.

Copy review template Write criteria first

Last updated: May 25, 2026

ai-pr-review.mdcopy-ready

Review order:
1. Scope
2. Criteria mapping
3. Evidence
4. Contracts
5. Drift

Decision:
- Approve
- Request changes
- Split PR

Review In This Order

Scope first

Check changed files before reading style or implementation details.

Criteria second

Map each accepted behavior to a named acceptance criterion.

Evidence third

Require tests, screenshots, logs, metrics, or manual checks for each criterion.

AI Review Needs A Different First Pass

AI-generated diffs often include useful code and quiet extras in the same patch. A normal style-first review may miss the extras because the useful part looks correct.

The first pass should be mechanical: files changed, contracts preserved, criteria satisfied, evidence present. Only then is it worth reviewing naming, structure, or implementation taste.

Reviewer Stop Signals

The diff touches files outside the allowed write scope.
A criterion has no matching evidence.
The agent changed public contracts without approval.
The PR bundles a refactor with the requested behavior.
The final answer does not mention known gaps or skipped checks.

Copy-Ready PR Review Checklist

Paste this into the PR description or reviewer comment before approving generated code.

ai-pr-review.md

# AI Coding PR Review

Spec:
Agent/tool:
Reviewer:

## 1. Scope Check
- Allowed files:
- Actual files changed:
- Out-of-scope changes:

## 2. Acceptance Mapping
- AC-1:
- AC-2:
- AC-3:

## 3. Evidence
- Tests added or updated:
- Test command and result:
- Manual check, screenshot, log, or metric:

## 4. Contract Check
- API shape preserved:
- DB schema preserved or migration reviewed:
- Events, permissions, and error codes preserved:

## 5. Decision
- Approve | Request changes | Split PR
- Follow-up owner:
- Notes:

Filled Example

A strong review record makes it obvious whether the agent followed the original spec.

filled-example.md

## Scope Check
- Allowed files: services/refunds/*, tests/refunds/*
- Actual files changed: services/refunds/retry.ts, tests/refunds/retry.test.ts
- Out-of-scope changes: none

## Acceptance Mapping
- AC-1 retry once on timeout: retry.test.ts "retries once"
- AC-2 double timeout response: retry.test.ts "keeps retryable failure"
- AC-3 invalid signature: existing signature test still passes

## Evidence
- npm test -- refunds
- Manual: PR diff shows no provider API change

## Decision
- Approve with follow-up: add production metric alert in next spec.

Real scenario: AI-generated user search PR

A coding agent adds account search to an admin page. The diff looks helpful, but it also touches a shared query helper, changes default sorting, and removes a tenant filter in one branch.

First pass

Review the changed-file list before reading implementation details. The shared helper and tenant filter are scope questions, not style comments.

Evidence map

Require each accepted behavior to point to a criterion: empty query, tenant-scoped results, latency budget, and permission denial.

Decision

Approve the search behavior only after the tenant filter evidence is restored. Move sorting cleanup into a follow-up spec instead of merging it silently.

How To Review A Generated PR Without Getting Pulled Into Style

The safest AI PR review is intentionally boring at first. Reviewers should verify that the diff stayed inside scope, preserved contracts, and produced evidence before they spend energy on naming, formatting, architecture taste, or whether the generated code looks clever.

Start with the changed-file list

Open the file list before the code body. Compare it with the spec's allowed write scope and mark every unexpected file as a question. This prevents a reviewer from becoming attached to a useful-looking implementation that already crossed a boundary.

Ask for evidence, not confidence

AI final summaries often sound certain. Require concrete proof: command output, named test cases, screenshots for UI states, migration logs, metrics, or manual QA notes. If the evidence is missing, the correct review decision is request changes, not a longer discussion.

Split drift into a new spec

Sometimes the extra behavior is genuinely useful. Still split it out unless the current spec is updated and reviewed. Quietly merging extra behavior trains agents to treat review as permission to expand scope, which makes future PRs harder to judge.

Review Questions Before Adoption

AI Coding PR Review Checklist is not just a prompt for an AI tool. It should help people decide whether the task is ready for implementation, who owns unresolved questions, and which evidence will remain with the pull request. Use these questions in team conventions, PR templates, or pre-implementation review.

Who owns the decision?

Before using AI Coding PR Review Checklist, name the person or role allowed to approve scope changes. Open questions without an owner become an invitation for the agent to fill gaps, and they make reviewers discover product, data, or permission decisions only after the code exists.

What blocks implementation?

Separate questions that must be answered from questions that can move forward as accepted risk. Blocking items usually include public APIs, data migrations, permission boundaries, payment behavior, rollback paths, and user-facing copy. If these are unclear, do not ask an agent to generate production code yet.

What evidence stays with the PR?

The final pull request should link ai-pr-review.md, list changed files, name skipped checks, and map tests, screenshots, logs, or metrics back to acceptance criteria. Without that record, future readers have to reverse-engineer the decision from the diff.

What To Reject

Unmapped behavior

The diff adds behavior that is not tied to a criterion, non-goal exception, or approved follow-up.

Unproven tests

The PR says tests pass but does not show the command, relevant case, or CI result.

Quiet contract changes

A response field, schema, permission, or event payload changed without being listed in the spec.

Related Resources

Pair this checklist with the spec and criteria pages so review has a source of truth.

AI Coding Review Template

Use a longer reusable review artifact for generated PRs.

Copy template

AI Coding Acceptance Criteria

Write criteria that can be mapped directly to review evidence.

Write criteria

Claude Code Spec Template

Give the coding agent a scope boundary before the PR exists.

Open spec

AI PR Review FAQ

Should reviewers trust AI final summaries?

No. Treat them as claims until commands, tests, screenshots, logs, or diff evidence support them.

What if the extra behavior is useful?

Split it into a follow-up spec or update the current spec before merge. Do not hide it in the same PR.

Should review start with code style?

No. Start with scope, criteria, evidence, and contracts. Style review comes after the change is bounded.

When a generated PR arrives, review the spec first, the evidence second, and the code third.

Copy review template