Agent teams for code review: the parallel-reviewer pattern

The first time I ran three agents in parallel on the same pull request, one of them returned a note about missing tests, one returned a note about a hardcoded secret, and one returned a paragraph about naming conventions. Three reviewers, three different angles, three findings I would have taken 30 minutes to catch on my own reading the diff once.

This is the pattern I use now. Same shape every time, works across DTC ecommerce code, healthcare compliance code, and internal tooling. Three instances.

The pattern

One diff, three or more agents, each with a specific review role, running in parallel, writing to separate output files. A coordinator agent reads the outputs, dedups overlapping findings, and ranks by severity.

One agent reviewing code with generic instructions finds generic findings. Three agents each given a specific role (security, performance, style, accessibility, whatever your stack needs) surface different problems. The point of splitting the work this way is that it forces each reviewer to look through a specific lens.

The cost math holds up because agent time is cheap. Three parallel reviews at 5-8K tokens each cost less than one human reviewer's hour and return faster. Parallel versus sequential dispatch covers why parallel wins here specifically (tasks are independent, outputs go to separate files, no shared state).

Macro edge of a single iridescent translucent glass slab catching dawn light on a cracked desert basin floor, refraction lines splitting through translucent layers.

Instance 1: the three-lens review on a payment change

A mid-market DTC brand I was working with shipped a change to their checkout's coupon-stacking logic. The diff was 180 lines across four files. I ran a three-agent review.

Security reviewer flagged that a new helper was logging the full cart object, including the customer email, to the server log.
Performance reviewer flagged an N+1 pattern where the coupon lookup was inside a map that iterated cart items.
Style reviewer flagged that two new date-formatting helpers duplicated logic already in lib/date.ts.

All three were real findings. None of them would have shown up in a single-agent "review this diff" pass because a single agent picks one lens and runs with it.

What the shape tells us: role specialization beats generalized review. Each agent in this pattern is given a persona and a narrow set of things to look for. They do not try to cover everything.

How it resolved: the three findings got fixed. The merged review took one human pass to confirm, maybe ten minutes. The diff went to main clean.

A single iridescent translucent glass slab lying flat on a cracked desert floor at dusk, low rippled dunes extending behind it toward the horizon.

Instance 2: the four-lens review on a security-sensitive change

A client I was working with on compliance tooling shipped a change to their audit logging layer. Sensitive code. I ran four reviewers.

Security reviewer checked auth, secret exposure, and PII handling.
Compliance reviewer checked that the log format matched the contract the compliance team had signed off.
Performance reviewer checked that the new logging path did not block hot-path requests.
Backward-compatibility reviewer checked that existing log consumers would not break on the new format.

The compliance reviewer found a field rename that would have broken a downstream analytics job. The performance reviewer found a sync-write that should have been async. The security and backward-compat reviewers came back clean.

What the shape tells us: when the review needs to hit multiple stakeholder concerns, add a reviewer per concern. The marginal cost of another parallel reviewer is 5-8K tokens. The marginal value is catching a finding the others would miss.

How it resolved: both findings were fixed before merge. The compliance team saw a clean sign-off instead of a rollback later.

Instance 3: the two-lens review on a fast-moving feature

Not every change needs a three-lens review. A small theme change, a copy tweak, a config update, one reviewer is plenty.

I was working on a landing page copy change recently. I ran a two-lens review: copy/brand and SEO. The copy reviewer flagged a tense shift mid-paragraph. The SEO reviewer flagged a meta description over 160 characters. Two findings, two minutes, ship.

What the shape tells us: scale the reviewer count to the blast radius of the change. A copy change gets two lenses. A checkout change gets four. A payments integration gets five or six and probably a human at the end too.

An empty cracked desert basin at dawn, a single iridescent translucent glass slab catching first light in the lower-right corner of the frame.

The file-ownership rule

The one rule that makes this pattern reliable: each reviewer writes to its own output file. No shared file, no shared branch, no shared context.

If you try to have all three reviewers edit the same file (say, a single review.md), they race. Whoever writes last wins. You lose findings. The pattern breaks.

What I do instead:

/tmp/review/
  security.md
  performance.md
  style.md

Each agent is told "write your findings to this file, do not touch the others." The coordinator reads all three at the end.

For agents actually editing code in parallel (not reviewing, editing), the same rule applies at the file level. Git worktrees for parallel agents covers that case. For review, separate output files are enough.

The coordinator

After the three reviewers finish, a coordinator agent reads the three output files and produces one merged review. Its job:

Dedup. If two reviewers flagged the same issue (say, both security and compliance noticed the PII log), collapse them.
Rank. Sort findings by severity. Blocking issues first, suggestions last.
Attribute. Tag each finding with which reviewer caught it so you can trust-check the severity calibration.

The coordinator is a plain sub-agent with a short prompt. It does not re-review anything; it just merges what the reviewers already did.

A vast desert basin at golden hour, a single iridescent translucent glass slab tiny in the lower-third of the frame as a small horizontal anchor.

What the pattern tells us

Three things I keep relearning:

Specialization beats generalization. A "security reviewer" finds more security issues than a "reviewer" looking at everything.
Parallelism is cheap when the tasks are independent. Three reviewers writing to three files is embarrassingly parallel. No coordination cost during the review itself.
The coordinator earns its keep. Merging three raw outputs by hand takes longer than having an agent do it, and the agent dedups more consistently.

A single iridescent translucent glass slab in vertical slice, interior revealing color refraction through its translucent layers, electric-blue and hot-pink light bleeding through the strata.

How to spot when the pattern is overkill

Two signals:

The change is small and low-risk. A one-line config change does not need three reviewers. One is fine. Zero is also fine.
The reviewers all find nothing. If three parallel reviews on three changes in a row return clean, you are probably over-reviewing. Scale down.

On the other side, two signals that you need more reviewers:

One reviewer keeps catching issues the others miss. Promote that reviewer's lens to a dedicated one-shot checklist that runs on every diff.
The review pass is the bottleneck, not the agent time but your merge-time. At that point, add an automated reviewer to CI and reserve the interactive three-reviewer pass for riskier changes.

For operators who want to package this into a reusable pattern, the Claude Code skills pack ships with review skills pre-built. For the broader agent engineering handbook, the hub post indexes the full pattern set.

Camera at slab-level looking past the edge of an iridescent translucent glass slab on a desert basin floor, dusk sky filling the upper frame, alpenglow catching the slab edge.

FAQ

Can I run parallel reviewers without sub-agents?

Not easily. Parallelism needs each reviewer to have its own context. If you run three "reviews" in a single context, they interfere with each other and you lose the specialization. The sub-agent post covers when this kind of dispatch is worth it.

How many reviewers is too many?

Four or five is my ceiling. Past that, the findings start overlapping and the coordinator struggles to dedup cleanly. Add a reviewer only when you can define a distinct lens it will bring.

Does this work for pull request review in GitHub?

Yes. The Claude Code review team pattern maps cleanly to a GitHub Actions workflow where each reviewer runs as a separate job. The coordinator's output becomes the PR comment.

How do I calibrate severity across reviewers?

Give each reviewer the same severity scale in its prompt (blocker / major / minor / nit). The coordinator can re-rank after, but starting with a shared scale reduces the merge work.

What if two reviewers contradict each other?

Surface the contradiction in the merged output. Do not have the coordinator pick a winner. Human reviewer adjudicates. Contradictions are rare; most of the time the two agents just emphasize different aspects of the same finding.