Workforce

Engineering

About

Resources

Book a demo

Workforce

Case Studies & Stories

How a 2-Person Team Ships Like 10

A practical look at how a lead engineer and product lead use Workforce to run multiple production codebases — the workflow, the costs, and the honest limitations.

A lead engineer and a product lead. Two people running production software for multiple clients, shipping features weekly, handling bug fixes same-day, maintaining code quality that passes security audits.

No sprint planning ceremonies. No standup theatre. No six-month hiring pipeline to bring on a mid-level developer who'll need three months to ramp up.

This is what engineering with Workforce agents looks like in practice. Not in theory, not in a controlled demo — in daily production work.

The Problem with Scaling Through Hiring

The default answer to "we have more work than we can handle" is "hire more engineers." It's also a slow and expensive answer with diminishing returns.

A single mid-to-senior developer in a major market costs £80,000–£150,000 in salary, plus benefits, equipment, and management overhead. Recruiting takes 2–4 months. Onboarding takes another 1–3 months before they're productive in a new codebase. And here's the part nobody likes to talk about: adding a developer doesn't double your output. It adds coordination overhead — more PRs to review, more context to share, more meetings to synchronise, more communication paths.

Brooks's Law isn't a suggestion. Adding people to a late project makes it later, and adding people to a small team makes it slower before it makes it faster.

What if the execution layer — the actual code writing, PR management, ticket updates, review cycles — could be handled by agents? And the humans focused on the work that actually requires human judgment: architecture, product decisions, client relationships, and quality control?

The Setup

The team runs on a simple structure:

One lead engineer — Sets architecture, reviews agent output, makes design decisions, handles the work that requires deep technical judgment
One product lead — Manages client relationships, defines requirements, prioritises the backlog, writes tickets that agents can act on

Workforce agents fill the execution layer:

Agents configured with role templates for different responsibilities (implementation, review, monitoring)
Each agent has access to the relevant GitHub repos, Linear workspaces, and Slack channels
The knowledge graph is indexed across all active codebases
Capabilities are scoped per agent — not every agent can merge, not every agent can modify infrastructure code

The tooling is straightforward. Linear for tickets. GitHub for code. Slack for communication. Workforce orchestrating the agents across all three.

A Typical Day

8:30 AM — The lead engineer opens Linear and reviews what's in the backlog. There are six tickets across two client projects: three feature requests, two bug reports, one refactoring task. They spend 30 minutes refining ticket descriptions, adding context about implementation approach, and assigning them to agents.

9:00 AM — Agents pick up assigned tickets during their heartbeat cycle. Within minutes, three agents are working in parallel: one on a new API endpoint for Client A, one diagnosing a caching bug for Client B, one starting the refactor.

10:30 AM — The first PR lands. The agent working on the API endpoint has opened a PR with a clear description: what the ticket asked for, what approach was taken, which files were modified and why. Another agent picks up the review, leaving inline comments on a potential edge case in the error handling.

11:00 AM — The lead engineer reviews the reviewed PR. The reviewer agent flagged one issue — a missing null check on a query parameter. The implementing agent has already pushed a fix. CI is green. The engineer approves the merge. SHA-pinned verification confirms the reviewed commit matches the merge target. Done.

11:30 AM — The caching bug PR is up. This one's more complex — the agent queried the knowledge graph to trace the data flow through three services before identifying the root cause. The PR includes a clear explanation of the bug mechanism and the fix. The review agent approved it with no changes requested. The engineer reads through it, agrees with the diagnosis, approves.

1:00 PM — The product lead has been in client meetings all morning. They come back, check Slack, and see status updates from agents on each ticket. Two are merged, one is in review, one is still in progress. They update the client: "The caching fix and the new endpoint are deployed. The dashboard feature will be ready for review this afternoon."

3:00 PM — The refactoring PR is the biggest of the day. The agent has restructured a module that was growing unwieldy, moving related functions into a cleaner namespace and updating all call sites. The review agent went through it carefully, flagging two places where the import paths could be simplified. The implementing agent applied the suggestions and the PR is clean.

4:30 PM — All six tickets are either merged or in final review. The engineer spent the day doing architecture work, reviewing agent output, and making a few design decisions. The product lead spent the day with clients and stakeholders. Six pieces of work shipped.

With a traditional team, this would have required three to four developers working full days, plus coordination overhead.

Why Kanban Works Here

This workflow maps naturally to Kanban — continuous flow, no fixed sprints, work pulled as capacity allows.

Agents don't need sprint planning. They don't have velocity estimates. They pick up the next prioritised ticket when they finish the current one. Work flows through the pipeline at whatever rate the agents can handle, bounded only by human review capacity.

For a small team, this eliminates one of the biggest overhead categories in traditional engineering management: the planning and estimation cycle. Instead of spending half a day every two weeks negotiating what fits in a sprint, you maintain a prioritised backlog and let agents pull from it continuously.

The lead engineer's job becomes quality control and direction-setting, not task allocation and progress tracking.

The Cost Equation

Here's the comparison that makes finance teams pay attention.

A traditional approach to this workload — six tickets per day across multiple codebases with same-day turnaround — would require a team of roughly four to six developers, depending on seniority and codebase familiarity. Conservatively:

4 developers × £100,000 average fully-loaded cost = £400,000/year
Plus recruiting costs, management overhead, office/equipment, onboarding time

With Workforce:

LLM API costs with adaptive model routing (frontier models for complex reasoning, cheaper models for routine tasks) — the 40–60% savings from intelligent routing keeps this well below what a naive "run everything through GPT-4" approach would cost
Workforce platform licensing
Two human team members who were already there

The exact numbers depend on ticket volume, codebase complexity, and model usage. But the order-of-magnitude difference is consistent: you're replacing £400,000+ in annual headcount costs with a fraction of that in compute and licensing costs.

Per-ticket cost tracking makes this concrete, not theoretical. You can see exactly what each piece of work cost in tokens and dollars and make informed decisions about where the economics work and where they don't.

What Agents Can't Do

Honesty about limitations matters more than hype. Here's where agents fall short and humans remain essential:

Strategic product thinking — Agents can implement a feature, but they can't tell you whether you should build it. Product intuition, market understanding, and customer empathy are still human territory.

Stakeholder management — No agent is joining your client call to explain why the roadmap shifted. Relationship management, negotiation, and the kind of communication that builds trust require human judgment and emotional intelligence.

Creative problem-solving on novel challenges — Agents excel at well-defined tasks with clear requirements. When the problem is ambiguous, when the right approach isn't obvious, when you need to invent a new architecture rather than implement a known pattern — that's where senior engineering judgment earns its keep.

Business context that isn't written down — Agents work from what's codified: tickets, documentation, code comments, identity files. The unspoken context that experienced team members carry — "we tried that approach two years ago and it didn't work because of X" — only becomes accessible to agents if someone writes it down.

Reviewing their own blind spots — Agents make mistakes. They sometimes misinterpret ticket requirements. They occasionally write code that's functional but not idiomatic. They can miss edge cases that an experienced developer would catch instinctively. Human review isn't optional — it's a core part of the workflow.

The Honest Assessment

Running engineering with Workforce agents isn't magic. It's a different operating model with real tradeoffs.

The setup isn't trivial. You need well-written tickets with enough context for agents to act on. Your codebase needs to be in reasonable shape — agents work better with clean code and clear conventions than with a legacy tangle. Identity files and memory configurations need tuning as you learn how agents work with your specific codebase.

Agents are a multiplier, not a replacement. They multiply the output of the humans on the team by handling the execution layer at scale. But the quality of that output is directly proportional to the quality of human direction. Vague tickets produce vague code. Clear architecture produces clean implementations.

The teams that get the most from this model are the ones that treat agents as junior-to-mid-level developers with perfect memory and unlimited hours: capable of solid execution when given clear direction, but not yet ready to make the judgment calls that shape a product.

For a two-person team that operates this way — clear direction in, reviewed output out — the throughput is genuinely comparable to a team five times the size. Not because the agents are smarter than developers. Because they don't context-switch, they don't need to be onboarded, they work in parallel, and they don't lose context between sessions.

Book a demo to see Workforce in action.

MORE RESOURCES

Continue reading

Jan 1, 1970

How Workforce Works: A Technical Overview

Deep dive into Workforce's agent engine architecture, five-layer memory model, session continuity, and self-hosted deployment for autonomous AI engineering agents.

Jan 1, 1970

How Workforce Works: A Technical Overview

Deep dive into Workforce's agent engine architecture, five-layer memory model, session continuity, and self-hosted deployment for autonomous AI engineering agents.

Jan 1, 1970

Workforce vs. Copilots: What's Actually Different

A direct technical comparison between AI copilots and Workforce's multi-agent orchestration — where each excels and where the model breaks down.

Jan 1, 1970

Workforce vs. Copilots: What's Actually Different

A direct technical comparison between AI copilots and Workforce's multi-agent orchestration — where each excels and where the model breaks down.

Jan 1, 1970

The Security Model Behind Autonomous AI Agents

How Workforce secures autonomous AI agents with three-layer defence: policy engine, sentinel scanning, and integrity verification — all self-hosted.

Jan 1, 1970

The Security Model Behind Autonomous AI Agents

How Workforce secures autonomous AI agents with three-layer defence: policy engine, sentinel scanning, and integrity verification — all self-hosted.

Ready to see Workforce in action?

Book a 30-minute demo. We'll show you autonomous agents shipping real code in a production environment.

Book a demo

Ready to see Workforce in action?

Book a 30-minute demo. We'll show you autonomous agents shipping real code in a production environment.

Book a demo

Ready to see Workforce in action?

Book a 30-minute demo. We'll show you autonomous agents shipping real code in a production environment.

Book a demo