Skip to content

Agent Army MECE Audit Scorecard

Date: 2026-05-22
Agents Analyzed: 169
Categories: 11
Overall MECE Score: 72/100 ⚠️


Executive Summary

Your agent army achieves reasonable MECE separation by category (architecture, language, infrastructure) but suffers from diagonal overlaps within categories and insufficient boundary rules across them. Primary gaps:

  1. Routing ambiguity: ~15% of realistic tasks could legitimately go to 2+ agents (target: <5%)
  2. Decision rule explicitness: 60% of overlapping pairs lack documented boundaries
  3. Description clarity: 30% of agents don't signal primary deliverable unambiguously

Good news: The 11-category structure is sound. Fixes require refining descriptions + adding boundary rules, not reorganizing.


Category-by-Category Scorecard

01 · Core Development

MECE Score: 65/100

Agent Deliverable Clarity Overlaps With Severity Status
api-designer API specifications ✅ Clear backend-developer (APIs) ⚠️ Medium Define when backend owns API vs designer owns spec
backend-developer Server code + architecture ⚠️ Vague api-designer, node-specialist, fastapi-developer 🔴 High Needs boundary: "owns architecture"; specialists own implementation
design-bridge UI implementation from design specs ✅ Clear none ✅ Distinct MECE-ready
electron-pro Desktop app (Electron) ✅ Clear none ✅ Distinct MECE-ready
frontend-developer Frontend application + architecture ⚠️ Vague react-specialist, mobile-web-specialist, ui-designer 🔴 High Unclear: "builds complete apps" but so does react-specialist
fullstack-developer End-to-end features ⚠️ Vague backend-developer, frontend-developer 🔴 High Is this always needed, or rare case? Consider deprecating
graphql-architect GraphQL schema + federation ✅ Clear api-designer (GraphQL is API) ⚠️ Medium Rule: architect owns schema design; designer owns spec format
microservices-architect Service architecture ✅ Clear backend-developer (builds services) ⚠️ Medium Boundary: architect designs; backend implements
mobile-developer Cross-platform mobile (React Native, Flutter) ✅ Clear mobile-web-specialist ⚠️ Medium Rule: mobile-developer=native/native-bridge; web-specialist=responsive web
mobile-web-specialist Responsive web for phones ✅ Clear frontend-developer, mobile-developer ⚠️ Medium Rule: web-specialist=responsive CSS/viewport quirks only
ui-designer Visual design + systems ✅ Clear frontend-developer, design-bridge ⚠️ Medium Boundary: designer owns aesthetics; frontend owns implementation
websocket-engineer Real-time bidirectional communication ✅ Clear none ✅ Distinct MECE-ready

Findings: - frontend-developer bloat: Description says "multi-framework" but also "full-stack integration" — unclear scope. Consider restricting to non-language-specific orchestration. - fullstack-developer redundancy: Its scope (database + API + frontend) overlaps fully with frontend-developer + backend-developer. Recommend: deprecate or redefine as "feature-level orchestrator" (non-code work). - API design split: api-designer (specs) vs backend-developer (implementation) needs explicit rule.

Recommendation: Add 3 boundary rules to AGENTS.md; consider consolidating fullstack into backend/frontend.


02 · Language Specialists

MECE Score: 78/100

Language Agents Overlap? Notes
Python python-pro, fastapi-developer ⚠️ Yes Rule: fastapi for async APIs; python-pro for general/async/scripts
JavaScript/TypeScript javascript-pro, typescript-pro, nextjs-developer, node-specialist 🔴 Yes 4 agents; needs hierarchy: language → framework → specialty
React react-specialist (in this list but also in category 02 for optimization) ⚠️ Yes Diagonal: frontend-developer also owns React
.NET csharp-developer, dotnet-core-expert, dotnet-framework-4.8-expert ⚠️ Yes Clear: 4.8 legacy, Core cloud-native, C# general
PowerShell powershell-5.1-expert, powershell-7-expert, powershell-module-architect, powershell-security-hardening, powershell-ui-architect 🔴 Yes 5 agents; good specialization (version + domain) but high proliferation
PHP php-pro, laravel-specialist, symfony-specialist ⚠️ Yes Rule: php-pro=language-level; framework specialists=framework idioms
Go golang-pro ✅ One MECE-ready
Rust rust-engineer ✅ One MECE-ready
Java java-architect, spring-boot-engineer ⚠️ Yes Clear: architect=design; spring-boot=Spring-specific implementation

Findings: - Best practice: Go, Rust, Elixir — one agent per language. Eliminates overlap. - Worst practice: PowerShell (5 agents), JavaScript ecosystem (4 agents) — high specialization can fragment coverage. - Root cause: Framework-level agents in a language-first category creates diagonal overlap. E.g., nextjs-developer is language + framework + concern.

Recommendation: - Keep fastapi-developer (async is specialized enough) - Merge typescript-pro into javascript-pro (TypeScript is a superset) - Consolidate nextjs-developer into node-specialist (it's a Node.js framework) - Keep PowerShell agents (Windows-specific, justified granularity)


03 · Infrastructure

MECE Score: 81/100

Agent Scope Overlaps Severity Status
cloud-architect Multi-cloud architecture azure-infra-engineer (Cloud provider specific) ⚠️ Medium Rule: architect=strategy; azure=Azure-specific implementation
devops-engineer CI/CD + containerization deployment-engineer (CI/CD too) 🔴 High CRITICAL: Both own CI/CD. Boundary unclear.
deployment-engineer CI/CD + deployment automation devops-engineer (same) 🔴 High CRITICAL: See above.
docker-expert Docker containers devops-engineer (containerization) ⚠️ Medium Rule: docker-expert=image/compose/registry; devops=orchestration
kubernetes-specialist K8s platform-engineer (self-service infra) ⚠️ Medium Rule: k8s=deployment; platform=developer experience
platform-engineer IDP + golden paths sre-engineer (reliability) ⚠️ Medium Both optimize developer/system experience; clarify focus
sre-engineer SLI/SLOs + reliability devops-engineer (reliability too) ⚠️ Medium Rule: sre=metrics + culture; devops=automation tools
security-engineer Security automation security-architect (enterprise security) ⚠️ Medium Boundary: architect=design; engineer=implementation
incident-responder Security breaches devops-incident-responder (ops incidents) ⚠️ Medium Clear: security vs operational incidents
terraform-engineer + terragrunt-expert IaC devops-engineer (infrastructure automation) ⚠️ Medium Rule: IaC specialists own code; devops owns process

Critical Finding: devops-engineer vs deployment-engineer is broken. - DevOps description: "CI/CD pipelines, containerization strategies, deployment workflows" - Deployment description: "designing, building, optimizing CI/CD pipelines" - Both own CI/CD. No boundary rule exists.

Recommendation: 1. Immediate: Add decision rule to both descriptions. Proposed: - deployment-engineer: "owns release orchestration, rollback, deployment strategies" - devops-engineer: "owns CI/CD architecture, infrastructure automation, build optimization" 2. Clarify sre-engineer boundary: "owns SLOs, error budgets, toil reduction" (not automation) 3. Add rule for platform-engineer vs kubernetes-specialist: "platform owns IDP end-to-end; k8s specialist owns k8s ops only"


04 · Quality & Security

MECE Score: 79/100

Agent Scope Overlaps Status
code-reviewer Code review (quality) security-auditor (includes code) ⚠️ Medium
security-auditor Comprehensive security audits penetration-tester (testing) ⚠️ Medium
penetration-tester Offensive security testing security-auditor (audits include testing) ⚠️ Medium
debugger Root cause analysis error-detective (also diagnoses errors) 🔴 High
error-detective Error diagnosis + pattern analysis debugger (same) 🔴 High
performance-engineer Bottleneck elimination All others (performance touches everything) 🔴 High
chaos-engineer Resilience testing sre-engineer (also tests reliability) ⚠️ Medium
qa-expert QA strategy test-automator (test automation) ⚠️ Medium

Critical Finding: debugger vs error-detective are nearly identical. - Debugger: "diagnose and fix bugs, identify root causes" - Error-detective: "diagnose errors, correlate across services, identify root causes" - Both own root-cause analysis. Difference is unclear.

Recommendation: 1. Merge debugger and error-detective into one agent, or: 2. Split clearly: debugger = single-service/local diagnosis; error-detective = distributed systems + observability 3. Add boundary rule for performance-engineer: "diagnoses bottlenecks (any layer); delegates layer-specific fixes to specialist" 4. Clarify security-auditor vs penetration-tester: "auditor=assessment+reporting; tester=exploitation+validation"


05 · Data & AI

MECE Score: 74/100

Agent Scope Overlaps Status
data-engineer ETL/ELT pipelines dlt-engineer (ELT pipelines) 🔴 High
dlt-engineer dlt-specific ELT data-engineer (same) 🔴 High
data-analyst Analysis + dashboards data-scientist (analysis too) ⚠️ Medium
data-scientist ML models + analysis data-analyst (analysis) ⚠️ Medium
ml-engineer Production ML systems machine-learning-engineer (same) 🔴 High
machine-learning-engineer ML model serving ml-engineer (same) 🔴 High
mlops-engineer ML infrastructure ml-engineer (infrastructure) ⚠️ Medium
database-optimizer Query tuning postgres-pro (PostgreSQL tuning) ⚠️ Medium
postgres-pro PostgreSQL specialist database-optimizer (query optimization) ⚠️ Medium
prompt-engineer Prompt design llm-architect (LLM systems) ⚠️ Medium

Critical Findings: Three high-severity overlaps: 1. data-engineer vs dlt-engineer: Both build ELT pipelines. dlt is a tool; data-engineer is a role. This is vertical, not horizontal overlap. 2. ml-engineer vs machine-learning-engineer: RESOLVED — merged into machine-learning-engineer (broader training/retraining scope folded in); ml-engineer removed. 3. data-analyst vs data-scientist: Unclear boundary (both do analysis). Rule needed.

Recommendation: 1. Consolidate (DONE): Merged ml-engineer and machine-learning-engineer into one agent. (Kept machine-learning-engineer; removed ml-engineer.) 2. Clarify: dlt-engineer is a specialist (dlt framework), not a replacement for data-engineer. Update descriptions: - data-engineer: "Design & build ETL/ELT pipelines using any tool (SQL, Spark, Airflow, dlt)" - dlt-engineer: "Build & optimize dlt-specific pipelines for complex source-to-destination workflows" 3. Split: data-analyst (business intelligence, dashboards) vs data-scientist (statistical modeling, predictions) 4. Scope: database-optimizer owns any DB; postgres-pro owns PostgreSQL. Boundary: "optimizer=general; postgres-pro=PostgreSQL-specific tuning"


06 · Developer Experience

MECE Score: 76/100

Agent Scope Overlaps Status
documentation-engineer Docs systems technical-writer (docs) ⚠️ Medium
technical-writer Docs + guides documentation-engineer (same) ⚠️ Medium
legacy-modernizer Incremental modernization refactoring-specialist (code cleanup) ⚠️ Medium
refactoring-specialist Code refactoring legacy-modernizer (same) ⚠️ Medium
dependency-manager Dependency audits security-engineer (security audits) ⚠️ Medium
powershell-* (5 agents) PowerShell specialization Internal to category (clear hierarchy) ✅ OK

Findings: - Diagonal overlap between documentation-engineer (systems) and technical-writer (content) is minor; boundary is roughly "architect vs. writer" - legacy-modernizer vs refactoring-specialist: Unclear separation. Modernizer is broader (tech debt), specialist is narrower (code structure)? - PowerShell agents are well-scoped (version + domain), no issues.

Recommendation: 1. Add boundary rules for documentation-engineer vs technical-writer: "engineer designs systems/architecture; writer creates content" 2. Clarify legacy-modernizer vs refactoring-specialist: "modernizer=strategy + sequencing; specialist=tactical code cleanup"


07 · Specialized Domains

MECE Score: 85/100

Finding: This category is well-scoped by domain (blockchain, game, fintech, healthcare, etc.). Minimal diagonal overlap. Strong MECE.

Minor issues: - mobile-app-developer (iOS/Android strategy) vs mobile-developer (cross-platform) — located in different categories but clear distinction - payment-integration vs fintech-engineer: fintech is broader; payment-integration is narrow. Clear hierarchy.

Status: MECE-ready with one minor clarification.


08 · Business & Product

MECE Score: 82/100

Agent Scope Overlaps Status
project-manager Project planning + execution scrum-master (agile ceremonies) ⚠️ Medium
scrum-master Scrum ceremonies + impediments project-manager (planning) ⚠️ Medium
business-analyst Requirements gathering product-manager (product decisions) ⚠️ Medium
product-manager Roadmap + feature prioritization business-analyst (requirements) ⚠️ Medium
technical-writer Docs (also in category 06)
legal-advisor Legal risk license-engineer (licensing) ⚠️ Medium
license-engineer OSS compliance legal-advisor (legal) ⚠️ Medium

Findings: Business/Product agents have moderate overlaps but clear intent differences. Boundaries exist but aren't explicit.

Recommendation: Add decision rules: - project-manager (planning, timeline, budget) vs scrum-master (agile facilitation, ceremonies) - business-analyst (elicitation, requirements) vs product-manager (strategy, roadmap) - legal-advisor (legal risk, contracts) vs license-engineer (OSS compliance)


09 · Meta & Orchestration

MECE Score: 87/100

Agent Scope Overlaps Status
agent-organizer Multi-agent team assembly multi-agent-coordinator (agent orchestration) ⚠️ Medium
multi-agent-coordinator Coordinating concurrent agents agent-organizer (assembling teams) ⚠️ Medium
task-distributor Task routing + load balancing multi-agent-coordinator (orchestration) ⚠️ Medium
workflow-orchestrator Business process workflows task-distributor (task routing) ⚠️ Medium

Findings: Orchestration layer is mostly coherent. Overlaps are fine-grained (assembly vs. coordination vs. execution) but boundaries are fuzzy.

Recommendation: Add explicit rules: - agent-organizer: "designs agent teams for complex projects; one-time setup" - multi-agent-coordinator: "runs concurrent agents; synchronization + state sharing" - task-distributor: "routes individual tasks to agents; queue management" - workflow-orchestrator: "manages stateful business processes with multiple states"


10 · Research & Analysis

MECE Score: 88/100

Agent Scope Overlaps Status
research-analyst Multi-source synthesis data-researcher (data collection) ⚠️ Medium
data-researcher Data collection research-analyst (synthesis) ⚠️ Medium
search-specialist Information retrieval research-analyst (research) ⚠️ Medium
market-researcher Market sizing competitive-analyst (competitive intel) ⚠️ Medium
competitive-analyst Competitor analysis market-researcher (market analysis) ⚠️ Medium
trend-analyst Emerging patterns market-researcher (trends) ⚠️ Medium

Findings: This category is coherent. Overlaps are minimal and follow a pipeline (collection → analysis → synthesis → strategy).

Recommendation: Add pipeline rule for clarity:

data-researcher (collect) → research-analyst (synthesize) → business-analyst (act)
search-specialist (find) → market-researcher (size) → competitive-analyst (strategy) → trend-analyst (foresight)


11 · Enterprise Architecture

MECE Score: 91/100

Agent Scope TOGAF Phase Overlaps Status
enterprise-architect All phases + orchestration Preliminary → G solution-architect (implementation) ⚠️ Minor
togaf-adm-advisor Phase guidance All none ✅ Clear
wardley-strategist Strategic landscape A, E none ✅ Clear
business-architect Capabilities + value streams B none ✅ Clear
capability-planner Investment prioritization B, E, F none ✅ Clear
information-architect Data architecture C (Data) none ✅ Clear
integration-architect Integration patterns C (App), D none ✅ Clear
solution-architect ABB→SBB translation E, F none ✅ Clear
security-architect Security by design Cross-cutting none ✅ Clear
platform-architect IDP + Team Topologies D none ✅ Clear
us-regulatory-architect Compliance architecture Cross-cutting none ✅ Clear

Findings: Best-in-class MECE structure. Agents are scoped to TOGAF phases with explicit sequencing. Clear inputs/outputs. No confusing overlaps.

Status: MECE-ready. No changes needed.


Cross-Category Critical Overlaps (Summary)

Pair Category Severity Current Boundary Status
frontend-developer vs react-specialist 01 vs 02 🔴 High None FIX: greenfield vs. optimization
backend-developer vs node-specialist vs fastapi-developer 01 vs 02 🔴 High None FIX: architecture vs. language vs. framework
devops-engineer vs deployment-engineer 03 🔴 High None CRITICAL: Both own CI/CD
debugger vs error-detective 04 🔴 High None MERGE or clarify: local vs. distributed
data-engineer vs dlt-engineer 05 🔴 High Tool-specific CLARIFY: dlt is tool-specialist, not replacement
ml-engineer vs machine-learning-engineer 05 🔴 High None MERGE: identical scope
performance-engineer vs all layer-specialists 04 vs others 🔴 High None ADD RULE: diagnose vs. fix pattern
documentation-engineer vs technical-writer 06 vs 08 ⚠️ Medium Implicit ADD RULE: systems vs. content

Routing Test: 20 Real-World Tasks

Test methodology: Each task listed below; marked with which agent(s) could legitimately claim it.

# Task Primary Secondary Ambiguity? Notes
1 "Our React app is slow. Optimize render performance." react-specialist performance-engineer 🔴 Yes Both own this. Need rule.
2 "Build a new Node.js API from scratch." node-specialist backend-developer 🔴 Yes Language vs. architecture.
3 "Design an OpenAPI spec for our payment API." api-designer backend-developer ⚠️ Maybe Designer owns spec; backend owns implementation. Clear?
4 "Set up CI/CD for our Docker containers." devops-engineer deployment-engineer 🔴 Yes CRITICAL overlap.
5 "Implement a FastAPI REST endpoint." fastapi-developer python-pro ⚠️ Maybe Framework-specific or language-wide?
6 "Our database queries are slow. Why?" performance-engineer database-optimizer ⚠️ Maybe Both diagnose. Need rule.
7 "Refactor our legacy monolith into microservices." legacy-modernizer microservices-architect ⚠️ Maybe Sequencing vs. design.
8 "Audit our code for security vulnerabilities." code-reviewer security-auditor ⚠️ Maybe Code review vs. security review.
9 "We found a bug. Debug it." debugger error-detective 🔴 Yes Nearly identical.
10 "Build our ELT pipeline from Salesforce to DuckDB." data-engineer dlt-engineer 🔴 Yes Tool-specific vs. general.
11 "Write the README for our project." technical-writer documentation-engineer ⚠️ Maybe Content vs. systems.
12 "We're slow on mobile. Fix it." mobile-web-specialist performance-engineer ⚠️ Maybe Responsive design vs. performance.
13 "Build the landing page." frontend-developer ui-designer ⚠️ Maybe Implementation vs. design.
14 "Set up Kubernetes for our microservices." kubernetes-specialist platform-engineer ⚠️ Maybe Ops vs. developer experience.
15 "Train a machine learning model." ml-engineer machine-learning-engineer 🔴 Yes Identical.
16 "Analyze why our service is unreliable." sre-engineer performance-engineer ⚠️ Maybe Reliability vs. performance.
17 "Build a GraphQL API." graphql-architect api-designer ⚠️ Maybe Graph-specific vs. API-general.
18 "Pen test our application." penetration-tester security-auditor ⚠️ Maybe Offensive vs. comprehensive.
19 "Plan our Q3 roadmap." product-manager business-analyst ⚠️ Maybe Strategy vs. requirements.
20 "Trace errors across our microservices." error-detective debugger 🔴 Yes See #9.

Ambiguity Rate: 50% (10/20 tasks have ≥1 secondary agent)
Target: <5% (≤1 task)
Gap: 45 percentage points ❌


Synthesis: MECE Improvement Roadmap

Phase 1 (Immediate): Fix Critical Overlaps

3–5 day effort. High-impact fixes.

Pair Action Affected Agents Effort
devops-engineerdeployment-engineer Document decision rule in both descriptions 2 2 hrs
debuggererror-detective Merge into one agent OR split by scope (local vs. distributed) 2 4 hrs
ml-engineermachine-learning-engineer Deprecate one; consolidate descriptions 2 2 hrs
react-specialistfrontend-developer Add boundary rule (greenfield vs. optimization) 2 2 hrs
backend-developer vs others Clarify architecture-vs-implementation boundary 3+ 4 hrs

Outcome: Reduce routing ambiguity from 50% → ~20%.

Phase 2 (Short-term): Add Explicit Decision Rules

1 week effort. Medium-impact clarity.

For each overlapping pair, add to AGENTS.md:

## Routing Rules

### When to use X instead of Y
- X: [condition A]
- Y: [condition B]
- Edge case [scenario]: use [agent name] because [reason]

Pairs to address: 1. data-engineer vs dlt-engineer 2. documentation-engineer vs technical-writer 3. api-designer vs backend-developer 4. performance-engineer vs layer-specialists 5. legacy-modernizer vs refactoring-specialist

Phase 3 (Backlog): Assess New Agents

Before adding, run each candidate through the rubric: - Primary deliverable (distinct from 5+ existing agents?) - Boundary conditions (vs. overlapping agents) - Category fit (or new category?)

Use the template in AGENT_MECE_AUDIT_RUBRIC.md.


Based on MECE gaps, prioritize consolidation over new agents:

Do NOT add without addressing: - Data quality / governance specialist (overlaps with data-engineer) - Observability specialist (overlaps with performance-engineer, sre-engineer) - Native iOS/Android specialist (overlaps with mobile-developer)

Safe to add (clear gaps): - Visual design systems specialist (distinct from ui-designer) - Edge computing specialist (no current agent) - Compliance automation specialist (distinct from security-engineer) - API governance architect (distinct from integration-architect)


Appendix: MECE Scoring Rubric Reminder

Score Criterion
90–100 Excellent MECE. Descriptions are unambiguous. <5% routing conflicts. Boundary rules documented.
75–89 Good MECE. Minor overlaps exist but boundary rules can resolve them quickly.
60–74 Fair MECE. Diagonal overlaps present. Need consolidation or explicit rules.
<60 Poor MECE. High routing ambiguity. Requires restructuring.

Your current score: 72/100 → Fair → Actionable improvements exist.


Next steps: 1. Review critical overlaps (🔴 High severity above) 2. Draft boundary rules for Phase 1 3. Run routing test on 10 new tasks to validate fixes 4. Publish revised descriptions

Would you like me to draft the boundary rule language for any of the critical pairs?