Skip to content

[analyze 2/3] persona-discoverer: built-in persona that clusters work into starter personas #76

@willwashburn

Description

@willwashburn

Part of the agentworkforce analyze feature. Issue 2 of 3. Consumes the JSON from #75 and produces the proposals that #77 will walk.

Depends on #71. This issue assumes the persona-kit migration (#64#71) has shipped. The file targets below reflect the post-migration layout.

Goal

Add a new internal built-in persona persona-discoverer that reads the gathered signal JSON (from #75) and emits a JSON proposals file describing 3–7 distinct starter personas, each grounded in a real cluster of work the repo has been doing.

The analyzer is a persona — not bespoke code — because clustering "ways of working" is a judgment task where heuristics produce shallow buckets, and because keeping the clustering logic in a persona JSON lets users iterate on it the same way they would persona-improver.

Files to touch

New:

  • personas/persona-discoverer.json — built-in persona spec.

Modify:

Persona shape

Pattern: copy the input/output contract style from personas/persona-improver.json, and the sparse systemPrompt + rich agentsMdContent style from personas/persona-maker.json.

  • id: persona-discoverer
  • intent: persona-discovery
  • tags: ["discovery", "planning"]
  • description: one or two plain sentences — what it does, when to use it.
  • tiers:
    • best: codex / gpt-5.3-codex / reasoning high / timeoutSeconds: 1200 / sandboxMode: workspace-write / workspaceWriteNetworkAccess: true
    • best-value: opencode / gpt-5-nano / reasoning medium / timeoutSeconds: 900
    • minimum: opencode / minimax-m2.5-free / reasoning low / timeoutSeconds: 600
    • Each tier's systemPrompt: "$TASK_DESCRIPTION" (matches persona-maker — the heavy spec lives in agentsMdContent).
  • inputs:
  • agentsMdContent: the operating spec (see below).

Output contract (what the persona writes to PROPOSALS_OUTPUT_PATH)

{
  "analysisInputPath": "<abs>",
  "proposals": [
    {
      "id": "kebab-case-id",
      "summary": "<= 80 chars, one line",
      "rationale": "Paragraph citing concrete signal from the analysis input: specific commits, files, PRs, sessions. No marketing.",
      "persona": { /* full PersonaSpec matching workload-router/src/index.ts */ }
    }
  ]
}

Each persona must validate against the existing PersonaSpec interface — id, intent, tags, description, skills, tiers.{best,best-value,minimum}, optional mount / permissions / inputs. If the persona declares skills, run npx skills find <kw> first (per the persona-maker spec) and only include skills that actually exist.

agentsMdContent outline

The spec should cover:

  • Read the analysis input. Specifically: walk commits, hotFiles, prs, codebase, sessions. Quote concrete signal (sha prefixes, file paths, PR numbers) in rationale.
  • Cluster shape. Name clusters by the work, not the code area. Good: "Database migration writer". Bad: "Person who touches src/db/".
  • Cluster count. 3–7. Fewer if the repo is small; more if there's clear separation. Don't pad to hit a number.
  • Conflict avoidance. Read TARGET_DIR for existing persona files; don't reuse those ids.
  • Skill curation. Run npx skills find <kw> for any skill before declaring it. Drop trivial single-flag CLIs.
  • Tier defaults. Codex@best, opencode@best-value, opencode@minimum — same shape as persona-maker. Override per-persona only when the work genuinely benefits.
  • Mount discipline. Use mount.readonlyPatterns to scope each persona to the directory cluster the signal pointed to — don't grant universal read/write.
  • Output discipline. Write the proposals JSON to PROPOSALS_OUTPUT_PATH and exit. Do not print proposals to stdout. Do not write any other files.
  • Anti-goals. Don't propose duplicates of persona-maker, persona-improver, persona-discoverer. Don't propose meta/management personas. Don't draft personas the signal doesn't actually support.

Tasks

  • Add 'persona-discovery' to PERSONA_INTENTS in workload-router/src/index.ts.
  • Author personas/persona-discoverer.json with the shape above. Cross-check against personas/persona-maker.json and personas/persona-improver.json for consistency.
  • Add a routing rule for persona-discovery in packages/workload-router/routing-profiles/default.json (mirror persona-improvement).
  • Run corepack pnpm --filter @agentworkforce/workload-router run dev once and confirm packages/workload-router/src/generated/personas.ts picks up the new built-in.
  • Validate the persona: npm run dev:cli -- agent persona-discoverer@best --dry-run — must pass.
  • Smoke test: write a minimal canned analysis JSON to /tmp/canned.json (2–3 fake commits, 1 PR, 1 package), invoke the persona headless with ANALYSIS_INPUT_PATH=/tmp/canned.json PROPOSALS_OUTPUT_PATH=/tmp/proposals.json TARGET_DIR=/tmp/personas, and verify the output file is valid JSON matching the contract above and that each proposed persona's JSON passes agentworkforce agent <written-persona>@best-value --dry-run.

Verification

  • corepack pnpm -r build clean.
  • corepack pnpm run check clean.
  • agentworkforce list shows persona-discoverer after the regenerated catalog ships.
  • agentworkforce show persona-discoverer@best prints the persona without warnings.
  • The smoke-test proposals JSON, when each persona is extracted and written under TARGET_DIR, all --dry-run cleanly.

Constraints

  • Built-in persona, not a persona pack. It lives under /personas/ like persona-maker and persona-improver and ships with the CLI; not under packages/personas-core/.
  • Model-agnostic prompt. agentsMdContent must not name specific models or hardcode tier identities — the same prompt drives all three tiers.
  • No new dependencies.
  • Don't write outside PROPOSALS_OUTPUT_PATH. The persona must not touch TARGET_DIR itself — that's [analyze 3/3] agentworkforce analyze: subcommand wiring + proposal walk + write to disk #77's job.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions