Skip to content

feat(table): chunked dispatcher + workflow cascade + UX polish#4672

Merged
TheodoreSpeaks merged 21 commits into
stagingfrom
feat/table-chunked-dispatcher
May 20, 2026
Merged

feat(table): chunked dispatcher + workflow cascade + UX polish#4672
TheodoreSpeaks merged 21 commits into
stagingfrom
feat/table-chunked-dispatcher

Conversation

@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator

@TheodoreSpeaks TheodoreSpeaks commented May 20, 2026

Summary

  • chunked dispatcher for workflow-column runs with tableRunDispatches rows + windowed walk via batchTriggerAndWait
  • loop-in-cell cascade under a Redis lock per (tableId, rowId); reactor removed; every cell-enqueue routes through runWorkflowColumn
  • unified trigger.dev + inline paths behind JobQueueBackend.batchEnqueueAndWait
  • new mode: 'new' for auto-fire — skips rows with any prior executions[gid]; SQL pushdown via NOT jsonb_exists_all
  • dep-aware retrigger: editing a column a workflow depends on clears terminal-state downstream groups and cancels + re-runs in-flight ones
  • optimistic UI: cancel distinguishes optimistic vs real claims; create-row stamps eligible groups instantly; unmet-deps cells stay Waiting
  • active-dispatches overlay preserves queued badges across refresh during long Run-all
  • backend-counted "X running" via the same endpoint, kept live by cell SSE deltas
  • sidebar: 'Run after' defaults empty, requires ≥1 when auto-run, anchors to leftmost group column on edit; reorder scrubs deps server-side
  • server returns schema.columns sorted by metadata.columnOrder
  • /api/resume/poll routes through executeResumeJob so paused cells get cell-context restoration in local dev; paused cells render Pending pill and surface View Execution
  • typewriter reveal for SSE-delivered workflow values

Type of Change

  • New feature
  • Bug fix

Testing

Tested manually — CSV import, Run all + Stop + refresh, dep edits re-run downstream workflows, cancel scoped per-row, delete-cell flicker gone, reorder scrubs deps, paused (wait-block) cells resume via cron and complete cleanly. tsc + vitest (198/198) + check:api-validation:strict pass.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

TheodoreSpeaks and others added 13 commits May 16, 2026 23:09
Replaces the all-rows-at-once runWorkflowColumn with a row-window dispatcher
backed by a new table_run_dispatches row. Each user click inserts a dispatch
row and triggers a trigger.dev task that crawls the table 20 rows at a time,
re-enqueueing itself between windows. The HTTP/Mothership entrypoints return
{ dispatchId } immediately instead of holding the request open for minutes
on multi-thousand-row dispatches.

- Per-row cancel stamps cancelledAt; the dispatcher skips cells whose
  cancelledAt > dispatch.requestedAt so a mid-cascade cancel sticks even
  under isManualRun.
- Table-wide cancel marks active dispatches cancelled atomically so the
  dispatcher bails on its next iteration.
- New 'dispatch' SSE event variant plumbed; client ignores for v1.
Run-column with run-mode 'all' wasn't visually flipping rows that already
had data — the cell renderer's "value wins" branch kept showing the prior
output behind the queued/running state. The dispatcher only cleared one
window of rows at a time, so most of the column stayed stale until the
cursor walked to it.

Now:
- Dispatcher's `pending → dispatching` transition runs a single SQL UPDATE
  that wipes targeted `data` output columns and `executions[gid]` across
  every targeted row (mode-aware: 'incomplete' skips fully-filled rows).
- Per-window clear in `dispatcherStep` is gone — rows are pre-cleared,
  the loop only filters cancel tombstones / unmet deps and enqueues.
- Optimistic patch in `useRunColumn` mirrors the bulk clear by nulling
  output values in the cached row, so the UI flips queued/running
  instantly without waiting for the SSE catch-up.
The eager bulk clear for mode: 'incomplete' only skipped rows that were
already fully filled, so two overlapping dispatches could race — dispatch B
would nuke executions[gid] on a row dispatch A had just stamped 'queued',
flickering the cell and potentially confusing the worker.

Skip any row whose targeted group is currently queued/running/pending — an
'incomplete' run shouldn't touch what another dispatch is actively working
on. The per-walk 'in-flight' eligibility skip already handles rows that
flip in-flight between the clear and the cursor reaching them.
Switch the per-window cell fan-out from fire-and-forget tasks.trigger to
tasks.batchTriggerAndWait. The dispatcher is now a single long-lived
trigger.dev task that loops dispatcherStep until the table is exhausted;
trigger.dev CRIU-checkpoints the parent during each wait so we don't pay
compute while cells execute. Queue depth is bounded at WINDOW_SIZE per
dispatch — no more flooding trigger.dev with a million queued runs.

- dispatcher.ts builds payloads via the new shared buildPendingRuns helper
  and calls tasks.batchTriggerAndWait directly. Pre-stamps each cell to
  `queued` (jobId=null) so the UI flips instantly.
- table-run-dispatcher.ts is now a plain while-true loop. No
  RUN_BUDGET_MS, no self-re-enqueue, no cold-start tax per window.

Cancel:
- New cancelCellRunsByTags(tags) paginates runs.list + runs.cancel(id).
- cancelWorkflowGroupRuns fires the tag-sweep alongside the per-jobId
  queue.cancelJob path (preserved for auto-fire cells that have real
  jobIds from single tasks.trigger calls).
- Trigger.dev acks the cancel → batchTriggerAndWait resumes → dispatcher
  observes the dispatch-row cancel flag → exits.

Side fixes:
- getAsyncBackendType returns 'trigger-dev' whenever taskContext.isInsideTask
  is true, regardless of TRIGGER_DEV_ENABLED env. The preview/dev-sim
  worker silently routing cell jobs to DatabaseJobQueue (no poller) is
  fixed without any env config change.
- runWorkflowColumn skips the dispatcher entirely when trigger.dev is
  disabled, running cells inline via DatabaseJobQueue.runInline. HTTP
  response returns dispatchId: null in that mode.
- runColumnContract response schema updated to dispatchId.nullable().
isExecInFlight required a jobId for `pending` status, gating it as "real
backend pending" vs "optimistic flag only." The row-gutter Stop button
keyed on this — so a freshly clicked Play sat as `pending` (no jobId) and
the user couldn't cancel it until the server-side `queued` stamp arrived
via SSE. With the dispatcher pre-batch stamping cells as `queued` (not
`pending`) and no per-cell jobIds under batchTriggerAndWait, the gap was
worse.

Drop the jobId requirement. `pending` now counts as in-flight everywhere.
Cancel writes `cancelled` to the cell exec authoritatively whether or not
a real trigger.dev run exists yet — cancelling an optimistic cell means
"don't run this," which is correct.

Also collapse isOptimisticInFlight into isExecInFlight since the two
helpers are now identical.
Two coupled changes:

1. Cell-task runs the row's full cascade in-process. executeWorkflowGroupCellJob
   acquires a Redis lock per (tableId, rowId) with heartbeat (10s/30s TTL),
   then loops through eligible workflow groups for the row. One cell-task =
   one row's full cascade, not N. Resume worker holds the same lock and
   continues the cascade after a HITL resume. Shared withCascadeLock helper
   in lib/table/cascade-lock.ts.

2. Every cell-enqueue goes through the dispatcher. The implicit
   scheduleRunsForRows reactor in service.ts is removed — 8 callsites
   (insertRow, batchInsertRows, upsertRow, updateRowsByFilter,
   batchUpdateRows, addWorkflowGroup, updateWorkflowGroup) now fire
   runWorkflowColumn with mode: 'incomplete', isManualRun: false. HTTP
   routes that call updateRow directly also fire runWorkflowColumn
   afterwards. scheduleRunsForTable / scheduleRunsForRowIds deleted;
   scheduleRunsForRows demoted to private (only the TRIGGER_DEV_ENABLED=false
   fallback uses it). skipScheduler flag dropped from UpdateRowData /
   BatchUpdateByIdData — no longer meaningful since there's nothing implicit
   to suppress.

Plumbed isManualRun through the dispatch row (new is_manual_run column,
default true) so auto-fire callers honor autoRun: false and don't re-run
completed cells.

Stamp 'pending' (not 'queued', executionId: null) before
batchTriggerAndWait — cell-task writes its own 'queued' on lock acquire.

Small UI polish: row gutter Play button spacing, "Delete workflow" →
"Delete column" label, optimistic-pending cells now show Stop button
(isExecInFlight no longer requires jobId).
…Id cell

The dispatcher's pre-batch `pending` stamp leaves executionId unset so any
cell-task that wins the cascade lock can claim the cell. The cancellation-
guard SQL clause was rejecting these claims because it tested
`executions->gid IS NULL` (whole exec missing) but the pre-stamp leaves
the exec present with executionId=null.

Add a third carve-out: `executions->gid->>'executionId' IS NULL`. Now the
guard reads "write allowed if no exec exists, OR no executionId is set
yet, OR the executionId matches ours."

Symptom: every cell-task's first markWorkflowGroupPickedUp call would log
"SQL guard saw cancelled" and skip, leaving cells stuck at the dispatcher's
pending stamp.
The dispatcher's row-window SELECT is `position > cursor` for exclusive
lower-bound semantics. With cursor initialized to 0, position-0 rows were
never picked up — every dispatch silently skipped the table's first row.

Start cursor at -1 instead. First window's filter `position > -1` matches
position 0; subsequent iterations advance to `lastPosition` which then
correctly excludes already-processed rows.
…el via 'new' mode

Fix 0: new `DispatchMode = 'new'` for auto-fire callsites. Eligibility skips
rows with any prior `executions[gid]` entry — cancelled / errored / completed
cells stay sticky until a manual run. Dispatcher's windowed SELECT pushes
`NOT jsonb_exists_any(...)` to SQL so CSV imports into mostly-attempted
tables don't pay a per-window load+JS-filter. `batchInsertRows` drops its
`rowIds` payload (keeps dispatch scope tiny on big imports).

Fix A/B/D: client optimistic patches now mirror the backend's actual
invariants. `useCreateTableRow.onSuccess` stamps eligible groups via
`optimisticallyScheduleNewlyEligibleGroups` so newly-inserted rows show
`Queued` instantly. `useCancelTableRuns.onMutate` distinguishes optimistic-
only pending (`executionId == null` — strip silently) from real worker
claims (stamp cancelled; SSE will reconcile). Drop `onSettled` invalidation
on `useUpdateTableRow` / `useBatchUpdateTableRows` to kill the
delete-cell flicker.

Fix C: active-dispatches overlay. New `listActiveDispatches` helper,
contract, and `GET /api/table/[tableId]/dispatches` route. `kind:'dispatch'`
SSE events carry scope+cursor+mode on every transition. New
`useActiveDispatches` hook + `resolveCellExec` synthesize a virtual
`pending` exec for cells in an active dispatch's scope ahead of cursor —
queued indicators now survive page refresh during long Run-all dispatches.
`cancelWorkflowGroupRuns` emits `kind:'dispatch',status:'cancelled'`
events so the overlay clears without a refetch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`runWorkflowColumn` now always inserts a `table_run_dispatches` row and
drives the dispatcher state machine. The trigger.dev / in-process branch
narrows to a single line: trigger.dev fires `tableRunDispatcherTask` (which
calls the new `runDispatcherToCompletion`), the inline path calls the same
helper fire-and-forget. Deletes `scheduleRunsForRows` and
`stampQueuedOrCancel` — the inline-fallback no longer duplicates window
walking, SSE emission, or cancel.

The dispatcher's window-execute call goes through `JobQueueBackend`:
- New `batchEnqueueAndWait` interface method.
- Trigger.dev impl wraps `tasks.batchTriggerAndWait` behind a
  `taskContext.isInsideTask` guard (clear error if called from outside a
  task).
- Database impl skips `async_jobs` entirely — `Promise.all` over
  `options.runner(payload, signal)` per item, with per-cell AbortControllers
  tracked by `cancelKey` for cancel.

`cancelInlineRun` moves to the interface as `cancelByKey` so
`cancelWorkflowGroupRuns` no longer reaches into the database backend.

Fix `mode: 'new'` SQL filter:
- `${array}::text[]` interpolated as a tuple-cast which Postgres rejected
  ("cannot cast type record to text[]") and every inline dispatch silently
  failed. Switched to `ARRAY[${sql.join(...)}]::text[]`.
- Predicate was `jsonb_exists_any` ("any one targeted group present"),
  which excluded rows that needed at least one group re-run after a
  downstream output was deleted. Switched to `jsonb_exists_all` — per-group
  JS eligibility handles the rest.

Cascade-loop workflowId bug: `runRowCascadeLoop` was not threading the new
group's `workflowId` when advancing across groups. The cell-task ran the
previous group's workflow against the next group's cell, terminating
`completed` with empty `accumulatedData`. Fixed by tracking
`currentWorkflowId` alongside `currentGroupId` / `currentExecutionId`.

Client optimistic-patch tightening:
- `useRunColumn.onMutate` mirrors server eligibility — skip cells with
  unmet deps so unmet rows don't flash Queued and get stuck (no SSE will
  arrive for cells the server skipped).
- `resolveCellExec` overlay synthesizes a virtual `pending` only when
  `areGroupDepsSatisfied` is true. Rows with unmet deps render Waiting,
  matching the dispatcher's actual behavior.

Cleanup from /simplify pass:
- Use `generateShortId(20)` instead of
  `generateId().replace(/-/g, '').slice(0, 20)`.
- Inline `batchEnqueueAndWait` no longer allocates synthetic ids
  (returned `string[]` is unused).
- Flattened the per-cell `tracked` array — only push entries that
  registered controllers, drop the null placeholders.
- Extracted `runDispatcherToCompletion` to share the loop between the
  trigger.dev wrapper and the in-process path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lish

Counter (Fix 1): top-right "X running" + per-row badge are now
backend-bootstrapped via a count on `user_table_rows.executions ->> 'status'
= 'running'` returned alongside active dispatches. SSE `kind: 'cell'` events
compute a delta from `prev → next` status to keep the cache live; cell
events for rows outside the loaded page slice trigger a run-state refetch.
On `pruned` we invalidate the cache. Counts only worker-claimed `running`
cells — optimistic queued/pending no longer inflate the badge, and rows
outside the loaded page slice are counted too.

Sidebar (Fix 2 + 3a): `Run after` no longer ticks every column by default
for new groups (empty list). Save is disabled with an inline error when
auto-run is on with zero deps. `edit-group` mode anchors the left-of-current
filter to the group's leftmost column, so a workflow can only depend on
columns to its left.

Reorder scrub (Fix 3b): `updateTableMetadata` walks the schema's workflow
groups when `columnOrder` is in the patch and drops any dep whose new
position lands at or after the group's leftmost column (uses the existing
`stripGroupDeps` helper). Metadata + schema updates land atomically.

Server returns ordered columns (Fix 3b cont'd): `getTableById` /
`listTables` now sort `schema.columns` by `metadata.columnOrder` before
returning, via a new `applyColumnOrderToSchema` helper. Every consumer
(grid, sidebar, copilot, mothership) gets one ordered list — the sidebar's
leftmost-group-column anchor now points at the right index.

Dep-aware retrigger (Fix 4): editing a value that a downstream workflow
depends on now re-runs that workflow.
- `deriveExecClearsForDataPatch` returns
  `{ executionsPatch, inFlightDownstreamGroups }`. Walks
  `schema.workflowGroups[].dependencies.columns` for every column in the
  patch, clears terminal-state downstream entries, and reports in-flight
  entries.
- `updateRow` calls `cancelWorkflowGroupRuns` + `runWorkflowColumn`
  (`mode: 'incomplete' + isManualRun: true`) for in-flight downstream
  groups, then always fires `runWorkflowColumn({ mode: 'new' })` for the
  cleared groups. Skips both when `executionsPatch` is provided by the
  caller — those are cell-task / cancel writes that would otherwise spawn
  a recursive flood of dispatches per partial-write.
- `cancelWorkflowGroupRuns(tableId, rowId, { groupIds? })` accepts a
  per-group filter so the cancel only touches the affected groups, not
  every in-flight cell on the row.
- `pickNextEligibleGroupForRow` now treats a dispatcher pre-stamp
  (`pending` + `executionId: null`) as claimable — the cascade-loop is the
  real owner. Without this, the dispatcher's pre-stamp of downstream
  groups made the cascade-loop see them as "in-flight" and skip them,
  stranding `pending` cells forever.
- `optimisticallyScheduleNewlyEligibleGroups` extends the cache patch to
  flip dep-touched groups to `pending` regardless of their current status,
  matching the server's cancel-then-rerun behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…der Pending + viewable

Three connected issues with workflows that pause mid-cell (e.g. wait blocks):

1. `/api/resume/poll` (the time-pause auto-resumer) called
   `PauseResumeManager.startResumeExecution` directly, bypassing
   `executeResumeJob` from `background/resume-execution.ts`. The wrapper is
   where the cell-context restoration + cascade-loop continuation lives —
   without it, the resumed workflow ran to completion but never wrote the
   terminal state back to the table cell. Cell stays `pending` forever
   even though the underlying execution finished.

   Fix: dynamically import `executeResumeJob` and use it for the
   `'starting'` branch. Same primitive the trigger.dev `resumeExecutionTask`
   wraps — calling it directly handles both trigger.dev-disabled local dev
   and trigger.dev-enabled prod identically.

2. The cell renderer mapped `status: 'pending'` to `kind: 'queued'` (gray
   "Queued" badge) regardless of whether the run had started. A HITL-paused
   run has `status: 'pending'` + `jobId` prefixed `paused-` + a real
   `executionId` — semantically very different from "queued, hasn't run."
   Now renders as `pending-upstream` (the existing Pending pill) for
   paused-jobId rows.

3. Right-click "View execution" was disabled for `pending` cells (gated to
   `completed | error | running`), so users couldn't open the trace for a
   paused execution. Paused runs have a viewable trace (the executionId is
   real and the log row exists). Both the per-row context menu and the
   action-bar derivation now recognize `pending` + `paused-` jobId as a
   started run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Workflow-output cells now reveal their text character-by-character when an
SSE update lands, while page reloads and virtualization remounts still paint
the value instantly. A first-render guard inside the new useTypewriter hook
distinguishes hydration from live updates with no plumbing through the cell
tree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 20, 2026 9:32am

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 20, 2026

PR Summary

High Risk
High risk because it refactors core table workflow execution/dispatch, introduces new async batching/cancellation semantics, and adds Redis locking + new DB-driven dispatch state that affects run correctness and UI state across refreshes.

Overview
Adds a chunked, DB-backed dispatcher for table workflow runs: run requests now create table_run_dispatches rows, bulk-clear targeted outputs/executions up front, and advance a persisted cursor window-by-window via batchEnqueueAndWait (trigger.dev batchTriggerAndWait in prod; inline Promise.all locally). A new GET /api/table/[tableId]/dispatches endpoint + SSE kind: 'dispatch' events expose active dispatches and running-cell counters so the UI can preserve queued indicators across refresh and show backend-derived per-row/total running counts.

Reworks workflow execution + resume to be cascade-safe: cell tasks now run under a per-(table,row) Redis withCascadeLock and can continue eligible downstream groups in-process; resume paths route through executeResumeJob, which restores cell context, writes terminal state, and continues the cascade when it owns the lock (falls back gracefully on contention).

Updates client behavior and contracts: runColumnContract returns dispatchId, optimistic run/cancel logic is tightened (deps-aware queuing, output clearing, tombstones with cancelledAt), row listing fetches executions from a sidecar table to preserve the row.executions[groupId] shape, paused HITL runs render distinctly and remain viewable, workflow sidebar requires at least one dependency when auto-run is enabled, and completed cell values animate in via a typewriter effect.

Reviewed by Cursor Bugbot for commit 9f30cb7. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread apps/sim/app/api/table/[tableId]/rows/[rowId]/route.ts Outdated
@TheodoreSpeaks TheodoreSpeaks changed the base branch from main to staging May 20, 2026 01:38
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

Greptile Summary

This PR is a major rework of the table workflow-column execution system, replacing a synchronous per-row reactor with a chunked, windowed dispatcher backed by a new table_row_executions sidecar table and table_run_dispatches state machine.

  • New sidecar schema: user_table_rows.executions JSONB is dropped; per-(row, group) execution state moves to table_row_executions with a composite PK and three partial indexes, removing last-writer-wins races between concurrent workers.
  • Dispatcher pattern: runWorkflowColumn now inserts a tableRunDispatches row and drives dispatcherStep (trigger.dev batchTriggerAndWait or in-process Promise.all) in windows equal to the table concurrency limit (20), gating window N+1 on N's completion.
  • Dep-aware cascade: deriveExecClearsForDataPatch does a forward-propagating dirty-column walk so editing a dep column clears terminal downstream groups and cancels + reruns in-flight ones; the cascade lock (withCascadeLock) serializes per-row group advancement.

Confidence Score: 4/5

The core dispatch + sidecar migration is well-architected and previously-flagged bugs all appear addressed. The remaining findings are minor schema/comment inconsistencies that don't affect runtime correctness.

The schema default for tableRunDispatches.cursor is 0 while the invariant requires -1; any row inserted outside insertDispatch would silently skip position-0 rows. No production path does this today, but it is a latent trap. All other changes are solid: transaction semantics are correct, the cascade lock heartbeat and compare-and-delete cleanup are properly handled, and the dep-aware retrigger walk is topologically correct.

packages/db/schema.ts for the cursor default; apps/sim/lib/table/dispatcher.ts for the stale double-comment; apps/sim/lib/table/deps.ts for the pending+jobId guard ordering.

Important Files Changed

Filename Overview
apps/sim/lib/table/dispatcher.ts New chunked dispatcher state machine — windowed cursor walk + pre-stamp + batchEnqueueAndWait; tombstone logic for per-row cancel ahead of cursor. Dangling stale doc comment found.
apps/sim/lib/table/service.ts Major rewrite: drops JSONB executions, loads sidecar via loadExecutionsByRow/loadExecutionsForRow, transactional writeExecutionsPatch replaces SQL jsonb patch expr, dep-aware cascade propagates dirty columns forward.
apps/sim/lib/table/workflow-columns.ts Replaced scheduleRunsForRows with buildPendingRuns + dispatcher flow; classifyEligibility gets new modes; cancelWorkflowGroupRuns gains groupIds filter and tombstone path.
apps/sim/lib/table/deps.ts isExecInFlight now counts all three pending/queued/running statuses; optimisticallyScheduleNewlyEligibleGroups gains dep-touched propagation; pending+jobId exits early before depTouched check.
packages/db/migrations/0209_smiling_fixer.sql Creates table_row_executions and table_run_dispatches; drops user_table_rows.executions; no data migration needed (feature unreleased).
apps/sim/lib/core/async-jobs/backends/database.ts Adds batchEnqueueAndWait and cancelByKey; compare-and-delete cleanup prevents stale-key collisions.
apps/sim/lib/core/async-jobs/backends/trigger-dev.ts Adds batchEnqueueAndWait wrapping tasks.batchTriggerAndWait with taskContext.isInsideTask guard.
apps/sim/app/workspace/[workspaceId]/tables/[tableId]/hooks/use-table-event-stream.ts Adds applyDispatch handler for new dispatch SSE kind; isManualRun correctly sourced from event payload.
apps/sim/hooks/queries/tables.ts Adds useTableRunState for active-dispatch overlay; withOptimisticAutoFireExec for newly created rows.
apps/sim/background/workflow-column-execution.ts Cell task now acquires cascade lock; inner loop advances through eligible groups with fresh executionIds per group.
apps/sim/lib/table/cascade-lock.ts New Redis-backed cascade lock with 30s TTL, 10s heartbeat, compare-and-delete release.
apps/sim/app/api/table/[tableId]/dispatches/route.ts New GET endpoint returning active dispatches, runningCellCount, and runningByRowId; properly auth-gated.

Sequence Diagram

sequenceDiagram
    participant UI as Client UI
    participant API as API Route
    participant SVC as service.ts
    participant WFC as workflow-columns.ts
    participant DISP as dispatcher.ts
    participant TDev as trigger.dev / DB backend
    participant CELL as workflow-column-execution.ts
    participant SSE as TableEventStream

    UI->>API: PATCH /rows/:rowId (user edit)
    API->>SVC: updateRow(data)
    SVC->>SVC: deriveExecClearsForDataPatch()
    SVC->>SVC: writeExecutionsPatch (sidecar upsert)
    SVC-->>API: updatedRow
    API-->>UI: 200 OK

    SVC->>WFC: runWorkflowColumn(mode:'new')
    WFC->>DISP: bulkClearWorkflowGroupCells
    WFC->>DISP: insertDispatch()
    WFC->>TDev: trigger tableRunDispatcherTask(dispatchId)

    loop Each window (WINDOW_SIZE rows)
        TDev->>DISP: dispatcherStep(dispatchId)
        DISP->>DISP: fetch chunk, buildPendingRuns
        DISP->>DISP: stampQueuedForBatch to sidecar pending
        DISP->>SSE: dispatch SSE event (cursor advance)
        SSE->>UI: dispatch event to overlay update
        DISP->>TDev: batchTriggerAndWait(workflow-group-cell items)
        TDev->>CELL: executeWorkflowGroupCellJob(payload)
        CELL->>CELL: withCascadeLock + runRowCascadeLoop
        CELL->>SSE: cell SSE events (pending to running to completed)
        SSE->>UI: cell event to row data update
        CELL-->>TDev: done
        TDev-->>DISP: all cells finished
        DISP->>DISP: advanceCursor
    end
    DISP->>SSE: dispatch SSE event (complete)
Loading

Reviews (4): Last reviewed commit: "fix(table): seed dispatch overlay on Run..." | Re-trigger Greptile

Comment thread apps/sim/app/api/table/[tableId]/rows/[rowId]/route.ts Outdated
Comment thread apps/sim/app/api/v1/tables/[tableId]/rows/[rowId]/route.ts Outdated
Comment thread apps/sim/lib/table/workflow-columns.ts Outdated
Two P1 issues + one cleanup from the bot reviewers:

1. **Double-dispatch + completed-output wipe.** Both PATCH row routes
   (`app/api/table/[tableId]/rows/[rowId]` and
   `app/api/v1/tables/[tableId]/rows/[rowId]`) were firing a second
   `runWorkflowColumn({ mode: 'incomplete' })` after `updateRow` returns.
   `updateRow` already fires `mode: 'new'` internally for user edits, so
   the second call created a concurrent dispatch. Worse, the
   `mode: 'incomplete'` path's `bulkClearWorkflowGroupCells` wipes ALL
   targeted output columns on any row where any one column is empty —
   meaning sibling-group completed outputs could be erased. Removed both
   route-level calls; auto-dispatch lives entirely in `updateRow`.

2. **`runWorkflowColumn` log-spamming on plain tables.**
   `if (targetGroups.length === 0) throw new Error(...)` fired on every
   row insert/update for tables without any workflow groups (the
   majority). Every caller wraps with `.catch(logger.error)`, so each
   PATCH produced an error-level log. Return `{ dispatchId: null }`
   silently — manual `runWorkflowColumn` callers pass `groupIds`
   explicitly so they can't reach this branch.

3. **`isManualRun` plumbed through dispatch SSE events.** Late-arriving
   `kind: 'dispatch'` events for dispatches not in the initial fetch
   were hardcoding `isManualRun: false`. Added the field to the event
   shape, emit it from `dispatcherStep` (pending → complete, dispatching
   transitions) and `markActiveDispatchesCancelled`, and consume it in
   the SSE handler with a sensible fallback for legacy emits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread apps/sim/lib/copilot/tools/server/table/user-table.ts Outdated
Comment thread apps/sim/lib/table/service.ts Outdated
… + cancel counter refresh

Split per-row workflow-group execution state out of the user_table_rows.executions
JSONB column into a new table_row_executions sidecar keyed by (row_id, group_id).
Dispatcher filters, "X running" counter, bulk clears, and the cancellation guard
all hit indexed columns instead of walking JSONB. Wire shape unchanged — server
merges sidecar rows back into row.executions on the way out.

Also:
- deriveExecClearsForDataPatch now walks workflowGroups left-to-right with a
  propagating dirtied-column set so transitive dep chains (edit col A → group 1
  re-runs → group 2 depends on group 1's output → group 2 re-runs) collapse to
  a single forward pass.
- useCancelTableRuns.onSettled invalidates the activeDispatches query so the
  top-right counter and row gutter Stop button refetch from the server after
  any Stop (per-cell, row, or table-wide). countRunningCells is the source of
  truth; client no longer needs duplicate state.

Three migrations on this branch (0209 + 0210 + new sidecar) collapsed into one
since the feature is unreleased.
@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

Pushed 01bb2339b — three changes on top of the existing PR:

1. row executions sidecar. Split user_table_rows.executions JSONB into a new table_row_executions table keyed by (row_id, group_id). Dispatcher's mode:'new' filter, "X running" counter, bulk clears, and the cancellation guard all hit indexed columns now. Wire shape unchanged — server merges sidecar rows back into row.executions before responding. Three branch migrations (0209+0210+sidecar) collapsed into one new migration since the feature is unreleased.

2. left-to-right dep-edit retrigger. deriveExecClearsForDataPatch now walks workflowGroups in order with a propagating dirty-column set. Editing col A → group 1 (deps on A) gets cleared → group 1's output columns join the dirty set → group 2 (deps on group 1's output) gets cleared too. Single forward pass, no DAG traversal. Cascade lock + pickNextEligibleGroupForRow already enforce serial execution at runtime.

3. cancel-flow counter refresh. useCancelTableRuns.onSettled invalidates tableKeys.activeDispatches(tableId) so the top-right counter and row gutter Stop button refetch from the server after any Stop (per-cell, row, or table-wide). countRunningCells is now the source of truth.

Verified: tsc clean, 193 vitest tests pass, lint clean, check:api-validation:strict clean.

@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

@greptile review @BugBot review

- Mothership update_row no longer double-dispatches. updateRow already fires
  the auto-cascade internally; the second `mode: 'incomplete'` call here
  raced with it and could bulk-clear sibling-group outputs.
- SSE dispatch events no longer dropped when the activeDispatches cache is
  cold. Seed an empty TableRunState if the initial fetch hasn't landed yet
  so the queued overlay doesn't lose the first dispatch event.
- batchUpdateRows now runs cancel+rerun for per-row in-flight downstream
  groups, mirroring updateRow. Without this, dep edits in a batch left
  running workflows reading stale upstream values.
Comment thread apps/sim/lib/table/workflow-columns.ts
Comment thread apps/sim/lib/table/dispatcher.ts
Comment thread apps/sim/lib/table/dispatcher.ts
Comment thread apps/sim/lib/table/service.ts
Comment thread apps/sim/lib/table/dispatcher.ts
…rphan pre-stamps

Addresses cursor + greptile review feedback on table dispatcher edge cases:

- Manual table-wide Run-all / Run-column now cancels prior active dispatches
  AND in-flight cell workers before bulk-clearing. Without this, mode:'all'
  deleted running sidecar rows out from under their workers (which kept
  writing into the wiped state) and a second Run-all could enqueue overlapping
  cells racing on the same rows. Row-scoped manual calls (dep-edit cascade)
  are excluded — those already cancel their own scope.
- batchInsertRowsWithTx now scopes its auto-dispatch to the newly-inserted
  row ids. Without this, after the sidecar migration the NOT EXISTS filter
  matches every existing row (zero sidecar entries), so a CSV import would
  walk the entire table dispatching workflow runs on every pre-existing row.
- classifyEligibility carve-out: pending + executionId=null is an orphan
  pre-stamp (cascade-lock contention, batchEnqueueAndWait failure, etc.),
  treated as claimable so future dispatchers can re-stamp instead of skipping
  it as 'in-flight' forever. Matches pickNextEligibleGroupForRow's logic.
- On batchEnqueueAndWait failure, dispatcherStep now sweeps the orphan
  pre-stamps it wrote for the failed batch so the cells don't render Queued
  forever; the next user action picks them up cleanly.
Comment thread apps/sim/lib/table/workflow-columns.ts
@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

@greptile review

Comment thread packages/db/migrations/0209_smiling_fixer.sql
…eued/pending

- runWorkflowColumn now cancels prior in-flight cells for row-scoped manual
  runs too (context-menu Refresh on a row subset, action-bar Refresh on
  selected rows). Previously only the table-wide path cancelled, so a
  row-scoped Refresh would bulk-clear running sidecar rows without aborting
  workers. Per-row cancel skips markActiveDispatchesCancelled so unrelated
  dispatches keep running.
- countRunningCells now counts all in-flight statuses (queued / running /
  pending) instead of just running. The row gutter Run/Stop button reads
  this map — with the old behavior, clicking Play during the queued window
  would re-enqueue an already-queued cell. SSE applyCell handler updated
  to use isExecInFlight so client deltas track the same semantics.
Comment thread apps/sim/lib/table/workflow-columns.ts
Comment thread apps/sim/app/api/table/[tableId]/columns/run/route.ts
Per-row Stop only cancelled sidecar rows already in flight. A row the
dispatcher hadn't reached yet had no exec record, so Stop was a no-op there
— the dispatcher would later walk to it, classify the group eligible, and
re-fire workflows the user thought they stopped.

cancelWorkflowGroupRuns now, for a per-row cancel, checks active dispatches
whose scope covers the row and writes `cancelled` tombstones (cancelledAt =
now) for the at-risk groups that don't already have a sidecar entry. The
dispatcher's existing `cancelledAt > dispatch.requestedAt` filter then skips
them when the cursor arrives. onConflictDoNothing guards against clobbering
a concurrently-written entry; the active-dispatch check avoids stamping
spurious cancels on idle rows.
@TheodoreSpeaks
Copy link
Copy Markdown
Collaborator Author

@greptile review

Comment thread apps/sim/hooks/queries/tables.ts
Comment thread apps/sim/lib/table/dispatcher.ts
…res as error

- useRunColumn.onSuccess invalidates the activeDispatches query so the
  resolveCellExec queued overlay populates immediately for ahead-of-cursor
  rows (scrolled-in / refetched), instead of waiting for the first dispatch
  SSE. Targeted at activeDispatches only — the rows cache stays owned by
  useTableEventStream.
- On batchEnqueueAndWait failure, dispatcherStep now flips the orphan
  pre-stamps to a terminal `error` state and emits a cell SSE event, rather
  than deleting them. The cursor still advances past the window, but the
  dropped cells are now visible (Error pill) instead of silently empty, stay
  out of the in-flight set, and re-run on the next manual run.
…res as error

- useRunColumn.onSuccess invalidates activeDispatches so the resolveCellExec
  queued overlay populates immediately for ahead-of-cursor rows instead of
  waiting for the first dispatch SSE. Rows cache stays owned by SSE.
- On batchEnqueueAndWait failure, dispatcherStep flips orphan pre-stamps to a
  terminal error state (+ cell SSE) instead of deleting them, so the dropped
  window is visible (Error pill) rather than silently empty and re-runs on the
  next manual run.
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 9f30cb7. Configure here.

Comment thread apps/sim/lib/table/deps.ts
@TheodoreSpeaks TheodoreSpeaks merged commit f0311a6 into staging May 20, 2026
14 checks passed
@TheodoreSpeaks TheodoreSpeaks deleted the feat/table-chunked-dispatcher branch May 20, 2026 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant