Problem
The submission_status_cleanup() task only recovers submissions stuck in Running state. Submissions stuck in Submitted, Preparing, or Scoring will hang forever and never be cleaned up.
Root Cause
In src/apps/competitions/tasks.py, the cleanup task filters for:
submissions = Submission.objects.filter(
status=Submission.RUNNING, # Only Running!
has_children=False,
).select_related('phase', 'parent')
Additionally, the task uses started_when to calculate the deadline, which is null for submissions that never reached Running state.
Impact
- Submissions can get stuck before reaching
Running (during submission queue processing, preparation, or scoring re-enqueue)
- No recovery mechanism exists for these states
- Users see permanently stuck submissions with no way to recover
This bug was discovered during the EEG Foundation Challenge incident analysis.
Solution
- Extend cleanup to all non-terminal states:
Submitted, Preparing, Running, Scoring
- Add fallback logic: Use
created_when when started_when is null
- Same deadline calculation: 24h + execution_time_limit from reference_time
New Flow
non_terminal_statuses = [
Submission.SUBMITTED,
Submission.PREPARING,
Submission.RUNNING,
Submission.SCORING,
]
submissions = Submission.objects.filter(
status__in=non_terminal_statuses,
has_children=False,
).select_related('phase', 'parent')
for sub in submissions:
# Use started_when for Running, created_when as fallback for others
reference_time = sub.started_when if sub.started_when else sub.created_when
deadline = reference_time + timedelta(
milliseconds=(3600000 * 24) + sub.phase.execution_time_limit
)
if now() > deadline:
sub.cancel(status=Submission.FAILED)
Testing
Comprehensive test suite included:
- Unit tests:
src/apps/competitions/tests/test_submissions.py (4 new tests)
- Integration tests:
tests/k6/ (K6 orchestrator + conservation harness)
Run integration tests:
cd tests/k6
./run_cleanup_test.sh
Problem
The
submission_status_cleanup()task only recovers submissions stuck inRunningstate. Submissions stuck inSubmitted,Preparing, orScoringwill hang forever and never be cleaned up.Root Cause
In
src/apps/competitions/tasks.py, the cleanup task filters for:Additionally, the task uses
started_whento calculate the deadline, which is null for submissions that never reachedRunningstate.Impact
Running(during submission queue processing, preparation, or scoring re-enqueue)This bug was discovered during the EEG Foundation Challenge incident analysis.
Solution
Submitted,Preparing,Running,Scoringcreated_whenwhenstarted_whenis nullNew Flow
Testing
Comprehensive test suite included:
src/apps/competitions/tests/test_submissions.py(4 new tests)tests/k6/(K6 orchestrator + conservation harness)Run integration tests:
cd tests/k6 ./run_cleanup_test.sh