Skip to content

Re-add Fleet.AmazonSqs sample — builds clean but Aspire+LocalStack runtime battery is red #5

Description

@jeremydmiller

Summary

Fleet.AmazonSqs (CritterWatch console monitoring a Trip fleet over Amazon SQS via LocalStack) was pulled from the initial samples round. It builds clean, but its Aspire test battery is red at runtime: the CritterWatch console resource starts but its HTTP endpoint refuses connections (the smoke test fails after ~3m20s — consistent with an SQS AutoProvision / LocalStack-startup timeout taking the console host down). The other transports in the round (RabbitMQ ✅, Azure Service Bus ✅, DB-backed queues) are unaffected.

This issue tracks adding it back once the LocalStack-under-Aspire startup is sorted.

What already works (don't redo from scratch)

The solution scaffold + control-channel wiring is done and compiles. The SQS control channel mirrors the verified flagship, only the transport calls differ:

  • Console (CritterWatchConsole):
    opts.UseAmazonSqsTransportLocally(port: SampleConnections.LocalStackPort()).AutoProvision();
    opts.ListenToSqsQueue("critterwatch").ListenOnlyAtLeader().UseCritterWatchSerializer();
  • Each monitored service:
    opts.UseAmazonSqsTransportLocally(port).UseConventionalRouting().AutoProvision();
    opts.ListenToSqsQueue("{service}");          // trip_service / trip_publisher / repair_shop
    opts.AddCritterWatchMonitoring("sqs://critterwatch".ToUri(), "sqs://{service}".ToUri());
    (URI shape is sqs://{queueName} — no //queue/ host segment, unlike RabbitMQ.)
  • AppHost: LocalStack via AddContainer("localstack", "localstack/localstack").WithImageTag("4").WithEnvironment("SERVICES","sqs").WithEndpoint(port:4566, targetPort:4566, scheme:"http", name:"gateway").WithHttpHealthCheck("/_localstack/health", endpointName:"gateway"), plus a Postgres container (DLQ/Scheduled panels read CritterWatch's durable store). Services get LOCALSTACK_PORT=4566.
  • Storage Marten/Postgres; Trip trio only (no Incidents).

Likely culprits to investigate when re-adding

  1. Hardcoded host port 4566. The AWS SDK ServiceURL must be a concrete http://localhost:{port}, so an Aspire-dynamic port can't be used — but a fixed 4566 can clash with a running CritterWatch docker-compose LocalStack (which maps host 4666→container 4566) or anything else on 4566.
  2. LocalStack readiness vs. the console's SQS AutoProvision. A WithHttpHealthCheck("/_localstack/health") was added so WaitFor(localstack) gates on readiness, but the battery was still red — confirm the health path reports SQS-ready, and that the console's AutoProvision actually reaches the gateway.
  3. Get the console's real startup exception. It wasn't captured (macOS lacks timeout; dotnet run on the AppHost trips the Aspire dashboard env-var check, which the test harness bypasses). Reproduce under DistributedApplicationTestingBuilder and dump the critterwatch resource logs.

Acceptance

Fleet.AmazonSqs restored to the round with a green Aspire battery (/about 200 + TripService/RepairShop registered).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions