Skip to content

Update polling and timeout message handling | CTT-188#335

Open
codingLogan wants to merge 5 commits into
masterfrom
lr/CTT-188
Open

Update polling and timeout message handling | CTT-188#335
codingLogan wants to merge 5 commits into
masterfrom
lr/CTT-188

Conversation

@codingLogan
Copy link
Copy Markdown
Collaborator

@codingLogan codingLogan commented Jun 1, 2026

This PR implements some edge-case fixes that have been reported by customers, which cause the user to get stuck on the Connecting screen, or get the wrong screen.

  1. Sometimes the first response from the backend APIs already signal that the job is done, and in an error state. The widget incorrectly would keep on polling waiting for a different response from the first one. This was recreated by using a Slow 3G network speed emulation, and using verification and No DDA member.
  2. Not aggregating fix - we have a handler for when we don't have an explicit actionable error code, but we are in a connection_status that is an error. Updates were made to allow that state to be reached.
  3. The two above fixes also exposed another problem. The Connecting screen is supposed to send a timeout post message after 60 seconds. Polling timing has changed in the last half-year, and this the timeout message was out of sync. This problem has also been addressed because the above two fixes changed that code as well.
  4. While testing the above, it was also discovered that sometimes the widget would show a false-positive success screen in No DDA scenarios. That has also been fixed.

Fixes

  • 188
  • 190
  • 191
  • 194

Test Results

✅ Vitest
✅ New Playwright OAuth tests
✅ Cypress Tests (when npm linked to these changes)

@codingLogan codingLogan marked this pull request as ready for review June 2, 2026 16:03
@codingLogan codingLogan self-assigned this Jun 2, 2026
Comment thread src/hooks/usePollMember.tsx Outdated
Comment thread src/hooks/__tests__/usePollMember-test.tsx Outdated
Comment thread src/views/connecting/Connecting.js
expect(message).toEqual(CONNECTING_MESSAGES.VERIFYING)
})

test('should stop polling on second consecutive CONNECTED update in verify mode', () => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my own learning, why does verify mode need two CONNECTED updates?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder this

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular scenario is one reason I would like reviews from some others because it is odd. I'm open to suggestions of handling this differently. I'll find where I placed my comments about this and add them to the test file.

The gist of the problem, is that an asynchronous edge case exists... What happens behind the scenes of a member record that hits the No DDA flow, apparently is this.

  1. The member goes to CONNECTED (no errors) - yay success!
  2. The member then goes to IMPEDED (with errors) - oh no, error...

In some rare instances, 1 in 6, or sometimes 1 in 20 attempts, the widget would catch the CONNECTED status and show the success screen. But, it should actually be showing the IMPAIRED screen... With a second confirmed CONNECTED message we're more confident that it should actually be a success instead of an error.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like it's a backend problem, no?

Copy link
Copy Markdown
Collaborator Author

@codingLogan codingLogan Jun 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially? If there is a bug, it would be with the is_being_aggregated flag. That should likely stay true until completely after the jobs are done AND NO DDA is checked for verification involved jobs.

I'll raise some questions about this behavior, but for now this is a workaround we can handle.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a backend issue to me as well. I'm also wondering whether it's expected for a member with no DDA to be marked as CONNECTED and then IMPEDED, or if it should go straight to IMPEDED.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a monkey patch that is adding tech debt and complexity to our widget, when the problem really should be solved on the backend. Is adding this worth it? Do we have a ticket to remove this as soon as it's fixed on the backend? Has the backend prioritized fixing this, or is this monkey patch going to delay the backend getting the proper fix in?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave the code in place while I personally investigate the backend behaviors. Depending on what is found we can decide to revert the monkey-patch behavior, or adjust as needed to correctly detect no dda scenarios with both polling and/or websockets.

Best scenario, is that we can figure out the complete picture back-to-front, and then move this entire set of changes forward after we apply whatever fix is necessary.

Comment thread src/utilities/__tests__/pollers-test.js Outdated
Comment thread README.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants