Update polling and timeout message handling | CTT-188#335
Conversation
… handling of polls
This change allows a proper detection of NO DDA flows in verification mode even when websockets are insanely fast in their detection
| expect(message).toEqual(CONNECTING_MESSAGES.VERIFYING) | ||
| }) | ||
|
|
||
| test('should stop polling on second consecutive CONNECTED update in verify mode', () => { |
There was a problem hiding this comment.
Just for my own learning, why does verify mode need two CONNECTED updates?
There was a problem hiding this comment.
This particular scenario is one reason I would like reviews from some others because it is odd. I'm open to suggestions of handling this differently. I'll find where I placed my comments about this and add them to the test file.
The gist of the problem, is that an asynchronous edge case exists... What happens behind the scenes of a member record that hits the No DDA flow, apparently is this.
- The member goes to CONNECTED (no errors) - yay success!
- The member then goes to IMPEDED (with errors) - oh no, error...
In some rare instances, 1 in 6, or sometimes 1 in 20 attempts, the widget would catch the CONNECTED status and show the success screen. But, it should actually be showing the IMPAIRED screen... With a second confirmed CONNECTED message we're more confident that it should actually be a success instead of an error.
There was a problem hiding this comment.
This sounds like it's a backend problem, no?
There was a problem hiding this comment.
Potentially? If there is a bug, it would be with the is_being_aggregated flag. That should likely stay true until completely after the jobs are done AND NO DDA is checked for verification involved jobs.
I'll raise some questions about this behavior, but for now this is a workaround we can handle.
There was a problem hiding this comment.
This sounds like a backend issue to me as well. I'm also wondering whether it's expected for a member with no DDA to be marked as CONNECTED and then IMPEDED, or if it should go straight to IMPEDED.
There was a problem hiding this comment.
This feels like a monkey patch that is adding tech debt and complexity to our widget, when the problem really should be solved on the backend. Is adding this worth it? Do we have a ticket to remove this as soon as it's fixed on the backend? Has the backend prioritized fixing this, or is this monkey patch going to delay the backend getting the proper fix in?
There was a problem hiding this comment.
I'm going to leave the code in place while I personally investigate the backend behaviors. Depending on what is found we can decide to revert the monkey-patch behavior, or adjust as needed to correctly detect no dda scenarios with both polling and/or websockets.
Best scenario, is that we can figure out the complete picture back-to-front, and then move this entire set of changes forward after we apply whatever fix is necessary.
This PR implements some edge-case fixes that have been reported by customers, which cause the user to get stuck on the Connecting screen, or get the wrong screen.
Fixes
Test Results
✅ Vitest
✅ New Playwright OAuth tests
✅ Cypress Tests (when npm linked to these changes)