Specify timeout when running cmd_list in CleanUnknownWorker#182
Conversation
48a3478 to
a29dcc0
Compare
|
I have an easy reproducer with all corner-cases here I am planning to incorporate the code into this PR if we agree that the approach is correct. |
a29dcc0 to
ecbfc3a
Compare
2f46abe to
6baa202
Compare
| while True: | ||
| # How many seconds are left before we want to timeout | ||
| remaining = deadline - time.monotonic() | ||
| if remaining <= 0: |
There was a problem hiding this comment.
could we put here some log message that tells it timeouted? It could be useful for debugging
| remaining = deadline - time.monotonic() | ||
| if remaining <= 0: | ||
| sp.kill() | ||
| break |
There was a problem hiding this comment.
on line 108 there is return {'status': 124} which returned 124 on timeout.... now because we break here, it exits but with a -9 as sigkill signal instead (eventhough the reason was timeout)
| break | ||
|
|
||
| return { | ||
| 'status': sp.wait(), |
There was a problem hiding this comment.
to fix 6baa202#r3201157553 I think we need to add here 124 if timed_out else sp.wait(), to be consistent
| # We encountered EOF but we still have unprocessed bytes | ||
| elif buffer: | ||
| # Treat partial line as a complete line | ||
| # TODO do we want this or rather discard the incomplete line? |
There was a problem hiding this comment.
I would discard it for catch_stdout_lines_securely. A truncated resource name would end up in the unknown_resources set and trigger a bogus cmd_delete.
A truncated line like vm-00<MISSING> (instead of e.g. vm-002) would end up in unknown_resources and trigger a cmd_delete for a resource that either doesn't exist or shouldn't be deleted.
| # TODO do we want this or rather discard the incomplete line? | |
| if not catch_stdout_lines_securely: | |
| buffer += b'\n' |
|
Thank you for the review @nikromen, all good points. Updated, PTAL |
|
|
||
| # Either we get a readable file descriptors back or we get nothing | ||
| # because the `select` timeouted on `read` | ||
| ready, _, _ = select.select([sp.stdout], [], [], remaining) |
There was a problem hiding this comment.
please use fileno() here, too, just so we are consistent
| catch_stdout_bytes=5120, | ||
| catch_stdout_lines_securely=True, | ||
| timeout=120, | ||
| discard_partial_line=False, |
There was a problem hiding this comment.
shouldn't this be true? With catch_stdout_lines_securely=True, we don't want a truncated resource name to end up in unknown_resources and trigger a bogus cmd_delete.
Fix #178