Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions src/lib/snyk-test/common.ts
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,28 @@ export type FailOn = 'all' | 'upgradable' | 'patchable';
export const RETRY_ATTEMPTS = 3;
export const RETRY_DELAY = 500;

const DEFAULT_REQUEST_CONCURRENCY = 10;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking: Generally changing a default without visibility in the behaviour, we currently don't have any metrics on CPU consumption, makes me wonder if we should be more careful. The impact is quite huge as it affects all SCA and Container scans.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there'll be significant CPU impact. Node executes JS on a single thread, and the code path this PR touches spends nearly all its time waiting on HTTP responses, not in JS execution.

On the "affects all SCA and Container scans" concern, I agree. A few things that make me comfortable with the bump:

  • headroom to throttling is seemingly large. I put more discussion here. The api-gateway rate limit on /v1/test* and /v1/monitor* is keyed per principal_id (per-token), default high bucket: 200 req/s burst, 2000/min, 60000/hour. Even at the override ceiling (50), a single CLI scan stays at ≤25% of the per-second cap.
  • 5 is a conservatively low bound and 10 is still quite modest as a default.
  • OS flows team reviewed and approved the PR 😄

Other options:

  • we could land a 5 to 8 bump first and revisit after we have telemetry?
  • add a feature-flag-style override/escape hatch to bump it back to 5
  • any other ideas?

What do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense for a single CLI run. A lot of our user actually run the CLI highly concurrent and already experience rate limiting. In the combination in the best case, the increase of request parallelism and rate limiting might just equal out, in the worst case the rate limiting will eventually exceed the retry limit and fail the CLI.

const MIN_REQUEST_CONCURRENCY = 1;
const MAX_REQUEST_CONCURRENCY = 50;

/**
* Returns the maximum number of in-flight Snyk dependency-test or
* dependency-monitor HTTP requests permitted at once. Override with the
* SNYK_REQUEST_CONCURRENCY environment variable; values are clamped to
* [MIN_REQUEST_CONCURRENCY, MAX_REQUEST_CONCURRENCY].
*/
export function getRequestConcurrency(): number {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: There is already a similar configuration value in the Golang CLI. While this is currently not publicly documented, I think we should inject it in the Legacy CLI and have the Golang configuration a single source of truth for this. We can also expose the config value but would probably want to frame it around SNYK_MAX_CONCURRENCY rather then focussed on requests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointers. I didn't know about this config value. I agree having a single source of truth here makes sense. Before I plumb it through, I want to flag a semantic concern.

MAX_THREADS is documented as "maximum number of threads that can be used by Extensions" and defaults to runtime.NumCPU(). That makes sense for CPU-bound parallelism.

What this PR controls is HTTP request concurrency — how many in-flight /test-dependencies / /monitor-dependencies requests we allow at once. That's network I/O, not CPU-bound, so runtime.NumCPU() is a poor proxy and also somewhat unrelated to the number of threads being used.

Binding HTTP concurrency to MAX_THREADS would make the default machine-dependent. It would also tie an HTTP knob to a CPU knob, where in principle we might want to tune them independently (e.g., a small machine could still benefit from high HTTP concurrency without needing many CPU threads).

A few options I see:

  1. Keep as separate config for now (SNYK_MAX_CONCURRENCY, Node-side env-var read). Acknowledges the semantic split; loses centralization.
  2. Bind to MAX_THREADS but override the default for the legacy CLI bridge (e.g., max(MAX_THREADS, 10)). Preserves the source-of-truth idea for users who deliberately set it high, but also preserves the functional change in this PR. However, it's highjacking a single variable for two different uses.
  3. Create a new go-application-framework config key (MAX_HTTP_CONCURRENCY or similar) that's separate from MAX_THREADS. Cleanest semantically; larger scope.

Am I understanding things correctly? What are you preferences from a CLI architecture perspective?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think semantically and technically you are right and point 3 would make sense.

Besides this, there is a usability question, as a user specifying MAX_HTTP_CONCURRENCY I would expect that this value is actually the maximum.
Currently given the codebase, I don't know how many request are actually happening in parallel and if we can guarantee this. A solution could be to limit it in the network stack but this could have some unforeseen impact as well.

Architecturally Application configuration via Environment Variables and configuration handling is done through the Golang CLI and its configuration package. This way, the decision on where values are coming from (Env var, config file, cmd arg ..) is left to the application. Making this available in the Typescript CLI via internal environment variables makes sense though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an option to separate concerns and decouple the configurability and the increase of the default value?

const raw = process.env.SNYK_REQUEST_CONCURRENCY;
if (!raw) {
return DEFAULT_REQUEST_CONCURRENCY;
}
const parsed = parseInt(raw, 10);
if (!Number.isFinite(parsed) || parsed < MIN_REQUEST_CONCURRENCY) {
return DEFAULT_REQUEST_CONCURRENCY;
}
return Math.min(parsed, MAX_REQUEST_CONCURRENCY);
}

/**
* printDepGraph writes the given dep-graph and target name to the destination
* stream as expected by the `depgraph` CLI workflow.
Expand Down
6 changes: 2 additions & 4 deletions src/lib/snyk-test/run-test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ import { isCI } from '../is-ci';
import {
RETRY_ATTEMPTS,
RETRY_DELAY,
getRequestConcurrency,
printDepGraph,
printEffectiveDepGraph,
printEffectiveDepGraphError,
Expand Down Expand Up @@ -94,9 +95,6 @@ import { ProblemError } from '@snyk/error-catalog-nodejs-public';

const debug = debugModule('snyk:run-test');

// Controls the number of simultaneous test requests that can be in-flight.
const MAX_CONCURRENCY = 5;

function prepareResponseForParsing(
payload: Payload,
response: TestDependenciesResponse,
Expand Down Expand Up @@ -293,7 +291,7 @@ async function sendAndParseResults(
};

const responses = await pMap(payloads, sendRequest, {
concurrency: MAX_CONCURRENCY,
concurrency: getRequestConcurrency(),
});

for (const { payload, originalPayload, response } of responses) {
Expand Down
62 changes: 61 additions & 1 deletion test/jest/unit/lib/snyk-test/common.spec.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
import { CLI, ProblemError } from '@snyk/error-catalog-nodejs-public';
import { CustomError } from '../../../../../src/lib/errors';
import { FailedProjectScanError } from '../../../../../src/lib/plugins/get-multi-plugin-result';
import { getOrCreateErrorCatalogError } from '../../../../../src/lib/snyk-test/common';
import {
getOrCreateErrorCatalogError,
getRequestConcurrency,
} from '../../../../../src/lib/snyk-test/common';

describe('getOrCreateErrorCatalogError', () => {
const defaultErrMessage = 'Default error message';
Expand Down Expand Up @@ -116,3 +119,60 @@ describe('getOrCreateErrorCatalogError', () => {
expect(result.detail).toBe(defaultErrMessage);
});
});

describe('getRequestConcurrency', () => {
const originalValue = process.env.SNYK_REQUEST_CONCURRENCY;

afterEach(() => {
if (originalValue === undefined) {
delete process.env.SNYK_REQUEST_CONCURRENCY;
} else {
process.env.SNYK_REQUEST_CONCURRENCY = originalValue;
}
});

it('returns the default of 10 when SNYK_REQUEST_CONCURRENCY is unset', () => {
delete process.env.SNYK_REQUEST_CONCURRENCY;
expect(getRequestConcurrency()).toBe(10);
});

it('returns the default of 10 when SNYK_REQUEST_CONCURRENCY is empty', () => {
process.env.SNYK_REQUEST_CONCURRENCY = '';
expect(getRequestConcurrency()).toBe(10);
});

it('returns the parsed value when SNYK_REQUEST_CONCURRENCY is a valid integer', () => {
process.env.SNYK_REQUEST_CONCURRENCY = '15';
expect(getRequestConcurrency()).toBe(15);
});

it('clamps to the maximum of 50 when SNYK_REQUEST_CONCURRENCY exceeds the cap', () => {
process.env.SNYK_REQUEST_CONCURRENCY = '500';
expect(getRequestConcurrency()).toBe(50);
});

it('returns the default when SNYK_REQUEST_CONCURRENCY is below the minimum', () => {
process.env.SNYK_REQUEST_CONCURRENCY = '0';
expect(getRequestConcurrency()).toBe(10);
});

it('returns the default when SNYK_REQUEST_CONCURRENCY is negative', () => {
process.env.SNYK_REQUEST_CONCURRENCY = '-5';
expect(getRequestConcurrency()).toBe(10);
});

it('returns the default when SNYK_REQUEST_CONCURRENCY is non-numeric', () => {
process.env.SNYK_REQUEST_CONCURRENCY = 'abc';
expect(getRequestConcurrency()).toBe(10);
});

it('honors the minimum boundary', () => {
process.env.SNYK_REQUEST_CONCURRENCY = '1';
expect(getRequestConcurrency()).toBe(1);
});

it('honors the maximum boundary', () => {
process.env.SNYK_REQUEST_CONCURRENCY = '50';
expect(getRequestConcurrency()).toBe(50);
});
});
Loading