Skip to content

perf(oss): reduce conversion allocations [IDE-1940]#1255

Merged
bastiandoetsch merged 5 commits intomainfrom
perf/IDE-1940-oss-conversion-caches
May 9, 2026
Merged

perf(oss): reduce conversion allocations [IDE-1940]#1255
bastiandoetsch merged 5 commits intomainfrom
perf/IDE-1940-oss-conversion-caches

Conversation

@bastiandoetsch
Copy link
Copy Markdown
Collaborator

@bastiandoetsch bastiandoetsch commented May 6, 2026

User description

Description

Reduces OSS conversion allocation pressure by indexing matching vulnerabilities, memoizing repeated extended messages, caching parsed URLs, and pre-sizing reference slices.

This is split PR 4 from performance/IDE-1940_optimize-memory-footprint. It is temporarily stacked on PR #1250 so the pre-push/test-hook stabilizers from the fixture harness PR are present; after PR #1250 lands this PR can be retargeted to main.

Split Context

flowchart LR
  main[origin/main] --> pr1["PR #1250: test fixture + megaproject harness"]
  pr1 --> pr2["PR #1251: Code SARIF streaming"]
  pr1 --> pr3["PR #1253: OSS CLI streaming"]
  pr1 --> pr4["PR #1255: OSS conversion caches"]
  main --> pr5["PR TBD: IssueCache index + memory backend"]
  pr5 --> pr6["PR TBD: Bolt backend + scanner integration"]
  pr6 --> pr7["PR TBD: Workspace/product cache clearing"]
  main --> pr8["PR TBD: Small hotpath optimizations"]
  main --> pr9["PR TBD: Perf docs + profiling history"]
Loading

Scope

  • Indexes OSS vulnerabilities by ID before matching related issues.
  • Memoizes repeated OSS extended-message rendering.
  • Caches parsed URLs and pre-sizes OSS reference slices.
  • Adds focused OSS tests for matching issue parity, memoized messages, and URL parse cache behavior.

Verification

  • go test -count=1 -v ./infrastructure/oss (build/pr4_focused_oss_tests.log)
  • make lint (build/pr4_make_lint.log)
  • Pre-push hook: lint-fix and unit tests passed during git push --set-upstream origin perf/IDE-1940-oss-conversion-caches

Checklist

  • Tests added and all succeed
  • Regenerated mocks, etc. (make generate)
  • Linted (make lint-fix)
  • README.md updated, if user-facing
  • License file updated, if new 3rd-party dependency is introduced

PR Type

Enhancement


Description

  • Index OSS vulnerabilities by ID for faster matching.

  • Memoize extended message rendering for performance.

  • Cache parsed URLs to reduce overhead.


File Walkthrough

Relevant files
Enhancement
4 files
issue.go
Index vulnerabilities by ID for efficient matching.           
+27/-14 
types.go
Implement memoization for extended messages and URL parsing.
+64/-4   
unified_converter.go
Integrate URL parsing cache for references and issue URLs.
+13/-17 
url_parse_cache.go
Introduce URL parsing cache mechanism.                                     
+45/-0   
Tests
4 files
issue_test.go
Add tests for vulnerability ID indexing and matching.       
+119/-0 
oss_test.go
Adjust tests for toIssue and URL creation.                             
+3/-3     
types_extended_message_test.go
Test memoization of extended message generation.                 
+108/-0 
url_parse_cache_test.go
Test URL parsing cache functionality.                                       
+51/-0   

@bastiandoetsch
Copy link
Copy Markdown
Collaborator Author

/describe

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 6, 2026

CLA assistant check
All committers have signed the CLA.

@snyk-io
Copy link
Copy Markdown

snyk-io Bot commented May 6, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues
Code Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@snyk-pr-review-bot
Copy link
Copy Markdown

PR Description updated to latest commit (66becca)

@bastiandoetsch
Copy link
Copy Markdown
Collaborator Author

Addressed the OSS cache review finding:

  • Bounded extended-message memoization with maxExtendedMessageCacheEntries.
  • Bounded URL parse memoization with maxParsedURLStringCacheEntries.
  • Added tests that exercise the cache caps for repeated unique OSS messages/URLs.
  • Rebased onto the latest test: add megaproject fixture and smoke harness [IDE-1940] #1250 base after origin/main was merged there.

Verification:

  • go test -count=1 -v ./infrastructure/oss
  • make lint
  • pre-push lint-fix and unit tests passed during force-with-lease push

@bastiandoetsch bastiandoetsch added the enhancement New feature or request label May 6, 2026
Base automatically changed from perf/IDE-1940-megaproject-harness to main May 6, 2026 17:14
Build map[id][]index once per convertScanResultToIssues and pass same-ID
indices into toIssue instead of scanning all Vulnerabilities per issue.

Adds issue_test coverage for index grouping and same-id MatchingIssues
parity; updates direct toIssue unit tests to pass nil indices.
Cache repeated OSS extended-message rendering for identical vulnerability rows and add focused coverage for the memoized behavior.
Checkpoint 16: memoize url.Parse per distinct URL string (sync.Map + shallow copy per consumer). Pre-size reference slices in toReferences and extractReferences. Route CreateIssueURL and unified createIssueURL through the cache.

Tests: url_parse_cache_test for cache semantics and CreateIssueURL.
Cap OSS extended-message and URL parse memoization so repeated unique scan data cannot grow package-level caches without limit in long-lived language-server sessions.
@bastiandoetsch bastiandoetsch force-pushed the perf/IDE-1940-oss-conversion-caches branch from a477253 to 29307b0 Compare May 7, 2026 05:44
@bastiandoetsch bastiandoetsch marked this pull request as ready for review May 7, 2026 15:47
@bastiandoetsch bastiandoetsch requested a review from a team as a code owner May 7, 2026 15:47
@bastiandoetsch bastiandoetsch requested review from a team as code owners May 7, 2026 15:47
@snyk-pr-review-bot

This comment has been minimized.

@snyk-pr-review-bot
Copy link
Copy Markdown

PR Reviewer Guide 🔍

🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Shallow Copy Risk 🟡 [minor]

The urlParseCachedCopy function returns a shallow copy of *url.URL. While this protects the top-level fields, url.URL contains nested pointers like User *Userinfo. Any caller that modifies the Userinfo of the returned URL will inadvertently mutate the shared instance inside parsedURLStringCache.values. While current use cases in toReference and CreateIssueURL appear to be read-only, this creates a subtle trap for future callers expecting a truly independent object (as url.Parse provides).

copied := *orig
return &copied, true
Pre-sizing Mismatch 🟡 [minor]

In extractReferences, the references slice is pre-sized to len(problem.References), but elements are added via append inside a loop that can continue on parse errors. This results in the final slice having trailing empty capacity (zero-valued types.Reference structs) equal to the number of failed parses. Although append is used, the pre-allocation should ideally be make([]types.Reference, 0, len(problem.References)) to avoid redundant allocations while ensuring the slice length correctly matches the count of successfully parsed URLs.

references := make([]types.Reference, 0, len(problem.References))
📚 Repository Context Analyzed

This review considered 21 relevant code sections from 13 files (average relevance: 0.98)

Copy link
Copy Markdown
Contributor

@acke acke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oss_integration_test is failing

@bastiandoetsch bastiandoetsch merged commit e710fa1 into main May 9, 2026
27 of 28 checks passed
@bastiandoetsch bastiandoetsch deleted the perf/IDE-1940-oss-conversion-caches branch May 9, 2026 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants