Skip to content

feat: autoclaim L1 -> L2#1644

Draft
arnaubennassar wants to merge 16 commits into
developfrom
feat/autoclaim-plan
Draft

feat: autoclaim L1 -> L2#1644
arnaubennassar wants to merge 16 commits into
developfrom
feat/autoclaim-plan

Conversation

@arnaubennassar

Copy link
Copy Markdown
Collaborator

🔄 Changes Summary

  • [Brief description of what was added, fixed, or changed in this PR]

⚠️ Breaking Changes

  • 🛠️ Config: [Optional: Changes to TOML config]
  • 🔌 API/CLI: [Optional: Breaking interface changes]
  • 🗑️ Deprecated Features: [Optional: Removed features]

📋 Config Updates

  • 🧾 Diff/Config snippet: [Optional: Enumerate config changes/provide snippet of changes]

✅ Testing

  • 🤖 Automatic: [Optional: Enumerate E2E tests]
  • 🖱️ Manual: [Optional: Steps to verify]

🐞 Issues

  • Closes #[issue-number]

🔗 Related PRs

  • [Optional: Enumerate related pull requests]

📝 Notes

  • [Optional: design decisions, tradeoffs, or TODOs]

@arnaubennassar arnaubennassar marked this pull request as draft June 8, 2026 09:06
@claude

claude Bot commented Jun 8, 2026

Copy link
Copy Markdown

Claude finished @arnaubennassar's task in 2m 59s —— View job


Code Review: Auto Claim L1 to L2 Feature

I've completed a comprehensive review of this PR introducing the Auto Claim service for automated L1 to L2 bridge claims. This is a substantial feature addition with ~14K lines of new code across 77 files.


🚀 What's New

This PR implements an automated bridge claim system that:

  • Automatically discovers L1 bridge exits from l1bridgesync
  • Evaluates configurable policies (allow-all, api-approve, no-message, basic-filter)
  • Prepares L1 claim proofs using L1 info tree data
  • Submits claim transactions to destination L2 networks via EthTxManager
  • Provides optional REST API for manual approval workflows
  • Tracks request lifecycle from detection through confirmation in SQLite storage

The implementation includes comprehensive unit tests (10 test files), E2E tests, and detailed documentation.


🐛 Bug Fixes

None - this is a new feature.


📋 Config Updates

New top-level config section: [AutoClaim]

[AutoClaim]
Enabled = false  # Default: disabled
StoragePath = "{{PathRWData}}/autoclaim.sqlite"

[AutoClaim.API]
Enabled = false
Host = "0.0.0.0"
Port = 5579

[AutoClaim.L1ToL2Watchdog]
Enabled = true
PollInterval = "3s"
RetryAfterErrorPeriod = "1s"
MaxRetryAttemptsAfterError = -1
EtrogL1UpgradeBlock = 0

[AutoClaim.L2ToLxWatchdog]
Enabled = false  # Must stay false - not implemented

[[AutoClaim.Claimers]]
Enabled = true
ID = "l2-primary"
NetworkType = "EVM"
NetworkID = 1
URLRPC = "http://l2-rpc:8545"
BridgeAddr = "0x..."
PolicyName = "allow-all"  # Options: allow-all, api-approve, no-message, basic-filter
GasOffset = 100000
WaitPeriod = "1s"
RetryAfter = "1s"
MaxRetries = 30

[AutoClaim.Claimers.Policy]
AllowMessageClaims = false
AllowedOrigins = [0]
AllowedTokens = []
ManualFallback = false
MaxGas = 500000

[AutoClaim.Claimers.EthTxManager]
# Standard EthTxManager config per claimer
StoragePath = "/var/lib/aggkit/ethtxmanager-autoclaim-l2-primary.sqlite"
# ... other EthTxManager fields

⚠️ Breaking Changes

None - The feature is disabled by default and requires explicit opt-in via:

  1. Selecting the autoclaim component
  2. Setting AutoClaim.Enabled = true

🔴 Required Issues

1. SQL Injection Risk in List Queries (autoclaim/storage/storage.go:714, 722)

Severity: Medium - Currently safe due to implementation, but fragile pattern.

The code concatenates WHERE clauses into SQL queries:

countQuery := "SELECT COUNT(*) FROM autoclaim_request" + where
query := selectRequestSQL() + where + " ORDER BY block_num DESC..."

While buildRequestWhereClause() properly uses parameterized queries with ? placeholders, the string concatenation pattern is error-prone and could lead to SQL injection if modified incorrectly in the future.

Recommendation: Refactor to make parameterization more explicit or add comments emphasizing that where must only contain ? placeholders, never user-supplied values.

2. Missing context.Background() Justification (9 files)

Severity: Low - Found in test files and one production file (autoclaim/api/api.go:163)

Test files appropriately use context.Background(), but autoclaim/api/api.go:163 creates a background context for shutdown timeout:

shutdownCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)

Recommendation: This is acceptable for graceful shutdown, but consider documenting why the parent context isn't used.

3. Acknowledged E2E Test Failures (docs/autoclaim/P14_LOG.md)

Severity: High - Per the PR's own documentation, E2E tests are failing.

The PR documentation states:

  • make lint fails with 28 issues including Auto Claim source/test findings
  • E2E tests fail with requests reaching failed status and HTTP 500 errors
  • Tests blocked on "bridge not included in L1 Info Tree" timeouts

Recommendation: These must be resolved before merge. The PR is marked as blocked in its own documentation (P14_LOG.md).


🟡 Suggested Improvements

1. Error Wrapping Consistency (Multiple files)

Some errors use fmt.Errorf with %w, others don't consistently wrap. The codebase follows the pattern well overall, but ensure all errors that should be inspected with errors.Is() use %w.

2. Race Condition in Concurrent Access (autoclaim/claimer/claimer.go)

The Claimer.Recover() method iterates over recoverable requests and calls Advance() on each. If multiple claimers share storage, concurrent updates could conflict. Consider adding transaction isolation or documenting single-claimer-per-destination requirement.

Current mitigation: Configuration validation prevents duplicate destination networks (config.go:119-121), so this is likely safe, but worth documenting.

3. Magic Numbers in Tests (test/e2e/autoclaim_test.go:24)

autoClaimKeystorePass = "pSnv6Dh5s9ahuzGzH9RoCDrKAMddaX3m"

While this is a test-only password, consider documenting its source or purpose.

4. Proof Preparer Binary Search (autoclaim/proof/preparer.go:20)

const binarySearchDivider = 2

This constant name could be clearer (e.g., binarySearchBranchingFactor), though the current name is acceptable.

5. API Error Responses Could Include Request IDs (autoclaim/api/api.go)

When API operations fail, including the request ID in error responses would aid debugging. Current errors are clear but could be enhanced.


🟢 Optional Enhancements

1. Consider Adding Metrics/Telemetry

This is a production-critical automation service that handles financial transactions. Consider adding:

  • Prometheus metrics for request counts by status
  • Latency histograms for proof preparation and claim submission
  • Alert-worthy counters for policy rejections and failures

2. Policy Simulation Without Submission

For debugging, a "dry-run" mode where policies are evaluated but claims aren't submitted could be valuable.

3. Request Replay/Retry API

If a request reaches failed status, operators currently lack an API to retry it without manual database modification.


✅ Strengths

🎯 Excellent Architecture

  • Clean separation of concerns with well-defined interfaces (types/interfaces.go)
  • Proper dependency injection throughout
  • Testable components with mock generation via mockery
  • Clear lifecycle state machine with transition validation

🔒 Good Security Practices

  • Parameterized SQL queries throughout storage layer
  • No credential exposure in logs or config files
  • Proper use of EthTxManager for transaction signing
  • Policy framework prevents unbounded gas consumption

📚 Comprehensive Documentation

  • Detailed operator documentation (docs/autoclaim.md)
  • Architecture boundaries clearly documented (docs/autoclaim-boundaries.md)
  • Extensive development logs (P1-P14 logs) show thoughtful iteration
  • Clear configuration examples with explanations

🧪 Strong Test Coverage

  • 10 dedicated test files for autoclaim package
  • Unit tests for all major components (claimer, sender, storage, policy, proof, watchdog)
  • E2E tests for allow-all and api-approve workflows
  • Mock implementations for all interfaces

🛡️ Defensive Programming

  • Extensive error checking and wrapping
  • Proper resource cleanup (database connections, HTTP servers)
  • Graceful degradation (e.g., proof preparer returns "not ready" instead of failing)
  • Duplicate request deduplication via unique keys

📊 Robust Storage Layer

  • Proper database migrations with up/down support
  • Atomic state transitions with precondition checks
  • Idempotent operations (enqueue, cursor save)
  • Comprehensive indexing for performance

🔍 Code Quality Assessment

Go Best Practices Compliance: ✅ Excellent

  • Follows repo's CLAUDE.md guidelines
  • Proper CamelCase naming conventions
  • Correct import grouping (stdlib, external, internal)
  • Uses require in tests (not assert) as specified
  • Error wrapping with context
  • Interfaces defined where needed

SQL Safety: ✅ Good (with one caveat noted above)

  • All queries use parameterized placeholders
  • No string interpolation of user input
  • Proper null handling with sql.NullString, sql.NullInt64

Concurrency Safety: ⚠️ Adequate

  • No obvious data races
  • Context propagation is correct
  • Configuration validates unique network IDs per claimer
  • Could benefit from explicit documentation of concurrency assumptions

Resource Management: ✅ Excellent

  • Proper defer usage for cleanup
  • Database connection pooling through sql.DB
  • HTTP server graceful shutdown
  • Transaction manager lifecycle handled correctly

📝 Additional Notes

Design Decisions (Well-Justified)

  1. SQLite for storage - Appropriate for single-node operation, avoids external dependencies
  2. Separate EthTxManager per claimer - Proper isolation for concurrent destination handling
  3. Hold cursor on GER unavailability - Prevents skipping claimable bridges, correct tradeoff
  4. Conservative basic-filter policy - Falls back to manual when simulation unavailable, safety-first approach

Known Limitations (Documented)

  1. L2 to Lx direction not implemented (clearly documented as disabled)
  2. Pre-Etrog zkEVM handling requires explicit configuration
  3. E2E tests currently blocked (acknowledged in P14_LOG.md)

🎯 Verdict

This is high-quality work with excellent architecture, comprehensive testing, and thorough documentation. The code demonstrates careful design and defensive programming.

However, the PR should not be merged until:

  1. 🔴 E2E test failures are resolved - The PR's own documentation (P14_LOG.md) acknowledges that focused E2E tests are failing and the feature is blocked
  2. 🔴 Lint issues are fixed - 28 lint findings need resolution
  3. 🟡 SQL concatenation pattern is reviewed - Low risk but should be hardened or documented

Once these blockers are addressed, this will be an excellent addition to the codebase.


🚀 Recommendation

APPROVE with Required Changes - Outstanding architectural work that needs completion of validation criteria before merge.

Next Steps:

  1. Resolve E2E test failures (GER injection timing, bridge inclusion index)
  2. Fix lint findings (28 issues per P14_LOG.md)
  3. Verify SQL safety patterns or add protective comments
  4. Re-run full validation suite (make build && make lint && make test-unit && E2E tests)

Comment thread autoclaim/storage/storage.go Fixed
@arnaubennassar arnaubennassar changed the title Feat/autoclaim plan feat: autoclaim plan Jun 9, 2026
@arnaubennassar arnaubennassar self-assigned this Jun 9, 2026
# Conflicts:
#	multidownloader/evm_multidownloader_reorg.go
#	multidownloader/evm_multidownloader_reorg_test.go
Comment thread autoclaim/storage/storage.go Fixed
@arnaubennassar arnaubennassar changed the title feat: autoclaim plan feat: autoclaim L1 -> L2 Jun 11, 2026
arnaubennassar and others added 8 commits June 11, 2026 10:57
…tic logs

- config/config_test.go: move testify/urfave imports to end of non-stdlib
  group to satisfy gci two-group ordering (fixes CI lint failure)
- config/default.go: set AutoClaim.L1ToL2Watchdog Enabled = true to match
  the assertion in TestLoadDefaultConfig (fixes unit test failure)
- autoclaim/proof/preparer.go: add debug logs in selectL1InfoTreeIndex to
  surface whether the proof is blocked by l1InfoTree lag or gerSyncer lag
- autoclaim/claimer/claimer.go: log info when proof not ready so E2E logs
  clearly show the pending state on each poll cycle

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rackedBlocks wipe

reorgDetector.Start calls loadTrackedHeaders which replaces the in-memory
trackedBlocks map entirely from DB. If l2gersync.New (which calls Subscribe
internally) ran first, the subscription entry it added would be wiped by the
subsequent loadTrackedHeaders call, causing every AddBlockToTrack to return
"subscriber not subscribed" forever.

The per-claimer l2gersync was stuck in a retry loop at the first GER injection
block (block 24), never processing the GER that covered the bridge deposit, so
GetFirstGERAfterL1InfoTreeIndex always returned ErrNotFound and the claim proof
was never prepared.

Fix: call reorgDetector.Start synchronously before l2gersync.New so that
loadTrackedHeaders completes before Subscribe adds the entry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@sonarqubecloud

sonarqubecloud Bot commented Jun 11, 2026

Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
75.2% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants