test(agent): cover netprobe add-on activation rollback paths (#3425) #3498

Closed
mfreeman451 wants to merge 0 commits from feat/netprobe-rollback-tests into staging
Owner

What

migrate-netprobe-to-native-addon §2.4 — test the activation rollback paths so a failed netprobe add-on activation never leaves a half-installed/enabled unit or a running-but-incapable process.

The rollback behavior already existed, but the post-stage orchestration in applySystemdAddon was untestable (hardcoded artifact root + a root-owned-updater shell-out).

  • Refactor (behavior-preserving): extract the post-stage unit reconcile/rollback into reconcileStagedSystemdUnits with an injectable installer (installUnitsFn) + a threaded runtime root. applySystemdAddon still passes runtimeRoot="" + the real updater installer — production path unchanged.
  • Tests (push_loop_addon_rollback_test.go):
    • install-failure → current rolled back to the prior version, no remembered units;
    • no units in the staged bundle → rollback without an install attempt;
    • success → stays on the new version + records the installed units.

The capability-application failure path already rolls back inside stageAndCapability (covered there); this PR closes the unit discovery/selection/install paths.

Validation

  • go test ./go/pkg/agent/ -run TestReconcileStagedSystemdUnits — 3/3 pass; go vet clean.
  • bazel build //go/pkg/agent:agent_test — green (new test registered in the go_test srcs).

Note: //:gazelle is currently broken on staging by an unrelated root-BUILD reference (#3496 made scripts/ a bazel package, invalidating srcs = ["scripts/swiftlint_bazel.sh"]). Fixed in a separate PR; here the test file was registered by hand.

🤖 Generated with Claude Code

## What `migrate-netprobe-to-native-addon` **§2.4** — test the activation rollback paths so a failed netprobe add-on activation never leaves a half-installed/enabled unit or a running-but-incapable process. The rollback *behavior* already existed, but the post-stage orchestration in `applySystemdAddon` was **untestable** (hardcoded artifact root + a root-owned-updater shell-out). - **Refactor (behavior-preserving):** extract the post-stage unit reconcile/rollback into `reconcileStagedSystemdUnits` with an injectable installer (`installUnitsFn`) + a threaded runtime root. `applySystemdAddon` still passes `runtimeRoot=""` + the real updater installer — production path unchanged. - **Tests** (`push_loop_addon_rollback_test.go`): - install-failure → `current` rolled back to the prior version, no remembered units; - no units in the staged bundle → rollback **without** an install attempt; - success → stays on the new version + records the installed units. The capability-application failure path already rolls back inside `stageAndCapability` (covered there); this PR closes the unit discovery/selection/install paths. ## Validation - `go test ./go/pkg/agent/ -run TestReconcileStagedSystemdUnits` — 3/3 pass; `go vet` clean. - `bazel build //go/pkg/agent:agent_test` — green (new test registered in the `go_test` srcs). > Note: `//:gazelle` is currently broken on `staging` by an unrelated root-BUILD reference (`#3496` made `scripts/` a bazel package, invalidating `srcs = ["scripts/swiftlint_bazel.sh"]`). Fixed in a separate PR; here the test file was registered by hand. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
test(agent): cover netprobe add-on activation rollback paths (#3425)
Some checks failed
Secret Scan / gitleaks (pull_request) Successful in 32s
lint / lint (push) Failing after 59s
Golang Tests / test-go (push) Successful in 1m26s
lint / lint (pull_request) Failing after 1m29s
CI / build (pull_request) Failing after 1m47s
264bf8a15e
migrate-netprobe §2.4. The systemd add-on activation rolls `current` back on failure
so a failed activation never leaves a half-installed/enabled unit, but the post-stage
orchestration in applySystemdAddon was untestable (hardcoded artifact root + the
root-owned-updater install shell-out).

- Extract the post-stage unit reconcile/rollback into reconcileStagedSystemdUnits with
  an injectable installer (installUnitsFn) + a threaded runtime root. Behavior-preserving:
  applySystemdAddon still passes runtimeRoot="" + the real updater installer.
- Tests (push_loop_addon_rollback_test.go): install-failure -> `current` rolled back to
  the prior version (no remembered units); no-units-in-bundle -> rollback without an
  install attempt; success -> stays on the new version + records the installed units.

(The capability-application failure path already rolls back inside stageAndCapability.)

Validated: go test ./go/pkg/agent/ (the 3 new tests) + go vet clean;
bazel build //go/pkg/agent:agent_test green (test registered in the go_test srcs —
gazelle is currently blocked by an unrelated root-BUILD break from #3496, fixed separately).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mfreeman451 left a comment

lgtm

lgtm
fix(agent): use static rollback test error
Some checks failed
Helm Lint / Helm Lint (pull_request) Successful in 13s
Secret Scan / gitleaks (pull_request) Successful in 38s
lint / lint (push) Successful in 1m57s
lint / lint (pull_request) Successful in 1m51s
Fingerprint Licensing / netprobe-fingerprint-licenses (pull_request) Successful in 1m19s
Golang Tests / test-go (push) Successful in 2m52s
CI / build (pull_request) Failing after 2m34s
Elixir Quality / Elixir Quality (pull_request) Failing after 31m17s
c3df466d9d
mfreeman451 closed this pull request 2026-06-01 16:37:50 +00:00
Some checks failed
Helm Lint / Helm Lint (pull_request) Successful in 13s
Secret Scan / gitleaks (pull_request) Successful in 38s
lint / lint (push) Successful in 1m57s
lint / lint (pull_request) Successful in 1m51s
Fingerprint Licensing / netprobe-fingerprint-licenses (pull_request) Successful in 1m19s
Golang Tests / test-go (push) Successful in 2m52s
CI / build (pull_request) Failing after 2m34s
Elixir Quality / Elixir Quality (pull_request) Failing after 31m17s

Pull request closed

Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar!3498
No description provided.