feat(agent): sidecar manager attach mode for systemd-managed sidecars (#3425) #3482
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!3482
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feat/netprobe-assignment-gated-lifecycle"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
migrate-netprobe-to-native-addon§2.2 — building block. Teaches the agent sidecar supervisor to attach to an externally-(systemd-)managed sidecar socket without launching the process. This is the mechanism netprobe'ssystemd-servicelifecycle connects through (config push + event ingest), and it lands first — additive, fully tested, zero production behavior change (no caller yet) — to de-risk the cutover, in the same spirit as #3481 shipping the inert unit.Why attach mode
Under
systemd-service, systemd owns the netprobe process (it binds the IPC socket). The agent must still get a connected client to pushVisibilityConfigover IPC and ingest fingerprint/DPI/flow events. Ingest is already launch-decoupled (DrainFlowAttributionEventsdrains a channel fed by any connected client), so the only thing the agent needs is a health/connect loop that wires the client in viaOnHealthy— which is exactly what the supervisor's health loop already does. Attach mode reuses it, minus theexec/restart.What changed (
go/pkg/agent/sidecar/manager.go)StartAttach(ctx)vs launchStart(ctx); both delegate tostart(ctx, attach bool).supervise()→superviseAttached(): runs the existinghealthLoopagainst the well-known socket — dial, health-check,OnHealthy(wires the client → config push + ingest), reconnect on loss — with noexec, no process wait, no restart/backoff/circuit-breaker.PIDstays 0; status is driven toRunning/Unhealthy/Stoppedby connectivity.Mode() (started, attach bool)so the forthcoming push-loop cutover can decide when a launch↔attach switch is needed.attachis passed tosupervise()as a parameter (not read from the field), so no new unsynchronized shared state;Mode()reads the mode field under the existingRLock.Tests (
manager_attach_test.go, registered in the Bazelgo_testsrcs)RunningwithPID 0and firesOnHealthywith a non-nil client — impossible on the launch path, so it proves noexec.Running→Pingstarts failing (stale clientClose()'d) →Unhealthy→Pingrecovers → health loop re-dials →Running, withRestartCount==0andPID==0(no process restart).Mode()acrossStartAttach/Stop/Start.Validation
go test -race ./go/pkg/agent/sidecar/green (full package, incl. existing launch tests — thesupervisesignature change broke nothing)go vet+gofmtclean; agent packages buildgo_testsrcs (this repo enumerates test srcs, no globbing), and making the reconnect test actually exercise the drop/recover path.Next (the cutover, §2.2 remainder)
push_loop_config.go: detect the netprobeAddonAssignment(enabledsystemd_service) →StartAttach+ push config; absent → the existing launch path (unchanged). Plus apply-on-connect for the netprobe sidecar — required becauseapplyVisibilityConfigruns beforeapplyAddonAssignmentsinstalls the unit (a synchronous attachApplyConfigwouldreturn falseand abort the apply atpush_loop_config.go:203), and theNotModifiedshort-circuit means poll-driven re-push won't recover after a systemd restart.🤖 Generated with Claude Code
lgtm