feat(core): per-agent add-on status read model (#3425) #3455

Merged
mfreeman451 merged 2 commits from feat/native-addon-edge-ops into staging 2026-05-29 17:18:15 +00:00
Owner

Implements task 7.2 of add-native-addon-edge-ops: the control plane now records the per-add-on status agents report. Previously the agent capability status (service_name 'agent') was received and dropped; StatusHandler now routes it to AddonStatusIngestor, which decodes the payload, extracts the addon: sidecar entries (ignoring real sidecars like netprobe), and upserts a ServiceRadar.Plugins.AddonStatus row per add-on keyed by {agent_uid, addon_id} with state, active, degradation_reason, pid, restart_count, and last_health_at.

AddonStatus lives in the platform schema (migration 20260529010000) and is registered in the Plugins domain, so Edge Ops can reconcile desired assignments against observed state. version/arch columns exist but are nullable: the agent does not yet report them per add-on (task 7.1, agent-side enrichment); they can be populated without a schema change once it does. Rust-SRQL entity registration for addon_statuses is a follow-up.

Tests (addon_status_ingestor_test.exs, verified against the srql-fixtures CNPG cluster, 2 tests / 0 failures): addon sidecars parsed into the read model, non-addon sidecars ignored, re-ingest upserts the existing row. mix compile --warnings-as-errors green; openspec validate --strict green.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

IMPORTANT: Please sign the Developer Certificate of Origin

Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:

Signed-off-by: J. Doe <j.doe@domain.com>

Describe your changes

Code checklist before requesting a review

  • I have signed the DCO?
  • The build completes without errors?
  • All tests are passing when running make test?
Implements task 7.2 of add-native-addon-edge-ops: the control plane now records the per-add-on status agents report. Previously the agent capability status (service_name 'agent') was received and dropped; StatusHandler now routes it to AddonStatusIngestor, which decodes the payload, extracts the addon:<id> sidecar entries (ignoring real sidecars like netprobe), and upserts a ServiceRadar.Plugins.AddonStatus row per add-on keyed by {agent_uid, addon_id} with state, active, degradation_reason, pid, restart_count, and last_health_at. AddonStatus lives in the platform schema (migration 20260529010000) and is registered in the Plugins domain, so Edge Ops can reconcile desired assignments against observed state. version/arch columns exist but are nullable: the agent does not yet report them per add-on (task 7.1, agent-side enrichment); they can be populated without a schema change once it does. Rust-SRQL entity registration for addon_statuses is a follow-up. Tests (addon_status_ingestor_test.exs, verified against the srql-fixtures CNPG cluster, 2 tests / 0 failures): addon sidecars parsed into the read model, non-addon sidecars ignored, re-ingest upserts the existing row. mix compile --warnings-as-errors green; openspec validate --strict green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> ## IMPORTANT: Please sign the Developer Certificate of Origin Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include a [DCO sign-off statement]( https://developercertificate.org/) indicating the DCO acceptance in one commit message. Here is an example DCO Signed-off-by line in a commit message: ``` Signed-off-by: J. Doe <j.doe@domain.com> ``` ## Describe your changes ## Issue ticket number and link ## Code checklist before requesting a review - [ ] I have signed the DCO? - [ ] The build completes without errors? - [ ] All tests are passing when running make test?
feat(core): per-agent add-on status read model (#3425)
Some checks failed
lint / lint (push) Successful in 1m19s
Golang Tests / test-go (push) Failing after 2m1s
Secret Scan / gitleaks (pull_request) Successful in 1m6s
lint / lint (pull_request) Successful in 2m14s
Elixir Quality / Elixir Quality (pull_request) Failing after 12m44s
CI / build (pull_request) Failing after 15m13s
32f73c3988
Implements task 7.2 of add-native-addon-edge-ops: the control plane now records the per-add-on status agents report. Previously the agent capability status (service_name 'agent') was received and dropped; StatusHandler now routes it to AddonStatusIngestor, which decodes the payload, extracts the addon:<id> sidecar entries (ignoring real sidecars like netprobe), and upserts a ServiceRadar.Plugins.AddonStatus row per add-on keyed by {agent_uid, addon_id} with state, active, degradation_reason, pid, restart_count, and last_health_at.

AddonStatus lives in the platform schema (migration 20260529010000) and is registered in the Plugins domain, so Edge Ops can reconcile desired assignments against observed state. version/arch columns exist but are nullable: the agent does not yet report them per add-on (task 7.1, agent-side enrichment); they can be populated without a schema change once it does. Rust-SRQL entity registration for addon_statuses is a follow-up.

Tests (addon_status_ingestor_test.exs, verified against the srql-fixtures CNPG cluster, 2 tests / 0 failures): addon sidecars parsed into the read model, non-addon sidecars ignored, re-ingest upserts the existing row. mix compile --warnings-as-errors green; openspec validate --strict green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
feat(srql): add addon_statuses SRQL entity (#3425)
Some checks failed
Rust Tests / test-rust (rust/rdp-adapter, cargo) (push) Successful in 1m58s
lint / lint (push) Successful in 2m7s
Golang Tests / test-go (push) Failing after 2m30s
Rust Tests / test-rust (rust/consumers/zen, cargo) (push) Successful in 2m50s
Rust Tests / test-rust (//rust/rperf-server:rperf, rust/rperf-server, bazel) (push) Successful in 3m9s
Secret Scan / gitleaks (pull_request) Successful in 1m14s
Rust Tests / test-rust (rust/log-collector, cargo) (push) Successful in 3m18s
Rust Tests / test-rust (rust/rperf-client, cargo) (push) Successful in 3m0s
lint / lint (pull_request) Successful in 2m50s
Rust Tests / test-rust (//rust/netprobe:netprobe, //build/platforms:linux_aarch64_musl, rust/netprobe, bazel-static) (push) Successful in 3m42s
Rust Tests / test-rust (rust/rdp-connector-probe, cargo) (push) Successful in 4m2s
Rust Tests / test-rust (//rust/netprobe:netprobe, //build/platforms:linux_x86_64_musl, rust/netprobe, bazel-static) (push) Successful in 4m10s
Rust Tests / test-rust (rust/trapd, cargo) (push) Successful in 3m44s
Rust Tests / test-rust (//rust/netprobe:netprobe_test, rust/netprobe, bazel-test) (push) Successful in 5m12s
Rust Tests / test-rust (rust/srql, cargo) (push) Successful in 5m14s
Elixir Quality / Elixir Quality (pull_request) Failing after 16m28s
CI / build (pull_request) Failing after 18m47s
dc9899d7ad
Completes add-native-addon-edge-ops task 1.2: per-agent add-on status is now queryable via SRQL (e.g. in:addon_statuses agent_uid:agent-1 state:unhealthy). SRQL runs in-process in web-ng as a Rustler NIF over the rust/srql crate (the NIF translates SRQL -> SQL; Elixir executes it), so the entity is registered on the live translate_request path.

Adds Entity::AddonStatuses + parse mapping (parser.rs), the addon_statuses Diesel table (schema.rs, unqualified like ocsf_agents so the platform schema resolves via search_path), AddonStatusRow (models.rs), and a query/addon_statuses.rs executor mirroring agents.rs (text filters agent_uid/addon_id/state/version/arch, reported_at time-range + default desc ordering). Wired into both exhaustive dispatch matches (translate_request used by the NIF, execute_query used by the standalone server) and the viz metadata match.

Verified with cargo test (translation is pure, no DB): the example test asserts in:addon_statuses translates to SQL against addon_statuses with the expected filters/ordering; full srql lib suite 179/0. cargo fmt + clippy clean. The prebuilt NIF .so is not git-tracked (built in CI from this source).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mfreeman451 force-pushed feat/native-addon-edge-ops from dc9899d7ad
Some checks failed
Rust Tests / test-rust (rust/rdp-adapter, cargo) (push) Successful in 1m58s
lint / lint (push) Successful in 2m7s
Golang Tests / test-go (push) Failing after 2m30s
Rust Tests / test-rust (rust/consumers/zen, cargo) (push) Successful in 2m50s
Rust Tests / test-rust (//rust/rperf-server:rperf, rust/rperf-server, bazel) (push) Successful in 3m9s
Secret Scan / gitleaks (pull_request) Successful in 1m14s
Rust Tests / test-rust (rust/log-collector, cargo) (push) Successful in 3m18s
Rust Tests / test-rust (rust/rperf-client, cargo) (push) Successful in 3m0s
lint / lint (pull_request) Successful in 2m50s
Rust Tests / test-rust (//rust/netprobe:netprobe, //build/platforms:linux_aarch64_musl, rust/netprobe, bazel-static) (push) Successful in 3m42s
Rust Tests / test-rust (rust/rdp-connector-probe, cargo) (push) Successful in 4m2s
Rust Tests / test-rust (//rust/netprobe:netprobe, //build/platforms:linux_x86_64_musl, rust/netprobe, bazel-static) (push) Successful in 4m10s
Rust Tests / test-rust (rust/trapd, cargo) (push) Successful in 3m44s
Rust Tests / test-rust (//rust/netprobe:netprobe_test, rust/netprobe, bazel-test) (push) Successful in 5m12s
Rust Tests / test-rust (rust/srql, cargo) (push) Successful in 5m14s
Elixir Quality / Elixir Quality (pull_request) Failing after 16m28s
CI / build (pull_request) Failing after 18m47s
to b628716eaa
Some checks failed
Netprobe eBPF Verifier / Verify eBPF programs on Linux 5.8 (push) Has been cancelled
Netprobe eBPF Verifier / Verify eBPF programs on Linux 6.x (push) Has been cancelled
Netprobe eBPF Verifier / Verify eBPF refusal on Linux 5.4 (push) Has been cancelled
Netprobe eBPF Verifier / Verify eBPF programs on Linux 5.15 (push) Has been cancelled
Secret Scan / gitleaks (pull_request) Successful in 29s
Rust Tests / test-rust (rust/rdp-adapter, cargo) (push) Successful in 1m35s
Fingerprint Licensing / netprobe-fingerprint-licenses (push) Successful in 2m1s
Rust Tests / test-rust (rust/consumers/zen, cargo) (push) Failing after 2m26s
Rust Tests / test-rust (//rust/rperf-server:rperf, rust/rperf-server, bazel) (push) Successful in 2m33s
Rust Tests / test-rust (rust/rperf-client, cargo) (push) Successful in 2m50s
Rust Tests / test-rust (//rust/netprobe:netprobe, //build/platforms:linux_x86_64_musl, rust/netprobe, bazel-static) (push) Successful in 2m52s
Rust Tests / test-rust (//rust/netprobe:netprobe, //build/platforms:linux_aarch64_musl, rust/netprobe, bazel-static) (push) Successful in 2m57s
lint / lint (push) Successful in 4m12s
Golang Tests / test-go (push) Failing after 4m22s
lint / lint (pull_request) Successful in 3m56s
Rust Tests / test-rust (rust/trapd, cargo) (push) Successful in 4m12s
Rust Tests / test-rust (rust/log-collector, cargo) (push) Successful in 4m34s
Rust Tests / test-rust (//rust/netprobe:netprobe_test, rust/netprobe, bazel-test) (push) Successful in 5m9s
Rust Tests / test-rust (rust/rdp-connector-probe, cargo) (push) Successful in 5m12s
Rust Tests / test-rust (rust/srql, cargo) (push) Successful in 5m58s
Elixir Quality / Elixir Quality (pull_request) Failing after 15m22s
CI / build (pull_request) Failing after 19m9s
2026-05-29 17:16:28 +00:00
Compare
mfreeman451 left a comment

lgtm

lgtm
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar!3455
No description provided.