agent/tenant updates #2628
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2628
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2628/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #2220
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2220
Original created: 2026-01-02T07:16:44Z
Original updated: 2026-01-02T21:22:10Z
Original head: carverauto/serviceradar:agent-push-update
Original base: testing
Original merged: 2026-01-02T21:22:08Z by @mfreeman451
User description
IMPORTANT: Please sign the Developer Certificate of Origin
Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:
Describe your changes
Issue ticket number and link
Code checklist before requesting a review
PR Type
Enhancement
Description
Rename poller-related terminology to gateway across gRPC services and agent process
Add SPKI SHA-256 hash computation and storage for tenant CA certificate validation
Implement tenant identity resolution via issuer CA SPKI hash matching
Add agent-initiated SaaS connectivity design and specification documents
Diagram Walkthrough
File Walkthrough
6 files
Rename poller to gateway in agent status operationsAdd SPKI SHA-256 hash attribute and lookup queryCompute and store SPKI SHA-256 hash for CA certificatesResolve tenant identity via issuer CA SPKI hash validationRename PollerService to AgentGatewayService and poller_id fieldsRename report_status to push_status with gateway terminology1 files
Add spki_sha256 column and index to tenant_cas table5 files
Resource snapshot for tenant CA SPKI hash migrationDesign document for agent-initiated SaaS connectivityProposal for agent SaaS connectivity and configuration changesSpecification for agent-initiated connection and enrollmentImplementation tasks for agent SaaS connectivity featureImported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2220#issuecomment-3704652677
Original created: 2026-01-02T07:17:29Z
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
Authorization bypass risk
Description: The new issuer-to-tenant lookup path uses
Ash.read_one(authorize?: false, load: [:tenant])and accepts an
:issuer_spki_sha256option that may be caller-controlled, which couldenable unauthorized tenant CA/tenant association lookup (tenant enumeration) if this
resolver is reachable from untrusted inputs without strict TLS-derived issuer enforcement.
tenant_resolver.ex [262-286]
Referred Code
🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status: Passed
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status: 🏷️
Swallowed migration errors: The data backfill silently ignores certificate parsing failures (
{:error, _} -> :ok),which can leave
spki_sha256unset without any visibility or remediation path.Referred Code
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status: 🏷️
Missing audit events: Tenant identity resolution decisions (e.g., CA lookup success/failure and slug mismatch)
are returned as errors but are not logged as auditable security events, making it hard to
reconstruct authentication/authorization-related outcomes.
Referred Code
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status: 🏷️
Potentially sensitive logs: The log line includes
request.gateway_idand logs rawGRPC.RPCError.message(error), whichmay contain environment/internal details depending on upstream error formatting and should
be reviewed against logging policy.
Referred Code
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status: 🏷️
Weak SPKI validation:
issuer_spki_sha256accepts any binary string (only downcased) without validating expectedSHA-256 hex length/format, which could permit malformed inputs to drive database lookups
and unpredictable behavior.
Referred Code
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2220#issuecomment-3704654015
Original created: 2026-01-02T07:18:35Z
PR Code Suggestions ✨
Explore these optional code suggestions:
✅
Optimize index for SPKI lookupsSuggestion Impact:
The migration was updated to create a unique partial index on spki_sha256 for active rows, and the down migration was adjusted to drop that corresponding index.code diff:
In the migration file, replace the index on
[:tenant_id, :spki_sha256]with apartial unique index on
[:spki_sha256]wherestatus = 'active'to optimizelookups and enforce data integrity.
elixir/serviceradar_core/priv/repo/migrations/20260102060204_add_tenant_ca_spki_hash.exs [41]
[Suggestion processed]Suggestion importance[1-10]: 9
__
Why: The suggestion provides a significant improvement by proposing a partial unique index that both optimizes the critical
by_spkiquery and enforces data integrity, which is a crucial aspect of the tenant resolution logic.✅
Align error code with specSuggestion Impact:
The commit updates ensure_tenant_loaded/1 to return :tenant_ca_not_found when given a %TenantCA{}, aligning the error atom with the spec as suggested.code diff:
In
ensure_tenant_loaded/1, return{:error, :tenant_ca_not_found}instead of{:error, :tenant_not_found}to align with the error types documented in thecalling function's spec.
elixir/serviceradar_core/lib/serviceradar/edge/tenant_resolver.ex [293]
[Suggestion processed]Suggestion importance[1-10]: 4
__
Why: The suggestion correctly points out an inconsistency where a function returns
:tenant_not_foundwhile the calling context expects:tenant_ca_not_found, improving error handling clarity.✅
Propagate original database query errorsSuggestion Impact:
The match clause for {:error, _} was changed to return the original error tuple ({:error, reason}) rather than masking it as {:error, :tenant_ca_not_found}, improving error visibility.code diff:
In
lookup_tenant_ca_by_spki/1, propagate the original error fromAsh.read_one/1instead of returning a generic
:tenant_ca_not_foundatom for all failure cases.elixir/serviceradar_core/lib/serviceradar/edge/tenant_resolver.ex [278-287]
[Suggestion processed]Suggestion importance[1-10]: 7
__
Why: The suggestion correctly identifies that masking database errors as
:tenant_ca_not_foundhinders debugging, and propagating the original error is a significant improvement for observability and maintenance.Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2220#issuecomment-3706043871
Original created: 2026-01-02T19:04:11Z
CI Feedback 🧐
A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
Action: build
Failed stage: Test [❌]
Failed test name: ""
Failure summary:
The action failed during the Bazel build/test step due to a repository fetch (loading phase) error,
not because any test failed.
- Bazel failed while fetching the external repository
gazelle++go_deps+org_golang_x_netfromgolang.org/x/net@v0.48.0.- The fetch failed with a network
error when downloading from
https://proxy.golang.org/golang.org/x/net/@v/v0.48.0.zip:read:connection reset by peer(seeexternal/gazelle+/internal/go_repository.bzl:297:13).- Because that
dependency could not be fetched, Bazel reported many downstream
no such packageerrors for targetsunder
@@gazelle++go_deps+org_golang_x_net//..., leading toAnalysis failedand ultimately:ERROR:command succeeded, but there were loading phase errorsand exit code 1.- Although earlier there was
also an OCI-related error (
//docker/images:cnpg_postgresql_16_6_rootfs_tarwithcould not parsereference: layout:/.../layout), the build ultimately failed because the loading phase error (Gomodule fetch) persisted.
Relevant error logs: