SystemActor integration work #2655
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2655
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2655/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #2272
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2272
Original created: 2026-01-12T07:40:02Z
Original updated: 2026-01-12T09:20:18Z
Original head: carverauto/serviceradar:updates/systemactor-module
Original base: staging
Original merged: 2026-01-12T09:20:16Z by @mfreeman451
User description
IMPORTANT: Please sign the Developer Certificate of Origin
Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:
Describe your changes
Issue ticket number and link
Code checklist before requesting a review
PR Type
Enhancement, Bug fix
Description
Introduce
SystemActormodule for tenant-scoped system operationsfor_tenant/2for tenant-scoped background operationsplatform/1for cross-tenant bootstrap operationssystem_actor?/1predicate for actor validationReplace
authorize?: falsewith proper system actors across multiple modulesConfigServer,HealthTracker,StateMonitorto use SystemActorStatefulAlertCleanupWorker,ZenRuleSync,SweepMonitorWorkerAdd comprehensive proposal and task tracking documentation
Diagram Walkthrough
File Walkthrough
1 files
New SystemActor module for tenant-scoped system operations6 files
Replace authorize?: false with SystemActor.for_tenantReplace authorize?: false with SystemActor across multiple operationsReplace build_system_actor with SystemActor.for_tenantReplace authorize?: false with SystemActor in cleanup operationsIntegrate SystemActor for rule synchronization operationsReplace authorize?: false with SystemActor in sweep monitoring2 files
Document security debt fix proposal and solution strategyDefine phased implementation tasks for SystemActor migrationImported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2272#issuecomment-3737239249
Original created: 2026-01-12T07:40:51Z
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
Actor spoofing
Description:
system_actor?/1treats any map containingrole: :systemas a valid system actor (withoutrequiring
idprefix,tenant_id, or other invariants), which can enable actorspoofing/privilege escalation if any request path allows user-controlled actors to be
provided to Ash operations.
system_actor.ex [156-159]
Referred Code
🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status: Passed
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
Audit context only: The PR introduces identifiable system actors but does not demonstrate that the critical
background Ash operations now emit audit log entries with actor id/outcome in a
centralized audit trail.
Referred Code
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status:
Inspect reason logging: Error logs include
inspect(reason)which may expose internal details depending on upstreamerror contents, so verification is needed that these logs are not user-facing and are
appropriately protected.
Referred Code
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status:
Reason logging content: New/modified warning logs include
inspect(reason)and should be verified to never includesensitive data (e.g., secrets from downstream clients or DB errors) in production log
sinks.
Referred Code
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status:
Actor validation lax:
system_actor?/1currently returns true for any map containing%{role: :system}withoutrequiring
tenant_idor a trustedidprefix, which could allow incorrectly-shaped actors tobe treated as privileged if used for authorization decisions elsewhere.
Referred Code
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2272#issuecomment-3737243261
Original created: 2026-01-12T07:42:17Z
PR Code Suggestions ✨
Latest suggestions up to
05b7a51Enforce tenant match in bypass
Strengthen the system actor bypass policy by adding a check to ensure the
actor's
tenant_idmatches the record'stenant_id, preventing cross-tenant dataaccess.
elixir/serviceradar_core/lib/serviceradar/integrations/integration_source.ex [277-280]
Suggestion importance[1-10]: 9
__
Why: This suggestion correctly identifies a potential security flaw where a system actor could bypass tenant isolation policies, and proposes a fix that strengthens security by enforcing tenant ID matching.
Guard against missing tenant id
Add a check to handle cases where
extract_tenant_id_from_schemareturnsniltoprevent a crash when calling
SystemActor.for_tenant.elixir/serviceradar_core/lib/serviceradar/agent_config/changes/create_version_history.ex [78-87]
Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies a potential
FunctionClauseErrorifextract_tenant_id_from_schemareturnsnil, preventing a runtime crash and improving robustness.Avoid nil-derived tenant actors
Use a platform actor instead of a tenant-scoped actor for the initial package
lookup to avoid a potential crash when the tenant ID cannot be derived from the
schema.
elixir/serviceradar_core/lib/serviceradar/events/onboarding_writer.ex [30-38]
Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies a potential crash and proposes using a
platformactor, which is the correct pattern for looking up a resource before its tenant context is known.Fix invalid checkout action version
Correct the invalid
actions/checkoutversion in the new CI workflow. Change@v6to the valid major version
@v4to ensure the lint job can run..github/workflows/serviceradar-core-lint.yml [44]
Suggestion importance[1-10]: 7
__
Why: This is a correct and important fix for the new CI workflow, as using an invalid action version like
@v6would cause the job to fail, defeating the purpose of adding the lint check.Limit cross-tenant reads to metadata
When querying all tenants, explicitly select only the necessary fields (
:id,:slug) to minimize data exposure and adhere to the principle of least privilege.elixir/serviceradar_core/lib/serviceradar/oban/tenant_queues.ex [372-376]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 6
__
Why: This is a good security practice that follows the principle of least privilege by ensuring the query only fetches the data it actually needs, reducing the risk of exposing sensitive tenant information.
Scope bypass to system actors
Refine the
bypassclause to be more specific by moving the role check into thebypass condition and adding a check to ensure the system actor has a
tenant_id.elixir/serviceradar_core/lib/serviceradar/monitoring/poll_job.ex [294-296]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 8
__
Why: This is a valuable security hardening suggestion that improves upon the new authorization pattern by making it more explicit and robust, reducing the risk of accidental privilege escalation.
Previous suggestions
Suggestions up to commit
5356ddbThe PR is incomplete without policy updates
The PR introduces actors with a new
:systemrole but omits the necessaryauthorization policy updates to recognize this role. This will cause background
jobs using these new actors to be blocked and fail.
Examples:
elixir/serviceradar_core/lib/serviceradar/actors/system_actor.ex [84-91]
elixir/serviceradar_core/lib/serviceradar/infrastructure/health_tracker.ex [132-136]
Solution Walkthrough:
Before:
After:
Suggestion importance[1-10]: 9
__
Why: This suggestion correctly identifies a critical flaw; the PR replaces
authorize?: falsewith actors having a new:systemrole, but without updating authorization policies to recognize this role, the refactored operations will fail.Use tenant schema in queries
In
load_from_database/4, replace the use of a rawtenant_idin theAsh.readcallwith a tenant schema. Fetch the schema using
TenantSchemas.schema_for_tenant/1and handle the case where it might not be found.
elixir/serviceradar_core/lib/serviceradar/agent_config/config_server.ex [169-185]
Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies that an
Ash.readcall is using a rawtenant_idinstead of a tenant schema, which is inconsistent with the pattern used elsewhere in the PR and can cause issues with Ash's multi-tenancy. Adopting the standard pattern of resolving the schema first is a critical fix for correctness.Include tenant in actor id
Modify the
for_tenant/2function to include thetenant_idin the generated actorid. This ensures actor IDs are unique across all tenants, which improvestraceability in audit logs.
elixir/serviceradar_core/lib/serviceradar/actors/system_actor.ex [84-91]
Suggestion importance[1-10]: 6
__
Why: The suggestion correctly points out that actor
ids for tenant-scoped actors are not unique across tenants, which could complicate auditing. Including thetenant_idin theidis a good practice for ensuring global uniqueness and improving traceability.Make system actor check more specific
Refine the
system_actor?function to be more specific for:systemrole actors.Add a check to ensure the actor's
idstarts with the "system:" prefix, matchingthe stricter validation used for
:super_adminactors.elixir/serviceradar_core/lib/serviceradar/actors/system_actor.ex [157-159]
Suggestion importance[1-10]: 7
__
Why: This suggestion correctly identifies that the
system_actor?check for the:systemrole is too permissive and could lead to incorrect actor identification. Aligning it with the actor creation logic infor_tenant/2by checking theidprefix significantly improves the robustness of this new security feature.