carverauto/serviceradar

Fork 0

SystemActor integration work #2655

Merged

mfreeman451 merged 7 commits from refs/pull/2655/head into staging

2026-01-12 09:20:16 +00:00

mfreeman451 commented

2026-01-12 07:40:02 +00:00

(Migrated from github.com)

Owner

Imported from GitHub pull request.

Original GitHub pull request: #2272
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2272
Original created: 2026-01-12T07:40:02Z
Original updated: 2026-01-12T09:20:18Z
Original head: carverauto/serviceradar:updates/systemactor-module
Original base: staging
Original merged: 2026-01-12T09:20:16Z by @mfreeman451

User description

IMPORTANT: Please sign the Developer Certificate of Origin

Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:

Signed-off-by: J. Doe <j.doe@domain.com>

Describe your changes

Issue ticket number and link

Code checklist before requesting a review

I have signed the DCO?
The build completes without errors?
All tests are passing when running make test?

PR Type

Enhancement, Bug fix

Description

Introduce SystemActor module for tenant-scoped system operations
- Provides for_tenant/2 for tenant-scoped background operations
- Provides platform/1 for cross-tenant bootstrap operations
- Includes system_actor?/1 predicate for actor validation
Replace authorize?: false with proper system actors across multiple modules
- Updates ConfigServer, HealthTracker, StateMonitor to use SystemActor
- Updates StatefulAlertCleanupWorker, ZenRuleSync, SweepMonitorWorker
- Maintains tenant isolation policy enforcement via actor tenant_id
Add comprehensive proposal and task tracking documentation
- Documents security debt and migration strategy
- Provides phased implementation plan for remaining ~115 instances

Diagram Walkthrough

flowchart LR
  A["authorize?: false<br/>bypasses all policies"] -->|replaced by| B["SystemActor.for_tenant<br/>maintains tenant isolation"]
  B --> C["Tenant-scoped<br/>background operations"]
  D["SystemActor.platform<br/>cross-tenant operations"] --> E["Bootstrap &<br/>platform operations"]
  C --> F["Policy enforcement<br/>with actor tenant_id"]
  E --> G["Super admin role<br/>for platform operations"]

File Walkthrough

Relevant files

Enhancement

1 files

system_actor.ex
New SystemActor module for tenant-scoped system operations +169/-0

Bug fix

6 files

config_server.ex `Replace authorize?: false with SystemActor.for_tenant`	+5/-2
health_tracker.ex `Replace authorize?: false with SystemActor across multiple operations`	+16/-6
state_monitor.ex `Replace build_system_actor with SystemActor.for_tenant`	+2/-10
stateful_alert_cleanup_worker.ex `Replace authorize?: false with SystemActor in cleanup operations`	+8/-5
zen_rule_sync.ex `Integrate SystemActor for rule synchronization operations`	+28/-12
sweep_monitor_worker.ex `Replace authorize?: false with SystemActor in sweep monitoring`	+4/-1

Documentation

2 files

proposal.md `Document security debt fix proposal and solution strategy`	+78/-0
tasks.md `Define phased implementation tasks for SystemActor migration`	+81/-0

Imported from GitHub pull request. Original GitHub pull request: #2272 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/pull/2272 Original created: 2026-01-12T07:40:02Z Original updated: 2026-01-12T09:20:18Z Original head: carverauto/serviceradar:updates/systemactor-module Original base: staging Original merged: 2026-01-12T09:20:16Z by @mfreeman451 --- ### **User description** ## IMPORTANT: Please sign the Developer Certificate of Origin Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include a [DCO sign-off statement]( https://developercertificate.org/) indicating the DCO acceptance in one commit message. Here is an example DCO Signed-off-by line in a commit message: ``` Signed-off-by: J. Doe <j.doe@domain.com> ``` ## Describe your changes ## Issue ticket number and link ## Code checklist before requesting a review - [ ] I have signed the DCO? - [ ] The build completes without errors? - [ ] All tests are passing when running make test? ___ ### **PR Type** Enhancement, Bug fix ___ ### **Description** - Introduce `SystemActor` module for tenant-scoped system operations - Provides `for_tenant/2` for tenant-scoped background operations - Provides `platform/1` for cross-tenant bootstrap operations - Includes `system_actor?/1` predicate for actor validation - Replace `authorize?: false` with proper system actors across multiple modules - Updates `ConfigServer`, `HealthTracker`, `StateMonitor` to use SystemActor - Updates `StatefulAlertCleanupWorker`, `ZenRuleSync`, `SweepMonitorWorker` - Maintains tenant isolation policy enforcement via actor tenant_id - Add comprehensive proposal and task tracking documentation - Documents security debt and migration strategy - Provides phased implementation plan for remaining ~115 instances ___ ### Diagram Walkthrough ```mermaid flowchart LR A["authorize?: false bypasses all policies"] -->|replaced by| B["SystemActor.for_tenant maintains tenant isolation"] B --> C["Tenant-scoped background operations"] D["SystemActor.platform cross-tenant operations"] --> E["Bootstrap & platform operations"] C --> F["Policy enforcement with actor tenant_id"] E --> G["Super admin role for platform operations"] ``` <details><summary><h3>File Walkthrough</h3></summary> <table><thead><tr><th></th><th align="left">Relevant files</th></tr></thead><tbody><tr><td>Enhancement</td><td><details><summary>1 files</summary><table> <tr> <td>system_actor.ex<dd><code>New SystemActor module for tenant-scoped system operations</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2272/files#diff-f3ebf5072e8a72e025076ca013e6726fb23a0169e9e8c9b0089489a4cf17a73d">+169/-0</a>  </td> </tr> </table></details></td></tr><tr><td>Bug fix</td><td><details><summary>6 files</summary><table> <tr> <td>config_server.ex<dd><code>Replace authorize?: false with SystemActor.for_tenant</code>        </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2272/files#diff-a4d8447584bfcd1088465714c00bea67c90100320b125857ac5bd6a9783de468">+5/-2</a>      </td> </tr> <tr> <td>health_tracker.ex<dd><code>Replace authorize?: false with SystemActor across multiple operations</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2272/files#diff-51e82fcb12ce016dad558a5dcc0e17c2c58454af416f01a7ddb6668d9e5c6670">+16/-6</a>    </td> </tr> <tr> <td>state_monitor.ex<dd><code>Replace build_system_actor with SystemActor.for_tenant</code>      </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2272/files#diff-f7126063b55676150db789fed6fd10c9ca4df307b4f62eb6253ae71b78f47c3d">+2/-10</a>    </td> </tr> <tr> <td>stateful_alert_cleanup_worker.ex<dd><code>Replace authorize?: false with SystemActor in cleanup operations</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2272/files#diff-223ffe1df2c7295d4b04bf355ce2e5dfb99b3bfbf0ceaa55aebd9b46ae8db591">+8/-5</a>      </td> </tr> <tr> <td>zen_rule_sync.ex<dd><code>Integrate SystemActor for rule synchronization operations</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2272/files#diff-d5f190f1d1692ffe3e1dba929363e6ff1ecbd70ddcb1f6f2240473466fe4ec8a">+28/-12</a>  </td> </tr> <tr> <td>sweep_monitor_worker.ex<dd><code>Replace authorize?: false with SystemActor in sweep monitoring</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2272/files#diff-c25f50cc2e496fc7f8f883a12098f278c56ae1d2d42d4d2af4200d364e2ef9a8">+4/-1</a>      </td> </tr> </table></details></td></tr><tr><td>Documentation</td><td><details><summary>2 files</summary><table> <tr> <td>proposal.md<dd><code>Document security debt fix proposal and solution strategy</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2272/files#diff-5a6d9406a8357a5178d6f12507bd86dcbfc797cd2ecb2153843af5b475133419">+78/-0</a>    </td> </tr> <tr> <td>tasks.md<dd><code>Define phased implementation tasks for SystemActor migration</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2272/files#diff-988c983aa7137ccc47b4295a8add07f3e70bf6b8bfbd8e94aa44adb4cc681486">+81/-0</a>    </td> </tr> </table></details></td></tr></tbody></table> </details> ___

qodo-code-review[bot] commented

2026-01-12 07:40:51 +00:00

(Migrated from github.com)

Author

Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2272#issuecomment-3737239249
Original created: 2026-01-12T07:40:51Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
⚪	Actor spoofing Description: `system_actor?/1` treats any map containing `role: :system` as a valid system actor (without requiring `id` prefix, `tenant_id`, or other invariants), which can enable actor spoofing/privilege escalation if any request path allows user-controlled actors to be provided to Ash operations. system_actor.ex [156-159] Referred Code `@spec system_actor?(any()) :: boolean() def system_actor?(%{role: :system}), do: true def system_actor?(%{role: :super_admin, id: "platform:" <> _}), do: true def system_actor?(_), do: false`
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
🟢	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
⚪	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: Audit context only: The PR introduces identifiable system actors but does not demonstrate that the critical background Ash operations now emit audit log entries with actor id/outcome in a centralized audit trail. Referred Code defmodule ServiceRadar.Actors.SystemActor do @moduledoc """ Generates tenant-scoped system actors for background operations. System actors allow background processes (GenServers, Oban workers, seeders) to perform Ash operations while maintaining tenant isolation policy enforcement. ## Usage # For tenant-scoped operations actor = SystemActor.for_tenant(tenant_id, :state_monitor) Gateway \|> Ash.read(actor: actor, tenant: tenant_schema) # For platform-wide operations (bootstrap, tenant management) actor = SystemActor.platform(:tenant_bootstrap) Tenant \|> Ash.read(actor: actor) ## Why Not authorize?: false? Using `authorize?: false` bypasses ALL authorization policies including tenant isolation. This creates security vulnerabilities where background ... (clipped 15 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: Inspect reason logging: Error logs include `inspect(reason)` which may expose internal details depending on upstream error contents, so verification is needed that these logs are not user-facing and are appropriately protected. Referred Code `{:error, reason} -> Logger.error("Failed to transition checker", tenant_id: state.tenant_id, checker_id: checker.id, reason: inspect(reason) )` Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: Reason logging content: New/modified warning logs include `inspect(reason)` and should be verified to never include sensitive data (e.g., secrets from downstream clients or DB errors) in production log sinks. Referred Code `{:ok, _} -> :ok {:error, reason} -> Logger.warning("Zen rule reconcile failed", tenant_id: state.tenant_id, rule_id: rule.id, reason: inspect(reason) ) end end) {:error, reason} -> Logger.warning("Failed to load zen rules", tenant_id: state.tenant_id, reason: inspect(reason) )` Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: Actor validation lax: `system_actor?/1` currently returns true for any map containing `%{role: :system}` without requiring `tenant_id` or a trusted `id` prefix, which could allow incorrectly-shaped actors to be treated as privileged if used for authorization decisions elsewhere. Referred Code `@spec system_actor?(any()) :: boolean() def system_actor?(%{role: :system}), do: true def system_actor?(%{role: :super_admin, id: "platform:" <> _}), do: true def system_actor?(_), do: false` Learn more about managing compliance generic rules or creating your own custom rules
Update

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2272#issuecomment-3737239249 Original created: 2026-01-12T07:40:51Z --- ## PR Compliance Guide 🔍  Below is a summary of compliance checks for this PR: <table><tbody><tr><td colspan='2'>Security Compliance</td></tr> <tr><td rowspan=1>⚪</td> <td><details><summary>Actor spoofing </summary> Description: <code>system_actor?/1</code> treats any map containing <code>role: :system</code> as a valid system actor (without requiring <code>id</code> prefix, <code>tenant_id</code>, or other invariants), which can enable actor spoofing/privilege escalation if any request path allows user-controlled actors to be provided to Ash operations. <a href='https://github.com/carverauto/serviceradar/pull/2272/files#diff-f3ebf5072e8a72e025076ca013e6726fb23a0169e9e8c9b0089489a4cf17a73dR156-R159'>system_actor.ex [156-159]</a> <details open><summary>Referred Code</summary> ```elixir @spec system_actor?(any()) :: boolean() def system_actor?(%{role: :system}), do: true def system_actor?(%{role: :super_admin, id: "platform:" <> _}), do: true def system_actor?(_), do: false ``` </details></details></td></tr> <tr><td colspan='2'>Ticket Compliance</td></tr> <tr><td>⚪</td><td><details><summary>🎫 No ticket provided </summary> - [ ] Create ticket/issue  </details></td></tr> <tr><td colspan='2'>Codebase Duplication Compliance</td></tr> <tr><td>⚪</td><td><details><summary>Codebase context is not defined </summary> Follow the <a href='https://qodo-merge-docs.qodo.ai/core-abilities/rag_context_enrichment/'>guide</a> to enable codebase context checks. </details></td></tr> <tr><td colspan='2'>Custom Compliance</td></tr> <tr><td rowspan=2>🟢</td><td> <details><summary>Generic: Meaningful Naming and Self-Documenting Code</summary> **Objective:** Ensure all identifiers clearly express their purpose and intent, making code self-documenting **Status:** Passed > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary>Generic: Robust Error Handling and Edge Case Management</summary> **Objective:** Ensure comprehensive error handling that provides meaningful context and graceful degradation **Status:** Passed > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td rowspan=4>⚪</td> <td><details> <summary>Generic: Comprehensive Audit Trails</summary> **Objective:** To create a detailed and reliable record of critical system actions for security analysis and compliance. **Status:** <a href='https://github.com/carverauto/serviceradar/pull/2272/files#diff-f3ebf5072e8a72e025076ca013e6726fb23a0169e9e8c9b0089489a4cf17a73dR1-R36'>Audit context only</a>: The PR introduces identifiable system actors but does not demonstrate that the critical background Ash operations now emit audit log entries with actor id/outcome in a centralized audit trail. <details open><summary>Referred Code</summary> ```elixir defmodule ServiceRadar.Actors.SystemActor do @moduledoc """ Generates tenant-scoped system actors for background operations. System actors allow background processes (GenServers, Oban workers, seeders) to perform Ash operations while maintaining tenant isolation policy enforcement. ## Usage # For tenant-scoped operations actor = SystemActor.for_tenant(tenant_id, :state_monitor) Gateway |> Ash.read(actor: actor, tenant: tenant_schema) # For platform-wide operations (bootstrap, tenant management) actor = SystemActor.platform(:tenant_bootstrap) Tenant |> Ash.read(actor: actor) ## Why Not authorize?: false? Using `authorize?: false` bypasses ALL authorization policies including tenant isolation. This creates security vulnerabilities where background ... (clipped 15 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary>Generic: Secure Error Handling</summary> **Objective:** To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. **Status:** <a href='https://github.com/carverauto/serviceradar/pull/2272/files#diff-f7126063b55676150db789fed6fd10c9ca4df307b4f62eb6253ae71b78f47c3dR415-R420'>Inspect reason logging</a>: Error logs include <code>inspect(reason)</code> which may expose internal details depending on upstream error contents, so verification is needed that these logs are not user-facing and are appropriately protected. <details open><summary>Referred Code</summary> ```elixir {:error, reason} -> Logger.error("Failed to transition checker", tenant_id: state.tenant_id, checker_id: checker.id, reason: inspect(reason) ) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary>Generic: Secure Logging Practices</summary> **Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. **Status:** <a href='https://github.com/carverauto/serviceradar/pull/2272/files#diff-d5f190f1d1692ffe3e1dba929363e6ff1ecbd70ddcb1f6f2240473466fe4ec8aR205-R219'>Reason logging content</a>: New/modified warning logs include <code>inspect(reason)</code> and should be verified to never include sensitive data (e.g., secrets from downstream clients or DB errors) in production log sinks. <details open><summary>Referred Code</summary> ```elixir {:ok, _} -> :ok {:error, reason} -> Logger.warning("Zen rule reconcile failed", tenant_id: state.tenant_id, rule_id: rule.id, reason: inspect(reason) ) end end) {:error, reason} -> Logger.warning("Failed to load zen rules", tenant_id: state.tenant_id, reason: inspect(reason) ) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary>Generic: Security-First Input Validation and Data Handling</summary> **Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities **Status:** <a href='https://github.com/carverauto/serviceradar/pull/2272/files#diff-f3ebf5072e8a72e025076ca013e6726fb23a0169e9e8c9b0089489a4cf17a73dR156-R159'>Actor validation lax</a>: <code>system_actor?/1</code> currently returns true for any map containing <code>%{role: :system}</code> without requiring <code>tenant_id</code> or a trusted <code>id</code> prefix, which could allow incorrectly-shaped actors to be treated as privileged if used for authorization decisions elsewhere. <details open><summary>Referred Code</summary> ```elixir @spec system_actor?(any()) :: boolean() def system_actor?(%{role: :system}), do: true def system_actor?(%{role: :super_admin, id: "platform:" <> _}), do: true def system_actor?(_), do: false ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td align="center" colspan="2"> - [ ] Update  </td></tr></tbody></table> <details><summary>Compliance status legend</summary> 🟢 - Fully Compliant 🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label </details>

qodo-code-review[bot] commented

2026-01-12 07:42:17 +00:00

(Migrated from github.com)

Author

Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2272#issuecomment-3737243261
Original created: 2026-01-12T07:42:17Z

PR Code Suggestions ✨

Latest suggestions up to 05b7a51

Category	Suggestion	Impact
Incremental ^[*]	Enforce tenant match in bypass Strengthen the system actor bypass policy by adding a check to ensure the actor's `tenant_id` matches the record's `tenant_id`, preventing cross-tenant data access. elixir/serviceradar_core/lib/serviceradar/integrations/integration_source.ex [277-280] `-# System actors can perform all operations (tenant isolation via schema) +# System actors can perform operations within their tenant bypass always() do - authorize_if actor_attribute_equals(:role, :system) + authorize_if all_of([ + actor_attribute_equals(:role, :system), + expr(tenant_id == ^actor(:tenant_id)) + ]) end` Apply / Chat Suggestion importance[1-10]: 9 __ Why: This suggestion correctly identifies a potential security flaw where a system actor could bypass tenant isolation policies, and proposes a fix that strengthens security by enforcing tenant ID matching.	High
	Guard against missing tenant id Add a check to handle cases where `extract_tenant_id_from_schema` returns `nil` to prevent a crash when calling `SystemActor.for_tenant`. elixir/serviceradar_core/lib/serviceradar/agent_config/changes/create_version_history.ex [78-87] `actor = case Map.get(context, :actor) do nil -> # Extract tenant_id from schema name (format: "tenant_<uuid>") - tenant_id = extract_tenant_id_from_schema(tenant) - SystemActor.for_tenant(tenant_id, :config_version_history) + case extract_tenant_id_from_schema(tenant) do + tenant_id when is_binary(tenant_id) -> + SystemActor.for_tenant(tenant_id, :config_version_history) + + _ -> + raise "missing tenant_id in context (cannot derive from tenant schema)" + end provided_actor -> provided_actor end` Apply / Chat Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies a potential `FunctionClauseError` if `extract_tenant_id_from_schema` returns `nil`, preventing a runtime crash and improving robustness.	Medium
	Avoid nil-derived tenant actors Use a platform actor instead of a tenant-scoped actor for the initial package lookup to avoid a potential crash when the tenant ID cannot be derived from the schema. elixir/serviceradar_core/lib/serviceradar/events/onboarding_writer.ex [30-38] `-# Extract tenant_id from the schema name (format: "tenant_<uuid>") -tenant_id = extract_tenant_id_from_schema(tenant_schema) -actor = SystemActor.for_tenant(tenant_id, :onboarding_writer) +# Tenant id may not be derivable here; use platform actor for initial lookup +actor = SystemActor.platform(:onboarding_writer) case Ash.get(OnboardingPackage, event.package_id, tenant: tenant_schema, actor: actor) do {:ok, package} -> {:ok, package} {:error, reason} -> {:error, reason} end` Apply / Chat Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies a potential crash and proposes using a `platform` actor, which is the correct pattern for looking up a resource before its tenant context is known.	Medium
	Fix invalid checkout action version Correct the invalid `actions/checkout` version in the new CI workflow. Change `@v6` to the valid major version `@v4` to ensure the lint job can run. .github/workflows/serviceradar-core-lint.yml [44] `-- uses: actions/checkout@v6 +- uses: actions/checkout@v4` Apply / Chat Suggestion importance[1-10]: 7 __ Why: This is a correct and important fix for the new CI workflow, as using an invalid action version like `@v6` would cause the job to fail, defeating the purpose of adding the lint check.	Medium
	Limit cross-tenant reads to metadata When querying all tenants, explicitly select only the necessary fields (`:id`, `:slug`) to minimize data exposure and adhere to the principle of least privilege. elixir/serviceradar_core/lib/serviceradar/oban/tenant_queues.ex [372-376] `# Platform actor since we need to read all tenants actor = SystemActor.platform(:tenant_queues) # Query all tenants from the database and provision their queues -case Ash.read(ServiceRadar.Identity.Tenant, actor: actor) do +case ServiceRadar.Identity.Tenant + \|> Ash.Query.select([:id, :slug]) + \|> Ash.read(actor: actor) do` `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 6 __ Why: This is a good security practice that follows the principle of least privilege by ensuring the query only fetches the data it actually needs, reducing the risk of exposing sensitive tenant information.	Low
Security	Scope bypass to system actors Refine the `bypass` clause to be more specific by moving the role check into the bypass condition and adding a check to ensure the system actor has a `tenant_id`. elixir/serviceradar_core/lib/serviceradar/monitoring/poll_job.ex [294-296] `-bypass always() do - authorize_if actor_attribute_equals(:role, :system) +bypass actor_attribute_equals(:role, :system) do + authorize_if expr(not is_nil(^actor(:tenant_id))) end` `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 8 __ Why: This is a valuable security hardening suggestion that improves upon the new authorization pattern by making it more explicit and robust, reducing the risk of accidental privilege escalation.	Medium
Update

Previous suggestions

Suggestions up to commit 5356ddb

Category	Suggestion	Impact
High-level	The PR is incomplete without policy updates The PR introduces actors with a new `:system` role but omits the necessary authorization policy updates to recognize this role. This will cause background jobs using these new actors to be blocked and fail. Examples: elixir/serviceradar_core/lib/serviceradar/actors/system_actor.ex [84-91] `def for_tenant(tenant_id, component) when is_binary(tenant_id) and is_atom(component) do %{ id: "system:#{component}", email: "#{component_to_email(component)}@system.serviceradar", role: :system, tenant_id: tenant_id } end` elixir/serviceradar_core/lib/serviceradar/infrastructure/health_tracker.ex [132-136] `actor = SystemActor.for_tenant(tenant_id, :health_tracker) HealthEvent \|> Ash.Changeset.for_create(:record, attrs, tenant: schema) \|> Ash.create(actor: actor)` Solution Walkthrough: Before: # elixir/serviceradar_core/lib/serviceradar/infrastructure/health_tracker.ex def record(tenant_id, attrs) do # ... schema -> actor = SystemActor.for_tenant(tenant_id, :health_tracker) HealthEvent \|> Ash.Changeset.for_create(:record, attrs, tenant: schema) # This call will fail because authorization policies # do not yet recognize `role: :system`. \|> Ash.create(actor: actor) end After: `# In a hypothetical Ash resource policy file (not in PR) authorizers do # ... authorize_if fn actor, _ -> case actor do # Add a clause to allow the new system role %{role: :system} -> true # ... other roles _ -> false end end end # elixir/serviceradar_core/lib/serviceradar/infrastructure/health_tracker.ex # (No change needed here after policies are updated) def record(tenant_id, attrs) do # ... # This call will now succeed. \|> Ash.create(actor: actor) end` Suggestion importance[1-10]: 9 __ Why: This suggestion correctly identifies a critical flaw; the PR replaces `authorize?: false` with actors having a new `:system` role, but without updating authorization policies to recognize this role, the refactored operations will fail.	High
General	Use tenant schema in queries In `load_from_database/4`, replace the use of a raw `tenant_id` in the `Ash.read` call with a tenant schema. Fetch the schema using `TenantSchemas.schema_for_tenant/1` and handle the case where it might not be found. elixir/serviceradar_core/lib/serviceradar/agent_config/config_server.ex [169-185] defp load_from_database(tenant_id, config_type, partition, agent_id) do + tenant_schema = TenantSchemas.schema_for_tenant(tenant_id) actor = SystemActor.for_tenant(tenant_id, :config_server) - case Ash.read( - ConfigInstance, - action: :for_agent, - tenant: tenant_id, - actor: actor, - args: %{ - config_type: config_type, - partition: partition, - agent_id: agent_id - } - ) do - {:ok, [instance \| _]} -> {:ok, instance} + case tenant_schema do + nil -> + {:error, :tenant_schema_not_found} + schema -> + case Ash.read( + ConfigInstance, + action: :for_agent, + tenant: schema, + actor: actor, + args: %{ + config_type: config_type, + partition: partition, + agent_id: agent_id + } + ) do + {:ok, [instance \| _]} -> {:ok, instance} Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies that an `Ash.read` call is using a raw `tenant_id` instead of a tenant schema, which is inconsistent with the pattern used elsewhere in the PR and can cause issues with Ash's multi-tenancy. Adopting the standard pattern of resolving the schema first is a critical fix for correctness.	Medium
General	Include tenant in actor id Modify the `for_tenant/2` function to include the `tenant_id` in the generated actor `id`. This ensures actor IDs are unique across all tenants, which improves traceability in audit logs. elixir/serviceradar_core/lib/serviceradar/actors/system_actor.ex [84-91] `def for_tenant(tenant_id, component) when is_binary(tenant_id) and is_atom(component) do %{ - id: "system:#{component}", + id: "system:#{tenant_id}:#{component}", email: "#{component_to_email(component)}@system.serviceradar", role: :system, tenant_id: tenant_id } end` Suggestion importance[1-10]: 6 __ Why: The suggestion correctly points out that actor `id`s for tenant-scoped actors are not unique across tenants, which could complicate auditing. Including the `tenant_id` in the `id` is a good practice for ensuring global uniqueness and improving traceability.	Low
Possible issue	Make system actor check more specific Refine the `system_actor?` function to be more specific for `:system` role actors. Add a check to ensure the actor's `id` starts with the "system:" prefix, matching the stricter validation used for `:super_admin` actors. elixir/serviceradar_core/lib/serviceradar/actors/system_actor.ex [157-159] `-def system_actor?(%{role: :system}), do: true +def system_actor?(%{role: :system, id: "system:" <> _}), do: true def system_actor?(%{role: :super_admin, id: "platform:" <> _}), do: true def system_actor?(_), do: false` Suggestion importance[1-10]: 7 __ Why: This suggestion correctly identifies that the `system_actor?` check for the `:system` role is too permissive and could lead to incorrect actor identification. Aligning it with the actor creation logic in `for_tenant/2` by checking the `id` prefix significantly improves the robustness of this new security feature.	Medium

Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2272#issuecomment-3737243261 Original created: 2026-01-12T07:42:17Z --- ## PR Code Suggestions ✨  Latest suggestions up to 05b7a51 <table><thead><tr><td>Category</td><td align=left>Suggestion                                                                                                                                    </td><td align=center>Impact</td></tr><tbody><tr><td rowspan=5>Incremental <a href='https://qodo-merge-docs.qodo.ai/core-abilities/incremental_update/'>[*]</a></td> <td> <details><summary>Enforce tenant match in bypass</summary> ___ **Strengthen the system actor bypass policy by adding a check to ensure the actor's <code>tenant_id</code> matches the record's <code>tenant_id</code>, preventing cross-tenant data access.** [elixir/serviceradar_core/lib/serviceradar/integrations/integration_source.ex [277-280]](https://github.com/carverauto/serviceradar/pull/2272/files#diff-2cd4228801a527fa4b0b74c0e16210095001e39a8bdc359ac564e62a52c38bd9R277-R280) ```diff -# System actors can perform all operations (tenant isolation via schema) +# System actors can perform operations within their tenant bypass always() do - authorize_if actor_attribute_equals(:role, :system) + authorize_if all_of([ + actor_attribute_equals(:role, :system), + expr(tenant_id == ^actor(:tenant_id)) + ]) end ``` - [ ] **Apply / Chat**  <details><summary>Suggestion importance[1-10]: 9</summary> __ Why: This suggestion correctly identifies a potential security flaw where a system actor could bypass tenant isolation policies, and proposes a fix that strengthens security by enforcing tenant ID matching. </details></details></td><td align=center>High </td></tr><tr><td> <details><summary>Guard against missing tenant id</summary> ___ **Add a check to handle cases where <code>extract_tenant_id_from_schema</code> returns <code>nil</code> to prevent a crash when calling <code>SystemActor.for_tenant</code>.** [elixir/serviceradar_core/lib/serviceradar/agent_config/changes/create_version_history.ex [78-87]](https://github.com/carverauto/serviceradar/pull/2272/files#diff-edf276f4ea9f73468d843cc8db84191727a3e37c3e964146c29419e91f120d50R78-R87) ```diff actor = case Map.get(context, :actor) do nil -> # Extract tenant_id from schema name (format: "tenant_<uuid>") - tenant_id = extract_tenant_id_from_schema(tenant) - SystemActor.for_tenant(tenant_id, :config_version_history) + case extract_tenant_id_from_schema(tenant) do + tenant_id when is_binary(tenant_id) -> + SystemActor.for_tenant(tenant_id, :config_version_history) + + _ -> + raise "missing tenant_id in context (cannot derive from tenant schema)" + end provided_actor -> provided_actor end ``` - [ ] **Apply / Chat**  <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: The suggestion correctly identifies a potential `FunctionClauseError` if `extract_tenant_id_from_schema` returns `nil`, preventing a runtime crash and improving robustness. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Avoid nil-derived tenant actors</summary> ___ **Use a platform actor instead of a tenant-scoped actor for the initial package lookup to avoid a potential crash when the tenant ID cannot be derived from the schema.** [elixir/serviceradar_core/lib/serviceradar/events/onboarding_writer.ex [30-38]](https://github.com/carverauto/serviceradar/pull/2272/files#diff-afd0d16afe71e46bf1d2be36b464f3f0bb140df2d1b59f417e89b08d7d2fee45R30-R38) ```diff -# Extract tenant_id from the schema name (format: "tenant_<uuid>") -tenant_id = extract_tenant_id_from_schema(tenant_schema) -actor = SystemActor.for_tenant(tenant_id, :onboarding_writer) +# Tenant id may not be derivable here; use platform actor for initial lookup +actor = SystemActor.platform(:onboarding_writer) case Ash.get(OnboardingPackage, event.package_id, tenant: tenant_schema, actor: actor) do {:ok, package} -> {:ok, package} {:error, reason} -> {:error, reason} end ``` - [ ] **Apply / Chat**  <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: The suggestion correctly identifies a potential crash and proposes using a `platform` actor, which is the correct pattern for looking up a resource before its tenant context is known. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Fix invalid checkout action version</summary> ___ **Correct the invalid <code>actions/checkout</code> version in the new CI workflow. Change <code>@v6</code> to the valid major version <code>@v4</code> to ensure the lint job can run.** [.github/workflows/serviceradar-core-lint.yml [44]](https://github.com/carverauto/serviceradar/pull/2272/files#diff-e5c0285bdbd2331395152999bb32e56f3ceda3dde07288ca50ac43d1abfa14b5R44-R44) ```diff -- uses: actions/checkout@v6 +- uses: actions/checkout@v4 ``` - [ ] **Apply / Chat**  <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: This is a correct and important fix for the new CI workflow, as using an invalid action version like `@v6` would cause the job to fail, defeating the purpose of adding the lint check. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Limit cross-tenant reads to metadata</summary> ___ **When querying all tenants, explicitly select only the necessary fields (<code>:id</code>, <code>:slug</code>) to minimize data exposure and adhere to the principle of least privilege.** [elixir/serviceradar_core/lib/serviceradar/oban/tenant_queues.ex [372-376]](https://github.com/carverauto/serviceradar/pull/2272/files#diff-44a3d7b5eae3e35edf813f060b444c311f39a363b35f41b77258f45fdc3ae32bR372-R376) ```diff # Platform actor since we need to read all tenants actor = SystemActor.platform(:tenant_queues) # Query all tenants from the database and provision their queues -case Ash.read(ServiceRadar.Identity.Tenant, actor: actor) do +case ServiceRadar.Identity.Tenant + |> Ash.Query.select([:id, :slug]) + |> Ash.read(actor: actor) do ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 6</summary> __ Why: This is a good security practice that follows the principle of least privilege by ensuring the query only fetches the data it actually needs, reducing the risk of exposing sensitive tenant information. </details></details></td><td align=center>Low </td></tr><tr><td rowspan=1>Security</td> <td> <details><summary>Scope bypass to system actors</summary> ___ **Refine the <code>bypass</code> clause to be more specific by moving the role check into the bypass condition and adding a check to ensure the system actor has a <code>tenant_id</code>.** [elixir/serviceradar_core/lib/serviceradar/monitoring/poll_job.ex [294-296]](https://github.com/carverauto/serviceradar/pull/2272/files#diff-e136cc8589304337f0da1f828071baed58c38dc25e82a40a611cd1e5c03fd88dR294-R296) ```diff -bypass always() do - authorize_if actor_attribute_equals(:role, :system) +bypass actor_attribute_equals(:role, :system) do + authorize_if expr(not is_nil(^actor(:tenant_id))) end ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: This is a valuable security hardening suggestion that improves upon the new authorization pattern by making it more explicit and robust, reducing the risk of accidental privilege escalation. </details></details></td><td align=center>Medium </td></tr> <tr><td align="center" colspan="2"> - [ ] Update  </td><td></td></tr></tbody></table> ___ #### Previous suggestions <details><summary>Suggestions up to commit 5356ddb</summary> <table><thead><tr><td>Category</td><td align=left>Suggestion                                                                                                                                    </td><td align=center>Impact</td></tr><tbody><tr><td rowspan=1>High-level</td> <td> <details><summary>The PR is incomplete without policy updates</summary> ___ **The PR introduces actors with a new <code>:system</code> role but omits the necessary authorization policy updates to recognize this role. This will cause background jobs using these new actors to be blocked and fail.** ### Examples: <details> <summary> <a href="https://github.com/carverauto/serviceradar/pull/2272/files#diff-f3ebf5072e8a72e025076ca013e6726fb23a0169e9e8c9b0089489a4cf17a73dR84-R91">elixir/serviceradar_core/lib/serviceradar/actors/system_actor.ex [84-91]</a> </summary> ```elixir def for_tenant(tenant_id, component) when is_binary(tenant_id) and is_atom(component) do %{ id: "system:#{component}", email: "#{component_to_email(component)}@system.serviceradar", role: :system, tenant_id: tenant_id } end ``` </details> <details> <summary> <a href="https://github.com/carverauto/serviceradar/pull/2272/files#diff-51e82fcb12ce016dad558a5dcc0e17c2c58454af416f01a7ddb6668d9e5c6670R132-R136">elixir/serviceradar_core/lib/serviceradar/infrastructure/health_tracker.ex [132-136]</a> </summary> ```elixir actor = SystemActor.for_tenant(tenant_id, :health_tracker) HealthEvent |> Ash.Changeset.for_create(:record, attrs, tenant: schema) |> Ash.create(actor: actor) ``` </details> ### Solution Walkthrough: #### Before: ```elixir # elixir/serviceradar_core/lib/serviceradar/infrastructure/health_tracker.ex def record(tenant_id, attrs) do # ... schema -> actor = SystemActor.for_tenant(tenant_id, :health_tracker) HealthEvent |> Ash.Changeset.for_create(:record, attrs, tenant: schema) # This call will fail because authorization policies # do not yet recognize `role: :system`. |> Ash.create(actor: actor) end ``` #### After: ```elixir # In a hypothetical Ash resource policy file (not in PR) authorizers do # ... authorize_if fn actor, _ -> case actor do # Add a clause to allow the new system role %{role: :system} -> true # ... other roles _ -> false end end end # elixir/serviceradar_core/lib/serviceradar/infrastructure/health_tracker.ex # (No change needed here after policies are updated) def record(tenant_id, attrs) do # ... # This call will now succeed. |> Ash.create(actor: actor) end ``` <details><summary>Suggestion importance[1-10]: 9</summary> __ Why: This suggestion correctly identifies a critical flaw; the PR replaces `authorize?: false` with actors having a new `:system` role, but without updating authorization policies to recognize this role, the refactored operations will fail. </details></details></td><td align=center>High </td></tr><tr><td rowspan=2>General</td> <td> <details><summary>Use tenant schema in queries</summary> ___ **In <code>load_from_database/4</code>, replace the use of a raw <code>tenant_id</code> in the <code>Ash.read</code> call with a tenant schema. Fetch the schema using <code>TenantSchemas.schema_for_tenant/1</code> and handle the case where it might not be found.** [elixir/serviceradar_core/lib/serviceradar/agent_config/config_server.ex [169-185]](https://github.com/carverauto/serviceradar/pull/2272/files#diff-a4d8447584bfcd1088465714c00bea67c90100320b125857ac5bd6a9783de468R169-R185) ```diff defp load_from_database(tenant_id, config_type, partition, agent_id) do + tenant_schema = TenantSchemas.schema_for_tenant(tenant_id) actor = SystemActor.for_tenant(tenant_id, :config_server) - case Ash.read( - ConfigInstance, - action: :for_agent, - tenant: tenant_id, - actor: actor, - args: %{ - config_type: config_type, - partition: partition, - agent_id: agent_id - } - ) do - {:ok, [instance | _]} -> {:ok, instance} + case tenant_schema do + nil -> + {:error, :tenant_schema_not_found} + schema -> + case Ash.read( + ConfigInstance, + action: :for_agent, + tenant: schema, + actor: actor, + args: %{ + config_type: config_type, + partition: partition, + agent_id: agent_id + } + ) do + {:ok, [instance | _]} -> {:ok, instance} ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: The suggestion correctly identifies that an `Ash.read` call is using a raw `tenant_id` instead of a tenant schema, which is inconsistent with the pattern used elsewhere in the PR and can cause issues with Ash's multi-tenancy. Adopting the standard pattern of resolving the schema first is a critical fix for correctness. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Include tenant in actor id</summary> ___ **Modify the <code>for_tenant/2</code> function to include the <code>tenant_id</code> in the generated actor <code>id</code>. This ensures actor IDs are unique across all tenants, which improves traceability in audit logs.** [elixir/serviceradar_core/lib/serviceradar/actors/system_actor.ex [84-91]](https://github.com/carverauto/serviceradar/pull/2272/files#diff-f3ebf5072e8a72e025076ca013e6726fb23a0169e9e8c9b0089489a4cf17a73dR84-R91) ```diff def for_tenant(tenant_id, component) when is_binary(tenant_id) and is_atom(component) do %{ - id: "system:#{component}", + id: "system:#{tenant_id}:#{component}", email: "#{component_to_email(component)}@system.serviceradar", role: :system, tenant_id: tenant_id } end ```  <details><summary>Suggestion importance[1-10]: 6</summary> __ Why: The suggestion correctly points out that actor `id`s for tenant-scoped actors are not unique across tenants, which could complicate auditing. Including the `tenant_id` in the `id` is a good practice for ensuring global uniqueness and improving traceability. </details></details></td><td align=center>Low </td></tr><tr><td rowspan=1>Possible issue</td> <td> <details><summary>Make system actor check more specific</summary> ___ **Refine the <code>system_actor?</code> function to be more specific for <code>:system</code> role actors. Add a check to ensure the actor's <code>id</code> starts with the "system:" prefix, matching the stricter validation used for <code>:super_admin</code> actors.** [elixir/serviceradar_core/lib/serviceradar/actors/system_actor.ex [157-159]](https://github.com/carverauto/serviceradar/pull/2272/files#diff-f3ebf5072e8a72e025076ca013e6726fb23a0169e9e8c9b0089489a4cf17a73dR157-R159) ```diff -def system_actor?(%{role: :system}), do: true +def system_actor?(%{role: :system, id: "system:" <> _}), do: true def system_actor?(%{role: :super_admin, id: "platform:" <> _}), do: true def system_actor?(_), do: false ```  <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: This suggestion correctly identifies that the `system_actor?` check for the `:system` role is too permissive and could lead to incorrect actor identification. Aligning it with the actor creation logic in `for_tenant/2` by checking the `id` prefix significantly improves the robustness of this new security feature. </details></details></td><td align=center>Medium </td></tr> <tr><td align="center" colspan="2">  </td><td></td></tr></tbody></table> </details>