sync #2651
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2651
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2651/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #2250
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2250
Original created: 2026-01-11T18:41:39Z
Original updated: 2026-01-11T18:45:39Z
Original head: carverauto/serviceradar:staging
Original base: fix/discovery_sources_empty
Original merged: 2026-01-11T18:45:38Z by @mfreeman451
User description
IMPORTANT: Please sign the Developer Certificate of Origin
Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:
Describe your changes
Issue ticket number and link
Code checklist before requesting a review
PR Type
Enhancement, Tests
Description
Major infrastructure overhaul: Comprehensive multi-tenant database schema with 30+ tables for user management, agents, devices, gateways, and monitoring systems
Stateful alert engine: New GenServer implementing bucketed time-window alert evaluation with rule state persistence and lifecycle management (firing, recovery, cooldown, renotification)
Gateway process infrastructure: New GenServer for agent coordination, check dispatch, and health monitoring with metrics tracking and load balancing
Enhanced LiveView modules: New cluster status monitoring, infrastructure monitoring with role-based access, integration sources management, and improved agent/gateway/job management UIs
Form modernization: Migrated edge package creation from Ecto changesets to
AshPhoenix.Formwith automatic TLS certificate generationJob scheduler redesign: Unified job catalog view supporting both Cron and AshOban jobs with search, filtering, and triggering capabilities
Live agent registry integration: Dynamic agent lookups via Horde with real-time status tracking and node system metrics retrieval
Comprehensive test coverage: New test suites for TenantRegistry multi-tenant infrastructure validation
Legacy code removal: Removed Go-based poller implementation and legacy Elixir user/device models in favor of new Ash-based architecture
Diagram Walkthrough
File Walkthrough
1 files
20260107043446_initial_schema.exs
Initial tenant schema migration with core infrastructure tableselixir/serviceradar_core/priv/repo/tenant_migrations/20260107043446_initial_schema.exs
infrastructure
monitoring systems
gateways, partitions)
certificates, keys)
1 files
index.ex
Integration sources management LiveView with modal formsweb-ng/lib/serviceradar_web_ng_web/live/admin/integration_live/index.ex
Nmap, Custom)
view operations
features
2 files
tenant_registry_test.exs
TenantRegistry unit tests for multi-tenant infrastructureelixir/serviceradar_core/test/serviceradar/cluster/tenant_registry_test.exs
TenantRegistrymodule with 15+ test casesand lifecycle management
cleanup
test_helper.exs
ExUnit test framework initializationweb-ng/serviceradar/test/test_helper.exs
7 files
index.ex
New cluster status monitoring LiveView for settingsweb-ng/lib/serviceradar_web_ng_web/live/settings/cluster_live/index.ex
area
Oban job queue status
refresh scheduling
tables, and recent events log
index.ex
Refactor edge package creation with Ash forms and auto-cert generationweb-ng/lib/serviceradar_web_ng_web/live/admin/edge_package_live/index.ex
AshPhoenix.Formfor form handlingcreate_with_tenant_certfunction
commands, and advanced options
checker
index.ex
New infrastructure monitoring LiveView with role-based accessweb-ng/lib/serviceradar_web_ng_web/live/infrastructure_live/index.ex
agent gateways
regular users see only agents)
cluster nodes
connected agents sections
show.ex
Live agent registry integration with node system metricsweb-ng/lib/serviceradar_web_ng_web/live/agent_live/show.ex
ServiceRadar.Infrastructure.Agentmodulereal-time status tracking
processes, schedulers, uptime) via RPC calls
status indicators
comprehensive node metrics visualization
index.ex
Job scheduler UI redesign with catalog integrationweb-ng/lib/serviceradar_web_ng_web/live/admin/job_live/index.ex
supporting both Cron and AshOban jobs
capabilities with configurable auto-refresh
admin users
navigation to detail views
permissions and tenant type
stateful_alert_engine.ex
Stateful alert engine with bucketed time-window evaluationelixir/serviceradar_core/lib/serviceradar/observability/stateful_alert_engine.ex
and events with time-windowed thresholds
names, and custom predicates
and renotification scheduling
cache for performance
and notification dispatch
gateway_process.ex
Gateway process for agent coordination and check dispatchelixir/serviceradar_core/lib/serviceradar/edge/gateway_process.ex
check execution and agent coordination
with load balancing via random selection
monitoring services
times) and registry heartbeats
result broadcasting
1 files
20260110054954_add_stateful_alert_rules.exs
Database migration for stateful alert rules systemelixir/serviceradar_core/priv/repo/tenant_migrations/20260110054954_add_stateful_alert_rules.exs
stateful_alert_rulestable for storing alert ruleconfigurations with thresholds and windows
stateful_alert_rule_statestable for tracking per-group rulestate and firing history
tenant_id+rule_id+group_key for states
101 files
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2250#issuecomment-3735392523
Original created: 2026-01-11T18:43:30Z
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
Authorization bypass
Description: The new
load_tenant/1performsAsh.get(Tenant, tenant_id, authorize?: false), which canbypass authorization checks and may enable tenant enumeration or cross-tenant data access
if an attacker can influence
tenant_id(e.g., via a forged/compromisedcurrent_scope).index.ex [981-999]
Referred Code
DoS via atom conversion
Description:
build_create_form/2converts user-controlled"component_type"viaString.to_existing_atom/1, which is safe from atom exhaustion but can raise on unexpectedvalues and potentially be abused to crash the LiveView process (DoS) if invalid
component_typevalues reach this path.index.ex [947-977]
Referred Code
🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status:
Crashable atom cast: The code converts user-controlled
component_typewithString.to_existing_atom/1without awhitelist/guard which can raise and crash the LiveView instead of handling invalid input
gracefully.
Referred Code
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status:
Missing input hardening: The
component_typeparameter is not strictly validated before being used (includingconversion to an atom), allowing crafted input to trigger exceptions and potentially
impact availability.
Referred Code
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
Audit coverage unclear: The PR adds create/revoke/delete flows for onboarding packages but the diff does not show
whether these critical actions are consistently recorded in an immutable audit trail with
actor, timestamp, action, and outcome.
Referred Code
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status:
User error details: The UI flashes
"Failed to create package: #{error_msg}"whereerror_msgmayinclude internal Ash/DB details depending on error content, and the diff does not show
sanitization guarantees.
Referred Code
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status:
Logging not shown: The diff does not provide enough visibility into application logging to verify structured
logging and confirm no sensitive data (tokens/certs/PII) is logged during the new
onboarding flows.
Referred Code
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2250#issuecomment-3735398070
Original created: 2026-01-11T18:45:11Z
PR Code Suggestions ✨
Explore these optional code suggestions:
Add tenant to edit form
In
build_edit_form/2, provide thetenantschema toAshPhoenix.Form.for_updatetoensure the form operates within the correct tenant context.
web-ng/lib/serviceradar_web_ng_web/live/admin/integration_live/index.ex [1336-1341]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 9
__
Why: This suggestion fixes a critical bug where the edit form was created without a tenant context, which would cause updates to fail in a multi-tenant environment.
Log RPC failures for cluster debugging
Add a case clause to handle and log
{:badrpc, reason}errors from:rpc.calltoavoid silently ignoring RPC failures.
web-ng/lib/serviceradar_web_ng_web/live/infrastructure_live/index.ex [736-741]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 7
__
Why: The suggestion correctly points out that RPC failures are being silently ignored. Adding logging for
{:badrpc, reason}significantly improves the observability and debuggability of the distributed system.Remove redundant unique indexes on primary keys
Remove the redundant unique indexes on the
uidcolumns for theocsf_agentsandocsf_devicestables, as these columns are already primary keys.elixir/serviceradar_core/priv/repo/tenant_migrations/20260107043446_initial_schema.exs [591-797]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 6
__
Why: The suggestion correctly identifies redundant unique indexes on columns already defined as primary keys, which improves database performance and simplifies the migration script.
Remove raw JSON params
In
parse_json_field/3, remove the original raw JSON string parameter from theparamsmap after it has been successfully decoded or if decoding fails.web-ng/lib/serviceradar_web_ng_web/live/admin/integration_live/index.ex [1368-1379]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 4
__
Why: The suggestion improves code hygiene by removing the original JSON string parameter after it has been parsed, preventing it from being unintentionally passed to Ash actions.
Include tenant in update calls
In the
toggle_enabledevent handler, pass thetenantschema to theAsh.updatecall to ensure the operation is performed within the correct tenant context.
web-ng/lib/serviceradar_web_ng_web/live/admin/integration_live/index.ex [350]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 9
__
Why: This suggestion fixes a critical bug where
Ash.updatewas called without a tenant context, which would cause it to fail in a multi-tenant environment.Safely convert component type to atom
Replace the unsafe
String.to_existing_atom/1call with a pattern that validatesthe input against a list of known component types before converting it to an
atom.
web-ng/lib/serviceradar_web_ng_web/live/admin/edge_package_live/index.ex [954-961]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 9
__
Why: This is a critical suggestion that prevents a potential server crash. Using
String.to_existing_atom/1with user-provided data is unsafe; the proposed change correctly validates the input against a safelist before conversion.Consolidate multiple RPC calls into one
Refactor
fetch_node_info/1to use a single RPC call instead of multiplesequential calls. This will prevent the LiveView process from blocking for a
long time if the remote node is unresponsive.
web-ng/lib/serviceradar_web_ng_web/live/agent_live/show.ex [164-189]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies a significant performance and reliability issue where multiple sequential RPC calls can block the LiveView process for an extended period, leading to timeouts. Consolidating these into a single RPC call is a critical improvement for the robustness of this feature.
Remove redundant column removal before dropping tables
In the
down/0function, remove the redundantalter table ... removeblocks fortables that are subsequently dropped, as dropping a table automatically removes
all its columns.
elixir/serviceradar_core/priv/repo/tenant_migrations/20260107043446_initial_schema.exs [1086-1125]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 6
__
Why: The suggestion correctly identifies that removing columns individually before dropping the entire table is redundant. Applying this change simplifies the migration's
downfunction and improves clarity.Handle unexpected errors during package creation
Wrap the package creation logic in a
try/rescueblock to ensure the:creatingflag is reset if an unexpected error occurs.
web-ng/lib/serviceradar_web_ng_web/live/admin/edge_package_live/index.ex [121-126]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 6
__
Why: The suggestion correctly identifies that an unhandled exception during package creation would leave the UI in a permanent loading state, and proposes a
try/rescueblock to improve robustness and user experience.Prevent map injection by validating input
Prevent a map injection vulnerability in the
update_queryevent handler byvalidating the
fieldparameter against an allowlist of permitted fields beforeupdating the query map.
web-ng/lib/serviceradar_web_ng_web/live/admin/integration_live/index.ex [232-241]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies a map injection vulnerability by allowing a client-provided
fieldto update a map. Implementing an allowlist for thefieldparameter is a critical security fix.Remove authorization bypass for tenant loading
Remove the
authorize?: falseoption when callingAsh.getto ensure tenant dataaccess is properly authorized by the framework's policies.
web-ng/lib/serviceradar_web_ng_web/live/settings/cluster_live/index.ex [994-998]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 8
__
Why: This is a valid security recommendation to avoid bypassing the authorization system. Relying on
Ashpolicies by removingauthorize?: falseis a best practice that strengthens security.