merging testing into staging #2649
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2649
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2649/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #2246
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2246
Original created: 2026-01-11T18:01:29Z
Original updated: 2026-01-11T18:04:33Z
Original head: carverauto/serviceradar:testing
Original base: staging
Original merged: 2026-01-11T18:01:51Z by @mfreeman451
User description
IMPORTANT: Please sign the Developer Certificate of Origin
Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:
Describe your changes
Issue ticket number and link
Code checklist before requesting a review
PR Type
Enhancement, Tests
Description
Comprehensive multi-tenant infrastructure implementation with 30+ database tables for user management, agents, devices, gateways, and monitoring systems
New stateful alert engine with bucketed event aggregation supporting time-windowed rule evaluation and alert lifecycle management
Gateway process implementation for distributed check execution with load balancing and execution metrics tracking
Enhanced LiveView components for infrastructure monitoring, cluster status, agent details, and integration source management
Job scheduler UI redesigned with search, filtering, pagination, and support for multiple job sources (Cron and AshOban triggers)
Edge package creation refactored with Ash forms and automatic certificate generation
Comprehensive test coverage for TenantRegistry multi-tenant isolation and infrastructure components
Migration from legacy poller-based architecture to gateway-based distributed execution model
Significant codebase refactoring from Go to Elixir with removal of legacy Go packages and old user/device management modules
Diagram Walkthrough
File Walkthrough
1 files
20260107043446_initial_schema.exs
Initial tenant schema migration with core infrastructure tableselixir/serviceradar_core/priv/repo/tenant_migrations/20260107043446_initial_schema.exs
infrastructure
monitoring systems
gateways, partitions)
audit tables
1 files
index.ex
Integration sources management LiveView with full CRUD UIweb-ng/lib/serviceradar_web_ng_web/live/admin/integration_live/index.ex
Syslog, Nmap, Custom)
view details
and credential handling
agent/partition assignment
2 files
tenant_registry_test.exs
TenantRegistry unit tests with multi-tenant isolation coverageelixir/serviceradar_core/test/serviceradar/cluster/tenant_registry_test.exs
TenantRegistrymodule with 15+ test casesand lookup
cleanup
test_helper.exs
ExUnit test framework initializationweb-ng/serviceradar/test/test_helper.exs
8 files
index.ex
New cluster status monitoring LiveView for settingsweb-ng/lib/serviceradar_web_ng_web/live/settings/cluster_live/index.ex
area
Oban job queue status
refresh scheduling
tables, and recent events log
index.ex
Refactor edge package creation with Ash forms and certificatesweb-ng/lib/serviceradar_web_ng_web/live/admin/edge_package_live/index.ex
create_with_tenant_certfunction
"sync" type option
improved UX with loading states
SettingsComponentsinstead ofAdminComponentsandrefactored layout structure
index.ex
New infrastructure monitoring LiveView with cluster visibilityweb-ng/lib/serviceradar_web_ng_web/live/infrastructure_live/index.ex
infrastructure
with platform admin visibility controls
real-time updates
fallback
metrics
20260110054954_add_stateful_alert_rules.exs
Add stateful alert rules database schema migrationelixir/serviceradar_core/priv/repo/tenant_migrations/20260110054954_add_stateful_alert_rules.exs
stateful_alert_rulestable with rule configuration (name,signal, thresholds, windows, cooldown)
stateful_alert_rule_statestable for tracking per-group rulestate and firing history
tenant_id+rule_id+group_key for states
last_notification tracking
show.ex
Agent detail view enhanced with live registry and system metricsweb-ng/lib/serviceradar_web_ng_web/live/agent_live/show.ex
and database records with rich system information
(memory, processes, schedulers), and registration timeline
indicators
ServiceRadar.Infrastructure.Agentmodulegateway node info, registration timeline, and service checks
index.ex
Job scheduler UI redesigned with search, filtering, and paginationweb-ng/lib/serviceradar_web_ng_web/live/admin/job_live/index.ex
searchable, sortable table with pagination
triggers) with filtering and search capabilities
manual trigger capability for jobs
jobs, and accessing Oban Web
navigation to detail pages
stateful_alert_engine.ex
New stateful alert engine with bucketed event aggregationelixir/serviceradar_core/lib/serviceradar/observability/stateful_alert_engine.ex
for log and event rules
sizes and cooldown periods
history tracking and snapshot persistence
grouping, severity filtering, and attribute matching
persistence via ETS and database snapshots
gateway_process.ex
New gateway process for distributed check executionelixir/serviceradar_core/lib/serviceradar/edge/gateway_process.ex
for check execution and result aggregation
checks and result retrieval
partition-based selection strategies
execution time) and maintains registry heartbeats
processes for distributed check execution
101 files
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2246#issuecomment-3735211678
Original created: 2026-01-11T18:03:18Z
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
Authorization bypass
Description:
load_tenant/1callsAsh.get(Tenant, tenant_id, authorize?: false), which can bypassauthorization checks and may allow cross-tenant data access if an attacker can influence
tenant_id(directly or indirectly).index.ex [994-998]
Referred Code
Denial of service
Description: Converting user-controlled
component_typewithString.to_existing_atom/1can raise onunexpected values and crash the LiveView process, creating a realistic denial-of-service
vector via repeated invalid submissions.
index.ex [952-961]
Referred Code
🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status:
Unsafe atom conversion: User-supplied
component_typeis converted usingString.to_existing_atom/1without awhitelist, which can raise and crash the LiveView for unexpected inputs.
Referred Code
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status:
Detailed error exposed: The UI displays internal error details to the end-user via
put_flash(:error, "Failedto create package: #{error_msg}"), which may leak implementation/system information.Referred Code
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2246#issuecomment-3735215686
Original created: 2026-01-11T18:04:33Z
PR Code Suggestions ✨
Explore these optional code suggestions:
Whitelist sortable fields
In
handle_event("sort", ...)replaceString.to_existing_atom/1with a whitelistof allowed sort fields to prevent potential crashes from unsafe atom creation.
web-ng/lib/serviceradar_web_ng_web/live/admin/job_live/index.ex [83-102]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 9
__
Why: The suggestion addresses a potential denial-of-service vulnerability by replacing
String.to_existing_atom/1with a secure whitelisting approach for user-provided sort fields.Normalize data to use consistent keys
In
source_queries_to_form/1, normalize map keys to strings to handleinconsistent data structures and improve code robustness, instead of using
fallbacks for both atom and string keys.
web-ng/lib/serviceradar_web_ng_web/live/admin/integration_live/index.ex [136-145]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 6
__
Why: The suggestion correctly points out that handling both atom and string keys can mask data consistency issues. Normalizing keys to a single type improves code clarity and maintainability, making it a valuable refactoring.
Avoid broad exception swallowing for robustness
In
get_leader_node/0, replace the broadtry/rescuewith more specific patternmatching on the return value of
:rpc.callto handle and log expected errorsgracefully.
web-ng/lib/serviceradar_web_ng_web/live/admin/job_live/index.ex [655-670]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 6
__
Why: The suggestion correctly identifies the anti-pattern of swallowing all exceptions and proposes a more robust error handling strategy, which improves maintainability and debugging.
Remove redundant unique indexes on primary keys
Remove redundant unique indexes on primary key columns for tables like
ocsf_devices,ocsf_agents, and others, as PostgreSQL automatically creates them.elixir/serviceradar_core/priv/repo/tenant_migrations/20260107043446_initial_schema.exs [797]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 5
__
Why: The suggestion correctly identifies that creating a unique index on a primary key column is redundant. Applying this change improves schema clarity and removes unnecessary database objects, which is a good practice.
Avoid using exceptions for flow control
In
call/2, replace thetry/catchblock withGenServer.whereis/1to check if theprocess is alive before making the
GenServer.call, avoiding exceptions for flowcontrol.
elixir/serviceradar_core/lib/serviceradar/observability/stateful_alert_engine.ex [96-101]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 5
__
Why: The suggestion replaces non-idiomatic exception-based flow control with a standard check using
GenServer.whereis/1, improving code clarity and adherence to Elixir best practices.Add prefix to indexes
Add
prefix: prefix()to allcreate indexandcreate unique_indexcalls to ensurethey are created within the correct tenant-specific schema.
elixir/serviceradar_core/priv/repo/tenant_migrations/20260107043446_initial_schema.exs [60]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 9
__
Why: This suggestion is critical for the correctness of the multi-tenant architecture. Without
prefix: prefix(), indexes would be created in the public schema, leading to incorrect behavior and potential data integrity issues.Use safe parsing for user input
Replace
String.to_integer/1with the saferInteger.parse/1in theremove_queryevent handler to prevent crashes from invalid user input.
web-ng/lib/serviceradar_web_ng_web/live/admin/integration_live/index.ex [224-230]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies a potential crash in the LiveView process due to unsafe string-to-integer conversion of user-provided input. Using
Integer.parse/1for safe parsing is a critical improvement for robustness and error handling.Enable pgcrypto extension
Add
execute("CREATE EXTENSION IF NOT EXISTS pgcrypto;")at the beginning of theupfunction to ensure thepgcryptoextension is available forgen_random_uuid().elixir/serviceradar_core/priv/repo/tenant_migrations/20260107043446_initial_schema.exs [10]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 7
__
Why: The suggestion correctly identifies that the
pgcryptoextension is a prerequisite for usinggen_random_uuid(). Adding this command ensures the migration is self-contained and will not fail in a fresh database environment.Use asynchronous RPC calls for performance
Refactor
fetch_node_info/1to userpc.async_call/4andrpc.yield_many/2forconcurrent, non-blocking RPC calls to improve performance.
web-ng/lib/serviceradar_web_ng_web/live/agent_live/show.ex [164-189]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 7
__
Why: The suggestion correctly identifies a performance bottleneck from sequential RPC calls and proposes a valid asynchronous alternative, which improves UI responsiveness.
Prevent atom exhaustion security vulnerability
To prevent a potential atom exhaustion vulnerability, validate the
component_typeagainst a whitelist of allowed values before converting thestring to an atom.
web-ng/lib/serviceradar_web_ng_web/live/admin/edge_package_live/index.ex [948-961]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 9
__
Why: The suggestion correctly identifies a potential atom exhaustion DoS vulnerability and provides a robust fix by whitelisting user input before atom conversion.
Guard tenant when listing packages
To prevent unintended data access, ensure that
OnboardingPackages.listis onlycalled with a
tenantwhen one is present; otherwise, return an empty list.web-ng/lib/serviceradar_web_ng_web/live/admin/edge_package_live/index.ex [20-26]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 7
__
Why: The suggestion correctly points out a potential issue where a
niltenant might bypass tenancy scoping, and proposes a safer default of returning an empty list.