Updates/log events work #2643
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2643
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2643/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #2239
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2239
Original created: 2026-01-10T03:54:03Z
Original updated: 2026-01-10T05:31:28Z
Original head: carverauto/serviceradar:updates/log-events-work
Original base: testing
Original merged: 2026-01-10T05:31:26Z by @mfreeman451
User description
IMPORTANT: Please sign the Developer Certificate of Origin
Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:
Describe your changes
Issue ticket number and link
Code checklist before requesting a review
PR Type
Enhancement, Refactoring
Description
Major terminology migration: Comprehensive rename of
pollertogatewaythroughout the codebase (100+ files affected)New gateway monitoring system: Implements comprehensive gateway health checks, offline/recovery detection, and event publishing to NATS with streaming status report handling
Agent-gateway communication: Adds
PushLoopimplementation for periodic agent status pushes with enrollment flow and config pollingNATS bootstrap CLI: New CLI command implementation for NATS operator/account generation with JWT and credentials file management
NATS account service: New test suite with 474 lines covering tenant account creation, user credential generation, and JWT signing
Database updates: Renamed
poller_idcolumns togateway_idacross metrics and discovery layersAPI refactoring: Updated REST API routes from
/pollers/{id}to/gateways/{id}with corresponding type and method renamesEdge onboarding updates: Added
EdgeOnboardingComponentTypeSyncsupport and updated KV storage paths fromconfig/pollers/toconfig/gateways/Deleted legacy components: Removed poller-specific files including
cmd/poller/main.go,cmd/sync/main.go, and related Docker/Helm configurationsDiagram Walkthrough
File Walkthrough
3 files
gateways.go
Gateway monitoring and status management implementationpkg/core/gateways.go
implementation with 1549 lines of core functionality
event publishing to NATS
datasets
gateways, agents, and checkers
nats_bootstrap.go
Add NATS bootstrap CLI command implementationpkg/cli/nats_bootstrap.go
of code
nats-bootstrap,admin natsgenerate-bootstrap-token,admin nats status, andadmin nats tenantssubcommands
and credentials file management
formatting (text/JSON)
push_loop.go
Add push loop implementation for agent-gateway communicationpkg/agent/push_loop.go
PushLoopstruct for managing periodic agentstatus pushes to gateway
HelloandGetConfigRPCcalls
from gateway
to
GatewayServiceStatusproto messagesstatus reports
8 files
edge_onboarding.go
Poller to gateway terminology refactoring and sync supportpkg/core/edge_onboarding.go
pollerterminology togatewaythroughout the file(100+ occurrences)
reflect gateway naming convention
EdgeOnboardingComponentTypeSynccomponent type inpackage creation and validation
config/pollers/toconfig/gateways/forconsistency
cnpg_discovery.go
Database column and variable naming updates for gatewaypkg/db/cnpg_discovery.go
poller_idcolumn references togateway_idin SQL queries andfunction arguments
pollerIDtogatewayIDin discoveryinterface and topology event builders
operations
edge_onboarding_test.go
Rename poller to gateway in edge onboarding testspkg/core/edge_onboarding_test.go
pollertogatewayterminology throughouttest cases
validPollerMetadataJSON()→validGatewayMetadataJSON()ListEdgeOnboardingPollerIDs()→ListEdgeOnboardingGatewayIDs()SetAllowedPollerCallback()→SetAllowedGatewayCallback()PollerID→GatewayID,ComponentTypePoller→ComponentTypeGatewaymetrics.go
Rename poller to gateway in metrics processingpkg/core/metrics.go
pollerIDtogatewayIDacross all metric processingfunctions
createSNMPMetric(),bufferMetrics(),processSNMPMetrics(),processRperfMetrics(),processSysmonMetrics(),processICMPMetrics(),processGRPCService(),processServicePayload()PollerID→GatewayIDin metricstructures
gateway_idinstead ofpoller_idterminology
cnpg_metrics.go
Rename poller to gateway in database metrics layerpkg/db/cnpg_metrics.go
poller_idtogateway_idin INSERTstatements for all metric tables
pollerIDtogatewayIDin all metricinsertion functions
gateway_idinstead ofpoller_idgatewayIDinstead ofpollerIDserver_test.go
Rename poller terminology to gateway throughout server testspkg/core/server_test.go
pollertogatewaythroughout test casesand mock setups
MaxPollers→MaxGateways,PollerStatus→GatewayStatus,pollerStatusCache→gatewayStatusCacheTestReportStatus→TestPushStatus,TestUpdatePollerStatus→TestUpdateGatewayStatusPollerStatusRequest→GatewayStatusRequest,ServiceStatus→GatewayServiceStatusserver.go
Refactor API server to use gateway terminology instead of pollerpkg/core/api/server.go
pollers→gateways,knownPollerSet→knownGatewaySet,dynamicPollers→dynamicGatewaysgetPollers→getGateways,getPoller→getGateway,UpdatePollerStatus→UpdateGatewayStatus/pollers/{id}to/gateways/{id}andrelated endpoints
PollerNode→GatewayNode,PollerStatus→GatewayStatus,PollerHistoryPoint→GatewayHistoryPointmodels.RolePoller→models.RoleGatewayTotalPollers→TotalGateways,HealthyPollers→HealthyGatewaysmock_db.go
Update mock database methods to use gateway terminologypkg/db/mock_db.go
DeletePoller→DeleteGateway,GetPollerStatus→GetGatewayStatus,UpdatePollerStatus→UpdateGatewayStatusPollerStatus→GatewayStatus,PollerHistoryPoint→
GatewayHistoryPointGetPollerHistory→GetGatewayHistory,GetPollerServices→GetGatewayServices,ListPollers→ListGatewaysListPollerStatuses→ListGatewayStatuses,ListAgentsByPoller→ListAgentsByGatewayIsPollerOffline→IsGatewayOffline1 files
nats_account_service_test.go
NATS account service test suite implementationpkg/datasvc/nats_account_service_test.go
of test coverage
account JWT signing
tests
assertions
1 files
main.go
Update API documentation descriptioncmd/core/main.go
1 files
prod.exs
Add Elixir production configuration fileelixir/serviceradar_core_elx/config/prod.exs
infofor production environment101 files
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2239#issuecomment-3731785021
Original created: 2026-01-10T03:55:18Z
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
Sensitive seed disclosure
Description: The CLI exposes highly sensitive NATS secrets (operator/system seeds) by printing them to
stdout and including them in JSON output (e.g.,
output["operator_seed"],output["system_account_seed"], and the text output that printsOperator Seed/System), which can leak via terminal history, logs, CI artifacts, or processAccount Seed
capture.
nats_bootstrap.go [471-530]
Referred Code
TLS verification bypass
Description: The
--tls-skip-verifyoption is plumbed intonewHTTPClient(cfg.TLSSkipVerify)for Core APIcalls, enabling certificate verification bypass and creating a realistic MITM risk if
users run the command with this flag in non-controlled environments.
nats_bootstrap.go [52-213]
Referred Code
🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status: Passed
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status:
Silent config drops: Invalid/unsupported checker configs can be silently skipped (returning nil) without any
warning/error logging, making misconfigurations difficult to detect and debug.
Referred Code
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status:
Sensitive target logging: The code logs
check.Targetverbatim when applying gateway-provided checks, which couldinclude sensitive data (e.g., URLs with embedded credentials or internal endpoints) and
therefore risks leaking secrets into logs.
Referred Code
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
Missing audit context: The new enrollment/config-application and status-push flows add operational logs but the
diff does not demonstrate audit-grade logging that consistently includes an acting
identity and outcome for critical actions across the system.
Referred Code
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status:
Gateway config trust: The agent consumes and applies gateway-provided check configuration (targets/paths/ports)
and while some validation exists, the diff does not show authentication/authorization and
provenance guarantees for this external input.
Referred Code
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2239#issuecomment-3731786208
Original created: 2026-01-10T03:56:41Z
PR Code Suggestions ✨
Explore these optional code suggestions:
Consider a phased feature rollout
Introduce feature flags for major new functionalities like the NATS bootstrap
CLI and the agent push-loop. This enables a safer, incremental rollout in
production by allowing features to be enabled one by one.
Examples:
pkg/core/gateways.go [801-814]
pkg/agent/push_loop.go [230-280]
Solution Walkthrough:
Before:
After:
Suggestion importance[1-10]: 8
__
Why: The suggestion addresses the significant operational risk of deploying multiple large, independent features at once, which is a critical concern for a PR of this magnitude.
Robustly merge streaming JSON chunks
Implement a robust method for merging streaming JSON array chunks in
mergeSyncServiceChunksby handling array delimiters and joining with commas toensure the final output is a valid JSON array.
pkg/core/gateways.go [1097-1109]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies that simple byte concatenation can produce invalid JSON, and the proposed change robustly merges JSON array chunks, preventing downstream parsing errors.
Fix unsupported component type error
Update the
CreatePackagefunction to correctly handle theEdgeOnboardingComponentTypeSynccase, removing the unsupported error andimplementing the necessary logic for package creation.
pkg/core/edge_onboarding.go [875-899]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies a logical contradiction where a component type is handled in some parts of the code but rejected in the creation logic, and provides a fix to complete the feature.
Fix stale checks when receiving empty config
Modify
applyChecksto correctly handle an empty list of checks from the gatewayby removing all existing checker configurations on the agent.
pkg/agent/push_loop.go [709-773]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 8
__
Why: This suggestion correctly identifies a bug where an empty check configuration from the gateway doesn't clear existing checks on the agent, leading to stale checks continuing to run. The fix ensures the agent's configuration accurately mirrors the central configuration.
Populate GatewayId in request
Populate the
GatewayIdfield in theGatewayStatusRequestwith the ID stored inthe server's configuration.
pkg/agent/push_loop.go [349-359]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 8
__
Why: This suggestion fixes a significant omission. The
GatewayStatusRequestwas missing theGatewayId, which is crucial for the receiving gateway to process the status correctly. This change ensures the request is properly formed and functional.Persist LastEvaluated timestamp
In
checkGatewayStatus, persist the updatedLastEvaluatedtimestamp to thegatewayStatusCachewithin a lock to ensure the skip logic functions correctly onsubsequent checks.
pkg/core/gateways.go [110-116]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 7
__
Why: The suggestion correctly points out that the
LastEvaluatedtimestamp is not persisted, causing redundant work, and provides a thread-safe fix to update the shared cache.Add missing sweepType constant
Add a
sweepTypeconstant to replace the use of the magic string "sweep" forbetter code maintainability.
pkg/agent/push_loop.go [122-127]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 5
__
Why: The code uses the magic string "sweep" in multiple places. Adding a
sweepTypeconstant improves maintainability and readability by providing a single, authoritative source for this value, reducing the risk of typos.Handle flag parsing failures gracefully
Change the
flag.NewFlagSeterror handling policy fromExitOnErrortoContinueOnErrorfor more graceful error management.pkg/cli/nats_bootstrap.go [53-116]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 7
__
Why: The suggestion correctly recommends using
flag.ContinueOnErrorto allow for graceful error handling within the application's command structure, which is a best practice for CLI tools.Store gateway ID after enrollment
After successful enrollment, store the
GatewayIdreceived from the gateway inthe server's configuration.
pkg/agent/push_loop.go [569-572]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 7
__
Why: This is a good suggestion for improving the agent's state management. Storing the
GatewayIdafter enrollment is necessary for subsequent requests to the gateway, as shown by another valid suggestion that uses this value.Sanitize agent ID before validation
In
processICMPMetrics, trim whitespace fromagentIDbefore checking if it isempty to improve validation.
pkg/core/metrics.go [579-597]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 6
__
Why: The suggestion correctly identifies a potential issue where an
agentIDwith only whitespace would bypass validation, leading to data integrity problems. Trimming theagentIDbefore validation is a good practice for robustness.Default partition constant
In
handleGatewayDown, initialize thepartitionvariable with thedefaultPartitionconstant instead of an empty string to ensure offline eventshave a valid partition.
pkg/core/gateways.go [277-280]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 4
__
Why: The suggestion improves data consistency by ensuring offline events are published with a default partition value instead of an empty string, aligning with best practices seen elsewhere in the code.
Avoid sending empty optional parameters
In
runNatsBootstrapCreate, only include theoperator_namein the API requestpayload if it has a non-empty value.
pkg/cli/nats_bootstrap.go [174-190]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 4
__
Why: The suggestion correctly points out that an empty
operator_nameis sent. While the API might handle this, adding a check improves robustness by making the request payload more explicit and avoiding sending empty optional values.