fixing flowgger / rust kv watcher infinite restart issues #2558
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2558
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2558/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #2119
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2119
Original created: 2025-12-14T03:59:48Z
Original updated: 2025-12-14T16:50:45Z
Original head: carverauto/serviceradar:feat/ui_update_mapper_and_snmp
Original base: main
Original merged: 2025-12-14T16:50:41Z by @mfreeman451
User description
IMPORTANT: Please sign the Developer Certificate of Origin
Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:
Describe your changes
Issue ticket number and link
Code checklist before requesting a review
PR Type
Enhancement, Bug fix
Description
Fix infinite restart loops in Rust KV watchers by skipping initial config value
Add sensitive field redaction for mapper and SNMP checker configs in API responses
Implement per-interface SNMP polling preference storage and management in KV
Add UI controls for mapper discovery configuration and interface SNMP polling toggles
Enable checker services to act as KV clients for telemetry and config updates
Diagram Walkthrough
File Walkthrough
4 files
Skip initial KV watch event to prevent restartSkip initial KV watch event to prevent restartSkip initial KV watch event to prevent restartSkip initial KV watch event to prevent restart12 files
Mark community field as sensitive in struct tagNew module for redacting sensitive config valuesNew endpoints for SNMP polling preferences and targetsRedact config bytes before returning to clientRegister new SNMP polling preference API routesEnable checker services as KV/telemetry clientsMark sensitive fields with struct tags and fix alignmentNew proxy endpoint for batch SNMP polling preferencesNew proxy endpoint for SNMP polling preference updatesNew proxy endpoint for rebuilding SNMP checker targetsAdd discovery config UI and SNMP polling toggle controlsAdd SNMP polling toggle column to interface table1 files
Tests for config redaction and restoration logic1 files
Fix alignment and formatting of error definitions1 files
Add new source files and dependencies to build8 files
Design proposal for network discovery UI featureDetailed design decisions and migration planKV storage requirements for interface preferencesNetwork discovery and polling control requirementsImplementation tasks and validation checklistProposal for fixing KV watcher restart loopsKV watch event handling requirementsTasks for fixing Rust service restart loopsImported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2119#issuecomment-3650171065
Original created: 2025-12-14T04:00:36Z
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
Redaction bypass risk
Description: The redaction logic uses a fixed placeholder "SR_REDACTED" without contextual binding,
making it possible for a client to submit that exact string for new sensitive fields to
cause unintended restoration or deletion of values; an attacker could also infer presence
of secrets by toggling placeholders—use per-field nonces or server-side field-level
tracking instead.
config_redaction.go [9-40]
Referred Code
Credential exposure
Description: The SNMP community string from mapper config may be propagated into generated snmp-checker
targets as plaintext ("community" field), risking storage or transmission of sensitive
credentials if the resulting config is exposed—ensure encryption-at-rest or restrict
access to this KV key and avoid logging these values.
network_discovery.go [480-503]
Referred Code
Client-side authZ only
Description: Admin detection decodes and parses the JWT in the browser to infer roles, which could be
spoofed if any upstream check relies on this client-side role; ensure server-side
authorization on all related API endpoints and treat this only as UI gating.
DeviceBasedDiscoveryDashboard.tsx [203-229]
Referred Code
Unvalidated IP usage
Description: SQL query uses a parameterized $1 but relies on device IDs from KV/user input; while
parameterization mitigates injection, the code selects the latest "device_ip" without
validation, enabling potential SSRF or misconfiguration if bogus IPs are stored—validate
IP format before using it to create SNMP targets.
network_discovery.go [446-468]
Referred Code
Proxy auth header trust
Description: Next.js proxy trusts any "Authorization: Bearer ..." header and forwards it along with a
global X-API-Key, potentially enabling header spoofing from the browser; enforce
server-side session verification and restrict CORS/origins to prevent token replay.
route.ts [9-29]
Referred Code
🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status: Passed
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status: Passed
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
Action Logging: New admin endpoints change per-interface SNMP polling and rebuild targets without explicit
audit logging of who did what beyond optional UpdatedBy, which may be insufficient for
comprehensive audit trails.
Referred Code
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status:
Error Context: Proxy routes return generic 502 errors without surfacing actionable context or structured
error payloads, which may hinder debugging and edge-case handling.
Referred Code
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status:
Input Validation: Batch and PUT handlers minimally validate fields and echo error strings while deriving KV
keys from user input, which may need stricter validation/authorization checks not visible
in the diff.
Referred Code
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2119#issuecomment-3650171764
Original created: 2025-12-14T04:01:55Z
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.
PR Code Suggestions ✨
Explore these optional code suggestions:
Decouple SNMP target generation from Core
The Core API is currently responsible for generating the
snmp-checkerconfiguration by reading interface preferences and mapper credentials. This
logic should be moved into the
snmp-checkerservice itself to improvemodularity.
Examples:
pkg/core/api/network_discovery.go [313-400]
Solution Walkthrough:
Before:
After:
Suggestion importance[1-10]: 9
__
Why: The suggestion correctly identifies a significant architectural issue where the Core API is tightly coupled with the configuration generation of the
snmp-checkerservice, and the proposed decoupling would improve modularity and long-term maintainability.Improve array merging logic for safety
In
restoreRedactions, modify the array merging logic for thetargetsarray touse the
namefield as a key instead of relying on array index. This preventsincorrect restoration of sensitive data if the order of elements changes.
pkg/core/api/config_redaction.go [169-177]
Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies that index-based array merging in
restoreRedactionsis unsafe for thesnmp-checker'stargetslist, and proposes a safer key-based merge using thenamefield, preventing potential misapplication of sensitive credentials.Prevent unsafe client-side API access
Refactor the logic for reading the
accessTokenfrom cookies to use auseStateand
useEffecthook. This ensures the cookie is only accessed on the client-side,preventing server-side rendering (SSR) errors and hydration mismatches.
web/src/components/Network/DeviceBasedDiscoveryDashboard.tsx [203-216]
Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies a server-side rendering (SSR) issue where
document.cookieis accessed outside auseEffecthook, which can cause hydration errors. The proposed fix correctly moves this client-side logic into auseEffecthook.Fix redaction logic for all JSON types
In
handleUpdateConfig, trigger the restoration of redacted configuration valuesbased on whether the service requires redaction by calling
shouldRedactConfig(service), rather than checking if theContent-Typeheadercontains "application/json".
pkg/core/api/auth.go [801-805]
Suggestion importance[1-10]: 7
__
Why: The suggestion correctly points out that relying on the
Content-Typeheader to trigger redaction restoration is brittle. UsingshouldRedactConfig(service)is more robust and directly ties the behavior to the services that actually need it.Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2119#issuecomment-3651567594
Original created: 2025-12-14T16:03:02Z
CI Feedback 🧐
(Feedback updated until commit
github.com/carverauto/serviceradar@db4e9ac8d3)A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
Action: build
Failed stage: Test [❌]
Failed test name: ""
Failure summary:
Bazel failed during the loading/fetch phase due to a missing external Debian package required by
Docker image targets:
- Failed to fetch repository
@@+_repo_rules2+debian_gcc_15_base_amd64_debbecause the URL returned 404:
https://deb.debian.org/debian/pool/main/g/gcc-15/gcc-15-base_15.2.0-10_amd64.deb
- Error surfaced
at
/home/runner/.cache/bazel/_bazel_runner/.../external/bazel_tools/tools/build_defs/repo/http.bzl:216:33- Subsequent “no such package” errors at:
-
/docker/images/BUILD.bazel:1484:8for//docker/images:glibc_runtime_layer-
/docker/images/BUILD.bazel:1509:8for//docker/images:timescaledb_extension_layer-
/docker/images/BUILD.bazel:1592:8for//docker/images:age_extension_layer- Additional warning: Node.js v20.18.1 URL also 404’d, but the
primary failure is the missing Debian GCC 15 base .deb which caused analysis to fail for multiple
docker/image targets.
- Result: “command succeeded, but there were loading phase errors” and build
did not complete successfully, causing the action to exit with code 1.
Relevant error logs: