2026 bugdire device inventory count growing beyond 50k #2484
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2484
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2484/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #2027
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2027
Original created: 2025-11-27T20:07:48Z
Original updated: 2025-11-29T00:43:56Z
Original head: carverauto/serviceradar:2026-bugdire-device-inventory-count-growing-beyond-50k
Original base: main
Original merged: 2025-11-28T00:53:29Z by @mfreeman451
User description
IMPORTANT: Please sign the Developer Certificate of Origin
Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:
Describe your changes
Issue ticket number and link
Code checklist before requesting a review
PR Type
Bug fix, Enhancement
Description
Stabilize device inventory at 50k by enforcing single IP per device and bounded IP pool management
Implement strong-ID merging across IP churn to prevent duplicate device creation
Add cardinality drift detection and promotion blocking when device count exceeds baseline
Mark promoted sightings as unavailable until positive probe confirms availability
Add identity drift metrics and Prometheus monitoring infrastructure
Diagram Walkthrough
File Walkthrough
5 files
Implement bounded IP pool and strong-ID merge logicDefine IdentityDriftConfig structureAdd batch-level strong ID assignment trackingAdd cardinality drift metrics and gaugesImplement cardinality drift blocking and availability semantics4 files
Add tests for IP cardinality and strong-ID preservationAdd test for strong ID merging across IP churnAdd test for promoted sighting availability defaultsAdd end-to-end tests for cardinality and availability5 files
Add identity drift configuration defaultsAdd IP pool and expansion configuration parametersAdd IP pool and expansion configuration parametersWire drift config and faker IP pool settingsAdd drift baseline and faker IP pool configuration8 files
Document identity drift metrics and Prometheus alertsRemove obsolete KV store migration documentationAdd faker guardrails and availability enforcement requirementsAdd strong-ID merge, availability, and drift detection requirementsAdd tasks for cardinality clamping and availability fixesAdd Prometheus monitoring bridge proposalDefine Prometheus endpoint and metrics requirementsAdd Prometheus exporter implementation tasksImported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2027#issuecomment-3587168115
Original created: 2025-11-27T20:08:26Z
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
Input validation risk
Description: The function
replacePrimaryIPand related IP pool operations manipulate comma-separated IPstrings without strict validation, which could allow malformed inputs to be persisted or
logged if upstream sources are untrusted; add input validation or structured IP storage to
avoid potential injection/log pollution.
main.go [570-588]
Referred Code
🎫 #2026
despite IP churn.
duplicate devices.
count.
(cardinality drift).
sustained churn and over time, including after restarts and persistence reuse.
expected drift metrics.
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status: Passed
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status: Passed
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status: Passed
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status: Passed
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
Audit Logging: New critical actions (promotion blocking via drift and reconciliation decisions) add only
warning/info logs without clear, structured audit records tied to actor/context, which may
be insufficient for audit trail requirements.
Referred Code
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2027#issuecomment-3587170221
Original created: 2025-11-27T20:09:33Z
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.
PR Code Suggestions ✨
Explore these optional code suggestions:
Retry device selection on IP collision
In
reassignIPsFromPool, retry finding a suitable device for a newly allocated IPif the first randomly selected device is not a valid target. This increases the
effectiveness of the IP shuffle simulation.
cmd/faker/main.go [484-488]
Suggestion importance[1-10]: 6
__
Why: The suggestion correctly identifies that not retrying to find a suitable device for a
newIPcan make the IP shuffle simulation less effective. Implementing a retry loop is a good improvement to make the simulation more robust, ensuring it is more likely to perform the intended number of IP changes.Avoid reporting metrics for disabled features
In
blockPromotionForCardinalityDrift, avoid recording drift metrics when thefeature is disabled (
drift.BaselineDevices <= 0). This prevents misleadingmetrics from being reported.
pkg/registry/registry.go [822-825]
Suggestion importance[1-10]: 5
__
Why: The suggestion correctly points out that recording metrics for a disabled feature can be misleading. Exiting early when drift detection is disabled improves the clarity of monitoring data by ensuring metrics are only reported when the feature is active.