2036 bugui agents showing sysmon collections incorrectly #2491

Merged
mfreeman451 merged 4 commits from refs/pull/2491/head into main 2025-12-01 02:36:26 +00:00
mfreeman451 commented 2025-12-01 02:29:35 +00:00 (Migrated from github.com)
Owner

Imported from GitHub pull request.

Original GitHub pull request: #2037
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2037
Original created: 2025-12-01T02:29:35Z
Original updated: 2025-12-01T02:36:29Z
Original head: carverauto/serviceradar:2036-bugui-agents-showing-sysmon-collections-incorrectly
Original base: main
Original merged: 2025-12-01T02:36:26Z by @mfreeman451

User description

IMPORTANT: Please sign the Developer Certificate of Origin

Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:

Signed-off-by: J. Doe <j.doe@domain.com>

Describe your changes

Code checklist before requesting a review

  • I have signed the DCO?
  • The build completes without errors?
  • All tests are passing when running make test?

PR Type

Bug fix, Enhancement


Description

  • Fix phantom device creation from checker host IPs by detecting collector IPs

  • Add stable service device IDs for core services (datasvc, sync, mapper, otel, zen)

  • Implement heuristic detection of ephemeral Docker container IPs with collector hostnames

  • Add database migration to clean up existing phantom devices while preserving service devices


Diagram Walkthrough

flowchart LR
  A["Checker Reports<br/>host_ip"] --> B{"Is host_ip<br/>collector IP?"}
  B -->|Yes| C["Skip Device<br/>Creation"]
  B -->|No| D["Create Target<br/>Device"]
  E["Core Service<br/>Reports Status"] --> F["Register with<br/>serviceradar:type:id"]
  F --> G["Stable Device ID<br/>Survives IP Changes"]
  C --> H["No Phantom<br/>Devices"]
  D --> I["Only Target<br/>Devices Created"]

File Walkthrough

Relevant files
Bug fix
3 files
devices.go
Add collector IP detection to skip phantom devices             
+132/-0 
00000000000011_cleanup_phantom_devices.up.sql
Migrate database to remove phantom devices with backup     
+62/-0   
00000000000011_cleanup_phantom_devices.down.sql
Rollback migration to restore phantom devices from backup
+12/-0   
Tests
5 files
devices_test.go
Add comprehensive tests for device filtering logic             
+755/-0 
services_core_test.go
Add tests for core service registration                                   
+417/-0 
service_device_test.go
Add tests for core service device updates                               
+146/-0 
identity_resolver_cnpg_test.go
Fix flaky cache expiry test with eventual consistency       
+6/-8     
sweeper_test.go
Increase watch context timeout to reduce CI flakiness       
+2/-1     
Enhancement
4 files
pollers.go
Refactor device registration to support core services       
+9/-35   
services.go
Add core service detection and registration logic               
+118/-1 
service_device.go
Add ServiceType constants for core services                           
+14/-0   
service_registration.go
Add CreateCoreServiceDeviceUpdate helper function               
+33/-0   
Configuration changes
1 files
Makefile
Add Bazel build and image push targets                                     
+10/-2   
Documentation
3 files
proposal.md
Document design proposal for phantom device fix                   
+83/-0   
spec.md
Define requirements for service device IDs and checker filtering
+112/-0 
tasks.md
Track implementation tasks and test coverage                         
+84/-0   

Imported from GitHub pull request. Original GitHub pull request: #2037 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/pull/2037 Original created: 2025-12-01T02:29:35Z Original updated: 2025-12-01T02:36:29Z Original head: carverauto/serviceradar:2036-bugui-agents-showing-sysmon-collections-incorrectly Original base: main Original merged: 2025-12-01T02:36:26Z by @mfreeman451 --- ### **User description** ## IMPORTANT: Please sign the Developer Certificate of Origin Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include a [DCO sign-off statement]( https://developercertificate.org/) indicating the DCO acceptance in one commit message. Here is an example DCO Signed-off-by line in a commit message: ``` Signed-off-by: J. Doe <j.doe@domain.com> ``` ## Describe your changes ## Issue ticket number and link ## Code checklist before requesting a review - [ ] I have signed the DCO? - [ ] The build completes without errors? - [ ] All tests are passing when running make test? ___ ### **PR Type** Bug fix, Enhancement ___ ### **Description** - Fix phantom device creation from checker host IPs by detecting collector IPs - Add stable service device IDs for core services (datasvc, sync, mapper, otel, zen) - Implement heuristic detection of ephemeral Docker container IPs with collector hostnames - Add database migration to clean up existing phantom devices while preserving service devices ___ ### Diagram Walkthrough ```mermaid flowchart LR A["Checker Reports<br/>host_ip"] --> B{"Is host_ip<br/>collector IP?"} B -->|Yes| C["Skip Device<br/>Creation"] B -->|No| D["Create Target<br/>Device"] E["Core Service<br/>Reports Status"] --> F["Register with<br/>serviceradar:type:id"] F --> G["Stable Device ID<br/>Survives IP Changes"] C --> H["No Phantom<br/>Devices"] D --> I["Only Target<br/>Devices Created"] ``` <details> <summary><h3> File Walkthrough</h3></summary> <table><thead><tr><th></th><th align="left">Relevant files</th></tr></thead><tbody><tr><td><strong>Bug fix</strong></td><td><details><summary>3 files</summary><table> <tr> <td><strong>devices.go</strong><dd><code>Add collector IP detection to skip phantom devices</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-fb7f5159558f22afdc809b9cbf5a9c0f71560be2f7389c944da8f1604998ea48">+132/-0</a>&nbsp; </td> </tr> <tr> <td><strong>00000000000011_cleanup_phantom_devices.up.sql</strong><dd><code>Migrate database to remove phantom devices with backup</code>&nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-feccedab5f5095c10c66ec9e0dd3dcb8e23dd7b966b2033647bcaf1e53a85e9b">+62/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>00000000000011_cleanup_phantom_devices.down.sql</strong><dd><code>Rollback migration to restore phantom devices from backup</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-b177e748d5c281969f40ac05e05e6d27efe86eeb2ef5679d1c3e46bfb34adc12">+12/-0</a>&nbsp; &nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Tests</strong></td><td><details><summary>5 files</summary><table> <tr> <td><strong>devices_test.go</strong><dd><code>Add comprehensive tests for device filtering logic</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-133a0e1348e2d2dcfc9645f0409219463138e17380efd4fbc213be14c104043b">+755/-0</a>&nbsp; </td> </tr> <tr> <td><strong>services_core_test.go</strong><dd><code>Add tests for core service registration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-2dab739cbd7eb56504ebe06b0349381b816503006b701baa42a8914bea04b45a">+417/-0</a>&nbsp; </td> </tr> <tr> <td><strong>service_device_test.go</strong><dd><code>Add tests for core service device updates</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-6b85e0c22bc48ca6678b23cd683d6b8b5dee9d20f7a8e822d6d13502460f3689">+146/-0</a>&nbsp; </td> </tr> <tr> <td><strong>identity_resolver_cnpg_test.go</strong><dd><code>Fix flaky cache expiry test with eventual consistency</code>&nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-9af43f5d2677c38f3515024e4268531f4125e4368719ae4725c830ef1f89eec2">+6/-8</a>&nbsp; &nbsp; &nbsp; </td> </tr> <tr> <td><strong>sweeper_test.go</strong><dd><code>Increase watch context timeout to reduce CI flakiness</code>&nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-7714a2d9e6f06cd0d5944247437dda7796228b95a507478e92f0fc6e09179424">+2/-1</a>&nbsp; &nbsp; &nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Enhancement</strong></td><td><details><summary>4 files</summary><table> <tr> <td><strong>pollers.go</strong><dd><code>Refactor device registration to support core services</code>&nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-fe81e2a32f1ac64bcdc6f25f55c5fa918d17bad8c0546f2cf80c757ff4051816">+9/-35</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>services.go</strong><dd><code>Add core service detection and registration logic</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-b75091f9768dcdaf46aedeee40cb2eaa33b46a484d77d5d432bab19fe437237f">+118/-1</a>&nbsp; </td> </tr> <tr> <td><strong>service_device.go</strong><dd><code>Add ServiceType constants for core services</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-f6d3995ae69f2f6d1e7f2c68e4c8a50759aebb26df3149af4cde2a05a018f14c">+14/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>service_registration.go</strong><dd><code>Add CreateCoreServiceDeviceUpdate helper function</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-3ad8d9e7f1f17e0198a6a5a53398cc9bcae94a111f907d965dfcc43daeeb95e8">+33/-0</a>&nbsp; &nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Configuration changes</strong></td><td><details><summary>1 files</summary><table> <tr> <td><strong>Makefile</strong><dd><code>Add Bazel build and image push targets</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-76ed074a9305c04054cdebb9e9aad2d818052b07091de1f20cad0bbac34ffb52">+10/-2</a>&nbsp; &nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Documentation</strong></td><td><details><summary>3 files</summary><table> <tr> <td><strong>proposal.md</strong><dd><code>Document design proposal for phantom device fix</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-ed59ef5f7b9f94cea669c81eccf7ae83d579c5ea60b4f139da3eb68a02e368ff">+83/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>spec.md</strong><dd><code>Define requirements for service device IDs and checker filtering</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-796f2e05e4b1942bcfd740f51f5f5a63cace288520ceee7a439936da6196521e">+112/-0</a>&nbsp; </td> </tr> <tr> <td><strong>tasks.md</strong><dd><code>Track implementation tasks and test coverage</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-7306d84aa629841d30563d0ddbcc2178e690cdd564451cbf009ebdbfdcbf3e95">+84/-0</a>&nbsp; &nbsp; </td> </tr> </table></details></td></tr></tr></tbody></table> </details> ___
qodo-code-review[bot] commented 2025-12-01 02:30:12 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2037#issuecomment-3594269806
Original created: 2025-12-01T02:30:12Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Risky data deletion

Description: The cleanup migration deletes rows from unified_devices based on regex IP and hostname
heuristics, which risks unintended data loss if criteria are overly broad or metadata
fields vary; ensure execution only in controlled environments and with validated backups.
00000000000011_cleanup_phantom_devices.up.sql [12-62]

Referred Code
CREATE TABLE IF NOT EXISTS _phantom_devices_backup AS
SELECT *
FROM unified_devices
WHERE
    -- Device ID is NOT a service device (we want to keep those)
    device_id NOT LIKE 'serviceradar:%'
    -- IP is in Docker bridge network ranges
    AND (
        ip ~ '^172\.17\.' OR
        ip ~ '^172\.18\.' OR
        ip ~ '^172\.19\.' OR
        ip ~ '^172\.20\.' OR
        ip ~ '^172\.21\.'
    )
    -- Source indicates it came from a checker
    AND (
        metadata->>'source' = 'checker'
        OR metadata->>'source' = 'self-reported'
    )
    -- Hostname suggests this is a collector, not a real target
    AND (


 ... (clipped 30 lines)
Ticket Compliance
🟡
🎫 #2036
🟢 Do not create devices from checker/agent/poller host IPs unless they are actual monitored
targets producing metrics.
Ensure agents and pollers themselves appear as devices, but external collectors/checkers
only create devices for their targets (e.g., SNMP target 192.168.1.1).
Prevent phantom devices caused by ephemeral Docker container IPs from collectors.
Correctly identify and retain the real target device (e.g., sysmon-vm at 192.168.1.218) in
inventory.
Provide stable device IDs for core ServiceRadar services so they appear as devices and
survive IP changes.
Add a data cleanup/migration to remove existing phantom devices without deleting
legitimate service devices.
Validate in a running environment that checker-originated phantom devices no longer appear
and real targets still do, across agent/poller restarts and Docker/Kubernetes IP churn.
Verify the migration safely removes only phantom devices on a real database snapshot and
that service devices (serviceradar:*) are preserved.
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Unstructured Logging: New debug logs for skipping device creation and registration paths lack explicit
user/action identifiers and structured audit context, which may be insufficient for
comprehensive audit trails.

Referred Code
	s.logger.Debug().
		Str("host_ip", hostIP).
		Str("collector_ip", collectorIP).
		Str("agent_id", agentID).
		Str("poller_id", pollerID).
		Str("service_name", svc.ServiceName).
		Msg("Skipping device creation: host_ip matches collector IP (this is the collector, not a target)")
	return
}

// Also skip if the IP is in a common Docker bridge network range and matches agent/poller characteristics.
// This catches cases where the collector IP couldn't be looked up but the IP is clearly ephemeral.
if s.isEphemeralCollectorIP(hostIP, hostname, hostID) {
	s.logger.Debug().
		Str("host_ip", hostIP).
		Str("hostname", hostname).
		Str("host_id", hostID).
		Str("service_name", svc.ServiceName).
		Msg("Skipping device creation: detected ephemeral collector IP with agent/poller hostname")
	return

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
- Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2037#issuecomment-3594269806 Original created: 2025-12-01T02:30:12Z --- ## PR Compliance Guide 🔍 <!-- https://github.com/carverauto/serviceradar/commit/013a181a82b49250ba9aadda091f4c27aead7a04 --> Below is a summary of compliance checks for this PR:<br> <table><tbody><tr><td colspan='2'><strong>Security Compliance</strong></td></tr> <tr><td rowspan=1>⚪</td> <td><details><summary><strong>Risky data deletion </strong></summary><br> <b>Description:</b> The cleanup migration deletes rows from unified_devices based on regex IP and hostname <br>heuristics, which risks unintended data loss if criteria are overly broad or metadata <br>fields vary; ensure execution only in controlled environments and with validated backups.<br> <strong><a href='https://github.com/carverauto/serviceradar/pull/2037/files#diff-feccedab5f5095c10c66ec9e0dd3dcb8e23dd7b966b2033647bcaf1e53a85e9bR12-R62'>00000000000011_cleanup_phantom_devices.up.sql [12-62]</a></strong><br> <details open><summary>Referred Code</summary> ```sql CREATE TABLE IF NOT EXISTS _phantom_devices_backup AS SELECT * FROM unified_devices WHERE -- Device ID is NOT a service device (we want to keep those) device_id NOT LIKE 'serviceradar:%' -- IP is in Docker bridge network ranges AND ( ip ~ '^172\.17\.' OR ip ~ '^172\.18\.' OR ip ~ '^172\.19\.' OR ip ~ '^172\.20\.' OR ip ~ '^172\.21\.' ) -- Source indicates it came from a checker AND ( metadata->>'source' = 'checker' OR metadata->>'source' = 'self-reported' ) -- Hostname suggests this is a collector, not a real target AND ( ... (clipped 30 lines) ``` </details></details></td></tr> <tr><td colspan='2'><strong>Ticket Compliance</strong></td></tr> <tr><td>🟡</td> <td> <details> <summary>🎫 <a href=https://github.com/carverauto/serviceradar/issues/2036>#2036</a></summary> <table width='100%'><tbody> <tr><td rowspan=6>🟢</td> <td>Do not create devices from checker/agent/poller host IPs unless they are actual monitored <br>targets producing metrics.</td></tr> <tr><td>Ensure agents and pollers themselves appear as devices, but external collectors/checkers <br>only create devices for their targets (e.g., SNMP target 192.168.1.1).</td></tr> <tr><td>Prevent phantom devices caused by ephemeral Docker container IPs from collectors.</td></tr> <tr><td>Correctly identify and retain the real target device (e.g., sysmon-vm at 192.168.1.218) in <br>inventory.</td></tr> <tr><td>Provide stable device IDs for core ServiceRadar services so they appear as devices and <br>survive IP changes.</td></tr> <tr><td>Add a data cleanup/migration to remove existing phantom devices without deleting <br>legitimate service devices.</td></tr> <tr><td rowspan=2>⚪</td> <td>Validate in a running environment that checker-originated phantom devices no longer appear <br>and real targets still do, across agent/poller restarts and Docker/Kubernetes IP churn.</td></tr> <tr><td>Verify the migration safely removes only phantom devices on a real database snapshot and <br>that service devices (serviceradar:*) are preserved.</td></tr> </tbody></table> </details> </td></tr> <tr><td colspan='2'><strong>Codebase Duplication Compliance</strong></td></tr> <tr><td>⚪</td><td><details><summary><strong>Codebase context is not defined </strong></summary> Follow the <a href='https://qodo-merge-docs.qodo.ai/core-abilities/rag_context_enrichment/'>guide</a> to enable codebase context checks. </details></td></tr> <tr><td colspan='2'><strong>Custom Compliance</strong></td></tr> <tr><td rowspan=5>🟢</td><td> <details><summary><strong>Generic: Meaningful Naming and Self-Documenting Code</strong></summary><br> **Objective:** Ensure all identifiers clearly express their purpose and intent, making code <br>self-documenting<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Robust Error Handling and Edge Case Management</strong></summary><br> **Objective:** Ensure comprehensive error handling that provides meaningful context and graceful <br>degradation<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Secure Error Handling</strong></summary><br> **Objective:** To prevent the leakage of sensitive system information through error messages while <br>providing sufficient detail for internal debugging.<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Secure Logging Practices</strong></summary><br> **Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive <br>information like PII, PHI, or cardholder data.<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Security-First Input Validation and Data Handling</strong></summary><br> **Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent <br>vulnerabilities<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td rowspan=1>⚪</td> <td><details> <summary><strong>Generic: Comprehensive Audit Trails</strong></summary><br> **Objective:** To create a detailed and reliable record of critical system actions for security analysis <br>and compliance.<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2037/files#diff-fb7f5159558f22afdc809b9cbf5a9c0f71560be2f7389c944da8f1604998ea48R62-R81'><strong>Unstructured Logging</strong></a>: New debug logs for skipping device creation and registration paths lack explicit <br>user/action identifiers and structured audit context, which may be insufficient for <br>comprehensive audit trails.<br> <details open><summary>Referred Code</summary> ```go s.logger.Debug(). Str("host_ip", hostIP). Str("collector_ip", collectorIP). Str("agent_id", agentID). Str("poller_id", pollerID). Str("service_name", svc.ServiceName). Msg("Skipping device creation: host_ip matches collector IP (this is the collector, not a target)") return } // Also skip if the IP is in a common Docker bridge network range and matches agent/poller characteristics. // This catches cases where the collector IP couldn't be looked up but the IP is clearly ephemeral. if s.isEphemeralCollectorIP(hostIP, hostname, hostID) { s.logger.Debug(). Str("host_ip", hostIP). Str("hostname", hostname). Str("host_id", hostID). Str("service_name", svc.ServiceName). Msg("Skipping device creation: detected ephemeral collector IP with agent/poller hostname") return ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td align="center" colspan="2"> <!-- placeholder --> <!-- /compliance --update_compliance=true --> </td></tr></tbody></table> <details><summary>Compliance status legend</summary> 🟢 - Fully Compliant<br> 🟡 - Partial Compliant<br> 🔴 - Not Compliant<br> ⚪ - Requires Further Human Verification<br> 🏷️ - Compliance label<br> </details>
qodo-code-review[bot] commented 2025-12-01 02:31:31 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2037#issuecomment-3594272422
Original created: 2025-12-01T02:31:31Z

PR Code Suggestions

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Hardcoded Docker IP ranges are brittle

The hardcoded list of Docker IP ranges used for phantom device detection is
brittle. These ranges should be made configurable to support different container
runtime network configurations.

Examples:

pkg/core/devices.go [427-453]
func isDockerBridgeIP(ipStr string) bool {
	ip := net.ParseIP(ipStr)
	if ip == nil {
		return false
	}

	// Common Docker bridge network ranges
	dockerCIDRs := []string{
		"172.17.0.0/16", // Default docker0 bridge
		"172.18.0.0/16", // Docker compose default

 ... (clipped 17 lines)
pkg/db/cnpg/migrations/00000000000011_cleanup_phantom_devices.up.sql [18-25]
    -- IP is in Docker bridge network ranges
    AND (
        ip ~ '^172\.17\.' OR
        ip ~ '^172\.18\.' OR
        ip ~ '^172\.19\.' OR
        ip ~ '^172\.20\.' OR
        ip ~ '^172\.21\.'
    )

Solution Walkthrough:

Before:

// pkg/core/devices.go
func isDockerBridgeIP(ipStr string) bool {
    ip := net.ParseIP(ipStr)
    // ...

    // Common Docker bridge network ranges are hardcoded
    dockerCIDRs := []string{
        "172.17.0.0/16",
        "172.18.0.0/16",
        "172.19.0.0/16",
        // ...
    }

    for _, cidr := range dockerCIDRs {
        // ... check if ip is in network
    }
    return false
}

After:

// pkg/core/server.go (or similar config struct)
type Config struct {
    // ...
    EphemeralIPRanges []string
}

// pkg/core/devices.go
func (s *Server) isEphemeralCollectorIP(...) bool {
    // Use configurable ranges from the server config
    if !isIPInConfiguredRanges(hostIP, s.Config.EphemeralIPRanges) {
        return false
    }
    // ... rest of the logic
}

Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that hardcoding Docker IP ranges in isDockerBridgeIP is a brittle design choice that will fail in custom environments, and making it configurable would significantly improve the solution's robustness.

Medium
Possible issue
Use robust network operators for IP matching

Replace the regex-based IP matching in the SQL migration with PostgreSQL's inet
type and << operator for a more robust and efficient query.

pkg/db/cnpg/migrations/00000000000011_cleanup_phantom_devices.up.sql [18-25]

 -- IP is in Docker bridge network ranges
 AND (
-    ip ~ '^172\.17\.' OR
-    ip ~ '^172\.18\.' OR
-    ip ~ '^172\.19\.' OR
-    ip ~ '^172\.20\.' OR
-    ip ~ '^172\.21\.'
+    ip::inet << '172.17.0.0/16' OR
+    ip::inet << '172.18.0.0/16' OR
+    ip::inet << '172.19.0.0/16' OR
+    ip::inet << '172.20.0.0/16' OR
+    ip::inet << '172.21.0.0/16'
 )
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly points out that using regex for IP matching is brittle and proposes using PostgreSQL's native inet type and network operators, which is more robust, correct, and performant.

Medium
General
Improve performance by pre-parsing CIDRs

To improve performance and robustness, pre-parse the hardcoded CIDR strings in
an init() function and panic on any parsing errors.

pkg/core/devices.go [426-453]

+var dockerNetworks []*net.IPNet
+
+func init() {
+	// Common Docker bridge network ranges
+	dockerCIDRs := []string{
+		"172.17.0.0/16", // Default docker0 bridge
+		"172.18.0.0/16", // Docker compose default
+		"192.19.0.0/16", // Additional compose networks
+		"172.20.0.0/16", // Additional compose networks
+		"172.21.0.0/16", // Additional compose networks
+	}
+
+	for _, cidr := range dockerCIDRs {
+		_, network, err := net.ParseCIDR(cidr)
+		if err != nil {
+			// This should not happen with hardcoded CIDRs.
+			// Panicking at startup is appropriate to catch typos immediately.
+			panic(fmt.Sprintf("failed to parse hardcoded CIDR '%s': %v", cidr, err))
+		}
+		dockerNetworks = append(dockerNetworks, network)
+	}
+}
+
 // isDockerBridgeIP checks if an IP is in common Docker bridge network ranges.
 func isDockerBridgeIP(ipStr string) bool {
 	ip := net.ParseIP(ipStr)
 	if ip == nil {
 		return false
 	}
 
-	// Common Docker bridge network ranges
-	dockerCIDRs := []string{
-		"172.17.0.0/16", // Default docker0 bridge
-		"172.18.0.0/16", // Docker compose default
-		"172.19.0.0/16", // Additional compose networks
-		"172.20.0.0/16", // Additional compose networks
-		"172.21.0.0/16", // Additional compose networks
-	}
-
-	for _, cidr := range dockerCIDRs {
-		_, network, err := net.ParseCIDR(cidr)
-		if err != nil {
-			continue
-		}
+	for _, network := range dockerNetworks {
 		if network.Contains(ip) {
 			return true
 		}
 	}
 
 	return false
 }
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies an inefficiency and proposes a standard Go pattern to fix it, improving both performance and robustness by pre-parsing CIDRs in an init function.

Low
  • More
Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2037#issuecomment-3594272422 Original created: 2025-12-01T02:31:31Z --- ## PR Code Suggestions ✨ <!-- 013a181 --> Explore these optional code suggestions: <table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=1>High-level</td> <td> <details><summary>Hardcoded Docker IP ranges are brittle</summary> ___ **The hardcoded list of Docker IP ranges used for phantom device detection is <br>brittle. These ranges should be made configurable to support different container <br>runtime network configurations.** ### Examples: <details> <summary> <a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-fb7f5159558f22afdc809b9cbf5a9c0f71560be2f7389c944da8f1604998ea48R427-R453">pkg/core/devices.go [427-453]</a> </summary> ```go func isDockerBridgeIP(ipStr string) bool { ip := net.ParseIP(ipStr) if ip == nil { return false } // Common Docker bridge network ranges dockerCIDRs := []string{ "172.17.0.0/16", // Default docker0 bridge "172.18.0.0/16", // Docker compose default ... (clipped 17 lines) ``` </details> <details> <summary> <a href="https://github.com/carverauto/serviceradar/pull/2037/files#diff-feccedab5f5095c10c66ec9e0dd3dcb8e23dd7b966b2033647bcaf1e53a85e9bR18-R25">pkg/db/cnpg/migrations/00000000000011_cleanup_phantom_devices.up.sql [18-25]</a> </summary> ```sql -- IP is in Docker bridge network ranges AND ( ip ~ '^172\.17\.' OR ip ~ '^172\.18\.' OR ip ~ '^172\.19\.' OR ip ~ '^172\.20\.' OR ip ~ '^172\.21\.' ) ``` </details> ### Solution Walkthrough: #### Before: ```sql // pkg/core/devices.go func isDockerBridgeIP(ipStr string) bool { ip := net.ParseIP(ipStr) // ... // Common Docker bridge network ranges are hardcoded dockerCIDRs := []string{ "172.17.0.0/16", "172.18.0.0/16", "172.19.0.0/16", // ... } for _, cidr := range dockerCIDRs { // ... check if ip is in network } return false } ``` #### After: ```sql // pkg/core/server.go (or similar config struct) type Config struct { // ... EphemeralIPRanges []string } // pkg/core/devices.go func (s *Server) isEphemeralCollectorIP(...) bool { // Use configurable ranges from the server config if !isIPInConfiguredRanges(hostIP, s.Config.EphemeralIPRanges) { return false } // ... rest of the logic } ``` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies that hardcoding Docker IP ranges in `isDockerBridgeIP` is a brittle design choice that will fail in custom environments, and making it configurable would significantly improve the solution's robustness. </details></details></td><td align=center>Medium </td></tr><tr><td rowspan=1>Possible issue</td> <td> <details><summary>Use robust network operators for IP matching</summary> ___ **Replace the regex-based IP matching in the SQL migration with PostgreSQL's <code>inet</code> <br>type and <code><<</code> operator for a more robust and efficient query.** [pkg/db/cnpg/migrations/00000000000011_cleanup_phantom_devices.up.sql [18-25]](https://github.com/carverauto/serviceradar/pull/2037/files#diff-feccedab5f5095c10c66ec9e0dd3dcb8e23dd7b966b2033647bcaf1e53a85e9bR18-R25) ```diff -- IP is in Docker bridge network ranges AND ( - ip ~ '^172\.17\.' OR - ip ~ '^172\.18\.' OR - ip ~ '^172\.19\.' OR - ip ~ '^172\.20\.' OR - ip ~ '^172\.21\.' + ip::inet << '172.17.0.0/16' OR + ip::inet << '172.18.0.0/16' OR + ip::inet << '172.19.0.0/16' OR + ip::inet << '172.20.0.0/16' OR + ip::inet << '172.21.0.0/16' ) ``` - [ ] **Apply / Chat** <!-- /improve --apply_suggestion=1 --> <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly points out that using regex for IP matching is brittle and proposes using PostgreSQL's native `inet` type and network operators, which is more robust, correct, and performant. </details></details></td><td align=center>Medium </td></tr><tr><td rowspan=1>General</td> <td> <details><summary>Improve performance by pre-parsing CIDRs</summary> ___ **To improve performance and robustness, pre-parse the hardcoded CIDR strings in <br>an <code>init()</code> function and panic on any parsing errors.** [pkg/core/devices.go [426-453]](https://github.com/carverauto/serviceradar/pull/2037/files#diff-fb7f5159558f22afdc809b9cbf5a9c0f71560be2f7389c944da8f1604998ea48R426-R453) ```diff +var dockerNetworks []*net.IPNet + +func init() { + // Common Docker bridge network ranges + dockerCIDRs := []string{ + "172.17.0.0/16", // Default docker0 bridge + "172.18.0.0/16", // Docker compose default + "192.19.0.0/16", // Additional compose networks + "172.20.0.0/16", // Additional compose networks + "172.21.0.0/16", // Additional compose networks + } + + for _, cidr := range dockerCIDRs { + _, network, err := net.ParseCIDR(cidr) + if err != nil { + // This should not happen with hardcoded CIDRs. + // Panicking at startup is appropriate to catch typos immediately. + panic(fmt.Sprintf("failed to parse hardcoded CIDR '%s': %v", cidr, err)) + } + dockerNetworks = append(dockerNetworks, network) + } +} + // isDockerBridgeIP checks if an IP is in common Docker bridge network ranges. func isDockerBridgeIP(ipStr string) bool { ip := net.ParseIP(ipStr) if ip == nil { return false } - // Common Docker bridge network ranges - dockerCIDRs := []string{ - "172.17.0.0/16", // Default docker0 bridge - "172.18.0.0/16", // Docker compose default - "172.19.0.0/16", // Additional compose networks - "172.20.0.0/16", // Additional compose networks - "172.21.0.0/16", // Additional compose networks - } - - for _, cidr := range dockerCIDRs { - _, network, err := net.ParseCIDR(cidr) - if err != nil { - continue - } + for _, network := range dockerNetworks { if network.Contains(ip) { return true } } return false } ``` - [ ] **Apply / Chat** <!-- /improve --apply_suggestion=2 --> <details><summary>Suggestion importance[1-10]: 5</summary> __ Why: The suggestion correctly identifies an inefficiency and proposes a standard Go pattern to fix it, improving both performance and robustness by pre-parsing CIDRs in an `init` function. </details></details></td><td align=center>Low </td></tr> <tr><td align="center" colspan="2"> - [ ] More <!-- /improve --more_suggestions=true --> </td><td></td></tr></tbody></table>
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar!2491
No description provided.