2023 bugdire internal services not first class devices #2483

Merged
mfreeman451 merged 4 commits from refs/pull/2483/head into main 2025-11-27 16:52:14 +00:00
mfreeman451 commented 2025-11-27 16:48:34 +00:00 (Migrated from github.com)
Owner

Imported from GitHub pull request.

Original GitHub pull request: #2025
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2025
Original created: 2025-11-27T16:48:34Z
Original updated: 2025-11-27T16:52:28Z
Original head: carverauto/serviceradar:2023-bugdire-internal-services-not-first-class-devices
Original base: main
Original merged: 2025-11-27T16:52:14Z by @mfreeman451

User description

IMPORTANT: Please sign the Developer Certificate of Origin

Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:

Signed-off-by: J. Doe <j.doe@domain.com>

Describe your changes

Code checklist before requesting a review

  • I have signed the DCO?
  • The build completes without errors?
  • All tests are passing when running make test?

PR Type

Bug fix, Enhancement


Description

  • Fix ICMP metrics attachment to agent devices instead of pollers

  • Resolve source IP normalization for poller/agent identity reporting

  • Treat ServiceRadar service updates as authoritative devices bypassing sightings

  • Add partition parameter to device registration for proper inventory placement

  • Enhance source IP resolution to check POD_IP, HOST_IP, NODE_IP environment variables


Diagram Walkthrough

flowchart LR
  A["Poller/Agent Status"] -->|resolveServiceHostIP| B["Normalized Source IP"]
  B -->|registerPollerAsDevice| C["Service Device Update"]
  C -->|ProcessBatchDeviceUpdates| D["Authoritative Device"]
  D -->|bypass sightings| E["Inventory Device"]
  F["ICMP Metrics"] -->|resolveICMPDevice| G["Agent Device ID"]
  G -->|attach capability| H["Agent Service Device"]

File Walkthrough

Relevant files
Bug fix
2 files
metrics.go
ICMP device resolution and agent-based attribution             
+84/-21 
registry.go
Bypass sightings for authoritative service updates             
+30/-0   
Tests
5 files
metrics_test.go
Test ICMP metrics agent device attachment                               
+126/-0 
pollers_test.go
Test source IP fallback to stored status                                 
+33/-0   
service_device_test.go
Update tests for partition parameter                                         
+13/-13 
source_ip_test.go
Test source IP resolution from environment                             
+58/-0   
service_device_test.go
Test service devices bypass sightings under identity reconciliation
+111/-15
Enhancement
4 files
pollers.go
Source IP resolution and device registration refactoring 
+133/-49
services.go
Pass partition to device registration functions                   
+2/-2     
service_registration.go
Add partition parameter to device creation functions         
+24/-6   
poller.go
Enhance source IP resolution with environment variables   
+44/-12 
Documentation
4 files
AGENTS.md
Remove outdated Beads documentation references                     
+0/-3     
proposal.md
Add OpenSpec proposal for service device fixes                     
+14/-0   
spec.md
Add specification requirements for service device identity
+14/-0   
tasks.md
Add implementation tasks and deployment checklist               
+17/-0   
Configuration changes
3 files
values.yaml
Update container image SHA tags                                                   
+18/-18 
kustomization.yaml
Update image tags for production deployment                           
+8/-8     
kustomization.yaml
Update image tags for staging deployment                                 
+9/-9     

Imported from GitHub pull request. Original GitHub pull request: #2025 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/pull/2025 Original created: 2025-11-27T16:48:34Z Original updated: 2025-11-27T16:52:28Z Original head: carverauto/serviceradar:2023-bugdire-internal-services-not-first-class-devices Original base: main Original merged: 2025-11-27T16:52:14Z by @mfreeman451 --- ### **User description** ## IMPORTANT: Please sign the Developer Certificate of Origin Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include a [DCO sign-off statement]( https://developercertificate.org/) indicating the DCO acceptance in one commit message. Here is an example DCO Signed-off-by line in a commit message: ``` Signed-off-by: J. Doe <j.doe@domain.com> ``` ## Describe your changes ## Issue ticket number and link ## Code checklist before requesting a review - [ ] I have signed the DCO? - [ ] The build completes without errors? - [ ] All tests are passing when running make test? ___ ### **PR Type** Bug fix, Enhancement ___ ### **Description** - Fix ICMP metrics attachment to agent devices instead of pollers - Resolve source IP normalization for poller/agent identity reporting - Treat ServiceRadar service updates as authoritative devices bypassing sightings - Add partition parameter to device registration for proper inventory placement - Enhance source IP resolution to check POD_IP, HOST_IP, NODE_IP environment variables ___ ### Diagram Walkthrough ```mermaid flowchart LR A["Poller/Agent Status"] -->|resolveServiceHostIP| B["Normalized Source IP"] B -->|registerPollerAsDevice| C["Service Device Update"] C -->|ProcessBatchDeviceUpdates| D["Authoritative Device"] D -->|bypass sightings| E["Inventory Device"] F["ICMP Metrics"] -->|resolveICMPDevice| G["Agent Device ID"] G -->|attach capability| H["Agent Service Device"] ``` <details> <summary><h3> File Walkthrough</h3></summary> <table><thead><tr><th></th><th align="left">Relevant files</th></tr></thead><tbody><tr><td><strong>Bug fix</strong></td><td><details><summary>2 files</summary><table> <tr> <td><strong>metrics.go</strong><dd><code>ICMP device resolution and agent-based attribution</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-6d98e853ce17576c088e77956ae4ecfa8078019e0bff107a79d8d1d6ed2443ad">+84/-21</a>&nbsp; </td> </tr> <tr> <td><strong>registry.go</strong><dd><code>Bypass sightings for authoritative service updates</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-cb61d8f79451b9541de4a8cc0811523a68d15452b2f5971c7618ea5b423cf4ec">+30/-0</a>&nbsp; &nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Tests</strong></td><td><details><summary>5 files</summary><table> <tr> <td><strong>metrics_test.go</strong><dd><code>Test ICMP metrics agent device attachment</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-2b0ce06b068be7b4418c3fe4d23e3b8bf536d0cc2b9201a6949976e298a9e95e">+126/-0</a>&nbsp; </td> </tr> <tr> <td><strong>pollers_test.go</strong><dd><code>Test source IP fallback to stored status</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-7d447a2f21f392abe4d087c35d56d0381dff6c1e58d01e8647f6f623c9622de2">+33/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>service_device_test.go</strong><dd><code>Update tests for partition parameter</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-6b85e0c22bc48ca6678b23cd683d6b8b5dee9d20f7a8e822d6d13502460f3689">+13/-13</a>&nbsp; </td> </tr> <tr> <td><strong>source_ip_test.go</strong><dd><code>Test source IP resolution from environment</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-435efb0616943125907b00105c4af2c8eaebed8a1f2a8e681d98bdfd276b5ec2">+58/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>service_device_test.go</strong><dd><code>Test service devices bypass sightings under identity reconciliation</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-2996ff7907c1495651fb6a345130a6ebdca6d8d03558e0d9e33542b3acebf2be">+111/-15</a></td> </tr> </table></details></td></tr><tr><td><strong>Enhancement</strong></td><td><details><summary>4 files</summary><table> <tr> <td><strong>pollers.go</strong><dd><code>Source IP resolution and device registration refactoring</code>&nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-fe81e2a32f1ac64bcdc6f25f55c5fa918d17bad8c0546f2cf80c757ff4051816">+133/-49</a></td> </tr> <tr> <td><strong>services.go</strong><dd><code>Pass partition to device registration functions</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-b75091f9768dcdaf46aedeee40cb2eaa33b46a484d77d5d432bab19fe437237f">+2/-2</a>&nbsp; &nbsp; &nbsp; </td> </tr> <tr> <td><strong>service_registration.go</strong><dd><code>Add partition parameter to device creation functions</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-3ad8d9e7f1f17e0198a6a5a53398cc9bcae94a111f907d965dfcc43daeeb95e8">+24/-6</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>poller.go</strong><dd><code>Enhance source IP resolution with environment variables</code>&nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-28a10dea1596540e55ce9a8b68bd1af3d96bd4634f6def668643892cef25a086">+44/-12</a>&nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Documentation</strong></td><td><details><summary>4 files</summary><table> <tr> <td><strong>AGENTS.md</strong><dd><code>Remove outdated Beads documentation references</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-a54ff182c7e8acf56acfd6e4b9c3ff41e2c41a31c9b211b2deb9df75d9a478f9">+0/-3</a>&nbsp; &nbsp; &nbsp; </td> </tr> <tr> <td><strong>proposal.md</strong><dd><code>Add OpenSpec proposal for service device fixes</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-4bcd11bed34206ce4102519db0e148b7b197fab33423d03712aa23475df761bc">+14/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>spec.md</strong><dd><code>Add specification requirements for service device identity</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-9f15500fd74ac3845aec9f5a6a2b2024a019b2f9438ce92f47fe18a6d945ab09">+14/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>tasks.md</strong><dd><code>Add implementation tasks and deployment checklist</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-bd097cdff5fea1b8137a6f3fffb2dda7686b999f1e6e3889b4372481c1c37a66">+17/-0</a>&nbsp; &nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Configuration changes</strong></td><td><details><summary>3 files</summary><table> <tr> <td><strong>values.yaml</strong><dd><code>Update container image SHA tags</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-d4449c7cb70362554b274f81eae5a4b81a8e81df494282e383d1b7ea3871c452">+18/-18</a>&nbsp; </td> </tr> <tr> <td><strong>kustomization.yaml</strong><dd><code>Update image tags for production deployment</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-0527e7f19d087f3576d5755a79554797ffbab78b1a7efaa38984b4f3241f6fc9">+8/-8</a>&nbsp; &nbsp; &nbsp; </td> </tr> <tr> <td><strong>kustomization.yaml</strong><dd><code>Update image tags for staging deployment</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-ae7d8d4134a595a9d278924988f58e1843ad4d5d24b4df3b2c976dd3610a1b64">+9/-9</a>&nbsp; &nbsp; &nbsp; </td> </tr> </table></details></td></tr></tr></tbody></table> </details> ___
qodo-code-review[bot] commented 2025-11-27 16:49:15 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2025#issuecomment-3586768289
Original created: 2025-11-27T16:49:15Z

You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Input validation

Description: ICMP metric processing logs and accepts payload-derived values (e.g., host, response_time)
without strict schema validation or size limits, which could enable log injection or
excessive memory usage if untrusted payloads are passed through.
metrics.go [587-646]

Referred Code
serviceName := strings.TrimSpace(svc.ServiceName)
serviceType := strings.TrimSpace(svc.ServiceType)

if agentID == "" {
	s.logger.Warn().
		Str("poller_id", pollerID).
		Msg("Skipping ICMP metrics without agent ID to avoid poller device attribution")
	return nil
}

var pingResult struct {
	Host         string  `json:"host"`
	ResponseTime int64   `json:"response_time"`
	PacketLoss   float64 `json:"packet_loss"`
	Available    bool    `json:"available"`
	DeviceID     string  `json:"device_id,omitempty"`
}

if err := json.Unmarshal(details, &pingResult); err != nil {
	s.logger.Error().
		Err(err).


 ... (clipped 39 lines)
Supply chain risk

Description: Pinning many component images to opaque sha tags without provenance verification or
signatures may risk supply chain integrity if the registry is compromised; ensure images
are trusted and verified.
values.yaml [3-28]

Referred Code
registryPullSecret: ghcr-io-cred
tags:
  # ServiceRadar UUID identity system - generates stable device IDs based on strong identifiers
  core: "sha-88d3a8af915b167407b36d991f757864c09dfeb5a4180de539f99498f650ebcf"
  web: "sha-ea0415aa1069be6420d2d4a80a21b2a8c835f4ca8daed27000cea23e710a70a6"
  nats: "2.12.2-alpine"
  datasvc: "sha-bdc0057ce88c9f275f700f573a989947ab5b7ff78903ee7ea3108c1b003feb80"
  agent: "sha-9c92617fce5fa5570786c3f3f43b298726b7ee0ceb02c857f3940623bdc60954"
  poller: "sha-bccc4567ef2a27492ccb98a9310c6cf82c931bb497275f5e0b015219efd98ad7"
  snmpChecker: "sha-b32c3d1c9923b85d6fa04517785701a44c2172fcc0a04c8b613951498d9859c2"
  dbEventWriter: "sha-0457fff7a78fb4063e1759847a4efb6c6be1208d5b4e1fb2de6df2798d38d2e4"
  otel: "sha-c4a130449dfeccd8c12d56371b0c06a1e7b79f2a8ee71c6d277c6cb783b230ca"
  mapper: "sha-327483dfd4769d124eb6e49b3b3e5ca577d5bde4a33bd885099e043c1673224d"
  trapd: "sha-3895ec49a493e40b062f7ea5c339846095bcad0f56a012dd6a82ee72245602d6"
  flowgger: "1.0.56"
  zen: "sha-8f62f5d7d9c758348b24a6d7849ae5d7e0aa00011ab09afb67fcb03fb2f42383"
  # Armis sync uses empty DeviceID - registry generates ServiceRadar UUIDs
  sync: "sha-022d570f8aeb461a2a105ed2065e77f0a5aaf41843c045e1199dadc4a2bedbb3"
  rperfClient: "sha-90ac66d44343d9cbd769d3598aedc49b581b056fb771266e2cfea86f097e01c5"
  faker: "sha-8fc66aff046f6e65e122e420609807b2260d6c8b2c3566d5c301518146810231"
  rperfChecker: "sha-8f62f5d7d9c758348b24a6d7849ae5d7e0aa00011ab09afb67fcb03fb2f42383"


 ... (clipped 5 lines)
Ticket Compliance
🟡
🎫 #2023
🟢 Ensure pollers/agents/checkers/collectors are treated as first-class devices, not demoted
to sightings.
Prevent promoted service components from reverting to sightings/DIRE; they should remain
in device inventory.
Attach ICMP metrics to the correct agent device instead of pollers.
Normalize and correctly resolve source IP/host identity for poller/agent reports,
including environment-based fallbacks.
Place service devices in the correct partition (default when unspecified) during
registration and updates.
Validate in the running system/UI that service devices no longer appear in sightings/DIRE
after promotion and persist in inventory across cycles.
Confirm in staging/production telemetry that ICMP graphs appear on agents and not on
pollers, and that hostnames/IPs are correctly populated.
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Audit context: New critical flows resolving/normalizing source IPs and registering devices (e.g.,
resolveServiceHostIP, register*AsDevice) add behavior changes without explicit audit/log
entries capturing actor, action, and outcome beyond warnings/info, requiring verification
that upstream logging/audit layers record these actions.

Referred Code
ctx, span := s.tracer.Start(ctx, "ReportStatus")
defer span.End()

resolvedSourceIP := s.resolveServiceHostIP(ctx, req.PollerId, req.AgentId, req.SourceIp)

// Add span attributes for the request
span.SetAttributes(
	attribute.String("poller_id", req.PollerId),
	attribute.String("partition", req.Partition),
	attribute.String("source_ip", resolvedSourceIP),
	attribute.String("source_ip_raw", req.SourceIp),
	attribute.Int("service_count", len(req.Services)),
)

// Get trace-aware logger from context (added by LoggingInterceptor)
logger := grpc.GetLogger(ctx, s.logger)
logger.Debug().
	Str("poller_id", req.PollerId).
	Int("service_count", len(req.Services)).
	Time("timestamp", time.Now()).
	Msg("Received status report")


 ... (clipped 554 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Input validation: The new resolveICMPDevice and ICMP processing pathways derive and propagate IPs/IDs from
environment/metadata without explicit validation beyond normalizeHostIP and net.ParseIP
checks, requiring verification that upstream layers sanitize inputs to prevent injection
or malformed identifiers in device updates.

Referred Code
	return nil
}

func (s *Server) resolveICMPDevice(
	ctx context.Context,
	pollerID string,
	partition string,
	agentID string,
	sourceIP string,
	deviceID string,
) (string, string, string, string, error) {
	collectorIP := normalizeHostIP(s.resolveServiceHostIP(ctx, pollerID, agentID, sourceIP))
	if collectorIP == "" {
		collectorIP = normalizeHostIP(sourceIP)
	}
	if collectorIP != "" && net.ParseIP(collectorIP) == nil {
		collectorIP = ""
	}

	resolvedDeviceID := strings.TrimSpace(deviceID)
	if resolvedDeviceID == "" {


 ... (clipped 40 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
- Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2025#issuecomment-3586768289 Original created: 2025-11-27T16:49:15Z --- _You are nearing your monthly Qodo Merge usage quota. For more information, please visit [here](https://qodo-merge-docs.qodo.ai/installation/qodo_merge/#cloud-users)._ ## PR Compliance Guide 🔍 <!-- https://github.com/carverauto/serviceradar/commit/b5c5e3ee0ef82417972c969bb5382f8c7b09613b --> Below is a summary of compliance checks for this PR:<br> <table><tbody><tr><td colspan='2'><strong>Security Compliance</strong></td></tr> <tr><td rowspan=2>⚪</td> <td><details><summary><strong>Input validation </strong></summary><br> <b>Description:</b> ICMP metric processing logs and accepts payload-derived values (e.g., host, response_time) <br>without strict schema validation or size limits, which could enable log injection or <br>excessive memory usage if untrusted payloads are passed through.<br> <strong><a href='https://github.com/carverauto/serviceradar/pull/2025/files#diff-6d98e853ce17576c088e77956ae4ecfa8078019e0bff107a79d8d1d6ed2443adR587-R646'>metrics.go [587-646]</a></strong><br> <details open><summary>Referred Code</summary> ```go serviceName := strings.TrimSpace(svc.ServiceName) serviceType := strings.TrimSpace(svc.ServiceType) if agentID == "" { s.logger.Warn(). Str("poller_id", pollerID). Msg("Skipping ICMP metrics without agent ID to avoid poller device attribution") return nil } var pingResult struct { Host string `json:"host"` ResponseTime int64 `json:"response_time"` PacketLoss float64 `json:"packet_loss"` Available bool `json:"available"` DeviceID string `json:"device_id,omitempty"` } if err := json.Unmarshal(details, &pingResult); err != nil { s.logger.Error(). Err(err). ... (clipped 39 lines) ``` </details></details></td></tr> <tr><td><details><summary><strong>Supply chain risk </strong></summary><br> <b>Description:</b> Pinning many component images to opaque sha tags without provenance verification or <br>signatures may risk supply chain integrity if the registry is compromised; ensure images <br>are trusted and verified.<br> <strong><a href='https://github.com/carverauto/serviceradar/pull/2025/files#diff-d4449c7cb70362554b274f81eae5a4b81a8e81df494282e383d1b7ea3871c452R3-R28'>values.yaml [3-28]</a></strong><br> <details open><summary>Referred Code</summary> ```yaml registryPullSecret: ghcr-io-cred tags: # ServiceRadar UUID identity system - generates stable device IDs based on strong identifiers core: "sha-88d3a8af915b167407b36d991f757864c09dfeb5a4180de539f99498f650ebcf" web: "sha-ea0415aa1069be6420d2d4a80a21b2a8c835f4ca8daed27000cea23e710a70a6" nats: "2.12.2-alpine" datasvc: "sha-bdc0057ce88c9f275f700f573a989947ab5b7ff78903ee7ea3108c1b003feb80" agent: "sha-9c92617fce5fa5570786c3f3f43b298726b7ee0ceb02c857f3940623bdc60954" poller: "sha-bccc4567ef2a27492ccb98a9310c6cf82c931bb497275f5e0b015219efd98ad7" snmpChecker: "sha-b32c3d1c9923b85d6fa04517785701a44c2172fcc0a04c8b613951498d9859c2" dbEventWriter: "sha-0457fff7a78fb4063e1759847a4efb6c6be1208d5b4e1fb2de6df2798d38d2e4" otel: "sha-c4a130449dfeccd8c12d56371b0c06a1e7b79f2a8ee71c6d277c6cb783b230ca" mapper: "sha-327483dfd4769d124eb6e49b3b3e5ca577d5bde4a33bd885099e043c1673224d" trapd: "sha-3895ec49a493e40b062f7ea5c339846095bcad0f56a012dd6a82ee72245602d6" flowgger: "1.0.56" zen: "sha-8f62f5d7d9c758348b24a6d7849ae5d7e0aa00011ab09afb67fcb03fb2f42383" # Armis sync uses empty DeviceID - registry generates ServiceRadar UUIDs sync: "sha-022d570f8aeb461a2a105ed2065e77f0a5aaf41843c045e1199dadc4a2bedbb3" rperfClient: "sha-90ac66d44343d9cbd769d3598aedc49b581b056fb771266e2cfea86f097e01c5" faker: "sha-8fc66aff046f6e65e122e420609807b2260d6c8b2c3566d5c301518146810231" rperfChecker: "sha-8f62f5d7d9c758348b24a6d7849ae5d7e0aa00011ab09afb67fcb03fb2f42383" ... (clipped 5 lines) ``` </details></details></td></tr> <tr><td colspan='2'><strong>Ticket Compliance</strong></td></tr> <tr><td>🟡</td> <td> <details> <summary>🎫 <a href=https://github.com/carverauto/serviceradar/issues/2023>#2023</a></summary> <table width='100%'><tbody> <tr><td rowspan=5>🟢</td> <td>Ensure pollers/agents/checkers/collectors are treated as first-class devices, not demoted <br>to sightings.</td></tr> <tr><td>Prevent promoted service components from reverting to sightings/DIRE; they should remain <br>in device inventory.</td></tr> <tr><td>Attach ICMP metrics to the correct agent device instead of pollers.</td></tr> <tr><td>Normalize and correctly resolve source IP/host identity for poller/agent reports, <br>including environment-based fallbacks.</td></tr> <tr><td>Place service devices in the correct partition (default when unspecified) during <br>registration and updates.</td></tr> <tr><td rowspan=2>⚪</td> <td>Validate in the running system/UI that service devices no longer appear in sightings/DIRE <br>after promotion and persist in inventory across cycles.</td></tr> <tr><td>Confirm in staging/production telemetry that ICMP graphs appear on agents and not on <br>pollers, and that hostnames/IPs are correctly populated.</td></tr> </tbody></table> </details> </td></tr> <tr><td colspan='2'><strong>Codebase Duplication Compliance</strong></td></tr> <tr><td>⚪</td><td><details><summary><strong>Codebase context is not defined </strong></summary> Follow the <a href='https://qodo-merge-docs.qodo.ai/core-abilities/rag_context_enrichment/'>guide</a> to enable codebase context checks. </details></td></tr> <tr><td colspan='2'><strong>Custom Compliance</strong></td></tr> <tr><td rowspan=4>🟢</td><td> <details><summary><strong>Generic: Meaningful Naming and Self-Documenting Code</strong></summary><br> **Objective:** Ensure all identifiers clearly express their purpose and intent, making code <br>self-documenting<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Robust Error Handling and Edge Case Management</strong></summary><br> **Objective:** Ensure comprehensive error handling that provides meaningful context and graceful <br>degradation<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Secure Error Handling</strong></summary><br> **Objective:** To prevent the leakage of sensitive system information through error messages while <br>providing sufficient detail for internal debugging.<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Secure Logging Practices</strong></summary><br> **Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive <br>information like PII, PHI, or cardholder data.<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td rowspan=2>⚪</td> <td><details> <summary><strong>Generic: Comprehensive Audit Trails</strong></summary><br> **Objective:** To create a detailed and reliable record of critical system actions for security analysis <br>and compliance.<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2025/files#diff-fe81e2a32f1ac64bcdc6f25f55c5fa918d17bad8c0546f2cf80c757ff4051816R896-R1470'><strong>Audit context</strong></a>: New critical flows resolving/normalizing source IPs and registering devices (e.g., <br>resolveServiceHostIP, register*AsDevice) add behavior changes without explicit audit/log <br>entries capturing actor, action, and outcome beyond warnings/info, requiring verification <br>that upstream logging/audit layers record these actions.<br> <details open><summary>Referred Code</summary> ```go ctx, span := s.tracer.Start(ctx, "ReportStatus") defer span.End() resolvedSourceIP := s.resolveServiceHostIP(ctx, req.PollerId, req.AgentId, req.SourceIp) // Add span attributes for the request span.SetAttributes( attribute.String("poller_id", req.PollerId), attribute.String("partition", req.Partition), attribute.String("source_ip", resolvedSourceIP), attribute.String("source_ip_raw", req.SourceIp), attribute.Int("service_count", len(req.Services)), ) // Get trace-aware logger from context (added by LoggingInterceptor) logger := grpc.GetLogger(ctx, s.logger) logger.Debug(). Str("poller_id", req.PollerId). Int("service_count", len(req.Services)). Time("timestamp", time.Now()). Msg("Received status report") ... (clipped 554 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Security-First Input Validation and Data Handling</strong></summary><br> **Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent <br>vulnerabilities<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2025/files#diff-6d98e853ce17576c088e77956ae4ecfa8078019e0bff107a79d8d1d6ed2443adR798-R858'><strong>Input validation</strong></a>: The new resolveICMPDevice and ICMP processing pathways derive and propagate IPs/IDs from <br>environment/metadata without explicit validation beyond normalizeHostIP and net.ParseIP <br>checks, requiring verification that upstream layers sanitize inputs to prevent injection <br>or malformed identifiers in device updates.<br> <details open><summary>Referred Code</summary> ```go return nil } func (s *Server) resolveICMPDevice( ctx context.Context, pollerID string, partition string, agentID string, sourceIP string, deviceID string, ) (string, string, string, string, error) { collectorIP := normalizeHostIP(s.resolveServiceHostIP(ctx, pollerID, agentID, sourceIP)) if collectorIP == "" { collectorIP = normalizeHostIP(sourceIP) } if collectorIP != "" && net.ParseIP(collectorIP) == nil { collectorIP = "" } resolvedDeviceID := strings.TrimSpace(deviceID) if resolvedDeviceID == "" { ... (clipped 40 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td align="center" colspan="2"> <!-- placeholder --> <!-- /compliance --update_compliance=true --> </td></tr></tbody></table> <details><summary>Compliance status legend</summary> 🟢 - Fully Compliant<br> 🟡 - Partial Compliant<br> 🔴 - Not Compliant<br> ⚪ - Requires Further Human Verification<br> 🏷️ - Compliance label<br> </details>
qodo-code-review[bot] commented 2025-11-27 16:50:42 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2025#issuecomment-3586772355
Original created: 2025-11-27T16:50:42Z

You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

PR Code Suggestions

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent data race on shared fields

Add a mutex lock in the sourceIP method to prevent data races when accessing
shared Poller fields from multiple goroutines.

pkg/poller/poller.go [189-211]

 func (p *Poller) sourceIP() string {
+	p.mu.Lock()
+	defer p.mu.Unlock()
+
 	if ip := parseSourceCandidate(p.resolvedSourceIP, p.logger); ip != "" {
 		if p.resolvedSourceIP != ip {
 			p.resolvedSourceIP = ip
 		}
 		if p.config.SourceIP != ip {
 			p.config.SourceIP = ip
 		}
 		return ip
 	}
 
 	resolved := resolveSourceIP(p.config.SourceIP, p.logger)
 	if resolved != "" {
 		p.resolvedSourceIP = resolved
 		p.config.SourceIP = resolved
 		return resolved
 	}
 
 	p.resolvedSourceIP = ""
 	p.config.SourceIP = ""
 
 	return ""
 }
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a potential data race on shared fields p.resolvedSourceIP and p.config.SourceIP and proposes adding a mutex to ensure thread safety.

Medium
Fix incorrect IP resolution fallback

Modify resolveServiceHostIP to ensure it falls back to resolving the IP from the
poller's information if the agent's information does not yield an IP.

pkg/core/pollers.go [1327-1373]

 func (s *Server) resolveServiceHostIP(ctx context.Context, pollerID, agentID, hostIP string) string {
 	resolvedIP := normalizeHostIP(hostIP)
+	if resolvedIP != "" {
+		return resolvedIP
+	}
 
 	resolveFromMetadata := func(metadata map[string]string) string {
 		if metadata == nil {
 			return ""
 		}
 
 		if ip := normalizeHostIP(metadata["source_ip"]); ip != "" {
 			return ip
 		}
 
 		if ip := normalizeHostIP(metadata["host_ip"]); ip != "" {
 			return ip
 		}
 
 		return ""
 	}
 
-	if resolvedIP == "" && s.ServiceRegistry != nil {
+	if s.ServiceRegistry != nil {
 		if agentID != "" {
 			if agent, err := s.ServiceRegistry.GetAgent(ctx, agentID); err == nil && agent != nil {
 				if ip := resolveFromMetadata(agent.Metadata); ip != "" {
-					resolvedIP = ip
+					return ip
 				}
 			}
 		}
 
-		if resolvedIP == "" && pollerID != "" {
+		if pollerID != "" {
 			if poller, err := s.ServiceRegistry.GetPoller(ctx, pollerID); err == nil && poller != nil {
 				if ip := resolveFromMetadata(poller.Metadata); ip != "" {
-					resolvedIP = ip
+					return ip
 				}
 			}
 		}
 	}
 
-	if resolvedIP == "" && pollerID != "" && s.DB != nil {
+	if pollerID != "" && s.DB != nil {
 		if status, err := s.DB.GetPollerStatus(ctx, pollerID); err == nil && status != nil {
 			if ip := normalizeHostIP(status.HostIP); ip != "" {
 				resolvedIP = ip
 			}
 		}
 	}
 
 	return resolvedIP
 }
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a logic flaw where the IP resolution fallback from agent to poller is incomplete, improving the robustness of IP resolution.

Medium
Ensure ICMP is always agent-associated

Update resolveICMPDevice to always use the agentID to generate the
resolvedDeviceID, ensuring ICMP capabilities are correctly associated with the
agent.

pkg/core/metrics.go [801-858]

 func (s *Server) resolveICMPDevice(
 	ctx context.Context,
 	pollerID string,
 	partition string,
 	agentID string,
 	sourceIP string,
 	deviceID string,
 ) (string, string, string, string, error) {
 	collectorIP := normalizeHostIP(s.resolveServiceHostIP(ctx, pollerID, agentID, sourceIP))
 	if collectorIP == "" {
 		collectorIP = normalizeHostIP(sourceIP)
 	}
 	if collectorIP != "" && net.ParseIP(collectorIP) == nil {
 		collectorIP = ""
 	}
 
 	resolvedDeviceID := strings.TrimSpace(deviceID)
+	if agentID != "" {
+		// When an agent is present, always associate with the agent's service device.
+		resolvedDeviceID = models.GenerateServiceDeviceID(models.ServiceTypeAgent, agentID)
+		hostDeviceID := resolvedDeviceID
+		updateIP := collectorIP
+		return resolvedDeviceID, hostDeviceID, collectorIP, updateIP, nil
+	}
+
 	if resolvedDeviceID == "" {
-		switch {
-		case agentID != "":
-			resolvedDeviceID = models.GenerateServiceDeviceID(models.ServiceTypeAgent, agentID)
-		case collectorIP != "":
+		if collectorIP != "" {
 			resolvedDeviceID = models.GenerateNetworkDeviceID(partition, collectorIP)
-		default:
+		} else {
 			return "", "", collectorIP, "", errICMPDeviceIdentifiersMissing
 		}
-	}
-
-	hostDeviceID := ""
-	updateIP := ""
-
-	if agentID != "" {
-		// When an agent is present, keep the ICMP capability attached to the agent service device.
-		hostDeviceID = resolvedDeviceID
-		updateIP = collectorIP
-		return resolvedDeviceID, hostDeviceID, collectorIP, updateIP, nil
 	}
 
 	resolution := s.resolveCanonicalDevice(ctx, collectorIP, resolvedDeviceID)
 	if trimmed := strings.TrimSpace(resolution.DeviceID); trimmed != "" {
 		resolvedDeviceID = trimmed
 	}
 
-	updateIP = strings.TrimSpace(resolution.IP)
+	updateIP := strings.TrimSpace(resolution.IP)
 	if updateIP == "" {
 		updateIP = ipFromDeviceID(resolvedDeviceID)
 	}
 	if updateIP == "" {
 		updateIP = collectorIP
 	}
 
-	hostDeviceID = strings.TrimSpace(resolution.DeviceID)
+	hostDeviceID := strings.TrimSpace(resolution.DeviceID)
 	if hostDeviceID == "" {
 		hostDeviceID = resolvedDeviceID
 	}
 
 	return resolvedDeviceID, hostDeviceID, collectorIP, updateIP, nil
 }
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly points out that if an agentID is present, the ICMP capability should always be associated with that agent, overriding any other deviceID to enforce the PR's goal.

Medium
High-level
Consider a more robust service identity

To make service identity more robust, add an explicit isAuthoritative boolean
flag to the DeviceUpdate model. This flag should be set upon creation,
consolidating the currently distributed logic for identifying authoritative
internal services.

Examples:

pkg/registry/registry.go [2050-2064]
func isAuthoritativeServiceUpdate(update *models.DeviceUpdate) bool {
	if update == nil {
		return false
	}

	if update.Source == models.DiscoverySourceServiceRadar || update.Source == models.DiscoverySourceSelfReported {
		return true
	}

	if update.ServiceType != nil {

 ... (clipped 5 lines)
pkg/models/service_registration.go [11-43]
func CreatePollerDeviceUpdate(pollerID, hostIP, partition string, metadata map[string]string) *DeviceUpdate {
	serviceType := ServiceTypePoller

	if metadata == nil {
		metadata = make(map[string]string)
	}

	normalizedPartition := strings.TrimSpace(partition)
	if normalizedPartition == "" {
		normalizedPartition = defaultServicePartition

 ... (clipped 23 lines)

Solution Walkthrough:

Before:

// pkg/registry/registry.go
func isAuthoritativeServiceUpdate(update *models.DeviceUpdate) bool {
	if update.Source == models.DiscoverySourceServiceRadar || update.Source == models.DiscoverySourceSelfReported {
		return true
	}
	if update.ServiceType != nil {
		return true
	}
	return isServiceDeviceID(update.DeviceID)
}

func (r *DeviceRegistry) ProcessBatchDeviceUpdates(ctx, updates) {
  // ...
  for _, u := range valid {
    if isAuthoritativeServiceUpdate(u) {
      // ... treat as authoritative
    }
  }
}

After:

// pkg/models/unified_device.go
type DeviceUpdate struct {
    // ... existing fields
    IsAuthoritative bool `json:"is_authoritative,omitempty"`
}

// pkg/models/service_registration.go
func CreatePollerDeviceUpdate(...) *DeviceUpdate {
    return &DeviceUpdate{
        // ... existing fields
        IsAuthoritative: true,
    }
}

// pkg/registry/registry.go
func (r *DeviceRegistry) ProcessBatchDeviceUpdates(ctx, updates) {
  // ...
  for _, u := range valid {
    if u.IsAuthoritative {
      // ... treat as authoritative
    }
  }
}

Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that the logic for "authoritative" updates is implicit and spread out, and proposes a valid design improvement to make this explicit in the DeviceUpdate model, which would improve code clarity and maintainability.

Medium
  • More
Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2025#issuecomment-3586772355 Original created: 2025-11-27T16:50:42Z --- _You are nearing your monthly Qodo Merge usage quota. For more information, please visit [here](https://qodo-merge-docs.qodo.ai/installation/qodo_merge/#cloud-users)._ ## PR Code Suggestions ✨ <!-- b5c5e3e --> Explore these optional code suggestions: <table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=3>Possible issue</td> <td> <details><summary>Prevent data race on shared fields</summary> ___ **Add a mutex lock in the <code>sourceIP</code> method to prevent data races when accessing <br>shared <code>Poller</code> fields from multiple goroutines.** [pkg/poller/poller.go [189-211]](https://github.com/carverauto/serviceradar/pull/2025/files#diff-28a10dea1596540e55ce9a8b68bd1af3d96bd4634f6def668643892cef25a086R189-R211) ```diff func (p *Poller) sourceIP() string { + p.mu.Lock() + defer p.mu.Unlock() + if ip := parseSourceCandidate(p.resolvedSourceIP, p.logger); ip != "" { if p.resolvedSourceIP != ip { p.resolvedSourceIP = ip } if p.config.SourceIP != ip { p.config.SourceIP = ip } return ip } resolved := resolveSourceIP(p.config.SourceIP, p.logger) if resolved != "" { p.resolvedSourceIP = resolved p.config.SourceIP = resolved return resolved } p.resolvedSourceIP = "" p.config.SourceIP = "" return "" } ``` - [ ] **Apply / Chat** <!-- /improve --apply_suggestion=0 --> <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: The suggestion correctly identifies a potential data race on shared fields `p.resolvedSourceIP` and `p.config.SourceIP` and proposes adding a mutex to ensure thread safety. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Fix incorrect IP resolution fallback</summary> ___ **Modify <code>resolveServiceHostIP</code> to ensure it falls back to resolving the IP from the <br>poller's information if the agent's information does not yield an IP.** [pkg/core/pollers.go [1327-1373]](https://github.com/carverauto/serviceradar/pull/2025/files#diff-fe81e2a32f1ac64bcdc6f25f55c5fa918d17bad8c0546f2cf80c757ff4051816R1327-R1373) ```diff func (s *Server) resolveServiceHostIP(ctx context.Context, pollerID, agentID, hostIP string) string { resolvedIP := normalizeHostIP(hostIP) + if resolvedIP != "" { + return resolvedIP + } resolveFromMetadata := func(metadata map[string]string) string { if metadata == nil { return "" } if ip := normalizeHostIP(metadata["source_ip"]); ip != "" { return ip } if ip := normalizeHostIP(metadata["host_ip"]); ip != "" { return ip } return "" } - if resolvedIP == "" && s.ServiceRegistry != nil { + if s.ServiceRegistry != nil { if agentID != "" { if agent, err := s.ServiceRegistry.GetAgent(ctx, agentID); err == nil && agent != nil { if ip := resolveFromMetadata(agent.Metadata); ip != "" { - resolvedIP = ip + return ip } } } - if resolvedIP == "" && pollerID != "" { + if pollerID != "" { if poller, err := s.ServiceRegistry.GetPoller(ctx, pollerID); err == nil && poller != nil { if ip := resolveFromMetadata(poller.Metadata); ip != "" { - resolvedIP = ip + return ip } } } } - if resolvedIP == "" && pollerID != "" && s.DB != nil { + if pollerID != "" && s.DB != nil { if status, err := s.DB.GetPollerStatus(ctx, pollerID); err == nil && status != nil { if ip := normalizeHostIP(status.HostIP); ip != "" { resolvedIP = ip } } } return resolvedIP } ``` - [ ] **Apply / Chat** <!-- /improve --apply_suggestion=1 --> <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies a logic flaw where the IP resolution fallback from agent to poller is incomplete, improving the robustness of IP resolution. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Ensure ICMP is always agent-associated</summary> ___ **Update <code>resolveICMPDevice</code> to always use the <code>agentID</code> to generate the <br><code>resolvedDeviceID</code>, ensuring ICMP capabilities are correctly associated with the <br>agent.** [pkg/core/metrics.go [801-858]](https://github.com/carverauto/serviceradar/pull/2025/files#diff-6d98e853ce17576c088e77956ae4ecfa8078019e0bff107a79d8d1d6ed2443adR801-R858) ```diff func (s *Server) resolveICMPDevice( ctx context.Context, pollerID string, partition string, agentID string, sourceIP string, deviceID string, ) (string, string, string, string, error) { collectorIP := normalizeHostIP(s.resolveServiceHostIP(ctx, pollerID, agentID, sourceIP)) if collectorIP == "" { collectorIP = normalizeHostIP(sourceIP) } if collectorIP != "" && net.ParseIP(collectorIP) == nil { collectorIP = "" } resolvedDeviceID := strings.TrimSpace(deviceID) + if agentID != "" { + // When an agent is present, always associate with the agent's service device. + resolvedDeviceID = models.GenerateServiceDeviceID(models.ServiceTypeAgent, agentID) + hostDeviceID := resolvedDeviceID + updateIP := collectorIP + return resolvedDeviceID, hostDeviceID, collectorIP, updateIP, nil + } + if resolvedDeviceID == "" { - switch { - case agentID != "": - resolvedDeviceID = models.GenerateServiceDeviceID(models.ServiceTypeAgent, agentID) - case collectorIP != "": + if collectorIP != "" { resolvedDeviceID = models.GenerateNetworkDeviceID(partition, collectorIP) - default: + } else { return "", "", collectorIP, "", errICMPDeviceIdentifiersMissing } - } - - hostDeviceID := "" - updateIP := "" - - if agentID != "" { - // When an agent is present, keep the ICMP capability attached to the agent service device. - hostDeviceID = resolvedDeviceID - updateIP = collectorIP - return resolvedDeviceID, hostDeviceID, collectorIP, updateIP, nil } resolution := s.resolveCanonicalDevice(ctx, collectorIP, resolvedDeviceID) if trimmed := strings.TrimSpace(resolution.DeviceID); trimmed != "" { resolvedDeviceID = trimmed } - updateIP = strings.TrimSpace(resolution.IP) + updateIP := strings.TrimSpace(resolution.IP) if updateIP == "" { updateIP = ipFromDeviceID(resolvedDeviceID) } if updateIP == "" { updateIP = collectorIP } - hostDeviceID = strings.TrimSpace(resolution.DeviceID) + hostDeviceID := strings.TrimSpace(resolution.DeviceID) if hostDeviceID == "" { hostDeviceID = resolvedDeviceID } return resolvedDeviceID, hostDeviceID, collectorIP, updateIP, nil } ``` - [ ] **Apply / Chat** <!-- /improve --apply_suggestion=2 --> <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly points out that if an `agentID` is present, the ICMP capability should always be associated with that agent, overriding any other `deviceID` to enforce the PR's goal. </details></details></td><td align=center>Medium </td></tr><tr><td rowspan=1>High-level</td> <td> <details><summary>Consider a more robust service identity</summary> ___ **To make service identity more robust, add an explicit <code>isAuthoritative</code> boolean <br>flag to the <code>DeviceUpdate</code> model. This flag should be set upon creation, <br>consolidating the currently distributed logic for identifying authoritative <br>internal services.** ### Examples: <details> <summary> <a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-cb61d8f79451b9541de4a8cc0811523a68d15452b2f5971c7618ea5b423cf4ecR2050-R2064">pkg/registry/registry.go [2050-2064]</a> </summary> ```go func isAuthoritativeServiceUpdate(update *models.DeviceUpdate) bool { if update == nil { return false } if update.Source == models.DiscoverySourceServiceRadar || update.Source == models.DiscoverySourceSelfReported { return true } if update.ServiceType != nil { ... (clipped 5 lines) ``` </details> <details> <summary> <a href="https://github.com/carverauto/serviceradar/pull/2025/files#diff-3ad8d9e7f1f17e0198a6a5a53398cc9bcae94a111f907d965dfcc43daeeb95e8R11-R43">pkg/models/service_registration.go [11-43]</a> </summary> ```go func CreatePollerDeviceUpdate(pollerID, hostIP, partition string, metadata map[string]string) *DeviceUpdate { serviceType := ServiceTypePoller if metadata == nil { metadata = make(map[string]string) } normalizedPartition := strings.TrimSpace(partition) if normalizedPartition == "" { normalizedPartition = defaultServicePartition ... (clipped 23 lines) ``` </details> ### Solution Walkthrough: #### Before: ```go // pkg/registry/registry.go func isAuthoritativeServiceUpdate(update *models.DeviceUpdate) bool { if update.Source == models.DiscoverySourceServiceRadar || update.Source == models.DiscoverySourceSelfReported { return true } if update.ServiceType != nil { return true } return isServiceDeviceID(update.DeviceID) } func (r *DeviceRegistry) ProcessBatchDeviceUpdates(ctx, updates) { // ... for _, u := range valid { if isAuthoritativeServiceUpdate(u) { // ... treat as authoritative } } } ``` #### After: ```go // pkg/models/unified_device.go type DeviceUpdate struct { // ... existing fields IsAuthoritative bool `json:"is_authoritative,omitempty"` } // pkg/models/service_registration.go func CreatePollerDeviceUpdate(...) *DeviceUpdate { return &DeviceUpdate{ // ... existing fields IsAuthoritative: true, } } // pkg/registry/registry.go func (r *DeviceRegistry) ProcessBatchDeviceUpdates(ctx, updates) { // ... for _, u := range valid { if u.IsAuthoritative { // ... treat as authoritative } } } ``` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies that the logic for "authoritative" updates is implicit and spread out, and proposes a valid design improvement to make this explicit in the `DeviceUpdate` model, which would improve code clarity and maintainability. </details></details></td><td align=center>Medium </td></tr> <tr><td align="center" colspan="2"> - [ ] More <!-- /improve --more_suggestions=true --> </td><td></td></tr></tbody></table>
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar!2483
No description provided.