carverauto/serviceradar

Fork 0

phase 2 work complete #2402

Merged

mfreeman451 merged 13 commits from refs/pull/2402/head into main

2025-11-07 18:37:19 +00:00

mfreeman451 commented

2025-11-05 06:47:01 +00:00

(Migrated from github.com)

Owner

Imported from GitHub pull request.

Original GitHub pull request: #1925
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/1925
Original created: 2025-11-05T06:47:01Z
Original updated: 2025-11-07T18:37:58Z
Original head: carverauto/serviceradar:1924-architecture-refactor-move-device-state-to-registry-treat-proton-as-olap-warehouse
Original base: main
Original merged: 2025-11-07T18:37:19Z by @mfreeman451

User description

IMPORTANT: Please sign the Developer Certificate of Origin

Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:

Signed-off-by: J. Doe <j.doe@domain.com>

Describe your changes

Issue ticket number and link

Code checklist before requesting a review

I have signed the DCO?
The build completes without errors?
All tests are passing when running make test?

PR Type

Enhancement, Tests

Description

Moved device state management from database to in-memory registry: Implemented DeviceRecord struct and in-memory storage with multi-index lookup (by ID, IP, MAC) to enable fast device queries without warehouse access
Implemented trigram-based device search: Added full-text search index with ranking logic for efficient device discovery by ID, IP, and hostname
Added collector capability tracking: Introduced CollectorCapability model and in-memory capability index to track which collectors (SNMP, ICMP, Sysmon, agents, pollers, checkers) are available for each device
Refactored API layer to use registry: Updated device API endpoints to query in-memory registry instead of database, with fallback to warehouse for missing data
Implemented registry hydration: Added batch loading of devices from Proton warehouse into in-memory registry on server startup
Simplified capability response generation: Removed 150+ lines of complex metadata-based capability derivation, replaced with explicit CollectorCapability records
Added comprehensive test coverage: Implemented tests for device storage, transformation, search, capability indexing, and registry hydration
Updated frontend types and components: Extended TypeScript interfaces and refactored device table to use new capabilities API
Optimized build configuration: Simplified Bazel build files using glob patterns for registry and core packages

Diagram Walkthrough

flowchart LR
  Proton["Proton Warehouse<br/>OLAP Store"]
  Hydrate["HydrateFromStore<br/>Batch Load"]
  Registry["In-Memory Registry<br/>DeviceRecord + Indexes"]
  Search["SearchDevices<br/>Trigram Index"]
  Capabilities["CapabilityIndex<br/>Collector Metadata"]
  API["API Server<br/>Fast Queries"]
  
  Proton -->|"Batch Load on Startup"| Hydrate
  Hydrate --> Registry
  Registry --> Search
  Registry --> Capabilities
  Search --> API
  Capabilities --> API

File Walkthrough

Relevant files

Enhancement

21 files

registry.go `In-memory device registry with search and capability indexing` pkg/registry/registry.go Added in-memory device storage with `mu`, `devices`, `devicesByIP`, `devicesByMAC` maps and search/capability indexes Initialized new storage structures in `NewDeviceRegistry` constructor Added `applyRegistryStore` call in `ProcessBatchDeviceUpdates` to persist updates to in-memory store Implemented four new collector capability methods: `SetCollectorCapabilities`, `GetCollectorCapabilities`, `HasDeviceCapability`, `ListDevicesWithCapability` Refactored `GetDevice`, `GetDevicesByIP`, `ListDevices` to use in-memory records instead of database queries Added `SearchDevices` method with trigram index and scoring logic for device search Added `DeleteLocal` method to remove devices from registry without emitting tombstones	+195/-28
server.go `API server integration with registry search and capabilities` pkg/core/api/server.go Replaced `filterDevices` search logic with new `SearchDevices` method from device registry Updated `sendDeviceRegistryResponse` to accept `context.Context` parameter Changed collector capability retrieval from metadata derivation to explicit registry lookup via `GetCollectorCapabilities` Refactored `getDevice` to prioritize registry lookup over database fallback Updated `getDeviceMetrics` to use registry capability hints instead of inline `collectorCapabilities` struct	+67/-28
device_transform.go `Device record transformation layer implementation` pkg/registry/device_transform.go New file implementing device record transformation functions `DeviceRecordFromUnified` converts Proton unified devices to in-memory `DeviceRecord` format `UnifiedDeviceFromRecord` and `LegacyDeviceFromRecord` convert records back to API representations Helper functions for source selection, metadata cloning, and record sorting	+278/-0
device_store.go `In-memory device storage with multi-index lookup` pkg/registry/device_store.go New file implementing in-memory device storage and indexing `UpsertDeviceRecord` maintains primary device map and IP/MAC lookup indexes `DeleteDeviceRecord` removes devices and cleans up all indexes `FindDevicesByIP` and `FindDevicesByMAC` query indexed lookups `snapshotRecords` provides thread-safe read-only view of all devices Defensive cloning ensures registry state isolation from caller mutations	+275/-0
collectors.go `Simplified collector capability response generation` pkg/core/api/collectors.go Removed complex metadata-based capability derivation logic (150+ lines) Simplified `toCollectorCapabilityResponse` to work with explicit `CollectorCapability` records New response includes `Capabilities`, `AgentID`, `PollerID`, `ServiceName`, `LastSeen` fields Capability flags (`SupportsICMP`, `SupportsSNMP`, `SupportsSysmon`) derived from capability list	+40/-176
capabilities.go `In-memory collector capability index implementation` pkg/registry/capabilities.go New file implementing `CapabilityIndex` for in-memory collector capability tracking `Set` stores/updates capability records with deduplication and normalization `Get` returns defensive copy of capability record `HasCapability` and `ListDevicesWithCapability` query capability indexes Maintains bidirectional indexes: device→capabilities and capability→devices	+206/-0
device_store_update.go `Device update application and merging logic` pkg/registry/device_store_update.go New file implementing device update application logic `applyRegistryStore` processes batch updates and tombstones `deviceRecordFromUpdate` merges updates with existing records, preserving timestamps Helper functions for metadata merging, discovery source deduplication, and deletion detection	+221/-0
trigram_index.go `Trigram-based device search index implementation` pkg/registry/trigram_index.go New file implementing trigram-based full-text search index `Add` indexes device text with trigram decomposition `Search` returns ranked matches using trigram scoring `Remove` cleans up trigram postings Supports short query fallback with substring matching	+210/-0
device_registry.go `Device deletion integration with registry` pkg/core/api/device_registry.go Added `DeleteDeviceRecord` call in `deleteDevice` to remove from in-memory registry Fixed indentation in checker component type case statement	+13/-5
capabilities.go `Collector capability management utilities` pkg/core/capabilities.go New file implementing capability management utilities `normalizeCapabilities` deduplicates and sorts capability lists `mergeCollectorCapabilityRecord` merges new capabilities with existing records `upsertCollectorCapabilities` helper for registry updates	+129/-0
pollers.go `Collector capability recording for service registration` pkg/core/pollers.go Added `upsertCollectorCapabilities` calls in poller, agent, and checker registration Poller registration records `poller` capability Agent registration records `agent` capability Checker registration records checker kind or `checker` capability	+25/-2
hydrate.go `Registry hydration from persistent storage` pkg/registry/hydrate.go New file implementing registry hydration from database `HydrateFromStore` loads devices in batches and populates in-memory indexes `replaceAll` atomically replaces registry state with new records	+76/-0
metrics.go `Collector capability recording for metrics` pkg/core/metrics.go Added `upsertCollectorCapabilities` calls in SNMP, ICMP, and Sysmon metric processing Records appropriate capability type for each metric source	+11/-0
devices.go `Collector capability recording for service devices` pkg/core/devices.go Added `upsertCollectorCapabilities` call in `ensureServiceDevice` for checker registration Normalizes service type and name as capabilities	+4/-0
interfaces.go `Manager interface capability method additions` pkg/registry/interfaces.go Added four new methods to `Manager` interface for capability operations `SetCollectorCapabilities`, `GetCollectorCapabilities`, `HasDeviceCapability`, `ListDevicesWithCapability`	+12/-0
server.go `Initialize device registry hydration from Proton warehouse` pkg/core/server.go Added device registry hydration from Proton store on server initialization Logs device count on successful hydration or warning if hydration fails Allows server to start with cold cache if hydration encounters errors	+8/-0
device.go `Add DeviceRecord struct for registry state management` pkg/registry/device.go New file defining `DeviceRecord` struct for in-memory device state Captures authoritative device fields including ID, IP, poller/agent IDs, and capabilities Enables registry to answer state queries without accessing warehouse	+25/-0
types.go `Add GetCollectorCapabilities method to registry interface` pkg/core/api/types.go Added `GetCollectorCapabilities` method to `DeviceRegistryService` interface Returns collector capability information for a specific device	+1/-0
collector.go `Add CollectorCapability model for device collectors` pkg/models/collector.go New file defining `CollectorCapability` struct for collector metadata Includes device ID, capabilities list, agent/poller IDs, and service information Provides JSON serialization for API responses	+14/-0
devices.ts `Extend CollectorCapabilities interface with new fields` web/src/types/devices.ts Extended `CollectorCapabilities` interface with new fields Added `capabilities` array, `agent_id`, `poller_id`, `service_name`, and `last_seen` fields Aligns TypeScript types with backend collector capability model	+5/-0
DeviceTable.tsx `Refactor device collector info extraction to use capabilities API` web/src/components/Devices/DeviceTable.tsx Refactored `extractCollectorInfo` to use `capabilities` array from API instead of metadata parsing Simplified logic by removing complex metadata extraction and service type registration Added `capabilities`, `agentId`, `pollerId`, and `lastSeen` to `CollectorInfo` type Improved collector detection to prioritize explicit capabilities over metadata hints	+52/-125

Tests

10 files

collectors_test.go `Simplified collector capability response tests` pkg/core/api/collectors_test.go Removed 252 lines of tests for metadata-based capability derivation Added simplified tests for `toCollectorCapabilityResponse` with explicit `CollectorCapability` records Tests now verify flag conversion from capability list instead of metadata inspection	+29/-241
device_transform_test.go `Device transformation function tests` pkg/registry/device_transform_test.go New test file for device transformation functions Tests `DeviceRecordFromUnified` with metadata, hostname, MAC, and discovery source handling Tests `UnifiedDeviceFromRecord` and `LegacyDeviceFromRecord` conversions Verifies defensive copying prevents mutation of original data	+170/-0
registry_test.go `Registry store and search functionality tests` pkg/registry/registry_test.go Added `TestProcessBatchDeviceUpdatesUpdatesStore` verifying updates persist to in-memory store Added `TestProcessBatchDeviceUpdatesRemovesDeletedRecords` testing deletion metadata handling Added `TestSearchDevices` testing search ranking by device ID, IP, hostname, and recency	+156/-0
device_store_test.go `Device storage and indexing tests` pkg/registry/device_store_test.go New test file for device storage operations Tests upsert, retrieval, and index maintenance Tests multi-device IP lookups and MAC address parsing Verifies defensive cloning prevents external mutations	+147/-0
device_store_update_test.go `Device update application tests` pkg/registry/device_store_update_test.go New test file for device update application Tests new record creation from updates with metadata and timestamp handling Tests merging updates with existing records while preserving older timestamps Tests deletion metadata detection	+110/-0
devices_test.go `Service device registration capability tests` pkg/core/devices_test.go Updated `TestEnsureServiceDeviceRegistersOnStatusSource` to expect capability record operations Added expectations for `GetCollectorCapabilities` and `SetCollectorCapabilities` calls Verifies capability record contains service type and agent/poller IDs	+41/-26
hydrate_test.go `Registry hydration from database tests` pkg/registry/hydrate_test.go New test file for registry hydration from database Tests `HydrateFromStore` loads devices into in-memory registry Tests error handling preserves existing state Tests context cancellation handling	+119/-0
mock_registry.go `Mock registry capability method implementations` pkg/registry/mock_registry.go Added mock implementations for four new capability methods `SetCollectorCapabilities`, `GetCollectorCapabilities`, `HasDeviceCapability`, `ListDevicesWithCapability` Includes mock recorder methods for test expectations	+55/-0
capabilities_test.go `Capability index tests` pkg/registry/capabilities_test.go New test file for capability index operations Tests set/get with deduplication and normalization Tests capability queries and device listing Tests removal on empty capabilities	+71/-0
trigram_index_test.go `Trigram index search tests` pkg/registry/trigram_index_test.go New test file for trigram index search functionality Tests add, search, and remove operations Tests multi-result queries and token-based matching	+36/-0

Configuration changes

2 files

BUILD.bazel `Simplify Bazel build configuration with glob patterns` pkg/registry/BUILD.bazel Changed `srcs` from explicit file list to glob pattern `[".go"]` excluding tests Changed `go_test` srcs to use glob pattern `["_test.go"]` Reduces maintenance burden when adding new files to registry package	+5/-18
BUILD.bazel `Add new capability and test files to build configuration` pkg/core/BUILD.bazel Added `capabilities.go` to library sources Added `devices_test.go` to test sources Registers new capability and test files in build configuration	+2/-0

Imported from GitHub pull request. Original GitHub pull request: #1925 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/pull/1925 Original created: 2025-11-05T06:47:01Z Original updated: 2025-11-07T18:37:58Z Original head: carverauto/serviceradar:1924-architecture-refactor-move-device-state-to-registry-treat-proton-as-olap-warehouse Original base: main Original merged: 2025-11-07T18:37:19Z by @mfreeman451 --- ### **User description** ## IMPORTANT: Please sign the Developer Certificate of Origin Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include a [DCO sign-off statement]( https://developercertificate.org/) indicating the DCO acceptance in one commit message. Here is an example DCO Signed-off-by line in a commit message: ``` Signed-off-by: J. Doe <j.doe@domain.com> ``` ## Describe your changes ## Issue ticket number and link ## Code checklist before requesting a review - [ ] I have signed the DCO? - [ ] The build completes without errors? - [ ] All tests are passing when running make test? ___ ### **PR Type** Enhancement, Tests ___ ### **Description** - **Moved device state management from database to in-memory registry**: Implemented `DeviceRecord` struct and in-memory storage with multi-index lookup (by ID, IP, MAC) to enable fast device queries without warehouse access - **Implemented trigram-based device search**: Added full-text search index with ranking logic for efficient device discovery by ID, IP, and hostname - **Added collector capability tracking**: Introduced `CollectorCapability` model and in-memory capability index to track which collectors (SNMP, ICMP, Sysmon, agents, pollers, checkers) are available for each device - **Refactored API layer to use registry**: Updated device API endpoints to query in-memory registry instead of database, with fallback to warehouse for missing data - **Implemented registry hydration**: Added batch loading of devices from Proton warehouse into in-memory registry on server startup - **Simplified capability response generation**: Removed 150+ lines of complex metadata-based capability derivation, replaced with explicit `CollectorCapability` records - **Added comprehensive test coverage**: Implemented tests for device storage, transformation, search, capability indexing, and registry hydration - **Updated frontend types and components**: Extended TypeScript interfaces and refactored device table to use new capabilities API - **Optimized build configuration**: Simplified Bazel build files using glob patterns for registry and core packages ___ ### Diagram Walkthrough ```mermaid flowchart LR Proton["Proton Warehouse OLAP Store"] Hydrate["HydrateFromStore Batch Load"] Registry["In-Memory Registry DeviceRecord + Indexes"] Search["SearchDevices Trigram Index"] Capabilities["CapabilityIndex Collector Metadata"] API["API Server Fast Queries"] Proton -->|"Batch Load on Startup"| Hydrate Hydrate --> Registry Registry --> Search Registry --> Capabilities Search --> API Capabilities --> API ``` <details> <summary><h3> File Walkthrough</h3></summary> <table><thead><tr><th></th><th align="left">Relevant files</th></tr></thead><tbody><tr><td>Enhancement</td><td><details><summary>21 files</summary><table> <tr> <td> <details> <summary>registry.go<dd><code>In-memory device registry with search and capability indexing</code></dd></summary> <hr> pkg/registry/registry.go <ul><li>Added in-memory device storage with <code>mu</code>, <code>devices</code>, <code>devicesByIP</code>, <code>devicesByMAC</code> maps and search/capability indexes <li> Initialized new storage structures in <code>NewDeviceRegistry</code> constructor <li> Added <code>applyRegistryStore</code> call in <code>ProcessBatchDeviceUpdates</code> to persist updates to in-memory store <li> Implemented four new collector capability methods: <code>SetCollectorCapabilities</code>, <code>GetCollectorCapabilities</code>, <code>HasDeviceCapability</code>, <code>ListDevicesWithCapability</code> <li> Refactored <code>GetDevice</code>, <code>GetDevicesByIP</code>, <code>ListDevices</code> to use in-memory records instead of database queries <li> Added <code>SearchDevices</code> method with trigram index and scoring logic for device search <li> Added <code>DeleteLocal</code> method to remove devices from registry without emitting tombstones</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-cb61d8f79451b9541de4a8cc0811523a68d15452b2f5971c7618ea5b423cf4ec">+195/-28</a></td> </tr> <tr> <td> <details> <summary>server.go<dd><code>API server integration with registry search and capabilities</code></dd></summary> <hr> pkg/core/api/server.go <ul><li>Replaced <code>filterDevices</code> search logic with new <code>SearchDevices</code> method from device registry <li> Updated <code>sendDeviceRegistryResponse</code> to accept <code>context.Context</code> parameter <li> Changed collector capability retrieval from metadata derivation to explicit registry lookup via <code>GetCollectorCapabilities</code> <li> Refactored <code>getDevice</code> to prioritize registry lookup over database fallback <li> Updated <code>getDeviceMetrics</code> to use registry capability hints instead of inline <code>collectorCapabilities</code> struct</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-1bb99367fdd853c728b7cfbf5893a293f6d217144dfb5282cb8dd32e5261021e">+67/-28</a>  </td> </tr> <tr> <td> <details> <summary>device_transform.go<dd><code>Device record transformation layer implementation</code>                </dd></summary> <hr> pkg/registry/device_transform.go <ul><li>New file implementing device record transformation functions <li> <code>DeviceRecordFromUnified</code> converts Proton unified devices to in-memory <code>DeviceRecord</code> format <li> <code>UnifiedDeviceFromRecord</code> and <code>LegacyDeviceFromRecord</code> convert records back to API representations <li> Helper functions for source selection, metadata cloning, and record sorting</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-ece077c963daf0f46cb4cfe0e938ff22569f0d29da9376f3f78bd5bb8e586836">+278/-0</a>  </td> </tr> <tr> <td> <details> <summary>device_store.go<dd><code>In-memory device storage with multi-index lookup</code>                  </dd></summary> <hr> pkg/registry/device_store.go <ul><li>New file implementing in-memory device storage and indexing <li> <code>UpsertDeviceRecord</code> maintains primary device map and IP/MAC lookup indexes <li> <code>DeleteDeviceRecord</code> removes devices and cleans up all indexes <li> <code>FindDevicesByIP</code> and <code>FindDevicesByMAC</code> query indexed lookups <li> <code>snapshotRecords</code> provides thread-safe read-only view of all devices <li> Defensive cloning ensures registry state isolation from caller mutations</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-40d722882cf0d1b27b15d87cd15166e070641f4e77f720da2db89201c7ccfba9">+275/-0</a>  </td> </tr> <tr> <td> <details> <summary>collectors.go<dd><code>Simplified collector capability response generation</code>            </dd></summary> <hr> pkg/core/api/collectors.go <ul><li>Removed complex metadata-based capability derivation logic (150+ lines) <li> Simplified <code>toCollectorCapabilityResponse</code> to work with explicit <code>CollectorCapability</code> records <li> New response includes <code>Capabilities</code>, <code>AgentID</code>, <code>PollerID</code>, <code>ServiceName</code>, <code>LastSeen</code> fields <li> Capability flags (<code>SupportsICMP</code>, <code>SupportsSNMP</code>, <code>SupportsSysmon</code>) derived from capability list</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-ea8e042db40227646592d91e2cabf35dfbad5fdcd4d7e48566768ec9cd91b50a">+40/-176</a></td> </tr> <tr> <td> <details> <summary>capabilities.go<dd><code>In-memory collector capability index implementation</code>            </dd></summary> <hr> pkg/registry/capabilities.go <ul><li>New file implementing <code>CapabilityIndex</code> for in-memory collector capability tracking <li> <code>Set</code> stores/updates capability records with deduplication and normalization <li> <code>Get</code> returns defensive copy of capability record <li> <code>HasCapability</code> and <code>ListDevicesWithCapability</code> query capability indexes <li> Maintains bidirectional indexes: device→capabilities and capability→devices</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-6ecec5065cbadea2f6285ad9ab39ed4906dd94d5b8dc586bf3a544d2121cc7b4">+206/-0</a>  </td> </tr> <tr> <td> <details> <summary>device_store_update.go<dd><code>Device update application and merging logic</code>                            </dd></summary> <hr> pkg/registry/device_store_update.go <ul><li>New file implementing device update application logic <li> <code>applyRegistryStore</code> processes batch updates and tombstones <li> <code>deviceRecordFromUpdate</code> merges updates with existing records, preserving timestamps <li> Helper functions for metadata merging, discovery source deduplication, and deletion detection</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-bebfdd26ec81773366665f5496e6f7af538de7a826443c9b5942a66c7df6c407">+221/-0</a>  </td> </tr> <tr> <td> <details> <summary>trigram_index.go<dd><code>Trigram-based device search index implementation</code>                  </dd></summary> <hr> pkg/registry/trigram_index.go <ul><li>New file implementing trigram-based full-text search index <li> <code>Add</code> indexes device text with trigram decomposition <li> <code>Search</code> returns ranked matches using trigram scoring <li> <code>Remove</code> cleans up trigram postings <li> Supports short query fallback with substring matching</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-d15089251c1f11295acd5dbb6ca83aecaa384f91e49522b87e08f33ec7fd5138">+210/-0</a>  </td> </tr> <tr> <td> <details> <summary>device_registry.go<dd><code>Device deletion integration with registry</code>                                </dd></summary> <hr> pkg/core/api/device_registry.go <ul><li>Added <code>DeleteDeviceRecord</code> call in <code>deleteDevice</code> to remove from in-memory registry <li> Fixed indentation in checker component type case statement</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-34b7da1b2845de83ee2b0eeba93ef1c8b7abf40517f20617c861abffec32ee1c">+13/-5</a>    </td> </tr> <tr> <td> <details> <summary>capabilities.go<dd><code>Collector capability management utilities</code>                                </dd></summary> <hr> pkg/core/capabilities.go <ul><li>New file implementing capability management utilities <li> <code>normalizeCapabilities</code> deduplicates and sorts capability lists <li> <code>mergeCollectorCapabilityRecord</code> merges new capabilities with existing records <li> <code>upsertCollectorCapabilities</code> helper for registry updates</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-4084a2d451d8d35a130d6ead2417126a70dd4929ffb18004640c4f7a91d44ce5">+129/-0</a>  </td> </tr> <tr> <td> <details> <summary>pollers.go<dd><code>Collector capability recording for service registration</code>    </dd></summary> <hr> pkg/core/pollers.go <ul><li>Added <code>upsertCollectorCapabilities</code> calls in poller, agent, and checker registration <li> Poller registration records <code>poller</code> capability <li> Agent registration records <code>agent</code> capability <li> Checker registration records checker kind or <code>checker</code> capability</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-fe81e2a32f1ac64bcdc6f25f55c5fa918d17bad8c0546f2cf80c757ff4051816">+25/-2</a>    </td> </tr> <tr> <td> <details> <summary>hydrate.go<dd><code>Registry hydration from persistent storage</code>                              </dd></summary> <hr> pkg/registry/hydrate.go <ul><li>New file implementing registry hydration from database <li> <code>HydrateFromStore</code> loads devices in batches and populates in-memory indexes <li> <code>replaceAll</code> atomically replaces registry state with new records</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-e27e237716b5830731f55caab246c132a8ba98f3b37fe5e39de3867038941e4d">+76/-0</a>    </td> </tr> <tr> <td> <details> <summary>metrics.go<dd><code>Collector capability recording for metrics</code>                              </dd></summary> <hr> pkg/core/metrics.go <ul><li>Added <code>upsertCollectorCapabilities</code> calls in SNMP, ICMP, and Sysmon metric processing <li> Records appropriate capability type for each metric source</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-6d98e853ce17576c088e77956ae4ecfa8078019e0bff107a79d8d1d6ed2443ad">+11/-0</a>    </td> </tr> <tr> <td> <details> <summary>devices.go<dd><code>Collector capability recording for service devices</code>              </dd></summary> <hr> pkg/core/devices.go <ul><li>Added <code>upsertCollectorCapabilities</code> call in <code>ensureServiceDevice</code> for checker registration <li> Normalizes service type and name as capabilities</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-fb7f5159558f22afdc809b9cbf5a9c0f71560be2f7389c944da8f1604998ea48">+4/-0</a>      </td> </tr> <tr> <td> <details> <summary>interfaces.go<dd><code>Manager interface capability method additions</code>                        </dd></summary> <hr> pkg/registry/interfaces.go <ul><li>Added four new methods to <code>Manager</code> interface for capability operations <li> <code>SetCollectorCapabilities</code>, <code>GetCollectorCapabilities</code>, <code>HasDeviceCapability</code>, <code>ListDevicesWithCapability</code></ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-21496eb3d34ec02e874f36c98cdd2ca7f39130369caf774c672dcf4be1619503">+12/-0</a>    </td> </tr> <tr> <td> <details> <summary>server.go<dd><code>Initialize device registry hydration from Proton warehouse</code></dd></summary> <hr> pkg/core/server.go <ul><li>Added device registry hydration from Proton store on server initialization <li> Logs device count on successful hydration or warning if hydration fails <li> Allows server to start with cold cache if hydration encounters errors</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-a12d34beb8640efbbedad28f8610fd58afeea457572d82e4fd295ab2a6bb8ee6">+8/-0</a>      </td> </tr> <tr> <td> <details> <summary>device.go<dd><code>Add DeviceRecord struct for registry state management</code>        </dd></summary> <hr> pkg/registry/device.go <ul><li>New file defining <code>DeviceRecord</code> struct for in-memory device state <li> Captures authoritative device fields including ID, IP, poller/agent IDs, and capabilities <li> Enables registry to answer state queries without accessing warehouse</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-a71b8e62531ef2608845e1f2581bb30c62fe3b77b8a98094723d5cfa0e7571a2">+25/-0</a>    </td> </tr> <tr> <td> <details> <summary>types.go<dd><code>Add GetCollectorCapabilities method to registry interface</code></dd></summary> <hr> pkg/core/api/types.go <ul><li>Added <code>GetCollectorCapabilities</code> method to <code>DeviceRegistryService</code> interface <li> Returns collector capability information for a specific device</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-39f024121630282f9e3eeee5b77e5a63a87950b9eae4f479277a289796b42d56">+1/-0</a>      </td> </tr> <tr> <td> <details> <summary>collector.go<dd><code>Add CollectorCapability model for device collectors</code>            </dd></summary> <hr> pkg/models/collector.go <ul><li>New file defining <code>CollectorCapability</code> struct for collector metadata <li> Includes device ID, capabilities list, agent/poller IDs, and service information <li> Provides JSON serialization for API responses</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-63afb8aba7142e90fa337e411d33a535bec0d57219a0c6916b7cf8737fabe552">+14/-0</a>    </td> </tr> <tr> <td> <details> <summary>devices.ts<dd><code>Extend CollectorCapabilities interface with new fields</code>      </dd></summary> <hr> web/src/types/devices.ts <ul><li>Extended <code>CollectorCapabilities</code> interface with new fields <li> Added <code>capabilities</code> array, <code>agent_id</code>, <code>poller_id</code>, <code>service_name</code>, and <code>last_seen</code> fields <li> Aligns TypeScript types with backend collector capability model</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-11c8b544ff6fb09788d7629254fc37c177d2d1d8e087a4263caa6dceea6059f0">+5/-0</a>      </td> </tr> <tr> <td> <details> <summary>DeviceTable.tsx<dd><code>Refactor device collector info extraction to use capabilities API</code></dd></summary> <hr> web/src/components/Devices/DeviceTable.tsx <ul><li>Refactored <code>extractCollectorInfo</code> to use <code>capabilities</code> array from API instead of metadata parsing <li> Simplified logic by removing complex metadata extraction and service type registration <li> Added <code>capabilities</code>, <code>agentId</code>, <code>pollerId</code>, and <code>lastSeen</code> to <code>CollectorInfo</code> type <li> Improved collector detection to prioritize explicit capabilities over metadata hints</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-62d65ad4e929bc5554024981aa1c3418d70d54333e3afd80af69d93cea2edec3">+52/-125</a></td> </tr> </table></details></td></tr><tr><td>Tests</td><td><details><summary>10 files</summary><table> <tr> <td> <details> <summary>collectors_test.go<dd><code>Simplified collector capability response tests</code>                      </dd></summary> <hr> pkg/core/api/collectors_test.go <ul><li>Removed 252 lines of tests for metadata-based capability derivation <li> Added simplified tests for <code>toCollectorCapabilityResponse</code> with explicit <code>CollectorCapability</code> records <li> Tests now verify flag conversion from capability list instead of metadata inspection</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-55e61784b820564f9424218a59dbc62d3c358156488b8beb09bdcd1894172e95">+29/-241</a></td> </tr> <tr> <td> <details> <summary>device_transform_test.go<dd><code>Device transformation function tests</code>                                          </dd></summary> <hr> pkg/registry/device_transform_test.go <ul><li>New test file for device transformation functions <li> Tests <code>DeviceRecordFromUnified</code> with metadata, hostname, MAC, and discovery source handling <li> Tests <code>UnifiedDeviceFromRecord</code> and <code>LegacyDeviceFromRecord</code> conversions <li> Verifies defensive copying prevents mutation of original data</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-a58d6e3f70cb01404143759590961e48032011956bb29fa8a830c741ea15fbb5">+170/-0</a>  </td> </tr> <tr> <td> <details> <summary>registry_test.go<dd><code>Registry store and search functionality tests</code>                        </dd></summary> <hr> pkg/registry/registry_test.go <ul><li>Added <code>TestProcessBatchDeviceUpdatesUpdatesStore</code> verifying updates persist to in-memory store <li> Added <code>TestProcessBatchDeviceUpdatesRemovesDeletedRecords</code> testing deletion metadata handling <li> Added <code>TestSearchDevices</code> testing search ranking by device ID, IP, hostname, and recency</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-f010972d104404be52d2a8e6e784cb56e31194f90795a69571a12696bcbdc075">+156/-0</a>  </td> </tr> <tr> <td> <details> <summary>device_store_test.go<dd><code>Device storage and indexing tests</code>                                                </dd></summary> <hr> pkg/registry/device_store_test.go <ul><li>New test file for device storage operations <li> Tests upsert, retrieval, and index maintenance <li> Tests multi-device IP lookups and MAC address parsing <li> Verifies defensive cloning prevents external mutations</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-8768ba941333f15376041e9abc2080bc00a813c32cb2cef78dac197e346644e3">+147/-0</a>  </td> </tr> <tr> <td> <details> <summary>device_store_update_test.go<dd><code>Device update application tests</code>                                                    </dd></summary> <hr> pkg/registry/device_store_update_test.go <ul><li>New test file for device update application <li> Tests new record creation from updates with metadata and timestamp handling <li> Tests merging updates with existing records while preserving older timestamps <li> Tests deletion metadata detection</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-faeaf1af263e0a4ac29579fd17652337bc20008e2bbf525f8801b381261a9017">+110/-0</a>  </td> </tr> <tr> <td> <details> <summary>devices_test.go<dd><code>Service device registration capability tests</code>                          </dd></summary> <hr> pkg/core/devices_test.go <ul><li>Updated <code>TestEnsureServiceDeviceRegistersOnStatusSource</code> to expect capability record operations <li> Added expectations for <code>GetCollectorCapabilities</code> and <code>SetCollectorCapabilities</code> calls <li> Verifies capability record contains service type and agent/poller IDs</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-133a0e1348e2d2dcfc9645f0409219463138e17380efd4fbc213be14c104043b">+41/-26</a>  </td> </tr> <tr> <td> <details> <summary>hydrate_test.go<dd><code>Registry hydration from database tests</code>                                      </dd></summary> <hr> pkg/registry/hydrate_test.go <ul><li>New test file for registry hydration from database <li> Tests <code>HydrateFromStore</code> loads devices into in-memory registry <li> Tests error handling preserves existing state <li> Tests context cancellation handling</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-806ca574aab8dd9d007c842838d807534833a38e0d30569f943bb3a6df76d0fe">+119/-0</a>  </td> </tr> <tr> <td> <details> <summary>mock_registry.go<dd><code>Mock registry capability method implementations</code>                    </dd></summary> <hr> pkg/registry/mock_registry.go <ul><li>Added mock implementations for four new capability methods <li> <code>SetCollectorCapabilities</code>, <code>GetCollectorCapabilities</code>, <code>HasDeviceCapability</code>, <code>ListDevicesWithCapability</code> <li> Includes mock recorder methods for test expectations</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-1c3085186c5d9b0c552ee50f98a3ff2251904cf3c9adedfd693f16488539f0bb">+55/-0</a>    </td> </tr> <tr> <td> <details> <summary>capabilities_test.go<dd><code>Capability index tests</code>                                                                      </dd></summary> <hr> pkg/registry/capabilities_test.go <ul><li>New test file for capability index operations <li> Tests set/get with deduplication and normalization <li> Tests capability queries and device listing <li> Tests removal on empty capabilities</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-d964d06970947f12ecef0023b30d65997cfc446d04c274299425c1c12971c921">+71/-0</a>    </td> </tr> <tr> <td> <details> <summary>trigram_index_test.go<dd><code>Trigram index search tests</code>                                                              </dd></summary> <hr> pkg/registry/trigram_index_test.go <ul><li>New test file for trigram index search functionality <li> Tests add, search, and remove operations <li> Tests multi-result queries and token-based matching</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-5c9a46ee3386c76a6fc7498e31ce304a41a5716bbc725ec36c62874983e8db5b">+36/-0</a>    </td> </tr> </table></details></td></tr><tr><td>Configuration changes</td><td><details><summary>2 files</summary><table> <tr> <td> <details> <summary>BUILD.bazel<dd><code>Simplify Bazel build configuration with glob patterns</code>        </dd></summary> <hr> pkg/registry/BUILD.bazel <ul><li>Changed <code>srcs</code> from explicit file list to glob pattern <code>["*.go"]</code> excluding tests <li> Changed <code>go_test</code> srcs to use glob pattern <code>["*_test.go"]</code> <li> Reduces maintenance burden when adding new files to registry package</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-3e838afce9a84935e04b7ff8fd3e48d5452c21538a3ea1d36e3fd00aa3c30cd0">+5/-18</a>    </td> </tr> <tr> <td> <details> <summary>BUILD.bazel<dd><code>Add new capability and test files to build configuration</code>  </dd></summary> <hr> pkg/core/BUILD.bazel <ul><li>Added <code>capabilities.go</code> to library sources <li> Added <code>devices_test.go</code> to test sources <li> Registers new capability and test files in build configuration</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-af133ce6c45ec51c66bcb5fe0b424291d78688285b614275765003b83c177534">+2/-0</a>      </td> </tr> </table></details></td></tr></tr></tbody></table> </details> ___

qodo-code-review[bot] commented

2025-11-05 06:48:18 +00:00

(Migrated from github.com)

Author

Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1925#issuecomment-3489630672
Original created: 2025-11-05T06:48:18Z

PR Compliance Guide 🔍

(Compliance updated until commit `github.com/carverauto/serviceradar@da2ee0f076`)

Below is a summary of compliance checks for this PR:

Security Compliance

⚪

Data exposure via search

Description: The device search handler returns raw SRQL rows in response (field RawResults and fallback
to result.Rows), potentially exposing unvetted warehouse data and internal schema fields
directly to clients.
server.go [1847-1906]

Referred Code

plannerReq := &search.Request{
	Query:   request.Query,
	Mode:    search.Mode(request.Mode),
	Filters: request.Filters,
	Pagination: search.Pagination{
		Limit:     request.Pagination.Limit,
		Offset:    request.Pagination.Offset,
		Cursor:    request.Pagination.Cursor,
		Direction: request.Pagination.Direction,
	},
}

result, err := s.searchPlanner.Search(ctx, plannerReq)
if err != nil {
	s.logger.Error().Err(err).Msg("Device search planner failed")
	writeError(w, err.Error(), http.StatusInternalServerError)
	return
}

diagnostics := make(map[string]any, len(result.Diagnostics)+2)
for k, v := range result.Diagnostics {


 ... (clipped 39 lines)

Sensitive information exposure

Description: Capability events and metadata include multiple IDs and network details taken from inputs
without normalization (e.g., agent_id, poller_id, target_host), which could be logged or
stored and later exposed; ensure these fields are sanitized and not echoed to clients.
metrics.go [728-776]

Referred Code


s.enqueueServiceDeviceUpdate(update)

s.upsertCollectorCapabilities(ctx, deviceID, []string{"icmp"}, agentID, pollerID, serviceName, now)

eventMetadata := map[string]any{
	"collector_ip":      sourceIP,
	"response_time_ms":  pingResult.ResponseTime,
	"packet_loss":       pingResult.PacketLoss,
	"available":         pingResult.Available,
	"partition":         partition,
	"host_device_id":    hostDeviceID,
	"host_last_seen_ip": updateIP,
}
if agentID != "" {
	eventMetadata["agent_id"] = agentID
}
if pollerID != "" {
	eventMetadata["poller_id"] = pollerID
}
if targetHost != "" {


 ... (clipped 28 lines)

Sensitive information exposure

Description: Stats discrepancy diagnostics may log lists of missing/excess device IDs
(missing_device_ids, excess_device_ids) which could leak inventory identifiers in logs
accessible beyond intended scope.
stats_aggregator.go [732-787]

Referred Code

func (a *StatsAggregator) maybeReportDiscrepancy(ctx context.Context, records []*registry.DeviceRecord, registryTotal int, protonSnapshot int64) {
	if a.dbService == nil || a.registry == nil {
		return
	}

	protonTotal := protonSnapshot
	var err error
	if protonTotal <= 0 {
		protonTotal, err = a.dbService.CountUnifiedDevices(ctx)
		if err != nil {
			a.logger.Warn().Err(err).Msg("Failed to count Proton devices during stats diagnostics")
			return
		}
	}

	if int64(registryTotal) == protonTotal {
		return
	}

	if !a.lastMismatchLog.IsZero() && a.mismatchLogInterval > 0 && a.now().Before(a.lastMismatchLog.Add(a.mismatchLogInterval)) {
		return


 ... (clipped 35 lines)

Ticket Compliance

🟡

🎫 #1924

🟢	Define CollectorCapability schema and implement in-memory capability index; stop deriving capability from metadata
	Update agent/poller registration and processing paths to emit capabilities and capability events
	Update API endpoints to use the in-memory registry for device/collector data with Proton fallback only when necessary
	Add device search using a new in-memory search index and planner
	Implement StatsAggregator that runs periodically, exposes a cached snapshot via API, and removes SRQL-driven stats from UI
	Add capability snapshots and matrix support to answer "last successful capability" style questions
	Provide registry hydration from Proton on startup and keep it in sync
	Enforce boundary to prefer registry over Proton, with a flag to disable Proton fallback in certain paths
	Add tests for registry, capabilities, search, hydration, and stats
⚪	Verify UI/frontend changes correctly consume new /stats and capabilities APIs and no longer issue SRQL/stat queries
	Validate production performance targets (latency and Proton CPU) after deployment
	End-to-end verification that all API paths avoid Proton for state except analytics, including linter/middleware enforcement

Codebase Duplication Compliance

⚪

Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance

🟢

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

🔴

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status:
Internal error leak: The device search handler returns err.Error() directly to clients on planner failure,
potentially exposing internal system details.

Referred Code

if err != nil {
	s.logger.Error().Err(err).Msg("Device search planner failed")
	writeError(w, err.Error(), http.StatusInternalServerError)
	return

⚪

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Missing audit logs: New critical endpoints and capability events (e.g., device search, stats, critical logs,
capability recordings) lack explicit audit logging of actor identity and action outcomes,
making audit trail completeness uncertain.

Referred Code

// SRQL HTTP endpoint removed; SRQL moves to OCaml service

// Device-centric endpoints
if s.searchPlanner != nil {
	protected.HandleFunc("/devices/search", s.handleDeviceSearch).Methods("POST")
}
protected.HandleFunc("/devices", s.getDevices).Methods("GET")
protected.HandleFunc("/devices/{id}", s.handleDeviceByID).Methods("GET", "DELETE")
protected.HandleFunc("/devices/{id}/registry", s.getDeviceRegistryInfo).Methods("GET")
protected.HandleFunc("/devices/{id}/metrics", s.getDeviceMetrics).Methods("GET")
protected.HandleFunc("/devices/metrics/status", s.getDeviceMetricsStatus).Methods("GET")
protected.HandleFunc("/devices/snmp/status", s.getDeviceSNMPStatus).Methods("POST")
protected.HandleFunc("/stats", s.handleDeviceStats).Methods("GET")
protected.HandleFunc("/logs/critical", s.handleCriticalLogs).Methods("GET")
protected.HandleFunc("/logs/critical/counters", s.handleCriticalLogCounters).Methods("GET")

// Store reference to protected router for MCP routes
s.protectedRouter = protected

// Admin routes with RBAC protection
// These routes use the route protection configuration from core.json


 ... (clipped 1061 lines)

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Input validation gaps: The device search handler minimally validates request fields (e.g., pagination limits,
mode) and passes them through, which may miss edge-case validation and error context for
invalid inputs.

Referred Code

func (s *APIServer) handleDeviceSearch(w http.ResponseWriter, r *http.Request) {
	if s.searchPlanner == nil {
		writeError(w, "device search planner unavailable", http.StatusServiceUnavailable)
		return
	}

	var request deviceSearchRequest
	if err := json.NewDecoder(r.Body).Decode(&request); err != nil {
		writeError(w, "invalid search request payload", http.StatusBadRequest)
		return
	}

	if request.Filters == nil {
		request.Filters = make(map[string]string)
	}

	ctx, cancel := context.WithTimeout(r.Context(), defaultTimeout)
	defer cancel()

	plannerReq := &search.Request{
		Query:   request.Query,


 ... (clipped 9 lines)

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Log metadata risk: New logging and capability events include device IDs, IPs, hostnames, and errors that may
contain sensitive data; ensure policies and redaction guardrails prevent PII/secret
exposure.

Referred Code

	return fmt.Errorf("failed to process batch SNMP target devices: %w", err)
}

for _, update := range deviceUpdates {
	if update == nil || strings.TrimSpace(update.DeviceID) == "" {
		continue
	}
	s.upsertCollectorCapabilities(ctx, update.DeviceID, []string{"snmp"}, agentID, pollerID, "snmp", timestamp)

	status := statusByDeviceID[update.DeviceID]
	metadata := map[string]any{
		"agent_id":  agentID,
		"poller_id": pollerID,
		"partition": partition,
	}
	if update.IP != "" {
		metadata["target_ip"] = update.IP
	}
	if update.Hostname != nil && *update.Hostname != "" {
		metadata["target_hostname"] = *update.Hostname
	}


 ... (clipped 513 lines)

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Weak pagination checks: The device search API accepts pagination fields (limit/offset/cursor) without explicit
bounds or sanitization in the handler, which could allow excessive resource usage or
injection risk depending on downstream handling.

Referred Code

plannerReq := &search.Request{
	Query:   request.Query,
	Mode:    search.Mode(request.Mode),
	Filters: request.Filters,
	Pagination: search.Pagination{
		Limit:     request.Pagination.Limit,
		Offset:    request.Pagination.Offset,
		Cursor:    request.Pagination.Cursor,
		Direction: request.Pagination.Direction,
	},

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Previous compliance checks

Compliance check up to commit 057b69f

Security Compliance

⚪

Unsafe type assertion

Description: Type assertion without proper error handling could cause runtime panic if deviceRegistry
doesn't implement SearchDevices interface, potentially crashing the service.
registry.go [1766-1777]

Referred Code

Unvalidated capability exposure

Description: Capability data retrieved from registry is directly exposed in API response without
validation or sanitization, potentially leaking sensitive collector metadata.
server.go [1820-1826]

Referred Code

if s.deviceRegistry != nil {
	if record, ok := s.deviceRegistry.GetCollectorCapabilities(ctx, device.DeviceID); ok {
		if resp := toCollectorCapabilityResponse(record); resp != nil {
			entry["collector_capabilities"] = resp
		}
	}
}

Ticket Compliance

🟢

🎫 #1924

🟢	Define DeviceRecord schema in pkg/registry/device.go
	Implement DeviceRegistry with in-memory map and RWMutex
	Hydrate registry from Proton on startup (HydrateRegistryFromProton)
	Update DeviceManager.UpsertDevice() to write to both Proton and Registry
	Unit tests for registry operations
	Define CollectorCapability schema
	Implement CapabilityIndex in pkg/registry/capabilities.go
	Update agent/poller registration to emit capabilities
	Update API /devices/{id}/collectors to use registry
	Remove all metadata scraping (_alias_last_seen_service_id, etc.)
	Update UI to use new capabilities API

Codebase Duplication Compliance

⚪

Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance

🟢

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

🔴

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status:
Generic variable names: Variables named 'src', 'dst', 'out' are too generic and
don't clearly express their purpose without additional context

Referred Code

func cloneMetadata(src map[string]string) map[string]string {
	if len(src) == 0 {
		return nil
	}

	out := make(map[string]string, len(src))
	for key, value := range src {
		out[key] = value
	}
	return out
}

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Silent nil handling: When search returns nil results, the code silently sets filteredDevices to nil without
logging or providing context about why the search failed

Referred Code

if searchTerm := strings.TrimSpace(params["searchTerm"].(string)); searchTerm != "" {
	if searcher, ok := s.deviceRegistry.(interface {
		SearchDevices(query string, limit int) []*models.UnifiedDevice
	}); ok {
		if searched := searcher.SearchDevices(searchTerm, params["limit"].(int)); len(searched) > 0 {
			filteredDevices = searched
		} else {
			filteredDevices = nil
		}
	}

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Missing input validation: The UpsertDeviceRecord function accepts a DeviceRecord pointer without validating critical
fields like IP address format or MAC address format before indexing

Referred Code

func (r *DeviceRegistry) UpsertDeviceRecord(record *DeviceRecord) {
	if record == nil || strings.TrimSpace(record.DeviceID) == "" {
		return
	}

	input := cloneDeviceRecord(record)

	r.mu.Lock()
	defer r.mu.Unlock()

	if existing, ok := r.devices[input.DeviceID]; ok {
		r.removeIndexesLocked(existing)
		r.removeFromSearchIndex(existing)
	}

	r.devices[input.DeviceID] = input
	r.indexRecordLocked(input)
	r.addToSearchIndex(input)
}

Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/1925#issuecomment-3489630672 Original created: 2025-11-05T06:48:18Z --- ## PR Compliance Guide 🔍  #### (Compliance updated until commit https://github.com/carverauto/serviceradar/commit/da2ee0f0763a68bcfca1e9baeb7602d67f9b06b0) Below is a summary of compliance checks for this PR: <table><tbody><tr><td colspan='2'>Security Compliance</td></tr> <tr><td rowspan=3>⚪</td> <td><details><summary>Data exposure via search </summary> Description: The device search handler returns raw SRQL rows in response (field <code>RawResults</code> and fallback to <code>result.Rows</code>), potentially exposing unvetted warehouse data and internal schema fields directly to clients. <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-1bb99367fdd853c728b7cfbf5893a293f6d217144dfb5282cb8dd32e5261021eR1847-R1906'>server.go [1847-1906]</a> <details open><summary>Referred Code</summary> ```go plannerReq := &search.Request{ Query: request.Query, Mode: search.Mode(request.Mode), Filters: request.Filters, Pagination: search.Pagination{ Limit: request.Pagination.Limit, Offset: request.Pagination.Offset, Cursor: request.Pagination.Cursor, Direction: request.Pagination.Direction, }, } result, err := s.searchPlanner.Search(ctx, plannerReq) if err != nil { s.logger.Error().Err(err).Msg("Device search planner failed") writeError(w, err.Error(), http.StatusInternalServerError) return } diagnostics := make(map[string]any, len(result.Diagnostics)+2) for k, v := range result.Diagnostics { ... (clipped 39 lines) ``` </details></details></td></tr> <tr><td><details><summary>Sensitive information exposure </summary> Description: Capability events and metadata include multiple IDs and network details taken from inputs without normalization (e.g., agent_id, poller_id, target_host), which could be logged or stored and later exposed; ensure these fields are sanitized and not echoed to clients. <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-6d98e853ce17576c088e77956ae4ecfa8078019e0bff107a79d8d1d6ed2443adR728-R776'>metrics.go [728-776]</a> <details open><summary>Referred Code</summary> ```go s.enqueueServiceDeviceUpdate(update) s.upsertCollectorCapabilities(ctx, deviceID, []string{"icmp"}, agentID, pollerID, serviceName, now) eventMetadata := map[string]any{ "collector_ip": sourceIP, "response_time_ms": pingResult.ResponseTime, "packet_loss": pingResult.PacketLoss, "available": pingResult.Available, "partition": partition, "host_device_id": hostDeviceID, "host_last_seen_ip": updateIP, } if agentID != "" { eventMetadata["agent_id"] = agentID } if pollerID != "" { eventMetadata["poller_id"] = pollerID } if targetHost != "" { ... (clipped 28 lines) ``` </details></details></td></tr> <tr><td><details><summary>Sensitive information exposure </summary> Description: Stats discrepancy diagnostics may log lists of missing/excess device IDs (<code>missing_device_ids</code>, <code>excess_device_ids</code>) which could leak inventory identifiers in logs accessible beyond intended scope. <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-09304e5f27f54a3a8fa2b57cc6a5485b9f7074d7554babde62b51a75753205e0R732-R787'>stats_aggregator.go [732-787]</a> <details open><summary>Referred Code</summary> ```go func (a *StatsAggregator) maybeReportDiscrepancy(ctx context.Context, records []*registry.DeviceRecord, registryTotal int, protonSnapshot int64) { if a.dbService == nil || a.registry == nil { return } protonTotal := protonSnapshot var err error if protonTotal <= 0 { protonTotal, err = a.dbService.CountUnifiedDevices(ctx) if err != nil { a.logger.Warn().Err(err).Msg("Failed to count Proton devices during stats diagnostics") return } } if int64(registryTotal) == protonTotal { return } if !a.lastMismatchLog.IsZero() && a.mismatchLogInterval > 0 && a.now().Before(a.lastMismatchLog.Add(a.mismatchLogInterval)) { return ... (clipped 35 lines) ``` </details></details></td></tr> <tr><td colspan='2'>Ticket Compliance</td></tr> <tr><td>🟡</td> <td> <details> <summary>🎫 <a href=https://github.com/carverauto/serviceradar/issues/1924>#1924</a></summary> <table width='100%'><tbody> <tr><td rowspan=9>🟢</td> <td>Define CollectorCapability schema and implement in-memory capability index; stop deriving capability from metadata</td></tr> <tr><td>Update agent/poller registration and processing paths to emit capabilities and capability events</td></tr> <tr><td>Update API endpoints to use the in-memory registry for device/collector data with Proton fallback only when necessary</td></tr> <tr><td>Add device search using a new in-memory search index and planner</td></tr> <tr><td>Implement StatsAggregator that runs periodically, exposes a cached snapshot via API, and removes SRQL-driven stats from UI</td></tr> <tr><td>Add capability snapshots and matrix support to answer "last successful capability" style questions</td></tr> <tr><td>Provide registry hydration from Proton on startup and keep it in sync</td></tr> <tr><td>Enforce boundary to prefer registry over Proton, with a flag to disable Proton fallback in certain paths</td></tr> <tr><td>Add tests for registry, capabilities, search, hydration, and stats</td></tr> <tr><td rowspan=3>⚪</td> <td>Verify UI/frontend changes correctly consume new /stats and capabilities APIs and no longer issue SRQL/stat queries</td></tr> <tr><td>Validate production performance targets (latency and Proton CPU) after deployment</td></tr> <tr><td>End-to-end verification that all API paths avoid Proton for state except analytics, including linter/middleware enforcement</td></tr> </tbody></table> </details> </td></tr> <tr><td colspan='2'>Codebase Duplication Compliance</td></tr> <tr><td>⚪</td><td><details><summary>Codebase context is not defined </summary> Follow the <a href='https://qodo-merge-docs.qodo.ai/core-abilities/rag_context_enrichment/'>guide</a> to enable codebase context checks. </details></td></tr> <tr><td colspan='2'>Custom Compliance</td></tr> <tr><td rowspan=1>🟢</td><td> <details><summary>Generic: Meaningful Naming and Self-Documenting Code</summary> **Objective:** Ensure all identifiers clearly express their purpose and intent, making code self-documenting **Status:** Passed </details></td></tr> <tr><td rowspan=1>🔴</td> <td><details> <summary>Generic: Secure Error Handling</summary> **Objective:** To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. **Status:** <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-1bb99367fdd853c728b7cfbf5893a293f6d217144dfb5282cb8dd32e5261021eR1860-R1863'>Internal error leak</a>: The device search handler returns err.Error() directly to clients on planner failure, potentially exposing internal system details. <details open><summary>Referred Code</summary> ```go if err != nil { s.logger.Error().Err(err).Msg("Device search planner failed") writeError(w, err.Error(), http.StatusInternalServerError) return ``` </details></details></td></tr> <tr><td rowspan=4>⚪</td> <td><details> <summary>Generic: Comprehensive Audit Trails</summary> **Objective:** To create a detailed and reliable record of critical system actions for security analysis and compliance. **Status:** <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-1bb99367fdd853c728b7cfbf5893a293f6d217144dfb5282cb8dd32e5261021eR827-R1908'>Missing audit logs</a>: New critical endpoints and capability events (e.g., device search, stats, critical logs, capability recordings) lack explicit audit logging of actor identity and action outcomes, making audit trail completeness uncertain. <details open><summary>Referred Code</summary> ```go // SRQL HTTP endpoint removed; SRQL moves to OCaml service // Device-centric endpoints if s.searchPlanner != nil { protected.HandleFunc("/devices/search", s.handleDeviceSearch).Methods("POST") } protected.HandleFunc("/devices", s.getDevices).Methods("GET") protected.HandleFunc("/devices/{id}", s.handleDeviceByID).Methods("GET", "DELETE") protected.HandleFunc("/devices/{id}/registry", s.getDeviceRegistryInfo).Methods("GET") protected.HandleFunc("/devices/{id}/metrics", s.getDeviceMetrics).Methods("GET") protected.HandleFunc("/devices/metrics/status", s.getDeviceMetricsStatus).Methods("GET") protected.HandleFunc("/devices/snmp/status", s.getDeviceSNMPStatus).Methods("POST") protected.HandleFunc("/stats", s.handleDeviceStats).Methods("GET") protected.HandleFunc("/logs/critical", s.handleCriticalLogs).Methods("GET") protected.HandleFunc("/logs/critical/counters", s.handleCriticalLogCounters).Methods("GET") // Store reference to protected router for MCP routes s.protectedRouter = protected // Admin routes with RBAC protection // These routes use the route protection configuration from core.json ... (clipped 1061 lines) ``` </details></details></td></tr> <tr><td><details> <summary>Generic: Robust Error Handling and Edge Case Management</summary> **Objective:** Ensure comprehensive error handling that provides meaningful context and graceful degradation **Status:** <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-1bb99367fdd853c728b7cfbf5893a293f6d217144dfb5282cb8dd32e5261021eR1828-R1857'>Input validation gaps</a>: The device search handler minimally validates request fields (e.g., pagination limits, mode) and passes them through, which may miss edge-case validation and error context for invalid inputs. <details open><summary>Referred Code</summary> ```go func (s *APIServer) handleDeviceSearch(w http.ResponseWriter, r *http.Request) { if s.searchPlanner == nil { writeError(w, "device search planner unavailable", http.StatusServiceUnavailable) return } var request deviceSearchRequest if err := json.NewDecoder(r.Body).Decode(&request); err != nil { writeError(w, "invalid search request payload", http.StatusBadRequest) return } if request.Filters == nil { request.Filters = make(map[string]string) } ctx, cancel := context.WithTimeout(r.Context(), defaultTimeout) defer cancel() plannerReq := &search.Request{ Query: request.Query, ... (clipped 9 lines) ``` </details></details></td></tr> <tr><td><details> <summary>Generic: Secure Logging Practices</summary> **Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. **Status:** <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-6d98e853ce17576c088e77956ae4ecfa8078019e0bff107a79d8d1d6ed2443adR243-R776'>Log metadata risk</a>: New logging and capability events include device IDs, IPs, hostnames, and errors that may contain sensitive data; ensure policies and redaction guardrails prevent PII/secret exposure. <details open><summary>Referred Code</summary> ```go return fmt.Errorf("failed to process batch SNMP target devices: %w", err) } for _, update := range deviceUpdates { if update == nil || strings.TrimSpace(update.DeviceID) == "" { continue } s.upsertCollectorCapabilities(ctx, update.DeviceID, []string{"snmp"}, agentID, pollerID, "snmp", timestamp) status := statusByDeviceID[update.DeviceID] metadata := map[string]any{ "agent_id": agentID, "poller_id": pollerID, "partition": partition, } if update.IP != "" { metadata["target_ip"] = update.IP } if update.Hostname != nil && *update.Hostname != "" { metadata["target_hostname"] = *update.Hostname } ... (clipped 513 lines) ``` </details></details></td></tr> <tr><td><details> <summary>Generic: Security-First Input Validation and Data Handling</summary> **Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities **Status:** <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-1bb99367fdd853c728b7cfbf5893a293f6d217144dfb5282cb8dd32e5261021eR1847-R1856'>Weak pagination checks</a>: The device search API accepts pagination fields (limit/offset/cursor) without explicit bounds or sanitization in the handler, which could allow excessive resource usage or injection risk depending on downstream handling. <details open><summary>Referred Code</summary> ```go plannerReq := &search.Request{ Query: request.Query, Mode: search.Mode(request.Mode), Filters: request.Filters, Pagination: search.Pagination{ Limit: request.Pagination.Limit, Offset: request.Pagination.Offset, Cursor: request.Pagination.Cursor, Direction: request.Pagination.Direction, }, ``` </details></details></td></tr> <tr><td align="center" colspan="2">   </td></tr></tbody></table> <details><summary>Compliance status legend</summary> 🟢 - Fully Compliant 🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label </details> ___ #### Previous compliance checks <details> <summary>Compliance check up to commit <a href='https://github.com/carverauto/serviceradar/commit/057b69fdcc8cb45a3d1e46ffb395d910474d897a'>057b69f</a></summary> <table><tbody><tr><td colspan='2'>Security Compliance</td></tr> <tr><td rowspan=2>⚪</td> <td><details><summary>Unsafe type assertion </summary> Description: Type assertion without proper error handling could cause runtime panic if deviceRegistry doesn't implement SearchDevices interface, potentially crashing the service. <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-cb61d8f79451b9541de4a8cc0811523a68d15452b2f5971c7618ea5b423cf4ecR1766-R1777'>registry.go [1766-1777]</a> <details open><summary>Referred Code</summary> ```go ``` </details></details></td></tr> <tr><td><details><summary>Unvalidated capability exposure </summary> Description: Capability data retrieved from registry is directly exposed in API response without validation or sanitization, potentially leaking sensitive collector metadata. <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-1bb99367fdd853c728b7cfbf5893a293f6d217144dfb5282cb8dd32e5261021eR1820-R1826'>server.go [1820-1826]</a> <details open><summary>Referred Code</summary> ```go if s.deviceRegistry != nil { if record, ok := s.deviceRegistry.GetCollectorCapabilities(ctx, device.DeviceID); ok { if resp := toCollectorCapabilityResponse(record); resp != nil { entry["collector_capabilities"] = resp } } } ``` </details></details></td></tr> <tr><td colspan='2'>Ticket Compliance</td></tr> <tr><td>🟢</td> <td> <details> <summary>🎫 <a href=https://github.com/carverauto/serviceradar/issues/1924>#1924</a></summary> <table width='100%'><tbody> <tr><td rowspan=11>🟢</td> <td>Define DeviceRecord schema in pkg/registry/device.go</td></tr> <tr><td>Implement DeviceRegistry with in-memory map and RWMutex</td></tr> <tr><td>Hydrate registry from Proton on startup (HydrateRegistryFromProton)</td></tr> <tr><td>Update DeviceManager.UpsertDevice() to write to both Proton and Registry</td></tr> <tr><td>Unit tests for registry operations</td></tr> <tr><td>Define CollectorCapability schema</td></tr> <tr><td>Implement CapabilityIndex in pkg/registry/capabilities.go</td></tr> <tr><td>Update agent/poller registration to emit capabilities</td></tr> <tr><td>Update API /devices/{id}/collectors to use registry</td></tr> <tr><td>Remove all metadata scraping (_alias_last_seen_service_id, etc.)</td></tr> <tr><td>Update UI to use new capabilities API</td></tr> </tbody></table> </details> </td></tr> <tr><td colspan='2'>Codebase Duplication Compliance</td></tr> <tr><td>⚪</td><td><details><summary>Codebase context is not defined </summary> Follow the <a href='https://qodo-merge-docs.qodo.ai/core-abilities/rag_context_enrichment/'>guide</a> to enable codebase context checks. </details></td></tr> <tr><td colspan='2'>Custom Compliance</td></tr> <tr><td rowspan=3>🟢</td><td> <details><summary>Generic: Comprehensive Audit Trails</summary> **Objective:** To create a detailed and reliable record of critical system actions for security analysis and compliance. **Status:** Passed </details></td></tr> <tr><td> <details><summary>Generic: Secure Error Handling</summary> **Objective:** To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. **Status:** Passed </details></td></tr> <tr><td> <details><summary>Generic: Secure Logging Practices</summary> **Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. **Status:** Passed </details></td></tr> <tr><td rowspan=3>🔴</td> <td><details> <summary>Generic: Meaningful Naming and Self-Documenting Code</summary> **Objective:** Ensure all identifiers clearly express their purpose and intent, making code self-documenting **Status:** <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-ece077c963daf0f46cb4cfe0e938ff22569f0d29da9376f3f78bd5bb8e586836R135-R145'>Generic variable names</a>: Variables named 'src', 'dst', 'out' are too generic and don't clearly express their purpose without additional context <details open><summary>Referred Code</summary> ```go func cloneMetadata(src map[string]string) map[string]string { if len(src) == 0 { return nil } out := make(map[string]string, len(src)) for key, value := range src { out[key] = value } return out } ``` </details></details></td></tr> <tr><td><details> <summary>Generic: Robust Error Handling and Edge Case Management</summary> **Objective:** Ensure comprehensive error handling that provides meaningful context and graceful degradation **Status:** <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-1bb99367fdd853c728b7cfbf5893a293f6d217144dfb5282cb8dd32e5261021eR1767-R1776'>Silent nil handling</a>: When search returns nil results, the code silently sets filteredDevices to nil without logging or providing context about why the search failed <details open><summary>Referred Code</summary> ```go if searchTerm := strings.TrimSpace(params["searchTerm"].(string)); searchTerm != "" { if searcher, ok := s.deviceRegistry.(interface { SearchDevices(query string, limit int) []*models.UnifiedDevice }); ok { if searched := searcher.SearchDevices(searchTerm, params["limit"].(int)); len(searched) > 0 { filteredDevices = searched } else { filteredDevices = nil } } ``` </details></details></td></tr> <tr><td><details> <summary>Generic: Security-First Input Validation and Data Handling</summary> **Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities **Status:** <a href='https://github.com/carverauto/serviceradar/pull/1925/files#diff-40d722882cf0d1b27b15d87cd15166e070641f4e77f720da2db89201c7ccfba9R9-R27'>Missing input validation</a>: The UpsertDeviceRecord function accepts a DeviceRecord pointer without validating critical fields like IP address format or MAC address format before indexing <details open><summary>Referred Code</summary> ```go func (r *DeviceRegistry) UpsertDeviceRecord(record *DeviceRecord) { if record == nil || strings.TrimSpace(record.DeviceID) == "" { return } input := cloneDeviceRecord(record) r.mu.Lock() defer r.mu.Unlock() if existing, ok := r.devices[input.DeviceID]; ok { r.removeIndexesLocked(existing) r.removeFromSearchIndex(existing) } r.devices[input.DeviceID] = input r.indexRecordLocked(input) r.addToSearchIndex(input) } ``` </details></details></td></tr> <tr><td align="center" colspan="2">  </td></tr></tbody></table> </details>

qodo-code-review[bot] commented

2025-11-05 06:49:34 +00:00

(Migrated from github.com)

Author

Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1925#issuecomment-3489633158
Original created: 2025-11-05T06:49:34Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
High-level	Re-evaluate custom cache and search implementation Instead of building a custom in-memory cache and trigram search engine, consider using established libraries like Ristretto for caching and Bleve for search. This would reduce maintenance and provide more robust features. Examples: pkg/registry/registry.go [62-67] `mu sync.RWMutex devices map[string]DeviceRecord devicesByIP map[string]map[string]DeviceRecord devicesByMAC map[string]map[string]DeviceRecord searchIndex TrigramIndex capabilities CapabilityIndex` pkg/registry/trigram_index.go [11-26] `// TrigramIndex maintains an in-memory trigram search index over device text fields. type TrigramIndex struct { mu sync.RWMutex trigramMap map[string]map[string]struct{} // trigram -> device IDs deviceTexts map[string]string // deviceID -> searchable text (lowercased) log logger.Logger } // NewTrigramIndex creates a trigram index using the provided logger for warnings. func NewTrigramIndex(log logger.Logger) TrigramIndex { ... (clipped 6 lines)` Solution Walkthrough: Before: // pkg/registry/registry.go type DeviceRegistry struct { mu sync.RWMutex devices map[string]DeviceRecord devicesByIP map[string]map[string]DeviceRecord devicesByMAC map[string]map[string]DeviceRecord searchIndex TrigramIndex // Custom trigram index // ... } // pkg/registry/trigram_index.go type TrigramIndex struct { mu sync.RWMutex trigramMap map[string]map[string]struct{} deviceTexts map[string]string } func (t TrigramIndex) Search(query string) []TrigramMatch { // ... custom trigram generation and scoring logic } After: import ( "github.com/blevesearch/bleve" "github.com/dgraph-io/ristretto" ) // pkg/registry/registry.go type DeviceRegistry struct { deviceCache ristretto.Cache // Using a library for caching searchIndex bleve.Index // Using a library for search // ... } func (r DeviceRegistry) GetDevice(ctx context.Context, deviceID string) (models.UnifiedDevice, error) { if record, found := r.deviceCache.Get(deviceID); found { return UnifiedDeviceFromRecord(record.(DeviceRecord)), nil } // ... fallback logic } func (r DeviceRegistry) SearchDevices(query string, limit int) []*models.UnifiedDevice { searchRequest := bleve.NewSearchRequest(bleve.NewQueryStringQuery(query)) searchRequest.Size = limit searchResults, err := r.searchIndex.Search(searchRequest) // ... process results from search library } Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies the custom implementation of a cache and search engine, proposing a valid, high-impact architectural alternative using battle-tested libraries that could reduce long-term maintenance and improve features.	Medium
General	✅ ~~Propagate context for proper cancellation~~ Suggestion Impact: The commit modified processSysmonMetrics to accept a ctx parameter and replaced context.Background() with ctx when calling upsertCollectorCapabilities, implementing proper context propagation. code diff: `func (s *Server) processSysmonMetrics( + ctx context.Context, pollerID, partition, agentID string, details json.RawMessage, timestamp time.Time) error { @@ -730,7 +828,35 @@ Str("timestamp", sysmonPayload.Status.Timestamp). Msg("Parsed sysmon metrics") - s.upsertCollectorCapabilities(context.Background(), deviceID, []string{"sysmon"}, agentID, pollerID, "sysmon", pollerTimestamp) + s.upsertCollectorCapabilities(ctx, deviceID, []string{"sysmon"}, agentID, pollerID, "sysmon", pollerTimestamp) +` In `processSysmonMetrics`, pass the incoming `ctx` to `upsertCollectorCapabilities` instead of using `context.Background()` to ensure proper context propagation. pkg/core/metrics.go [733] `-s.upsertCollectorCapabilities(context.Background(), deviceID, []string{"sysmon"}, agentID, pollerID, "sysmon", pollerTimestamp) +s.upsertCollectorCapabilities(ctx, deviceID, []string{"sysmon"}, agentID, pollerID, "sysmon", pollerTimestamp)` `[Suggestion processed]` Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies that `context.Background()` is used instead of propagating the function's `ctx`, which is important for handling cancellations and tracing.	Medium
	✅ ~~Add context to function signature~~ Suggestion Impact: The function ensureServiceDevice now takes a context.Context and passes it to upsertCollectorCapabilities; previously it used context.Background(). However, new calls to recordCapabilityEvent still use context.Background(). code diff: `func (s Server) ensureServiceDevice( + ctx context.Context, agentID, pollerID, partition string, svc proto.ServiceStatus, serviceData json.RawMessage, @@ -112,7 +113,36 @@ } capabilities := normalizeCapabilities([]string{svc.ServiceType, svc.ServiceName}) - s.upsertCollectorCapabilities(context.Background(), deviceID, capabilities, agentID, pollerID, svc.ServiceName, timestamp) + s.upsertCollectorCapabilities(ctx, deviceID, capabilities, agentID, pollerID, svc.ServiceName, timestamp)` Update the `ensureServiceDevice` function to accept a `context.Context` and pass it to the `upsertCollectorCapabilities` call instead of using `context.Background()`. pkg/core/devices.go [109-116] `func (s Server) ensureServiceDevice( + ctx context.Context, svc models.Service, agentID, pollerID string, timestamp time.Time, ) { ... - s.upsertCollectorCapabilities(context.Background(), deviceID, capabilities, agentID, pollerID, svc.ServiceName, timestamp) + s.upsertCollectorCapabilities(ctx, deviceID, capabilities, agentID, pollerID, svc.ServiceName, timestamp) }` `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 7 __ Why: The suggestion correctly points out the use of `context.Background()` and proposes adding context propagation to `ensureServiceDevice`, which is a good practice for consistency and robustness.	Medium
	✅ ~~Optimize capability set creation logic~~ Suggestion Impact: The commit replaced the chained filter/map/set construction with a single for-loop that trims, lowercases, type-checks, and adds capabilities to a Set. code diff: - const rawCapabilities = Array.isArray(capabilityHints?.capabilities) - ? capabilityHints!.capabilities - : []; - const capabilitySet = new Set( - rawCapabilities - .filter((value): value is string => typeof value === "string") - .map((value) => value.trim().toLowerCase()) - .filter(Boolean), - ); + const capabilitySet = new Set<string>(); + if (Array.isArray(capabilityHints?.capabilities)) { + for (const cap of capabilityHints.capabilities) { + if (typeof cap === "string" && cap.trim()) { + capabilitySet.add(cap.trim().toLowerCase()); + } + } + } Refactor the creation of `capabilitySet` to use a single loop instead of multiple chained `filter` and `map` calls for better efficiency. web/src/components/Devices/DeviceTable.tsx [100-108] -const rawCapabilities = Array.isArray(capabilityHints?.capabilities) - ? capabilityHints!.capabilities - : []; -const capabilitySet = new Set( - rawCapabilities - .filter((value): value is string => typeof value === "string") - .map((value) => value.trim().toLowerCase()) - .filter(Boolean), -); +const capabilitySet = new Set<string>(); +if (Array.isArray(capabilityHints?.capabilities)) { + for (const cap of capabilityHints.capabilities) { + if (typeof cap === "string" && cap.trim()) { + capabilitySet.add(cap.trim().toLowerCase()); + } + } +} `[Suggestion processed]` Suggestion importance[1-10]: 4 __ Why: The suggestion offers a valid micro-optimization by reducing chained array methods to a single loop, which can be slightly more performant and readable.	Low
Update

Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/1925#issuecomment-3489633158 Original created: 2025-11-05T06:49:34Z --- ## PR Code Suggestions ✨  Explore these optional code suggestions: <table><thead><tr><td>Category</td><td align=left>Suggestion                                                                                                                                    </td><td align=center>Impact</td></tr><tbody><tr><td rowspan=1>High-level</td> <td> <details><summary>Re-evaluate custom cache and search implementation</summary> ___ **Instead of building a custom in-memory cache and trigram search engine, consider using established libraries like Ristretto for caching and Bleve for search. This would reduce maintenance and provide more robust features.** ### Examples: <details> <summary> <a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-cb61d8f79451b9541de4a8cc0811523a68d15452b2f5971c7618ea5b423cf4ecR62-R67">pkg/registry/registry.go [62-67]</a> </summary> ```go mu sync.RWMutex devices map[string]*DeviceRecord devicesByIP map[string]map[string]*DeviceRecord devicesByMAC map[string]map[string]*DeviceRecord searchIndex *TrigramIndex capabilities *CapabilityIndex ``` </details> <details> <summary> <a href="https://github.com/carverauto/serviceradar/pull/1925/files#diff-d15089251c1f11295acd5dbb6ca83aecaa384f91e49522b87e08f33ec7fd5138R11-R26">pkg/registry/trigram_index.go [11-26]</a> </summary> ```go // TrigramIndex maintains an in-memory trigram search index over device text fields. type TrigramIndex struct { mu sync.RWMutex trigramMap map[string]map[string]struct{} // trigram -> device IDs deviceTexts map[string]string // deviceID -> searchable text (lowercased) log logger.Logger } // NewTrigramIndex creates a trigram index using the provided logger for warnings. func NewTrigramIndex(log logger.Logger) *TrigramIndex { ... (clipped 6 lines) ``` </details> ### Solution Walkthrough: #### Before: ```go // pkg/registry/registry.go type DeviceRegistry struct { mu sync.RWMutex devices map[string]*DeviceRecord devicesByIP map[string]map[string]*DeviceRecord devicesByMAC map[string]map[string]*DeviceRecord searchIndex *TrigramIndex // Custom trigram index // ... } // pkg/registry/trigram_index.go type TrigramIndex struct { mu sync.RWMutex trigramMap map[string]map[string]struct{} deviceTexts map[string]string } func (t *TrigramIndex) Search(query string) []TrigramMatch { // ... custom trigram generation and scoring logic } ``` #### After: ```go import ( "github.com/blevesearch/bleve" "github.com/dgraph-io/ristretto" ) // pkg/registry/registry.go type DeviceRegistry struct { deviceCache *ristretto.Cache // Using a library for caching searchIndex bleve.Index // Using a library for search // ... } func (r *DeviceRegistry) GetDevice(ctx context.Context, deviceID string) (*models.UnifiedDevice, error) { if record, found := r.deviceCache.Get(deviceID); found { return UnifiedDeviceFromRecord(record.(*DeviceRecord)), nil } // ... fallback logic } func (r *DeviceRegistry) SearchDevices(query string, limit int) []*models.UnifiedDevice { searchRequest := bleve.NewSearchRequest(bleve.NewQueryStringQuery(query)) searchRequest.Size = limit searchResults, err := r.searchIndex.Search(searchRequest) // ... process results from search library } ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: The suggestion correctly identifies the custom implementation of a cache and search engine, proposing a valid, high-impact architectural alternative using battle-tested libraries that could reduce long-term maintenance and improve features. </details></details></td><td align=center>Medium </td></tr><tr><td rowspan=3>General</td> <td> <details><summary>✅ <s>Propagate context for proper cancellation</s></summary> ___ <details><summary>Suggestion Impact:</summary>The commit modified processSysmonMetrics to accept a ctx parameter and replaced context.Background() with ctx when calling upsertCollectorCapabilities, implementing proper context propagation. code diff: ```diff func (s *Server) processSysmonMetrics( + ctx context.Context, pollerID, partition, agentID string, details json.RawMessage, timestamp time.Time) error { @@ -730,7 +828,35 @@ Str("timestamp", sysmonPayload.Status.Timestamp). Msg("Parsed sysmon metrics") - s.upsertCollectorCapabilities(context.Background(), deviceID, []string{"sysmon"}, agentID, pollerID, "sysmon", pollerTimestamp) + s.upsertCollectorCapabilities(ctx, deviceID, []string{"sysmon"}, agentID, pollerID, "sysmon", pollerTimestamp) + ``` </details> ___ **In <code>processSysmonMetrics</code>, pass the incoming <code>ctx</code> to <code>upsertCollectorCapabilities</code> instead of using <code>context.Background()</code> to ensure proper context propagation.** [pkg/core/metrics.go [733]](https://github.com/carverauto/serviceradar/pull/1925/files#diff-6d98e853ce17576c088e77956ae4ecfa8078019e0bff107a79d8d1d6ed2443adR733-R733) ```diff -s.upsertCollectorCapabilities(context.Background(), deviceID, []string{"sysmon"}, agentID, pollerID, "sysmon", pollerTimestamp) +s.upsertCollectorCapabilities(ctx, deviceID, []string{"sysmon"}, agentID, pollerID, "sysmon", pollerTimestamp) ``` `[Suggestion processed]` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies that `context.Background()` is used instead of propagating the function's `ctx`, which is important for handling cancellations and tracing. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>✅ <s>Add context to function signature</s></summary> ___ <details><summary>Suggestion Impact:</summary>The function ensureServiceDevice now takes a context.Context and passes it to upsertCollectorCapabilities; previously it used context.Background(). However, new calls to recordCapabilityEvent still use context.Background(). code diff: ```diff func (s *Server) ensureServiceDevice( + ctx context.Context, agentID, pollerID, partition string, svc *proto.ServiceStatus, serviceData json.RawMessage, @@ -112,7 +113,36 @@ } capabilities := normalizeCapabilities([]string{svc.ServiceType, svc.ServiceName}) - s.upsertCollectorCapabilities(context.Background(), deviceID, capabilities, agentID, pollerID, svc.ServiceName, timestamp) + s.upsertCollectorCapabilities(ctx, deviceID, capabilities, agentID, pollerID, svc.ServiceName, timestamp) ``` </details> ___ **Update the <code>ensureServiceDevice</code> function to accept a <code>context.Context</code> and pass it to the <code>upsertCollectorCapabilities</code> call instead of using <code>context.Background()</code>.** [pkg/core/devices.go [109-116]](https://github.com/carverauto/serviceradar/pull/1925/files#diff-fb7f5159558f22afdc809b9cbf5a9c0f71560be2f7389c944da8f1604998ea48R109-R116) ```diff func (s *Server) ensureServiceDevice( + ctx context.Context, svc *models.Service, agentID, pollerID string, timestamp time.Time, ) { ... - s.upsertCollectorCapabilities(context.Background(), deviceID, capabilities, agentID, pollerID, svc.ServiceName, timestamp) + s.upsertCollectorCapabilities(ctx, deviceID, capabilities, agentID, pollerID, svc.ServiceName, timestamp) } ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly points out the use of `context.Background()` and proposes adding context propagation to `ensureServiceDevice`, which is a good practice for consistency and robustness. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>✅ <s>Optimize capability set creation logic</s></summary> ___ <details><summary>Suggestion Impact:</summary>The commit replaced the chained filter/map/set construction with a single for-loop that trims, lowercases, type-checks, and adds capabilities to a Set. code diff: ```diff - const rawCapabilities = Array.isArray(capabilityHints?.capabilities) - ? capabilityHints!.capabilities - : []; - const capabilitySet = new Set( - rawCapabilities - .filter((value): value is string => typeof value === "string") - .map((value) => value.trim().toLowerCase()) - .filter(Boolean), - ); + const capabilitySet = new Set<string>(); + if (Array.isArray(capabilityHints?.capabilities)) { + for (const cap of capabilityHints.capabilities) { + if (typeof cap === "string" && cap.trim()) { + capabilitySet.add(cap.trim().toLowerCase()); + } + } + } ``` </details> ___ **Refactor the creation of <code>capabilitySet</code> to use a single loop instead of multiple chained <code>filter</code> and <code>map</code> calls for better efficiency.** [web/src/components/Devices/DeviceTable.tsx [100-108]](https://github.com/carverauto/serviceradar/pull/1925/files#diff-62d65ad4e929bc5554024981aa1c3418d70d54333e3afd80af69d93cea2edec3R100-R108) ```diff -const rawCapabilities = Array.isArray(capabilityHints?.capabilities) - ? capabilityHints!.capabilities - : []; -const capabilitySet = new Set( - rawCapabilities - .filter((value): value is string => typeof value === "string") - .map((value) => value.trim().toLowerCase()) - .filter(Boolean), -); +const capabilitySet = new Set<string>(); +if (Array.isArray(capabilityHints?.capabilities)) { + for (const cap of capabilityHints.capabilities) { + if (typeof cap === "string" && cap.trim()) { + capabilitySet.add(cap.trim().toLowerCase()); + } + } +} ``` `[Suggestion processed]` <details><summary>Suggestion importance[1-10]: 4</summary> __ Why: The suggestion offers a valid micro-optimization by reducing chained array methods to a single loop, which can be slightly more performant and readable. </details></details></td><td align=center>Low </td></tr> <tr><td align="center" colspan="2"> - [ ] Update  </td><td></td></tr></tbody></table>