Datasvc/fixing UI #2403

Merged
mfreeman451 merged 19 commits from refs/pull/2403/head into main 2025-11-10 19:37:05 +00:00
mfreeman451 commented 2025-11-08 17:03:39 +00:00 (Migrated from github.com)
Owner

Imported from GitHub pull request.

Original GitHub pull request: #1926
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/1926
Original created: 2025-11-08T17:03:39Z
Original updated: 2025-11-10T19:38:10Z
Original head: carverauto/serviceradar:datasvc/fixing_ui
Original base: main
Original merged: 2025-11-10T19:37:05Z by @mfreeman451

User description

IMPORTANT: Please sign the Developer Certificate of Origin

Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:

Signed-off-by: J. Doe <j.doe@domain.com>

Describe your changes

Code checklist before requesting a review

  • I have signed the DCO?
  • The build completes without errors?
  • All tests are passing when running make test?

PR Type

Enhancement, Tests


Description

  • Configuration Management System: Implemented comprehensive configuration management with service descriptors, metadata tracking, and KV-based storage with template seeding

  • Configuration API Endpoints: Added new admin routes for listing config descriptors, checking status, and managing watchers with metadata exposure

  • Bootstrap Service Pattern: Created unified cfgbootstrap.Service abstraction for simplified configuration loading and KV watching across all services

  • Configuration Templates: Added 25+ embedded service configuration templates (JSON/TOML) for all managed services including core, poller, agent, mapper, sync, and various checkers/consumers

  • Configuration Sanitization: Implemented sensitive field removal for KV storage with TOML-specific sanitization support

  • Watcher Registry: Built watcher tracking system for monitoring active KV configuration watchers with event and status tracking

  • Config-Sync Tool: Created new command-line utility for syncing service configurations from KV to local files with template seeding and optional watching

  • Web UI Integration: Enhanced config editor with metadata display (revision, origin, timestamp) and dynamic global services discovery from descriptors

  • Docker & Helm Integration: Updated Dockerfiles and Helm templates with config-sync support including environment variable helpers and entrypoint scripts

  • Service Migrations: Migrated all services (poller, agent, mapper, sync, checkers, consumers) from legacy KVManager to new bootstrap pattern

  • Unit Tests: Added comprehensive tests for configuration API, status endpoints, watcher registry, and TOML sanitization

  • Documentation: Added KV configuration checks and config-sync tool usage documentation


Diagram Walkthrough

flowchart LR
  A["Services<br/>poller, agent, mapper, etc."] -->|"cfgbootstrap.Service"| B["Bootstrap Service"]
  B -->|"Load & Watch"| C["KV Store"]
  C -->|"Fetch Config"| D["Config Registry<br/>Descriptors"]
  B -->|"Seed Template"| E["Config Templates<br/>25+ services"]
  E -->|"Merge"| F["Final Config"]
  F -->|"Sanitize"| G["Remove Sensitive<br/>Fields"]
  G -->|"Store"| C
  H["Config-Sync Tool"] -->|"Sync to Disk"| I["Local Files"]
  C -->|"Provide"| I
  J["Web UI"] -->|"Query"| K["Admin API<br/>Descriptors/Status/Watchers"]
  K -->|"Display Metadata"| L["Config Editor<br/>with Revision Info"]

File Walkthrough

Relevant files
Enhancement
32 files
auth.go
Configuration management API with template seeding and metadata
tracking

pkg/core/api/auth.go

  • Added comprehensive configuration management endpoints
    (handleListConfigDescriptors, handleConfigStatus,
    handleConfigWatchers) with metadata tracking
  • Refactored handleGetConfig to support service descriptors, template
    seeding, and configuration metadata recording
  • Introduced helper functions for KV key resolution, metadata
    persistence, and configuration entry loading
  • Enhanced error handling with standardized writeAPIError method and
    improved response formatting
+632/-70
registry.go
Service configuration descriptor registry and metadata     

pkg/config/registry.go

  • Created new service descriptor registry mapping service names to
    configuration metadata
  • Defined ConfigFormat (JSON/TOML) and ConfigScope (global/poller/agent)
    enums
  • Implemented descriptor lookup functions and KV key resolution with
    template variable substitution
  • Registered 25+ service descriptors with their KV keys and
    configuration formats
+282/-0 
main.go
Faker service configuration bootstrap integration               

cmd/faker/main.go

  • Added Validate() and applyDefaults() methods to Config struct for
    configuration validation
  • Refactored loadConfig into loadFakerConfig using new bootstrap service
    pattern
  • Integrated with cfgbootstrap.Service for unified configuration loading
    and KV support
+101/-38
main.go
Configuration synchronization utility tool                             

cmd/tools/config-sync/main.go

  • Created new command-line tool for syncing service configurations from
    KV to local files
  • Supports template seeding, KV watching, and multiple service roles
  • Implements graceful shutdown and atomic file writes with temporary
    files
+200/-0 
main.go
DB event writer bootstrap service integration                       

cmd/consumers/db-event-writer/main.go

  • Replaced legacy KVManager pattern with new cfgbootstrap.Service API
  • Simplified configuration loading and KV watch setup
  • Removed explicit cleanup calls in favor of deferred
    bootstrapResult.Close()
+27/-45 
main.go
Poller service bootstrap integration                                         

cmd/poller/main.go

  • Migrated from legacy KVManager to new cfgbootstrap.Service pattern
  • Simplified configuration loading with service descriptor lookup
  • Updated KV watch integration to use new bootstrap result API
+35/-44 
server.go
API server configuration endpoints and KV client updates 

pkg/core/api/server.go

  • Updated kvGetFn signature to return revision number as fourth return
    value
  • Added new admin routes for config descriptors, status, and watchers
  • Updated isServiceConfigured to use new resolveServiceKey method
+12/-9   
watchers.go
KV watcher registry and tracking system                                   

pkg/config/watchers.go

  • Implemented watcher registry for tracking active KV configuration
    watchers
  • Added WatcherInfo and WatcherStatus types for runtime metadata
    exposure
  • Provided functions to register, mark events, and list watchers with
    thread-safe access
+150/-0 
config_templates.go
Embedded service configuration templates                                 

pkg/core/api/config_templates.go

  • Created embedded template assets for all 25+ managed service
    configurations
  • Mapped service names to template data with format information
  • Supports both JSON and TOML configuration formats
+85/-0   
main.go
Sync service bootstrap integration                                             

cmd/sync/main.go

  • Replaced legacy KVManager initialization with cfgbootstrap.Service API
  • Simplified configuration loading and KV watch setup
  • Removed explicit cleanup in favor of deferred bootstrap result closure
+16/-31 
main.go
Agent service bootstrap integration                                           

cmd/agent/main.go

  • Migrated from legacy KVManager to new cfgbootstrap.Service pattern
  • Simplified configuration loading with service descriptor lookup
  • Updated KV watch to use new bootstrap result API
+17/-26 
service.go
Configuration bootstrap service abstraction                           

pkg/config/bootstrap/service.go

  • Created new Service function for unified configuration loading and
    watching
  • Implemented Result type with Close() and StartWatch() helper methods
  • Handles KV manager initialization, config loading, bootstrapping, and
    watcher registration
+126/-0 
toml_mask.go
TOML document sanitization utility                                             

pkg/config/toml_mask.go

  • Implemented SanitizeTOML function to remove sensitive keys from TOML
    documents
  • Supports table-specific and wildcard key filtering without full
    parsing
  • Preserves comments and multiline strings while filtering sensitive
    data
+114/-0 
kv_client.go
KV bootstrap configuration sanitization                                   

pkg/config/kv_client.go

  • Updated BootstrapConfig to accept configPath parameter and prefer
    on-disk sanitization
  • Added sanitizeBootstrapSource to read and sanitize config files before
    KV storage
  • Implemented cloneConfig for creating configuration copies via
    reflection
+58/-8   
main.go
Mapper service bootstrap integration                                         

cmd/mapper/main.go

  • Migrated from direct config loading to cfgbootstrap.Service pattern
  • Added service descriptor lookup and bootstrap result handling
  • Integrated KV watch with watcher registration
+16/-7   
main.go
SNMP checker bootstrap integration                                             

cmd/checkers/snmp/main.go

  • Replaced direct config loading with cfgbootstrap.Service API
  • Added service descriptor lookup for SNMP checker
  • Integrated KV watch with watcher registration and logging
+16/-6   
main.go
Sysmon-VM checker bootstrap integration                                   

cmd/checkers/sysmon-vm/main.go

  • Migrated from direct config loading to cfgbootstrap.Service pattern
  • Added service descriptor lookup for sysmon-vm checker
  • Integrated KV watch with watcher registration
+16/-3   
app.go
Core service bootstrap integration                                             

cmd/core/app/app.go

  • Replaced core.LoadConfig with cfgbootstrap.Service pattern
  • Added service descriptor lookup for core service
  • Integrated KV watch with watcher registration for hot-reload
    notifications
+21/-1   
main.go
NetFlow consumer bootstrap integration                                     

cmd/consumers/netflow/main.go

  • Migrated from direct config loading to cfgbootstrap.Service API
  • Added service descriptor lookup for netflow consumer
  • Integrated KV watch with watcher registration
+16/-4   
main.go
Data service bootstrap integration                                             

cmd/data-services/main.go

  • Replaced direct config loading with cfgbootstrap.Service pattern
  • Added service descriptor lookup for datasvc
  • Removed manual KV store setup in favor of bootstrap result handling
+11/-9   
kv_watch.go
KV watch watcher event tracking                                                   

pkg/config/kv_watch.go

  • Enhanced StartKVWatchOverlay to track watcher events and errors
  • Integrated watcher ID from context for event marking
  • Added error tracking and status updates during KV watch operations
+19/-1   
main.go
Dusk checker bootstrap integration                                             

cmd/checkers/dusk/main.go

  • Migrated from direct config loading to cfgbootstrap.Service pattern
  • Added service descriptor lookup for dusk checker
  • Simplified configuration initialization
+12/-6   
types.go
KV client function signature update                                           

pkg/core/api/types.go

  • Updated kvGetFn function signature to include revision number as
    fourth return value
+1/-1     
route.ts
Web API proxy for config descriptor listing                           

web/src/app/api/admin/config/route.ts

  • Created new API proxy endpoint for listing configuration descriptors
  • Implements authentication and error handling for admin config API
+43/-0   
route.ts
Web API proxy for service configuration management             

web/src/app/api/admin/config/[service]/route.ts

  • Created new API proxy endpoints for GET, PUT, DELETE operations on
    service configurations
  • Routes requests through shared proxy utility
+23/-0   
sanitize.go
Configuration sanitization for KV storage                               

pkg/config/sanitize.go

  • Implemented sanitizeForKV function to remove sensitive fields before
    KV storage
  • Filters fields marked with sensitive:"true" tag using models utility
+27/-0   
config.go
Core configuration KV bootstrap integration                           

pkg/core/config.go

  • Enhanced overlayFromKV to bootstrap core configuration into KV after
    loading
  • Added service descriptor lookup and KV bootstrap call with error
    logging
+12/-1   
proxy.ts
New shared proxy implementation for config API                     

web/src/app/api/admin/config/proxy.ts

  • Created new shared proxy module for admin config routes
  • Implements proxyConfigRequest function handling GET, PUT, DELETE
    methods
  • Normalizes legacy kvStore query parameter to kv_store_id
  • Forwards requests to upstream API with proper headers and error
    handling
+93/-0   
run-with-config-sync.sh
Config-sync wrapper entrypoint for services                           

docker/entrypoints/run-with-config-sync.sh

  • New bash entrypoint script for optional config-sync execution
  • Supports environment variables for config service, output path,
    template, role, seeding, and watching
  • Conditionally runs config-sync binary before launching main service
  • Handles missing binary gracefully with informative error messages
+59/-0   
_helpers.tpl
Helm helper for config-sync environment variables               

helm/serviceradar/templates/_helpers.tpl

  • Added new serviceradar.configSyncEnv Helm template helper
  • Generates environment variables for config-sync configuration
  • Supports optional kvKey, role, extraArgs, and extraEnv customization
  • Provides sensible defaults for enabled, seed, and watch flags
+28/-0   
ServicesTreeNavigation.tsx
Dynamic global services discovery and rendering                   

web/src/components/Admin/ServicesTreeNavigation.tsx

  • Dynamically fetch config descriptors from /api/admin/config endpoint
  • Filter global services based on scope from descriptors instead of
    hardcoded list
  • Implement dynamic service label generation with display_name support
  • Render global services list dynamically with status indicators and
    metadata display
+174/-49
ConfigEditor.tsx
Config metadata parsing and display in editor                       

web/src/components/Admin/ConfigEditor.tsx

  • Added ConfigMetadata and ConfigEnvelope types for structured config
    responses
  • Implement buildConfigQuery helper for consistent query parameter
    construction
  • Parse JSON envelope responses containing metadata (revision, origin,
    last_writer, updated_at)
  • Display config metadata badges showing revision, origin, and update
    timestamp
  • Refresh config after successful save to reflect server state
+99/-15 
Tests
5 files
auth_test.go
Configuration API unit tests                                                         

pkg/core/api/auth_test.go

  • Added tests for template seeding functionality in handleGetConfig
  • Added tests for agent-scoped configuration requiring agent_id
    parameter
  • Added tests for watcher registration and listing
+136/-0 
config_status_test.go
Configuration status API tests                                                     

pkg/core/api/config_status_test.go

  • Added tests for config status endpoint reporting missing KV keys
  • Added tests for KV key qualification with domain prefixing
  • Added tests verifying template coverage for all service descriptors
+76/-0   
watchers_test.go
Watcher registry unit tests                                                           

pkg/config/watchers_test.go

  • Added tests for watcher registry event tracking and status updates
  • Added tests for context-based watcher ID propagation
  • Verified watcher lifecycle management
+58/-0   
toml_mask_test.go
TOML sanitization unit tests                                                         

pkg/config/toml_mask_test.go

  • Added tests for TOML sanitization with specific key filtering
  • Added tests for wildcard-based table and key filtering
+54/-0   
sanitize_test.go
Configuration sanitization unit tests                                       

pkg/config/sanitize_test.go

  • Added tests for sensitive field removal from configuration structures
  • Verified nested sensitive field filtering
+43/-0   
Configuration changes
30 files
flowgger.yaml
Flowgger Helm template config sync integration                     

helm/serviceradar/templates/flowgger.yaml

  • Added config sync environment variables to flowgger deployment
+1/-0     
core.json
Core service configuration template                                           

pkg/core/api/templates/core.json

  • New comprehensive core service configuration template
  • Includes database, security, NATS, events, auth, RBAC, webhooks, and
    logging settings
  • Defines route protection rules for API endpoints with role-based
    access control
  • Contains placeholder values for TLS certificates, API keys, and NATS
    configuration
+165/-0 
docker-compose.yml
Config-sync environment setup in docker-compose                   

docker-compose.yml

  • Added config-sync environment variables to otel, flowgger, trapd, and
    zen-consumer services
  • Configures service name, output path, template path, sync role, and
    optional watch/seed flags
  • Enables dynamic configuration synchronization for each collector
    service
+40/-0   
mapper.json
Mapper service configuration template                                       

pkg/core/api/templates/mapper.json

  • New mapper service configuration template with SNMP discovery settings
  • Includes worker pool, timeout, retry, and job management configuration
  • Defines OID mappings for basic system info, interfaces, and topology
    discovery
  • Contains stream configuration for publishing discovery results to NATS
+105/-0 
poller.json
Poller service configuration template                                       

pkg/core/api/templates/poller.json

  • New poller service configuration template with agent definitions
  • Configures local and default agents with security and check
    definitions
  • Includes core address, listen address, poll interval, and partition
    settings
  • Defines OTEL logging configuration for observability
+111/-0 
db-event-writer.json
DB event writer service configuration template                     

pkg/core/api/templates/db-event-writer.json

  • New event writer service configuration template
  • Configures NATS stream consumption and database event persistence
  • Maps event subjects to database tables for logs, traces, and metrics
  • Includes database connection, security, and OTEL logging settings
+87/-0   
sync.json
Sync service configuration template                                           

pkg/core/api/templates/sync.json

  • New sync service configuration template for device discovery
  • Configures NATS, KV store, and security settings
  • Includes Armis API integration with credentials and discovery queries
  • Defines network blacklist and custom field mappings for device
    filtering
+77/-0   
Dockerfile
Zen consumer Dockerfile with config-sync integration         

cmd/consumers/zen/Dockerfile

  • Added config-sync builder stage to compile config-sync binary
  • Copy config-sync and entrypoint script into runtime image
  • Move template config to /etc/serviceradar/templates/consumers/
    directory
  • Set CONFIG_* environment variables and use run-with-config-sync.sh as
    ENTRYPOINT
+22/-2   
zen-consumer.json
Zen consumer service configuration template                           

pkg/core/api/templates/zen-consumer.json

  • New zen-consumer service configuration template
  • Defines NATS stream consumption and decision groups for event
    processing
  • Configures syslog, SNMP, OTEL logs, and metrics processing rules
  • Includes security, gRPC, and KV bucket settings
+61/-0   
Dockerfile
TRAPD Dockerfile with config-sync integration                       

cmd/trapd/Dockerfile

  • Added config-sync builder stage to compile config-sync binary
  • Copy config-sync and entrypoint script into runtime image
  • Move template config to /etc/serviceradar/templates/ directory
  • Set CONFIG_* environment variables and use run-with-config-sync.sh as
    ENTRYPOINT
+22/-1   
Dockerfile
Flowgger Dockerfile with config-sync integration                 

cmd/flowgger/Dockerfile

  • Added config-sync builder stage to compile config-sync binary
  • Copy config-sync and entrypoint script into runtime image
  • Move template config to /etc/serviceradar/templates/ directory
  • Set CONFIG_* environment variables and use run-with-config-sync.sh as
    ENTRYPOINT
+23/-1   
Dockerfile
OTEL Dockerfile with config-sync integration                         

cmd/otel/Dockerfile

  • Added config-sync builder stage to compile config-sync binary
  • Copy config-sync and entrypoint script into runtime image
  • Move template config to /etc/serviceradar/templates/ directory
  • Set CONFIG_* environment variables and use run-with-config-sync.sh as
    ENTRYPOINT
+23/-1   
netflow-consumer.json
Netflow consumer service configuration template                   

pkg/core/api/templates/netflow-consumer.json

  • New netflow-consumer service configuration template
  • Configures NATS stream consumption for netflow metrics
  • Defines enabled/disabled fields and dictionary-based enrichment
  • Includes database connection and mTLS security settings
+58/-0   
snmp-checker.json
SNMP checker service configuration template                           

pkg/core/api/templates/snmp-checker.json

  • New SNMP checker service configuration template
  • Configures node address, listen address, and SPIFFE security
  • Defines target devices with SNMP credentials and OID mappings
  • Includes logging and OTEL configuration
+47/-0   
agent.json
Agent service configuration template                                         

pkg/core/api/templates/agent.json

  • New agent service configuration template
  • Configures gRPC service with mTLS security
  • Defines agent identity, partition, and checker directory
  • Includes OTEL logging configuration
+39/-0   
flowgger.toml
Flowgger service TOML configuration template                         

pkg/core/api/templates/flowgger.toml

  • New flowgger service TOML configuration template
  • Configures UDP syslog input and NATS JetStream output
  • Defines NATS connection, TLS, and retry behavior
  • Includes gRPC listener with mTLS security settings
+34/-0   
faker.json
Faker service configuration template                                         

pkg/core/api/templates/faker.json

  • New faker service configuration template for testing
  • Configures fake Armis API with device simulation
  • Includes IP shuffle simulation and persistent storage settings
  • Defines logging configuration with file output
+34/-0   
otel.toml
OTEL collector service TOML configuration template             

pkg/core/api/templates/otel.toml

  • New OTEL collector service TOML configuration template
  • Configures OTEL server binding and NATS output
  • Defines JetStream stream and subject configuration
  • Includes TLS settings for NATS and gRPC
+31/-0   
trapd.json
TRAPD service configuration template                                         

pkg/core/api/templates/trapd.json

  • New TRAPD service configuration template
  • Configures SNMP trap listener and NATS output
  • Defines gRPC listener with mTLS security
  • Includes NATS domain and stream configuration
+21/-0   
sysmon-vm-checker.json
Sysmon-vm checker service configuration template                 

pkg/core/api/templates/sysmon-vm-checker.json

  • New sysmon-vm checker service configuration template
  • Configures gRPC listener with SPIFFE security
  • Defines ZFS pool monitoring and filesystem tracking
  • Includes poll interval and monitoring settings
+27/-0   
values.yaml
Helm values for config-sync service configuration               

helm/serviceradar/values.yaml

  • Added configSync section with per-service configuration
  • Defines enabled, seed, watch, kvKey, role, extraArgs, and extraEnv for
    each service
  • Provides configuration for flowgger, trapd, otel, and zen services
  • Enables flexible config-sync behavior per deployment
+34/-0   
datasvc.json
Datasvc service configuration template                                     

pkg/core/api/templates/datasvc.json

  • New datasvc service configuration template
  • Configures NATS connection and KV bucket settings
  • Defines RBAC roles and bucket history/TTL configuration
  • Includes mTLS security settings
+25/-0   
dusk-checker.json
Dusk checker service configuration template                           

pkg/core/api/templates/dusk-checker.json

  • New dusk checker service configuration template
  • Configures gRPC node connection and listener
  • Defines SPIFFE security with trust domain and workload socket
  • Includes timeout and TLS certificate settings
+22/-0   
trapd.yaml
Trapd Helm template config-sync environment injection       

helm/serviceradar/templates/trapd.yaml

  • Injected serviceradar.configSyncEnv helper to generate config-sync
    environment variables
  • Uses configSync.trapd values from Helm chart
+1/-0     
agent-sweep.json
Agent sweep service configuration template                             

pkg/core/api/templates/agent-sweep.json

  • New agent-sweep service configuration template
  • Configures network sweep parameters and port scanning
  • Defines ICMP and TCP sweep modes with concurrency and timeout
  • Includes rate limiting and high-performance ICMP settings
+11/-0   
otel.yaml
OTEL Helm template config-sync environment injection         

helm/serviceradar/templates/otel.yaml

  • Injected serviceradar.configSyncEnv helper to generate config-sync
    environment variables
  • Uses configSync.otel values from Helm chart
+1/-0     
zen.yaml
Zen Helm template config-sync environment injection           

helm/serviceradar/templates/zen.yaml

  • Injected serviceradar.configSyncEnv helper to generate config-sync
    environment variables
  • Uses configSync.zen values from Helm chart
+1/-0     
agent-snmp.json
Agent SNMP service configuration template                               

pkg/core/api/templates/agent-snmp.json

  • New agent-snmp service configuration template
  • Configures SNMP checker with enabled flag and listener address
  • Defines node address and target list
+6/-0     
agent-mapper.json
Agent mapper service configuration template                           

pkg/core/api/templates/agent-mapper.json

  • New agent-mapper service configuration template
  • Configures mapper service address and enabled flag
+4/-0     
agent-trapd.json
Agent TRAPD service configuration template                             

pkg/core/api/templates/agent-trapd.json

  • New agent-trapd service configuration template
  • Configures TRAPD listener address and enabled flag
+4/-0     
Refactoring
1 files
route.ts
Centralize config API route handling with shared proxy     

web/src/app/api/config/[service]/route.ts

  • Refactored route handlers to delegate to centralized
    proxyConfigRequest function
  • Removed duplicate authentication and API forwarding logic from GET,
    PUT, DELETE handlers
  • Simplified each handler to a single-line call to shared proxy
    implementation
  • Updated parameter naming from destructured object to ctx for
    consistency
+11/-139
Documentation
2 files
agents.md
Documentation for KV config checks and config-sync tool   

docs/docs/agents.md

  • Added KV Configuration Checks section with nats-kv helper commands
  • Documented config-sync tool usage for hydrating collectors
  • Provided example command showing service, output, template, and watch
    options
  • Explained tool behavior: pulls KV record, seeds template, writes to
    disk, optionally watches
+22/-0   
newarch_plan.md
Architecture plan issue tracking references                           

newarch_plan.md

  • Added tracking issue reference #1924 and related issue #1921
  • Links to tactical CTE query fix for context
+3/-0     
Additional files
1 files
agent-rperf.json +4/-0     

Imported from GitHub pull request. Original GitHub pull request: #1926 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/pull/1926 Original created: 2025-11-08T17:03:39Z Original updated: 2025-11-10T19:38:10Z Original head: carverauto/serviceradar:datasvc/fixing_ui Original base: main Original merged: 2025-11-10T19:37:05Z by @mfreeman451 --- ### **User description** ## IMPORTANT: Please sign the Developer Certificate of Origin Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include a [DCO sign-off statement]( https://developercertificate.org/) indicating the DCO acceptance in one commit message. Here is an example DCO Signed-off-by line in a commit message: ``` Signed-off-by: J. Doe <j.doe@domain.com> ``` ## Describe your changes ## Issue ticket number and link ## Code checklist before requesting a review - [ ] I have signed the DCO? - [ ] The build completes without errors? - [ ] All tests are passing when running make test? ___ ### **PR Type** Enhancement, Tests ___ ### **Description** - **Configuration Management System**: Implemented comprehensive configuration management with service descriptors, metadata tracking, and KV-based storage with template seeding - **Configuration API Endpoints**: Added new admin routes for listing config descriptors, checking status, and managing watchers with metadata exposure - **Bootstrap Service Pattern**: Created unified `cfgbootstrap.Service` abstraction for simplified configuration loading and KV watching across all services - **Configuration Templates**: Added 25+ embedded service configuration templates (JSON/TOML) for all managed services including core, poller, agent, mapper, sync, and various checkers/consumers - **Configuration Sanitization**: Implemented sensitive field removal for KV storage with TOML-specific sanitization support - **Watcher Registry**: Built watcher tracking system for monitoring active KV configuration watchers with event and status tracking - **Config-Sync Tool**: Created new command-line utility for syncing service configurations from KV to local files with template seeding and optional watching - **Web UI Integration**: Enhanced config editor with metadata display (revision, origin, timestamp) and dynamic global services discovery from descriptors - **Docker & Helm Integration**: Updated Dockerfiles and Helm templates with config-sync support including environment variable helpers and entrypoint scripts - **Service Migrations**: Migrated all services (poller, agent, mapper, sync, checkers, consumers) from legacy `KVManager` to new bootstrap pattern - **Unit Tests**: Added comprehensive tests for configuration API, status endpoints, watcher registry, and TOML sanitization - **Documentation**: Added KV configuration checks and config-sync tool usage documentation ___ ### Diagram Walkthrough ```mermaid flowchart LR A["Services<br/>poller, agent, mapper, etc."] -->|"cfgbootstrap.Service"| B["Bootstrap Service"] B -->|"Load & Watch"| C["KV Store"] C -->|"Fetch Config"| D["Config Registry<br/>Descriptors"] B -->|"Seed Template"| E["Config Templates<br/>25+ services"] E -->|"Merge"| F["Final Config"] F -->|"Sanitize"| G["Remove Sensitive<br/>Fields"] G -->|"Store"| C H["Config-Sync Tool"] -->|"Sync to Disk"| I["Local Files"] C -->|"Provide"| I J["Web UI"] -->|"Query"| K["Admin API<br/>Descriptors/Status/Watchers"] K -->|"Display Metadata"| L["Config Editor<br/>with Revision Info"] ``` <details> <summary><h3> File Walkthrough</h3></summary> <table><thead><tr><th></th><th align="left">Relevant files</th></tr></thead><tbody><tr><td><strong>Enhancement</strong></td><td><details><summary>32 files</summary><table> <tr> <td> <details> <summary><strong>auth.go</strong><dd><code>Configuration management API with template seeding and metadata </code><br><code>tracking</code></dd></summary> <hr> pkg/core/api/auth.go <ul><li>Added comprehensive configuration management endpoints <br>(<code>handleListConfigDescriptors</code>, <code>handleConfigStatus</code>, <br><code>handleConfigWatchers</code>) with metadata tracking<br> <li> Refactored <code>handleGetConfig</code> to support service descriptors, template <br>seeding, and configuration metadata recording<br> <li> Introduced helper functions for KV key resolution, metadata <br>persistence, and configuration entry loading<br> <li> Enhanced error handling with standardized <code>writeAPIError</code> method and <br>improved response formatting</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553e">+632/-70</a></td> </tr> <tr> <td> <details> <summary><strong>registry.go</strong><dd><code>Service configuration descriptor registry and metadata</code>&nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/config/registry.go <ul><li>Created new service descriptor registry mapping service names to <br>configuration metadata<br> <li> Defined <code>ConfigFormat</code> (JSON/TOML) and <code>ConfigScope</code> (global/poller/agent) <br>enums<br> <li> Implemented descriptor lookup functions and KV key resolution with <br>template variable substitution<br> <li> Registered 25+ service descriptors with their KV keys and <br>configuration formats</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-571dd192b00a11d35be5792632115c32a9fc59eb70e0716f0e9c49fe12940e8c">+282/-0</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>Faker service configuration bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/faker/main.go <ul><li>Added <code>Validate()</code> and <code>applyDefaults()</code> methods to <code>Config</code> struct for <br>configuration validation<br> <li> Refactored <code>loadConfig</code> into <code>loadFakerConfig</code> using new bootstrap service <br>pattern<br> <li> Integrated with <code>cfgbootstrap.Service</code> for unified configuration loading <br>and KV support</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-f101715b3bafbe3843ac9df0ffa3566fd0278b8f730efa3665557ce98acf6d23">+101/-38</a></td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>Configuration synchronization utility tool</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/tools/config-sync/main.go <ul><li>Created new command-line tool for syncing service configurations from <br>KV to local files<br> <li> Supports template seeding, KV watching, and multiple service roles<br> <li> Implements graceful shutdown and atomic file writes with temporary <br>files</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-bc6eeb1b05bcb9179525e32fac1de9926b5823ec3504be546ab10c5c9740f544">+200/-0</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>DB event writer bootstrap service integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/consumers/db-event-writer/main.go <ul><li>Replaced legacy <code>KVManager</code> pattern with new <code>cfgbootstrap.Service</code> API<br> <li> Simplified configuration loading and KV watch setup<br> <li> Removed explicit cleanup calls in favor of deferred <br><code>bootstrapResult.Close()</code></ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-c9a73828b631e4618af51a47bc4c618d72ad1726fef3c3cbe12ab73b57b0eb63">+27/-45</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>Poller service bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/poller/main.go <ul><li>Migrated from legacy <code>KVManager</code> to new <code>cfgbootstrap.Service</code> pattern<br> <li> Simplified configuration loading with service descriptor lookup<br> <li> Updated KV watch integration to use new bootstrap result API</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-4b8ec845da50cd58d011e69f9d1c30530ee1968df26616b8768bb1fc03433bbe">+35/-44</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>server.go</strong><dd><code>API server configuration endpoints and KV client updates</code>&nbsp; </dd></summary> <hr> pkg/core/api/server.go <ul><li>Updated <code>kvGetFn</code> signature to return revision number as fourth return <br>value<br> <li> Added new admin routes for config descriptors, status, and watchers<br> <li> Updated <code>isServiceConfigured</code> to use new <code>resolveServiceKey</code> method</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-1bb99367fdd853c728b7cfbf5893a293f6d217144dfb5282cb8dd32e5261021e">+12/-9</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>watchers.go</strong><dd><code>KV watcher registry and tracking system</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/config/watchers.go <ul><li>Implemented watcher registry for tracking active KV configuration <br>watchers<br> <li> Added <code>WatcherInfo</code> and <code>WatcherStatus</code> types for runtime metadata <br>exposure<br> <li> Provided functions to register, mark events, and list watchers with <br>thread-safe access</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-bbc878c252ce9408c4bb821b90aa297836781531d7530fb1c84a8dd1c31a3d86">+150/-0</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>config_templates.go</strong><dd><code>Embedded service configuration templates</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/config_templates.go <ul><li>Created embedded template assets for all 25+ managed service <br>configurations<br> <li> Mapped service names to template data with format information<br> <li> Supports both JSON and TOML configuration formats</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-552a472f331bb543ec2a34145e8a3db569f68945a97130b71e6328a212a044f8">+85/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>Sync service bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/sync/main.go <ul><li>Replaced legacy <code>KVManager</code> initialization with <code>cfgbootstrap.Service</code> API<br> <li> Simplified configuration loading and KV watch setup<br> <li> Removed explicit cleanup in favor of deferred bootstrap result closure</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-78dc6bc53f1c760c66f43ff5f486bfe78a65bee8b2e0d4862293ec0892da2b29">+16/-31</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>Agent service bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/agent/main.go <ul><li>Migrated from legacy <code>KVManager</code> to new <code>cfgbootstrap.Service</code> pattern<br> <li> Simplified configuration loading with service descriptor lookup<br> <li> Updated KV watch to use new bootstrap result API</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-61358711e980ccf505246fd3915f97cbd3a380e9b66f6fa5aad46749968c5ca3">+17/-26</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>service.go</strong><dd><code>Configuration bootstrap service abstraction</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/config/bootstrap/service.go <ul><li>Created new <code>Service</code> function for unified configuration loading and <br>watching<br> <li> Implemented <code>Result</code> type with <code>Close()</code> and <code>StartWatch()</code> helper methods<br> <li> Handles KV manager initialization, config loading, bootstrapping, and <br>watcher registration</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-bdbe338c04da107ded4d961b6a02c05f7580a5b70d33d2e141c4b60769a35ea7">+126/-0</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>toml_mask.go</strong><dd><code>TOML document sanitization utility</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/config/toml_mask.go <ul><li>Implemented <code>SanitizeTOML</code> function to remove sensitive keys from TOML <br>documents<br> <li> Supports table-specific and wildcard key filtering without full <br>parsing<br> <li> Preserves comments and multiline strings while filtering sensitive <br>data</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-0d1120abd3a87aa7e1b1b9f80e70db597353f622e9c90992116379e4b39367d0">+114/-0</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>kv_client.go</strong><dd><code>KV bootstrap configuration sanitization</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/config/kv_client.go <ul><li>Updated <code>BootstrapConfig</code> to accept <code>configPath</code> parameter and prefer <br>on-disk sanitization<br> <li> Added <code>sanitizeBootstrapSource</code> to read and sanitize config files before <br>KV storage<br> <li> Implemented <code>cloneConfig</code> for creating configuration copies via <br>reflection</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-0c911990aa32d43404f0f23dadc0363ec97ed4ca315592e6c9bd24510e82e06d">+58/-8</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>Mapper service bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/mapper/main.go <ul><li>Migrated from direct config loading to <code>cfgbootstrap.Service</code> pattern<br> <li> Added service descriptor lookup and bootstrap result handling<br> <li> Integrated KV watch with watcher registration</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-1b7aa67fec348b5072091f561abc40b75a0a18b660af797c9efcf90803840e39">+16/-7</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>SNMP checker bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/checkers/snmp/main.go <ul><li>Replaced direct config loading with <code>cfgbootstrap.Service</code> API<br> <li> Added service descriptor lookup for SNMP checker<br> <li> Integrated KV watch with watcher registration and logging</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-f25402eade63525184cb5e7437accff93c7b9338eebe81add6dc5f2a9eb12550">+16/-6</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>Sysmon-VM checker bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/checkers/sysmon-vm/main.go <ul><li>Migrated from direct config loading to <code>cfgbootstrap.Service</code> pattern<br> <li> Added service descriptor lookup for sysmon-vm checker<br> <li> Integrated KV watch with watcher registration</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-b1ab49590901666d7536778838bbf3eda1cac7c33257e8e89cb3ab74df8ae376">+16/-3</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>app.go</strong><dd><code>Core service bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/core/app/app.go <ul><li>Replaced <code>core.LoadConfig</code> with <code>cfgbootstrap.Service</code> pattern<br> <li> Added service descriptor lookup for core service<br> <li> Integrated KV watch with watcher registration for hot-reload <br>notifications</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-4ad8a289575edf3b163088617b7a40ae1305c29ced0c7d59b3751c57d6938072">+21/-1</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>NetFlow consumer bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/consumers/netflow/main.go <ul><li>Migrated from direct config loading to <code>cfgbootstrap.Service</code> API<br> <li> Added service descriptor lookup for netflow consumer<br> <li> Integrated KV watch with watcher registration</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-7e1a99c83e5e5c2d5deec445135861c820d0aeddb2d3b5365e24b4c3b6955f3a">+16/-4</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>Data service bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/data-services/main.go <ul><li>Replaced direct config loading with <code>cfgbootstrap.Service</code> pattern<br> <li> Added service descriptor lookup for datasvc<br> <li> Removed manual KV store setup in favor of bootstrap result handling</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-5e7731adfb877918cd65d9d5531621312496450fd550fea2682efca4ca8fe816">+11/-9</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>kv_watch.go</strong><dd><code>KV watch watcher event tracking</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/config/kv_watch.go <ul><li>Enhanced <code>StartKVWatchOverlay</code> to track watcher events and errors<br> <li> Integrated watcher ID from context for event marking<br> <li> Added error tracking and status updates during KV watch operations</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-6d4dd0c34977dc5f8eda8c3a62075478ae14ab26d45bbcdd24974310ee839e84">+19/-1</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>main.go</strong><dd><code>Dusk checker bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/checkers/dusk/main.go <ul><li>Migrated from direct config loading to <code>cfgbootstrap.Service</code> pattern<br> <li> Added service descriptor lookup for dusk checker<br> <li> Simplified configuration initialization</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-2beb066906fd7a77f5812c5dcdb294a4886c9c2943378f49bce239cfe51b8035">+12/-6</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>types.go</strong><dd><code>KV client function signature update</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/types.go <ul><li>Updated <code>kvGetFn</code> function signature to include revision number as <br>fourth return value</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-39f024121630282f9e3eeee5b77e5a63a87950b9eae4f479277a289796b42d56">+1/-1</a>&nbsp; &nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>route.ts</strong><dd><code>Web API proxy for config descriptor listing</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> web/src/app/api/admin/config/route.ts <ul><li>Created new API proxy endpoint for listing configuration descriptors<br> <li> Implements authentication and error handling for admin config API</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-cd21c2156b445a89675aafed261e647269cef390709656ee228c8c0859ace35b">+43/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>route.ts</strong><dd><code>Web API proxy for service configuration management</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> web/src/app/api/admin/config/[service]/route.ts <ul><li>Created new API proxy endpoints for GET, PUT, DELETE operations on <br>service configurations<br> <li> Routes requests through shared proxy utility</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-cb55bb18ed52e3dc35d18e17d25f22c42db346e96c0f870c2973ce13df953281">+23/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>sanitize.go</strong><dd><code>Configuration sanitization for KV storage</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/config/sanitize.go <ul><li>Implemented <code>sanitizeForKV</code> function to remove sensitive fields before <br>KV storage<br> <li> Filters fields marked with <code>sensitive:"true"</code> tag using models utility</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-ea7244a86d39e6b09ad0811dbdf1a84f85779f6dac03089af47f64cb4877741d">+27/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>config.go</strong><dd><code>Core configuration KV bootstrap integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/config.go <ul><li>Enhanced <code>overlayFromKV</code> to bootstrap core configuration into KV after <br>loading<br> <li> Added service descriptor lookup and KV bootstrap call with error <br>logging</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-a2dcdf3bac7ca59032d52f7a0635b5dbcfcb83a6f07cc45db8f57fba0cc33c30">+12/-1</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>proxy.ts</strong><dd><code>New shared proxy implementation for config API</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> web/src/app/api/admin/config/proxy.ts <ul><li>Created new shared proxy module for admin config routes<br> <li> Implements <code>proxyConfigRequest</code> function handling GET, PUT, DELETE <br>methods<br> <li> Normalizes legacy <code>kvStore</code> query parameter to <code>kv_store_id</code><br> <li> Forwards requests to upstream API with proper headers and error <br>handling</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-efbdcfc4a88b03f4292320f91f31515bd86eb070be3b8c65a3dc2e583276e9d7">+93/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>run-with-config-sync.sh</strong><dd><code>Config-sync wrapper entrypoint for services</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> docker/entrypoints/run-with-config-sync.sh <ul><li>New bash entrypoint script for optional config-sync execution<br> <li> Supports environment variables for config service, output path, <br>template, role, seeding, and watching<br> <li> Conditionally runs <code>config-sync</code> binary before launching main service<br> <li> Handles missing binary gracefully with informative error messages</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-3af218090e73b0ba3d67825ed131ad8d58ecea2884175dd333a876cdac559eeb">+59/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>_helpers.tpl</strong><dd><code>Helm helper for config-sync environment variables</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> helm/serviceradar/templates/_helpers.tpl <ul><li>Added new <code>serviceradar.configSyncEnv</code> Helm template helper<br> <li> Generates environment variables for config-sync configuration<br> <li> Supports optional kvKey, role, extraArgs, and extraEnv customization<br> <li> Provides sensible defaults for enabled, seed, and watch flags</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-3d59d815f528d134e097ce2c3e830c6eaa738e27b6645df1e9b18136cd5d3c0d">+28/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>ServicesTreeNavigation.tsx</strong><dd><code>Dynamic global services discovery and rendering</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> web/src/components/Admin/ServicesTreeNavigation.tsx <ul><li>Dynamically fetch config descriptors from <code>/api/admin/config</code> endpoint<br> <li> Filter global services based on scope from descriptors instead of <br>hardcoded list<br> <li> Implement dynamic service label generation with display_name support<br> <li> Render global services list dynamically with status indicators and <br>metadata display</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-1ed7ef6d80b5c019b943bbd7a1827806a8f054b84e1ca993bc51179748e9f4b5">+174/-49</a></td> </tr> <tr> <td> <details> <summary><strong>ConfigEditor.tsx</strong><dd><code>Config metadata parsing and display in editor</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> web/src/components/Admin/ConfigEditor.tsx <ul><li>Added <code>ConfigMetadata</code> and <code>ConfigEnvelope</code> types for structured config <br>responses<br> <li> Implement <code>buildConfigQuery</code> helper for consistent query parameter <br>construction<br> <li> Parse JSON envelope responses containing metadata (revision, origin, <br>last_writer, updated_at)<br> <li> Display config metadata badges showing revision, origin, and update <br>timestamp<br> <li> Refresh config after successful save to reflect server state</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-7c31e6580bb0219035f62a50bbaa55695349e610982708bc32c9a8ae78b30ad9">+99/-15</a>&nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Tests</strong></td><td><details><summary>5 files</summary><table> <tr> <td> <details> <summary><strong>auth_test.go</strong><dd><code>Configuration API unit tests</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/auth_test.go <ul><li>Added tests for template seeding functionality in <code>handleGetConfig</code><br> <li> Added tests for agent-scoped configuration requiring <code>agent_id</code> <br>parameter<br> <li> Added tests for watcher registration and listing</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-4e62854ea14cbf28a84fb3d9fc1341866261872678a73c3912628e475ddb7d06">+136/-0</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>config_status_test.go</strong><dd><code>Configuration status API tests</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/config_status_test.go <ul><li>Added tests for config status endpoint reporting missing KV keys<br> <li> Added tests for KV key qualification with domain prefixing<br> <li> Added tests verifying template coverage for all service descriptors</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-f2e69c39ea36d06d2101e2ce9cd4dda2cbbb53831c3e804a452aca603a81e440">+76/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>watchers_test.go</strong><dd><code>Watcher registry unit tests</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/config/watchers_test.go <ul><li>Added tests for watcher registry event tracking and status updates<br> <li> Added tests for context-based watcher ID propagation<br> <li> Verified watcher lifecycle management</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-7e73cc9429d87cf11a87f5ce84b68329af719b0331393a92987d9470010e8bdd">+58/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>toml_mask_test.go</strong><dd><code>TOML sanitization unit tests</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/config/toml_mask_test.go <ul><li>Added tests for TOML sanitization with specific key filtering<br> <li> Added tests for wildcard-based table and key filtering</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-a8d92d1c06c3ea486e386ddcbc61a5632a6d63bb8c0262246e9d5713347ee8cf">+54/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>sanitize_test.go</strong><dd><code>Configuration sanitization unit tests</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/config/sanitize_test.go <ul><li>Added tests for sensitive field removal from configuration structures<br> <li> Verified nested sensitive field filtering</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-5460aef12cf0765bd0d264dceca65d792f819bf897822b6e65657b6d900ea2fd">+43/-0</a>&nbsp; &nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Configuration changes</strong></td><td><details><summary>30 files</summary><table> <tr> <td> <details> <summary><strong>flowgger.yaml</strong><dd><code>Flowgger Helm template config sync integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> helm/serviceradar/templates/flowgger.yaml - Added config sync environment variables to flowgger deployment </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-511cfbfe42e0c41cd2fddf67a04911b48724ca9ea9f6d1ddc1f4bb7bf07086ab">+1/-0</a>&nbsp; &nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>core.json</strong><dd><code>Core service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/core.json <ul><li>New comprehensive core service configuration template<br> <li> Includes database, security, NATS, events, auth, RBAC, webhooks, and <br>logging settings<br> <li> Defines route protection rules for API endpoints with role-based <br>access control<br> <li> Contains placeholder values for TLS certificates, API keys, and NATS <br>configuration</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-d6c48d9e3feaabbac2dd13d8406276ba887d662a5532444bff601775ac017b0a">+165/-0</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>docker-compose.yml</strong><dd><code>Config-sync environment setup in docker-compose</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> docker-compose.yml <ul><li>Added config-sync environment variables to otel, flowgger, trapd, and <br>zen-consumer services<br> <li> Configures service name, output path, template path, sync role, and <br>optional watch/seed flags<br> <li> Enables dynamic configuration synchronization for each collector <br>service</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-e45e45baeda1c1e73482975a664062aa56f20c03dd9d64a827aba57775bed0d3">+40/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>mapper.json</strong><dd><code>Mapper service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/mapper.json <ul><li>New mapper service configuration template with SNMP discovery settings<br> <li> Includes worker pool, timeout, retry, and job management configuration<br> <li> Defines OID mappings for basic system info, interfaces, and topology <br>discovery<br> <li> Contains stream configuration for publishing discovery results to NATS</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-61e1b38bbd9093f5f1c61290d018c4267080ff9b1f1071cbb01279404b4f5cdd">+105/-0</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>poller.json</strong><dd><code>Poller service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/poller.json <ul><li>New poller service configuration template with agent definitions<br> <li> Configures local and default agents with security and check <br>definitions<br> <li> Includes core address, listen address, poll interval, and partition <br>settings<br> <li> Defines OTEL logging configuration for observability</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-83605b42fd8b1b70f5309344e5250738431b371c38ee743d354799bd88ce46eb">+111/-0</a>&nbsp; </td> </tr> <tr> <td> <details> <summary><strong>db-event-writer.json</strong><dd><code>DB event writer service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/db-event-writer.json <ul><li>New event writer service configuration template<br> <li> Configures NATS stream consumption and database event persistence<br> <li> Maps event subjects to database tables for logs, traces, and metrics<br> <li> Includes database connection, security, and OTEL logging settings</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-df452c379e82aa142456365b936bd8021016acde9a1ec237b0e9425d523fb3a9">+87/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>sync.json</strong><dd><code>Sync service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/sync.json <ul><li>New sync service configuration template for device discovery<br> <li> Configures NATS, KV store, and security settings<br> <li> Includes Armis API integration with credentials and discovery queries<br> <li> Defines network blacklist and custom field mappings for device <br>filtering</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-3a6c6d9636bae153e76252c8da0d941ae042d54e4b2bd76cb809b54b7cc0346a">+77/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>Dockerfile</strong><dd><code>Zen consumer Dockerfile with config-sync integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/consumers/zen/Dockerfile <ul><li>Added config-sync builder stage to compile config-sync binary<br> <li> Copy config-sync and entrypoint script into runtime image<br> <li> Move template config to <code>/etc/serviceradar/templates/consumers/</code> <br>directory<br> <li> Set CONFIG_* environment variables and use run-with-config-sync.sh as <br>ENTRYPOINT</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-9f8119755792e77da338d2d29ce5e1bbfeeb8f5816a3233e78a9e206bafb0b53">+22/-2</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>zen-consumer.json</strong><dd><code>Zen consumer service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/zen-consumer.json <ul><li>New zen-consumer service configuration template<br> <li> Defines NATS stream consumption and decision groups for event <br>processing<br> <li> Configures syslog, SNMP, OTEL logs, and metrics processing rules<br> <li> Includes security, gRPC, and KV bucket settings</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-6a8e7da446b263b0aa8bd5dca18b5c73cf425c60e451cf999d6dd0a94a482d39">+61/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>Dockerfile</strong><dd><code>TRAPD Dockerfile with config-sync integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/trapd/Dockerfile <ul><li>Added config-sync builder stage to compile config-sync binary<br> <li> Copy config-sync and entrypoint script into runtime image<br> <li> Move template config to <code>/etc/serviceradar/templates/</code> directory<br> <li> Set CONFIG_* environment variables and use run-with-config-sync.sh as <br>ENTRYPOINT</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-7ee3f454058c8cf12947948e4f4f302e7a461b9d79bb447c15e68f4b02668647">+22/-1</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>Dockerfile</strong><dd><code>Flowgger Dockerfile with config-sync integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/flowgger/Dockerfile <ul><li>Added config-sync builder stage to compile config-sync binary<br> <li> Copy config-sync and entrypoint script into runtime image<br> <li> Move template config to <code>/etc/serviceradar/templates/</code> directory<br> <li> Set CONFIG_* environment variables and use run-with-config-sync.sh as <br>ENTRYPOINT</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-7199f3111d0ff4e1154dd5a0f2250ea0e82c39031abdf7864356570dd1007c87">+23/-1</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>Dockerfile</strong><dd><code>OTEL Dockerfile with config-sync integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> cmd/otel/Dockerfile <ul><li>Added config-sync builder stage to compile config-sync binary<br> <li> Copy config-sync and entrypoint script into runtime image<br> <li> Move template config to <code>/etc/serviceradar/templates/</code> directory<br> <li> Set CONFIG_* environment variables and use run-with-config-sync.sh as <br>ENTRYPOINT</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-62c9619630b9f9c73e89622525098ec4722282a8499ef89df09116d0840566ae">+23/-1</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>netflow-consumer.json</strong><dd><code>Netflow consumer service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/netflow-consumer.json <ul><li>New netflow-consumer service configuration template<br> <li> Configures NATS stream consumption for netflow metrics<br> <li> Defines enabled/disabled fields and dictionary-based enrichment<br> <li> Includes database connection and mTLS security settings</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-f10c3caba327629f4f0cd3708354ffac884508ef90e57aba7bbb11abc6d9a481">+58/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>snmp-checker.json</strong><dd><code>SNMP checker service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/snmp-checker.json <ul><li>New SNMP checker service configuration template<br> <li> Configures node address, listen address, and SPIFFE security<br> <li> Defines target devices with SNMP credentials and OID mappings<br> <li> Includes logging and OTEL configuration</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-648a52615eb992bcde77be6a1da9e961cf0d1e8b3e179d8542dcb538d50c47ac">+47/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>agent.json</strong><dd><code>Agent service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/agent.json <ul><li>New agent service configuration template<br> <li> Configures gRPC service with mTLS security<br> <li> Defines agent identity, partition, and checker directory<br> <li> Includes OTEL logging configuration</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-112ab33cc236d53be962db02d3035e805e80c8bbcbea489b6b3e523a27afcd77">+39/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>flowgger.toml</strong><dd><code>Flowgger service TOML configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/flowgger.toml <ul><li>New flowgger service TOML configuration template<br> <li> Configures UDP syslog input and NATS JetStream output<br> <li> Defines NATS connection, TLS, and retry behavior<br> <li> Includes gRPC listener with mTLS security settings</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-19eb3f34dc9dd906f5d747b6a265b097e4a2b82f9d101cb9d44daf6f3630f663">+34/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>faker.json</strong><dd><code>Faker service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/faker.json <ul><li>New faker service configuration template for testing<br> <li> Configures fake Armis API with device simulation<br> <li> Includes IP shuffle simulation and persistent storage settings<br> <li> Defines logging configuration with file output</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-6ce518fe56d8e597e74e870c6e2e43dc947b9a97252594d0f1a2a690290af0e8">+34/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>otel.toml</strong><dd><code>OTEL collector service TOML configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/otel.toml <ul><li>New OTEL collector service TOML configuration template<br> <li> Configures OTEL server binding and NATS output<br> <li> Defines JetStream stream and subject configuration<br> <li> Includes TLS settings for NATS and gRPC</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-13066e71d4bcfa375c5cacd2fe0330d5e6f07835286350683e0d93ae983ab104">+31/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>trapd.json</strong><dd><code>TRAPD service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/trapd.json <ul><li>New TRAPD service configuration template<br> <li> Configures SNMP trap listener and NATS output<br> <li> Defines gRPC listener with mTLS security<br> <li> Includes NATS domain and stream configuration</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-8bbb29491cffc17c4fba15177fa585d428524df2acb1343e43d74b5ac8af5433">+21/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>sysmon-vm-checker.json</strong><dd><code>Sysmon-vm checker service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/sysmon-vm-checker.json <ul><li>New sysmon-vm checker service configuration template<br> <li> Configures gRPC listener with SPIFFE security<br> <li> Defines ZFS pool monitoring and filesystem tracking<br> <li> Includes poll interval and monitoring settings</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-b212ca180313ce49e8dbebe85b45ce6364c5b3ac6608014e5b26b301f6963ecb">+27/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>values.yaml</strong><dd><code>Helm values for config-sync service configuration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> helm/serviceradar/values.yaml <ul><li>Added <code>configSync</code> section with per-service configuration<br> <li> Defines enabled, seed, watch, kvKey, role, extraArgs, and extraEnv for <br>each service<br> <li> Provides configuration for flowgger, trapd, otel, and zen services<br> <li> Enables flexible config-sync behavior per deployment</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-d4449c7cb70362554b274f81eae5a4b81a8e81df494282e383d1b7ea3871c452">+34/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>datasvc.json</strong><dd><code>Datasvc service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/datasvc.json <ul><li>New datasvc service configuration template<br> <li> Configures NATS connection and KV bucket settings<br> <li> Defines RBAC roles and bucket history/TTL configuration<br> <li> Includes mTLS security settings</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-947e04774230915769b7c49f34e53d102eb5918ff882bc827afc788661c6e975">+25/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>dusk-checker.json</strong><dd><code>Dusk checker service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/dusk-checker.json <ul><li>New dusk checker service configuration template<br> <li> Configures gRPC node connection and listener<br> <li> Defines SPIFFE security with trust domain and workload socket<br> <li> Includes timeout and TLS certificate settings</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-97b76cac9094a87b9fa47bf7964632dd41771e2e5bdba216795be569cd7115f3">+22/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>trapd.yaml</strong><dd><code>Trapd Helm template config-sync environment injection</code>&nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> helm/serviceradar/templates/trapd.yaml <ul><li>Injected <code>serviceradar.configSyncEnv</code> helper to generate config-sync <br>environment variables<br> <li> Uses <code>configSync.trapd</code> values from Helm chart</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-007ac4e264749139ee8ba84630cf34244ad17ed524b50ad7b28981c0ec3cf3f6">+1/-0</a>&nbsp; &nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>agent-sweep.json</strong><dd><code>Agent sweep service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/agent-sweep.json <ul><li>New agent-sweep service configuration template<br> <li> Configures network sweep parameters and port scanning<br> <li> Defines ICMP and TCP sweep modes with concurrency and timeout<br> <li> Includes rate limiting and high-performance ICMP settings</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-e8f199b19954ec38b1b945f156ef552b4b4bf647a6553b6ceedb8edf20730ab0">+11/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>otel.yaml</strong><dd><code>OTEL Helm template config-sync environment injection</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> helm/serviceradar/templates/otel.yaml <ul><li>Injected <code>serviceradar.configSyncEnv</code> helper to generate config-sync <br>environment variables<br> <li> Uses <code>configSync.otel</code> values from Helm chart</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-3422f8a7811e511a91937331e4a2795bbcae8846c19ae8f7f66d52a241ad300c">+1/-0</a>&nbsp; &nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>zen.yaml</strong><dd><code>Zen Helm template config-sync environment injection</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> helm/serviceradar/templates/zen.yaml <ul><li>Injected <code>serviceradar.configSyncEnv</code> helper to generate config-sync <br>environment variables<br> <li> Uses <code>configSync.zen</code> values from Helm chart</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-6ad57a6af05e6ac723002f0979b35265589bd65639d2ea1605465c695f51364b">+1/-0</a>&nbsp; &nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>agent-snmp.json</strong><dd><code>Agent SNMP service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/agent-snmp.json <ul><li>New agent-snmp service configuration template<br> <li> Configures SNMP checker with enabled flag and listener address<br> <li> Defines node address and target list</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-f8e5312b2b514563b53c2ccd9cd6fd81626b9c3288bba5da7e0454046d1faafa">+6/-0</a>&nbsp; &nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>agent-mapper.json</strong><dd><code>Agent mapper service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/agent-mapper.json <ul><li>New agent-mapper service configuration template<br> <li> Configures mapper service address and enabled flag</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-74a3fd4b2aae0ede0121d9950110f48a4f9ac3dece131f15a51bde1e9e55ac95">+4/-0</a>&nbsp; &nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>agent-trapd.json</strong><dd><code>Agent TRAPD service configuration template</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> pkg/core/api/templates/agent-trapd.json <ul><li>New agent-trapd service configuration template<br> <li> Configures TRAPD listener address and enabled flag</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-209e52478d55435201fa589bf46568956a49d3ba3e14261fb509169f97fef5d6">+4/-0</a>&nbsp; &nbsp; &nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Refactoring</strong></td><td><details><summary>1 files</summary><table> <tr> <td> <details> <summary><strong>route.ts</strong><dd><code>Centralize config API route handling with shared proxy</code>&nbsp; &nbsp; &nbsp; </dd></summary> <hr> web/src/app/api/config/[service]/route.ts <ul><li>Refactored route handlers to delegate to centralized <br><code>proxyConfigRequest</code> function<br> <li> Removed duplicate authentication and API forwarding logic from GET, <br>PUT, DELETE handlers<br> <li> Simplified each handler to a single-line call to shared proxy <br>implementation<br> <li> Updated parameter naming from destructured object to <code>ctx</code> for <br>consistency</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-8ec3e6eab784db17164c21c173f8254157bcb0ab8f330dfb21b39fc6cbda72cd">+11/-139</a></td> </tr> </table></details></td></tr><tr><td><strong>Documentation</strong></td><td><details><summary>2 files</summary><table> <tr> <td> <details> <summary><strong>agents.md</strong><dd><code>Documentation for KV config checks and config-sync tool</code>&nbsp; &nbsp; </dd></summary> <hr> docs/docs/agents.md <ul><li>Added KV Configuration Checks section with nats-kv helper commands<br> <li> Documented config-sync tool usage for hydrating collectors<br> <li> Provided example command showing service, output, template, and watch <br>options<br> <li> Explained tool behavior: pulls KV record, seeds template, writes to <br>disk, optionally watches</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-af8d04277f2353629065b0cc5fad3e44bd3e7c20339bd125e0812104bdbeff28">+22/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td> <details> <summary><strong>newarch_plan.md</strong><dd><code>Architecture plan issue tracking references</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></summary> <hr> newarch_plan.md <ul><li>Added tracking issue reference #1924 and related issue #1921<br> <li> Links to tactical CTE query fix for context</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-87750e31d04e7123dc95483d779a90deef0eb64c8d52afda1e6c2b59b68d9a1c">+3/-0</a>&nbsp; &nbsp; &nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Additional files</strong></td><td><details><summary>1 files</summary><table> <tr> <td><strong>agent-rperf.json</strong></td> <td><a href="https://github.com/carverauto/serviceradar/pull/1926/files#diff-230b52475761573f26dcaac43d647fbfcbf63f4b0ba8bd333409d2d26ece7f75">+4/-0</a>&nbsp; &nbsp; &nbsp; </td> </tr> </table></details></td></tr></tr></tbody></table> </details> ___
qodo-code-review[bot] commented 2025-11-08 17:04:56 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1926#issuecomment-3506720817
Original created: 2025-11-08T17:04:56Z

PR Compliance Guide 🔍

(Compliance updated until commit github.com/carverauto/serviceradar@af6b19d330)

Below is a summary of compliance checks for this PR:

Security Compliance
Insecure TLS verification

Description: TLS configuration allows InsecureSkipVerify to be enabled via a flag, which can permit
MITM if used in production; this should be guarded or clearly restricted to
development-only builds.
main.go [401-418]

Referred Code

tlsConfig := &tls.Config{
	MinVersion: tls.VersionTLS12,
}
if cfg.natsInsecureTLS {
	tlsConfig.InsecureSkipVerify = true
}

if cfg.natsTLSCA != "" {
	caCert, err := os.ReadFile(cfg.natsTLSCA)
	if err != nil {
		return nil, fmt.Errorf("read NATS CA file: %w", err)
	}
	cp := x509.NewCertPool()
	cp.AppendCertsFromPEM(caCert)
	tlsConfig.RootCAs = cp
}


Error information leakage

Description: Error responses echo internal messages directly in JSON which may leak internal details
(keys, store IDs, or stack messages) to clients; sanitize messages before returning to
avoid information disclosure.
auth.go [818-870]

Referred Code

func (s *APIServer) writeRawConfigResponse(w http.ResponseWriter, data []byte, format config.ConfigFormat, meta *configMetadata) {
	if meta != nil {
		w.Header().Set("X-Serviceradar-Kv-Key", meta.KVKey)
		if meta.KVStoreID != "" {
			w.Header().Set("X-Serviceradar-Kv-Store", meta.KVStoreID)
		}
		if meta.Revision != 0 {
			w.Header().Set("X-Serviceradar-Kv-Revision", fmt.Sprintf("%d", meta.Revision))
		}
		if meta.Origin != "" {
			w.Header().Set("X-Serviceradar-Config-Origin", meta.Origin)
		}
		if meta.LastWriter != "" {
			w.Header().Set("X-Serviceradar-Config-Writer", meta.LastWriter)
		}
		if meta.UpdatedAt != nil {
			w.Header().Set("X-Serviceradar-Config-Updated-At", meta.UpdatedAt.Format(time.RFC3339))
		}
	}




 ... (clipped 32 lines)
Sensitive data exposure

Description: Corrupt payloads are base64-dumped to disk using filenames derived from untrusted KV keys,
which could enable path confusion or sensitive data exposure if directories are
world-readable; sanitize and restrict permissions.
main.go [430-438]

Referred Code
func dumpPayload(dir, key string, value []byte) (string, error) {
	safeName := strings.NewReplacer("/", "_", ":", "_").Replace(key)
	target := filepath.Join(dir, safeName+".b64")
	encoded := base64.StdEncoding.EncodeToString(value)
	if err := os.WriteFile(target, []byte(encoded+"\n"), fs.FileMode(0o644)); err != nil {
		return "", err
	}
	return target, nil
}

Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Missing audit logs: New admin endpoints perform critical configuration reads/writes, seeding, and watcher
enumeration without emitting structured audit events capturing user identity, action,
target KV key, and outcome.

Referred Code
// @Failure 500 {object} models.ErrorResponse "Internal server error"
// @Router /api/admin/config/{service} [get]
func (s *APIServer) handleGetConfig(w http.ResponseWriter, r *http.Request) {
	ctx := r.Context()
	vars := mux.Vars(r)
	service := vars["service"]

	key := r.URL.Query().Get("key")
	agentID := r.URL.Query().Get("agent_id")
	pollerID := r.URL.Query().Get("poller_id")
	serviceType := r.URL.Query().Get("service_type")
	kvStoreID := r.URL.Query().Get("kv_store_id")
	if kvStoreID == "" {
		kvStoreID = r.URL.Query().Get("kvStore")
	}

	var desc config.ServiceDescriptor
	var hasDescriptor bool
	if key == "" {
		var err error
		desc, key, hasDescriptor, err = s.resolveServiceKey(service, serviceType, agentID, pollerID)



 ... (clipped 520 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Partial edge handling: Tool logs and returns errors appropriately but delete and rehydrate operations lack
retries/backoff and per-key timeouts are parsed but not applied to requests, which may
cause hangs or partial failures under transient conditions.

Referred Code
	ctx context.Context,
	kv nats.KeyValue,
	cfg sweepConfig,
	stats *sweepStats,
	problems *[]corruptRecord,
	coreClient proto.CoreServiceClient,
) error {
	lister, err := kv.ListKeys()
	if err != nil {
		return fmt.Errorf("list keys: %w", err)
	}
	defer stopLister(lister)

	for key := range lister.Keys() {
		stats.totalKeys++
		if cfg.prefix != "" && !strings.HasPrefix(key, cfg.prefix) {
			continue
		}

		stats.filteredKeys++
		if cfg.maxKeys > 0 && stats.filteredKeys > cfg.maxKeys {



 ... (clipped 223 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Potential sensitive logs: New handlers log KV keys, service names, and errors which may include sensitive
configuration paths; confirm no secret values or sanitized TOML fields are ever logged at
any level.

Referred Code
}

entry, metaRecord, err := s.loadConfigEntry(ctx, kvStoreID, resolvedKey)
if err != nil && kvStoreID != "" {
	if s.logger != nil {
		s.logger.Warn().Err(err).Str("key", resolvedKey).Str("kv_store_id", kvStoreID).Msg("KV fetch failed; falling back to default store")
	}
	kvStoreID = ""
	baseKeyUnqualified := baseKey
	if strings.HasPrefix(strings.ToLower(baseKeyUnqualified), "domains/") {
		parts := strings.SplitN(baseKeyUnqualified, "/", 3)
		if len(parts) == 3 {
			baseKeyUnqualified = parts[2]
		}
	}
	resolvedKey = s.qualifyKVKey(kvStoreID, baseKeyUnqualified)
	if !hasDescriptor {
		formatHint = guessFormatFromKey(resolvedKey)
	}
	entry, metaRecord, err = s.loadConfigEntry(ctx, kvStoreID, resolvedKey)
}



 ... (clipped 484 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Query validation gaps: Handlers accept query parameters like kv_store_id, key, agent_id, and poller_id and
forward them to KV and DB lookups without explicit validation/allowlists, which may risk
key traversal or cross-tenant access without additional unseen checks.

Referred Code
// @Failure 500 {object} models.ErrorResponse "Internal server error"
// @Router /api/admin/config/{service} [get]
func (s *APIServer) handleGetConfig(w http.ResponseWriter, r *http.Request) {
	ctx := r.Context()
	vars := mux.Vars(r)
	service := vars["service"]

	key := r.URL.Query().Get("key")
	agentID := r.URL.Query().Get("agent_id")
	pollerID := r.URL.Query().Get("poller_id")
	serviceType := r.URL.Query().Get("service_type")
	kvStoreID := r.URL.Query().Get("kv_store_id")
	if kvStoreID == "" {
		kvStoreID = r.URL.Query().Get("kvStore")
	}

	var desc config.ServiceDescriptor
	var hasDescriptor bool
	if key == "" {
		var err error
		desc, key, hasDescriptor, err = s.resolveServiceKey(service, serviceType, agentID, pollerID)



 ... (clipped 309 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
- Requires Further Human Verification
🏷️ - Compliance label

Previous compliance checks

Compliance check up to commit 0cbe2e6
Security Compliance
Sensitive information exposure

Description: Raw configuration responses are written back to clients without additional sanitization or
authentication scope checks for sensitive fields, potentially exposing secrets if configs
contain credentials (e.g., TOML/JSON containing tokens).
auth.go [650-676]

Referred Code
func (s *APIServer) writeRawConfigResponse(w http.ResponseWriter, data []byte, format config.ConfigFormat, meta *configMetadata) {
	if meta != nil {
		w.Header().Set("X-Serviceradar-Kv-Key", meta.KVKey)
		if meta.KVStoreID != "" {
			w.Header().Set("X-Serviceradar-Kv-Store", meta.KVStoreID)
		}
		if meta.Revision != 0 {
			w.Header().Set("X-Serviceradar-Kv-Revision", fmt.Sprintf("%d", meta.Revision))
		}
		if meta.Origin != "" {
			w.Header().Set("X-Serviceradar-Config-Origin", meta.Origin)
		}
		if meta.LastWriter != "" {
			w.Header().Set("X-Serviceradar-Config-Writer", meta.LastWriter)
		}
		if meta.UpdatedAt != nil {
			w.Header().Set("X-Serviceradar-Config-Updated-At", meta.UpdatedAt.Format(time.RFC3339))
		}
	}

	switch format {


 ... (clipped 6 lines)
Insecure file write

Description: The tool writes KV-fetched configuration to a filesystem path which may be
attacker-controllable via CLI without permission checks, risking overwriting sensitive
files if run with elevated privileges.
main.go [150-174]

Referred Code
func fetchConfig(ctx context.Context, client *kvgrpc.Client, key string) ([]byte, error) {
	if client == nil {
		return nil, nil
	}

	data, found, err := client.Get(ctx, key)
	if err != nil {
		return nil, fmt.Errorf("kv get %q failed: %w", key, err)
	}
	if !found {
		return nil, nil
	}
	return data, nil
}

func writeOutput(path string, data []byte) error {
	if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil {
		return err
	}
	tmp := path + ".tmp"
	if err := os.WriteFile(tmp, data, 0o640); err != nil {


 ... (clipped 4 lines)
Insecure default seeding

Description: Seeding configuration from embedded templates into KV on GET when a key is missing can let
an unauthenticated or overly-permitted caller initialize configuration state if route
protection is misconfigured, leading to privilege or config injection.
auth.go [826-896]

Referred Code
func (s *APIServer) checkConfigKey(ctx context.Context, kvStoreID, key string) (bool, error) {
	if key == "" {
		return false, errors.New("kv key not specified")
	}

	resolvedKey := s.qualifyKVKey(kvStoreID, key)

	entry, err := s.getKVEntry(ctx, kvStoreID, resolvedKey)
	if err != nil {
		return false, err
	}
	return entry != nil && entry.Found, nil
}

func (s *APIServer) seedConfigFromTemplate(ctx context.Context, desc config.ServiceDescriptor, key, kvStoreID string) ([]byte, error) {
	tmpl, ok := serviceTemplates[desc.Name]
	if !ok || len(tmpl.data) == 0 {
		return nil, errTemplateUnavailable
	}

	payload := make([]byte, len(tmpl.data))


 ... (clipped 50 lines)
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Missing Audit Logs: New admin config endpoints perform critical actions (reading/updating/seeding
configurations and KV writes) without explicit audit logging of user, action, key, and
outcome in the added code.

Referred Code
// @Summary List managed config descriptors
// @Description Returns metadata about known service configurations
// @Tags Admin
// @Produce json
// @Success 200 {array} map[string]string
// @Router /api/admin/config [get]
func (s *APIServer) handleListConfigDescriptors(w http.ResponseWriter, r *http.Request) {
	descs := config.ServiceDescriptors()
	resp := make([]configDescriptorResponse, 0, len(descs))
	for _, desc := range descs {
		resp = append(resp, configDescriptorResponse{
			Name:          desc.Name,
			DisplayName:   desc.DisplayName,
			ServiceType:   desc.ServiceType,
			Scope:         string(desc.Scope),
			KVKey:         desc.KVKey,
			KVKeyTemplate: desc.KVKeyTemplate,
			Format:        string(desc.Format),
			RequiresAgent: desc.Scope == config.ConfigScopeAgent && desc.KVKeyTemplate != "",
		})
	}


 ... (clipped 765 lines)
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Incomplete Edge Handling: Several new handlers and helpers return generic 500s on KV/template failures without
differentiating transient vs persistent errors or validating query inputs beyond presence
which may hinder graceful degradation.

Referred Code
// @Summary Configuration coverage status
// @Description Checks whether every known descriptor has an initialized KV entry
// @Tags Admin
// @Produce json
// @Param kvStore query string false "Optional KV store identifier (defaults to local hub store)"
// @Success 200 {object} configStatusResponse "All descriptors have KV entries"
// @Failure 503 {object} configStatusResponse "One or more descriptors missing KV entries"
// @Failure 502 {object} configStatusResponse "KV lookups failed"
// @Router /api/admin/config/status [get]
func (s *APIServer) handleConfigStatus(w http.ResponseWriter, r *http.Request) {
	if s == nil {
		http.Error(w, "server not initialized", http.StatusInternalServerError)
		return
	}

	kvStoreID := r.URL.Query().Get("kv_store_id")
	if kvStoreID == "" {
		kvStoreID = r.URL.Query().Get("kvStore")
	}

	descs := config.ServiceDescriptors()


 ... (clipped 737 lines)
Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Sensitive Log Risk: New logging around configuration get/put and template seeding logs KV keys and may log raw
configuration values indirectly; ensure no secrets or PII from configs (e.g., TOML/JSON
with tokens) are emitted at any level.

Referred Code
	if err := s.encodeJSONResponse(w, infos); err != nil {
		s.logger.Error().Err(err).Msg("failed to encode watcher response")
		s.writeAPIError(w, http.StatusInternalServerError, "failed to enumerate watchers")
	}
}

// @Summary Authenticate with username and password
// @Description Logs in a user with username and password and returns authentication tokens
// @Tags Authentication
// @Accept json
// @Produce json
// @Param credentials body LoginCredentials true "User credentials"
// @Success 200 {object} models.Token "Authentication tokens"
// @Failure 400 {object} models.ErrorResponse "Invalid request"
// @Failure 401 {object} models.ErrorResponse "Authentication failed"
// @Failure 500 {object} models.ErrorResponse "Internal server error"
// @Router /auth/login [post]
func (s *APIServer) handleLocalLogin(w http.ResponseWriter, r *http.Request) {
	if r.Method == http.MethodOptions {
		w.WriteHeader(http.StatusOK)
		return


 ... (clipped 635 lines)
Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Limited Validation: The update and get config endpoints accept query/body inputs (keys, agent_id, JSON/TOML)
with minimal validation and no explicit sanitization of configuration content before
storage beyond format hints.

Referred Code
// @Failure 500 {object} models.ErrorResponse "Internal server error"
// @Router /api/admin/config/{service} [get]
func (s *APIServer) handleGetConfig(w http.ResponseWriter, r *http.Request) {
	ctx := r.Context()
	vars := mux.Vars(r)
	service := vars["service"]

	key := r.URL.Query().Get("key")
	agentID := r.URL.Query().Get("agent_id")
	serviceType := r.URL.Query().Get("service_type")
	kvStoreID := r.URL.Query().Get("kv_store_id")
	if kvStoreID == "" {
		kvStoreID = r.URL.Query().Get("kvStore")
	}

	var desc config.ServiceDescriptor
	var hasDescriptor bool
	if key == "" {
		var err error
		desc, key, hasDescriptor, err = s.resolveServiceKey(service, serviceType, agentID)
		if err != nil {


 ... (clipped 178 lines)
Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/1926#issuecomment-3506720817 Original created: 2025-11-08T17:04:56Z --- ## PR Compliance Guide 🔍 <!-- https://github.com/carverauto/serviceradar/commit/af6b19d3302c61cc09bb493ca204365f9caa0f74 --> #### (Compliance updated until commit https://github.com/carverauto/serviceradar/commit/af6b19d3302c61cc09bb493ca204365f9caa0f74) Below is a summary of compliance checks for this PR:<br> <table><tbody><tr><td colspan='2'><strong>Security Compliance</strong></td></tr> <tr><td rowspan=3>⚪</td> <td><details><summary><strong>Insecure TLS verification </strong></summary><br> <b>Description:</b> TLS configuration allows InsecureSkipVerify to be enabled via a flag, which can permit <br>MITM if used in production; this should be guarded or clearly restricted to <br>development-only builds.<br> <strong><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-d333a5d2d948c60657097bfd7fef250908a625c36d14041c634c4d5df2468fccR401-R418'>main.go [401-418]</a></strong><br> <details open><summary>Referred Code</summary> ```go tlsConfig := &tls.Config{ MinVersion: tls.VersionTLS12, } if cfg.natsInsecureTLS { tlsConfig.InsecureSkipVerify = true } if cfg.natsTLSCA != "" { caCert, err := os.ReadFile(cfg.natsTLSCA) if err != nil { return nil, fmt.Errorf("read NATS CA file: %w", err) } cp := x509.NewCertPool() cp.AppendCertsFromPEM(caCert) tlsConfig.RootCAs = cp } ``` </details></details></td></tr> <tr><td><details><summary><strong>Error information leakage </strong></summary><br> <b>Description:</b> Error responses echo internal messages directly in JSON which may leak internal details <br>(keys, store IDs, or stack messages) to clients; sanitize messages before returning to <br>avoid information disclosure.<br> <strong><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR818-R870'>auth.go [818-870]</a></strong><br> <details open><summary>Referred Code</summary> ```go func (s *APIServer) writeRawConfigResponse(w http.ResponseWriter, data []byte, format config.ConfigFormat, meta *configMetadata) { if meta != nil { w.Header().Set("X-Serviceradar-Kv-Key", meta.KVKey) if meta.KVStoreID != "" { w.Header().Set("X-Serviceradar-Kv-Store", meta.KVStoreID) } if meta.Revision != 0 { w.Header().Set("X-Serviceradar-Kv-Revision", fmt.Sprintf("%d", meta.Revision)) } if meta.Origin != "" { w.Header().Set("X-Serviceradar-Config-Origin", meta.Origin) } if meta.LastWriter != "" { w.Header().Set("X-Serviceradar-Config-Writer", meta.LastWriter) } if meta.UpdatedAt != nil { w.Header().Set("X-Serviceradar-Config-Updated-At", meta.UpdatedAt.Format(time.RFC3339)) } } ... (clipped 32 lines) ``` </details></details></td></tr> <tr><td><details><summary><strong>Sensitive data exposure </strong></summary><br> <b>Description:</b> Corrupt payloads are base64-dumped to disk using filenames derived from untrusted KV keys, <br>which could enable path confusion or sensitive data exposure if directories are <br>world-readable; sanitize and restrict permissions.<br> <strong><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-d333a5d2d948c60657097bfd7fef250908a625c36d14041c634c4d5df2468fccR430-R438'>main.go [430-438]</a></strong><br> <details open><summary>Referred Code</summary> ```go func dumpPayload(dir, key string, value []byte) (string, error) { safeName := strings.NewReplacer("/", "_", ":", "_").Replace(key) target := filepath.Join(dir, safeName+".b64") encoded := base64.StdEncoding.EncodeToString(value) if err := os.WriteFile(target, []byte(encoded+"\n"), fs.FileMode(0o644)); err != nil { return "", err } return target, nil } ``` </details></details></td></tr> <tr><td colspan='2'><strong>Ticket Compliance</strong></td></tr> <tr><td>⚪</td><td><details><summary>🎫 <strong>No ticket provided </summary></strong> - [ ] Create ticket/issue <!-- /create_ticket --create_ticket=true --> </details></td></tr> <tr><td colspan='2'><strong>Codebase Duplication Compliance</strong></td></tr> <tr><td>⚪</td><td><details><summary><strong>Codebase context is not defined </strong></summary> Follow the <a href='https://qodo-merge-docs.qodo.ai/core-abilities/rag_context_enrichment/'>guide</a> to enable codebase context checks. </details></td></tr> <tr><td colspan='2'><strong>Custom Compliance</strong></td></tr> <tr><td rowspan=2>🟢</td><td> <details><summary><strong>Generic: Meaningful Naming and Self-Documenting Code</strong></summary><br> **Objective:** Ensure all identifiers clearly express their purpose and intent, making code <br>self-documenting<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Secure Error Handling</strong></summary><br> **Objective:** To prevent the leakage of sensitive system information through error messages while <br>providing sufficient detail for internal debugging.<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td rowspan=4>⚪</td> <td><details> <summary><strong>Generic: Comprehensive Audit Trails</strong></summary><br> **Objective:** To create a detailed and reliable record of critical system actions for security analysis <br>and compliance.<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR570-R1110'><strong>Missing audit logs</strong></a>: New admin endpoints perform critical configuration reads/writes, seeding, and watcher <br>enumeration without emitting structured audit events capturing user identity, action, <br>target KV key, and outcome.<br> <details open><summary>Referred Code</summary> ```go // @Failure 500 {object} models.ErrorResponse "Internal server error" // @Router /api/admin/config/{service} [get] func (s *APIServer) handleGetConfig(w http.ResponseWriter, r *http.Request) { ctx := r.Context() vars := mux.Vars(r) service := vars["service"] key := r.URL.Query().Get("key") agentID := r.URL.Query().Get("agent_id") pollerID := r.URL.Query().Get("poller_id") serviceType := r.URL.Query().Get("service_type") kvStoreID := r.URL.Query().Get("kv_store_id") if kvStoreID == "" { kvStoreID = r.URL.Query().Get("kvStore") } var desc config.ServiceDescriptor var hasDescriptor bool if key == "" { var err error desc, key, hasDescriptor, err = s.resolveServiceKey(service, serviceType, agentID, pollerID) ... (clipped 520 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Robust Error Handling and Edge Case Management</strong></summary><br> **Objective:** Ensure comprehensive error handling that provides meaningful context and graceful <br>degradation<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-d333a5d2d948c60657097bfd7fef250908a625c36d14041c634c4d5df2468fccR240-R483'><strong>Partial edge handling</strong></a>: Tool logs and returns errors appropriately but delete and rehydrate operations lack <br>retries/backoff and per-key timeouts are parsed but not applied to requests, which may <br>cause hangs or partial failures under transient conditions.<br> <details open><summary>Referred Code</summary> ```go ctx context.Context, kv nats.KeyValue, cfg sweepConfig, stats *sweepStats, problems *[]corruptRecord, coreClient proto.CoreServiceClient, ) error { lister, err := kv.ListKeys() if err != nil { return fmt.Errorf("list keys: %w", err) } defer stopLister(lister) for key := range lister.Keys() { stats.totalKeys++ if cfg.prefix != "" && !strings.HasPrefix(key, cfg.prefix) { continue } stats.filteredKeys++ if cfg.maxKeys > 0 && stats.filteredKeys > cfg.maxKeys { ... (clipped 223 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Secure Logging Practices</strong></summary><br> **Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive <br>information like PII, PHI, or cardholder data.<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR606-R1110'><strong>Potential sensitive logs</strong></a>: New handlers log KV keys, service names, and errors which may include sensitive <br>configuration paths; confirm no secret values or sanitized TOML fields are ever logged at <br>any level.<br> <details open><summary>Referred Code</summary> ```go } entry, metaRecord, err := s.loadConfigEntry(ctx, kvStoreID, resolvedKey) if err != nil && kvStoreID != "" { if s.logger != nil { s.logger.Warn().Err(err).Str("key", resolvedKey).Str("kv_store_id", kvStoreID).Msg("KV fetch failed; falling back to default store") } kvStoreID = "" baseKeyUnqualified := baseKey if strings.HasPrefix(strings.ToLower(baseKeyUnqualified), "domains/") { parts := strings.SplitN(baseKeyUnqualified, "/", 3) if len(parts) == 3 { baseKeyUnqualified = parts[2] } } resolvedKey = s.qualifyKVKey(kvStoreID, baseKeyUnqualified) if !hasDescriptor { formatHint = guessFormatFromKey(resolvedKey) } entry, metaRecord, err = s.loadConfigEntry(ctx, kvStoreID, resolvedKey) } ... (clipped 484 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Security-First Input Validation and Data Handling</strong></summary><br> **Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent <br>vulnerabilities<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR570-R899'><strong>Query validation gaps</strong></a>: Handlers accept query parameters like kv_store_id, key, agent_id, and poller_id and <br>forward them to KV and DB lookups without explicit validation/allowlists, which may risk <br>key traversal or cross-tenant access without additional unseen checks.<br> <details open><summary>Referred Code</summary> ```go // @Failure 500 {object} models.ErrorResponse "Internal server error" // @Router /api/admin/config/{service} [get] func (s *APIServer) handleGetConfig(w http.ResponseWriter, r *http.Request) { ctx := r.Context() vars := mux.Vars(r) service := vars["service"] key := r.URL.Query().Get("key") agentID := r.URL.Query().Get("agent_id") pollerID := r.URL.Query().Get("poller_id") serviceType := r.URL.Query().Get("service_type") kvStoreID := r.URL.Query().Get("kv_store_id") if kvStoreID == "" { kvStoreID = r.URL.Query().Get("kvStore") } var desc config.ServiceDescriptor var hasDescriptor bool if key == "" { var err error desc, key, hasDescriptor, err = s.resolveServiceKey(service, serviceType, agentID, pollerID) ... (clipped 309 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td align="center" colspan="2"> <!-- placeholder --> <!-- /compliance --update_compliance=true --> </td></tr></tbody></table> <details><summary>Compliance status legend</summary> 🟢 - Fully Compliant<br> 🟡 - Partial Compliant<br> 🔴 - Not Compliant<br> ⚪ - Requires Further Human Verification<br> 🏷️ - Compliance label<br> </details> ___ #### Previous compliance checks <details> <summary>Compliance check up to commit <a href='https://github.com/carverauto/serviceradar/commit/0cbe2e6faecad12f512cd08bee8531f7b7b11bed'>0cbe2e6</a></summary><br> <table><tbody><tr><td colspan='2'><strong>Security Compliance</strong></td></tr> <tr><td rowspan=3>⚪</td> <td><details><summary><strong>Sensitive information exposure </strong></summary><br> <b>Description:</b> Raw configuration responses are written back to clients without additional sanitization or <br>authentication scope checks for sensitive fields, potentially exposing secrets if configs <br>contain credentials (e.g., TOML/JSON containing tokens).<br> <strong><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR650-R676'>auth.go [650-676]</a></strong><br> <details open><summary>Referred Code</summary> ```go func (s *APIServer) writeRawConfigResponse(w http.ResponseWriter, data []byte, format config.ConfigFormat, meta *configMetadata) { if meta != nil { w.Header().Set("X-Serviceradar-Kv-Key", meta.KVKey) if meta.KVStoreID != "" { w.Header().Set("X-Serviceradar-Kv-Store", meta.KVStoreID) } if meta.Revision != 0 { w.Header().Set("X-Serviceradar-Kv-Revision", fmt.Sprintf("%d", meta.Revision)) } if meta.Origin != "" { w.Header().Set("X-Serviceradar-Config-Origin", meta.Origin) } if meta.LastWriter != "" { w.Header().Set("X-Serviceradar-Config-Writer", meta.LastWriter) } if meta.UpdatedAt != nil { w.Header().Set("X-Serviceradar-Config-Updated-At", meta.UpdatedAt.Format(time.RFC3339)) } } switch format { ... (clipped 6 lines) ``` </details></details></td></tr> <tr><td><details><summary><strong>Insecure file write </strong></summary><br> <b>Description:</b> The tool writes KV-fetched configuration to a filesystem path which may be <br>attacker-controllable via CLI without permission checks, risking overwriting sensitive <br>files if run with elevated privileges.<br> <strong><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-bc6eeb1b05bcb9179525e32fac1de9926b5823ec3504be546ab10c5c9740f544R150-R174'>main.go [150-174]</a></strong><br> <details open><summary>Referred Code</summary> ```go func fetchConfig(ctx context.Context, client *kvgrpc.Client, key string) ([]byte, error) { if client == nil { return nil, nil } data, found, err := client.Get(ctx, key) if err != nil { return nil, fmt.Errorf("kv get %q failed: %w", key, err) } if !found { return nil, nil } return data, nil } func writeOutput(path string, data []byte) error { if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil { return err } tmp := path + ".tmp" if err := os.WriteFile(tmp, data, 0o640); err != nil { ... (clipped 4 lines) ``` </details></details></td></tr> <tr><td><details><summary><strong>Insecure default seeding </strong></summary><br> <b>Description:</b> Seeding configuration from embedded templates into KV on GET when a key is missing can let <br>an unauthenticated or overly-permitted caller initialize configuration state if route <br>protection is misconfigured, leading to privilege or config injection.<br> <strong><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR826-R896'>auth.go [826-896]</a></strong><br> <details open><summary>Referred Code</summary> ```go func (s *APIServer) checkConfigKey(ctx context.Context, kvStoreID, key string) (bool, error) { if key == "" { return false, errors.New("kv key not specified") } resolvedKey := s.qualifyKVKey(kvStoreID, key) entry, err := s.getKVEntry(ctx, kvStoreID, resolvedKey) if err != nil { return false, err } return entry != nil && entry.Found, nil } func (s *APIServer) seedConfigFromTemplate(ctx context.Context, desc config.ServiceDescriptor, key, kvStoreID string) ([]byte, error) { tmpl, ok := serviceTemplates[desc.Name] if !ok || len(tmpl.data) == 0 { return nil, errTemplateUnavailable } payload := make([]byte, len(tmpl.data)) ... (clipped 50 lines) ``` </details></details></td></tr> <tr><td colspan='2'><strong>Ticket Compliance</strong></td></tr> <tr><td>⚪</td><td><details><summary>🎫 <strong>No ticket provided </summary></strong> - [ ] Create ticket/issue <!-- /create_ticket --create_ticket=true --> </details></td></tr> <tr><td colspan='2'><strong>Codebase Duplication Compliance</strong></td></tr> <tr><td>⚪</td><td><details><summary><strong>Codebase context is not defined </strong></summary> Follow the <a href='https://qodo-merge-docs.qodo.ai/core-abilities/rag_context_enrichment/'>guide</a> to enable codebase context checks. </details></td></tr> <tr><td colspan='2'><strong>Custom Compliance</strong></td></tr> <tr><td rowspan=2>🟢</td><td> <details><summary><strong>Generic: Meaningful Naming and Self-Documenting Code</strong></summary><br> **Objective:** Ensure all identifiers clearly express their purpose and intent, making code <br>self-documenting<br> **Status:** Passed<br> </details></td></tr> <tr><td> <details><summary><strong>Generic: Secure Error Handling</strong></summary><br> **Objective:** To prevent the leakage of sensitive system information through error messages while <br>providing sufficient detail for internal debugging.<br> **Status:** Passed<br> </details></td></tr> <tr><td rowspan=4>⚪</td> <td><details> <summary><strong>Generic: Comprehensive Audit Trails</strong></summary><br> **Objective:** To create a detailed and reliable record of critical system actions for security analysis <br>and compliance.<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR111-R896'><strong>Missing Audit Logs</strong></a>: New admin config endpoints perform critical actions (reading/updating/seeding <br>configurations and KV writes) without explicit audit logging of user, action, key, and <br>outcome in the added code.<br> <details open><summary>Referred Code</summary> ```go // @Summary List managed config descriptors // @Description Returns metadata about known service configurations // @Tags Admin // @Produce json // @Success 200 {array} map[string]string // @Router /api/admin/config [get] func (s *APIServer) handleListConfigDescriptors(w http.ResponseWriter, r *http.Request) { descs := config.ServiceDescriptors() resp := make([]configDescriptorResponse, 0, len(descs)) for _, desc := range descs { resp = append(resp, configDescriptorResponse{ Name: desc.Name, DisplayName: desc.DisplayName, ServiceType: desc.ServiceType, Scope: string(desc.Scope), KVKey: desc.KVKey, KVKeyTemplate: desc.KVKeyTemplate, Format: string(desc.Format), RequiresAgent: desc.Scope == config.ConfigScopeAgent && desc.KVKeyTemplate != "", }) } ... (clipped 765 lines) ``` </details></details></td></tr> <tr><td><details> <summary><strong>Generic: Robust Error Handling and Edge Case Management</strong></summary><br> **Objective:** Ensure comprehensive error handling that provides meaningful context and graceful <br>degradation<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR139-R896'><strong>Incomplete Edge Handling</strong></a>: Several new handlers and helpers return generic 500s on KV/template failures without <br>differentiating transient vs persistent errors or validating query inputs beyond presence <br>which may hinder graceful degradation.<br> <details open><summary>Referred Code</summary> ```go // @Summary Configuration coverage status // @Description Checks whether every known descriptor has an initialized KV entry // @Tags Admin // @Produce json // @Param kvStore query string false "Optional KV store identifier (defaults to local hub store)" // @Success 200 {object} configStatusResponse "All descriptors have KV entries" // @Failure 503 {object} configStatusResponse "One or more descriptors missing KV entries" // @Failure 502 {object} configStatusResponse "KV lookups failed" // @Router /api/admin/config/status [get] func (s *APIServer) handleConfigStatus(w http.ResponseWriter, r *http.Request) { if s == nil { http.Error(w, "server not initialized", http.StatusInternalServerError) return } kvStoreID := r.URL.Query().Get("kv_store_id") if kvStoreID == "" { kvStoreID = r.URL.Query().Get("kvStore") } descs := config.ServiceDescriptors() ... (clipped 737 lines) ``` </details></details></td></tr> <tr><td><details> <summary><strong>Generic: Secure Logging Practices</strong></summary><br> **Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive <br>information like PII, PHI, or cardholder data.<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR241-R896'><strong>Sensitive Log Risk</strong></a>: New logging around configuration get/put and template seeding logs KV keys and may log raw <br>configuration values indirectly; ensure no secrets or PII from configs (e.g., TOML/JSON <br>with tokens) are emitted at any level.<br> <details open><summary>Referred Code</summary> ```go if err := s.encodeJSONResponse(w, infos); err != nil { s.logger.Error().Err(err).Msg("failed to encode watcher response") s.writeAPIError(w, http.StatusInternalServerError, "failed to enumerate watchers") } } // @Summary Authenticate with username and password // @Description Logs in a user with username and password and returns authentication tokens // @Tags Authentication // @Accept json // @Produce json // @Param credentials body LoginCredentials true "User credentials" // @Success 200 {object} models.Token "Authentication tokens" // @Failure 400 {object} models.ErrorResponse "Invalid request" // @Failure 401 {object} models.ErrorResponse "Authentication failed" // @Failure 500 {object} models.ErrorResponse "Internal server error" // @Router /auth/login [post] func (s *APIServer) handleLocalLogin(w http.ResponseWriter, r *http.Request) { if r.Method == http.MethodOptions { w.WriteHeader(http.StatusOK) return ... (clipped 635 lines) ``` </details></details></td></tr> <tr><td><details> <summary><strong>Generic: Security-First Input Validation and Data Handling</strong></summary><br> **Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent <br>vulnerabilities<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR416-R614'><strong>Limited Validation</strong></a>: The update and get config endpoints accept query/body inputs (keys, agent_id, JSON/TOML) <br>with minimal validation and no explicit sanitization of configuration content before <br>storage beyond format hints.<br> <details open><summary>Referred Code</summary> ```go // @Failure 500 {object} models.ErrorResponse "Internal server error" // @Router /api/admin/config/{service} [get] func (s *APIServer) handleGetConfig(w http.ResponseWriter, r *http.Request) { ctx := r.Context() vars := mux.Vars(r) service := vars["service"] key := r.URL.Query().Get("key") agentID := r.URL.Query().Get("agent_id") serviceType := r.URL.Query().Get("service_type") kvStoreID := r.URL.Query().Get("kv_store_id") if kvStoreID == "" { kvStoreID = r.URL.Query().Get("kvStore") } var desc config.ServiceDescriptor var hasDescriptor bool if key == "" { var err error desc, key, hasDescriptor, err = s.resolveServiceKey(service, serviceType, agentID) if err != nil { ... (clipped 178 lines) ``` </details></details></td></tr> <tr><td align="center" colspan="2"> <!-- /compliance --update_compliance=true --> </td></tr></tbody></table> </details>
qodo-code-review[bot] commented 2025-11-08 17:06:07 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1926#issuecomment-3506721680
Original created: 2025-11-08T17:06:07Z

PR Code Suggestions

Latest suggestions up to af6b19d

CategorySuggestion                                                                                                                                    Impact
Possible issue
Use correct gRPC dial function

Replace the non-existent grpc.NewClient with grpc.DialContext to correctly
establish a gRPC connection and fix a compilation error.

cmd/tools/kv-sweep/main.go [470-474]

-conn, err := grpc.NewClient(address, dialOpts...)
+conn, err := grpc.DialContext(ctx, address, dialOpts...)
 if err != nil {
 	closeProvider()
 	return nil, nil, err
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies that grpc.NewClient is not a valid function for creating a gRPC connection, which would cause a compilation error, and provides the correct grpc.DialContext function as a fix.

High
Avoid KV key collisions

Update the KVKeyTemplate for poller-icmp, poller-port, and poller-process to be
unique for each service, preventing configuration overwrites.

pkg/config/registry.go [279-301]

 "poller-icmp": {
 	Name:          "poller-icmp",
 	DisplayName:   "ICMP Check",
 	ServiceType:   "icmp",
 	Scope:         ConfigScopePoller,
-	KVKeyTemplate: "config/pollers/{{poller_id}}.json",
+	KVKeyTemplate: "config/pollers/{{poller_id}}/icmp.json",
 	Format:        ConfigFormatJSON,
 },
 "poller-port": {
 	Name:          "poller-port",
 	DisplayName:   "Port Check",
 	ServiceType:   "port",
 	Scope:         ConfigScopePoller,
-	KVKeyTemplate: "config/pollers/{{poller_id}}.json",
+	KVKeyTemplate: "config/pollers/{{poller_id}}/port.json",
 	Format:        ConfigFormatJSON,
 },
 "poller-process": {
 	Name:          "poller-process",
 	DisplayName:   "Process Check",
 	ServiceType:   "process",
 	Scope:         ConfigScopePoller,
-	KVKeyTemplate: "config/pollers/{{poller_id}}.json",
+	KVKeyTemplate: "config/pollers/{{poller_id}}/process.json",
 	Format:        ConfigFormatJSON,
 },

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a critical bug where multiple poller service descriptors use the same KVKeyTemplate, which would cause them to overwrite each other's configurations.

High
Validate SPIFFE env requirements

When KV_SEC_MODE=spiffe, validate that the KV_TRUST_DOMAIN and
KV_SERVER_SPIFFE_ID environment variables are set. This ensures a clear, early
failure instead of a confusing runtime error.

pkg/agent/server.go [269-310]

 func kvSecurityFromEnv() (*models.SecurityConfig, error) {
 	mode := strings.ToLower(strings.TrimSpace(os.Getenv("KV_SEC_MODE")))
 	switch mode {
 	case "":
 		return nil, nil
 	case "spiffe":
 		socket := strings.TrimSpace(os.Getenv("KV_WORKLOAD_SOCKET"))
 		if socket == "" {
 			socket = "unix:/run/spire/sockets/agent.sock"
 		}
+		trustDomain := strings.TrimSpace(os.Getenv("KV_TRUST_DOMAIN"))
+		serverSPIFFE := strings.TrimSpace(os.Getenv("KV_SERVER_SPIFFE_ID"))
+		if trustDomain == "" || serverSPIFFE == "" {
+			return nil, fmt.Errorf("KV_SEC_MODE=spiffe requires KV_TRUST_DOMAIN and KV_SERVER_SPIFFE_ID")
+		}
 
 		return &models.SecurityConfig{
 			Mode:           "spiffe",
 			CertDir:        strings.TrimSpace(os.Getenv("KV_CERT_DIR")),
 			Role:           models.RoleAgent,
-			TrustDomain:    strings.TrimSpace(os.Getenv("KV_TRUST_DOMAIN")),
-			ServerSPIFFEID: strings.TrimSpace(os.Getenv("KV_SERVER_SPIFFE_ID")),
+			TrustDomain:    trustDomain,
+			ServerSPIFFEID: serverSPIFFE,
 			WorkloadSocket: socket,
 		}, nil
 	case "mtls":
 		cert := strings.TrimSpace(os.Getenv("KV_CERT_FILE"))
 		key := strings.TrimSpace(os.Getenv("KV_KEY_FILE"))
 		ca := strings.TrimSpace(os.Getenv("KV_CA_FILE"))
 		if cert == "" || key == "" || ca == "" {
 			return nil, errKVMtlsEnvMissing
 		}
 
 		return &models.SecurityConfig{
 			Mode:       "mtls",
 			CertDir:    strings.TrimSpace(os.Getenv("KV_CERT_DIR")),
 			Role:       models.RoleAgent,
 			ServerName: strings.TrimSpace(os.Getenv("KV_SERVER_NAME")),
 			TLS: models.TLSConfig{
 				CertFile: cert,
 				KeyFile:  key,
 				CAFile:   ca,
 			},
 		}, nil
 	default:
 		return nil, fmt.Errorf("%w %q", errUnsupportedKVSecMode, mode)
 	}
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: This suggestion improves robustness by adding validation for required environment variables (KV_TRUST_DOMAIN, KV_SERVER_SPIFFE_ID) when using SPIFFE authentication, ensuring the application fails fast with a clear error message.

Medium
Bound KV get with timeout

Add a timeout to the context for the KV Get operation to prevent hangs and
resource leaks if the KV store is slow or unavailable. This improves API server
resilience.

pkg/core/api/server.go [148-176]

 // Default kvGetFn dials KV and performs a Get
 s.kvGetFn = func(ctx context.Context, key string) ([]byte, bool, uint64, error) {
 	if s.kvAddress == "" {
 		return nil, false, 0, ErrKVAddressNotConfigured
 	}
+
+	// Bound the KV operation to avoid hangs and ensure timely cleanup.
+	callCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
+	defer cancel()
+
 	clientCfg := coregrpc.ClientConfig{Address: s.kvAddress, MaxRetries: 3, Logger: s.logger, DisableTelemetry: true}
 	var provider coregrpc.SecurityProvider
 	if s.kvSecurity != nil {
 		sec := *s.kvSecurity
 		sec.Role = models.RolePoller
 		var err error
-		provider, err = coregrpc.NewSecurityProvider(ctx, &sec, s.logger)
+		provider, err = coregrpc.NewSecurityProvider(callCtx, &sec, s.logger)
 		if err != nil {
 			return nil, false, 0, err
 		}
 		defer func() { _ = provider.Close() }()
 		clientCfg.SecurityProvider = provider
 	}
-	c, err := coregrpc.NewClient(ctx, clientCfg)
+
+	c, err := coregrpc.NewClient(callCtx, clientCfg)
 	if err != nil {
 		return nil, false, 0, err
 	}
 	defer func() { _ = c.Close() }()
+
 	kv := proto.NewKVServiceClient(c.GetConnection())
-	resp, err := kv.Get(ctx, &proto.GetRequest{Key: key})
+	resp, err := kv.Get(callCtx, &proto.GetRequest{Key: key})
 	if err != nil {
 		return nil, false, 0, err
 	}
 	return resp.Value, resp.Found, resp.Revision, nil
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a potential for long-running requests by adding a timeout to the KV client call, which improves the API server's resilience against a slow or unresponsive KV store.

Medium
Security
Normalize and dedupe kv_store_id

Refactor the buildUpstreamUrl function to explicitly reconstruct query
parameters, normalizing kvStore to kv_store_id and preventing duplicate
parameter issues.

web/src/app/api/admin/config/proxy.ts [16-36]

 function buildUpstreamUrl(req: NextRequest, service: string): string {
   const normalized = service.trim().toLowerCase();
   if (!/^[a-z0-9-]+$/.test(normalized)) {
     throw new TypeError("bad-service");
   }
 
-  const params = new URLSearchParams(req.nextUrl.searchParams);
+  const incoming = req.nextUrl.searchParams;
+  const params = new URLSearchParams();
 
-  // Normalize legacy kvStore query parameter if present.
-  if (!params.has("kv_store_id") && params.has("kvStore")) {
-    const legacy = params.get("kvStore");
-    if (legacy) {
-      params.set("kv_store_id", legacy);
-    }
-    params.delete("kvStore");
+  // Copy all params except kv_store_id/kvStore; we will normalize those explicitly.
+  for (const [k, v] of incoming.entries()) {
+    if (k.toLowerCase() === "kv_store_id" || k === "kvStore") continue;
+    params.append(k, v);
+  }
+
+  // Normalize legacy kvStore to kv_store_id with a single value.
+  let kvStoreID = incoming.get("kv_store_id");
+  if (!kvStoreID && incoming.has("kvStore")) {
+    kvStoreID = incoming.get("kvStore") || undefined;
+  }
+  if (kvStoreID) {
+    params.set("kv_store_id", kvStoreID);
   }
 
   const query = params.toString();
   const base = `${getInternalApiUrl()}/api/admin/config/${normalized}`;
   return query ? `${base}?${query}` : base;
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: This is a valuable security hardening suggestion that prevents query parameter pollution by explicitly sanitizing and rebuilding the upstream URL's query string.

Medium
Validate required SPIFFE env vars

Add validation to ensure CORE_TRUST_DOMAIN and CORE_SERVER_SPIFFE_ID environment
variables are not empty when using SPIFFE security mode.

pkg/config/bootstrap/core_client.go [52-95]

 func CoreSecurityProviderFromEnv(ctx context.Context, role models.ServiceRole, log logger.Logger) (coregrpc.SecurityProvider, error) {
 	mode := strings.ToLower(strings.TrimSpace(os.Getenv("CORE_SEC_MODE")))
 	switch mode {
 	case "", "none":
 		return nil, nil
 	case "spiffe":
 		workloadSocket := strings.TrimSpace(os.Getenv("CORE_WORKLOAD_SOCKET"))
 		if workloadSocket == "" {
 			workloadSocket = "unix:/run/spire/sockets/agent.sock"
 		}
 
+		trustDomain := strings.TrimSpace(os.Getenv("CORE_TRUST_DOMAIN"))
+		serverID := strings.TrimSpace(os.Getenv("CORE_SERVER_SPIFFE_ID"))
+		if trustDomain == "" || serverID == "" {
+			return nil, fmt.Errorf("spiffe mode requires CORE_TRUST_DOMAIN and CORE_SERVER_SPIFFE_ID")
+		}
+
 		conf := &models.SecurityConfig{
 			Mode:           "spiffe",
 			CertDir:        strings.TrimSpace(os.Getenv("CORE_CERT_DIR")),
 			Role:           role,
-			TrustDomain:    strings.TrimSpace(os.Getenv("CORE_TRUST_DOMAIN")),
-			ServerSPIFFEID: strings.TrimSpace(os.Getenv("CORE_SERVER_SPIFFE_ID")),
+			TrustDomain:    trustDomain,
+			ServerSPIFFEID: serverID,
 			WorkloadSocket: workloadSocket,
 		}
 		return coregrpc.NewSecurityProvider(ctx, conf, log)
 	case "mtls":
 		cert := strings.TrimSpace(os.Getenv("CORE_CERT_FILE"))
 		key := strings.TrimSpace(os.Getenv("CORE_KEY_FILE"))
 		ca := strings.TrimSpace(os.Getenv("CORE_CA_FILE"))
 		if cert == "" || key == "" || ca == "" {
 			return nil, errCoreMTLSConfig
 		}
 
 		conf := &models.SecurityConfig{
 			Mode:       "mtls",
 			CertDir:    strings.TrimSpace(os.Getenv("CORE_CERT_DIR")),
 			Role:       role,
 			ServerName: strings.TrimSpace(os.Getenv("CORE_SERVER_NAME")),
 			TLS: models.TLSConfig{
 				CertFile: cert,
 				KeyFile:  key,
 				CAFile:   ca,
 			},
 		}
 		return coregrpc.NewSecurityProvider(ctx, conf, log)
 	default:
 		return nil, fmt.Errorf("%w %q", errUnsupportedCoreSecMode, mode)
 	}
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that missing SPIFFE environment variables can lead to runtime failures and adds validation to fail fast with a clear error message.

Medium
General
Validate watcher snapshot fields

Add validation to ensure the Status and KVKey fields of a WatcherInfo are not
empty before publishing a snapshot to the key-value store.

pkg/config/watchers_snapshot.go [33-51]

 func (m *KVManager) PublishWatcherSnapshot(ctx context.Context, info WatcherInfo) error {
 	if m == nil || m.client == nil {
 		return errKVClientUnavailable
+	}
+	if info.Status == "" {
+		return fmt.Errorf("watcher status is required")
+	}
+	if strings.TrimSpace(info.KVKey) == "" {
+		return fmt.Errorf("watcher KV key is required")
 	}
 	key, err := WatcherSnapshotKey(info.Service, info.InstanceID)
 	if err != nil {
 		return err
 	}
 
 	payload, err := json.Marshal(WatcherSnapshot{
 		WatcherInfo: info,
 		UpdatedAt:   time.Now().UTC(),
 	})
 	if err != nil {
 		return err
 	}
 
 	return m.client.Put(ctx, key, payload, watcherSnapshotTTL)
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 6

__

Why: The suggestion improves data integrity by adding validation for required fields before publishing a watcher snapshot, preventing incomplete records from being stored in the KV.

Low
Write explicit HTTP status

Add an explicit w.WriteHeader(http.StatusOK) call in writeRawConfigResponse
before writing the response body to improve clarity and robustness.

pkg/core/api/auth.go [819-850]

 func (s *APIServer) writeRawConfigResponse(w http.ResponseWriter, data []byte, format config.ConfigFormat, meta *configMetadata) {
 	if meta != nil {
 		w.Header().Set("X-Serviceradar-Kv-Key", meta.KVKey)
 		if meta.KVStoreID != "" {
 			w.Header().Set("X-Serviceradar-Kv-Store", meta.KVStoreID)
 		}
 		if meta.Revision != 0 {
 			w.Header().Set("X-Serviceradar-Kv-Revision", fmt.Sprintf("%d", meta.Revision))
 		}
 		if meta.Origin != "" {
 			w.Header().Set("X-Serviceradar-Config-Origin", meta.Origin)
 		}
 		if meta.LastWriter != "" {
 			w.Header().Set("X-Serviceradar-Config-Writer", meta.LastWriter)
 		}
 		if meta.UpdatedAt != nil {
 			w.Header().Set("X-Serviceradar-Config-Updated-At", meta.UpdatedAt.Format(time.RFC3339))
 		}
 	}
 
 	switch format {
 	case config.ConfigFormatTOML:
 		w.Header().Set("Content-Type", "text/plain; charset=utf-8")
 	case config.ConfigFormatJSON:
 		w.Header().Set("Content-Type", "application/json")
 	default:
 		w.Header().Set("Content-Type", "application/json")
 	}
+	// Ensure an explicit status code is sent before writing the body.
+	w.WriteHeader(http.StatusOK)
 	if _, err := w.Write(data); err != nil && s != nil && s.logger != nil {
 		s.logger.Warn().Err(err).Msg("failed to write raw config response")
 	}
 }

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 4

__

Why: While Go's HTTP server implicitly sends a 200 OK status, explicitly calling WriteHeader is a good practice for clarity and robustness, especially in environments with proxies or middleware.

Low
  • More

Previous suggestions

Suggestions up to commit 15f83a6
CategorySuggestion                                                                                                                                    Impact
Possible issue
Use correct gRPC dial API

Replace the non-existent grpc.NewClient with the correct grpc.DialContext
function to fix a compilation error when creating a gRPC client connection.

pkg/config/bootstrap/template.go [202-206]

-conn, err := grpc.NewClient(coreAddr, dialOpts...)
+conn, err := grpc.DialContext(regCtx, coreAddr, dialOpts...)
 if err != nil {
 	closeProvider()
 	return fmt.Errorf("%w: %w", errTemplateRegistration, err)
 }
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies that grpc.NewClient is not a function in the standard google.golang.org/grpc package, and replacing it with grpc.DialContext fixes a compilation error.

High
Avoid nil kvStore panic
Suggestion Impact:The commit added a c.kvStore != nil check around the Put call and introduced a debug log when kvStore is nil, preventing a nil pointer dereference.

code diff:

+		if c.kvStore != nil {
+			if normalized, err := json.Marshal(over); err == nil {
+				if err := c.kvStore.Put(ctx, key, normalized, 0); err != nil && c.logger != nil {
+					c.logger.Warn().
+						Err(err).
+						Str("key", key).
+						Msg("failed to rewrite normalized KV entry")
+				}
+			}
+		} else if c.logger != nil {
+			c.logger.Debug().
+				Str("key", key).
+				Msg("skipping KV normalization rewrite; kvStore not configured")

Add a nil check for c.kvStore before attempting to write back the normalized
configuration to prevent a potential panic.

pkg/config/config.go [211-220]

 if normalizeOverlayTypes(over, base) {
-	if normalized, err := json.Marshal(over); err == nil {
-		if err := c.kvStore.Put(ctx, key, normalized, 0); err != nil && c.logger != nil {
-			c.logger.Warn().
-				Err(err).
-				Str("key", key).
-				Msg("failed to rewrite normalized KV entry")
+	if c.kvStore != nil {
+		if normalized, err := json.Marshal(over); err == nil {
+			if err := c.kvStore.Put(ctx, key, normalized, 0); err != nil && c.logger != nil {
+				c.logger.Warn().
+					Err(err).
+					Str("key", key).
+					Msg("failed to rewrite normalized KV entry")
+			}
 		}
+	} else if c.logger != nil {
+		c.logger.Debug().
+			Str("key", key).
+			Msg("skipping KV normalization rewrite; kvStore not configured")
 	}
 }
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a potential nil pointer dereference on c.kvStore and provides a check to prevent a panic, which is a significant bug fix.

Medium
Harden error response writing
Suggestion Impact:The commit updated writeAPIError to early-return without writing a body for StatusNoContent and StatusNotModified, and it deletes Content-Type before setting it, matching the suggestion's intent.

code diff:

+	switch status {
+	case http.StatusNoContent, http.StatusNotModified:
+		w.WriteHeader(status)
+		return
+	}
+
+	w.Header().Del("Content-Type")
 	w.Header().Set("Content-Type", "application/json")
 	w.WriteHeader(status)
+
 	errResp := models.ErrorResponse{

Modify writeAPIError to avoid writing a response body for HTTP statuses like 204
No Content and 304 Not Modified, which must not include one.

pkg/core/api/auth.go [852-862]

 func (s *APIServer) writeAPIError(w http.ResponseWriter, status int, message string) {
+	// Do not attempt to write a body for statuses that must not include one.
+	switch status {
+	case http.StatusNoContent, http.StatusNotModified:
+		w.WriteHeader(status)
+		return
+	}
+
+	// Set content type explicitly before writing status.
+	// Some proxies may have already set content-type; overwrite safely.
+	w.Header().Del("Content-Type")
 	w.Header().Set("Content-Type", "application/json")
 	w.WriteHeader(status)
+
 	errResp := models.ErrorResponse{
 		Message: message,
 		Status:  status,
 	}
 	if err := json.NewEncoder(w).Encode(errResp); err != nil && s != nil && s.logger != nil {
 		s.logger.Warn().Err(err).Msg("failed to encode error response")
 	}
 }
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that writing a body for status codes like 204 or 304 violates the HTTP protocol, and the proposed change fixes this, improving the robustness of the new writeAPIError helper function.

Medium
Reject numeric duration inputs
Suggestion Impact:The commit implemented rejecting numeric jwt_expiration inputs, preferred string parsing with TrimSpace, and returned specific errors for invalid formats, aligning with the suggestion’s intent.

code diff:

 import (
 	"encoding/json"
+	"errors"
+	"strings"
 	"time"
+)
+
+var (
+	errJWTDurationString = errors.New("jwt_expiration must be a duration string (e.g., \"24h\")")
+	errJWTInvalidFormat  = errors.New("invalid jwt_expiration format")
 )
 
 // User contains information about an authenticated user.
@@ -111,8 +118,10 @@
 		return nil
 	}
 
+	// Prefer string durations like "24h".
 	var expStr string
 	if err := json.Unmarshal(aux.JWTExpiration, &expStr); err == nil {
+		expStr = strings.TrimSpace(expStr)
 		if expStr == "" {
 			a.JWTExpiration = 0
 			return nil
@@ -125,13 +134,13 @@
 		return nil
 	}
 
-	var expNumber float64
-	if err := json.Unmarshal(aux.JWTExpiration, &expNumber); err == nil {
-		a.JWTExpiration = time.Duration(expNumber)
-		return nil
+	// Reject numeric inputs to avoid ambiguous nanosecond interpretation.
+	var numericProbe float64
+	if err := json.Unmarshal(aux.JWTExpiration, &numericProbe); err == nil {
+		return errJWTDurationString
 	}
 
-	return json.Unmarshal(aux.JWTExpiration, &expStr)
+	return errJWTInvalidFormat
 }

Modify the UnmarshalJSON method for AuthConfig to reject numeric values for
jwt_expiration, enforcing the use of duration strings to prevent silent
misconfiguration.

pkg/models/auth.go [97-135]

 func (a *AuthConfig) UnmarshalJSON(data []byte) error {
 	type Alias AuthConfig
 	aux := &struct {
 		*Alias
 		JWTExpiration json.RawMessage `json:"jwt_expiration"`
 	}{
 		Alias: (*Alias)(a),
 	}
 
 	if err := json.Unmarshal(data, &aux); err != nil {
 		return err
 	}
 
 	if len(aux.JWTExpiration) == 0 {
 		return nil
 	}
 
+	// Prefer string durations like "24h"
 	var expStr string
 	if err := json.Unmarshal(aux.JWTExpiration, &expStr); err == nil {
+		expStr = strings.TrimSpace(expStr)
 		if expStr == "" {
 			a.JWTExpiration = 0
 			return nil
 		}
 		dur, err := time.ParseDuration(expStr)
 		if err != nil {
 			return err
 		}
 		a.JWTExpiration = dur
 		return nil
 	}
 
-	var expNumber float64
-	if err := json.Unmarshal(aux.JWTExpiration, &expNumber); err == nil {
-		a.JWTExpiration = time.Duration(expNumber)
+	// Reject numeric inputs to avoid ambiguous nanosecond interpretation
+	var dummy float64
+	if err := json.Unmarshal(aux.JWTExpiration, &dummy); err == nil {
+		return fmt.Errorf("jwt_expiration must be a duration string (e.g., \"24h\")")
+	}
+
+	// As a last resort, try parsing again as string
+	if err := json.Unmarshal(aux.JWTExpiration, &expStr); err == nil {
+		dur, err := time.ParseDuration(expStr)
+		if err != nil {
+			return err
+		}
+		a.JWTExpiration = dur
 		return nil
 	}
-
-	return json.Unmarshal(aux.JWTExpiration, &expStr)
+	return fmt.Errorf("invalid jwt_expiration format")
 }
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that parsing a numeric JSON value for jwt_expiration can lead to misinterpretation as nanoseconds, and proposes a stricter validation that improves configuration robustness.

Medium
Nil-safe KV client cleanup
Suggestion Impact:The commit added a conditional nil check around closeFn before deferring it, matching the suggestion.

code diff:

+		if closeFn != nil {
+			defer closeFn()
+		}

In getKVEntry, add a nil check for closeFn before deferring its call to prevent
a potential panic.

pkg/core/api/auth.go [898-926]

 func (s *APIServer) getKVEntry(ctx context.Context, kvStoreID, key string) (*kvEntry, error) {
 	if key == "" {
 		return nil, errKVKeyNotSpecified
 	}
 
 	if kvStoreID != "" {
 		kvClient, closeFn, err := s.getKVClient(ctx, kvStoreID)
 		if err != nil {
 			return nil, err
 		}
-		defer closeFn()
+		if closeFn != nil {
+			defer closeFn()
+		}
 
 		resp, err := kvClient.Get(ctx, &proto.GetRequest{Key: key})
 		if err != nil {
 			return nil, err
 		}
 		return &kvEntry{Value: resp.GetValue(), Found: resp.GetFound(), Revision: resp.GetRevision()}, nil
 	}
 
 	if s.kvGetFn == nil {
 		return nil, ErrKVAddressNotConfigured
 	}
 
 	value, found, revision, err := s.kvGetFn(ctx, key)
 	if err != nil {
 		return nil, err
 	}
 	return &kvEntry{Value: value, Found: found, Revision: revision}, nil
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly points out a potential nil pointer dereference if closeFn is nil. While the implementation of getKVClient might prevent this, adding a nil check is a good defensive programming practice.

Low
Security
Strip hop-by-hop proxy headers
Suggestion Impact:The commit added logic to remove hop-by-hop headers from forwarded requests and adjusted Content-Type handling to only set it for PUT requests based on incoming content-type.

code diff:

+  // Remove hop-by-hop headers that must not be forwarded.
+  const hopByHopHeaders = [
+    "connection",
+    "upgrade",
+    "transfer-encoding",
+    "proxy-connection",
+    "keep-alive",
+    "te",
+    "trailer",
+  ];
+  for (const header of hopByHopHeaders) {
+    headers.delete(header);
+  }
+
+  if (method === "PUT") {
+    const contentType = req.headers.get("content-type");
+    if (contentType) {
+      headers.set("Content-Type", contentType);
+    }
   }

Prevent potential request smuggling vulnerabilities by stripping hop-by-hop
headers (e.g., connection, upgrade) before forwarding requests to the upstream
API.

web/src/app/api/admin/config/proxy.ts [37-81]

 async function forwardRequest(
   method: Method,
   req: NextRequest,
   upstreamUrl: string,
   authHeader: string,
 ): Promise<NextResponse> {
   const headers = new Headers({
     "X-API-Key": getApiKey(),
     Authorization: authHeader,
   });
-  const contentType = req.headers.get("Content-Type");
-  if (contentType) {
-    headers.set("Content-Type", contentType);
+
+  // Drop hop-by-hop headers to avoid proxy issues
+  const hopByHop = [
+    "connection",
+    "upgrade",
+    "transfer-encoding",
+    "proxy-connection",
+    "keep-alive",
+    "te",
+    "trailer",
+  ];
+  for (const h of hopByHop) {
+    // Ensure they are not propagated
+    headers.delete(h);
+  }
+
+  // Only set Content-Type if we actually send a body
+  if (method === "PUT") {
+    const ct = req.headers.get("content-type");
+    if (ct) headers.set("Content-Type", ct);
   }
 
   const init: RequestInit = {
     method,
     headers,
-    body: undefined,
   };
 
   if (method === "PUT") {
     init.body = await req.text();
   }
 
   try {
     const resp = await fetch(upstreamUrl, init);
     const body = await resp.text();
     const responseHeaders = new Headers();
     const respContentType = resp.headers.get("content-type");
     if (respContentType) {
       responseHeaders.set("Content-Type", respContentType);
     }
     return new NextResponse(body, {
       status: resp.status,
       headers: responseHeaders,
     });
   } catch (error) {
     console.error("Admin config proxy error", error);
     return NextResponse.json(
       { error: "Failed to reach core admin config API" },
       { status: 502 },
     );
   }
 }
Suggestion importance[1-10]: 8

__

Why: The suggestion addresses a valid security concern by proposing to strip hop-by-hop headers, which helps prevent HTTP request smuggling vulnerabilities in this proxy implementation.

Medium
Harden key unsanitize parsing
Suggestion Impact:The commit updated unsanitizeSegment to require an even number of hex digits, return an error for invalid length, restrict decoded values to 0x20–0x7E with an error otherwise, write bytes instead of runes, and explicitly handle lone '='. It also introduced specific error variables for these cases.

code diff:

@@ -561,15 +563,24 @@
 		}
 
 		if j == i+1 {
+			// Lone '='; leave as-is.
 			b.WriteByte('=')
 			continue
 		}
 
-		val, err := strconv.ParseInt(seg[i+1:j], 16, 32)
+		hexRun := seg[i+1 : j]
+		if len(hexRun)%2 != 0 {
+			return "", fmt.Errorf("%w after '=' in %q", errInvalidHexEscape, seg)
+		}
+
+		val, err := strconv.ParseInt(hexRun, 16, 32)
 		if err != nil {
 			return "", err
 		}
-		b.WriteRune(rune(val))
+		if val < 0x20 || val > 0x7E {
+			return "", fmt.Errorf("%w in %q", errUnsafeASCIICharacter, seg)
+		}
+		b.WriteByte(byte(val))
 		i = j - 1

In unsanitizeSegment, add validation to ensure hex escape sequences have an even
number of digits and that the decoded character is within the printable ASCII
range.

cmd/tools/kv-sweep/main.go [543-577]

 func unsanitizeSegment(seg string) (string, error) {
 	if !strings.Contains(seg, "=") {
 		return seg, nil
 	}
 
 	var b strings.Builder
 	b.Grow(len(seg))
 
 	for i := 0; i < len(seg); i++ {
 		ch := seg[i]
 		if ch != '=' {
 			b.WriteByte(ch)
 			continue
 		}
 
 		j := i + 1
 		for j < len(seg) && isHexDigit(seg[j]) {
 			j++
 		}
 
 		if j == i+1 {
+			// Lone '='; keep as-is
 			b.WriteByte('=')
 			continue
 		}
 
-		val, err := strconv.ParseInt(seg[i+1:j], 16, 32)
+		// Require even number of hex digits to form whole bytes
+		hexRun := seg[i+1 : j]
+		if len(hexRun)%2 != 0 {
+			return "", fmt.Errorf("invalid hex escape length after '=' in %q", seg)
+		}
+
+		val, err := strconv.ParseInt(hexRun, 16, 32)
 		if err != nil {
 			return "", err
 		}
-		b.WriteRune(rune(val))
+		// Constrain to visible ASCII to avoid control characters in keys
+		if val < 0x20 || val > 0x7E {
+			return "", fmt.Errorf("decoded character out of safe ASCII range in %q", seg)
+		}
+		b.WriteByte(byte(val))
 		i = j - 1
 	}
 
 	return b.String(), nil
 }
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that the unsanitizeSegment function could misinterpret malformed hex escape sequences. Adding validation for an even number of hex digits and restricting the output to a safe ASCII range makes the key parsing more robust and secure.

Medium
Incremental [*]
Fix provider close lifecycle
Suggestion Impact:The commit moved the deferred provider.Close() directly after creating the provider and removed the conditional explicit close in the client creation error path, fixing the double-close issue.

code diff:

+			defer func() { _ = provider.Close() }()
 			clientCfg.SecurityProvider = provider
-			defer func() { _ = provider.Close() }()
 		}
 		c, err := coregrpc.NewClient(ctx, clientCfg)
 		if err != nil {
-			if provider != nil {
-				_ = provider.Close()
-			}
 			return nil, false, 0, err

Fix a double-close bug by moving the deferred call to provider.Close() to
immediately after its creation and removing the redundant explicit close in the
error path.

pkg/core/api/server.go [148-179]

 // Default kvGetFn dials KV and performs a Get
 s.kvGetFn = func(ctx context.Context, key string) ([]byte, bool, uint64, error) {
 	if s.kvAddress == "" {
 		return nil, false, 0, ErrKVAddressNotConfigured
 	}
 	clientCfg := coregrpc.ClientConfig{Address: s.kvAddress, MaxRetries: 3, Logger: s.logger, DisableTelemetry: true}
 	var provider coregrpc.SecurityProvider
 	if s.kvSecurity != nil {
 		sec := *s.kvSecurity
 		sec.Role = models.RolePoller
 		var err error
 		provider, err = coregrpc.NewSecurityProvider(ctx, &sec, s.logger)
 		if err != nil {
 			return nil, false, 0, err
 		}
+		defer func() { _ = provider.Close() }()
 		clientCfg.SecurityProvider = provider
-		defer func() { _ = provider.Close() }()
 	}
 	c, err := coregrpc.NewClient(ctx, clientCfg)
 	if err != nil {
-		if provider != nil {
-			_ = provider.Close()
-		}
 		return nil, false, 0, err
 	}
 	defer func() { _ = c.Close() }()
 	kv := proto.NewKVServiceClient(c.GetConnection())
 	resp, err := kv.Get(ctx, &proto.GetRequest{Key: key})
 	if err != nil {
 		return nil, false, 0, err
 	}
 	return resp.Value, resp.Found, resp.Revision, nil
 }
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a double-close bug in the original code and proposes a fix that follows the standard Go idiom for resource cleanup, making the code more robust.

Medium
Harden TOML key extraction parser
Suggestion Impact:The commit refactored extractKey to a stateful tomlKeyScanner that handles single and double-quoted strings, multiline triple-quoted strings, escape sequences, and bracket/brace nesting, and stops at comments only outside strings—fulfilling the suggested hardening.

code diff:

 func extractKey(line string) string {
-	inString := false
-	escapeNext := false
+	scanner := tomlKeyScanner{}
 	for i := 0; i < len(line); i++ {
 		ch := line[i]
-		if ch == '#' && !inString {
+
+		if scanner.shouldStopAtComment(ch) {
 			break
 		}
-		if inString {
-			if escapeNext {
-				escapeNext = false
-				continue
-			}
-			if ch == '\\' {
-				escapeNext = true
-				continue
-			}
-			if ch == '"' {
-				inString = false
-			}
-			continue
+
+		if scanner.handleMultiline(line, &i) {
+			continue
+		}
+
+		if scanner.handleSingleString(ch) {
+			continue
+		}
+
+		if scanner.handleBasicString(ch) {
+			continue
+		}
+
+		if scanner.handleBrackets(ch) {
+			continue
+		}
+
+		if ch == '=' && scanner.bracketDepth == 0 {
+			return strings.TrimSpace(line[:i])
+		}
+	}
+	return ""
+}
+
+type tomlKeyScanner struct {
+	inBasicString  bool
+	inSingleString bool
+	inMultiline    bool
+	escapeNext     bool
+	bracketDepth   int
+}
+
+func (s *tomlKeyScanner) inAnyString() bool {
+	return s.inBasicString || s.inSingleString || s.inMultiline
+}
+
+func (s *tomlKeyScanner) shouldStopAtComment(ch byte) bool {
+	return ch == '#' && !s.inAnyString()
+}
+
+func (s *tomlKeyScanner) handleMultiline(line string, idx *int) bool {
+	if s.inSingleString {
+		return false
+	}
+	if !s.inMultiline {
+		if s.inBasicString {
+			return false
+		}
+		if hasTripleQuote(line, *idx) {
+			s.inMultiline = true
+			*idx += 2
+			return true
+		}
+		return false
+	}
+	if hasTripleQuote(line, *idx) {
+		s.inMultiline = false
+		*idx += 2
+	}
+	return true
+}
+
+func (s *tomlKeyScanner) handleSingleString(ch byte) bool {
+	if s.inBasicString || s.inMultiline {
+		return false
+	}
+	if s.inSingleString {
+		if ch == '\'' {
+			s.inSingleString = false
+		}
+		return true
+	}
+	if ch == '\'' {
+		s.inSingleString = true
+		return true
+	}
+	return false
+}
+
+func (s *tomlKeyScanner) handleBasicString(ch byte) bool {
+	if s.inSingleString || s.inMultiline {
+		return false
+	}
+	if s.inBasicString {
+		if s.escapeNext {
+			s.escapeNext = false
+			return true
+		}
+		if ch == '\\' {
+			s.escapeNext = true
+			return true
 		}
 		if ch == '"' {
-			inString = true
-			continue
-		}
-		if ch == '=' {
-			return strings.TrimSpace(line[:i])
-		}
-	}
-	return ""
-}
+			s.inBasicString = false
+		}
+		return true
+	}
+	if ch == '"' {
+		s.inBasicString = true
+		return true
+	}
+	return false
+}
+
+func (s *tomlKeyScanner) handleBrackets(ch byte) bool {
+	if s.inAnyString() {
+		return false
+	}
+	switch ch {
+	case '[', '{', '(':
+		s.bracketDepth++
+		return true
+	case ']', '}', ')':
+		if s.bracketDepth > 0 {
+			s.bracketDepth--
+		}
+		return true
+	default:
+		return false
+	}
+}
+
+func hasTripleQuote(line string, idx int) bool {
+	return idx+2 < len(line) && line[idx] == '"' && line[idx+1] == '"' && line[idx+2] == '"'
+}

Harden the extractKey function's TOML parser to correctly handle various TOML
constructs like single-quoted strings, multiline strings, and inline tables.
This ensures accurate key extraction for sanitization purposes.

pkg/config/toml_mask.go [109-140]

 func extractKey(line string) string {
 	inString := false
+	inSQString := false
+	inMultiline := false
 	escapeNext := false
+	bracketDepth := 0 // counts [], {} nesting
 	for i := 0; i < len(line); i++ {
 		ch := line[i]
-		if ch == '#' && !inString {
+
+		// comment begins only when not inside a string
+		if ch == '#' && !inString && !inSQString && !inMultiline {
 			break
+		}
+
+		// handle triple-quote start/end for multiline basic strings
+		if !inSQString && !inMultiline && ch == '"' && i+2 < len(line) && line[i:i+3] == `"""` {
+			inMultiline = true
+			i += 2
+			continue
+		}
+		if inMultiline {
+			// end of multiline basic string
+			if ch == '"' && i+2 < len(line) && line[i:i+3] == `"""` {
+				inMultiline = false
+				i += 2
+			}
+			continue
+		}
+
+		// handle single-quoted strings (no escapes inside)
+		if !inString && ch == '\'' && !inSQString {
+			inSQString = true
+			continue
+		}
+		if inSQString {
+			if ch == '\'' {
+				inSQString = false
+			}
+			continue
+		}
+
+		// handle basic strings with escapes
+		if ch == '"' && !inString {
+			inString = true
+			continue
 		}
 		if inString {
 			if escapeNext {
 				escapeNext = false
 				continue
 			}
 			if ch == '\\' {
 				escapeNext = true
 				continue
 			}
 			if ch == '"' {
 				inString = false
 			}
 			continue
 		}
-		if ch == '"' {
-			inString = true
+
+		// track bracket/brace nesting to avoid '=' inside arrays/inline tables
+		if ch == '[' || ch == '{' || ch == '(' {
+			bracketDepth++
 			continue
 		}
-		if ch == '=' {
+		if ch == ']' || ch == '}' || ch == ')' {
+			if bracketDepth > 0 {
+				bracketDepth--
+			}
+			continue
+		}
+
+		// only consider '=' at top level (not in strings or nested structures)
+		if ch == '=' && bracketDepth == 0 {
 			return strings.TrimSpace(line[:i])
 		}
 	}
 	return ""
 }
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that the simple TOML parser in extractKey is not robust and could fail to sanitize sensitive data in more complex but valid TOML files, and the proposed change makes it more resilient.

Medium
Handle encrypted/non-RSA PEM safely
Suggestion Impact:The commit added an encrypted PEM check and adjusted the RSA type assertion to assign only when ok, matching the suggestion's intent. It implemented the encrypted check via a helper isEncryptedPEMBlock instead of x509.IsEncryptedPEMBlock.

code diff:

+	if isEncryptedPEMBlock(block) {
+		return "", errUnsupportedPrivateKey
+	}
 
 	var rsaKey *rsa.PrivateKey
 
@@ -306,9 +309,9 @@
 		if err != nil {
 			return "", err
 		}
-		var ok bool
-		rsaKey, ok = key.(*rsa.PrivateKey)
-		if !ok {
+		if k, ok := key.(*rsa.PrivateKey); ok {
+			rsaKey = k
+		} else {
 			return "", errUnsupportedPrivateKey
 		}
 	case "RSA PRIVATE KEY":
@@ -326,6 +329,14 @@
 		return "", err
 	}
 	return string(pem.EncodeToMemory(&pem.Block{Type: "PUBLIC KEY", Bytes: pubDER})), nil
+}
+
+func isEncryptedPEMBlock(block *pem.Block) bool {
+	if block == nil {
+		return false
+	}
+	procType := block.Headers["Proc-Type"]
+	return strings.EqualFold(procType, "4,ENCRYPTED")

Improve derivePublicKeyPEM to explicitly check for and reject encrypted PEM
blocks and non-RSA keys, providing clearer error messages for unsupported key
formats.

pkg/config/kv_client.go [295-329]

 func derivePublicKeyPEM(privatePEM string) (string, error) {
 	block, _ := pem.Decode([]byte(privatePEM))
 	if block == nil {
 		return "", errDecodePrivateKeyPEM
+	}
+	// Encrypted PEMs are not supported here; surface a clear error
+	if x509.IsEncryptedPEMBlock(block) {
+		return "", errUnsupportedPrivateKey
 	}
 
 	var rsaKey *rsa.PrivateKey
 
 	switch block.Type {
 	case "PRIVATE KEY":
 		key, err := x509.ParsePKCS8PrivateKey(block.Bytes)
 		if err != nil {
 			return "", err
 		}
-		var ok bool
-		rsaKey, ok = key.(*rsa.PrivateKey)
-		if !ok {
+		if k, ok := key.(*rsa.PrivateKey); ok {
+			rsaKey = k
+		} else {
 			return "", errUnsupportedPrivateKey
 		}
 	case "RSA PRIVATE KEY":
 		key, err := x509.ParsePKCS1PrivateKey(block.Bytes)
 		if err != nil {
 			return "", err
 		}
 		rsaKey = key
 	default:
 		return "", errUnsupportedPrivateKey
 	}
 
 	pubDER, err := x509.MarshalPKIXPublicKey(&rsaKey.PublicKey)
 	if err != nil {
 		return "", err
 	}
 	return string(pem.EncodeToMemory(&pem.Block{Type: "PUBLIC KEY", Bytes: pubDER})), nil
 }
Suggestion importance[1-10]: 6

__

Why: The suggestion improves the robustness of the derivePublicKeyPEM function by adding an explicit check for encrypted PEM blocks and refining the type assertion for RSA keys. This leads to clearer error messages for unsupported key types, which is a valuable improvement for maintainability and debugging.

Low
Return 400 on bad service input
Suggestion Impact:The commit normalized the service name, changed validation to throw TypeError("bad-service"), used the normalized name in the URL, and added a try/catch to return a 400 response with an error message when the service name is invalid.

code diff:

-  if (!/^[a-z0-9-]+$/i.test(service)) {
-    throw new Error("invalid service name");
+  const normalized = service.trim().toLowerCase();
+  if (!/^[a-z0-9-]+$/.test(normalized)) {
+    throw new TypeError("bad-service");
   }
 
   const params = new URLSearchParams(req.nextUrl.searchParams);
@@ -30,7 +31,7 @@
   }
 
   const query = params.toString();
-  const base = `${getInternalApiUrl()}/api/admin/config/${service}`;
+  const base = `${getInternalApiUrl()}/api/admin/config/${normalized}`;
   return query ? `${base}?${query}` : base;
 }
 
@@ -44,15 +45,31 @@
     "X-API-Key": getApiKey(),
     Authorization: authHeader,
   });
-  const contentType = req.headers.get("Content-Type");
-  if (contentType) {
-    headers.set("Content-Type", contentType);
+
+  // Remove hop-by-hop headers that must not be forwarded.
+  const hopByHopHeaders = [
+    "connection",
+    "upgrade",
+    "transfer-encoding",
+    "proxy-connection",
+    "keep-alive",
+    "te",
+    "trailer",
+  ];
+  for (const header of hopByHopHeaders) {
+    headers.delete(header);
+  }
+
+  if (method === "PUT") {
+    const contentType = req.headers.get("content-type");
+    if (contentType) {
+      headers.set("Content-Type", contentType);
+    }
   }
 
   const init: RequestInit = {
     method,
     headers,
-    body: undefined,
   };
 
   if (method === "PUT") {
@@ -91,6 +108,17 @@
   }
 
   const { service } = await params;
-  const upstreamUrl = buildUpstreamUrl(req, service);
+  let upstreamUrl: string;
+  try {
+    upstreamUrl = buildUpstreamUrl(req, service);
+  } catch (err) {
+    if (err instanceof TypeError && err.message === "bad-service") {
+      return NextResponse.json(
+        { error: "Invalid service name" },
+        { status: 400 },
+      );
+    }
+    throw err;
+  }
   return forwardRequest(method, req, upstreamUrl, authHeader);

Modify the buildUpstreamUrl function to handle invalid service names by enabling
a 400 Bad Request response instead of throwing an exception that causes a 500
Internal Server Error. Also, normalize the service name to lowercase.

web/src/app/api/admin/config/proxy.ts [17-35]

 function buildUpstreamUrl(req: NextRequest, service: string): string {
-  if (!/^[a-z0-9-]+$/i.test(service)) {
-    throw new Error("invalid service name");
+  const normalized = service.trim().toLowerCase();
+  if (!/^[a-z0-9-]+$/.test(normalized)) {
+    // Return a sentinel that caller can handle to produce 400
+    throw new TypeError("bad-service");
   }
 
   const params = new URLSearchParams(req.nextUrl.searchParams);
-
-  // Normalize legacy kvStore query parameter if present.
   if (!params.has("kv_store_id") && params.has("kvStore")) {
     const legacy = params.get("kvStore");
     if (legacy) {
       params.set("kv_store_id", legacy);
     }
     params.delete("kvStore");
   }
 
   const query = params.toString();
-  const base = `${getInternalApiUrl()}/api/admin/config/${service}`;
+  const base = `${getInternalApiUrl()}/api/admin/config/${normalized}`;
   return query ? `${base}?${query}` : base;
 }
 
+export async function proxyConfigRequest(
+  method: Method,
+  req: NextRequest,
+  { params }: { params: ParamsPromise },
+): Promise<NextResponse> {
+  const authHeader = req.headers.get("Authorization");
+  if (!authHeader?.startsWith("Bearer ")) {
+    return UNAUTHORIZED;
+  }
+
+  const { service } = await params;
+  let upstreamUrl: string;
+  try {
+    upstreamUrl = buildUpstreamUrl(req, service);
+  } catch (e) {
+    if (e instanceof TypeError && e.message === "bad-service") {
+      return NextResponse.json({ error: "Invalid service name" }, { status: 400 });
+    }
+    throw e;
+  }
+  return forwardRequest(method, req, upstreamUrl, authHeader);
+}
+
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly points out that throwing an error for invalid input leads to a 500 response, and proposes a better approach of returning a 400 error, which improves error handling and security.

Low
General
Add WithBlock to dial options

Add the grpc.WithBlock() dial option to gRPC connections to ensure connection
errors are surfaced promptly during startup.

pkg/config/bootstrap/core_client.go [26-49]

 func BuildCoreDialOptionsFromEnv(ctx context.Context, role models.ServiceRole, log logger.Logger) ([]grpc.DialOption, func(), error) {
 	noOpCloser := func() {}
 
 	provider, err := CoreSecurityProviderFromEnv(ctx, role, log)
 	if err != nil {
 		return nil, noOpCloser, err
 	}
 
 	if provider != nil {
 		creds, credErr := provider.GetClientCredentials(ctx)
 		if credErr != nil {
 			_ = provider.Close()
 			return nil, noOpCloser, credErr
 		}
 		closer := func() {
 			if err := provider.Close(); err != nil && log != nil {
 				log.Warn().Err(err).Msg("failed to close core security provider")
 			}
 		}
-		return []grpc.DialOption{creds}, closer, nil
+		return []grpc.DialOption{
+			creds,
+			grpc.WithBlock(),
+		}, closer, nil
 	}
 
-	return []grpc.DialOption{grpc.WithTransportCredentials(insecure.NewCredentials())}, noOpCloser, nil
+	return []grpc.DialOption{
+		grpc.WithTransportCredentials(insecure.NewCredentials()),
+		grpc.WithBlock(),
+	}, noOpCloser, nil
 }
Suggestion importance[1-10]: 7

__

Why: Adding grpc.WithBlock() is a good practice to make gRPC connections fail fast on startup if the endpoint is unreachable, improving service reliability and providing clearer startup error messages.

Medium
Suggestions up to commit c096ab3
CategorySuggestion                                                                                                                                    Impact
Possible issue
Use grpc.Dial instead of NewClient
Suggestion Impact:The commit replaced grpc.NewClient with grpc.DialContext and adjusted provider cleanup, implementing the suggested fix for creating the gRPC client connection.

code diff:

-	conn, err := grpc.NewClient(address, dialOpts...)
-	if err != nil {
-		if provider != nil {
-			_ = provider.Close()
-		}
+	conn, err := grpc.DialContext(ctx, address, dialOpts...) //nolint:staticcheck // DialContext ensures ctx cancellation adherence
+	if err != nil {
+		closeProvider()
 		return nil, nil, err
 	}
 
 	cleanup := func() {
 		_ = conn.Close()
-		if provider != nil {
-			_ = provider.Close()
-		}
+		closeProvider()
 	}
 	return proto.NewCoreServiceClient(conn), cleanup, nil

Replace the incorrect grpc.NewClient call with grpc.DialContext to fix a
critical bug that prevents gRPC connections.

cmd/tools/kv-sweep/main.go [453-483]

 func connectCore(ctx context.Context, cfg sweepConfig) (proto.CoreServiceClient, func(), error) {
 	role := models.ServiceRole(cfg.coreRole)
 	dialOpts, provider, err := bootstrap.BuildCoreDialOptionsFromEnv(ctx, role, logger.NewTestLogger())
 	if err != nil {
 		return nil, nil, err
 	}
 
 	address := cfg.coreAddress
 	if address == "" {
 		address = os.Getenv("CORE_ADDRESS")
 	}
 	if address == "" {
 		return nil, nil, errCoreAddressNotConfigured
 	}
 
-	conn, err := grpc.NewClient(address, dialOpts...)
+	// grpc.Dial is the correct API to create a client connection
+	conn, err := grpc.DialContext(ctx, address, dialOpts...)
 	if err != nil {
 		if provider != nil {
 			_ = provider.Close()
 		}
 		return nil, nil, err
 	}
 
 	cleanup := func() {
 		_ = conn.Close()
 		if provider != nil {
 			_ = provider.Close()
 		}
 	}
 	return proto.NewCoreServiceClient(conn), cleanup, nil
 }
Suggestion importance[1-10]: 10

__

Why: The suggestion correctly identifies a critical bug where grpc.NewClient is used instead of grpc.Dial or grpc.DialContext, which would cause a runtime failure and prevent the tool from connecting to the gRPC server.

High
Fix unsafe KV fallback key resolution
Suggestion Impact:The commit updated the fallback logic to strip a "domains/" prefix from the base key, re-qualify the key for the default store, and retry loading. It also guarded the logger call and adjusted format hinting. This addresses the unsafe key resolution during KV fallback as suggested.

code diff:

@@ -607,9 +607,21 @@
 
 	entry, metaRecord, err := s.loadConfigEntry(ctx, kvStoreID, resolvedKey)
 	if err != nil && kvStoreID != "" {
-		s.logger.Warn().Err(err).Str("key", resolvedKey).Str("kv_store_id", kvStoreID).Msg("KV fetch failed; falling back to default store")
+		if s.logger != nil {
+			s.logger.Warn().Err(err).Str("key", resolvedKey).Str("kv_store_id", kvStoreID).Msg("KV fetch failed; falling back to default store")
+		}
 		kvStoreID = ""
-		resolvedKey = baseKey
+		baseKeyUnqualified := baseKey
+		if strings.HasPrefix(strings.ToLower(baseKeyUnqualified), "domains/") {
+			parts := strings.SplitN(baseKeyUnqualified, "/", 3)
+			if len(parts) == 3 {
+				baseKeyUnqualified = parts[2]
+			}
+		}
+		resolvedKey = s.qualifyKVKey(kvStoreID, baseKeyUnqualified)
+		if !hasDescriptor {
+			formatHint = guessFormatFromKey(resolvedKey)
+		}
 		entry, metaRecord, err = s.loadConfigEntry(ctx, kvStoreID, resolvedKey)
 	}

Fix an unsafe key resolution issue in the KV fallback logic. When falling back
to the default store, ensure the key is correctly unqualified to prevent reading
from a domain-specific path.

pkg/core/api/auth.go [608-614]

-func (s *APIServer) handleGetConfig(w http.ResponseWriter, r *http.Request) {
-	ctx := r.Context()
-	vars := mux.Vars(r)
-	service := vars["service"]
+entry, metaRecord, err := s.loadConfigEntry(ctx, kvStoreID, resolvedKey)
+if err != nil && kvStoreID != "" {
+	if s.logger != nil {
+		s.logger.Warn().Err(err).Str("key", resolvedKey).Str("kv_store_id", kvStoreID).Msg("KV fetch failed; falling back to default store")
+	}
+	// Fallback to default store; ensure we use an appropriate unqualified key
+	kvStoreID = ""
+	// If the original baseKey includes a domains/ prefix, strip it for default store
+	baseKeyUnqualified := baseKey
+	if strings.HasPrefix(strings.ToLower(baseKeyUnqualified), "domains/") {
+		parts := strings.SplitN(baseKeyUnqualified, "/", 3)
+		if len(parts) == 3 {
+			baseKeyUnqualified = parts[2]
+		}
+	}
+	resolvedKey = s.qualifyKVKey(kvStoreID, baseKeyUnqualified)
+	entry, metaRecord, err = s.loadConfigEntry(ctx, kvStoreID, resolvedKey)
+}
 
-	key := r.URL.Query().Get("key")
-	agentID := r.URL.Query().Get("agent_id")
-	pollerID := r.URL.Query().Get("poller_id")
-	serviceType := r.URL.Query().Get("service_type")
-	kvStoreID := r.URL.Query().Get("kv_store_id")
-	if kvStoreID == "" {
-		kvStoreID = r.URL.Query().Get("kvStore")
-	}
-
-	var desc config.ServiceDescriptor
-	var hasDescriptor bool
-	if key == "" {
-		var err error
-		desc, key, hasDescriptor, err = s.resolveServiceKey(service, serviceType, agentID, pollerID)
-		if err != nil {
-			s.writeAPIError(w, http.StatusBadRequest, err.Error())
-			return
-		}
-	} else {
-		desc, hasDescriptor = s.lookupServiceDescriptor(service, serviceType, agentID, pollerID)
-	}
-
-	rawMode := isRawConfigRequested(r)
-	baseKey := key
-	resolvedKey := s.qualifyKVKey(kvStoreID, key)
-
-	formatHint := guessFormatFromKey(resolvedKey)
-	if hasDescriptor {
-		formatHint = desc.Format
-	}
-
-	entry, metaRecord, err := s.loadConfigEntry(ctx, kvStoreID, resolvedKey)
-	if err != nil && kvStoreID != "" {
-		s.logger.Warn().Err(err).Str("key", resolvedKey).Str("kv_store_id", kvStoreID).Msg("KV fetch failed; falling back to default store")
-		kvStoreID = ""
-		resolvedKey = baseKey
-		entry, metaRecord, err = s.loadConfigEntry(ctx, kvStoreID, resolvedKey)
-	}
-
Suggestion importance[1-10]: 8

__

Why: This suggestion correctly identifies a critical bug where a fallback to the default KV store could use a domain-qualified key, leading to incorrect data access or failures. The fix is accurate and prevents this cross-namespace issue.

Medium
Prevent nil pointer on shutdown
Suggestion Impact:The commit added a nil check around bootstrapResult.Close() in the ListenAndServe error handling path, aligning with the suggestion. It also added graceful shutdown handling, but the nil check directly reflects the suggestion’s intent.

code diff:

 	if err := server.ListenAndServe(); err != nil {
-		_ = bootstrapResult.Close()
+		if bootstrapResult != nil {
+			_ = bootstrapResult.Close()
+		}

Add a nil check for bootstrapResult before calling Close() in the
server.ListenAndServe() error path to prevent a panic.

cmd/faker/main.go [474-477]

-func main() {
-  var configPath string
-  ...
-  config = &Config{}
-  bootstrapResult := loadFakerConfig(ctx, configPath, config)
+if err := server.ListenAndServe(); err != nil {
   if bootstrapResult != nil {
-    defer func() { _ = bootstrapResult.Close() }()
+    _ = bootstrapResult.Close()
   }
-  ...
-  if err := server.ListenAndServe(); err != nil {
-    _ = bootstrapResult.Close()
-    log.Fatalf("Server failed: %v", err) //nolint:gocritic // Close is explicitly called before Fatalf
-  }
+  log.Fatalf("Server failed: %v", err) //nolint:gocritic // Close is explicitly called before Fatalf
 }
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a potential nil pointer dereference on bootstrapResult in an error handling path, which would cause a panic during server shutdown.

Medium
Prevent unsafe duration coercion

Modify formatDurationValue to avoid interpreting raw numeric values as
nanoseconds for durations, as this can lead to misconfiguration.

pkg/config/config.go [354-382]

 func formatDurationValue(value interface{}) (string, bool) {
+	// Only accept string-like inputs for durations; numeric inputs are ambiguous (nanoseconds vs seconds).
 	switch v := value.(type) {
+	case string:
+		if d, err := time.ParseDuration(v); err == nil {
+			return d.String(), true
+		}
+		return "", false
 	case json.Number:
-		if i, err := v.Int64(); err == nil {
-			return time.Duration(i).String(), true
+		// Treat only when it includes a unit-like suffix once converted to string
+		s := v.String()
+		if d, err := time.ParseDuration(s); err == nil {
+			return d.String(), true
 		}
-		if f, err := v.Float64(); err == nil {
-			return time.Duration(int64(f)).String(), true
-		}
-	case float64:
-		return time.Duration(int64(v)).String(), true
-	case float32:
-		return time.Duration(int64(v)).String(), true
-	case int64:
-		return time.Duration(v).String(), true
-	case int32:
-		return time.Duration(v).String(), true
-	case int:
-		return time.Duration(v).String(), true
-	case uint64:
-		return time.Duration(v).String(), true
-	case uint32:
-		return time.Duration(v).String(), true
-	case uint:
-		return time.Duration(v).String(), true
+		return "", false
+	default:
+		return "", false
 	}
-
-	return "", false
 }
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly points out that treating numeric JSON values as nanoseconds for time.Duration is likely incorrect and can lead to silent configuration errors.

Medium
Prevent security provider resource leaks
Suggestion Impact:The function signature was changed to return a closer func(), with no-op closer on error/insecure paths, and a proper closer that closes the provider. It also handles error paths by closing the provider and returning the no-op closer, aligning with the suggestion’s intent to prevent leaks.

code diff:

+// using the SPIFFE or mTLS settings provided via environment variables. It returns a
+// closer function that callers should invoke to release any underlying security provider.
+func BuildCoreDialOptionsFromEnv(ctx context.Context, role models.ServiceRole, log logger.Logger) ([]grpc.DialOption, func(), error) {
+	noOpCloser := func() {}
 
 	provider, err := CoreSecurityProviderFromEnv(ctx, role, log)
 	if err != nil {
-		return nil, nil, err
+		return nil, noOpCloser, err
 	}
 
 ...
Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/1926#issuecomment-3506721680 Original created: 2025-11-08T17:06:07Z --- ## PR Code Suggestions ✨ <!-- af6b19d --> Latest suggestions up to af6b19d <table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=4>Possible issue</td> <td> <details><summary>Use correct gRPC dial function</summary> ___ **Replace the non-existent <code>grpc.NewClient</code> with <code>grpc.DialContext</code> to correctly <br>establish a gRPC connection and fix a compilation error.** [cmd/tools/kv-sweep/main.go [470-474]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-d333a5d2d948c60657097bfd7fef250908a625c36d14041c634c4d5df2468fccR470-R474) ```diff -conn, err := grpc.NewClient(address, dialOpts...) +conn, err := grpc.DialContext(ctx, address, dialOpts...) if err != nil { closeProvider() return nil, nil, err } ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 9</summary> __ Why: The suggestion correctly identifies that `grpc.NewClient` is not a valid function for creating a gRPC connection, which would cause a compilation error, and provides the correct `grpc.DialContext` function as a fix. </details></details></td><td align=center>High </td></tr><tr><td> <details><summary>Avoid KV key collisions</summary> ___ **Update the <code>KVKeyTemplate</code> for <code>poller-icmp</code>, <code>poller-port</code>, and <code>poller-process</code> to be <br>unique for each service, preventing configuration overwrites.** [pkg/config/registry.go [279-301]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-571dd192b00a11d35be5792632115c32a9fc59eb70e0716f0e9c49fe12940e8cR279-R301) ```diff "poller-icmp": { Name: "poller-icmp", DisplayName: "ICMP Check", ServiceType: "icmp", Scope: ConfigScopePoller, - KVKeyTemplate: "config/pollers/{{poller_id}}.json", + KVKeyTemplate: "config/pollers/{{poller_id}}/icmp.json", Format: ConfigFormatJSON, }, "poller-port": { Name: "poller-port", DisplayName: "Port Check", ServiceType: "port", Scope: ConfigScopePoller, - KVKeyTemplate: "config/pollers/{{poller_id}}.json", + KVKeyTemplate: "config/pollers/{{poller_id}}/port.json", Format: ConfigFormatJSON, }, "poller-process": { Name: "poller-process", DisplayName: "Process Check", ServiceType: "process", Scope: ConfigScopePoller, - KVKeyTemplate: "config/pollers/{{poller_id}}.json", + KVKeyTemplate: "config/pollers/{{poller_id}}/process.json", Format: ConfigFormatJSON, }, ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 9</summary> __ Why: The suggestion correctly identifies a critical bug where multiple poller service descriptors use the same `KVKeyTemplate`, which would cause them to overwrite each other's configurations. </details></details></td><td align=center>High </td></tr><tr><td> <details><summary>Validate SPIFFE env requirements</summary> ___ **When <code>KV_SEC_MODE=spiffe</code>, validate that the <code>KV_TRUST_DOMAIN</code> and <br><code>KV_SERVER_SPIFFE_ID</code> environment variables are set. This ensures a clear, early <br>failure instead of a confusing runtime error.** [pkg/agent/server.go [269-310]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-3c51e5356a25859a3300ab62ed2494462feb2567ef69e6db3aa2bbc1032c1c5dR269-R310) ```diff func kvSecurityFromEnv() (*models.SecurityConfig, error) { mode := strings.ToLower(strings.TrimSpace(os.Getenv("KV_SEC_MODE"))) switch mode { case "": return nil, nil case "spiffe": socket := strings.TrimSpace(os.Getenv("KV_WORKLOAD_SOCKET")) if socket == "" { socket = "unix:/run/spire/sockets/agent.sock" } + trustDomain := strings.TrimSpace(os.Getenv("KV_TRUST_DOMAIN")) + serverSPIFFE := strings.TrimSpace(os.Getenv("KV_SERVER_SPIFFE_ID")) + if trustDomain == "" || serverSPIFFE == "" { + return nil, fmt.Errorf("KV_SEC_MODE=spiffe requires KV_TRUST_DOMAIN and KV_SERVER_SPIFFE_ID") + } return &models.SecurityConfig{ Mode: "spiffe", CertDir: strings.TrimSpace(os.Getenv("KV_CERT_DIR")), Role: models.RoleAgent, - TrustDomain: strings.TrimSpace(os.Getenv("KV_TRUST_DOMAIN")), - ServerSPIFFEID: strings.TrimSpace(os.Getenv("KV_SERVER_SPIFFE_ID")), + TrustDomain: trustDomain, + ServerSPIFFEID: serverSPIFFE, WorkloadSocket: socket, }, nil case "mtls": cert := strings.TrimSpace(os.Getenv("KV_CERT_FILE")) key := strings.TrimSpace(os.Getenv("KV_KEY_FILE")) ca := strings.TrimSpace(os.Getenv("KV_CA_FILE")) if cert == "" || key == "" || ca == "" { return nil, errKVMtlsEnvMissing } return &models.SecurityConfig{ Mode: "mtls", CertDir: strings.TrimSpace(os.Getenv("KV_CERT_DIR")), Role: models.RoleAgent, ServerName: strings.TrimSpace(os.Getenv("KV_SERVER_NAME")), TLS: models.TLSConfig{ CertFile: cert, KeyFile: key, CAFile: ca, }, }, nil default: return nil, fmt.Errorf("%w %q", errUnsupportedKVSecMode, mode) } } ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: This suggestion improves robustness by adding validation for required environment variables (`KV_TRUST_DOMAIN`, `KV_SERVER_SPIFFE_ID`) when using SPIFFE authentication, ensuring the application fails fast with a clear error message. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Bound KV get with timeout</summary> ___ **Add a timeout to the context for the KV <code>Get</code> operation to prevent hangs and <br>resource leaks if the KV store is slow or unavailable. This improves API server <br>resilience.** [pkg/core/api/server.go [148-176]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-1bb99367fdd853c728b7cfbf5893a293f6d217144dfb5282cb8dd32e5261021eR148-R176) ```diff // Default kvGetFn dials KV and performs a Get s.kvGetFn = func(ctx context.Context, key string) ([]byte, bool, uint64, error) { if s.kvAddress == "" { return nil, false, 0, ErrKVAddressNotConfigured } + + // Bound the KV operation to avoid hangs and ensure timely cleanup. + callCtx, cancel := context.WithTimeout(ctx, 5*time.Second) + defer cancel() + clientCfg := coregrpc.ClientConfig{Address: s.kvAddress, MaxRetries: 3, Logger: s.logger, DisableTelemetry: true} var provider coregrpc.SecurityProvider if s.kvSecurity != nil { sec := *s.kvSecurity sec.Role = models.RolePoller var err error - provider, err = coregrpc.NewSecurityProvider(ctx, &sec, s.logger) + provider, err = coregrpc.NewSecurityProvider(callCtx, &sec, s.logger) if err != nil { return nil, false, 0, err } defer func() { _ = provider.Close() }() clientCfg.SecurityProvider = provider } - c, err := coregrpc.NewClient(ctx, clientCfg) + + c, err := coregrpc.NewClient(callCtx, clientCfg) if err != nil { return nil, false, 0, err } defer func() { _ = c.Close() }() + kv := proto.NewKVServiceClient(c.GetConnection()) - resp, err := kv.Get(ctx, &proto.GetRequest{Key: key}) + resp, err := kv.Get(callCtx, &proto.GetRequest{Key: key}) if err != nil { return nil, false, 0, err } return resp.Value, resp.Found, resp.Revision, nil } ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies a potential for long-running requests by adding a timeout to the KV client call, which improves the API server's resilience against a slow or unresponsive KV store. </details></details></td><td align=center>Medium </td></tr><tr><td rowspan=2>Security</td> <td> <details><summary>Normalize and dedupe kv_store_id</summary> ___ **Refactor the <code>buildUpstreamUrl</code> function to explicitly reconstruct query <br>parameters, normalizing <code>kvStore</code> to <code>kv_store_id</code> and preventing duplicate <br>parameter issues.** [web/src/app/api/admin/config/proxy.ts [16-36]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-efbdcfc4a88b03f4292320f91f31515bd86eb070be3b8c65a3dc2e583276e9d7R16-R36) ```diff function buildUpstreamUrl(req: NextRequest, service: string): string { const normalized = service.trim().toLowerCase(); if (!/^[a-z0-9-]+$/.test(normalized)) { throw new TypeError("bad-service"); } - const params = new URLSearchParams(req.nextUrl.searchParams); + const incoming = req.nextUrl.searchParams; + const params = new URLSearchParams(); - // Normalize legacy kvStore query parameter if present. - if (!params.has("kv_store_id") && params.has("kvStore")) { - const legacy = params.get("kvStore"); - if (legacy) { - params.set("kv_store_id", legacy); - } - params.delete("kvStore"); + // Copy all params except kv_store_id/kvStore; we will normalize those explicitly. + for (const [k, v] of incoming.entries()) { + if (k.toLowerCase() === "kv_store_id" || k === "kvStore") continue; + params.append(k, v); + } + + // Normalize legacy kvStore to kv_store_id with a single value. + let kvStoreID = incoming.get("kv_store_id"); + if (!kvStoreID && incoming.has("kvStore")) { + kvStoreID = incoming.get("kvStore") || undefined; + } + if (kvStoreID) { + params.set("kv_store_id", kvStoreID); } const query = params.toString(); const base = `${getInternalApiUrl()}/api/admin/config/${normalized}`; return query ? `${base}?${query}` : base; } ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: This is a valuable security hardening suggestion that prevents query parameter pollution by explicitly sanitizing and rebuilding the upstream URL's query string. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Validate required SPIFFE env vars</summary> ___ **Add validation to ensure <code>CORE_TRUST_DOMAIN</code> and <code>CORE_SERVER_SPIFFE_ID</code> environment <br>variables are not empty when using SPIFFE security mode.** [pkg/config/bootstrap/core_client.go [52-95]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-5148a65652c2ea30798812e62706a3355f3517f2a20b4561741bcc24b2bd6673R52-R95) ```diff func CoreSecurityProviderFromEnv(ctx context.Context, role models.ServiceRole, log logger.Logger) (coregrpc.SecurityProvider, error) { mode := strings.ToLower(strings.TrimSpace(os.Getenv("CORE_SEC_MODE"))) switch mode { case "", "none": return nil, nil case "spiffe": workloadSocket := strings.TrimSpace(os.Getenv("CORE_WORKLOAD_SOCKET")) if workloadSocket == "" { workloadSocket = "unix:/run/spire/sockets/agent.sock" } + trustDomain := strings.TrimSpace(os.Getenv("CORE_TRUST_DOMAIN")) + serverID := strings.TrimSpace(os.Getenv("CORE_SERVER_SPIFFE_ID")) + if trustDomain == "" || serverID == "" { + return nil, fmt.Errorf("spiffe mode requires CORE_TRUST_DOMAIN and CORE_SERVER_SPIFFE_ID") + } + conf := &models.SecurityConfig{ Mode: "spiffe", CertDir: strings.TrimSpace(os.Getenv("CORE_CERT_DIR")), Role: role, - TrustDomain: strings.TrimSpace(os.Getenv("CORE_TRUST_DOMAIN")), - ServerSPIFFEID: strings.TrimSpace(os.Getenv("CORE_SERVER_SPIFFE_ID")), + TrustDomain: trustDomain, + ServerSPIFFEID: serverID, WorkloadSocket: workloadSocket, } return coregrpc.NewSecurityProvider(ctx, conf, log) case "mtls": cert := strings.TrimSpace(os.Getenv("CORE_CERT_FILE")) key := strings.TrimSpace(os.Getenv("CORE_KEY_FILE")) ca := strings.TrimSpace(os.Getenv("CORE_CA_FILE")) if cert == "" || key == "" || ca == "" { return nil, errCoreMTLSConfig } conf := &models.SecurityConfig{ Mode: "mtls", CertDir: strings.TrimSpace(os.Getenv("CORE_CERT_DIR")), Role: role, ServerName: strings.TrimSpace(os.Getenv("CORE_SERVER_NAME")), TLS: models.TLSConfig{ CertFile: cert, KeyFile: key, CAFile: ca, }, } return coregrpc.NewSecurityProvider(ctx, conf, log) default: return nil, fmt.Errorf("%w %q", errUnsupportedCoreSecMode, mode) } } ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies that missing SPIFFE environment variables can lead to runtime failures and adds validation to fail fast with a clear error message. </details></details></td><td align=center>Medium </td></tr><tr><td rowspan=2>General</td> <td> <details><summary>Validate watcher snapshot fields</summary> ___ **Add validation to ensure the <code>Status</code> and <code>KVKey</code> fields of a <code>WatcherInfo</code> are not <br>empty before publishing a snapshot to the key-value store.** [pkg/config/watchers_snapshot.go [33-51]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-e6b688991669ff2e740b25b031841a14db1e2da72068981f3f6fd1fd0e2a236bR33-R51) ```diff func (m *KVManager) PublishWatcherSnapshot(ctx context.Context, info WatcherInfo) error { if m == nil || m.client == nil { return errKVClientUnavailable + } + if info.Status == "" { + return fmt.Errorf("watcher status is required") + } + if strings.TrimSpace(info.KVKey) == "" { + return fmt.Errorf("watcher KV key is required") } key, err := WatcherSnapshotKey(info.Service, info.InstanceID) if err != nil { return err } payload, err := json.Marshal(WatcherSnapshot{ WatcherInfo: info, UpdatedAt: time.Now().UTC(), }) if err != nil { return err } return m.client.Put(ctx, key, payload, watcherSnapshotTTL) } ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 6</summary> __ Why: The suggestion improves data integrity by adding validation for required fields before publishing a watcher snapshot, preventing incomplete records from being stored in the KV. </details></details></td><td align=center>Low </td></tr><tr><td> <details><summary>Write explicit HTTP status</summary> ___ **Add an explicit <code>w.WriteHeader(http.StatusOK)</code> call in <code>writeRawConfigResponse</code> <br>before writing the response body to improve clarity and robustness.** [pkg/core/api/auth.go [819-850]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR819-R850) ```diff func (s *APIServer) writeRawConfigResponse(w http.ResponseWriter, data []byte, format config.ConfigFormat, meta *configMetadata) { if meta != nil { w.Header().Set("X-Serviceradar-Kv-Key", meta.KVKey) if meta.KVStoreID != "" { w.Header().Set("X-Serviceradar-Kv-Store", meta.KVStoreID) } if meta.Revision != 0 { w.Header().Set("X-Serviceradar-Kv-Revision", fmt.Sprintf("%d", meta.Revision)) } if meta.Origin != "" { w.Header().Set("X-Serviceradar-Config-Origin", meta.Origin) } if meta.LastWriter != "" { w.Header().Set("X-Serviceradar-Config-Writer", meta.LastWriter) } if meta.UpdatedAt != nil { w.Header().Set("X-Serviceradar-Config-Updated-At", meta.UpdatedAt.Format(time.RFC3339)) } } switch format { case config.ConfigFormatTOML: w.Header().Set("Content-Type", "text/plain; charset=utf-8") case config.ConfigFormatJSON: w.Header().Set("Content-Type", "application/json") default: w.Header().Set("Content-Type", "application/json") } + // Ensure an explicit status code is sent before writing the body. + w.WriteHeader(http.StatusOK) if _, err := w.Write(data); err != nil && s != nil && s.logger != nil { s.logger.Warn().Err(err).Msg("failed to write raw config response") } } ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 4</summary> __ Why: While Go's HTTP server implicitly sends a 200 OK status, explicitly calling `WriteHeader` is a good practice for clarity and robustness, especially in environments with proxies or middleware. </details></details></td><td align=center>Low </td></tr> <tr><td align="center" colspan="2"> - [ ] More <!-- /improve --more_suggestions=true --> </td><td></td></tr></tbody></table> ___ #### Previous suggestions <details><summary>✅ Suggestions up to commit 15f83a6</summary> <br><table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=5>Possible issue</td> <td> <details><summary>Use correct gRPC dial API</summary> ___ **Replace the non-existent <code>grpc.NewClient</code> with the correct <code>grpc.DialContext</code> <br>function to fix a compilation error when creating a gRPC client connection.** [pkg/config/bootstrap/template.go [202-206]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-9a44dcfb7a9e43b18ebf20b4be4123c6cf52a4e9fcf1b7ad65877148325f2b7eR202-R206) ```diff -conn, err := grpc.NewClient(coreAddr, dialOpts...) +conn, err := grpc.DialContext(regCtx, coreAddr, dialOpts...) if err != nil { closeProvider() return fmt.Errorf("%w: %w", errTemplateRegistration, err) } ``` <details><summary>Suggestion importance[1-10]: 9</summary> __ Why: The suggestion correctly identifies that `grpc.NewClient` is not a function in the standard `google.golang.org/grpc` package, and replacing it with `grpc.DialContext` fixes a compilation error. </details></details></td><td align=center>High </td></tr><tr><td> <details><summary>✅ <s>Avoid nil kvStore panic</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit added a c.kvStore != nil check around the Put call and introduced a debug log when kvStore is nil, preventing a nil pointer dereference. code diff: ```diff + if c.kvStore != nil { + if normalized, err := json.Marshal(over); err == nil { + if err := c.kvStore.Put(ctx, key, normalized, 0); err != nil && c.logger != nil { + c.logger.Warn(). + Err(err). + Str("key", key). + Msg("failed to rewrite normalized KV entry") + } + } + } else if c.logger != nil { + c.logger.Debug(). + Str("key", key). + Msg("skipping KV normalization rewrite; kvStore not configured") ``` </details> ___ **Add a nil check for <code>c.kvStore</code> before attempting to write back the normalized <br>configuration to prevent a potential panic.** [pkg/config/config.go [211-220]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-a3d824da3c42420cd5cbb0a4a2c0e7b5bfddd819652788a0596d195dc6e31fa5R211-R220) ```diff if normalizeOverlayTypes(over, base) { - if normalized, err := json.Marshal(over); err == nil { - if err := c.kvStore.Put(ctx, key, normalized, 0); err != nil && c.logger != nil { - c.logger.Warn(). - Err(err). - Str("key", key). - Msg("failed to rewrite normalized KV entry") + if c.kvStore != nil { + if normalized, err := json.Marshal(over); err == nil { + if err := c.kvStore.Put(ctx, key, normalized, 0); err != nil && c.logger != nil { + c.logger.Warn(). + Err(err). + Str("key", key). + Msg("failed to rewrite normalized KV entry") + } } + } else if c.logger != nil { + c.logger.Debug(). + Str("key", key). + Msg("skipping KV normalization rewrite; kvStore not configured") } } ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: The suggestion correctly identifies a potential nil pointer dereference on `c.kvStore` and provides a check to prevent a panic, which is a significant bug fix. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>✅ <s>Harden error response writing</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit updated writeAPIError to early-return without writing a body for StatusNoContent and StatusNotModified, and it deletes Content-Type before setting it, matching the suggestion's intent. code diff: ```diff + switch status { + case http.StatusNoContent, http.StatusNotModified: + w.WriteHeader(status) + return + } + + w.Header().Del("Content-Type") w.Header().Set("Content-Type", "application/json") w.WriteHeader(status) + errResp := models.ErrorResponse{ ``` </details> ___ **Modify <code>writeAPIError</code> to avoid writing a response body for HTTP statuses like <code>204 </code><br><code>No Content</code> and <code>304 Not Modified</code>, which must not include one.** [pkg/core/api/auth.go [852-862]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR852-R862) ```diff func (s *APIServer) writeAPIError(w http.ResponseWriter, status int, message string) { + // Do not attempt to write a body for statuses that must not include one. + switch status { + case http.StatusNoContent, http.StatusNotModified: + w.WriteHeader(status) + return + } + + // Set content type explicitly before writing status. + // Some proxies may have already set content-type; overwrite safely. + w.Header().Del("Content-Type") w.Header().Set("Content-Type", "application/json") w.WriteHeader(status) + errResp := models.ErrorResponse{ Message: message, Status: status, } if err := json.NewEncoder(w).Encode(errResp); err != nil && s != nil && s.logger != nil { s.logger.Warn().Err(err).Msg("failed to encode error response") } } ``` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies that writing a body for status codes like `204` or `304` violates the HTTP protocol, and the proposed change fixes this, improving the robustness of the new `writeAPIError` helper function. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>✅ <s>Reject numeric duration inputs</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit implemented rejecting numeric jwt_expiration inputs, preferred string parsing with TrimSpace, and returned specific errors for invalid formats, aligning with the suggestion’s intent. code diff: ```diff import ( "encoding/json" + "errors" + "strings" "time" +) + +var ( + errJWTDurationString = errors.New("jwt_expiration must be a duration string (e.g., \"24h\")") + errJWTInvalidFormat = errors.New("invalid jwt_expiration format") ) // User contains information about an authenticated user. @@ -111,8 +118,10 @@ return nil } + // Prefer string durations like "24h". var expStr string if err := json.Unmarshal(aux.JWTExpiration, &expStr); err == nil { + expStr = strings.TrimSpace(expStr) if expStr == "" { a.JWTExpiration = 0 return nil @@ -125,13 +134,13 @@ return nil } - var expNumber float64 - if err := json.Unmarshal(aux.JWTExpiration, &expNumber); err == nil { - a.JWTExpiration = time.Duration(expNumber) - return nil + // Reject numeric inputs to avoid ambiguous nanosecond interpretation. + var numericProbe float64 + if err := json.Unmarshal(aux.JWTExpiration, &numericProbe); err == nil { + return errJWTDurationString } - return json.Unmarshal(aux.JWTExpiration, &expStr) + return errJWTInvalidFormat } ``` </details> ___ **Modify the <code>UnmarshalJSON</code> method for <code>AuthConfig</code> to reject numeric values for <br><code>jwt_expiration</code>, enforcing the use of duration strings to prevent silent <br>misconfiguration.** [pkg/models/auth.go [97-135]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-5ec89db9baf2975f8777da05352fac09d7d4352cf358af79b5904c3304c77f0cR97-R135) ```diff func (a *AuthConfig) UnmarshalJSON(data []byte) error { type Alias AuthConfig aux := &struct { *Alias JWTExpiration json.RawMessage `json:"jwt_expiration"` }{ Alias: (*Alias)(a), } if err := json.Unmarshal(data, &aux); err != nil { return err } if len(aux.JWTExpiration) == 0 { return nil } + // Prefer string durations like "24h" var expStr string if err := json.Unmarshal(aux.JWTExpiration, &expStr); err == nil { + expStr = strings.TrimSpace(expStr) if expStr == "" { a.JWTExpiration = 0 return nil } dur, err := time.ParseDuration(expStr) if err != nil { return err } a.JWTExpiration = dur return nil } - var expNumber float64 - if err := json.Unmarshal(aux.JWTExpiration, &expNumber); err == nil { - a.JWTExpiration = time.Duration(expNumber) + // Reject numeric inputs to avoid ambiguous nanosecond interpretation + var dummy float64 + if err := json.Unmarshal(aux.JWTExpiration, &dummy); err == nil { + return fmt.Errorf("jwt_expiration must be a duration string (e.g., \"24h\")") + } + + // As a last resort, try parsing again as string + if err := json.Unmarshal(aux.JWTExpiration, &expStr); err == nil { + dur, err := time.ParseDuration(expStr) + if err != nil { + return err + } + a.JWTExpiration = dur return nil } - - return json.Unmarshal(aux.JWTExpiration, &expStr) + return fmt.Errorf("invalid jwt_expiration format") } ``` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies that parsing a numeric JSON value for `jwt_expiration` can lead to misinterpretation as nanoseconds, and proposes a stricter validation that improves configuration robustness. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>✅ <s>Nil-safe KV client cleanup</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit added a conditional nil check around closeFn before deferring it, matching the suggestion. code diff: ```diff + if closeFn != nil { + defer closeFn() + } ``` </details> ___ **In <code>getKVEntry</code>, add a <code>nil</code> check for <code>closeFn</code> before deferring its call to prevent <br>a potential panic.** [pkg/core/api/auth.go [898-926]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR898-R926) ```diff func (s *APIServer) getKVEntry(ctx context.Context, kvStoreID, key string) (*kvEntry, error) { if key == "" { return nil, errKVKeyNotSpecified } if kvStoreID != "" { kvClient, closeFn, err := s.getKVClient(ctx, kvStoreID) if err != nil { return nil, err } - defer closeFn() + if closeFn != nil { + defer closeFn() + } resp, err := kvClient.Get(ctx, &proto.GetRequest{Key: key}) if err != nil { return nil, err } return &kvEntry{Value: resp.GetValue(), Found: resp.GetFound(), Revision: resp.GetRevision()}, nil } if s.kvGetFn == nil { return nil, ErrKVAddressNotConfigured } value, found, revision, err := s.kvGetFn(ctx, key) if err != nil { return nil, err } return &kvEntry{Value: value, Found: found, Revision: revision}, nil } ``` <details><summary>Suggestion importance[1-10]: 5</summary> __ Why: The suggestion correctly points out a potential nil pointer dereference if `closeFn` is `nil`. While the implementation of `getKVClient` might prevent this, adding a `nil` check is a good defensive programming practice. </details></details></td><td align=center>Low </td></tr><tr><td rowspan=2>Security</td> <td> <details><summary>✅ <s>Strip hop-by-hop proxy headers</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit added logic to remove hop-by-hop headers from forwarded requests and adjusted Content-Type handling to only set it for PUT requests based on incoming content-type. code diff: ```diff + // Remove hop-by-hop headers that must not be forwarded. + const hopByHopHeaders = [ + "connection", + "upgrade", + "transfer-encoding", + "proxy-connection", + "keep-alive", + "te", + "trailer", + ]; + for (const header of hopByHopHeaders) { + headers.delete(header); + } + + if (method === "PUT") { + const contentType = req.headers.get("content-type"); + if (contentType) { + headers.set("Content-Type", contentType); + } } ``` </details> ___ **Prevent potential request smuggling vulnerabilities by stripping hop-by-hop <br>headers (e.g., <code>connection</code>, <code>upgrade</code>) before forwarding requests to the upstream <br>API.** [web/src/app/api/admin/config/proxy.ts [37-81]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-efbdcfc4a88b03f4292320f91f31515bd86eb070be3b8c65a3dc2e583276e9d7R37-R81) ```diff async function forwardRequest( method: Method, req: NextRequest, upstreamUrl: string, authHeader: string, ): Promise<NextResponse> { const headers = new Headers({ "X-API-Key": getApiKey(), Authorization: authHeader, }); - const contentType = req.headers.get("Content-Type"); - if (contentType) { - headers.set("Content-Type", contentType); + + // Drop hop-by-hop headers to avoid proxy issues + const hopByHop = [ + "connection", + "upgrade", + "transfer-encoding", + "proxy-connection", + "keep-alive", + "te", + "trailer", + ]; + for (const h of hopByHop) { + // Ensure they are not propagated + headers.delete(h); + } + + // Only set Content-Type if we actually send a body + if (method === "PUT") { + const ct = req.headers.get("content-type"); + if (ct) headers.set("Content-Type", ct); } const init: RequestInit = { method, headers, - body: undefined, }; if (method === "PUT") { init.body = await req.text(); } try { const resp = await fetch(upstreamUrl, init); const body = await resp.text(); const responseHeaders = new Headers(); const respContentType = resp.headers.get("content-type"); if (respContentType) { responseHeaders.set("Content-Type", respContentType); } return new NextResponse(body, { status: resp.status, headers: responseHeaders, }); } catch (error) { console.error("Admin config proxy error", error); return NextResponse.json( { error: "Failed to reach core admin config API" }, { status: 502 }, ); } } ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: The suggestion addresses a valid security concern by proposing to strip hop-by-hop headers, which helps prevent HTTP request smuggling vulnerabilities in this proxy implementation. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>✅ <s>Harden key unsanitize parsing</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit updated unsanitizeSegment to require an even number of hex digits, return an error for invalid length, restrict decoded values to 0x20–0x7E with an error otherwise, write bytes instead of runes, and explicitly handle lone '='. It also introduced specific error variables for these cases. code diff: ```diff @@ -561,15 +563,24 @@ } if j == i+1 { + // Lone '='; leave as-is. b.WriteByte('=') continue } - val, err := strconv.ParseInt(seg[i+1:j], 16, 32) + hexRun := seg[i+1 : j] + if len(hexRun)%2 != 0 { + return "", fmt.Errorf("%w after '=' in %q", errInvalidHexEscape, seg) + } + + val, err := strconv.ParseInt(hexRun, 16, 32) if err != nil { return "", err } - b.WriteRune(rune(val)) + if val < 0x20 || val > 0x7E { + return "", fmt.Errorf("%w in %q", errUnsafeASCIICharacter, seg) + } + b.WriteByte(byte(val)) i = j - 1 ``` </details> ___ **In <code>unsanitizeSegment</code>, add validation to ensure hex escape sequences have an even <br>number of digits and that the decoded character is within the printable ASCII <br>range.** [cmd/tools/kv-sweep/main.go [543-577]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-d333a5d2d948c60657097bfd7fef250908a625c36d14041c634c4d5df2468fccR543-R577) ```diff func unsanitizeSegment(seg string) (string, error) { if !strings.Contains(seg, "=") { return seg, nil } var b strings.Builder b.Grow(len(seg)) for i := 0; i < len(seg); i++ { ch := seg[i] if ch != '=' { b.WriteByte(ch) continue } j := i + 1 for j < len(seg) && isHexDigit(seg[j]) { j++ } if j == i+1 { + // Lone '='; keep as-is b.WriteByte('=') continue } - val, err := strconv.ParseInt(seg[i+1:j], 16, 32) + // Require even number of hex digits to form whole bytes + hexRun := seg[i+1 : j] + if len(hexRun)%2 != 0 { + return "", fmt.Errorf("invalid hex escape length after '=' in %q", seg) + } + + val, err := strconv.ParseInt(hexRun, 16, 32) if err != nil { return "", err } - b.WriteRune(rune(val)) + // Constrain to visible ASCII to avoid control characters in keys + if val < 0x20 || val > 0x7E { + return "", fmt.Errorf("decoded character out of safe ASCII range in %q", seg) + } + b.WriteByte(byte(val)) i = j - 1 } return b.String(), nil } ``` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies that the `unsanitizeSegment` function could misinterpret malformed hex escape sequences. Adding validation for an even number of hex digits and restricting the output to a safe ASCII range makes the key parsing more robust and secure. </details></details></td><td align=center>Medium </td></tr><tr><td rowspan=4>Incremental <sup><a href='https://qodo-merge-docs.qodo.ai/core-abilities/incremental_update/'>[*]</a></sup></td> <td> <details><summary>✅ <s>Fix provider close lifecycle</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit moved the deferred provider.Close() directly after creating the provider and removed the conditional explicit close in the client creation error path, fixing the double-close issue. code diff: ```diff + defer func() { _ = provider.Close() }() clientCfg.SecurityProvider = provider - defer func() { _ = provider.Close() }() } c, err := coregrpc.NewClient(ctx, clientCfg) if err != nil { - if provider != nil { - _ = provider.Close() - } return nil, false, 0, err ``` </details> ___ **Fix a double-close bug by moving the deferred call to <code>provider.Close()</code> to <br>immediately after its creation and removing the redundant explicit close in the <br>error path.** [pkg/core/api/server.go [148-179]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-1bb99367fdd853c728b7cfbf5893a293f6d217144dfb5282cb8dd32e5261021eR148-R179) ```diff // Default kvGetFn dials KV and performs a Get s.kvGetFn = func(ctx context.Context, key string) ([]byte, bool, uint64, error) { if s.kvAddress == "" { return nil, false, 0, ErrKVAddressNotConfigured } clientCfg := coregrpc.ClientConfig{Address: s.kvAddress, MaxRetries: 3, Logger: s.logger, DisableTelemetry: true} var provider coregrpc.SecurityProvider if s.kvSecurity != nil { sec := *s.kvSecurity sec.Role = models.RolePoller var err error provider, err = coregrpc.NewSecurityProvider(ctx, &sec, s.logger) if err != nil { return nil, false, 0, err } + defer func() { _ = provider.Close() }() clientCfg.SecurityProvider = provider - defer func() { _ = provider.Close() }() } c, err := coregrpc.NewClient(ctx, clientCfg) if err != nil { - if provider != nil { - _ = provider.Close() - } return nil, false, 0, err } defer func() { _ = c.Close() }() kv := proto.NewKVServiceClient(c.GetConnection()) resp, err := kv.Get(ctx, &proto.GetRequest{Key: key}) if err != nil { return nil, false, 0, err } return resp.Value, resp.Found, resp.Revision, nil } ``` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies a double-close bug in the original code and proposes a fix that follows the standard Go idiom for resource cleanup, making the code more robust. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>✅ <s>Harden TOML key extraction parser</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit refactored extractKey to a stateful tomlKeyScanner that handles single and double-quoted strings, multiline triple-quoted strings, escape sequences, and bracket/brace nesting, and stops at comments only outside strings—fulfilling the suggested hardening. code diff: ```diff func extractKey(line string) string { - inString := false - escapeNext := false + scanner := tomlKeyScanner{} for i := 0; i < len(line); i++ { ch := line[i] - if ch == '#' && !inString { + + if scanner.shouldStopAtComment(ch) { break } - if inString { - if escapeNext { - escapeNext = false - continue - } - if ch == '\\' { - escapeNext = true - continue - } - if ch == '"' { - inString = false - } - continue + + if scanner.handleMultiline(line, &i) { + continue + } + + if scanner.handleSingleString(ch) { + continue + } + + if scanner.handleBasicString(ch) { + continue + } + + if scanner.handleBrackets(ch) { + continue + } + + if ch == '=' && scanner.bracketDepth == 0 { + return strings.TrimSpace(line[:i]) + } + } + return "" +} + +type tomlKeyScanner struct { + inBasicString bool + inSingleString bool + inMultiline bool + escapeNext bool + bracketDepth int +} + +func (s *tomlKeyScanner) inAnyString() bool { + return s.inBasicString || s.inSingleString || s.inMultiline +} + +func (s *tomlKeyScanner) shouldStopAtComment(ch byte) bool { + return ch == '#' && !s.inAnyString() +} + +func (s *tomlKeyScanner) handleMultiline(line string, idx *int) bool { + if s.inSingleString { + return false + } + if !s.inMultiline { + if s.inBasicString { + return false + } + if hasTripleQuote(line, *idx) { + s.inMultiline = true + *idx += 2 + return true + } + return false + } + if hasTripleQuote(line, *idx) { + s.inMultiline = false + *idx += 2 + } + return true +} + +func (s *tomlKeyScanner) handleSingleString(ch byte) bool { + if s.inBasicString || s.inMultiline { + return false + } + if s.inSingleString { + if ch == '\'' { + s.inSingleString = false + } + return true + } + if ch == '\'' { + s.inSingleString = true + return true + } + return false +} + +func (s *tomlKeyScanner) handleBasicString(ch byte) bool { + if s.inSingleString || s.inMultiline { + return false + } + if s.inBasicString { + if s.escapeNext { + s.escapeNext = false + return true + } + if ch == '\\' { + s.escapeNext = true + return true } if ch == '"' { - inString = true - continue - } - if ch == '=' { - return strings.TrimSpace(line[:i]) - } - } - return "" -} + s.inBasicString = false + } + return true + } + if ch == '"' { + s.inBasicString = true + return true + } + return false +} + +func (s *tomlKeyScanner) handleBrackets(ch byte) bool { + if s.inAnyString() { + return false + } + switch ch { + case '[', '{', '(': + s.bracketDepth++ + return true + case ']', '}', ')': + if s.bracketDepth > 0 { + s.bracketDepth-- + } + return true + default: + return false + } +} + +func hasTripleQuote(line string, idx int) bool { + return idx+2 < len(line) && line[idx] == '"' && line[idx+1] == '"' && line[idx+2] == '"' +} ``` </details> ___ **Harden the <code>extractKey</code> function's TOML parser to correctly handle various TOML <br>constructs like single-quoted strings, multiline strings, and inline tables. <br>This ensures accurate key extraction for sanitization purposes.** [pkg/config/toml_mask.go [109-140]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-0d1120abd3a87aa7e1b1b9f80e70db597353f622e9c90992116379e4b39367d0R109-R140) ```diff func extractKey(line string) string { inString := false + inSQString := false + inMultiline := false escapeNext := false + bracketDepth := 0 // counts [], {} nesting for i := 0; i < len(line); i++ { ch := line[i] - if ch == '#' && !inString { + + // comment begins only when not inside a string + if ch == '#' && !inString && !inSQString && !inMultiline { break + } + + // handle triple-quote start/end for multiline basic strings + if !inSQString && !inMultiline && ch == '"' && i+2 < len(line) && line[i:i+3] == `"""` { + inMultiline = true + i += 2 + continue + } + if inMultiline { + // end of multiline basic string + if ch == '"' && i+2 < len(line) && line[i:i+3] == `"""` { + inMultiline = false + i += 2 + } + continue + } + + // handle single-quoted strings (no escapes inside) + if !inString && ch == '\'' && !inSQString { + inSQString = true + continue + } + if inSQString { + if ch == '\'' { + inSQString = false + } + continue + } + + // handle basic strings with escapes + if ch == '"' && !inString { + inString = true + continue } if inString { if escapeNext { escapeNext = false continue } if ch == '\\' { escapeNext = true continue } if ch == '"' { inString = false } continue } - if ch == '"' { - inString = true + + // track bracket/brace nesting to avoid '=' inside arrays/inline tables + if ch == '[' || ch == '{' || ch == '(' { + bracketDepth++ continue } - if ch == '=' { + if ch == ']' || ch == '}' || ch == ')' { + if bracketDepth > 0 { + bracketDepth-- + } + continue + } + + // only consider '=' at top level (not in strings or nested structures) + if ch == '=' && bracketDepth == 0 { return strings.TrimSpace(line[:i]) } } return "" } ``` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies that the simple TOML parser in `extractKey` is not robust and could fail to sanitize sensitive data in more complex but valid TOML files, and the proposed change makes it more resilient. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>✅ <s>Handle encrypted/non-RSA PEM safely</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit added an encrypted PEM check and adjusted the RSA type assertion to assign only when ok, matching the suggestion's intent. It implemented the encrypted check via a helper isEncryptedPEMBlock instead of x509.IsEncryptedPEMBlock. code diff: ```diff + if isEncryptedPEMBlock(block) { + return "", errUnsupportedPrivateKey + } var rsaKey *rsa.PrivateKey @@ -306,9 +309,9 @@ if err != nil { return "", err } - var ok bool - rsaKey, ok = key.(*rsa.PrivateKey) - if !ok { + if k, ok := key.(*rsa.PrivateKey); ok { + rsaKey = k + } else { return "", errUnsupportedPrivateKey } case "RSA PRIVATE KEY": @@ -326,6 +329,14 @@ return "", err } return string(pem.EncodeToMemory(&pem.Block{Type: "PUBLIC KEY", Bytes: pubDER})), nil +} + +func isEncryptedPEMBlock(block *pem.Block) bool { + if block == nil { + return false + } + procType := block.Headers["Proc-Type"] + return strings.EqualFold(procType, "4,ENCRYPTED") ``` </details> ___ **Improve <code>derivePublicKeyPEM</code> to explicitly check for and reject encrypted PEM <br>blocks and non-RSA keys, providing clearer error messages for unsupported key <br>formats.** [pkg/config/kv_client.go [295-329]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-0c911990aa32d43404f0f23dadc0363ec97ed4ca315592e6c9bd24510e82e06dR295-R329) ```diff func derivePublicKeyPEM(privatePEM string) (string, error) { block, _ := pem.Decode([]byte(privatePEM)) if block == nil { return "", errDecodePrivateKeyPEM + } + // Encrypted PEMs are not supported here; surface a clear error + if x509.IsEncryptedPEMBlock(block) { + return "", errUnsupportedPrivateKey } var rsaKey *rsa.PrivateKey switch block.Type { case "PRIVATE KEY": key, err := x509.ParsePKCS8PrivateKey(block.Bytes) if err != nil { return "", err } - var ok bool - rsaKey, ok = key.(*rsa.PrivateKey) - if !ok { + if k, ok := key.(*rsa.PrivateKey); ok { + rsaKey = k + } else { return "", errUnsupportedPrivateKey } case "RSA PRIVATE KEY": key, err := x509.ParsePKCS1PrivateKey(block.Bytes) if err != nil { return "", err } rsaKey = key default: return "", errUnsupportedPrivateKey } pubDER, err := x509.MarshalPKIXPublicKey(&rsaKey.PublicKey) if err != nil { return "", err } return string(pem.EncodeToMemory(&pem.Block{Type: "PUBLIC KEY", Bytes: pubDER})), nil } ``` <details><summary>Suggestion importance[1-10]: 6</summary> __ Why: The suggestion improves the robustness of the `derivePublicKeyPEM` function by adding an explicit check for encrypted PEM blocks and refining the type assertion for RSA keys. This leads to clearer error messages for unsupported key types, which is a valuable improvement for maintainability and debugging. </details></details></td><td align=center>Low </td></tr><tr><td> <details><summary>✅ <s>Return 400 on bad service input</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit normalized the service name, changed validation to throw TypeError("bad-service"), used the normalized name in the URL, and added a try/catch to return a 400 response with an error message when the service name is invalid. code diff: ```diff - if (!/^[a-z0-9-]+$/i.test(service)) { - throw new Error("invalid service name"); + const normalized = service.trim().toLowerCase(); + if (!/^[a-z0-9-]+$/.test(normalized)) { + throw new TypeError("bad-service"); } const params = new URLSearchParams(req.nextUrl.searchParams); @@ -30,7 +31,7 @@ } const query = params.toString(); - const base = `${getInternalApiUrl()}/api/admin/config/${service}`; + const base = `${getInternalApiUrl()}/api/admin/config/${normalized}`; return query ? `${base}?${query}` : base; } @@ -44,15 +45,31 @@ "X-API-Key": getApiKey(), Authorization: authHeader, }); - const contentType = req.headers.get("Content-Type"); - if (contentType) { - headers.set("Content-Type", contentType); + + // Remove hop-by-hop headers that must not be forwarded. + const hopByHopHeaders = [ + "connection", + "upgrade", + "transfer-encoding", + "proxy-connection", + "keep-alive", + "te", + "trailer", + ]; + for (const header of hopByHopHeaders) { + headers.delete(header); + } + + if (method === "PUT") { + const contentType = req.headers.get("content-type"); + if (contentType) { + headers.set("Content-Type", contentType); + } } const init: RequestInit = { method, headers, - body: undefined, }; if (method === "PUT") { @@ -91,6 +108,17 @@ } const { service } = await params; - const upstreamUrl = buildUpstreamUrl(req, service); + let upstreamUrl: string; + try { + upstreamUrl = buildUpstreamUrl(req, service); + } catch (err) { + if (err instanceof TypeError && err.message === "bad-service") { + return NextResponse.json( + { error: "Invalid service name" }, + { status: 400 }, + ); + } + throw err; + } return forwardRequest(method, req, upstreamUrl, authHeader); ``` </details> ___ **Modify the <code>buildUpstreamUrl</code> function to handle invalid <code>service</code> names by enabling <br>a 400 Bad Request response instead of throwing an exception that causes a 500 <br>Internal Server Error. Also, normalize the service name to lowercase.** [web/src/app/api/admin/config/proxy.ts [17-35]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-efbdcfc4a88b03f4292320f91f31515bd86eb070be3b8c65a3dc2e583276e9d7R17-R35) ```diff function buildUpstreamUrl(req: NextRequest, service: string): string { - if (!/^[a-z0-9-]+$/i.test(service)) { - throw new Error("invalid service name"); + const normalized = service.trim().toLowerCase(); + if (!/^[a-z0-9-]+$/.test(normalized)) { + // Return a sentinel that caller can handle to produce 400 + throw new TypeError("bad-service"); } const params = new URLSearchParams(req.nextUrl.searchParams); - - // Normalize legacy kvStore query parameter if present. if (!params.has("kv_store_id") && params.has("kvStore")) { const legacy = params.get("kvStore"); if (legacy) { params.set("kv_store_id", legacy); } params.delete("kvStore"); } const query = params.toString(); - const base = `${getInternalApiUrl()}/api/admin/config/${service}`; + const base = `${getInternalApiUrl()}/api/admin/config/${normalized}`; return query ? `${base}?${query}` : base; } +export async function proxyConfigRequest( + method: Method, + req: NextRequest, + { params }: { params: ParamsPromise }, +): Promise<NextResponse> { + const authHeader = req.headers.get("Authorization"); + if (!authHeader?.startsWith("Bearer ")) { + return UNAUTHORIZED; + } + + const { service } = await params; + let upstreamUrl: string; + try { + upstreamUrl = buildUpstreamUrl(req, service); + } catch (e) { + if (e instanceof TypeError && e.message === "bad-service") { + return NextResponse.json({ error: "Invalid service name" }, { status: 400 }); + } + throw e; + } + return forwardRequest(method, req, upstreamUrl, authHeader); +} + ``` <details><summary>Suggestion importance[1-10]: 6</summary> __ Why: The suggestion correctly points out that throwing an error for invalid input leads to a 500 response, and proposes a better approach of returning a 400 error, which improves error handling and security. </details></details></td><td align=center>Low </td></tr><tr><td rowspan=1>General</td> <td> <details><summary>Add WithBlock to dial options</summary> ___ **Add the <code>grpc.WithBlock()</code> dial option to gRPC connections to ensure connection <br>errors are surfaced promptly during startup.** [pkg/config/bootstrap/core_client.go [26-49]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-5148a65652c2ea30798812e62706a3355f3517f2a20b4561741bcc24b2bd6673R26-R49) ```diff func BuildCoreDialOptionsFromEnv(ctx context.Context, role models.ServiceRole, log logger.Logger) ([]grpc.DialOption, func(), error) { noOpCloser := func() {} provider, err := CoreSecurityProviderFromEnv(ctx, role, log) if err != nil { return nil, noOpCloser, err } if provider != nil { creds, credErr := provider.GetClientCredentials(ctx) if credErr != nil { _ = provider.Close() return nil, noOpCloser, credErr } closer := func() { if err := provider.Close(); err != nil && log != nil { log.Warn().Err(err).Msg("failed to close core security provider") } } - return []grpc.DialOption{creds}, closer, nil + return []grpc.DialOption{ + creds, + grpc.WithBlock(), + }, closer, nil } - return []grpc.DialOption{grpc.WithTransportCredentials(insecure.NewCredentials())}, noOpCloser, nil + return []grpc.DialOption{ + grpc.WithTransportCredentials(insecure.NewCredentials()), + grpc.WithBlock(), + }, noOpCloser, nil } ``` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: Adding `grpc.WithBlock()` is a good practice to make gRPC connections fail fast on startup if the endpoint is unreachable, improving service reliability and providing clearer startup error messages. </details></details></td><td align=center>Medium </td></tr> <tr><td align="center" colspan="2"> <!-- /improve_multi --more_suggestions=true --> </td><td></td></tr></tbody></table> </details> <details><summary>✅ Suggestions up to commit c096ab3</summary> <br><table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=5>Possible issue</td> <td> <details><summary>✅ <s>Use grpc.Dial instead of NewClient</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit replaced grpc.NewClient with grpc.DialContext and adjusted provider cleanup, implementing the suggested fix for creating the gRPC client connection. code diff: ```diff - conn, err := grpc.NewClient(address, dialOpts...) - if err != nil { - if provider != nil { - _ = provider.Close() - } + conn, err := grpc.DialContext(ctx, address, dialOpts...) //nolint:staticcheck // DialContext ensures ctx cancellation adherence + if err != nil { + closeProvider() return nil, nil, err } cleanup := func() { _ = conn.Close() - if provider != nil { - _ = provider.Close() - } + closeProvider() } return proto.NewCoreServiceClient(conn), cleanup, nil ``` </details> ___ **Replace the incorrect <code>grpc.NewClient</code> call with <code>grpc.DialContext</code> to fix a <br>critical bug that prevents gRPC connections.** [cmd/tools/kv-sweep/main.go [453-483]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-d333a5d2d948c60657097bfd7fef250908a625c36d14041c634c4d5df2468fccR453-R483) ```diff func connectCore(ctx context.Context, cfg sweepConfig) (proto.CoreServiceClient, func(), error) { role := models.ServiceRole(cfg.coreRole) dialOpts, provider, err := bootstrap.BuildCoreDialOptionsFromEnv(ctx, role, logger.NewTestLogger()) if err != nil { return nil, nil, err } address := cfg.coreAddress if address == "" { address = os.Getenv("CORE_ADDRESS") } if address == "" { return nil, nil, errCoreAddressNotConfigured } - conn, err := grpc.NewClient(address, dialOpts...) + // grpc.Dial is the correct API to create a client connection + conn, err := grpc.DialContext(ctx, address, dialOpts...) if err != nil { if provider != nil { _ = provider.Close() } return nil, nil, err } cleanup := func() { _ = conn.Close() if provider != nil { _ = provider.Close() } } return proto.NewCoreServiceClient(conn), cleanup, nil } ``` <details><summary>Suggestion importance[1-10]: 10</summary> __ Why: The suggestion correctly identifies a critical bug where `grpc.NewClient` is used instead of `grpc.Dial` or `grpc.DialContext`, which would cause a runtime failure and prevent the tool from connecting to the gRPC server. </details></details></td><td align=center>High </td></tr><tr><td> <details><summary>✅ <s>Fix unsafe KV fallback key resolution</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit updated the fallback logic to strip a "domains/" prefix from the base key, re-qualify the key for the default store, and retry loading. It also guarded the logger call and adjusted format hinting. This addresses the unsafe key resolution during KV fallback as suggested. code diff: ```diff @@ -607,9 +607,21 @@ entry, metaRecord, err := s.loadConfigEntry(ctx, kvStoreID, resolvedKey) if err != nil && kvStoreID != "" { - s.logger.Warn().Err(err).Str("key", resolvedKey).Str("kv_store_id", kvStoreID).Msg("KV fetch failed; falling back to default store") + if s.logger != nil { + s.logger.Warn().Err(err).Str("key", resolvedKey).Str("kv_store_id", kvStoreID).Msg("KV fetch failed; falling back to default store") + } kvStoreID = "" - resolvedKey = baseKey + baseKeyUnqualified := baseKey + if strings.HasPrefix(strings.ToLower(baseKeyUnqualified), "domains/") { + parts := strings.SplitN(baseKeyUnqualified, "/", 3) + if len(parts) == 3 { + baseKeyUnqualified = parts[2] + } + } + resolvedKey = s.qualifyKVKey(kvStoreID, baseKeyUnqualified) + if !hasDescriptor { + formatHint = guessFormatFromKey(resolvedKey) + } entry, metaRecord, err = s.loadConfigEntry(ctx, kvStoreID, resolvedKey) } ``` </details> ___ **Fix an unsafe key resolution issue in the KV fallback logic. When falling back <br>to the default store, ensure the key is correctly unqualified to prevent reading <br>from a domain-specific path.** [pkg/core/api/auth.go [608-614]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-61eb09279471ca59a1b57a306f0ad05988a8cb39eca163b32bfb86c53f8a553eR608-R614) ```diff -func (s *APIServer) handleGetConfig(w http.ResponseWriter, r *http.Request) { - ctx := r.Context() - vars := mux.Vars(r) - service := vars["service"] +entry, metaRecord, err := s.loadConfigEntry(ctx, kvStoreID, resolvedKey) +if err != nil && kvStoreID != "" { + if s.logger != nil { + s.logger.Warn().Err(err).Str("key", resolvedKey).Str("kv_store_id", kvStoreID).Msg("KV fetch failed; falling back to default store") + } + // Fallback to default store; ensure we use an appropriate unqualified key + kvStoreID = "" + // If the original baseKey includes a domains/ prefix, strip it for default store + baseKeyUnqualified := baseKey + if strings.HasPrefix(strings.ToLower(baseKeyUnqualified), "domains/") { + parts := strings.SplitN(baseKeyUnqualified, "/", 3) + if len(parts) == 3 { + baseKeyUnqualified = parts[2] + } + } + resolvedKey = s.qualifyKVKey(kvStoreID, baseKeyUnqualified) + entry, metaRecord, err = s.loadConfigEntry(ctx, kvStoreID, resolvedKey) +} - key := r.URL.Query().Get("key") - agentID := r.URL.Query().Get("agent_id") - pollerID := r.URL.Query().Get("poller_id") - serviceType := r.URL.Query().Get("service_type") - kvStoreID := r.URL.Query().Get("kv_store_id") - if kvStoreID == "" { - kvStoreID = r.URL.Query().Get("kvStore") - } - - var desc config.ServiceDescriptor - var hasDescriptor bool - if key == "" { - var err error - desc, key, hasDescriptor, err = s.resolveServiceKey(service, serviceType, agentID, pollerID) - if err != nil { - s.writeAPIError(w, http.StatusBadRequest, err.Error()) - return - } - } else { - desc, hasDescriptor = s.lookupServiceDescriptor(service, serviceType, agentID, pollerID) - } - - rawMode := isRawConfigRequested(r) - baseKey := key - resolvedKey := s.qualifyKVKey(kvStoreID, key) - - formatHint := guessFormatFromKey(resolvedKey) - if hasDescriptor { - formatHint = desc.Format - } - - entry, metaRecord, err := s.loadConfigEntry(ctx, kvStoreID, resolvedKey) - if err != nil && kvStoreID != "" { - s.logger.Warn().Err(err).Str("key", resolvedKey).Str("kv_store_id", kvStoreID).Msg("KV fetch failed; falling back to default store") - kvStoreID = "" - resolvedKey = baseKey - entry, metaRecord, err = s.loadConfigEntry(ctx, kvStoreID, resolvedKey) - } - ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: This suggestion correctly identifies a critical bug where a fallback to the default KV store could use a domain-qualified key, leading to incorrect data access or failures. The fix is accurate and prevents this cross-namespace issue. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>✅ <s>Prevent nil pointer on shutdown</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The commit added a nil check around bootstrapResult.Close() in the ListenAndServe error handling path, aligning with the suggestion. It also added graceful shutdown handling, but the nil check directly reflects the suggestion’s intent. code diff: ```diff if err := server.ListenAndServe(); err != nil { - _ = bootstrapResult.Close() + if bootstrapResult != nil { + _ = bootstrapResult.Close() + } ``` </details> ___ **Add a nil check for <code>bootstrapResult</code> before calling <code>Close()</code> in the <br><code>server.ListenAndServe()</code> error path to prevent a panic.** [cmd/faker/main.go [474-477]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-f101715b3bafbe3843ac9df0ffa3566fd0278b8f730efa3665557ce98acf6d23R474-R477) ```diff -func main() { - var configPath string - ... - config = &Config{} - bootstrapResult := loadFakerConfig(ctx, configPath, config) +if err := server.ListenAndServe(); err != nil { if bootstrapResult != nil { - defer func() { _ = bootstrapResult.Close() }() + _ = bootstrapResult.Close() } - ... - if err := server.ListenAndServe(); err != nil { - _ = bootstrapResult.Close() - log.Fatalf("Server failed: %v", err) //nolint:gocritic // Close is explicitly called before Fatalf - } + log.Fatalf("Server failed: %v", err) //nolint:gocritic // Close is explicitly called before Fatalf } ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: The suggestion correctly identifies a potential nil pointer dereference on `bootstrapResult` in an error handling path, which would cause a panic during server shutdown. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>Prevent unsafe duration coercion</summary> ___ **Modify <code>formatDurationValue</code> to avoid interpreting raw numeric values as <br>nanoseconds for durations, as this can lead to misconfiguration.** [pkg/config/config.go [354-382]](https://github.com/carverauto/serviceradar/pull/1926/files#diff-a3d824da3c42420cd5cbb0a4a2c0e7b5bfddd819652788a0596d195dc6e31fa5R354-R382) ```diff func formatDurationValue(value interface{}) (string, bool) { + // Only accept string-like inputs for durations; numeric inputs are ambiguous (nanoseconds vs seconds). switch v := value.(type) { + case string: + if d, err := time.ParseDuration(v); err == nil { + return d.String(), true + } + return "", false case json.Number: - if i, err := v.Int64(); err == nil { - return time.Duration(i).String(), true + // Treat only when it includes a unit-like suffix once converted to string + s := v.String() + if d, err := time.ParseDuration(s); err == nil { + return d.String(), true } - if f, err := v.Float64(); err == nil { - return time.Duration(int64(f)).String(), true - } - case float64: - return time.Duration(int64(v)).String(), true - case float32: - return time.Duration(int64(v)).String(), true - case int64: - return time.Duration(v).String(), true - case int32: - return time.Duration(v).String(), true - case int: - return time.Duration(v).String(), true - case uint64: - return time.Duration(v).String(), true - case uint32: - return time.Duration(v).String(), true - case uint: - return time.Duration(v).String(), true + return "", false + default: + return "", false } - - return "", false } ``` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly points out that treating numeric JSON values as nanoseconds for `time.Duration` is likely incorrect and can lead to silent configuration errors. </details></details></td><td align=center>Medium </td></tr><tr><td> <details><summary>✅ <s>Prevent security provider resource leaks</s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The function signature was changed to return a closer func(), with no-op closer on error/insecure paths, and a proper closer that closes the provider. It also handles error paths by closing the provider and returning the no-op closer, aligning with the suggestion’s intent to prevent leaks. code diff: ```diff +// using the SPIFFE or mTLS settings provided via environment variables. It returns a +// closer function that callers should invoke to release any underlying security provider. +func BuildCoreDialOptionsFromEnv(ctx context.Context, role models.ServiceRole, log logger.Logger) ([]grpc.DialOption, func(), error) { + noOpCloser := func() {} provider, err := CoreSecurityProviderFromEnv(ctx, role, log) if err != nil { - return nil, nil, err + return nil, noOpCloser, err } ...
gitguardian[bot] commented 2025-11-10 03:30:19 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @gitguardian[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1926#issuecomment-3509257034
Original created: 2025-11-10T03:30:19Z

There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Imported GitHub PR comment. Original author: @gitguardian[bot] Original URL: https://github.com/carverauto/serviceradar/pull/1926#issuecomment-3509257034 Original created: 2025-11-10T03:30:19Z --- #### ️✅ There are no secrets present in this pull request anymore. If these secrets were true positive and are still valid, we highly recommend you to revoke them. While these secrets were previously flagged, we no longer have a reference to the specific commits where they were detected. Once a secret has been leaked into a git repository, you should consider it compromised, even if it was deleted immediately. Find [here](https://docs.gitguardian.com/platform/remediate/remediate-incidents) more information about risks. --- <sup><sub>🦉 [GitGuardian](https://dashboard.gitguardian.com/auth/login/?utm_medium=checkruns&amp;utm_source=github&amp;utm_campaign=cr1) detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.<br/></sup></sub>
qodo-code-review[bot] commented 2025-11-10 19:35:53 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1926#issuecomment-3513563431
Original created: 2025-11-10T19:35:53Z

Persistent suggestions updated to latest commit af6b19d

Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/1926#issuecomment-3513563431 Original created: 2025-11-10T19:35:53Z --- **[Persistent suggestions](https://github.com/carverauto/serviceradar/pull/1926#issuecomment-3506721680)** updated to latest commit af6b19d
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar!2403
No description provided.