1944 schema migration pkgdb #2414
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2414
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2414/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #1945
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/1945
Original created: 2025-11-16T05:40:14Z
Original updated: 2025-11-16T06:25:01Z
Original head: carverauto/serviceradar:1944-schema-migration-pkgdb
Original base: main
Original merged: 2025-11-16T06:24:20Z by @mfreeman451
User description
IMPORTANT: Please sign the Developer Certificate of Origin
Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:
Describe your changes
Issue ticket number and link
Code checklist before requesting a review
PR Type
Enhancement
Description
Complete migration from Proton to CloudNativePG (CNPG) PostgreSQL backend across the entire codebase
Replaced Proton driver (
proton-go-driver) withpgx/pgxpoolfor PostgreSQL connectivityRefactored database layer with new CNPG-specific implementations for metrics, registry, events, discovery, and device operations
Introduced
CompatConnwrapper to emulate Proton batch API on top of PostgreSQLMigrated all metric storage and retrieval operations to use CNPG batch inserts and parameterized queries
Updated service registry with CNPG read/write paths including upsert operations for pollers, agents, and checkers
Simplified database configuration model from
ProtonSettings/ProtonDatabasetoCNPGDatabasewith PostgreSQL connection parametersAdded new CNPG migration tool (
cnpg-migrate) for schema setup and initializationRefactored event writer consumer to use CNPG backend with generic
processOTELTablehelperImplemented identifier resolution with CNPG fallback logic for Armis IDs, Netbox IDs, MACs, and IPs
Added comprehensive test infrastructure for CNPG query operations
Removed 813+ lines of Proton-specific batch processing and query logic
Deleted legacy Proton migration files and configuration
Diagram Walkthrough
File Walkthrough
18 files
metrics.go
Migrate metrics module from Proton to CNPG backendpkg/db/metrics.go
strings, proton driver)
(
cnpgInsertTimeseriesMetrics,cnpgInsertCPUMetrics, etc.)storeRperfMetricsto build metric series and call CNPGinsert instead of direct batch operations
helpers (
cnpgGetCPUMetrics,cnpgGetAllDiskMetrics, etc.)consolidating into thin wrapper functions
db.go
Replace Proton database driver with CloudNativePG/PostgreSQLpkg/db/db.go
(CloudNativePG/PostgreSQL)
setup
DBstruct to usepgxpool.Poolinstead of Proton connectionCompatConnwrapper to emulate Proton batch API on top ofCNPG
?placeholders toPostgreSQL
$nformatconnection logic
ExecuteQueryto use pgx rows and normalize CNPG valuesservice_registry_queries.go
Add CNPG read path support to service registry queriespkg/registry/service_registry_queries.go
GetPoller,GetAgent,GetChecker,ListPollers, etc.)decodeServiceMetadatahelper functiongetPollerCNPG,getAgentCNPG,listPollersCNPG, etc.)UpdateServiceStatusto call CNPG upsert methods after Protonwrites
isKnownPollerCNPGandrefreshPollerCacheCNPGfor CNPG-backedcaching
dedicated CNPG upsert functions
processor.go
Schema migration from Proton to CNPG database backendpkg/consumers/db-event-writer/processor.go
replacing
proton.Connwith*db.DBLogRow,MetricsRow,TracesRow) and replaced withmodels.OTELLogRow,models.OTELMetricRow,models.OTELTraceRowprocessOTELTablehelper function, eliminating code duplication
(
InsertEvents,InsertOTELLogs,InsertOTELMetrics,InsertOTELTraces)cnpg_metrics_reads.go
CNPG metrics read operations implementationpkg/db/cnpg_metrics_reads.go
memory, and process metrics
range support, and device-based queries
building, and metric row scanning
complexity
edge_onboarding.go
Edge onboarding schema migration to CNPG PostgreSQLpkg/db/edge_onboarding.go
proton-go-driverimports withpgx/v5arg_maxaggregations tostandard PostgreSQL
DISTINCT ONpatternListEdgeOnboardingPackagesto use parameterized querieswith proper argument indexing
scanEdgeOnboardingPackageto work with PostgreSQL row scanninginstead of Proton-specific interface
service_registry.go
Service registry CNPG write operations integrationpkg/registry/service_registry.go
pollers,agents, andcheckerstableswrite execution
upsertCNPGPoller,upsertCNPGAgent,upsertCNPGCheckerfunctionsdeleteServiceCNPGmethod to handle service deletion in CNPGbackend
cnpg_metrics.go
CNPG metrics write operations implementationpkg/db/cnpg_metrics.go
memory, and process metrics
type
pgx.Batchfor efficient bulk inserts with error handlingevents.go
CNPG CloudEvents insertion implementationpkg/db/events.go
semantics
defaulting
ON CONFLICTclause for idempotent event storagepollers.go
Migrate poller queries from Proton to CNPG delegationpkg/db/pollers.go
implementations via wrapper functions
pollerIDandlimitparameterscnpg*helperfunctions
cnpg_registry.go
New CNPG registry implementation for poller/service datapkg/db/cnpg_registry.go
operations
pollers,poller_history,service_status, andservicestablesqueries against PostgreSQL
normalization
unified_devices.go
Simplify unified devices to CNPG delegation layerpkg/db/unified_devices.go
utilities
cnpg*implementationsfile
discovery.go
Simplify discovery publishing to CNPG delegationpkg/db/discovery.go
topology events
PublishDiscoveredInterfaceandPublishTopologyDiscoveryEventto delegate to CNPG functionscnpg_unified_devices.go
New CNPG unified devices implementation with scanningpkg/db/cnpg_unified_devices.go
operations
proper null handling
last_seentimestampcnpg_sweep.go
New CNPG sweep host state operationspkg/db/cnpg_sweep.go
field mapping
normalization
type handling
main.go
New CNPG migration command-line toolcmd/tools/cnpg-migrate/main.go
port, database, credentials)
db.RunCNPGMigrationsto execute schema setupregistry.go
Add CNPG identifier resolution with fallback logicpkg/registry/registry.go
cnpgRegistryClientinterface for CNPG read operations(
resolveArmisIDsCNPG,resolveNetboxIDsCNPG,resolveMACsCNPG,resolveIPsToCanonicalCNPG)(
cnpgIdentifierChunkSize)delegate accordingly
cnpg_edge_onboarding.go
New CNPG edge onboarding package operationspkg/db/cnpg_edge_onboarding.go
operations
normalization
1 files
db.go
Update database configuration model for CNPGpkg/models/db.go
ProtonSettingsstruct with Proton-specific configurationoptions
ProtonDatabasestruct withCNPGDatabasestructdatabase, username, password)
max_conn_lifetime)
health_check_period)
1 files
service_registry_queries_test.go
Add CNPG query testing infrastructure and testspkg/registry/service_registry_queries_test.go
testing
testCNPGClientandstubRowstypes to mock databaseresponses
TestGetPollerCNPGandTestListPollersCNPGFilterstest cases101 files
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1945#issuecomment-3537916848
Original created: 2025-11-16T05:41:18Z
PR Compliance Guide 🔍
(Compliance updated until commit
github.com/carverauto/serviceradar@8663f5004e)Below is a summary of compliance checks for this PR:
SQL placeholder rewrite
Description: The CompatConn placeholder rewriter replaces all '?' with positional parameters without
parsing SQL, which can corrupt literals/comments and lead to query breakage or unexpected
behavior; while not an injection itself (pgx uses parameterization), it should parse only
parameter placeholders outside string literals.
db.go [280-379]
Referred Code
Large JSON payload risk
Description: JSON metadata is accepted and embedded after a basic Unmarshal check but other CNPG insert
args rely on upstream validation; ensure untrusted strings (e.g., names) are not
concatenated into SQL elsewhere—current file uses parameters, but validation of large
payloads may allow oversized records causing resource exhaustion.
cnpg_metrics.go [492-503]
Referred Code
🎫 #1944
operations.
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status: Passed
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
Missing audit logs: New CNPG write paths (e.g., cnpgInsertTimeseriesMetrics via storeRperfMetrics and
StoreSysmonMetrics) add or modify critical data without any explicit audit logging of
actor, action, and outcome in the updated code segment.
Referred Code
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status:
Error context gaps: Placeholder rewrite and batch emulation paths return generic errors (e.g.,
ErrUnsupportedInsertStatement, ErrPlaceholderMismatch) without logging or operation
context, which may hinder diagnosis if SQL parsing or CNPG exec fails.
Referred Code
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status:
Metadata in logs: Error branches log serialized rperf result metadata and error strings that may include
target addresses; ensure no sensitive data (e.g., IPs considered sensitive in your
context) is emitted at error/warn levels.
Referred Code
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status:
SQL parsing risks: The CompatConn SQL parsing (parseInsertStatement, rewritePlaceholders) accepts and
rewrites arbitrary SQL without visible input validation or strict allowance lists in this
diff, which could introduce risks if fed untrusted statements.
Referred Code
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Previous compliance checks
Compliance check up to commit e0d1313
Query placeholder rewrite
Description: The placeholder rewrite in
rewritePlaceholdersis naive and replaces every '?' characterglobally without parsing SQL, which can corrupt queries containing '?' inside string
literals or comments and cause unintended execution paths.
db.go [361-379]
Referred Code
🎫 #1944
helpers.
memory, process).
use CNPG.
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status: Passed
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
Missing audit logs: New CNPG write paths (e.g., cnpgInsertTimeseriesMetrics and related helpers) are invoked
without visible audit logging of critical actions, but the actual helpers are not in diff,
requiring verification.
Referred Code
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status:
Partial error context: Several CNPG proxy methods wrap/return errors but some paths (e.g., Close, cnpgConfigured
fallbacks) return generic errors without operation context; actual insert/query helpers
are external so full coverage is uncertain.
Referred Code
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status:
Metadata logging risk: Rperf metadata is marshaled and stored, and warnings/errors include target and poller_id;
confirm that cnpgInsert... helpers and surrounding logs do not print sensitive metadata
contents.
Referred Code
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status:
SQL placeholder rewrite: CompatConn rewrites '?' to '$n' and builds INSERT from parsed columns;
while pgx parameterization is used, parseInsertStatement relies on regex and may accept
unexpected inputs—needs verification of upstream-controlled statements.
Referred Code
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1945#issuecomment-3537921527
Original created: 2025-11-16T05:42:26Z
PR Code Suggestions ✨
Explore these optional code suggestions:
Use efficient bulk insert operation
Replace the inefficient loop of single
INSERTstatements incompatBatch.Sendwith a single, high-performance
pgx.CopyFrombulk insert operation to improvedatabase write performance.
pkg/db/db.go [265-277]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 9
__
Why: The suggestion correctly identifies a critical performance issue where the batch insert is implemented as a loop of single inserts, and it proposes the correct and idiomatic
pgx.CopyFromsolution for efficient bulk insertion.Refactor to use safe batch helper
Refactor
cnpgInsertDeviceUpdatesto use thesendCNPGhelper function for sendingthe batch, ensuring thread-safe database operations and centralizing logic.
pkg/db/cnpg_unified_devices.go [58-127]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies a concurrency issue by using
pgPool.SendBatchdirectly and proposes a valid refactoring to use thesendCNPGhelper. This improves thread safety and centralizes batch sending logic, enhancing code maintainability.Improve placeholder rewriting logic
Refactor
rewritePlaceholdersto use a regular expression for replacing?with$nplaceholders to improve efficiency and prevent incorrect replacements within
string literals.
pkg/db/db.go [360-379]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 4
__
Why: The suggestion correctly points out a potential edge case where
?inside string literals would be incorrectly replaced, but the proposed regex-based solution does not fix this issue and is not necessarily more performant than the existingstrings.Builderimplementation.Return error on database scan failure
In list functions, return an error immediately on a row scan failure instead of
logging and continuing, to prevent returning partial data without an error.
pkg/registry/service_registry_queries.go [961-977]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 9
__
Why: This suggestion identifies a critical bug where a scan error is suppressed, leading to the function returning a partial dataset without an error, which could cause silent data corruption or incorrect behavior downstream.
Return messages on database insertion failure
On a database insertion failure in
processEventsTable, return the originalmessages along with the error to prevent them from being incorrectly handled by
the caller, which could lead to an infinite redelivery loop.
pkg/consumers/db-event-writer/processor.go [622-644]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 9
__
Why: The suggestion correctly identifies a critical bug where returning
nilon a database error would cause message redelivery and a potential infinite processing loop, and provides the correct fix.Ensure concurrent database batch safety
Modify
sendCNPGto acquire a dedicated connection from the pool before sending abatch and release it afterward, preventing potential concurrency issues and data
corruption.
pkg/db/cnpg_metrics.go [291-297]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 9
__
Why: The suggestion correctly identifies a critical concurrency issue where sending batches directly on a connection pool (
pgPool) can lead to race conditions and data corruption. Acquiring a dedicated connection for each batch operation is the correct way to ensure thread safety.Log dual-write errors without failing
In dual-write scenarios, log errors from the secondary database write instead of
returning an error to prevent data inconsistency and failed operations.
pkg/registry/service_registry_queries.go [449-453]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies a potential data inconsistency issue in the dual-write migration strategy and proposes a robust solution to handle it, improving the reliability of the system during the transition.
Return correct type for nullables
Modify the
toNullableStringfunction to returnsql.NullStringinstead ofinterface{}to ensure type safety and proper handling of nullable strings by thedatabase driver.
pkg/db/cnpg_unified_devices.go [359-369]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 6
__
Why: The suggestion correctly points out that returning
sql.NullStringis more idiomatic and type-safe for database operations than returninginterface{}. While thepgxdriver might handlenilcorrectly, this change improves code clarity and robustness.