1942 build cnpg with age and timescale extensions #2413
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2413
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2413/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #1943
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/1943
Original created: 2025-11-15T03:04:40Z
Original updated: 2025-11-15T03:09:16Z
Original head: carverauto/serviceradar:1942-build-cnpg-with-age-and-timescale-extensions
Original base: main
Original merged: 2025-11-15T03:08:50Z by @mfreeman451
User description
IMPORTANT: Please sign the Developer Certificate of Origin
Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:
Describe your changes
Issue ticket number and link
Code checklist before requesting a review
PR Type
Enhancement
Description
Build custom CNPG image with TimescaleDB and Apache AGE extensions
Add Python helpers for OCI rootfs extraction with whiteout handling
Update SPIRE deployment manifests to use custom image with extensions
Document clean rebuild procedure for CNPG cluster with new image
Diagram Walkthrough
File Walkthrough
11 files
Wrapper script for PostgreSQL config path rewritingRewrite pg_config paths for custom root directoryExtract container rootfs with OCI whiteout handlingExport OCI image layout to rootfs tarballOverlay Debian packages into extracted rootfsBazel repository alias rule for Bzlmod compatibilityRefactor push targets to dict format with CNPG imageAdd CNPG image build with TimescaleDB and AGE layersConfigure CNPG cluster with custom image and extensionsUpdate SPIRE server to use renamed cnpg clusterConfigure demo CNPG cluster with custom image and extensions4 files
Build package placeholder for repository aliasAdd CNPG image configuration and extension parametersRename CNPG cluster manifest referenceUpdate SPIRE server database connection to cnpg cluster3 files
Add CNPG PostgreSQL 16.6 and extension source dependenciesAdd crane binary filegroup for OCI exportAdd TimescaleDB source repository submodule8 files
Document CNPG rebuild with TimescaleDB and AGEAdd SPIRE CNPG cluster rebuild runbookUpdate CNPG cluster name reference in documentationUpdate CNPG cluster name and descriptionExpand project context with comprehensive architecture detailsOpenSpec proposal for CNPG with extensionsOpenSpec requirements for CNPG image and deploymentOpenSpec task checklist for CNPG implementationImported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1943#issuecomment-3535480600
Original created: 2025-11-15T03:05:27Z
PR Compliance Guide 🔍
(Compliance updated until commit
github.com/carverauto/serviceradar@c848a4fda2)Below is a summary of compliance checks for this PR:
Archive extraction traversal
Description: The script creates files, symlinks, and hardlinks from tar members into a destination
directory without enforcing a chroot/jail, which could allow path confusion within the
work directory if a malicious tar is processed (e.g., symlink traversal combined with
later file writes).
extract_rootfs.py [121-154]
Referred Code
Symlink write redirection
Description: Extracts tar payloads from .deb archives into a rootfs directory and honors
symlinks/hardlinks without sandboxing; while it normalizes paths to prevent ../ escapes,
writing through attacker-controlled symlinks inside the destination could still redirect
writes within the rootfs tree.
overlay_deb_packages.py [88-120]
Referred Code
🎫 #1942
TimescaleDB and Apache AGE extensions.
both extensions via shared_preload_libraries and init SQL.
(including whiteout handling and local crane fallback).
verifying extension availability.
appropriate tags.
EXTENSION succeeds) on a real cluster.
succeed with provided imagePullSecrets.
production-like load.
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status: Passed
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status:
Weak error handling: Several code paths raise generic ValueError or continue silently (e.g., missing hardlink
sources) without contextual logging or handling for edge cases, reducing diagnosability
when overlays fail.
Referred Code
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
Missing audit logs: New helper scripts perform filesystem extraction and OCI layer processing without emitting
structured logs of critical actions or outcomes, making it unclear who did what and when
during rootfs/export operations.
Referred Code
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status:
Verbose traceback: On failure the script prints full stack traces and exception messages to stderr which may
expose internal paths or details if surfaced to users of automated build systems.
Referred Code
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status:
Path sanitization: While normalization exists, the extraction processes write files and symlinks from
external archives without additional validation or sandboxing beyond simple path checks,
which may warrant further review for traversal and link handling safety.
Referred Code
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Previous compliance checks
Compliance check up to commit c848a4f
Path traversal/whiteout delete
Description: Whiteout handling deletes arbitrary filesystem paths during extraction which, if pointed
at an unintended directory (via crafted tar entries), could remove files within the
extraction root; ensure all paths are strictly confined to the destination and inputs are
trusted.
extract_rootfs.py [68-86]
Referred Code
Untrusted archive extraction
Description: The .deb ar parsing trusts entry headers and extracts data.tar.* contents, which if
untrusted could write arbitrary files within the rootfs; ensure packages are verified
(e.g., checksum/signature) and paths remain confined to the destination.
overlay_deb_packages.py [120-149]
Referred Code
🎫 #1942
Apache AGE extensions.
enabling the extensions via shared_preload_libraries and init SQL.
whiteouts correctly and working without remote crane on executors.
image and verifying extensions.
using the produced image.
that pull credentials work in target clusters.
operator accepts the parameters and pods become Ready.
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status: Passed
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
No audit logs: The new scripts and Helm changes perform critical build/deploy actions (rootfs extraction,
package overlay, image selection) without emitting any structured audit logs identifying
user, action, and outcome.
Referred Code
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status:
Limited edge handling: Several helpers raise generic ValueError without contextual logging and do not validate
all edge cases (e.g., ar header parsing, missing data.tar stream) which may hinder
actionable diagnostics in CI.
Referred Code
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status:
Verbose traceback: On failure the script prints full tracebacks to stderr which may expose internal paths;
consider gating detailed traces behind a debug flag while keeping user-facing messages
generic.
Referred Code
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status:
Plaintext secret use: The template constructs a plaintext Postgres connection and relies on inline
username/password values, lacking enforcement of TLS or secret referencing best practices
for secure data handling.
Referred Code
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1943#issuecomment-3535481770
Original created: 2025-11-15T03:06:39Z
PR Code Suggestions ✨
Explore these optional code suggestions:
Simplify the custom image build
Refactor the custom image build process by replacing the complex Bazel
genrulesand shell scripts with a more maintainable multi-stage Dockerfile. This
encapsulates the build logic, improves readability, and utilizes Docker's
caching.
Examples:
docker/images/BUILD.bazel [1415-1541]
Solution Walkthrough:
Before:
After:
Suggestion importance[1-10]: 9
__
Why: The suggestion correctly identifies that using complex
genrules with multiple helper scripts to build the custom image is brittle and hard to maintain, and proposes a much more robust and standard multi-stage Dockerfile approach.Improve hard link creation reliability
Refactor the hard link creation logic to use a retry loop, ensuring links are
created even if their targets appear later in the tar archive.
docker/images/extract_rootfs.py [140-150]
Suggestion importance[1-10]: 7
__
Why: The suggestion correctly identifies that hard links may fail to be created if their targets appear later in the archive, and the proposed retry logic robustly fixes this potential extraction bug.
Fix broken symlink handling during cleanup
In
_safe_rmtree, modify the_onerrorhandler to check for and unlink brokensymbolic links when a
FileNotFoundErroroccurs to ensure proper cleanup.docker/images/extract_rootfs.py [54-65]
Suggestion importance[1-10]: 6
__
Why: The suggestion correctly identifies a potential
FileNotFoundErrorwhen_safe_rmtreeencounters a broken symbolic link and provides a robust fix, improving the script's reliability.Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1943#issuecomment-3535482211
Original created: 2025-11-15T03:06:44Z
CI Feedback 🧐
A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
Action: test-go
Failed stage: Run Go Tests [❌]
Failed test name: TestNetworkSweeper_WatchConfigWithInitialSignal/WatchConfig_with_initial_KV_config
Failure summary:
The GitHub Action failed because a Go test in the
pkg/sweeperpackage failed:- Test
TestNetworkSweeper_WatchConfigWithInitialSignal/WatchConfig_with_initial_KV_configtimed out waitingfor a config ready signal.
- File and line:
sweeper_test.go:282reported "Timeout waiting for configready signal".
- Package result:
FAIL github.com/carverauto/serviceradar/pkg/sweeper 0.642s,causing the overall process to exit with code 1.
Relevant error logs: