Release staging pipeline (OpenSpec): manual promotion + pinned demo #2551

Merged
mfreeman451 merged 18 commits from refs/pull/2551/head into main 2025-12-12 21:40:41 +00:00
mfreeman451 commented 2025-12-12 05:00:27 +00:00 (Migrated from github.com)
Owner

Imported from GitHub pull request.

Original GitHub pull request: #2112
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2112
Original created: 2025-12-12T05:00:27Z
Original updated: 2025-12-12T21:40:45Z
Original head: carverauto/serviceradar:chore/fix_releases_oci_image_versioning
Original base: main
Original merged: 2025-12-12T21:40:41Z by @mfreeman451

What

  • Align release-staging proposal/specs with the current simplified approach (no GitOps Promoter).
  • Stage tracks latest published Helm chart and uses latest images; demo pins chart+image versions for stability.
  • Reduce e2e flakiness by waiting for API readiness instead of fixed sleep.
  • Switch ArgoCD demo apps to use chart valueFiles (less inline YAML).

Key files

  • k8s/argocd/applications/demo-staging.yaml
  • k8s/argocd/applications/demo-prod.yaml
  • .github/workflows/e2e-tests.yml
  • helm/serviceradar/values-demo.yaml
  • helm/serviceradar/values-demo-staging.yaml
  • openspec/changes/add-release-staging-pipeline/*

Next steps (pipeline test)

After merge, run a pre-release tag (example):

  • scripts/cut-release.sh --version 1.0.76-pre1 --push
    This should publish images + chart, then trigger E2E Tests against demo-staging.

Promotion to demo is a follow-up PR that bumps targetRevision + global.imageTag in k8s/argocd/applications/demo-prod.yaml.

Imported from GitHub pull request. Original GitHub pull request: #2112 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/pull/2112 Original created: 2025-12-12T05:00:27Z Original updated: 2025-12-12T21:40:45Z Original head: carverauto/serviceradar:chore/fix_releases_oci_image_versioning Original base: main Original merged: 2025-12-12T21:40:41Z by @mfreeman451 --- ### What - Align release-staging proposal/specs with the current simplified approach (no GitOps Promoter). - Stage tracks latest published Helm chart and uses `latest` images; demo pins chart+image versions for stability. - Reduce e2e flakiness by waiting for API readiness instead of fixed sleep. - Switch ArgoCD demo apps to use chart `valueFiles` (less inline YAML). ### Key files - `k8s/argocd/applications/demo-staging.yaml` - `k8s/argocd/applications/demo-prod.yaml` - `.github/workflows/e2e-tests.yml` - `helm/serviceradar/values-demo.yaml` - `helm/serviceradar/values-demo-staging.yaml` - `openspec/changes/add-release-staging-pipeline/*` ### Next steps (pipeline test) After merge, run a pre-release tag (example): - `scripts/cut-release.sh --version 1.0.76-pre1 --push` This should publish images + chart, then trigger `E2E Tests` against `demo-staging`. Promotion to demo is a follow-up PR that bumps `targetRevision` + `global.imageTag` in `k8s/argocd/applications/demo-prod.yaml`.
qodo-code-review[bot] commented 2025-12-12 05:00:53 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2112#issuecomment-3644915500
Original created: 2025-12-12T05:00:53Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No runtime logs: The PR adds documentation/specs only and introduces no executable code that could
implement audit logging of critical actions, so compliance cannot be verified from this
diff.

Referred Code
## Context

ServiceRadar releases currently follow a manual process:
1. Developer runs `scripts/cut-release.sh --version X.Y.Z`
2. Push triggers `.github/workflows/release.yml`
3. Bazel builds and pushes OCI images tagged with `sha-<commit>` and `latest`
4. Packages (deb/rpm) are uploaded to GitHub Releases
5. Manual Helm update to demo namespace (often forgotten or delayed)

This creates gaps where released images may not have been validated in a staging environment, leading to potential production issues.

### Current State Analysis

**OCI Images (`docker/images/push_targets.bzl:48-63`):**
- Tags generated: `sha-<commit>`, `latest`, and image-specific static tags
- Missing: Semantic version tags (e.g., `v1.0.70`)

**Helm Chart (`helm/serviceradar/`):**
- Lives only in Git repository
- `Chart.yaml` has static version `0.1.0`, appVersion `1.0.0`
- `values.yaml` uses hardcoded SHA tags: `sha-0933fd20c98038af196c35ea9f5cc95e3dc38909`


 ... (clipped 197 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status:
No code added: Only markdown documentation was added and no identifiers or implementations are present to
assess naming conventions.

Referred Code
# Change: Add Release Staging Pipeline with GitOps Promotion

## Why

The current release process requires significant manual intervention: building images, manually deploying to demo, testing, then cutting the release. This creates risk of releasing untested changes and wastes time on repetitive tasks. Additionally, Helm charts are only stored in the repository and not published to a chart repository, limiting external consumption and ArgoCD best practices.

## What Changes

### 1. OCI Image Versioning
- **Ensure release workflow tags images with semantic version** (e.g., `v1.0.70`) in addition to `sha-<commit>` and `latest`
- Verify `latest` tag is applied correctly during releases
- Update `docker/images/push_targets.bzl` and `container_tags.bzl` to support version-based tags

### 2. Helm Chart Repository
- **Set up GitHub Pages-hosted Helm chart repository** following https://helm.sh/docs/topics/chart_repository/
- Add workflow to package and publish `helm/serviceradar` chart on release
- Update `Chart.yaml` version and `appVersion` automatically during releases
- Configure ArgoCD to consume charts from the Helm repository instead of raw Git paths

### 3. Demo-Staging Environment
- **Recreate `demo-staging` namespace** for pre-release validation


 ... (clipped 29 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
No error logic: The changes are specifications in markdown with no executable error handling paths to
validate robustness or edge case management.

Referred Code
## ADDED Requirements

### Requirement: OCI Image Version Tagging
The release workflow SHALL tag all OCI images with the semantic version (e.g., `v1.0.70`) in addition to `sha-<commit>` and `latest` tags.

#### Scenario: Release triggers version-tagged image push
- **WHEN** a release tag `v1.0.70` is pushed to the repository
- **THEN** all images in `GHCR_PUSH_TARGETS` SHALL be pushed with tags:
  - `v1.0.70` (semantic version)
  - `sha-<commit>` (immutable digest-based)
  - `latest` (mutable latest reference)

#### Scenario: Version tag matches VERSION file
- **WHEN** the release workflow runs
- **THEN** the image version tag SHALL match the content of the VERSION file

### Requirement: Helm Chart Repository
The project SHALL maintain a Helm chart repository hosted on GitHub Pages at `https://carverauto.github.io/serviceradar`.

#### Scenario: Chart published on release
- **WHEN** a new release is created


 ... (clipped 89 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status:
No user errors: No user-facing error messages or handling were introduced in this PR diff, so security of
error outputs cannot be evaluated.

Referred Code
## Context

ServiceRadar releases currently follow a manual process:
1. Developer runs `scripts/cut-release.sh --version X.Y.Z`
2. Push triggers `.github/workflows/release.yml`
3. Bazel builds and pushes OCI images tagged with `sha-<commit>` and `latest`
4. Packages (deb/rpm) are uploaded to GitHub Releases
5. Manual Helm update to demo namespace (often forgotten or delayed)

This creates gaps where released images may not have been validated in a staging environment, leading to potential production issues.

### Current State Analysis

**OCI Images (`docker/images/push_targets.bzl:48-63`):**
- Tags generated: `sha-<commit>`, `latest`, and image-specific static tags
- Missing: Semantic version tags (e.g., `v1.0.70`)

**Helm Chart (`helm/serviceradar/`):**
- Lives only in Git repository
- `Chart.yaml` has static version `0.1.0`, appVersion `1.0.0`
- `values.yaml` uses hardcoded SHA tags: `sha-0933fd20c98038af196c35ea9f5cc95e3dc38909`


 ... (clipped 197 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Logging not present: The PR adds planning docs and tasks only; no logging statements or configurations are
introduced to assess secure logging practices.

Referred Code
## 1. OCI Image Version Tagging

- [ ] 1.1 Update `docker/images/push_targets.bzl` to accept version tag from release workflow
- [ ] 1.2 Modify `container_tags.bzl` to include semantic version (e.g., `v1.0.70`) when provided
- [ ] 1.3 Verify `latest` tag is applied to all images during release builds
- [ ] 1.4 Update `.github/workflows/release.yml` to pass `--tag v$VERSION` to image push step
- [ ] 1.5 Test image tagging with a dry-run release

## 2. Helm Chart Repository Setup

- [ ] 2.1 Create `.github/workflows/helm-release.yml` workflow for chart publishing
- [ ] 2.2 Configure GitHub Pages for chart hosting (branch: `gh-pages`, path: `/charts`)
- [ ] 2.3 Add `helm package` and `helm repo index` steps to workflow
- [ ] 2.4 Update `helm/serviceradar/Chart.yaml` with versioning scheme matching releases
- [ ] 2.5 Create script to update Chart.yaml version from VERSION file
- [ ] 2.6 Document Helm repository URL and usage in README

## 3. Helm Values Modernization

- [ ] 3.1 Update `helm/serviceradar/values.yaml` to use `image.tag` defaulting to chart appVersion
- [ ] 3.2 Add `global.imageTag` override for consistent tag across all services


 ... (clipped 50 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
No input handling: This PR contains only markdown specifications and does not add executable input handling
to assess validation, sanitization, or secret management beyond high-level guidance.

Referred Code
## ADDED Requirements

### Requirement: OCI Image Version Tagging
The release workflow SHALL tag all OCI images with the semantic version (e.g., `v1.0.70`) in addition to `sha-<commit>` and `latest` tags.

#### Scenario: Release triggers version-tagged image push
- **WHEN** a release tag `v1.0.70` is pushed to the repository
- **THEN** all images in `GHCR_PUSH_TARGETS` SHALL be pushed with tags:
  - `v1.0.70` (semantic version)
  - `sha-<commit>` (immutable digest-based)
  - `latest` (mutable latest reference)

#### Scenario: Version tag matches VERSION file
- **WHEN** the release workflow runs
- **THEN** the image version tag SHALL match the content of the VERSION file

### Requirement: Helm Chart Repository
The project SHALL maintain a Helm chart repository hosted on GitHub Pages at `https://carverauto.github.io/serviceradar`.

#### Scenario: Chart published on release
- **WHEN** a new release is created


 ... (clipped 89 lines)

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
- Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2112#issuecomment-3644915500 Original created: 2025-12-12T05:00:53Z --- ## PR Compliance Guide 🔍 <!-- https://github.com/carverauto/serviceradar/commit/486cbfcbc1027b4255e8287df1c7ced48402b1c4 --> Below is a summary of compliance checks for this PR:<br> <table><tbody><tr><td colspan='2'><strong>Security Compliance</strong></td></tr> <tr><td>🟢</td><td><details><summary><strong>No security concerns identified</strong></summary> No security vulnerabilities detected by AI analysis. Human verification advised for critical code. </details></td></tr> <tr><td colspan='2'><strong>Ticket Compliance</strong></td></tr> <tr><td>⚪</td><td><details><summary>🎫 <strong>No ticket provided </strong></summary> - [ ] Create ticket/issue <!-- /create_ticket --create_ticket=true --> </details></td></tr> <tr><td colspan='2'><strong>Codebase Duplication Compliance</strong></td></tr> <tr><td>⚪</td><td><details><summary><strong>Codebase context is not defined </strong></summary> Follow the <a href='https://qodo-merge-docs.qodo.ai/core-abilities/rag_context_enrichment/'>guide</a> to enable codebase context checks. </details></td></tr> <tr><td colspan='2'><strong>Custom Compliance</strong></td></tr> <tr><td rowspan=6>⚪</td> <td><details> <summary><strong>Generic: Comprehensive Audit Trails</strong></summary><br> **Objective:** To create a detailed and reliable record of critical system actions for security analysis <br>and compliance.<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2112/files#diff-90617d610307ed2ea02da97436aea57e8399b8f594d1368d680049b5afabbec3R1-R218'><strong>No runtime logs</strong></a>: The PR adds documentation/specs only and introduces no executable code that could <br>implement audit logging of critical actions, so compliance cannot be verified from this <br>diff.<br> <details open><summary>Referred Code</summary> ```markdown ## Context ServiceRadar releases currently follow a manual process: 1. Developer runs `scripts/cut-release.sh --version X.Y.Z` 2. Push triggers `.github/workflows/release.yml` 3. Bazel builds and pushes OCI images tagged with `sha-<commit>` and `latest` 4. Packages (deb/rpm) are uploaded to GitHub Releases 5. Manual Helm update to demo namespace (often forgotten or delayed) This creates gaps where released images may not have been validated in a staging environment, leading to potential production issues. ### Current State Analysis **OCI Images (`docker/images/push_targets.bzl:48-63`):** - Tags generated: `sha-<commit>`, `latest`, and image-specific static tags - Missing: Semantic version tags (e.g., `v1.0.70`) **Helm Chart (`helm/serviceradar/`):** - Lives only in Git repository - `Chart.yaml` has static version `0.1.0`, appVersion `1.0.0` - `values.yaml` uses hardcoded SHA tags: `sha-0933fd20c98038af196c35ea9f5cc95e3dc38909` ... (clipped 197 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Meaningful Naming and Self-Documenting Code</strong></summary><br> **Objective:** Ensure all identifiers clearly express their purpose and intent, making code <br>self-documenting<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2112/files#diff-7c3da89695f90b1a240c0484e43f047432660bb254df7042a7d595afabe6e3d1R1-R50'><strong>No code added</strong></a>: Only markdown documentation was added and no identifiers or implementations are present to <br>assess naming conventions.<br> <details open><summary>Referred Code</summary> ```markdown # Change: Add Release Staging Pipeline with GitOps Promotion ## Why The current release process requires significant manual intervention: building images, manually deploying to demo, testing, then cutting the release. This creates risk of releasing untested changes and wastes time on repetitive tasks. Additionally, Helm charts are only stored in the repository and not published to a chart repository, limiting external consumption and ArgoCD best practices. ## What Changes ### 1. OCI Image Versioning - **Ensure release workflow tags images with semantic version** (e.g., `v1.0.70`) in addition to `sha-<commit>` and `latest` - Verify `latest` tag is applied correctly during releases - Update `docker/images/push_targets.bzl` and `container_tags.bzl` to support version-based tags ### 2. Helm Chart Repository - **Set up GitHub Pages-hosted Helm chart repository** following https://helm.sh/docs/topics/chart_repository/ - Add workflow to package and publish `helm/serviceradar` chart on release - Update `Chart.yaml` version and `appVersion` automatically during releases - Configure ArgoCD to consume charts from the Helm repository instead of raw Git paths ### 3. Demo-Staging Environment - **Recreate `demo-staging` namespace** for pre-release validation ... (clipped 29 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Robust Error Handling and Edge Case Management</strong></summary><br> **Objective:** Ensure comprehensive error handling that provides meaningful context and graceful <br>degradation<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2112/files#diff-529fc2deed0814bdbac7c27ac7eb5fa4f681801251645618e4a0b8a41caab87cR1-R110'><strong>No error logic</strong></a>: The changes are specifications in markdown with no executable error handling paths to <br>validate robustness or edge case management.<br> <details open><summary>Referred Code</summary> ```markdown ## ADDED Requirements ### Requirement: OCI Image Version Tagging The release workflow SHALL tag all OCI images with the semantic version (e.g., `v1.0.70`) in addition to `sha-<commit>` and `latest` tags. #### Scenario: Release triggers version-tagged image push - **WHEN** a release tag `v1.0.70` is pushed to the repository - **THEN** all images in `GHCR_PUSH_TARGETS` SHALL be pushed with tags: - `v1.0.70` (semantic version) - `sha-<commit>` (immutable digest-based) - `latest` (mutable latest reference) #### Scenario: Version tag matches VERSION file - **WHEN** the release workflow runs - **THEN** the image version tag SHALL match the content of the VERSION file ### Requirement: Helm Chart Repository The project SHALL maintain a Helm chart repository hosted on GitHub Pages at `https://carverauto.github.io/serviceradar`. #### Scenario: Chart published on release - **WHEN** a new release is created ... (clipped 89 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Secure Error Handling</strong></summary><br> **Objective:** To prevent the leakage of sensitive system information through error messages while <br>providing sufficient detail for internal debugging.<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2112/files#diff-90617d610307ed2ea02da97436aea57e8399b8f594d1368d680049b5afabbec3R1-R218'><strong>No user errors</strong></a>: No user-facing error messages or handling were introduced in this PR diff, so security of <br>error outputs cannot be evaluated.<br> <details open><summary>Referred Code</summary> ```markdown ## Context ServiceRadar releases currently follow a manual process: 1. Developer runs `scripts/cut-release.sh --version X.Y.Z` 2. Push triggers `.github/workflows/release.yml` 3. Bazel builds and pushes OCI images tagged with `sha-<commit>` and `latest` 4. Packages (deb/rpm) are uploaded to GitHub Releases 5. Manual Helm update to demo namespace (often forgotten or delayed) This creates gaps where released images may not have been validated in a staging environment, leading to potential production issues. ### Current State Analysis **OCI Images (`docker/images/push_targets.bzl:48-63`):** - Tags generated: `sha-<commit>`, `latest`, and image-specific static tags - Missing: Semantic version tags (e.g., `v1.0.70`) **Helm Chart (`helm/serviceradar/`):** - Lives only in Git repository - `Chart.yaml` has static version `0.1.0`, appVersion `1.0.0` - `values.yaml` uses hardcoded SHA tags: `sha-0933fd20c98038af196c35ea9f5cc95e3dc38909` ... (clipped 197 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Secure Logging Practices</strong></summary><br> **Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive <br>information like PII, PHI, or cardholder data.<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2112/files#diff-1d0c2c65aa29bf62ffb7eb1612511c7ef980c0eec8d6070a049baed0892f962cR1-R71'><strong>Logging not present</strong></a>: The PR adds planning docs and tasks only; no logging statements or configurations are <br>introduced to assess secure logging practices.<br> <details open><summary>Referred Code</summary> ```markdown ## 1. OCI Image Version Tagging - [ ] 1.1 Update `docker/images/push_targets.bzl` to accept version tag from release workflow - [ ] 1.2 Modify `container_tags.bzl` to include semantic version (e.g., `v1.0.70`) when provided - [ ] 1.3 Verify `latest` tag is applied to all images during release builds - [ ] 1.4 Update `.github/workflows/release.yml` to pass `--tag v$VERSION` to image push step - [ ] 1.5 Test image tagging with a dry-run release ## 2. Helm Chart Repository Setup - [ ] 2.1 Create `.github/workflows/helm-release.yml` workflow for chart publishing - [ ] 2.2 Configure GitHub Pages for chart hosting (branch: `gh-pages`, path: `/charts`) - [ ] 2.3 Add `helm package` and `helm repo index` steps to workflow - [ ] 2.4 Update `helm/serviceradar/Chart.yaml` with versioning scheme matching releases - [ ] 2.5 Create script to update Chart.yaml version from VERSION file - [ ] 2.6 Document Helm repository URL and usage in README ## 3. Helm Values Modernization - [ ] 3.1 Update `helm/serviceradar/values.yaml` to use `image.tag` defaulting to chart appVersion - [ ] 3.2 Add `global.imageTag` override for consistent tag across all services ... (clipped 50 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Security-First Input Validation and Data Handling</strong></summary><br> **Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent <br>vulnerabilities<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2112/files#diff-529fc2deed0814bdbac7c27ac7eb5fa4f681801251645618e4a0b8a41caab87cR1-R110'><strong>No input handling</strong></a>: This PR contains only markdown specifications and does not add executable input handling <br>to assess validation, sanitization, or secret management beyond high-level guidance.<br> <details open><summary>Referred Code</summary> ```markdown ## ADDED Requirements ### Requirement: OCI Image Version Tagging The release workflow SHALL tag all OCI images with the semantic version (e.g., `v1.0.70`) in addition to `sha-<commit>` and `latest` tags. #### Scenario: Release triggers version-tagged image push - **WHEN** a release tag `v1.0.70` is pushed to the repository - **THEN** all images in `GHCR_PUSH_TARGETS` SHALL be pushed with tags: - `v1.0.70` (semantic version) - `sha-<commit>` (immutable digest-based) - `latest` (mutable latest reference) #### Scenario: Version tag matches VERSION file - **WHEN** the release workflow runs - **THEN** the image version tag SHALL match the content of the VERSION file ### Requirement: Helm Chart Repository The project SHALL maintain a Helm chart repository hosted on GitHub Pages at `https://carverauto.github.io/serviceradar`. #### Scenario: Chart published on release - **WHEN** a new release is created ... (clipped 89 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td align="center" colspan="2"> - [ ] Update <!-- /compliance --update_compliance=true --> </td></tr></tbody></table> <details><summary>Compliance status legend</summary> 🟢 - Fully Compliant<br> 🟡 - Partial Compliant<br> 🔴 - Not Compliant<br> ⚪ - Requires Further Human Verification<br> 🏷️ - Compliance label<br> </details>
qodo-code-review[bot] commented 2025-12-12 05:01:53 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2112#issuecomment-3644917740
Original created: 2025-12-12T05:01:53Z

PR Code Suggestions

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Test hotfixes in staging before release

Modify the hotfix bypass scenario to include an expedited, automated test run in
the staging environment before release, instead of skipping testing entirely.

openspec/changes/add-release-staging-pipeline/specs/release-automation/spec.md [94-97]

 #### Scenario: Hotfix bypass
-- **WHEN** `scripts/cut-release.sh --version X.Y.Z --skip-staging` is executed
-- **THEN** the staging deployment and promotion steps SHALL be skipped
-- **AND** the release SHALL proceed directly to GitHub release creation
+- **WHEN** `scripts/cut-release.sh --version X.Y.Z --hotfix` is executed
+- **THEN** the release SHALL be deployed to the `demo-staging` environment
+- **AND** an expedited test suite SHALL run against staging
+- **AND** upon test success, promotion to `demo` SHALL be automatic, bypassing any manual approval gates
+- **AND** the release SHALL proceed to GitHub release creation
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a significant risk in the proposed hotfix process, which completely bypasses staging tests, and proposes a safer, more robust alternative that still allows for an expedited release.

High
High-level
Consider a simpler Git-based promotion flow
Suggestion Impact:The design was changed to "Manual PR-Based Promotion (Simplified)" and "Environment Promotion (SIMPLIFIED)," explicitly dropping the GitOps Promoter/Source Hydrator and adopting promotion via manual PRs (with possible future GitHub Action automation). This aligns with the suggested simpler Git-based promotion approach.

code diff:

+### Decision 2: Manual PR-Based Promotion (Simplified)
+**Rationale:** After evaluating GitOps Promoter with Source Hydrator, we chose a simpler approach. The Source Hydrator's credential requirements for write operations added complexity without proportional value for our use case.
+
+**Approach:**
+- Demo-staging deploys automatically via ArgoCD when chart version is updated
+- Promotion to demo-prod happens via manual PR (or future GitHub Action)
+- ArgoCD syncs both environments from their respective configurations
+
+**Alternatives Evaluated:**
+- **GitOps Promoter + Source Hydrator:** Too complex for credential setup with GitHub Apps. Source Hydrator requires write access to push hydrated manifests.
 - Argo Rollouts: More focused on canary/blue-green within a single app
 - Flux: Would require migrating from ArgoCD
-- Jenkins/custom: More maintenance overhead
-
-**Key Resources:**
-```yaml
-apiVersion: promoter.argoproj.io/v1alpha1
-kind: PromotionStrategy
-metadata:
-  name: serviceradar-release
-spec:
-  environments:
-    - name: demo-staging
-      source:
-        type: registry
-        registryRef: ghcr.io/carverauto/serviceradar-core
-    - name: demo
-      source:
-        type: pull-request
-      gates:
-        - name: e2e-tests
-          type: commit-status
-```
+
+**Future Enhancement:**
+Could add a GitHub Action that creates a promotion PR automatically when staging e2e tests pass.
 
 ### Decision 3: Helm-Based ArgoCD Applications
-**Rationale:** ArgoCD Helm support is mature, allows value overrides per environment, and integrates well with chart repositories.
+**Rationale:** ArgoCD Helm support is mature, allows value overrides per environment, and integrates well with OCI registries.
 
 **Implementation:**
 ```yaml
 # k8s/argocd/applications/demo-staging.yaml
 spec:
   source:
-    repoURL: https://carverauto.github.io/serviceradar
+    repoURL: ghcr.io/carverauto/charts
     chart: serviceradar
-    targetRevision: "*"  # Latest chart version
+    targetRevision: "1.0.70"  # or "*" for latest
     helm:
       valueFiles:
         - values-demo-staging.yaml

+Note: ArgoCD requires OCI URLs without the oci:// prefix for Helm sources.

Decision 4: Version Tag Strategy

Rationale: Use v<VERSION> tag (e.g., v1.0.70) as the primary release tag, with latest for dev convenience and sha-<commit> for immutability.
@@ -115,6 +107,20 @@

  1. v1.0.70 - Primary release tag
  2. sha-<commit> - Immutable reference
  3. latest - Development convenience (staging only)

+Implementation:
+- Modified docker/images/container_tags.bzl to add version_tags attribute to immutable_push_tags rule
+- Modified docker/images/push_targets.bzl to generate version tag via expand_template using {{STABLE_VERSION}}
+- Version is read from VERSION file via scripts/workspace_status.sh when --stamp is used
+- The "vdev" tag is excluded when building without proper workspace status (local dev builds)
+
+Generated Tags (example):
+ +sha-486cbfcbc1027b4255e8287df1c7ced48402b1c4 # commit SHA +v1.0.70 # semantic version +latest # static tag +sha-c30cd42eb275 # short digest +

Decision 5: E2E Test Credentials via GitHub Environments

Rationale: Cannot expose kubectl/kubeadm API credentials to GitHub Actions. Instead, store application-level credentials in GitHub Secrets using deployment environments for isolation.
@@ -168,40 +174,133 @@

Risk: Staging environment divergence

Mitigation: Use same Helm chart with minimal value overrides. Document differences clearly.

-### Trade-off: GitHub Pages vs dedicated chart registry
-Accepted: GitHub Pages has rate limits but sufficient for internal use. Can migrate later if needed.
+### Trade-off: OCI registry visibility
+Accepted: OCI charts in ghcr.io are less discoverable than GitHub Pages but provide better integration with existing GHCR authentication and image workflows.
+
+### Decision 6: Helm Chart and Image Versioning Strategy
+Rationale: Keep chart version in sync with app version (standard practice for charts in the same repo as the app). Image tags default to latest in values.yaml; deployments override via global.imageTag.
+
+Release updates required:
+1. VERSION file - app version
+2. helm/serviceradar/Chart.yaml - chart version + appVersion (via cut-release.sh)
+
+Not updated on release:
+- values.yaml - keeps appTag: "latest" as default
+- ArgoCD Applications override with global.imageTag: "v1.0.71" for specific versions
+
+Benefits:
+- Minimal files to update during release
+- Flexible: local dev uses latest, deployments pin to specific versions
+- Standard Helm versioning practice

Migration Plan

-### Phase 1: Helm Chart Repository (Week 1)
-1. Create gh-pages branch with index.yaml
-2. Add helm-release workflow
-3. Publish initial chart version

-### Phase 2: Image Version Tagging (Week 1)
-1. Update push_targets.bzl for version tags
-2. Modify release.yml to pass version
-3. Verify with dry-run release

-### Phase 3: Demo-Staging Setup (Week 2)
-1. Create demo-staging ArgoCD Application
-2. Deploy via Helm chart
-3. Validate staging deployment works

-### Phase 4: GitOps Promoter (Week 2-3)
-1. Install promoter CRDs
-2. Configure staging->demo promotion
-3. Integrate e2e test gate

-### Phase 5: Full Pipeline (Week 3)
-1. Update release workflow for staged deployment
-2. Test complete flow with pre-release
-3. Document and train team
+### Phase 1: Helm Chart OCI Registry (DONE)
+1. Create gh-pages branch Using OCI registry instead
+2. Added helm package/push step to release.yml
+3. Published chart: oci://ghcr.io/carverauto/charts/serviceradar:1.0.75
+4. Updated cut-release.sh to bump Chart.yaml version automatically
+5. Created ArgoCD repo credentials template (not needed - chart made public)
+
+### Phase 2: Helm Values Modernization (DONE)
+1. Added global.imageTag and global.imagePullPolicy to values.yaml
+2. Set default image.tags.appTag to latest
+3. Added helper templates for image tag/policy resolution
+4. Updated key templates (core, web, datasvc, agent, poller, srql)
+5. Fixed db-event-writer-config.yaml template whitespace issue (malformed apiVersion)
+6. Fixed db-event-writer.yaml duplicate volume/volumeMount definitions
+
+### Phase 3: Demo-Staging Setup (DONE)
+1. Created demo-staging ArgoCD Application
+2. Configured to use OCI Helm chart with inline values
+3. Made Helm chart public in GHCR (no credentials needed)
+4. Copied ghcr-io-cred image pull secret to demo-staging namespace
+5. Fixed CNPG secret name to use dynamic cluster name ($cnpgClusterName-ca) in templates
+6. Fixed CNPG host to use dynamic cluster name ($cnpgClusterName-rw) in templates
+7. Published chart v1.0.75 with all template fixes
+8. Successfully deployed demo-staging: Sync: Synced, Health: Healthy (all 19 deployments running)
+
+### Phase 4: Environment Promotion (SIMPLIFIED)
+Approach Changed: After evaluating GitOps Promoter with Source Hydrator, we found the complexity didn't match our needs. The Source Hydrator requires specific credential configurations for write operations that proved difficult to set up with GitHub Apps.
+
+Current Simple Approach:
+1. Demo-staging deploys automatically when chart is updated
+2. Demo-staging validation happens (manual or via e2e tests)
+3. Production promotion is via manual PR to update demo-prod version
+4. ArgoCD syncs demo-prod when PR is merged
+
+What was tried and removed:
+- GitOps Promoter v0.18.3 CRDs
+- ArgoCD Source Hydrator (argocd-commit-server)
+- Hydrated branch model (environments/demo-staging, environments/demo)
+- GitHub App for SCM access (sr-argocd-promoter)
+
+Future consideration: Could add GitHub Action to automate PR creation when staging e2e tests pass.


</details>


___

**Instead of adding the ArgoCD GitOps Promoter, consider a simpler approach where <br>the CI pipeline directly commits promotion changes to the Git repository after <br>tests pass. This avoids introducing a new dependency and controller to the <br>cluster.**


### Examples:



<details>
<summary>
<a href="https://github.com/carverauto/serviceradar/pull/2112/files#diff-90617d610307ed2ea02da97436aea57e8399b8f594d1368d680049b5afabbec3R67-R93">openspec/changes/add-release-staging-pipeline/design.md [67-93]</a>
</summary>



```markdown
### Decision 2: ArgoCD GitOps Promoter for Environment Promotion
**Rationale:** Native ArgoCD integration, declarative configuration, supports commit status gates for test validation.

**Alternatives Considered:**
- Argo Rollouts: More focused on canary/blue-green within a single app
- Flux: Would require migrating from ArgoCD
- Jenkins/custom: More maintenance overhead

**Key Resources:**
```yaml

 ... (clipped 17 lines)

Solution Walkthrough:

Before:

# design.md: Proposing ArgoCD GitOps Promoter
# A new controller and CRDs are added to the cluster.

# k8s/argocd/promoter-config.yaml
apiVersion: promoter.argoproj.io/v1alpha1
kind: PromotionStrategy
metadata:
  name: serviceradar-release
spec:
  environments:
    - name: demo-staging
      ...
    - name: demo
      gates:
        - name: e2e-tests
          type: commit-status

After:

# .github/workflows/release.yml: CI-driven promotion
# No new controller is needed; promotion is a Git commit.

jobs:
  promote-to-demo:
    runs-on: ubuntu-latest
    needs: e2e-tests
    if: success()
    steps:
      - name: Checkout gitops repo
        uses: actions/checkout@v3
        with:
          repository: my-org/gitops-repo
      - name: Update demo environment config
        run: |
          # Update image tag in Helm values or Kustomization
          update_app_version('k8s/demo/', env.NEW_VERSION)
      - name: Commit and push changes
        run: |
          git commit -m "Promote app to v${{ env.NEW_VERSION }}"
          git push

Suggestion importance[1-10]: 8

__

Why: The suggestion proposes a valid and simpler architectural alternative to a core component of the design, the ArgoCD GitOps Promoter, which significantly impacts the implementation's complexity and dependencies.

Medium
  • Update
Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2112#issuecomment-3644917740 Original created: 2025-12-12T05:01:53Z --- ## PR Code Suggestions ✨ <!-- 486cbfc --> Explore these optional code suggestions: <table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=1>Possible issue</td> <td> <details><summary>Test hotfixes in staging before release</summary> ___ **Modify the hotfix bypass scenario to include an expedited, automated test run in <br>the staging environment before release, instead of skipping testing entirely.** [openspec/changes/add-release-staging-pipeline/specs/release-automation/spec.md [94-97]](https://github.com/carverauto/serviceradar/pull/2112/files#diff-529fc2deed0814bdbac7c27ac7eb5fa4f681801251645618e4a0b8a41caab87cR94-R97) ```diff #### Scenario: Hotfix bypass -- **WHEN** `scripts/cut-release.sh --version X.Y.Z --skip-staging` is executed -- **THEN** the staging deployment and promotion steps SHALL be skipped -- **AND** the release SHALL proceed directly to GitHub release creation +- **WHEN** `scripts/cut-release.sh --version X.Y.Z --hotfix` is executed +- **THEN** the release SHALL be deployed to the `demo-staging` environment +- **AND** an expedited test suite SHALL run against staging +- **AND** upon test success, promotion to `demo` SHALL be automatic, bypassing any manual approval gates +- **AND** the release SHALL proceed to GitHub release creation ``` - [ ] **Apply / Chat** <!-- /improve --apply_suggestion=0 --> <details><summary>Suggestion importance[1-10]: 9</summary> __ Why: The suggestion correctly identifies a significant risk in the proposed hotfix process, which completely bypasses staging tests, and proposes a safer, more robust alternative that still allows for an expedited release. </details></details></td><td align=center>High </td></tr><tr><td rowspan=1>High-level</td> <td> <details><summary>✅ <s>Consider a simpler Git-based promotion flow<!-- not_implemented --></s></summary> ___ <details><summary><b>Suggestion Impact:</b></summary>The design was changed to "Manual PR-Based Promotion (Simplified)" and "Environment Promotion (SIMPLIFIED)," explicitly dropping the GitOps Promoter/Source Hydrator and adopting promotion via manual PRs (with possible future GitHub Action automation). This aligns with the suggested simpler Git-based promotion approach. code diff: ```diff +### Decision 2: Manual PR-Based Promotion (Simplified) +**Rationale:** After evaluating GitOps Promoter with Source Hydrator, we chose a simpler approach. The Source Hydrator's credential requirements for write operations added complexity without proportional value for our use case. + +**Approach:** +- Demo-staging deploys automatically via ArgoCD when chart version is updated +- Promotion to demo-prod happens via manual PR (or future GitHub Action) +- ArgoCD syncs both environments from their respective configurations + +**Alternatives Evaluated:** +- **GitOps Promoter + Source Hydrator:** Too complex for credential setup with GitHub Apps. Source Hydrator requires write access to push hydrated manifests. - Argo Rollouts: More focused on canary/blue-green within a single app - Flux: Would require migrating from ArgoCD -- Jenkins/custom: More maintenance overhead - -**Key Resources:** -```yaml -apiVersion: promoter.argoproj.io/v1alpha1 -kind: PromotionStrategy -metadata: - name: serviceradar-release -spec: - environments: - - name: demo-staging - source: - type: registry - registryRef: ghcr.io/carverauto/serviceradar-core - - name: demo - source: - type: pull-request - gates: - - name: e2e-tests - type: commit-status -``` + +**Future Enhancement:** +Could add a GitHub Action that creates a promotion PR automatically when staging e2e tests pass. ### Decision 3: Helm-Based ArgoCD Applications -**Rationale:** ArgoCD Helm support is mature, allows value overrides per environment, and integrates well with chart repositories. +**Rationale:** ArgoCD Helm support is mature, allows value overrides per environment, and integrates well with OCI registries. **Implementation:** ```yaml # k8s/argocd/applications/demo-staging.yaml spec: source: - repoURL: https://carverauto.github.io/serviceradar + repoURL: ghcr.io/carverauto/charts chart: serviceradar - targetRevision: "*" # Latest chart version + targetRevision: "1.0.70" # or "*" for latest helm: valueFiles: - values-demo-staging.yaml ``` + +Note: ArgoCD requires OCI URLs without the `oci://` prefix for Helm sources. ### Decision 4: Version Tag Strategy **Rationale:** Use `v<VERSION>` tag (e.g., `v1.0.70`) as the primary release tag, with `latest` for dev convenience and `sha-<commit>` for immutability. @@ -115,6 +107,20 @@ 1. `v1.0.70` - Primary release tag 2. `sha-<commit>` - Immutable reference 3. `latest` - Development convenience (staging only) + +**Implementation:** +- Modified `docker/images/container_tags.bzl` to add `version_tags` attribute to `immutable_push_tags` rule +- Modified `docker/images/push_targets.bzl` to generate version tag via `expand_template` using `{{STABLE_VERSION}}` +- Version is read from `VERSION` file via `scripts/workspace_status.sh` when `--stamp` is used +- The "vdev" tag is excluded when building without proper workspace status (local dev builds) + +**Generated Tags (example):** +``` +sha-486cbfcbc1027b4255e8287df1c7ced48402b1c4 # commit SHA +v1.0.70 # semantic version +latest # static tag +sha-c30cd42eb275 # short digest +``` ### Decision 5: E2E Test Credentials via GitHub Environments **Rationale:** Cannot expose kubectl/kubeadm API credentials to GitHub Actions. Instead, store application-level credentials in GitHub Secrets using deployment environments for isolation. @@ -168,40 +174,133 @@ ### Risk: Staging environment divergence **Mitigation:** Use same Helm chart with minimal value overrides. Document differences clearly. -### Trade-off: GitHub Pages vs dedicated chart registry -**Accepted:** GitHub Pages has rate limits but sufficient for internal use. Can migrate later if needed. +### Trade-off: OCI registry visibility +**Accepted:** OCI charts in ghcr.io are less discoverable than GitHub Pages but provide better integration with existing GHCR authentication and image workflows. + +### Decision 6: Helm Chart and Image Versioning Strategy +**Rationale:** Keep chart version in sync with app version (standard practice for charts in the same repo as the app). Image tags default to `latest` in values.yaml; deployments override via `global.imageTag`. + +**Release updates required:** +1. `VERSION` file - app version +2. `helm/serviceradar/Chart.yaml` - chart version + appVersion (via `cut-release.sh`) + +**Not updated on release:** +- `values.yaml` - keeps `appTag: "latest"` as default +- ArgoCD Applications override with `global.imageTag: "v1.0.71"` for specific versions + +**Benefits:** +- Minimal files to update during release +- Flexible: local dev uses `latest`, deployments pin to specific versions +- Standard Helm versioning practice ## Migration Plan -### Phase 1: Helm Chart Repository (Week 1) -1. Create `gh-pages` branch with index.yaml -2. Add helm-release workflow -3. Publish initial chart version - -### Phase 2: Image Version Tagging (Week 1) -1. Update push_targets.bzl for version tags -2. Modify release.yml to pass version -3. Verify with dry-run release - -### Phase 3: Demo-Staging Setup (Week 2) -1. Create demo-staging ArgoCD Application -2. Deploy via Helm chart -3. Validate staging deployment works - -### Phase 4: GitOps Promoter (Week 2-3) -1. Install promoter CRDs -2. Configure staging->demo promotion -3. Integrate e2e test gate - -### Phase 5: Full Pipeline (Week 3) -1. Update release workflow for staged deployment -2. Test complete flow with pre-release -3. Document and train team +### Phase 1: Helm Chart OCI Registry (DONE) +1. ~~Create gh-pages branch~~ Using OCI registry instead +2. Added helm package/push step to release.yml +3. Published chart: `oci://ghcr.io/carverauto/charts/serviceradar:1.0.75` +4. Updated `cut-release.sh` to bump Chart.yaml version automatically +5. Created ArgoCD repo credentials template (not needed - chart made public) + +### Phase 2: Helm Values Modernization (DONE) +1. Added `global.imageTag` and `global.imagePullPolicy` to values.yaml +2. Set default `image.tags.appTag` to `latest` +3. Added helper templates for image tag/policy resolution +4. Updated key templates (core, web, datasvc, agent, poller, srql) +5. Fixed db-event-writer-config.yaml template whitespace issue (malformed apiVersion) +6. Fixed db-event-writer.yaml duplicate volume/volumeMount definitions + +### Phase 3: Demo-Staging Setup (DONE) +1. Created demo-staging ArgoCD Application +2. Configured to use OCI Helm chart with inline values +3. Made Helm chart public in GHCR (no credentials needed) +4. Copied ghcr-io-cred image pull secret to demo-staging namespace +5. Fixed CNPG secret name to use dynamic cluster name (`$cnpgClusterName-ca`) in templates +6. Fixed CNPG host to use dynamic cluster name (`$cnpgClusterName-rw`) in templates +7. Published chart v1.0.75 with all template fixes +8. Successfully deployed demo-staging: Sync: Synced, Health: Healthy (all 19 deployments running) + +### Phase 4: Environment Promotion (SIMPLIFIED) +**Approach Changed:** After evaluating GitOps Promoter with Source Hydrator, we found the complexity didn't match our needs. The Source Hydrator requires specific credential configurations for write operations that proved difficult to set up with GitHub Apps. + +**Current Simple Approach:** +1. Demo-staging deploys automatically when chart is updated +2. Demo-staging validation happens (manual or via e2e tests) +3. Production promotion is via manual PR to update demo-prod version +4. ArgoCD syncs demo-prod when PR is merged + +**What was tried and removed:** +- GitOps Promoter v0.18.3 CRDs +- ArgoCD Source Hydrator (argocd-commit-server) +- Hydrated branch model (environments/demo-staging, environments/demo) +- GitHub App for SCM access (sr-argocd-promoter) + +**Future consideration:** Could add GitHub Action to automate PR creation when staging e2e tests pass. ``` </details> ___ **Instead of adding the ArgoCD GitOps Promoter, consider a simpler approach where <br>the CI pipeline directly commits promotion changes to the Git repository after <br>tests pass. This avoids introducing a new dependency and controller to the <br>cluster.** ### Examples: <details> <summary> <a href="https://github.com/carverauto/serviceradar/pull/2112/files#diff-90617d610307ed2ea02da97436aea57e8399b8f594d1368d680049b5afabbec3R67-R93">openspec/changes/add-release-staging-pipeline/design.md [67-93]</a> </summary> ```markdown ### Decision 2: ArgoCD GitOps Promoter for Environment Promotion **Rationale:** Native ArgoCD integration, declarative configuration, supports commit status gates for test validation. **Alternatives Considered:** - Argo Rollouts: More focused on canary/blue-green within a single app - Flux: Would require migrating from ArgoCD - Jenkins/custom: More maintenance overhead **Key Resources:** ```yaml ... (clipped 17 lines) ``` </details> ### Solution Walkthrough: #### Before: ```markdown # design.md: Proposing ArgoCD GitOps Promoter # A new controller and CRDs are added to the cluster. # k8s/argocd/promoter-config.yaml apiVersion: promoter.argoproj.io/v1alpha1 kind: PromotionStrategy metadata: name: serviceradar-release spec: environments: - name: demo-staging ... - name: demo gates: - name: e2e-tests type: commit-status ``` #### After: ```markdown # .github/workflows/release.yml: CI-driven promotion # No new controller is needed; promotion is a Git commit. jobs: promote-to-demo: runs-on: ubuntu-latest needs: e2e-tests if: success() steps: - name: Checkout gitops repo uses: actions/checkout@v3 with: repository: my-org/gitops-repo - name: Update demo environment config run: | # Update image tag in Helm values or Kustomization update_app_version('k8s/demo/', env.NEW_VERSION) - name: Commit and push changes run: | git commit -m "Promote app to v${{ env.NEW_VERSION }}" git push ``` <details><summary>Suggestion importance[1-10]: 8</summary> __ Why: The suggestion proposes a valid and simpler architectural alternative to a core component of the design, the `ArgoCD GitOps Promoter`, which significantly impacts the implementation's complexity and dependencies. </details></details></td><td align=center>Medium </td></tr> <tr><td align="center" colspan="2"> - [ ] Update <!-- /improve_multi --more_suggestions=true --> </td><td></td></tr></tbody></table>
mfreeman451 commented 2025-12-12 17:17:39 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2112#issuecomment-3647439411
Original created: 2025-12-12T17:17:39Z

Validated locally:

  • openspec validate add-release-staging-pipeline --strict
  • helm lint helm/serviceradar + helm template with demo + staging values
  • scripts/cut-release.sh --version 1.0.76-pre1 --dry-run (clean tree)

After merge, to exercise the full pipeline:

  • scripts/cut-release.sh --version 1.0.76-pre1 --push
    Then confirm Publish Release Artifacts and follow-on E2E Tests (staging) succeed, and promote by bumping targetRevision + global.imageTag in k8s/argocd/applications/demo-prod.yaml.
Imported GitHub PR comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/pull/2112#issuecomment-3647439411 Original created: 2025-12-12T17:17:39Z --- Validated locally: - `openspec validate add-release-staging-pipeline --strict` - `helm lint helm/serviceradar` + `helm template` with demo + staging values - `scripts/cut-release.sh --version 1.0.76-pre1 --dry-run` (clean tree) After merge, to exercise the full pipeline: - `scripts/cut-release.sh --version 1.0.76-pre1 --push` Then confirm `Publish Release Artifacts` and follow-on `E2E Tests` (staging) succeed, and promote by bumping `targetRevision` + `global.imageTag` in `k8s/argocd/applications/demo-prod.yaml`.
qodo-code-review[bot] commented 2025-12-12 17:19:30 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2112#issuecomment-3647445223
Original created: 2025-12-12T17:19:30Z

CI Feedback 🧐

A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

Action: test-go

Failed stage: Run Go Tests []

Failed test name: TestSanitizeTOML

Failure summary:

The action failed because the Go test github.com/carverauto/serviceradar/pkg/config timed out and
panicked after 3s. The failing test is TestSanitizeTOML, which repeatedly attempted to read a
configuration file at /etc/serviceradar/core.json and logged errors:
- open
/etc/serviceradar/core.json: no such file or directory
The repeated read attempts caused the test to
hang until the timeout, leading to:
- panic: test timed out after 3s
Relevant frames:
-
pkg/config/toml_mask.go:57
- pkg/config/toml_mask_test.go:28

Relevant error logs:
1:  Runner name: 'arc-runner-set-2tp2m-runner-lrmxg'
2:  Runner group name: 'Default'
...

194:  github.com/carverauto/serviceradar/cmd/mapper		coverage: 0.0% of statements
195:  github.com/carverauto/serviceradar/cmd/poller		coverage: 0.0% of statements
196:  github.com/carverauto/serviceradar/cmd/sync		coverage: 0.0% of statements
197:  github.com/carverauto/serviceradar/cmd/tools/cnpg-migrate		coverage: 0.0% of statements
198:  github.com/carverauto/serviceradar/cmd/tools/config-sync		coverage: 0.0% of statements
199:  github.com/carverauto/serviceradar/cmd/tools/kv-sweep		coverage: 0.0% of statements
200:  github.com/carverauto/serviceradar/cmd/tools/waitforport		coverage: 0.0% of statements
201:  github.com/carverauto/serviceradar/internal/fastsum		coverage: 0.0% of statements
202:  ok  	github.com/carverauto/serviceradar/pkg/agent	1.693s	coverage: 1.3% of statements in ./...
203:  github.com/carverauto/serviceradar/pkg/checker		coverage: 0.0% of statements
204:  github.com/carverauto/serviceradar/pkg/checker/dusk		coverage: 0.0% of statements
205:  ok  	github.com/carverauto/serviceradar/pkg/checker/snmp	1.520s	coverage: 0.6% of statements in ./...
206:  ok  	github.com/carverauto/serviceradar/pkg/checker/sysmonosx	1.447s	coverage: 0.3% of statements in ./...
207:  github.com/carverauto/serviceradar/pkg/cli		coverage: 0.0% of statements
208:  -test.shuffle 1765559951957357935
209:  {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:11Z","message":"Failed to read configuration file"}
210:  {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:12Z","message":"Failed to read configuration file"}
211:  {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:12Z","message":"Failed to read configuration file"}
212:  {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:13Z","message":"Failed to read configuration file"}
213:  {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:13Z","message":"Failed to read configuration file"}
214:  {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:13Z","message":"Failed to read configuration file"}
215:  {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:14Z","message":"Failed to read configuration file"}
216:  {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:14Z","message":"Failed to read configuration file"}
217:  coverage: 1.7% of statements in ./...
218:  {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:15Z","message":"Failed to read configuration file"}
219:  panic: test timed out after 3s
...

244:  /home/runner/_work/serviceradar/serviceradar/pkg/config/toml_mask.go:57 +0xa75
245:  github.com/carverauto/serviceradar/pkg/config.TestSanitizeTOML(0xc000392540)
246:  /home/runner/_work/serviceradar/serviceradar/pkg/config/toml_mask_test.go:28 +0x169
247:  testing.tRunner(0xc000392540, 0x13b4f28)
248:  /home/runner/_work/_tool/go/1.25.0/x64/src/testing/testing.go:1934 +0x21d
249:  created by testing.(*T).Run in goroutine 1
250:  /home/runner/_work/_tool/go/1.25.0/x64/src/testing/testing.go:1997 +0x9d3
251:  FAIL	github.com/carverauto/serviceradar/pkg/config	3.327s
252:  github.com/carverauto/serviceradar/pkg/config/bootstrap		coverage: 0.0% of statements
253:  ?   	github.com/carverauto/serviceradar/pkg/config/kv	[no test files]
254:  github.com/carverauto/serviceradar/pkg/config/kvgrpc		coverage: 0.0% of statements
255:  github.com/carverauto/serviceradar/pkg/config/kvnats		coverage: 0.0% of statements
256:  ok  	github.com/carverauto/serviceradar/pkg/consumers/db-event-writer	1.454s	coverage: 0.4% of statements in ./...
257:  github.com/carverauto/serviceradar/pkg/consumers/netflow		coverage: 0.0% of statements
258:  FAIL
259:  ##[error]Process completed with exit code 1.
260:  Post job cleanup.

Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2112#issuecomment-3647445223 Original created: 2025-12-12T17:19:30Z --- ## CI Feedback 🧐 A test triggered by this PR failed. Here is an AI-generated analysis of the failure: <table><tr><td> **Action:** test-go</td></tr> <tr><td> **Failed stage:** [Run Go Tests](https://github.com/carverauto/serviceradar/actions/runs/20174431498/job/57918676589) [❌] </td></tr> <tr><td> **Failed test name:** TestSanitizeTOML </td></tr> <tr><td> **Failure summary:** The action failed because the Go test <code>github.com/carverauto/serviceradar/pkg/config</code> timed out and <br>panicked after 3s. The failing test is <code>TestSanitizeTOML</code>, which repeatedly attempted to read a <br>configuration file at <code>/etc/serviceradar/core.json</code> and logged errors:<br> - <code>open </code><br><code>/etc/serviceradar/core.json: no such file or directory</code><br> The repeated read attempts caused the test to <br>hang until the timeout, leading to:<br> - panic: test timed out after 3s<br> Relevant frames:<br> - <br><code>pkg/config/toml_mask.go:57</code><br> - <code>pkg/config/toml_mask_test.go:28</code><br> </td></tr> <tr><td> <details><summary>Relevant error logs:</summary> ```yaml 1: Runner name: 'arc-runner-set-2tp2m-runner-lrmxg' 2: Runner group name: 'Default' ... 194: github.com/carverauto/serviceradar/cmd/mapper coverage: 0.0% of statements 195: github.com/carverauto/serviceradar/cmd/poller coverage: 0.0% of statements 196: github.com/carverauto/serviceradar/cmd/sync coverage: 0.0% of statements 197: github.com/carverauto/serviceradar/cmd/tools/cnpg-migrate coverage: 0.0% of statements 198: github.com/carverauto/serviceradar/cmd/tools/config-sync coverage: 0.0% of statements 199: github.com/carverauto/serviceradar/cmd/tools/kv-sweep coverage: 0.0% of statements 200: github.com/carverauto/serviceradar/cmd/tools/waitforport coverage: 0.0% of statements 201: github.com/carverauto/serviceradar/internal/fastsum coverage: 0.0% of statements 202: ok github.com/carverauto/serviceradar/pkg/agent 1.693s coverage: 1.3% of statements in ./... 203: github.com/carverauto/serviceradar/pkg/checker coverage: 0.0% of statements 204: github.com/carverauto/serviceradar/pkg/checker/dusk coverage: 0.0% of statements 205: ok github.com/carverauto/serviceradar/pkg/checker/snmp 1.520s coverage: 0.6% of statements in ./... 206: ok github.com/carverauto/serviceradar/pkg/checker/sysmonosx 1.447s coverage: 0.3% of statements in ./... 207: github.com/carverauto/serviceradar/pkg/cli coverage: 0.0% of statements 208: -test.shuffle 1765559951957357935 209: {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:11Z","message":"Failed to read configuration file"} 210: {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:12Z","message":"Failed to read configuration file"} 211: {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:12Z","message":"Failed to read configuration file"} 212: {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:13Z","message":"Failed to read configuration file"} 213: {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:13Z","message":"Failed to read configuration file"} 214: {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:13Z","message":"Failed to read configuration file"} 215: {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:14Z","message":"Failed to read configuration file"} 216: {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:14Z","message":"Failed to read configuration file"} 217: coverage: 1.7% of statements in ./... 218: {"level":"error","path":"/etc/serviceradar/core.json","error":"open /etc/serviceradar/core.json: no such file or directory","time":"2025-12-12T17:19:15Z","message":"Failed to read configuration file"} 219: panic: test timed out after 3s ... 244: /home/runner/_work/serviceradar/serviceradar/pkg/config/toml_mask.go:57 +0xa75 245: github.com/carverauto/serviceradar/pkg/config.TestSanitizeTOML(0xc000392540) 246: /home/runner/_work/serviceradar/serviceradar/pkg/config/toml_mask_test.go:28 +0x169 247: testing.tRunner(0xc000392540, 0x13b4f28) 248: /home/runner/_work/_tool/go/1.25.0/x64/src/testing/testing.go:1934 +0x21d 249: created by testing.(*T).Run in goroutine 1 250: /home/runner/_work/_tool/go/1.25.0/x64/src/testing/testing.go:1997 +0x9d3 251: FAIL github.com/carverauto/serviceradar/pkg/config 3.327s 252: github.com/carverauto/serviceradar/pkg/config/bootstrap coverage: 0.0% of statements 253: ? github.com/carverauto/serviceradar/pkg/config/kv [no test files] 254: github.com/carverauto/serviceradar/pkg/config/kvgrpc coverage: 0.0% of statements 255: github.com/carverauto/serviceradar/pkg/config/kvnats coverage: 0.0% of statements 256: ok github.com/carverauto/serviceradar/pkg/consumers/db-event-writer 1.454s coverage: 0.4% of statements in ./... 257: github.com/carverauto/serviceradar/pkg/consumers/netflow coverage: 0.0% of statements 258: FAIL 259: ##[error]Process completed with exit code 1. 260: Post job cleanup. ``` </details></td></tr></table>
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar!2551
No description provided.