carverauto/serviceradar

Fork 0

doc updates #2487

Merged

mfreeman451 merged 1 commit from refs/pull/2487/head into main

2025-11-28 18:15:04 +00:00

mfreeman451 commented

2025-11-28 18:14:52 +00:00

(Migrated from github.com)

Owner

Imported from GitHub pull request.

Original GitHub pull request: #2031
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2031
Original created: 2025-11-28T18:14:52Z
Original updated: 2025-11-28T18:16:23Z
Original head: carverauto/serviceradar:chore/docs_updates_nov28
Original base: main
Original merged: 2025-11-28T18:15:04Z by @mfreeman451

User description

IMPORTANT: Please sign the Developer Certificate of Origin

Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:

Signed-off-by: J. Doe <j.doe@domain.com>

Describe your changes

Issue ticket number and link

Code checklist before requesting a review

I have signed the DCO?
The build completes without errors?
All tests are passing when running make test?

PR Type

Documentation

Description

Redesigned architecture diagram with clearer Kubernetes cluster structure
- Updated component organization into logical subgraphs (Ingress, API, Monitoring, Data Plane, Collectors, Identity)
- Added comprehensive traffic flow documentation and cluster requirements
- Upgraded Docusaurus dependencies from 3.7.0 to 3.9.2
- Replaced Timeplus references with Timescale/TimescaleDB terminology

Diagram Walkthrough

    flowchart LR
      ArchDoc["Architecture Diagram<br/>Redesign"]
      ArchDoc -->|"Clearer K8s<br/>structure"| NewDiagram["New Mermaid<br/>Flowchart"]
      ArchDoc -->|"Traffic flow<br/>summary"| TrafficDocs["User/Agent/Data<br/>Flow Docs"]
      ArchDoc -->|"Cluster<br/>requirements"| ReqDocs["Ingress/Storage/<br/>CPU/Identity Specs"]
      
      DepUpgrade["Dependency<br/>Upgrades"]
      DepUpgrade -->|"3.7.0 → 3.9.2"| DocusaurusUpgrade["Docusaurus Core<br/>Preset & Mermaid"]
      
      TerminologyFix["Terminology<br/>Updates"]
      TerminologyFix -->|"Timeplus →<br/>Timescale"| DBRefs["Database<br/>References"]

File Walkthrough

Relevant files

Documentation

architecture.md

Redesign architecture diagram and add cluster requirements

docs/docs/architecture.md

Completely redesigned Mermaid architecture diagram from graph TD to
flowchart TB with improved Kubernetes cluster structure
Reorganized components into logical subgraphs: External Access,
Ingress Layer, API Layer, Monitoring Layer, Data Plane, Telemetry
Collectors, and Identity & Security
Added detailed traffic flow summary explaining user requests, Kong
validation, edge agent connections, NATS messaging, and SPIRE
certificate distribution
Added comprehensive "Cluster requirements" section documenting Ingress
configuration, persistent storage needs (~150GiB baseline), CPU/memory
resource requests, and identity plane requirements
Replaced all references to "Timeplus" with "Timescale" or
"TimescaleDB" for consistency
Updated Web UI documentation to reference cluster ingress exposure
instead of Nginx reverse proxy

+80/-83

Dependencies

package.json

Upgrade Docusaurus dependencies to 3.9.2

docs/package.json

Upgraded @docusaurus/core from 3.7.0 to ^3.9.2
Upgraded @docusaurus/preset-classic from 3.7.0 to ^3.9.2
Upgraded @docusaurus/theme-mermaid from ^3.7.0 to ^3.9.2
Upgraded @docusaurus/module-type-aliases from 3.7.0 to ^3.9.2
Upgraded @docusaurus/tsconfig from 3.7.0 to ^3.9.2
Upgraded @docusaurus/types from 3.7.0 to ^3.9.2

+6/-6

Imported from GitHub pull request. Original GitHub pull request: #2031 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/pull/2031 Original created: 2025-11-28T18:14:52Z Original updated: 2025-11-28T18:16:23Z Original head: carverauto/serviceradar:chore/docs_updates_nov28 Original base: main Original merged: 2025-11-28T18:15:04Z by @mfreeman451 --- ### **User description** ## IMPORTANT: Please sign the Developer Certificate of Origin Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include a [DCO sign-off statement]( https://developercertificate.org/) indicating the DCO acceptance in one commit message. Here is an example DCO Signed-off-by line in a commit message: ``` Signed-off-by: J. Doe <j.doe@domain.com> ``` ## Describe your changes ## Issue ticket number and link ## Code checklist before requesting a review - [ ] I have signed the DCO? - [ ] The build completes without errors? - [ ] All tests are passing when running make test? ___ ### **PR Type** Documentation ___ ### **Description** - Redesigned architecture diagram with clearer Kubernetes cluster structure - Updated component organization into logical subgraphs (Ingress, API, Monitoring, Data Plane, Collectors, Identity) - Added comprehensive traffic flow documentation and cluster requirements - Upgraded Docusaurus dependencies from 3.7.0 to 3.9.2 - Replaced Timeplus references with Timescale/TimescaleDB terminology ___ ### Diagram Walkthrough ```mermaid flowchart LR ArchDoc["Architecture Diagram Redesign"] ArchDoc -->|"Clearer K8s structure"| NewDiagram["New Mermaid Flowchart"] ArchDoc -->|"Traffic flow summary"| TrafficDocs["User/Agent/Data Flow Docs"] ArchDoc -->|"Cluster requirements"| ReqDocs["Ingress/Storage/ CPU/Identity Specs"] DepUpgrade["Dependency Upgrades"] DepUpgrade -->|"3.7.0 → 3.9.2"| DocusaurusUpgrade["Docusaurus Core Preset & Mermaid"] TerminologyFix["Terminology Updates"] TerminologyFix -->|"Timeplus → Timescale"| DBRefs["Database References"] ``` <details> <summary><h3> File Walkthrough</h3></summary> <table><thead><tr><th></th><th align="left">Relevant files</th></tr></thead><tbody><tr><td>Documentation</td><td><table> <tr> <td> <details> <summary>architecture.md<dd><code>Redesign architecture diagram and add cluster requirements</code></dd></summary> <hr> docs/docs/architecture.md <ul><li>Completely redesigned Mermaid architecture diagram from <code>graph TD</code> to <code>flowchart TB</code> with improved Kubernetes cluster structure <li> Reorganized components into logical subgraphs: External Access, Ingress Layer, API Layer, Monitoring Layer, Data Plane, Telemetry Collectors, and Identity & Security <li> Added detailed traffic flow summary explaining user requests, Kong validation, edge agent connections, NATS messaging, and SPIRE certificate distribution <li> Added comprehensive "Cluster requirements" section documenting Ingress configuration, persistent storage needs (~150GiB baseline), CPU/memory resource requests, and identity plane requirements <li> Replaced all references to "Timeplus" with "Timescale" or "TimescaleDB" for consistency <li> Updated Web UI documentation to reference cluster ingress exposure instead of Nginx reverse proxy</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/2031/files#diff-90abd06467420fd89391fd1a4d75ceb1f6a9381de4d13a95fffe606abff38d37">+80/-83</a>  </td> </tr> </table></td></tr><tr><td>Dependencies</td><td><table> <tr> <td> <details> <summary>package.json<dd><code>Upgrade Docusaurus dependencies to 3.9.2</code>                                  </dd></summary> <hr> docs/package.json <ul><li>Upgraded <code>@docusaurus/core</code> from 3.7.0 to ^3.9.2 <li> Upgraded <code>@docusaurus/preset-classic</code> from 3.7.0 to ^3.9.2 <li> Upgraded <code>@docusaurus/theme-mermaid</code> from ^3.7.0 to ^3.9.2 <li> Upgraded <code>@docusaurus/module-type-aliases</code> from 3.7.0 to ^3.9.2 <li> Upgraded <code>@docusaurus/tsconfig</code> from 3.7.0 to ^3.9.2 <li> Upgraded <code>@docusaurus/types</code> from 3.7.0 to ^3.9.2</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/2031/files#diff-adfa337ce44dc2902621da20152a048dac41878cf3716dfc4cc56d03aa212a56">+6/-6</a>      </td> </tr> </table></td></tr></tr></tbody></table> </details> ___

qodo-code-review[bot] commented

2025-11-28 18:15:26 +00:00

(Migrated from github.com)

Author

Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2031#issuecomment-3590079079
Original created: 2025-11-28T18:15:26Z

You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢	No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
⚪	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: No code impact: The PR only updates documentation and package versions without adding or modifying runtime code paths that would affect audit logging of critical actions. Referred Code Traffic flow summary: - User requests → Ingress → Web UI (static/SSR) or Kong (API) - Kong validates JWTs and routes to Core (control plane) or SRQL (queries) - Edge agents connect via gRPC mTLS to the Poller - NATS JetStream provides pub/sub messaging and KV storage for all services - SPIRE issues X.509 certificates to all workloads via DaemonSet agents ### Cluster requirements - Ingress: Required for the web UI and API. Default host/class/TLS come from `helm/serviceradar/values.yaml` (`ingress.enabled=true`, `host=demo.serviceradar.cloud`, `className=nginx`, `tls.secretName=serviceradar-prod-tls`, `tls.clusterIssuer=carverauto-issuer`). If you use nginx, mirror the demo annotations (`nginx.ingress.kubernetes.io/proxy-body-size: 100m`, `proxy-buffer-size: 128k`, `proxy-buffers-number: 4`, `proxy-busy-buffers-size: 256k`, `proxy-read-timeout: 86400`, `proxy-send-timeout: 86400`, `proxy-connect-timeout: 60`) to keep SRQL streams and large asset uploads stable (`k8s/demo/prod/ingress.yaml`). - Persistent storage (~150GiB/node baseline): CNPG consumes the majority (3×100Gi PVCs from `k8s/demo/base/spire/cnpg-cluster.yaml`). JetStream adds 30Gi (`k8s/demo/base/serviceradar-nats.yaml`), OTEL 10Gi (`k8s/demo/base/serviceradar-otel.yaml`), and several 5Gi claims for Core, Datasvc, Mapper, Zen, DB event writer, plus 1Gi claims for Faker/Flowgger/Cert jobs. Spread the CNPG replicas across at least three nodes with SSD-class volumes; the extra PVCs lift per-node needs to roughly 150Gi of usable capacity when co-scheduled with CNPG. - CPU / memory (requested): Core 1 CPU / 4Gi, Poller 0.5 CPU / 2Gi (`k8s/demo/base/serviceradar-core.yaml`, `serviceradar-poller.yaml`); Kong 0.5 CPU / 1Gi; Web 0.2 CPU / 512Mi; Datasvc 0.5 CPU / 128Mi; SRQL 0.1 CPU / 128Mi; NATS 1 CPU / 8Gi; OTEL 0.2 CPU / 256Mi. The steady-state floor is ~4 vCPU and ~16 GiB for the core path, before adding optional sync/checker pods or horizontal scaling. - Identity plane: SPIRE server (StatefulSet) and daemonset agents must be running; services expect the workload socket at `/run/spire/sockets/agent.sock` and SPIFFE IDs derived from `spire.trustDomain` in `values.yaml`. - TLS artifacts: Pods mount `serviceradar-cert-data` for inter-service TLS and `cnpg-ca` for database verification; ensure these secrets/PVCs are provisioned before rolling workloads. Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Documentation only: Changes introduce documentation prose and diagram nodes rather than executable code, so identifier naming in code cannot be evaluated from this diff. Referred Code ```mermaid flowchart TB subgraph External["External Access"] User([User Browser]) EdgeAgent([Edge Agents]) end subgraph Cluster["Kubernetes Cluster"] subgraph Ingress["Edge Layer"] ING[Ingress Controller] WEB[Web UI<br/>Next.js :3000] KONG[Kong Gateway<br/>:8000] end subgraph API["API Layer"] CORE[Core Service<br/>REST :8090 / gRPC :50052] SRQL[SRQL Service<br/>:8080] end subgraph Monitoring["Monitoring Layer"] POLLER[Poller<br/>:50053] ... (clipped 49 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Dependency upgrade: Only documentation and dependency versions were updated; no new executable code was added where error handling could be assessed. Referred Code `"@docusaurus/core": "^3.9.2", "@docusaurus/preset-classic": "^3.9.2", "@docusaurus/theme-mermaid": "^3.9.2", "@mdx-js/react": "^3.0.0",` Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: No user errors: The diff does not include user-facing error handling code; only docs were changed, so exposure of internal details in errors cannot be validated. Referred Code Traffic flow summary: - User requests → Ingress → Web UI (static/SSR) or Kong (API) - Kong validates JWTs and routes to Core (control plane) or SRQL (queries) - Edge agents connect via gRPC mTLS to the Poller - NATS JetStream provides pub/sub messaging and KV storage for all services - SPIRE issues X.509 certificates to all workloads via DaemonSet agents ### Cluster requirements - Ingress: Required for the web UI and API. Default host/class/TLS come from `helm/serviceradar/values.yaml` (`ingress.enabled=true`, `host=demo.serviceradar.cloud`, `className=nginx`, `tls.secretName=serviceradar-prod-tls`, `tls.clusterIssuer=carverauto-issuer`). If you use nginx, mirror the demo annotations (`nginx.ingress.kubernetes.io/proxy-body-size: 100m`, `proxy-buffer-size: 128k`, `proxy-buffers-number: 4`, `proxy-busy-buffers-size: 256k`, `proxy-read-timeout: 86400`, `proxy-send-timeout: 86400`, `proxy-connect-timeout: 60`) to keep SRQL streams and large asset uploads stable (`k8s/demo/prod/ingress.yaml`). - Persistent storage (~150GiB/node baseline): CNPG consumes the majority (3×100Gi PVCs from `k8s/demo/base/spire/cnpg-cluster.yaml`). JetStream adds 30Gi (`k8s/demo/base/serviceradar-nats.yaml`), OTEL 10Gi (`k8s/demo/base/serviceradar-otel.yaml`), and several 5Gi claims for Core, Datasvc, Mapper, Zen, DB event writer, plus 1Gi claims for Faker/Flowgger/Cert jobs. Spread the CNPG replicas across at least three nodes with SSD-class volumes; the extra PVCs lift per-node needs to roughly 150Gi of usable capacity when co-scheduled with CNPG. - CPU / memory (requested): Core 1 CPU / 4Gi, Poller 0.5 CPU / 2Gi (`k8s/demo/base/serviceradar-core.yaml`, `serviceradar-poller.yaml`); Kong 0.5 CPU / 1Gi; Web 0.2 CPU / 512Mi; Datasvc 0.5 CPU / 128Mi; SRQL 0.1 CPU / 128Mi; NATS 1 CPU / 8Gi; OTEL 0.2 CPU / 256Mi. The steady-state floor is ~4 vCPU and ~16 GiB for the core path, before adding optional sync/checker pods or horizontal scaling. - Identity plane: SPIRE server (StatefulSet) and daemonset agents must be running; services expect the workload socket at `/run/spire/sockets/agent.sock` and SPIFFE IDs derived from `spire.trustDomain` in `values.yaml`. - TLS artifacts: Pods mount `serviceradar-cert-data` for inter-service TLS and `cnpg-ca` for database verification; ensure these secrets/PVCs are provisioned before rolling workloads. Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: No logging code: No application logging changes are present in this documentation-focused PR; secure logging practices cannot be assessed. Referred Code Traffic flow summary: - User requests → Ingress → Web UI (static/SSR) or Kong (API) - Kong validates JWTs and routes to Core (control plane) or SRQL (queries) - Edge agents connect via gRPC mTLS to the Poller - NATS JetStream provides pub/sub messaging and KV storage for all services - SPIRE issues X.509 certificates to all workloads via DaemonSet agents ### Cluster requirements - Ingress: Required for the web UI and API. Default host/class/TLS come from `helm/serviceradar/values.yaml` (`ingress.enabled=true`, `host=demo.serviceradar.cloud`, `className=nginx`, `tls.secretName=serviceradar-prod-tls`, `tls.clusterIssuer=carverauto-issuer`). If you use nginx, mirror the demo annotations (`nginx.ingress.kubernetes.io/proxy-body-size: 100m`, `proxy-buffer-size: 128k`, `proxy-buffers-number: 4`, `proxy-busy-buffers-size: 256k`, `proxy-read-timeout: 86400`, `proxy-send-timeout: 86400`, `proxy-connect-timeout: 60`) to keep SRQL streams and large asset uploads stable (`k8s/demo/prod/ingress.yaml`). - Persistent storage (~150GiB/node baseline): CNPG consumes the majority (3×100Gi PVCs from `k8s/demo/base/spire/cnpg-cluster.yaml`). JetStream adds 30Gi (`k8s/demo/base/serviceradar-nats.yaml`), OTEL 10Gi (`k8s/demo/base/serviceradar-otel.yaml`), and several 5Gi claims for Core, Datasvc, Mapper, Zen, DB event writer, plus 1Gi claims for Faker/Flowgger/Cert jobs. Spread the CNPG replicas across at least three nodes with SSD-class volumes; the extra PVCs lift per-node needs to roughly 150Gi of usable capacity when co-scheduled with CNPG. - CPU / memory (requested): Core 1 CPU / 4Gi, Poller 0.5 CPU / 2Gi (`k8s/demo/base/serviceradar-core.yaml`, `serviceradar-poller.yaml`); Kong 0.5 CPU / 1Gi; Web 0.2 CPU / 512Mi; Datasvc 0.5 CPU / 128Mi; SRQL 0.1 CPU / 128Mi; NATS 1 CPU / 8Gi; OTEL 0.2 CPU / 256Mi. The steady-state floor is ~4 vCPU and ~16 GiB for the core path, before adding optional sync/checker pods or horizontal scaling. - Identity plane: SPIRE server (StatefulSet) and daemonset agents must be running; services expect the workload socket at `/run/spire/sockets/agent.sock` and SPIFFE IDs derived from `spire.trustDomain` in `values.yaml`. - TLS artifacts: Pods mount `serviceradar-cert-data` for inter-service TLS and `cnpg-ca` for database verification; ensure these secrets/PVCs are provisioned before rolling workloads. Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: Package bumps: Only Docusaurus-related dependency versions and documentation text changed; there are no new input-handling code paths to validate for security. Referred Code `"@docusaurus/module-type-aliases": "^3.9.2", "@docusaurus/tsconfig": "^3.9.2", "@docusaurus/types": "^3.9.2",` Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2031#issuecomment-3590079079 Original created: 2025-11-28T18:15:26Z --- _You are nearing your monthly Qodo Merge usage quota. For more information, please visit [here](https://qodo-merge-docs.qodo.ai/installation/qodo_merge/#cloud-users)._ ## PR Compliance Guide 🔍  Below is a summary of compliance checks for this PR: <table><tbody><tr><td colspan='2'>Security Compliance</td></tr> <tr><td>🟢</td><td><details><summary>No security concerns identified</summary> No security vulnerabilities detected by AI analysis. Human verification advised for critical code. </details></td></tr> <tr><td colspan='2'>Ticket Compliance</td></tr> <tr><td>⚪</td><td><details><summary>🎫 No ticket provided </summary> - [ ] Create ticket/issue  </details></td></tr> <tr><td colspan='2'>Codebase Duplication Compliance</td></tr> <tr><td>⚪</td><td><details><summary>Codebase context is not defined </summary> Follow the <a href='https://qodo-merge-docs.qodo.ai/core-abilities/rag_context_enrichment/'>guide</a> to enable codebase context checks. </details></td></tr> <tr><td colspan='2'>Custom Compliance</td></tr> <tr><td rowspan=6>⚪</td> <td><details> <summary>Generic: Comprehensive Audit Trails</summary> **Objective:** To create a detailed and reliable record of critical system actions for security analysis and compliance. **Status:** <a href='https://github.com/carverauto/serviceradar/pull/2031/files#diff-90abd06467420fd89391fd1a4d75ceb1f6a9381de4d13a95fffe606abff38d37R83-R101'>No code impact</a>: The PR only updates documentation and package versions without adding or modifying runtime code paths that would affect audit logging of critical actions. <details open><summary>Referred Code</summary> ```markdown **Traffic flow summary:** - **User requests** → Ingress → Web UI (static/SSR) or Kong (API) - **Kong** validates JWTs and routes to Core (control plane) or SRQL (queries) - **Edge agents** connect via gRPC mTLS to the Poller - **NATS JetStream** provides pub/sub messaging and KV storage for all services - **SPIRE** issues X.509 certificates to all workloads via DaemonSet agents ### Cluster requirements - **Ingress**: Required for the web UI and API. Default host/class/TLS come from `helm/serviceradar/values.yaml` (`ingress.enabled=true`, `host=demo.serviceradar.cloud`, `className=nginx`, `tls.secretName=serviceradar-prod-tls`, `tls.clusterIssuer=carverauto-issuer`). If you use nginx, mirror the demo annotations (`nginx.ingress.kubernetes.io/proxy-body-size: 100m`, `proxy-buffer-size: 128k`, `proxy-buffers-number: 4`, `proxy-busy-buffers-size: 256k`, `proxy-read-timeout: 86400`, `proxy-send-timeout: 86400`, `proxy-connect-timeout: 60`) to keep SRQL streams and large asset uploads stable (`k8s/demo/prod/ingress.yaml`). - **Persistent storage (~150GiB/node baseline)**: CNPG consumes the majority (3×100Gi PVCs from `k8s/demo/base/spire/cnpg-cluster.yaml`). JetStream adds 30Gi (`k8s/demo/base/serviceradar-nats.yaml`), OTEL 10Gi (`k8s/demo/base/serviceradar-otel.yaml`), and several 5Gi claims for Core, Datasvc, Mapper, Zen, DB event writer, plus 1Gi claims for Faker/Flowgger/Cert jobs. Spread the CNPG replicas across at least three nodes with SSD-class volumes; the extra PVCs lift per-node needs to roughly 150Gi of usable capacity when co-scheduled with CNPG. - **CPU / memory (requested)**: Core 1 CPU / 4Gi, Poller 0.5 CPU / 2Gi (`k8s/demo/base/serviceradar-core.yaml`, `serviceradar-poller.yaml`); Kong 0.5 CPU / 1Gi; Web 0.2 CPU / 512Mi; Datasvc 0.5 CPU / 128Mi; SRQL 0.1 CPU / 128Mi; NATS 1 CPU / 8Gi; OTEL 0.2 CPU / 256Mi. The steady-state floor is ~4 vCPU and ~16 GiB for the core path, before adding optional sync/checker pods or horizontal scaling. - **Identity plane**: SPIRE server (StatefulSet) and daemonset agents must be running; services expect the workload socket at `/run/spire/sockets/agent.sock` and SPIFFE IDs derived from `spire.trustDomain` in `values.yaml`. - **TLS artifacts**: Pods mount `serviceradar-cert-data` for inter-service TLS and `cnpg-ca` for database verification; ensure these secrets/PVCs are provisioned before rolling workloads. ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary>Generic: Meaningful Naming and Self-Documenting Code</summary> **Objective:** Ensure all identifiers clearly express their purpose and intent, making code self-documenting **Status:** <a href='https://github.com/carverauto/serviceradar/pull/2031/files#diff-90abd06467420fd89391fd1a4d75ceb1f6a9381de4d13a95fffe606abff38d37R12-R81'>Documentation only</a>: Changes introduce documentation prose and diagram nodes rather than executable code, so identifier naming in code cannot be evaluated from this diff. <details open><summary>Referred Code</summary> ```markdown ```mermaid flowchart TB subgraph External["External Access"] User([User Browser]) EdgeAgent([Edge Agents]) end subgraph Cluster["Kubernetes Cluster"] subgraph Ingress["Edge Layer"] ING[Ingress Controller] WEB[Web UI Next.js :3000] KONG[Kong Gateway :8000] end subgraph API["API Layer"] CORE[Core Service REST :8090 / gRPC :50052] SRQL[SRQL Service :8080] end subgraph Monitoring["Monitoring Layer"] POLLER[Poller :50053] ... (clipped 49 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary>Generic: Robust Error Handling and Edge Case Management</summary> **Objective:** Ensure comprehensive error handling that provides meaningful context and graceful degradation **Status:** <a href='https://github.com/carverauto/serviceradar/pull/2031/files#diff-adfa337ce44dc2902621da20152a048dac41878cf3716dfc4cc56d03aa212a56R18-R21'>Dependency upgrade</a>: Only documentation and dependency versions were updated; no new executable code was added where error handling could be assessed. <details open><summary>Referred Code</summary> ```json "@docusaurus/core": "^3.9.2", "@docusaurus/preset-classic": "^3.9.2", "@docusaurus/theme-mermaid": "^3.9.2", "@mdx-js/react": "^3.0.0", ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary>Generic: Secure Error Handling</summary> **Objective:** To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. **Status:** <a href='https://github.com/carverauto/serviceradar/pull/2031/files#diff-90abd06467420fd89391fd1a4d75ceb1f6a9381de4d13a95fffe606abff38d37R83-R101'>No user errors</a>: The diff does not include user-facing error handling code; only docs were changed, so exposure of internal details in errors cannot be validated. <details open><summary>Referred Code</summary> ```markdown **Traffic flow summary:** - **User requests** → Ingress → Web UI (static/SSR) or Kong (API) - **Kong** validates JWTs and routes to Core (control plane) or SRQL (queries) - **Edge agents** connect via gRPC mTLS to the Poller - **NATS JetStream** provides pub/sub messaging and KV storage for all services - **SPIRE** issues X.509 certificates to all workloads via DaemonSet agents ### Cluster requirements - **Ingress**: Required for the web UI and API. Default host/class/TLS come from `helm/serviceradar/values.yaml` (`ingress.enabled=true`, `host=demo.serviceradar.cloud`, `className=nginx`, `tls.secretName=serviceradar-prod-tls`, `tls.clusterIssuer=carverauto-issuer`). If you use nginx, mirror the demo annotations (`nginx.ingress.kubernetes.io/proxy-body-size: 100m`, `proxy-buffer-size: 128k`, `proxy-buffers-number: 4`, `proxy-busy-buffers-size: 256k`, `proxy-read-timeout: 86400`, `proxy-send-timeout: 86400`, `proxy-connect-timeout: 60`) to keep SRQL streams and large asset uploads stable (`k8s/demo/prod/ingress.yaml`). - **Persistent storage (~150GiB/node baseline)**: CNPG consumes the majority (3×100Gi PVCs from `k8s/demo/base/spire/cnpg-cluster.yaml`). JetStream adds 30Gi (`k8s/demo/base/serviceradar-nats.yaml`), OTEL 10Gi (`k8s/demo/base/serviceradar-otel.yaml`), and several 5Gi claims for Core, Datasvc, Mapper, Zen, DB event writer, plus 1Gi claims for Faker/Flowgger/Cert jobs. Spread the CNPG replicas across at least three nodes with SSD-class volumes; the extra PVCs lift per-node needs to roughly 150Gi of usable capacity when co-scheduled with CNPG. - **CPU / memory (requested)**: Core 1 CPU / 4Gi, Poller 0.5 CPU / 2Gi (`k8s/demo/base/serviceradar-core.yaml`, `serviceradar-poller.yaml`); Kong 0.5 CPU / 1Gi; Web 0.2 CPU / 512Mi; Datasvc 0.5 CPU / 128Mi; SRQL 0.1 CPU / 128Mi; NATS 1 CPU / 8Gi; OTEL 0.2 CPU / 256Mi. The steady-state floor is ~4 vCPU and ~16 GiB for the core path, before adding optional sync/checker pods or horizontal scaling. - **Identity plane**: SPIRE server (StatefulSet) and daemonset agents must be running; services expect the workload socket at `/run/spire/sockets/agent.sock` and SPIFFE IDs derived from `spire.trustDomain` in `values.yaml`. - **TLS artifacts**: Pods mount `serviceradar-cert-data` for inter-service TLS and `cnpg-ca` for database verification; ensure these secrets/PVCs are provisioned before rolling workloads. ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary>Generic: Secure Logging Practices</summary> **Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. **Status:** <a href='https://github.com/carverauto/serviceradar/pull/2031/files#diff-90abd06467420fd89391fd1a4d75ceb1f6a9381de4d13a95fffe606abff38d37R83-R101'>No logging code</a>: No application logging changes are present in this documentation-focused PR; secure logging practices cannot be assessed. <details open><summary>Referred Code</summary> ```markdown **Traffic flow summary:** - **User requests** → Ingress → Web UI (static/SSR) or Kong (API) - **Kong** validates JWTs and routes to Core (control plane) or SRQL (queries) - **Edge agents** connect via gRPC mTLS to the Poller - **NATS JetStream** provides pub/sub messaging and KV storage for all services - **SPIRE** issues X.509 certificates to all workloads via DaemonSet agents ### Cluster requirements - **Ingress**: Required for the web UI and API. Default host/class/TLS come from `helm/serviceradar/values.yaml` (`ingress.enabled=true`, `host=demo.serviceradar.cloud`, `className=nginx`, `tls.secretName=serviceradar-prod-tls`, `tls.clusterIssuer=carverauto-issuer`). If you use nginx, mirror the demo annotations (`nginx.ingress.kubernetes.io/proxy-body-size: 100m`, `proxy-buffer-size: 128k`, `proxy-buffers-number: 4`, `proxy-busy-buffers-size: 256k`, `proxy-read-timeout: 86400`, `proxy-send-timeout: 86400`, `proxy-connect-timeout: 60`) to keep SRQL streams and large asset uploads stable (`k8s/demo/prod/ingress.yaml`). - **Persistent storage (~150GiB/node baseline)**: CNPG consumes the majority (3×100Gi PVCs from `k8s/demo/base/spire/cnpg-cluster.yaml`). JetStream adds 30Gi (`k8s/demo/base/serviceradar-nats.yaml`), OTEL 10Gi (`k8s/demo/base/serviceradar-otel.yaml`), and several 5Gi claims for Core, Datasvc, Mapper, Zen, DB event writer, plus 1Gi claims for Faker/Flowgger/Cert jobs. Spread the CNPG replicas across at least three nodes with SSD-class volumes; the extra PVCs lift per-node needs to roughly 150Gi of usable capacity when co-scheduled with CNPG. - **CPU / memory (requested)**: Core 1 CPU / 4Gi, Poller 0.5 CPU / 2Gi (`k8s/demo/base/serviceradar-core.yaml`, `serviceradar-poller.yaml`); Kong 0.5 CPU / 1Gi; Web 0.2 CPU / 512Mi; Datasvc 0.5 CPU / 128Mi; SRQL 0.1 CPU / 128Mi; NATS 1 CPU / 8Gi; OTEL 0.2 CPU / 256Mi. The steady-state floor is ~4 vCPU and ~16 GiB for the core path, before adding optional sync/checker pods or horizontal scaling. - **Identity plane**: SPIRE server (StatefulSet) and daemonset agents must be running; services expect the workload socket at `/run/spire/sockets/agent.sock` and SPIFFE IDs derived from `spire.trustDomain` in `values.yaml`. - **TLS artifacts**: Pods mount `serviceradar-cert-data` for inter-service TLS and `cnpg-ca` for database verification; ensure these secrets/PVCs are provisioned before rolling workloads. ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary>Generic: Security-First Input Validation and Data Handling</summary> **Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities **Status:** <a href='https://github.com/carverauto/serviceradar/pull/2031/files#diff-adfa337ce44dc2902621da20152a048dac41878cf3716dfc4cc56d03aa212a56R29-R31'>Package bumps</a>: Only Docusaurus-related dependency versions and documentation text changed; there are no new input-handling code paths to validate for security. <details open><summary>Referred Code</summary> ```json "@docusaurus/module-type-aliases": "^3.9.2", "@docusaurus/tsconfig": "^3.9.2", "@docusaurus/types": "^3.9.2", ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td align="center" colspan="2">   </td></tr></tbody></table> <details><summary>Compliance status legend</summary> 🟢 - Fully Compliant 🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label </details>

qodo-code-review[bot] commented

2025-11-28 18:16:22 +00:00

(Migrated from github.com)

Author

Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2031#issuecomment-3590080554
Original created: 2025-11-28T18:16:22Z

You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Avoid hardcoding configuration-defined storage values In the documentation, replace hardcoded storage size values with a general overview and a reference to the configuration files for the exact figures to improve maintainability. docs/docs/architecture.md [94] -- Persistent storage (~150GiB/node baseline): CNPG consumes the majority (3×100Gi PVCs from `k8s/demo/base/spire/cnpg-cluster.yaml`). JetStream adds 30Gi (`k8s/demo/base/serviceradar-nats.yaml`), OTEL 10Gi (`k8s/demo/base/serviceradar-otel.yaml`), and several 5Gi claims for Core, Datasvc, Mapper, Zen, DB event writer, plus 1Gi claims for Faker/Flowgger/Cert jobs. Spread the CNPG replicas across at least three nodes with SSD-class volumes; the extra PVCs lift per-node needs to roughly 150Gi of usable capacity when co-scheduled with CNPG. +- Persistent storage (~150GiB/node baseline): Several components require persistent storage. CNPG (TimescaleDB) is the largest consumer, followed by NATS JetStream, the OTel Collector, and various services like Core and Datasvc. For specific PVC sizes, refer to the manifests in `k8s/demo/base/`. Spread the CNPG replicas across at least three nodes with SSD-class volumes. The combined storage requirements can reach approximately 150Gi of usable capacity per node when co-scheduled with a CNPG replica. Apply / Chat Suggestion importance[1-10]: 6 __ Why: This is a valid suggestion that improves the long-term maintainability of the documentation by avoiding hardcoded values that can become stale.	Low
General	Avoid hardcoding resource request values In the documentation, replace hardcoded CPU and memory resource requests with a high-level summary and a reference to the configuration files for precise values. docs/docs/architecture.md [96] -- CPU / memory (requested): Core 1 CPU / 4Gi, Poller 0.5 CPU / 2Gi (`k8s/demo/base/serviceradar-core.yaml`, `serviceradar-poller.yaml`); Kong 0.5 CPU / 1Gi; Web 0.2 CPU / 512Mi; Datasvc 0.5 CPU / 128Mi; SRQL 0.1 CPU / 128Mi; NATS 1 CPU / 8Gi; OTEL 0.2 CPU / 256Mi. The steady-state floor is ~4 vCPU and ~16 GiB for the core path, before adding optional sync/checker pods or horizontal scaling. +- CPU / memory (requested): The core services require significant resources. For example, the Core service and NATS request 1 CPU each, with 4Gi and 8Gi of memory respectively. For a complete and up-to-date list of resource requests for all components, please refer to the relevant deployment manifests (e.g., `k8s/demo/base/serviceradar-core.yaml`). The steady-state floor is ~4 vCPU and ~16 GiB for the core path, before adding optional sync/checker pods or horizontal scaling. Apply / Chat Suggestion importance[1-10]: 6 __ Why: This is a valid suggestion that improves the long-term maintainability of the documentation by avoiding hardcoded values that can become stale.	Low
More

Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2031#issuecomment-3590080554 Original created: 2025-11-28T18:16:22Z --- _You are nearing your monthly Qodo Merge usage quota. For more information, please visit [here](https://qodo-merge-docs.qodo.ai/installation/qodo_merge/#cloud-users)._ ## PR Code Suggestions ✨  Explore these optional code suggestions: <table><thead><tr><td>Category</td><td align=left>Suggestion                                                                                                                                    </td><td align=center>Impact</td></tr><tbody><tr><td rowspan=2>General</td> <td> <details><summary>Avoid hardcoding configuration-defined storage values</summary> ___ **In the documentation, replace hardcoded storage size values with a general overview and a reference to the configuration files for the exact figures to improve maintainability.** [docs/docs/architecture.md [94]](https://github.com/carverauto/serviceradar/pull/2031/files#diff-90abd06467420fd89391fd1a4d75ceb1f6a9381de4d13a95fffe606abff38d37R94-R94) ```diff -- **Persistent storage (~150GiB/node baseline)**: CNPG consumes the majority (3×100Gi PVCs from `k8s/demo/base/spire/cnpg-cluster.yaml`). JetStream adds 30Gi (`k8s/demo/base/serviceradar-nats.yaml`), OTEL 10Gi (`k8s/demo/base/serviceradar-otel.yaml`), and several 5Gi claims for Core, Datasvc, Mapper, Zen, DB event writer, plus 1Gi claims for Faker/Flowgger/Cert jobs. Spread the CNPG replicas across at least three nodes with SSD-class volumes; the extra PVCs lift per-node needs to roughly 150Gi of usable capacity when co-scheduled with CNPG. +- **Persistent storage (~150GiB/node baseline)**: Several components require persistent storage. CNPG (TimescaleDB) is the largest consumer, followed by NATS JetStream, the OTel Collector, and various services like Core and Datasvc. For specific PVC sizes, refer to the manifests in `k8s/demo/base/`. Spread the CNPG replicas across at least three nodes with SSD-class volumes. The combined storage requirements can reach approximately 150Gi of usable capacity per node when co-scheduled with a CNPG replica. ``` - [ ] **Apply / Chat**  <details><summary>Suggestion importance[1-10]: 6</summary> __ Why: This is a valid suggestion that improves the long-term maintainability of the documentation by avoiding hardcoded values that can become stale. </details></details></td><td align=center>Low </td></tr><tr><td> <details><summary>Avoid hardcoding resource request values</summary> ___ **In the documentation, replace hardcoded CPU and memory resource requests with a high-level summary and a reference to the configuration files for precise values.** [docs/docs/architecture.md [96]](https://github.com/carverauto/serviceradar/pull/2031/files#diff-90abd06467420fd89391fd1a4d75ceb1f6a9381de4d13a95fffe606abff38d37R96-R96) ```diff -- **CPU / memory (requested)**: Core 1 CPU / 4Gi, Poller 0.5 CPU / 2Gi (`k8s/demo/base/serviceradar-core.yaml`, `serviceradar-poller.yaml`); Kong 0.5 CPU / 1Gi; Web 0.2 CPU / 512Mi; Datasvc 0.5 CPU / 128Mi; SRQL 0.1 CPU / 128Mi; NATS 1 CPU / 8Gi; OTEL 0.2 CPU / 256Mi. The steady-state floor is ~4 vCPU and ~16 GiB for the core path, before adding optional sync/checker pods or horizontal scaling. +- **CPU / memory (requested)**: The core services require significant resources. For example, the Core service and NATS request 1 CPU each, with 4Gi and 8Gi of memory respectively. For a complete and up-to-date list of resource requests for all components, please refer to the relevant deployment manifests (e.g., `k8s/demo/base/serviceradar-core.yaml`). The steady-state floor is ~4 vCPU and ~16 GiB for the core path, before adding optional sync/checker pods or horizontal scaling. ``` - [ ] **Apply / Chat**  <details><summary>Suggestion importance[1-10]: 6</summary> __ Why: This is a valid suggestion that improves the long-term maintainability of the documentation by avoiding hardcoded values that can become stale. </details></details></td><td align=center>Low </td></tr> <tr><td align="center" colspan="2"> - [ ] More  </td><td></td></tr></tbody></table>