2038 feat apache age integration #2492

Merged
mfreeman451 merged 16 commits from refs/pull/2492/head into main 2025-12-02 20:09:50 +00:00
mfreeman451 commented 2025-12-01 02:52:40 +00:00 (Migrated from github.com)
Owner

Imported from GitHub pull request.

Original GitHub pull request: #2039
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2039
Original created: 2025-12-01T02:52:40Z
Original updated: 2025-12-02T20:10:12Z
Original head: carverauto/serviceradar:2038-feat-apache-age-integration
Original base: main
Original merged: 2025-12-02T20:09:50Z by @mfreeman451

User description

IMPORTANT: Please sign the Developer Certificate of Origin

Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:

Signed-off-by: J. Doe <j.doe@domain.com>

Describe your changes

Code checklist before requesting a review

  • I have signed the DCO?
  • The build completes without errors?
  • All tests are passing when running make test?

PR Type

Enhancement


Description

  • Integrate Apache AGE graph database for device/service/collector relationships

  • Add AGE graph writer to ingest device updates into relationship graph

  • Bootstrap AGE graph schema with nodes and edges in CNPG migrations

  • Enable feature flag control for AGE graph writes via environment variable


Diagram Walkthrough

flowchart LR
  A["Device Updates"] -->|ProcessBatchDeviceUpdates| B["DeviceRegistry"]
  B -->|WriteGraph| C["AGEGraphWriter"]
  C -->|cypher MERGE| D["AGE Graph<br/>serviceradar"]
  D -->|Nodes| E["Device<br/>Service<br/>Collector"]
  D -->|Edges| F["REPORTED_BY<br/>HOSTS_SERVICE<br/>TARGETS"]
  G["ENABLE_AGE_GRAPH_WRITES<br/>env flag"] -->|ageGraphEnabled| B
  H["CNPG Migration"] -->|create_graph| D

File Walkthrough

Relevant files
Configuration changes
2 files
config.go
Add AGE runtime defaults to CNPG configuration                     
+19/-0   
00000000000012_age_graph_bootstrap.up.sql
Bootstrap AGE extension and graph schema in CNPG                 
+102/-0 
Enhancement
3 files
server.go
Wire AGE graph writer and feature flag to registry             
+16/-0   
age_graph_writer.go
Implement AGE graph writer for device relationship ingestion
+387/-0 
registry.go
Add graph writer option and invoke on batch updates           
+12/-0   
Tests
1 files
age_graph_writer_test.go
Add unit tests for AGE graph parameter building                   
+69/-0   
Documentation
4 files
proposal.md
Define AGE relationship graph feature proposal and scope 
+17/-0   
design.md
Detail AGE graph design decisions and migration plan         
+28/-0   
spec.md
Define AGE graph requirements and test scenarios                 
+129/-0 
tasks.md
Outline implementation tasks for AGE integration                 
+46/-0   

Imported from GitHub pull request. Original GitHub pull request: #2039 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/pull/2039 Original created: 2025-12-01T02:52:40Z Original updated: 2025-12-02T20:10:12Z Original head: carverauto/serviceradar:2038-feat-apache-age-integration Original base: main Original merged: 2025-12-02T20:09:50Z by @mfreeman451 --- ### **User description** ## IMPORTANT: Please sign the Developer Certificate of Origin Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include a [DCO sign-off statement]( https://developercertificate.org/) indicating the DCO acceptance in one commit message. Here is an example DCO Signed-off-by line in a commit message: ``` Signed-off-by: J. Doe <j.doe@domain.com> ``` ## Describe your changes ## Issue ticket number and link ## Code checklist before requesting a review - [ ] I have signed the DCO? - [ ] The build completes without errors? - [ ] All tests are passing when running make test? ___ ### **PR Type** Enhancement ___ ### **Description** - Integrate Apache AGE graph database for device/service/collector relationships - Add AGE graph writer to ingest device updates into relationship graph - Bootstrap AGE graph schema with nodes and edges in CNPG migrations - Enable feature flag control for AGE graph writes via environment variable ___ ### Diagram Walkthrough ```mermaid flowchart LR A["Device Updates"] -->|ProcessBatchDeviceUpdates| B["DeviceRegistry"] B -->|WriteGraph| C["AGEGraphWriter"] C -->|cypher MERGE| D["AGE Graph<br/>serviceradar"] D -->|Nodes| E["Device<br/>Service<br/>Collector"] D -->|Edges| F["REPORTED_BY<br/>HOSTS_SERVICE<br/>TARGETS"] G["ENABLE_AGE_GRAPH_WRITES<br/>env flag"] -->|ageGraphEnabled| B H["CNPG Migration"] -->|create_graph| D ``` <details> <summary><h3> File Walkthrough</h3></summary> <table><thead><tr><th></th><th align="left">Relevant files</th></tr></thead><tbody><tr><td><strong>Configuration changes</strong></td><td><details><summary>2 files</summary><table> <tr> <td><strong>config.go</strong><dd><code>Add AGE runtime defaults to CNPG configuration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-a2dcdf3bac7ca59032d52f7a0635b5dbcfcb83a6f07cc45db8f57fba0cc33c30">+19/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>00000000000012_age_graph_bootstrap.up.sql</strong><dd><code>Bootstrap AGE extension and graph schema in CNPG</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-fa8a4a01fcedff3d25fa425f261cfd24f7a1c31d2fe10a644ffa1bf90c57cae6">+102/-0</a>&nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Enhancement</strong></td><td><details><summary>3 files</summary><table> <tr> <td><strong>server.go</strong><dd><code>Wire AGE graph writer and feature flag to registry</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-a12d34beb8640efbbedad28f8610fd58afeea457572d82e4fd295ab2a6bb8ee6">+16/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>age_graph_writer.go</strong><dd><code>Implement AGE graph writer for device relationship ingestion</code></dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-2e6c0a5c048582201ffb983889ab9fab6fb1f8d64b91ee3fb65479b2fb2c6195">+387/-0</a>&nbsp; </td> </tr> <tr> <td><strong>registry.go</strong><dd><code>Add graph writer option and invoke on batch updates</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-cb61d8f79451b9541de4a8cc0811523a68d15452b2f5971c7618ea5b423cf4ec">+12/-0</a>&nbsp; &nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Tests</strong></td><td><details><summary>1 files</summary><table> <tr> <td><strong>age_graph_writer_test.go</strong><dd><code>Add unit tests for AGE graph parameter building</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-583d7b3d194efad05641ae58185bccf093ffb0e83d3fa2bc9a6352c503874abd">+69/-0</a>&nbsp; &nbsp; </td> </tr> </table></details></td></tr><tr><td><strong>Documentation</strong></td><td><details><summary>4 files</summary><table> <tr> <td><strong>proposal.md</strong><dd><code>Define AGE relationship graph feature proposal and scope</code>&nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-3fa8ff1eb6c2b79b5937f3b7e97d0eaa429475802e0e136d70e61f3ce8b996e1">+17/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>design.md</strong><dd><code>Detail AGE graph design decisions and migration plan</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-5e5a5da59cb9f0b4f7a8fc3c492f1be70a9292fc0533fe4fc103a85f27f3c105">+28/-0</a>&nbsp; &nbsp; </td> </tr> <tr> <td><strong>spec.md</strong><dd><code>Define AGE graph requirements and test scenarios</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-68425fc01e3047a9fd103cfcb917fa84bdc3f2781539b29a5f0f983753d57627">+129/-0</a>&nbsp; </td> </tr> <tr> <td><strong>tasks.md</strong><dd><code>Outline implementation tasks for AGE integration</code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </dd></td> <td><a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-8bd336677c2048834707d9f86bcb964404304815695a2ebd7273203c57cb7c92">+46/-0</a>&nbsp; &nbsp; </td> </tr> </table></details></td></tr></tr></tbody></table> </details> ___
qodo-code-review[bot] commented 2025-12-01 02:53:14 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2039#issuecomment-3594310807
Original created: 2025-12-01T02:53:14Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🟡
🎫 #2038
🟢 Bootstrap an Apache AGE graph named serviceradar in CNPG with AGE extension enabled and
usable idempotently.
Ensure connections default to AGE search_path and graph_path for core/SRQL sessions.
Define node and edge labels for Device, Service, Collector, Interface, Capability,
CheckerDefinition and relationships like HOSTS_SERVICE, RUNS_CHECKER, TARGETS,
HAS_INTERFACE, CONNECTS_TO, PROVIDES_CAPABILITY, REPORTED_BY.
Create property indexes on canonical IDs for key vertex labels when supported.
Add a graph writer in core registry to emit device/service/collector relationships using
Cypher MERGE into AGE.
Ensure collector-owned services/checkers are modeled as Service nodes attached to
Collector nodes without creating phantom Device nodes for collector hosts.
Add provenance edges (REPORTED_BY) from Devices to Collectors for updates.
Control graph writes with a feature flag environment variable.
Tolerate AGE failures without blocking registry ingestion.
Provide unit tests covering ingestion behavior and collector-vs-target handling.
Ensure migrations execute correctly across target environments (CNPG versions) and confirm
AGE availability at runtime.
Validate that property indexes are created effectively and improve query performance in
real deployments.
End-to-end verification that no phantom devices appear in UI and that graph writes are
correctly gated by the feature flag in production-like setups.
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Missing audit logs: The new graph write path performs merges into AGE without emitting structured audit logs
capturing actor/context, making it unclear who/what triggered critical graph mutations.

Referred Code
func (w *ageGraphWriter) WriteGraph(ctx context.Context, updates []*models.DeviceUpdate) {
	if w == nil || w.executor == nil {
		return
	}
	params := buildAgeGraphParams(updates)
	if params == nil {
		return
	}

	payload, err := json.Marshal(params)
	if err != nil {
		if w.log != nil {
			w.log.Warn().Err(err).Msg("age graph: failed to marshal params")
		}
		return
	}

	if _, err := w.executor.ExecuteQuery(ctx, ageGraphMergeQuery, payload); err != nil && w.log != nil {
		w.log.Warn().Err(err).Msg("age graph: failed to merge batch")
	}
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Limited error context: Errors from AGE writes are only warned without retry/backoff/metrics and lack operation
context (e.g., counts), which may lead to silent drift if AGE is unavailable.

Referred Code

	if _, err := w.executor.ExecuteQuery(ctx, ageGraphMergeQuery, payload); err != nil && w.log != nil {
		w.log.Warn().Err(err).Msg("age graph: failed to merge batch")
	}
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Input validation gaps: The graph ingestion builds Cypher parameters from updates without explicit validation for
allowed characters in IDs/types, relying on parameterization but lacking
canonicalization/constraints that could prevent malformed graph data.

Referred Code
func upsertCollector(store map[string]ageGraphCollector, id, serviceType, ip, hostname string) {
	id = strings.TrimSpace(id)
	if id == "" {
		return
	}

	entry, exists := store[id]
	if !exists {
		store[id] = ageGraphCollector{ID: id, Type: strings.TrimSpace(serviceType), IP: strings.TrimSpace(ip), Hostname: strings.TrimSpace(hostname)}
		return
	}

	if entry.Type == "" {
		entry.Type = strings.TrimSpace(serviceType)
	}
	if entry.IP == "" {
		entry.IP = strings.TrimSpace(ip)
	}
	if entry.Hostname == "" {
		entry.Hostname = strings.TrimSpace(hostname)
	}


 ... (clipped 2 lines)

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
- Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2039#issuecomment-3594310807 Original created: 2025-12-01T02:53:14Z --- ## PR Compliance Guide 🔍 <!-- https://github.com/carverauto/serviceradar/commit/ef5b01fc6b7d77a3f02a761321a587d7e312f0dc --> Below is a summary of compliance checks for this PR:<br> <table><tbody><tr><td colspan='2'><strong>Security Compliance</strong></td></tr> <tr><td>🟢</td><td><details><summary><strong>No security concerns identified</strong></summary> No security vulnerabilities detected by AI analysis. Human verification advised for critical code. </details></td></tr> <tr><td colspan='2'><strong>Ticket Compliance</strong></td></tr> <tr><td>🟡</td> <td> <details> <summary>🎫 <a href=https://github.com/carverauto/serviceradar/issues/2038>#2038</a></summary> <table width='100%'><tbody> <tr><td rowspan=10>🟢</td> <td>Bootstrap an Apache AGE graph named serviceradar in CNPG with AGE extension enabled and <br>usable idempotently.</td></tr> <tr><td>Ensure connections default to AGE search_path and graph_path for core/SRQL sessions.</td></tr> <tr><td>Define node and edge labels for Device, Service, Collector, Interface, Capability, <br>CheckerDefinition and relationships like HOSTS_SERVICE, RUNS_CHECKER, TARGETS, <br>HAS_INTERFACE, CONNECTS_TO, PROVIDES_CAPABILITY, REPORTED_BY.</td></tr> <tr><td>Create property indexes on canonical IDs for key vertex labels when supported.</td></tr> <tr><td>Add a graph writer in core registry to emit device/service/collector relationships using <br>Cypher MERGE into AGE.</td></tr> <tr><td>Ensure collector-owned services/checkers are modeled as Service nodes attached to <br>Collector nodes without creating phantom Device nodes for collector hosts.</td></tr> <tr><td>Add provenance edges (REPORTED_BY) from Devices to Collectors for updates.</td></tr> <tr><td>Control graph writes with a feature flag environment variable.</td></tr> <tr><td>Tolerate AGE failures without blocking registry ingestion.</td></tr> <tr><td>Provide unit tests covering ingestion behavior and collector-vs-target handling.</td></tr> <tr><td rowspan=3>⚪</td> <td>Ensure migrations execute correctly across target environments (CNPG versions) and confirm <br>AGE availability at runtime.</td></tr> <tr><td>Validate that property indexes are created effectively and improve query performance in <br>real deployments.</td></tr> <tr><td>End-to-end verification that no phantom devices appear in UI and that graph writes are <br>correctly gated by the feature flag in production-like setups.</td></tr> </tbody></table> </details> </td></tr> <tr><td colspan='2'><strong>Codebase Duplication Compliance</strong></td></tr> <tr><td>⚪</td><td><details><summary><strong>Codebase context is not defined </strong></summary> Follow the <a href='https://qodo-merge-docs.qodo.ai/core-abilities/rag_context_enrichment/'>guide</a> to enable codebase context checks. </details></td></tr> <tr><td colspan='2'><strong>Custom Compliance</strong></td></tr> <tr><td rowspan=3>🟢</td><td> <details><summary><strong>Generic: Meaningful Naming and Self-Documenting Code</strong></summary><br> **Objective:** Ensure all identifiers clearly express their purpose and intent, making code <br>self-documenting<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Secure Error Handling</strong></summary><br> **Objective:** To prevent the leakage of sensitive system information through error messages while <br>providing sufficient detail for internal debugging.<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Secure Logging Practices</strong></summary><br> **Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive <br>information like PII, PHI, or cardholder data.<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td rowspan=3>⚪</td> <td><details> <summary><strong>Generic: Comprehensive Audit Trails</strong></summary><br> **Objective:** To create a detailed and reliable record of critical system actions for security analysis <br>and compliance.<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2039/files#diff-2e6c0a5c048582201ffb983889ab9fab6fb1f8d64b91ee3fb65479b2fb2c6195R67-R87'><strong>Missing audit logs</strong></a>: The new graph write path performs merges into AGE without emitting structured audit logs <br>capturing actor/context, making it unclear who/what triggered critical graph mutations.<br> <details open><summary>Referred Code</summary> ```go func (w *ageGraphWriter) WriteGraph(ctx context.Context, updates []*models.DeviceUpdate) { if w == nil || w.executor == nil { return } params := buildAgeGraphParams(updates) if params == nil { return } payload, err := json.Marshal(params) if err != nil { if w.log != nil { w.log.Warn().Err(err).Msg("age graph: failed to marshal params") } return } if _, err := w.executor.ExecuteQuery(ctx, ageGraphMergeQuery, payload); err != nil && w.log != nil { w.log.Warn().Err(err).Msg("age graph: failed to merge batch") } } ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Robust Error Handling and Edge Case Management</strong></summary><br> **Objective:** Ensure comprehensive error handling that provides meaningful context and graceful <br>degradation<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2039/files#diff-2e6c0a5c048582201ffb983889ab9fab6fb1f8d64b91ee3fb65479b2fb2c6195R83-R87'><strong>Limited error context</strong></a>: Errors from AGE writes are only warned without retry/backoff/metrics and lack operation <br>context (e.g., counts), which may lead to silent drift if AGE is unavailable.<br> <details open><summary>Referred Code</summary> ```go if _, err := w.executor.ExecuteQuery(ctx, ageGraphMergeQuery, payload); err != nil && w.log != nil { w.log.Warn().Err(err).Msg("age graph: failed to merge batch") } } ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Security-First Input Validation and Data Handling</strong></summary><br> **Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent <br>vulnerabilities<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2039/files#diff-2e6c0a5c048582201ffb983889ab9fab6fb1f8d64b91ee3fb65479b2fb2c6195R185-R207'><strong>Input validation gaps</strong></a>: The graph ingestion builds Cypher parameters from updates without explicit validation for <br>allowed characters in IDs/types, relying on parameterization but lacking <br>canonicalization/constraints that could prevent malformed graph data.<br> <details open><summary>Referred Code</summary> ```go func upsertCollector(store map[string]ageGraphCollector, id, serviceType, ip, hostname string) { id = strings.TrimSpace(id) if id == "" { return } entry, exists := store[id] if !exists { store[id] = ageGraphCollector{ID: id, Type: strings.TrimSpace(serviceType), IP: strings.TrimSpace(ip), Hostname: strings.TrimSpace(hostname)} return } if entry.Type == "" { entry.Type = strings.TrimSpace(serviceType) } if entry.IP == "" { entry.IP = strings.TrimSpace(ip) } if entry.Hostname == "" { entry.Hostname = strings.TrimSpace(hostname) } ... (clipped 2 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td align="center" colspan="2"> - [ ] Update <!-- /compliance --update_compliance=true --> </td></tr></tbody></table> <details><summary>Compliance status legend</summary> 🟢 - Fully Compliant<br> 🟡 - Partial Compliant<br> 🔴 - Not Compliant<br> ⚪ - Requires Further Human Verification<br> 🏷️ - Compliance label<br> </details>
qodo-code-review[bot] commented 2025-12-01 02:54:33 +00:00 (Migrated from github.com)
Author
Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2039#issuecomment-3594312876
Original created: 2025-12-01T02:54:33Z

PR Code Suggestions

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Consider a more robust data consistency strategy

The current approach of writing to the primary database and then separately to
the AGE graph can cause data inconsistency if the graph write fails. Implement a
more robust mechanism, like the transactional outbox pattern, to ensure updates
are reliably propagated.

Examples:

pkg/registry/registry.go [348-356]
	if err := r.db.PublishBatchDeviceUpdates(ctx, batch); err != nil {
		return fmt.Errorf("failed to publish device updates: %w", err)
	}

	r.applyRegistryStore(canonicalized, tombstones)

	if r.graphWriter != nil {
		r.graphWriter.WriteGraph(ctx, canonicalized)
	}
pkg/registry/age_graph_writer.go [84-86]
	if _, err := w.executor.ExecuteQuery(ctx, ageGraphMergeQuery, payload); err != nil && w.log != nil {
		w.log.Warn().Err(err).Msg("age graph: failed to merge batch")
	}

Solution Walkthrough:

Before:

// in pkg/registry/registry.go ProcessBatchDeviceUpdates()

// 1. Write to primary database stream
err := r.db.PublishBatchDeviceUpdates(ctx, batch)
if err != nil {
    return err
}
r.applyRegistryStore(canonicalized, tombstones)

// 2. Separately, write to AGE graph.
// This is a "fire-and-forget" call.
// If this fails, the graph becomes inconsistent with the primary DB.
if r.graphWriter != nil {
    r.graphWriter.WriteGraph(ctx, canonicalized)
}

After:

// Using a transactional outbox pattern

// In ProcessBatchDeviceUpdates()
// 1. Start a database transaction
tx := db.Begin()

// 2. Write device updates to primary tables within the transaction
tx.PublishBatchDeviceUpdates(ctx, batch)

// 3. Write the event/message for the graph to an "outbox" table within the same transaction
tx.InsertIntoOutboxTable(graphUpdatePayload)

// 4. Commit the transaction
tx.Commit()

// A separate processor will then reliably read from the outbox table
// and write to the AGE graph, with retries and dead-letter queueing.

Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a critical data consistency issue between the primary database and the new AGE graph, as the current implementation lacks transactional guarantees for dual writes, which is a significant architectural risk.

High
Possible issue
Process graph writes asynchronously for resilience

Refactor WriteGraph to process updates asynchronously in a goroutine, preventing
the main registry flow from being blocked and improving resilience to transient
database errors.

pkg/registry/age_graph_writer.go [67-87]

 func (w *ageGraphWriter) WriteGraph(ctx context.Context, updates []*models.DeviceUpdate) {
-	if w == nil || w.executor == nil {
-		return
-	}
-	params := buildAgeGraphParams(updates)
-	if params == nil {
+	if w == nil || w.executor == nil || len(updates) == 0 {
 		return
 	}
 
-	payload, err := json.Marshal(params)
-	if err != nil {
-		if w.log != nil {
-			w.log.Warn().Err(err).Msg("age graph: failed to marshal params")
+	// Process in a separate goroutine to avoid blocking the caller.
+	go func() {
+		// Create a new context to avoid cancellation from the parent context.
+		detachedCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+		defer cancel()
+
+		params := buildAgeGraphParams(updates)
+		if params == nil {
+			return
 		}
-		return
-	}
 
-	if _, err := w.executor.ExecuteQuery(ctx, ageGraphMergeQuery, payload); err != nil && w.log != nil {
-		w.log.Warn().Err(err).Msg("age graph: failed to merge batch")
-	}
+		payload, err := json.Marshal(params)
+		if err != nil {
+			if w.log != nil {
+				w.log.Warn().Err(err).Msg("age graph: failed to marshal params")
+			}
+			return
+		}
+
+		if _, err := w.executor.ExecuteQuery(detachedCtx, ageGraphMergeQuery, payload); err != nil && w.log != nil {
+			w.log.Warn().Err(err).Str("payload", string(payload)).Msg("age graph: failed to merge batch")
+		}
+	}()
 }
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies that synchronous graph writes could block critical path processing and proposes a reasonable asynchronous pattern to improve resilience and performance.

Medium
General
Make new feature opt-in by default

Modify the ageGraphEnabled function to default to false when the
ENABLE_AGE_GRAPH_WRITES environment variable is unset, making the new graph
writer feature opt-in.

pkg/core/server.go [77-87]

 func ageGraphEnabled() bool {
 	val := strings.TrimSpace(strings.ToLower(os.Getenv("ENABLE_AGE_GRAPH_WRITES")))
 	switch val {
-	case "", "1", "true", "yes", "on":
+	case "1", "true", "yes", "on":
 		return true
-	case "0", "false", "no", "off":
+	default:
+		// Default to false for safety if the variable is unset, empty, or has an unrecognized value.
 		return false
-	default:
-		return true
 	}
 }
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that the feature is opt-out by default and proposes making it opt-in, which is a safer approach for introducing new functionality and aligns with best practices for feature flagging.

Low
  • Update
Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2039#issuecomment-3594312876 Original created: 2025-12-01T02:54:33Z --- ## PR Code Suggestions ✨ <!-- ef5b01f --> Explore these optional code suggestions: <table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=1>High-level</td> <td> <details><summary>Consider a more robust data consistency strategy</summary> ___ **The current approach of writing to the primary database and then separately to <br>the AGE graph can cause data inconsistency if the graph write fails. Implement a <br>more robust mechanism, like the transactional outbox pattern, to ensure updates <br>are reliably propagated.** ### Examples: <details> <summary> <a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-cb61d8f79451b9541de4a8cc0811523a68d15452b2f5971c7618ea5b423cf4ecR348-R356">pkg/registry/registry.go [348-356]</a> </summary> ```go if err := r.db.PublishBatchDeviceUpdates(ctx, batch); err != nil { return fmt.Errorf("failed to publish device updates: %w", err) } r.applyRegistryStore(canonicalized, tombstones) if r.graphWriter != nil { r.graphWriter.WriteGraph(ctx, canonicalized) } ``` </details> <details> <summary> <a href="https://github.com/carverauto/serviceradar/pull/2039/files#diff-2e6c0a5c048582201ffb983889ab9fab6fb1f8d64b91ee3fb65479b2fb2c6195R84-R86">pkg/registry/age_graph_writer.go [84-86]</a> </summary> ```go if _, err := w.executor.ExecuteQuery(ctx, ageGraphMergeQuery, payload); err != nil && w.log != nil { w.log.Warn().Err(err).Msg("age graph: failed to merge batch") } ``` </details> ### Solution Walkthrough: #### Before: ```go // in pkg/registry/registry.go ProcessBatchDeviceUpdates() // 1. Write to primary database stream err := r.db.PublishBatchDeviceUpdates(ctx, batch) if err != nil { return err } r.applyRegistryStore(canonicalized, tombstones) // 2. Separately, write to AGE graph. // This is a "fire-and-forget" call. // If this fails, the graph becomes inconsistent with the primary DB. if r.graphWriter != nil { r.graphWriter.WriteGraph(ctx, canonicalized) } ``` #### After: ```go // Using a transactional outbox pattern // In ProcessBatchDeviceUpdates() // 1. Start a database transaction tx := db.Begin() // 2. Write device updates to primary tables within the transaction tx.PublishBatchDeviceUpdates(ctx, batch) // 3. Write the event/message for the graph to an "outbox" table within the same transaction tx.InsertIntoOutboxTable(graphUpdatePayload) // 4. Commit the transaction tx.Commit() // A separate processor will then reliably read from the outbox table // and write to the AGE graph, with retries and dead-letter queueing. ``` <details><summary>Suggestion importance[1-10]: 9</summary> __ Why: The suggestion correctly identifies a critical data consistency issue between the primary database and the new AGE graph, as the current implementation lacks transactional guarantees for dual writes, which is a significant architectural risk. </details></details></td><td align=center>High </td></tr><tr><td rowspan=1>Possible issue</td> <td> <details><summary>Process graph writes asynchronously for resilience<!-- not_implemented --></summary> ___ **Refactor <code>WriteGraph</code> to process updates asynchronously in a goroutine, preventing <br>the main registry flow from being blocked and improving resilience to transient <br>database errors.** [pkg/registry/age_graph_writer.go [67-87]](https://github.com/carverauto/serviceradar/pull/2039/files#diff-2e6c0a5c048582201ffb983889ab9fab6fb1f8d64b91ee3fb65479b2fb2c6195R67-R87) ```diff func (w *ageGraphWriter) WriteGraph(ctx context.Context, updates []*models.DeviceUpdate) { - if w == nil || w.executor == nil { - return - } - params := buildAgeGraphParams(updates) - if params == nil { + if w == nil || w.executor == nil || len(updates) == 0 { return } - payload, err := json.Marshal(params) - if err != nil { - if w.log != nil { - w.log.Warn().Err(err).Msg("age graph: failed to marshal params") + // Process in a separate goroutine to avoid blocking the caller. + go func() { + // Create a new context to avoid cancellation from the parent context. + detachedCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second) + defer cancel() + + params := buildAgeGraphParams(updates) + if params == nil { + return } - return - } - if _, err := w.executor.ExecuteQuery(ctx, ageGraphMergeQuery, payload); err != nil && w.log != nil { - w.log.Warn().Err(err).Msg("age graph: failed to merge batch") - } + payload, err := json.Marshal(params) + if err != nil { + if w.log != nil { + w.log.Warn().Err(err).Msg("age graph: failed to marshal params") + } + return + } + + if _, err := w.executor.ExecuteQuery(detachedCtx, ageGraphMergeQuery, payload); err != nil && w.log != nil { + w.log.Warn().Err(err).Str("payload", string(payload)).Msg("age graph: failed to merge batch") + } + }() } ``` - [ ] **Apply / Chat** <!-- /improve --apply_suggestion=1 --> <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies that synchronous graph writes could block critical path processing and proposes a reasonable asynchronous pattern to improve resilience and performance. </details></details></td><td align=center>Medium </td></tr><tr><td rowspan=1>General</td> <td> <details><summary>Make new feature opt-in by default<!-- not_implemented --></summary> ___ **Modify the <code>ageGraphEnabled</code> function to default to <code>false</code> when the <br><code>ENABLE_AGE_GRAPH_WRITES</code> environment variable is unset, making the new graph <br>writer feature opt-in.** [pkg/core/server.go [77-87]](https://github.com/carverauto/serviceradar/pull/2039/files#diff-a12d34beb8640efbbedad28f8610fd58afeea457572d82e4fd295ab2a6bb8ee6R77-R87) ```diff func ageGraphEnabled() bool { val := strings.TrimSpace(strings.ToLower(os.Getenv("ENABLE_AGE_GRAPH_WRITES"))) switch val { - case "", "1", "true", "yes", "on": + case "1", "true", "yes", "on": return true - case "0", "false", "no", "off": + default: + // Default to false for safety if the variable is unset, empty, or has an unrecognized value. return false - default: - return true } } ``` - [ ] **Apply / Chat** <!-- /improve --apply_suggestion=2 --> <details><summary>Suggestion importance[1-10]: 6</summary> __ Why: The suggestion correctly identifies that the feature is opt-out by default and proposes making it opt-in, which is a safer approach for introducing new functionality and aligns with best practices for feature flagging. </details></details></td><td align=center>Low </td></tr> <tr><td align="center" colspan="2"> - [ ] Update <!-- /improve_multi --more_suggestions=true --> </td><td></td></tr></tbody></table>
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar!2492
No description provided.