carverauto/serviceradar

Fork 0

fixing flakey test #2550

Merged

mfreeman451 merged 1 commit from refs/pull/2550/head into main

2025-12-12 04:35:28 +00:00

mfreeman451 commented

2025-12-12 04:34:57 +00:00

(Migrated from github.com)

Owner

Imported from GitHub pull request.

Original GitHub pull request: #2111
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2111
Original created: 2025-12-12T04:34:57Z
Original updated: 2025-12-12T04:36:12Z
Original head: carverauto/serviceradar:update/fix_flakey_kv_tests
Original base: main
Original merged: 2025-12-12T04:35:28Z by @mfreeman451

User description

IMPORTANT: Please sign the Developer Certificate of Origin

Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:

Signed-off-by: J. Doe <j.doe@domain.com>

Describe your changes

Issue ticket number and link

Code checklist before requesting a review

I have signed the DCO?
The build completes without errors?
All tests are passing when running make test?

PR Type

Tests

Description

Increased timeout values in flaky test cases
Expanded watch channel buffer from 1 to 8 to prevent blocking
Adjusted timing thresholds to improve test reliability

Diagram Walkthrough

flowchart LR
  A["Test Timeouts"] -->|increased| B["50ms → 500ms<br/>25ms → 100ms<br/>30ms → 100ms"]
  C["Watch Channel"] -->|buffer expanded| D["1 → 8 capacity"]
  B --> E["Improved Test Stability"]
  D --> E

File Walkthrough

Relevant files

Tests

kv_client_test.go

Increase test timeouts and watch channel buffer

pkg/config/kv_client_test.go

Increased timeout in TestKVManagerStartWatchReloadsOnAnyChange from
50ms to 500ms for reload detection
Increased timeout in same test from 25ms to 100ms for no-reload
scenario
Increased timeout in TestKVManagerStartWatchReappliesPinnedOverlay
from 30ms to 100ms
Expanded watchKVStore.Watch channel buffer from 1 to 8 to prevent emit
operations from blocking

+4/-4

Imported from GitHub pull request. Original GitHub pull request: #2111 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/pull/2111 Original created: 2025-12-12T04:34:57Z Original updated: 2025-12-12T04:36:12Z Original head: carverauto/serviceradar:update/fix_flakey_kv_tests Original base: main Original merged: 2025-12-12T04:35:28Z by @mfreeman451 --- ### **User description** ## IMPORTANT: Please sign the Developer Certificate of Origin Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include a [DCO sign-off statement]( https://developercertificate.org/) indicating the DCO acceptance in one commit message. Here is an example DCO Signed-off-by line in a commit message: ``` Signed-off-by: J. Doe <j.doe@domain.com> ``` ## Describe your changes ## Issue ticket number and link ## Code checklist before requesting a review - [ ] I have signed the DCO? - [ ] The build completes without errors? - [ ] All tests are passing when running make test? ___ ### **PR Type** Tests ___ ### **Description** - Increased timeout values in flaky test cases - Expanded watch channel buffer from 1 to 8 to prevent blocking - Adjusted timing thresholds to improve test reliability ___ ### Diagram Walkthrough ```mermaid flowchart LR A["Test Timeouts"] -->|increased| B["50ms → 500ms<br/>25ms → 100ms<br/>30ms → 100ms"] C["Watch Channel"] -->|buffer expanded| D["1 → 8 capacity"] B --> E["Improved Test Stability"] D --> E ``` <details><summary><h3>File Walkthrough</h3></summary> <table><thead><tr><th></th><th align="left">Relevant files</th></tr></thead><tbody><tr><td><strong>Tests</strong></td><td><table> <tr> <td> <details> <summary><strong>kv_client_test.go</strong><dd><code>Increase test timeouts and watch channel buffer</code>                    </dd></summary> <hr> pkg/config/kv_client_test.go <ul><li>Increased timeout in <code>TestKVManagerStartWatchReloadsOnAnyChange</code> from <br>50ms to 500ms for reload detection<br> <li> Increased timeout in same test from 25ms to 100ms for no-reload <br>scenario<br> <li> Increased timeout in <code>TestKVManagerStartWatchReappliesPinnedOverlay</code> <br>from 30ms to 100ms<br> <li> Expanded <code>watchKVStore.Watch</code> channel buffer from 1 to 8 to prevent emit <br>operations from blocking</ul> </details> </td> <td><a href="https://github.com/carverauto/serviceradar/pull/2111/files#diff-09814dd9ab05f9d40f40974f297806d71fe2eca5f8120492c9f79311153d3aef">+4/-4</a>      </td> </tr> </table></td></tr></tbody></table> </details> ___

qodo-code-review[bot] commented

2025-12-12 04:35:16 +00:00

(Migrated from github.com)

Author

Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2111#issuecomment-3644869811
Original created: 2025-12-12T04:35:16Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢	No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
🟢	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
⚪	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: No audit scope: The changes only adjust test timeouts and a buffered channel size without introducing or affecting audit logging of critical actions, so compliance cannot be determined from this diff. Referred Code store.emit([]byte(`{"listen_addr":":9090"}`)) select { case <-reloads: case <-time.After(500 * time.Millisecond): t.Fatalf("timed out waiting for reload triggered by listen_addr change") } if snap := cfg.snapshot(); snap.ListenAddr != ":9090" { t.Fatalf("expected config to apply listen_addr change, got %q", snap.ListenAddr) } store.emit([]byte(`{"listen_addr":":9090"}`)) select { case <-reloads: t.Fatalf("unexpected reload when config payload did not change") case <-time.After(100 * time.Millisecond): } } ... (clipped 141 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Timeout handling: Tests use fixed timeouts and rely on timing-based selects without explicit handling for slow environments, which may still be flaky though the change increases thresholds. Referred Code select { case <-reloads: case <-time.After(500 * time.Millisecond): t.Fatalf("timed out waiting for reload triggered by listen_addr change") } if snap := cfg.snapshot(); snap.ListenAddr != ":9090" { t.Fatalf("expected config to apply listen_addr change, got %q", snap.ListenAddr) } store.emit([]byte(`{"listen_addr":":9090"}`)) select { case <-reloads: t.Fatalf("unexpected reload when config payload did not change") case <-time.After(100 * time.Millisecond): } } func TestKVManagerStartWatchReappliesPinnedOverlay(t *testing.T) { initialKV := []byte(`{"listen_addr":":8080","logging":{"level":"info"}}`) ... (clipped 39 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: Test-only changes: The diff only modifies test timeouts and a channel buffer and does not touch input validation or data handling paths, so security validation compliance cannot be assessed from this change. Referred Code `func (s *watchKVStore) Watch(ctx context.Context, _ string) (<-chan []byte, error) { s.mu.Lock() if s.watchCh == nil { s.watchCh = make(chan []byte, 8) // Larger buffer to prevent emit from blocking s.readyOnce.Do(func() {` Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2111#issuecomment-3644869811 Original created: 2025-12-12T04:35:16Z --- ## PR Compliance Guide 🔍  Below is a summary of compliance checks for this PR:<br> <table><tbody><tr><td colspan='2'><strong>Security Compliance</strong></td></tr> <tr><td>🟢</td><td><details><summary><strong>No security concerns identified</strong></summary> No security vulnerabilities detected by AI analysis. Human verification advised for critical code. </details></td></tr> <tr><td colspan='2'><strong>Ticket Compliance</strong></td></tr> <tr><td>⚪</td><td><details><summary>🎫 <strong>No ticket provided </strong></summary> - [ ] Create ticket/issue  </details></td></tr> <tr><td colspan='2'><strong>Codebase Duplication Compliance</strong></td></tr> <tr><td>⚪</td><td><details><summary><strong>Codebase context is not defined </strong></summary> Follow the <a href='https://qodo-merge-docs.qodo.ai/core-abilities/rag_context_enrichment/'>guide</a> to enable codebase context checks. </details></td></tr> <tr><td colspan='2'><strong>Custom Compliance</strong></td></tr> <tr><td rowspan=3>🟢</td><td> <details><summary><strong>Generic: Meaningful Naming and Self-Documenting Code</strong></summary><br> **Objective:** Ensure all identifiers clearly express their purpose and intent, making code <br>self-documenting<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Secure Error Handling</strong></summary><br> **Objective:** To prevent the leakage of sensitive system information through error messages while <br>providing sufficient detail for internal debugging.<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td> <details><summary><strong>Generic: Secure Logging Practices</strong></summary><br> **Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive <br>information like PII, PHI, or cardholder data.<br> **Status:** Passed<br> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td rowspan=3>⚪</td> <td><details> <summary><strong>Generic: Comprehensive Audit Trails</strong></summary><br> **Objective:** To create a detailed and reliable record of critical system actions for security analysis <br>and compliance.<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2111/files#diff-09814dd9ab05f9d40f40974f297806d71fe2eca5f8120492c9f79311153d3aefR330-R491'><strong>No audit scope</strong></a>: The changes only adjust test timeouts and a buffered channel size without introducing or <br>affecting audit logging of critical actions, so compliance cannot be determined from this <br>diff.<br> <details open><summary>Referred Code</summary> ```go store.emit([]byte(`{"listen_addr":":9090"}`)) select { case <-reloads: case <-time.After(500 * time.Millisecond): t.Fatalf("timed out waiting for reload triggered by listen_addr change") } if snap := cfg.snapshot(); snap.ListenAddr != ":9090" { t.Fatalf("expected config to apply listen_addr change, got %q", snap.ListenAddr) } store.emit([]byte(`{"listen_addr":":9090"}`)) select { case <-reloads: t.Fatalf("unexpected reload when config payload did not change") case <-time.After(100 * time.Millisecond): } } ... (clipped 141 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Robust Error Handling and Edge Case Management</strong></summary><br> **Objective:** Ensure comprehensive error handling that provides meaningful context and graceful <br>degradation<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2111/files#diff-09814dd9ab05f9d40f40974f297806d71fe2eca5f8120492c9f79311153d3aefR332-R391'><strong>Timeout handling</strong></a>: Tests use fixed timeouts and rely on timing-based selects without explicit handling for <br>slow environments, which may still be flaky though the change increases thresholds.<br> <details open><summary>Referred Code</summary> ```go select { case <-reloads: case <-time.After(500 * time.Millisecond): t.Fatalf("timed out waiting for reload triggered by listen_addr change") } if snap := cfg.snapshot(); snap.ListenAddr != ":9090" { t.Fatalf("expected config to apply listen_addr change, got %q", snap.ListenAddr) } store.emit([]byte(`{"listen_addr":":9090"}`)) select { case <-reloads: t.Fatalf("unexpected reload when config payload did not change") case <-time.After(100 * time.Millisecond): } } func TestKVManagerStartWatchReappliesPinnedOverlay(t *testing.T) { initialKV := []byte(`{"listen_addr":":8080","logging":{"level":"info"}}`) ... (clipped 39 lines) ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td><details> <summary><strong>Generic: Security-First Input Validation and Data Handling</strong></summary><br> **Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent <br>vulnerabilities<br> **Status:** <br><a href='https://github.com/carverauto/serviceradar/pull/2111/files#diff-09814dd9ab05f9d40f40974f297806d71fe2eca5f8120492c9f79311153d3aefR483-R487'><strong>Test-only changes</strong></a>: The diff only modifies test timeouts and a channel buffer and does not touch input <br>validation or data handling paths, so security validation compliance cannot be assessed <br>from this change.<br> <details open><summary>Referred Code</summary> ```go func (s *watchKVStore) Watch(ctx context.Context, _ string) (<-chan []byte, error) { s.mu.Lock() if s.watchCh == nil { s.watchCh = make(chan []byte, 8) // Larger buffer to prevent emit from blocking s.readyOnce.Do(func() { ``` </details> > Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a> </details></td></tr> <tr><td align="center" colspan="2">   </td></tr></tbody></table> <details><summary>Compliance status legend</summary> 🟢 - Fully Compliant<br> 🟡 - Partial Compliant<br> 🔴 - Not Compliant<br> ⚪ - Requires Further Human Verification<br> 🏷️ - Compliance label<br> </details>

qodo-code-review[bot] commented

2025-12-12 04:36:12 +00:00

(Migrated from github.com)

Author

Owner

Imported GitHub PR comment.

Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2111#issuecomment-3644871358
Original created: 2025-12-12T04:36:12Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Use a non-blocking channel read Replace the timed wait (`time.After`) with a non-blocking channel read using a `select` with a `default` case to deterministically check for an unexpected reload. pkg/config/kv_client_test.go [342-348] store.emit([]byte(`{"listen_addr":":9090"}`)) select { case <-reloads: t.Fatalf("unexpected reload when config payload did not change") - case <-time.After(100 * time.Millisecond): + default: + // No reload, which is the expected behavior. } `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies that using `time.After` to check for the absence of a signal is a source of test flakiness and proposes a more robust, non-blocking `select` with a `default` case.	Medium
More

Imported GitHub PR comment. Original author: @qodo-code-review[bot] Original URL: https://github.com/carverauto/serviceradar/pull/2111#issuecomment-3644871358 Original created: 2025-12-12T04:36:12Z --- ## PR Code Suggestions ✨  Explore these optional code suggestions: <table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion                                                                                                                                    </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=1>General</td> <td> <details><summary>Use a non-blocking channel read</summary> ___ **Replace the timed wait (<code>time.After</code>) with a non-blocking channel read using a <br><code>select</code> with a <code>default</code> case to deterministically check for an unexpected reload.** [pkg/config/kv_client_test.go [342-348]](https://github.com/carverauto/serviceradar/pull/2111/files#diff-09814dd9ab05f9d40f40974f297806d71fe2eca5f8120492c9f79311153d3aefR342-R348) ```diff store.emit([]byte(`{"listen_addr":":9090"}`)) select { case <-reloads: t.Fatalf("unexpected reload when config payload did not change") - case <-time.After(100 * time.Millisecond): + default: + // No reload, which is the expected behavior. } ``` `[To ensure code accuracy, apply this suggestion manually]` <details><summary>Suggestion importance[1-10]: 7</summary> __ Why: The suggestion correctly identifies that using `time.After` to check for the absence of a signal is a source of test flakiness and proposes a more robust, non-blocking `select` with a `default` case. </details></details></td><td align=center>Medium </td></tr> <tr><td align="center" colspan="2"> - [ ] More  </td><td></td></tr></tbody></table>