reconnection work #2320
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2320
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2320/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #1761
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/1761
Original created: 2025-10-14T20:56:46Z
Original updated: 2025-10-14T21:03:14Z
Original head: carverauto/serviceradar:1760-bugflowgger-crashes-if-it-cant-reach-nats
Original base: main
Original merged: 2025-10-14T21:03:10Z by @mfreeman451
PR Type
Enhancement, Bug fix
Description
Add configurable retry logic with exponential backoff for NATS connection failures
Implement graceful error handling to prevent crashes when NATS is unreachable
Add three new configuration parameters for connection retry behavior
Refactor connection logic to support unlimited or limited retry attempts
Diagram Walkthrough
File Walkthrough
nats_output.rs
Implement NATS connection retry with exponential backoffcmd/flowgger/src/flowgger/output/nats_output.rs
connect_attempts,connect_initial_backoff,connect_max_backoff) toNATSConfigconnect_with_retry()method with exponential backoff logicconnect()toconnect_once()for single connection attemptsflowgger.toml
Add retry configuration parameters to example configcmd/flowgger/flowgger.toml
nats_connect_attemptsconfiguration (0 = unlimited retries)nats_connect_initial_backoff_msconfiguration (default 500ms)nats_connect_max_backoff_msconfiguration (default 30000ms)flowgger.toml
Add retry configuration parameters to packaging configpackaging/flowgger/config/flowgger.toml
nats_connect_attemptsconfiguration (0 = unlimited retries)nats_connect_initial_backoff_msconfiguration (default 500ms)nats_connect_max_backoff_msconfiguration (default 30000ms)Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1761#issuecomment-3403591711
Original created: 2025-10-14T20:57:21Z
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
Verbose error logging
Description: The code logs connection failures using eprintln! which writes to stderr without
structured logging or rate limiting, potentially exposing sensitive connection details and
causing log flooding during outages.
nats_output.rs [179-214]
Referred Code
🎫 #1760
Codebase context is not defined
Follow the guide to enable codebase context checks.
No custom compliance provided
Follow the guide to enable custom compliance check.
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1761#issuecomment-3403594243
Original created: 2025-10-14T20:58:17Z
PR Code Suggestions ✨
Explore these optional code suggestions:
Replace custom retry logic with a standard library
Replace the custom exponential backoff logic with a standard library like the
backoffcrate. This will add jitter to prevent "thundering herd" issues andsimplify the implementation.
Examples:
cmd/flowgger/src/flowgger/output/nats_output.rs [165-199]
Solution Walkthrough:
Before:
After:
Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies a significant design flaw—the lack of jitter in the custom retry logic—and proposes a robust, standard solution using the
backoffcrate, which improves production-readiness.Ensure max backoff is not less than initial
To prevent unexpected behavior, ensure
connect_max_backoffis always greaterthan or equal to
connect_initial_backoffby adjusting it if a user configures itto be smaller.
cmd/flowgger/src/flowgger/output/nats_output.rs [103-105]
Suggestion importance[1-10]: 6
__
Why: The suggestion correctly identifies a potential user misconfiguration and proposes a reasonable change to handle it gracefully, making the new retry feature more robust and less surprising.
Simplify backoff calculation logic
Simplify the exponential backoff calculation by replacing the verbose
checked_mul,unwrap_or, andminchain with a more directmin(backoff * 2,self.cfg.connect_max_backoff).cmd/flowgger/src/flowgger/output/nats_output.rs [192-195]
Suggestion importance[1-10]: 5
__
Why: The suggestion correctly points out that the backoff calculation can be simplified, improving code readability and conciseness without sacrificing safety in this context.