nats retry connection fixes #2330
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2330
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2330/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #1788
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/1788
Original created: 2025-10-16T14:20:44Z
Original updated: 2025-10-16T15:05:08Z
Original head: carverauto/serviceradar:bug/event-writer-failing-nats
Original base: main
Original merged: 2025-10-16T15:05:04Z by @mfreeman451
PR Type
Bug fix, Enhancement
Description
Enhanced NATS connection retry logic with fatal error detection and automatic reconnection
Added comprehensive error handling for context cancellation and connection failures
Introduced test coverage for consumer and service reconnection scenarios
Refactored service lifecycle management with improved shutdown and configuration updates
Diagram Walkthrough
File Walkthrough
consumer.go
Add error handling and fatal error detection to consumerpkg/consumers/db-event-writer/consumer.go
ProcessMessagesto return error instead of void for propererror propagation
pullConsumerinterface to enable testing and dependencyinjection
isFatalFetchErrorandisContextErrorhelper functions forerror classification
error detection
consumer_test.go
Add unit tests for consumer fatal error handlingpkg/consumers/db-event-writer/consumer_test.go
fakePullConsumerandfakeMessageBatchtest doubles for consumertesting
TestConsumerProcessMessagesReturnsFatalErrorto verifyfatal error propagation
service_test.go
Add integration test for service reconnection logicpkg/consumers/db-event-writer/service_test.go
TestServiceRunReconnectsAfterFatalErrorto verify reconnectionbehavior
attempts
service.go
Implement automatic reconnection and lifecycle managementpkg/consumers/db-event-writer/service.go
Startto useensureConsumerandrunloop for automaticreconnection
connectFactoryfield for dependency injection and testabilityrunmethod with retry logic that handles fatal errors andreconnects
ensureConsumer,establishConnection,setConnection, andresetConnectionfor connection lifecycle managementStopwith proper shutdown timeout and connection cleanupUpdateConfigto properly restart service with newconfiguration
sleepWithContexthelper for context-aware delayscreateConnection(formerlyStartlogic) with enhanced NATSconnection handlers
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1788#issuecomment-3411154603
Original created: 2025-10-16T14:21:24Z
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
No security concerns identified
No security vulnerabilities detected by AI analysis. Human verification advised for critical code.🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
No custom compliance provided
Follow the guide to enable custom compliance check.
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1788#issuecomment-3411162054
Original created: 2025-10-16T14:22:53Z
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.
PR Code Suggestions ✨
Explore these optional code suggestions:
Use context cancellation for test shutdown
In
TestServiceRunReconnectsAfterFatalError, use context cancellation togracefully stop the service's
runloop instead of returningcontext.Canceledfrom the mock
connectFactory.pkg/consumers/db-event-writer/service_test.go [33-50]
Suggestion importance[1-10]: 7
__
Why: The suggestion proposes a more robust and idiomatic way to control the test's lifecycle by using context cancellation, which better reflects real-world shutdown scenarios.
Refactor duplicated error handling logic
Refactor the duplicated error handling for
c.consumer.Fetchandmsgs.Error()into a new helper method to improve maintainability.
pkg/consumers/db-event-writer/consumer.go [120-169]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 6
__
Why: The suggestion correctly identifies duplicated error handling logic and proposes a valid refactoring into a helper function, which would improve code maintainability and reduce redundancy.
Remove redundant connection closing logic
Remove the redundant connection closing block from
setConnectionas callers arealready responsible for closing the connection via
resetConnection.pkg/consumers/db-event-writer/service.go [214-225]
Suggestion importance[1-10]: 4
__
Why: The suggestion correctly identifies that the connection closing logic in
setConnectionis redundant because all call paths already handle closing the connection viaresetConnection.