k8s updates, nats updates for otel to create stream if missing, poller #2319
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2319
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2319/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #1759
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/1759
Original created: 2025-10-14T18:42:54Z
Original updated: 2025-10-14T20:29:37Z
Original head: carverauto/serviceradar:update/stability_core_updates
Original base: main
Original merged: 2025-10-14T20:29:34Z by @mfreeman451
User description
stability fixes
PR Type
Enhancement, Bug fix
Description
Enhanced NATS stream configuration with retention policies (
max_bytes,max_age)Added thread-safe core client access with nil checks in poller
Increased Kubernetes resource limits for core, KV, and NATS services
Improved error handling for unavailable core client connections
Diagram Walkthrough
File Walkthrough
2 files
Add thread-safe core client access and error handlingAdd file permissions for Kong configuration2 files
Add NATS stream retention configuration fieldsImplement stream retention policy updates and validation7 files
Make wait-for service attempts configurable via environmentAdd memory limits and CPU constraints to servicesConfigure NATS stream retention parametersRemove NATS health check and add retention settingsIncrease CPU and memory resource limitsIncrease CPU resource limits for KV serviceIncrease CPU resource limits for NATS serviceImported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1759#issuecomment-3403152723
Original created: 2025-10-14T18:43:41Z
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
Resource exhaustion
Description: The stream retention limits use very large defaults (max_bytes ≈ 2 GiB, max_age ≈ 30
minutes) which could still allow high storage growth if subjects proliferate; ensure
quotas align with storage capacity and that unauthorized config escalation is not possible
through untrusted config sources.
nats_output.rs [68-136]
Referred Code
DoS via infinite wait
Description: Allowing unlimited or very high connection wait attempts via environment variables
(default 0 meaning potentially infinite) can cause denial of service or stuck startup if
services are unreachable; consider sane upper bounds or timeouts.
entrypoint-db-event-writer.sh [52-76]
Referred Code
🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
No custom compliance provided
Follow the guide to enable custom compliance check.
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1759#issuecomment-3403155904
Original created: 2025-10-14T18:44:52Z
PR Code Suggestions ✨
Explore these optional code suggestions:
✅
Restore a sensible default for retriesSuggestion Impact:
The commit changed NATS_ATTEMPTS to use a non-zero default via DEFAULT_WAIT_ATTEMPTS (60 by default), addressing the issue of 0 retries and restoring a resilient default, though not exactly to 30.code diff:
In
entrypoint-db-event-writer.sh, restore the default number of NATS connectionattempts to 30 instead of 0 to ensure service startup is resilient.
docker/compose/entrypoint-db-event-writer.sh [52]
[Suggestion processed]Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies that changing the default number of wait attempts from 30 to 0 makes the service startup brittle and likely to fail in environments with non-deterministic startup order.
Simplify function by using struct's client
*Refactor
reportToCoreStreamingto be a method onPollerand use the struct'sp.coreClientdirectly, removing the need to pass the client as an argument andeliminating a redundant nil check.
pkg/poller/poller.go [618-626]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 5
__
Why: The suggestion correctly points out redundant code and an opportunity to simplify the function signature, which improves code clarity and maintainability.