bug(core): cant save metrics #616
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar#616
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub.
Original GitHub issue: #1882
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1882
Original created: 2025-10-24T19:19:49Z
Describe the bug
seeing these errors in the OTEL logsL
CRITICAL DB WRITE ERROR: Failed to flush/StoreMetricsTo Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context
Add any other context about the problem here.
Imported GitHub comment.
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1882#issuecomment-3444584950
Original created: 2025-10-24T19:20:34Z
proton is out of disc again even though we gave it 1TB in k8s??
Imported GitHub comment.
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1882#issuecomment-3445753394
Original created: 2025-10-25T04:17:23Z
Latest updates on Proton disk pressure:\n\n- Found the PVC usage dominated by Proton's native logstore (
/var/lib/proton/nativelog) rather than MergeTree data.unified_devicesstream alone was holding ~186 GiB of backlog despite 3d table TTLs.\n- Tightened the packaged Proton config so nativelog retention aligns with data TTLs (3 days) and caps at ~50 GiB; also reduced segment size to 256 MiB so streams stop pre-allocating 4 GiB chunks.\n- Built and pushedghcr.io/carverauto/serviceradar-proton:sha-385b06cbd38cwith the new config and rolled it out to the demo (prod) namespace.\n- Scaled Proton down, cleared the stale nativelog folders on the PVC, and brought the deployment back up on the new image—disk usage now ~5 MB and climbing slowly under the enforced cap.\n\nNext steps:\n- Monitordu -sh /var/lib/proton/nativelog/log/defaultover the next few days; the backlog should plateau well under 50 GiB.\n- (Optional) add monitoring/alerts for the Proton PVC so we get early warnings if backlog creeps again.\n\nDocs/runbook now note that the nativelog purge is a recovery-only step; normal ops shouldn't need repeated manual cleanups.],workdir:/home/mfreeman/serviceradar}