bug(core): identitymap #595
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar#595
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub.
Original GitHub issue: #1846
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1846
Original created: 2025-10-22T06:15:29Z
Describe the bug
Seeing these messages in our OTEL logs again:
Ignoring corrupt canonical identity entry in KVSeems like a regression of #1842, also possibly related to our restarts and agents or other services trying to register themselves, maybe their pod got a new IP address, etc. We are also noticing that on almost every restart of our k8s deployment in the
demonamespace, we accumulate an extra device in the inventory. We should have 50,002 but now after several restarts we're at 50,011~To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context
Add any other context about the problem here.
Imported GitHub comment.
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1846#issuecomment-3430820694
Original created: 2025-10-22T07:25:48Z
Quick update: after today’s roll we dug into the canonical identity corruption reports. Core previously logged repeated warnings on , so we tailed that key via the tools pod and also wrote a small Go scanner that walked ~37k canonical entries in . Everything currently in the bucket unmarshals cleanly, and the watch output shows the poller agent rewriting the canonical payload with the expected proto. We haven’t reproduced the bad wire-format again post-restart, so the working theory is we caught a transient malformed publish that later got overwritten. Next step is to instrument/trace the poller → registry pipeline so we can capture the exact writer if the corruption resurfaces.
Imported GitHub comment.
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1846#issuecomment-3430822006
Original created: 2025-10-22T07:26:17Z
Follow-up detail: the key we were chasing was
device_canonical_map/device-id/default=3A10.42.111.102and the bucket wasserviceradar-kv; both decoded fine during the scan.Imported GitHub comment.
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1846#issuecomment-3432926966
Original created: 2025-10-22T15:12:04Z
closing, cant repro