BuildKeysFromRecord omits alias metadata, leaving stale alias keys undeleted #690
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar#690
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub.
Original GitHub issue: #2152
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/2152
Original created: 2025-12-16T05:18:46Z
Summary
BuildKeysFromRecordfails to reconstruct alias-related identity keys (from_alias_last_seen_service_id,_alias_last_seen_ip,service_alias:*, andip_alias:*metadata fields) that were originally created byBuildKeys.BuildKeysFromRecordto reconstruct the keys that should exist. SinceBuildKeysFromRecordomits alias keys, these keys are never identified as stale and are never deleted, even when they should be.3a5787ac("stop KV from growing out of control").Code with bug
In
pkg/identitymap/identitymap.go, theBuildKeysFromRecordfunction only reconstructs a subset of metadata fields:The
metaKeyslist on line 151 does not include any of the alias-related metadata fields thatBuildKeysuses (lines 97-119):_alias_last_seen_service_id_alias_last_seen_ipservice_alias:ip_alias:Evidence
Example
Consider a device update with alias metadata:
When
BuildKeys(update)is called, it creates 13 keys including:KindDeviceID: "tenant-a:host-device"(the main device ID)KindDeviceID: "serviceradar:agent:k8s-agent"(from_alias_last_seen_service_id)KindDeviceID: "serviceradar:poller:k8s"(fromservice_alias:serviceradar:poller:k8s)KindIP: "10.0.0.5"(main IP)KindIP: "10.0.0.8"(from_alias_last_seen_ip)KindIP: "10.0.0.9"(fromip_alias:10.0.0.9)KindPartitionIP: "tenant-a:10.0.0.5"KindPartitionIP: "tenant-a:10.0.0.8"KindPartitionIP: "tenant-a:10.0.0.9"These keys are stored in the KV store pointing to the canonical record.
However, when the record is later retrieved and
BuildKeysFromRecordis called (inidentity_publisher.go:521), only 3 keys are reconstructed:KindDeviceID: "tenant-a:host-device"KindIP: "10.0.0.5"KindPartitionIP: "tenant-a:10.0.0.5"The 6 alias-related keys are missing from the reconstruction.
Inconsistency within the codebase
Reference code: BuildKeys in pkg/identitymap/identitymap.go (lines 97-119)
Current code: BuildKeysFromRecord in pkg/identitymap/identitymap.go (lines 131-163)
Contradiction
BuildKeysprocesses four categories of alias metadata fields to create identity keys:_alias_last_seen_service_id→ creates aKindDeviceIDkey_alias_last_seen_ip→ createsKindIPandKindPartitionIPkeysservice_alias:*prefixed keys → createsKindDeviceIDkeysip_alias:*prefixed keys → createsKindIPandKindPartitionIPkeysHowever,
BuildKeysFromRecordonly reconstructs metadata forarmis_device_id,integration_id,integration_type, andnetbox_device_id. None of the alias-related fields are included in the reconstruction, causingBuildKeysto produce a different set of keys when called fromBuildKeysFromRecordversus when called directly with the original update.This violates the documented purpose of
BuildKeysFromRecord: to "reconstruct the identity keys for a canonical record" (line 131). The function does not actually reconstruct all the keys that were originally created.Failing test
Test script
Test output
The test clearly shows that
BuildKeysFromRecordproduces only 7 keys whileBuildKeysproduces 13 keys for the same device. The 6 missing keys are all alias-related keys.Full context
The identity map system is used to maintain a canonical mapping of device identities across the ServiceRadar platform. When a device is discovered or updated, multiple identity keys are created (by IP address, MAC address, device ID, integration IDs, etc.) that all point to the same canonical device record stored in a NATS JetStream KV store.
The
identityPublisherinpkg/registry/identity_publisher.gois responsible for publishing these mappings to the KV store. When publishing an update for a device:BuildKeys(update)to generate all identity keys for the new update (lines 216-222)existingIdentitySnapshot)existingIdentitySnapshot, it callsBuildKeysFromRecord(record)to reconstruct what keys should exist based on the stored record (line 521)This stale key deletion mechanism was explicitly added in commit
3a5787acon Oct 16, 2025 to "stop KV from growing out of control."However, on Nov 4, 2025 (commit
5223ac8c), alias support was added toBuildKeysto track device aliases (when a device is known by multiple service IDs or IP addresses). This allows the system to create additional lookup keys for aliased identities. The commit added support for:_alias_last_seen_service_id: The last service ID this device was seen as_alias_last_seen_ip: The last IP address this device was seen atservice_alias:*: Historical service IDs this device has been known asip_alias:*: Historical IP addresses this device has been associated withThe problem is that
BuildKeysFromRecordwas not updated to handle these alias fields. Additionally, thebuildIdentityAttributesfunction inpkg/registry/identity_publisher.go(lines 408-450) does not store alias metadata in the record's attributes, so even ifBuildKeysFromRecordtried to reconstruct them, the data wouldn't be available.This means:
3a5787acwas meant to preventThe device alias feature is actively used in the codebase (see
pkg/core/alias_events.go,pkg/devicealias/alias.go) to track when devices are seen under different identities, which is an important feature for device tracking in complex networks.Why has this bug gone undetected?
This bug has gone undetected for several reasons:
Silent failure: The bug doesn't cause any errors or exceptions. The identity publisher successfully creates the alias keys and successfully publishes updates. The only symptom is that stale keys are not deleted, which is not immediately observable.
Gradual accumulation: The effect of the bug is a gradual accumulation of stale keys over time. In a test environment or during initial deployment, the KV store might not grow large enough to be noticed. Only in production with sustained use would the KV store growth become apparent.
Temporal separation: The bug was introduced by the interaction of two commits 19 days apart:
BuildKeysFromRecordwas addedBuildKeysEach commit in isolation worked correctly. The bug only manifested when both features interacted.
No end-to-end test: The existing test
TestBuildKeysFromRecord(lines 91-124 ofpkg/identitymap/identitymap_test.go) only tests the basic functionality without alias metadata. It doesn't verify thatBuildKeysFromRecordproduces the same keys asBuildKeysfor devices with aliases.Different code paths: The creation of keys (via
BuildKeys) and the reconstruction of keys (viaBuildKeysFromRecord) happen in different code paths. During normal operation, keys are created successfully. The reconstruction only happens when checking for stale keys, and its failure is not visible to the application - it simply results in stale keys not being deleted.Monitoring gap: The KV store metrics likely track total entries and growth rate, but wouldn't necessarily distinguish between legitimate growth (more devices) and bug-related growth (accumulating stale alias keys).
Recommended fix
The fix requires two changes:
Store alias metadata in attributes: Update
buildIdentityAttributesinpkg/registry/identity_publisher.goto store alias-related metadata fields in the record's attributes. This ensures the data is available for reconstruction.Reconstruct alias metadata: Update
BuildKeysFromRecordinpkg/identitymap/identitymap.goto extract and reconstruct alias metadata fields from the record's attributes, similar to how it currently handlesarmis_device_id,integration_id, etc.Here's a sketch of the fix for
BuildKeysFromRecord:Note: This fix also requires updating
buildIdentityAttributesto actually store these fields, which it currently doesn't do.Imported GitHub comment.
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/2152#issuecomment-3663247422
Original created: 2025-12-17T01:49:02Z
closing as completed