Update/troubleshooting sysmonvm #2318
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2318
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2318/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #1758
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/1758
Original created: 2025-10-14T03:37:27Z
Original updated: 2025-10-14T06:28:44Z
Original head: carverauto/serviceradar:update/troubleshooting_sysmonvm
Original base: main
Original merged: 2025-10-14T06:28:41Z by @mfreeman451
PR Type
Enhancement
Description
Enhanced CPU metrics with cluster-level aggregation and core labeling
Added caching mechanism for hostfreq data collection with 2-second TTL
Improved data validation and error handling for frequency measurements
Fixed indentation and formatting issues in service initialization
Diagram Walkthrough
File Walkthrough
service.go
Enhanced CPU metrics with cluster supportpkg/checker/sysmonvm/service.go
LabelandClusterfields to CPU metrics for core identificationClustersfield to status payload structurefreqCollectorinitializationcollector.go
Extended frequency data structures with cluster supportpkg/cpufreq/collector.go
LabelandClusterfields toCoreFrequencystructClusterFrequencystruct for cluster-level metricsSnapshotto include both cores and clustershostfreqas new data source optionhostfreq_darwin.go
Added caching and cluster frequency collection for Darwinpkg/cpufreq/hostfreq_darwin.go
enforcement
clusterFromLabelfunction to extract cluster names fromcore labels
metrics.go
Extended metrics models with cluster-level CPU datapkg/models/metrics.go
LabelandClusterfields toCPUMetricstructCPUClusterMetricstruct with comprehensivedocumentation
Clustersfield toSysmonMetricswith omitempty tagImported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1758#issuecomment-3399988343
Original created: 2025-10-14T03:38:02Z
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
Cross-tenant data leak
Description: The in-memory cache
hostfreqCachestores the latest frequency snapshot globally withoutsize/expiry enforcement beyond TTL and without context scoping, which could expose
cross-tenant data if used in multi-tenant agents or long-lived processes sharing the same
address space.
hostfreq_darwin.go [226-246]
Referred Code
🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
No custom compliance provided
Follow the guide to enable custom compliance check.
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1758#issuecomment-3399989774
Original created: 2025-10-14T03:39:03Z
PR Code Suggestions ✨
Explore these optional code suggestions:
Simplify data models to reduce redundancy
Remove redundant metadata fields like
Timestamp,HostID, andAgentIDfrom theCPUMetricandCPUClusterMetricmodels. This data should be placed in ahigher-level response structure to reduce payload size and improve data
normalization.
Examples:
pkg/models/metrics.go [73-92]
pkg/models/metrics.go [94-109]
Solution Walkthrough:
Before:
After:
Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies the introduction of redundant fields in core data models, which is a significant design flaw that could lead to data bloat and inconsistencies, even if not currently populated.
✅
Remove cloning on write to cacheSuggestion Impact:
The commit removed the hostfreq cache entirely (including both read- and write-side cloning), replacing it with a buffered sampler. This makes the specific write-side clone removal moot by eliminating the caching mechanism, but it addresses the underlying inefficiency more broadly.code diff:
Modify
hostfreqCacheStoreto store the snapshot pointer directly instead of aclone, as cloning is already handled on read, thus removing the redundant
write-side clone.
pkg/cpufreq/hostfreq_darwin.go [240-246]
[Suggestion processed]Suggestion importance[1-10]: 7
__
Why: The suggestion correctly identifies a redundant clone on write to the cache and proposes removing it, which is a standard and more efficient caching pattern.
✅
Avoid redundant snapshot cloning for efficiencySuggestion Impact:
The commit refactored caching to a sampler and, after creating a snapshot, records it and returns snapshotClone(snapshot) directly, avoiding a redundant cache read/clone. This aligns with the suggestion’s intent.code diff:
In
collectViaHostfreq, avoid a redundant snapshot clone by returning a clone ofthe local
snapshotvariable directly, instead of reading it back from the cache.pkg/cpufreq/hostfreq_darwin.go [139-146]
[Suggestion processed]Suggestion importance[1-10]: 6
__
Why: The suggestion correctly identifies a redundant cloning operation and proposes a valid fix that improves performance by avoiding an unnecessary cache read and lock cycle.