feat: basic health-checker #218
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar#218
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub.
Original GitHub issue: #605
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/605
Original created: 2025-04-14T13:43:26Z
Monitors HTTP or gRPC health endpoints (e.g., /health, /livez, /readyz) on services, checking availability and parsing JSON status (e.g., {"status": "healthy"}). Reports health (true/false) and key details (e.g., component statuses) to agent.
Value Proposition:
Unique Niche: Targets modern microservices (e.g., Kubernetes apps, APIs) with health endpoints, which sysmon (system-level) and rperf (network) don’t cover. Unlike snmp (devices) or dusk (blockchain), it’s application-focused.
Lightweight: Fetches a small JSON response (100 bytes) every 30s, storing minimal data (e.g., healthy: true, details: {"db": "ok"}). Fits SQLite’s ~24 GB/day for 100,000 hosts (0.3 MB/day/host).
Proxmox Fit: Monitors containerized apps or APIs on Proxmox (e.g., LXC containers), complementing sysmon’s ZFS/CPU metrics.
Security: Supports mTLS for gRPC to agent and HTTP basic auth/TLS for endpoints, aligning with tls-security.md.
Ease of Use: Simple HTTP GET or gRPC call, no complex parsing (unlike Prometheus metrics). Configurable endpoints and expected status fields.