carverauto/serviceradar

Fork

You've already forked serviceradar

Code Issues 233 Pull requests 24 Projects Releases 254 Packages Wiki Activity Actions 7

Architecture Refactor: Move device state to registry, treat Proton as OLAP warehouse #639

New issue

Closed

opened 2026-03-28 04:26:53 +00:00 by mfreeman451 · 11 comments

mfreeman451 commented

2026-03-28 04:26:53 +00:00

Owner

Copy link

Imported from GitHub.

Original GitHub issue: #1924
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924
Original created: 2025-11-05T03:32:25Z

Problem Statement

ServiceRadar currently treats Proton (a stream processing database) as the primary source of truth for device state, causing performance issues that don't scale beyond tens of thousands of devices. While the tactical CTE query fix (#1921) reduced Proton CPU from 3986m to ~1000m, we're still fundamentally doing the wrong thing: hitting Proton for every device lookup, stats query, and inventory search.

Current issues:

Proton CPU baseline: ~1000m (1 core) just for normal operations
Device lookups read 640k+ rows to find latest state
Dashboard stats issue live count() queries on 50k devices
Inventory search does full table scans with metadata map extraction
Collector capability derived by scraping metadata keys
No audit trail for "when did device X last have ICMP capability?"

Vision

Establish a proper layered data architecture:

Hot tier: In-memory device registry for current state (μs latency)
Warm tier: Search index for inventory queries (ms latency)
Cold tier: Proton for time-series analytics and audit logs (s latency acceptable)

Proton should only answer questions like:

"Show me ICMP RTT for device X over last 7 days" ✅
"What devices were discovered in the last hour?" ✅
"Run this exploratory SRQL query across historical data" ✅

Proton should not answer questions like:

"Does device X exist?" ❌ → Use registry
"What's the hostname of device X?" ❌ → Use registry
"How many devices have ICMP collectors?" ❌ → Use stats cache
"Search for devices matching 'foo'" ❌ → Use search index

Detailed Plan

See `newarch_plan.md` for comprehensive implementation details including:

Retrospective (how we got here, why tactical fix wasn't enough)
6-phase refactor with code examples
Sprint breakdown (10 weeks)
Success metrics and rollback plan

Implementation Phases

Phase 1: Device Registry Service (Week 1-2)

Goal: Canonical in-memory device graph

Define `DeviceRecord` schema in `pkg/registry/device.go`
Implement `DeviceRegistry` with in-memory map + RWMutex
Hydrate from Proton on startup (`HydrateRegistryFromProton`)
Update `DeviceManager.UpsertDevice()` to write to both Proton + Registry
Unit tests for registry operations

Success: Registry hydrates from Proton, stays in sync with new updates

Phase 2: First-Class Collector Capabilities (Week 3-4)

Goal: Stop deriving capability from metadata

Define `CollectorCapability` schema
Implement `CapabilityIndex` in `pkg/registry/capabilities.go`
Update agent/poller registration to emit capabilities
Update API `/devices/{id}/collectors` to use registry
Remove all metadata scraping (`_alias_last_seen_service_id`, etc.)
Update UI to use new capabilities API

Success: Collector status from explicit records, not metadata inference

Phase 3: Stats Aggregator (Week 5)

Goal: Pre-aggregate dashboard metrics

Implement `StatsAggregator` that runs every 10 seconds
Add `StatsSnapshot` cache to registry
Create `/api/stats` endpoint
Update dashboard tiles to call `/api/stats`
Remove SRQL stat card queries from UI

Success: Dashboard loads in <10ms, no Proton queries for stats

Phase 4: Search Index (Week 6-7)

Goal: Fast inventory search without table scans

Implement in-memory trigram index in `pkg/search/trigram.go`
Integrate with `DeviceRegistry.Upsert()` to update index
Add `/api/devices/search?q=...` endpoint
Update inventory UI to use search API (OPEN – next engineer)
Remove `SELECT ... LIKE ...` queries from codebase

Success: Inventory search returns in <50ms for any query

Phase 5: Capability Matrix (Week 8-9)

Goal: Model Device ⇄ Service ⇄ Capability explicitly

Define `device_capabilities` stream in Proton for audit trail
Implement `CapabilityMatrix` in `pkg/registry/matrix.go`
Update agent heartbeats to report capability checks
Create capability monitoring/alerting
Dashboard shows capability status + last-seen

Success: Can answer "when did device X last have successful ICMP?" without manual queries

Phase 6: Proton Boundary Enforcement (Week 10)

Goal: Ensure all state queries hit registry, not Proton

Audit all `db.*` calls in `pkg/core/api`
Replace device state queries with registry lookups
Add linter rule / middleware to prevent non-analytics Proton queries
Document "when to use Proton vs registry" guidelines
Final performance validation

Success: Proton CPU <200m under normal load

Success Metrics

Performance Targets

Registry lookups: <1ms (currently 500ms+ from Proton)
Dashboard stats: <10ms (currently 500ms+ from live count())
Inventory search: <50ms (currently 1-5s from table scan)
Proton CPU baseline: <200m (currently ~1000m)

Data Quality

Collector capability accuracy: 100% (explicit records vs inferred)
Audit trail: All capability changes logged to Proton
No stale data: Registry TTL/refresh keeps cache current

Developer Experience

Query clarity: `registry.Get(deviceID)` not SQL
Testability: Registry is mockable
Debuggability: Capability matrix shows exact state + history

Rollback Plan

Each phase is independently deployable with feature flags:

```go
const (
UseRegistry = true // Phase 1
UseCapabilityIndex = true // Phase 2
UseStatsCache = true // Phase 3
UseSearchIndex = true // Phase 4
)
```

If any phase has issues, disable the flag and fall back to Proton queries (slower but functional).

Related Issues

#1921 - Original Proton performance crisis (tactical CTE fix applied)

Open Questions

Registry persistence: Should we persist registry snapshots to disk for faster restarts?
Registry size: At 1M devices, in-memory registry ≈ 1-2GB. Acceptable?
Search sophistication: Do we need Elastic's query DSL, or is trigram enough?
Capability staleness: How long before we mark a collector capability as "stale"?
Multi-region: How does registry sync across clusters?

References

`newarch_plan.md` - Full implementation details with code examples
`debug.md` - Performance investigation notes
Commit `85733a09` - Tactical CTE query fix
Commit `65e5d947` - Architecture plan documentation

Imported from GitHub. Original GitHub issue: #1924 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924 Original created: 2025-11-05T03:32:25Z --- ## Problem Statement ServiceRadar currently treats Proton (a stream processing database) as the primary source of truth for device state, causing performance issues that don't scale beyond tens of thousands of devices. While the tactical CTE query fix (#1921) reduced Proton CPU from 3986m to ~1000m, we're still fundamentally doing the wrong thing: hitting Proton for every device lookup, stats query, and inventory search. **Current issues:** - Proton CPU baseline: ~1000m (1 core) just for normal operations - Device lookups read 640k+ rows to find latest state - Dashboard stats issue live `count()` queries on 50k devices - Inventory search does full table scans with metadata map extraction - Collector capability derived by scraping metadata keys - No audit trail for "when did device X last have ICMP capability?" ## Vision Establish a proper **layered data architecture**: - **Hot tier:** In-memory device registry for current state (μs latency) - **Warm tier:** Search index for inventory queries (ms latency) - **Cold tier:** Proton for time-series analytics and audit logs (s latency acceptable) Proton should **only** answer questions like: - "Show me ICMP RTT for device X over last 7 days" ✅ - "What devices were discovered in the last hour?" ✅ - "Run this exploratory SRQL query across historical data" ✅ Proton should **not** answer questions like: - "Does device X exist?" ❌ → Use registry - "What's the hostname of device X?" ❌ → Use registry - "How many devices have ICMP collectors?" ❌ → Use stats cache - "Search for devices matching 'foo'" ❌ → Use search index ## Detailed Plan See \`newarch_plan.md\` for comprehensive implementation details including: - Retrospective (how we got here, why tactical fix wasn't enough) - 6-phase refactor with code examples - Sprint breakdown (10 weeks) - Success metrics and rollback plan ## Implementation Phases ### Phase 1: Device Registry Service (Week 1-2) **Goal:** Canonical in-memory device graph - [x] Define \`DeviceRecord\` schema in \`pkg/registry/device.go\` - [x] Implement \`DeviceRegistry\` with in-memory map + RWMutex - [x] Hydrate from Proton on startup (\`HydrateRegistryFromProton\`) - [x] Update \`DeviceManager.UpsertDevice()\` to write to both Proton + Registry - [x] Unit tests for registry operations **Success:** Registry hydrates from Proton, stays in sync with new updates ### Phase 2: First-Class Collector Capabilities (Week 3-4) **Goal:** Stop deriving capability from metadata - [x] Define \`CollectorCapability\` schema - [x] Implement \`CapabilityIndex\` in \`pkg/registry/capabilities.go\` - [x] Update agent/poller registration to emit capabilities - [x] Update API \`/devices/{id}/collectors\` to use registry - [x] Remove all metadata scraping (\`_alias_last_seen_service_id\`, etc.) - [x] Update UI to use new capabilities API **Success:** Collector status from explicit records, not metadata inference ### Phase 3: Stats Aggregator (Week 5) **Goal:** Pre-aggregate dashboard metrics - [ ] Implement \`StatsAggregator\` that runs every 10 seconds - [ ] Add \`StatsSnapshot\` cache to registry - [ ] Create \`/api/stats\` endpoint - [ ] Update dashboard tiles to call \`/api/stats\` - [ ] Remove SRQL stat card queries from UI **Success:** Dashboard loads in <10ms, no Proton queries for stats ### Phase 4: Search Index (Week 6-7) **Goal:** Fast inventory search without table scans - [ ] Implement in-memory trigram index in \`pkg/search/trigram.go\` - [ ] Integrate with \`DeviceRegistry.Upsert()\` to update index - [ ] Add \`/api/devices/search?q=...\` endpoint - [ ] Update inventory UI to use search API (**OPEN – next engineer**) - [ ] Remove \`SELECT ... LIKE ...\` queries from codebase **Success:** Inventory search returns in <50ms for any query ### Phase 5: Capability Matrix (Week 8-9) **Goal:** Model Device ⇄ Service ⇄ Capability explicitly - [ ] Define \`device_capabilities\` stream in Proton for audit trail - [ ] Implement \`CapabilityMatrix\` in \`pkg/registry/matrix.go\` - [ ] Update agent heartbeats to report capability checks - [ ] Create capability monitoring/alerting - [ ] Dashboard shows capability status + last-seen **Success:** Can answer "when did device X last have successful ICMP?" without manual queries ### Phase 6: Proton Boundary Enforcement (Week 10) **Goal:** Ensure all state queries hit registry, not Proton - [ ] Audit all \`db.*\` calls in \`pkg/core/api\` - [ ] Replace device state queries with registry lookups - [ ] Add linter rule / middleware to prevent non-analytics Proton queries - [ ] Document "when to use Proton vs registry" guidelines - [ ] Final performance validation **Success:** Proton CPU <200m under normal load ## Success Metrics ### Performance Targets - **Registry lookups:** <1ms (currently 500ms+ from Proton) - **Dashboard stats:** <10ms (currently 500ms+ from live count()) - **Inventory search:** <50ms (currently 1-5s from table scan) - **Proton CPU baseline:** <200m (currently ~1000m) ### Data Quality - **Collector capability accuracy:** 100% (explicit records vs inferred) - **Audit trail:** All capability changes logged to Proton - **No stale data:** Registry TTL/refresh keeps cache current ### Developer Experience - **Query clarity:** \`registry.Get(deviceID)\` not SQL - **Testability:** Registry is mockable - **Debuggability:** Capability matrix shows exact state + history ## Rollback Plan Each phase is independently deployable with feature flags: \`\`\`go const ( UseRegistry = true // Phase 1 UseCapabilityIndex = true // Phase 2 UseStatsCache = true // Phase 3 UseSearchIndex = true // Phase 4 ) \`\`\` If any phase has issues, disable the flag and fall back to Proton queries (slower but functional). ## Related Issues - #1921 - Original Proton performance crisis (tactical CTE fix applied) ## Open Questions 1. **Registry persistence:** Should we persist registry snapshots to disk for faster restarts? 2. **Registry size:** At 1M devices, in-memory registry ≈ 1-2GB. Acceptable? 3. **Search sophistication:** Do we need Elastic's query DSL, or is trigram enough? 4. **Capability staleness:** How long before we mark a collector capability as "stale"? 5. **Multi-region:** How does registry sync across clusters? ## References - \`newarch_plan.md\` - Full implementation details with code examples - \`debug.md\` - Performance investigation notes - Commit \`85733a09\` - Tactical CTE query fix - Commit \`65e5d947\` - Architecture plan documentation

mfreeman451 added the

enhancement

label

2026-03-28 04:26:53 +00:00

mfreeman451 commented

2026-03-28 04:26:54 +00:00

Author

Owner

Copy link

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3489467643
Original created: 2025-11-05T05:41:42Z

Phase 1 progress recap

Added DeviceRecord and in-memory store (pkg/registry/device.go, device_store.go) with ID/IP/MAC indexes and cache snapshot helpers.
Boot hydration now bulk-loads Proton state and rebuilds indexes/search on startup (pkg/registry/hydrate.go, pkg/core/server.go).
Registry ProcessBatchDeviceUpdates keeps the hot cache in sync on every update/tombstone and exposes cache-backed getters used by API/device manager code paths.
Device detail endpoint now favors the registry snapshot, with Proton as a fallback for cache misses (pkg/core/api/server.go).
Introduced a trigram-based search index, wired it into the registry, and ranked results with relevance + recency before handing unified views back to the API (pkg/registry/trigram_index.go, pkg/registry/registry.go).

What’s left (next engineer hand-off)

Web UI integration
- web/src/components/Devices/DeviceList (and any inventory/search routes) should call the new registry search endpoint instead of fan-out SRQL queries. Surfacing metrics_summary, alias_history, and collector capability blobs that the API now attaches will require mapping the new fields in the React data loader and updating DeviceRow renderers.
- Highlight matches using the ranked order: carry the trigram score from the API (extend the response to include score), display an inline badge for exact hostname/IP hits, and preserve the existing status filters.
- Audit client-side filtering to ensure it doesn’t reintroduce Proton calls—update web/src/lib/api.ts to use the registry-backed /api/devices list/search endpoints.
Search telemetry & UX polish
- Emit a lightweight metric (Prometheus counter + histogram) from the API search path (pkg/core/api/server.go) capturing query length, match count, and latency so we can validate the <50 ms target under load.
- Add a UI-level empty state when no results are returned, and surface the total count (API already has everything needed once we extend the response).
Follow-on cache consumers
- Update any remaining backend paths that still hit db.GetUnifiedDevices... (e.g., identity lookup, mapper publisher) to rely on DeviceRegistry.SearchDevices/GetDeviceRecord to avoid Proton reads.
- Gate search and cache features behind a feature flag so we can roll out gradually; add config plumbing in pkg/core/server.go + pkg/core/api/server.go and note it in docs/docs/agents.md.

Once those are in place, we can iterate on Phase 2 (capability index) with a warmed-up UI and telemetry to prove the search latency/success metrics.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3489467643 Original created: 2025-11-05T05:41:42Z --- ## Phase 1 progress recap - Added `DeviceRecord` and in-memory store (`pkg/registry/device.go`, `device_store.go`) with ID/IP/MAC indexes and cache snapshot helpers. - Boot hydration now bulk-loads Proton state and rebuilds indexes/search on startup (`pkg/registry/hydrate.go`, `pkg/core/server.go`). - Registry `ProcessBatchDeviceUpdates` keeps the hot cache in sync on every update/tombstone and exposes cache-backed getters used by API/device manager code paths. - Device detail endpoint now favors the registry snapshot, with Proton as a fallback for cache misses (`pkg/core/api/server.go`). - Introduced a trigram-based search index, wired it into the registry, and ranked results with relevance + recency before handing unified views back to the API (`pkg/registry/trigram_index.go`, `pkg/registry/registry.go`). ## What’s left (next engineer hand-off) 1. **Web UI integration** - `web/src/components/Devices/DeviceList` (and any inventory/search routes) should call the new registry search endpoint instead of fan-out SRQL queries. Surfacing `metrics_summary`, `alias_history`, and collector capability blobs that the API now attaches will require mapping the new fields in the React data loader and updating `DeviceRow` renderers. - Highlight matches using the ranked order: carry the trigram score from the API (extend the response to include `score`), display an inline badge for exact hostname/IP hits, and preserve the existing status filters. - Audit client-side filtering to ensure it doesn’t reintroduce Proton calls—update `web/src/lib/api.ts` to use the registry-backed `/api/devices` list/search endpoints. 2. **Search telemetry & UX polish** - Emit a lightweight metric (Prometheus counter + histogram) from the API search path (`pkg/core/api/server.go`) capturing query length, match count, and latency so we can validate the <50 ms target under load. - Add a UI-level empty state when no results are returned, and surface the total count (API already has everything needed once we extend the response). 3. **Follow-on cache consumers** - Update any remaining backend paths that still hit `db.GetUnifiedDevices...` (e.g., identity lookup, mapper publisher) to rely on `DeviceRegistry.SearchDevices`/`GetDeviceRecord` to avoid Proton reads. - Gate search and cache features behind a feature flag so we can roll out gradually; add config plumbing in `pkg/core/server.go` + `pkg/core/api/server.go` and note it in `docs/docs/agents.md`. Once those are in place, we can iterate on Phase 2 (capability index) with a warmed-up UI and telemetry to prove the search latency/success metrics.

mfreeman451 commented

2026-03-28 04:26:54 +00:00

Author

Owner

Copy link

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3489640042
Original created: 2025-11-05T06:52:40Z

Update after Phase 1/2 rollout:\n\n- Proton returned to ~4 cores consumed within 10 minutes of redeploy (NAME CPU(cores) MEMORY(bytes)
serviceradar-proton-654fbcbcbf-bqdxs 3993m 3018Mi ).\n- shows the dominant query is , issued once per minute by the SRQL/Proton OCaml client. Each run scans ~15.6M rows / 5.9GB, so the Observability dashboard still hammers Proton.\n- The CTE-based device lookups introduced in Phase 1 are still invoked hundreds of times per half hour (totalling ~3.7e8 rows read). They're better than the old pattern but remain an expensive fallback because SRQL routes keep hitting Proton instead of the registry cache.\n- There are still Code 210 exceptions for giant clauses generated from SRQL filters (e.g. 100+ IPs or Armis IDs), which cause retries and more table scans.\n\nTo address the remaining load we extended with Phase 3b (Critical Log Rollups) so the web dashboards consume a dedicated log digest instead of the raw scan, and we tightened Sprint 6 tasks to force SRQL/device lookups through the registry and search index. That should eliminate the hot queries once Phases 3-6 are complete.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3489640042 Original created: 2025-11-05T06:52:40Z --- Update after Phase 1/2 rollout:\n\n- Proton returned to ~4 cores consumed within 10 minutes of redeploy (NAME CPU(cores) MEMORY(bytes) serviceradar-proton-654fbcbcbf-bqdxs 3993m 3018Mi ).\n- shows the dominant query is , issued once per minute by the SRQL/Proton OCaml client. Each run scans ~15.6M rows / 5.9GB, so the Observability dashboard still hammers Proton.\n- The CTE-based device lookups introduced in Phase 1 are still invoked hundreds of times per half hour (totalling ~3.7e8 rows read). They're better than the old pattern but remain an expensive fallback because SRQL routes keep hitting Proton instead of the registry cache.\n- There are still Code 210 exceptions for giant clauses generated from SRQL filters (e.g. 100+ IPs or Armis IDs), which cause retries and more table scans.\n\nTo address the remaining load we extended with Phase 3b (Critical Log Rollups) so the web dashboards consume a dedicated log digest instead of the raw scan, and we tightened Sprint 6 tasks to force SRQL/device lookups through the registry and search index. That should eliminate the hot queries once Phases 3-6 are complete.

mfreeman451 commented

2026-03-28 04:26:54 +00:00

Author

Owner

Copy link

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3489724548
Original created: 2025-11-05T07:26:04Z

Status update from the architecture refactor work:

• Log digest cache landed. Added pkg/core/log_digest.go with a capped ring buffer + 1h/24h counters, hydrated every 30s from Proton via the new DBLogDigestSource helper. Core start-up now wires the aggregator and keeps it refreshed until shutdown.
• New critical log APIs. Exposed /api/logs/critical and /api/logs/critical/counters (protected routes); the handlers serve the in-memory digest so fatal/error widgets no longer hit SRQL.
• Frontend wired to cache. web/src/services/dataService.ts fetches the new endpoints and supplies CriticalLogsWidget with typed data + counters; accompanying unit coverage mocks the API responses.
• Plan/doc cleanup. Phase 3b items are checked off in newarch_plan.md to reflect the cache + API + UI work.
• Validation. go test ./pkg/core/... and npm run lint are green.

Remaining for Phase 3b: stream-driven hydration (instead of snapshots) and feature-flag plumbing once we’re ready to roll this out broadly.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3489724548 Original created: 2025-11-05T07:26:04Z --- Status update from the architecture refactor work: • **Log digest cache landed.** Added pkg/core/log_digest.go with a capped ring buffer + 1h/24h counters, hydrated every 30s from Proton via the new DBLogDigestSource helper. Core start-up now wires the aggregator and keeps it refreshed until shutdown. • **New critical log APIs.** Exposed `/api/logs/critical` and `/api/logs/critical/counters` (protected routes); the handlers serve the in-memory digest so fatal/error widgets no longer hit SRQL. • **Frontend wired to cache.** web/src/services/dataService.ts fetches the new endpoints and supplies CriticalLogsWidget with typed data + counters; accompanying unit coverage mocks the API responses. • **Plan/doc cleanup.** Phase 3b items are checked off in newarch_plan.md to reflect the cache + API + UI work. • **Validation.** `go test ./pkg/core/...` and `npm run lint` are green. Remaining for Phase 3b: stream-driven hydration (instead of snapshots) and feature-flag plumbing once we’re ready to roll this out broadly.

mfreeman451 commented

2026-03-28 04:26:54 +00:00

Author

Owner

Copy link

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492120109
Original created: 2025-11-05T16:17:11Z

Phase 3b status update:\n- Landed the log digest aggregator, tailer, and persistence plumbing; feature flag () is now available in config.\n- Built/published ghcr.io/carverauto/serviceradar-core@sha256:4124f3f298f13c1d2425725bbca80c8bc2e902a93074e2e3849a24103b6e1be9 and rolled the demo deployment to that image.\n- During the rollout, enabling in the demo cluster prevented the HTTP listener from ever becoming ready (readiness probe stayed red). For now the flag is set to false in the runtime config so the new build can serve traffic.\n- Follow-up: debug why enabling the log digest stream blocks readiness before we flip the flag back on.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492120109 Original created: 2025-11-05T16:17:11Z --- Phase 3b status update:\n- Landed the log digest aggregator, tailer, and persistence plumbing; feature flag () is now available in config.\n- Built/published ghcr.io/carverauto/serviceradar-core@sha256:4124f3f298f13c1d2425725bbca80c8bc2e902a93074e2e3849a24103b6e1be9 and rolled the demo deployment to that image.\n- During the rollout, enabling in the demo cluster prevented the HTTP listener from ever becoming ready (readiness probe stayed red). For now the flag is set to false in the runtime config so the new build can serve traffic.\n- Follow-up: debug why enabling the log digest stream blocks readiness before we flip the flag back on.

mfreeman451 commented

2026-03-28 04:26:54 +00:00

Author

Owner

Copy link

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492120711
Original created: 2025-11-05T16:17:20Z

Phase 3b status update:

Landed the log digest aggregator, tailer, and persistence plumbing; feature flag (features.use_log_digest) is now exposed in config.
Built/published ghcr.io/carverauto/serviceradar-core@sha256:4124f3f298f13c1d2425725bbca80c8bc2e902a93074e2e3849a24103b6e1be9 and rolled the demo deployment to that image.
During the rollout, enabling UseLogDigest in the demo cluster prevented the HTTP listener from ever becoming ready (readiness probe stayed red). For now the flag is set to false in the runtime config so the new build can serve traffic.
Follow-up: debug why enabling the log digest stream blocks readiness before we flip the flag back on.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492120711 Original created: 2025-11-05T16:17:20Z --- Phase 3b status update: - Landed the log digest aggregator, tailer, and persistence plumbing; feature flag (`features.use_log_digest`) is now exposed in config. - Built/published ghcr.io/carverauto/serviceradar-core@sha256:4124f3f298f13c1d2425725bbca80c8bc2e902a93074e2e3849a24103b6e1be9 and rolled the demo deployment to that image. - During the rollout, enabling `UseLogDigest` in the demo cluster prevented the HTTP listener from ever becoming ready (readiness probe stayed red). For now the flag is set to false in the runtime config so the new build can serve traffic. - Follow-up: debug why enabling the log digest stream blocks readiness before we flip the flag back on.

mfreeman451 commented

2026-03-28 04:26:55 +00:00

Author

Owner

Copy link

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492222622
Original created: 2025-11-05T16:36:04Z

Re-enabled the log digest path in demo and rolled the cluster:

Updated serviceradar-config to set features.use_log_digest=true, then rebuilt/pushed ghcr.io/carverauto/serviceradar-core (sha256:ab992d84af2ad9500ce0c4d37c2f7b3231eb76a145c267acdb0a205388c0bb9b, tag sha-057b69fdcc8cb45a3d1e46ffb395d910474d897a).
Applied the refreshed ConfigMap and set the deployment image to the new tag; rollout completed and the pod is healthy (serviceradar-core-c8cf58f59-dcgvb reached 1/1 ready in ~70s).
Startup logs confirm the async bootstrap now times out without blocking readiness, so the HTTP listener comes up cleanly.

Follow-up: Proton is rejecting the streaming tail with code: 62 ... Syntax error ... EMIT CHANGES; the aggregator is retrying with exponential backoff. We’ll need to adjust the tail query so the digest keeps up-to-date once the flag stays on.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492222622 Original created: 2025-11-05T16:36:04Z --- Re-enabled the log digest path in demo and rolled the cluster: - Updated `serviceradar-config` to set `features.use_log_digest=true`, then rebuilt/pushed `ghcr.io/carverauto/serviceradar-core` (`sha256:ab992d84af2ad9500ce0c4d37c2f7b3231eb76a145c267acdb0a205388c0bb9b`, tag `sha-057b69fdcc8cb45a3d1e46ffb395d910474d897a`). - Applied the refreshed ConfigMap and set the deployment image to the new tag; rollout completed and the pod is healthy (`serviceradar-core-c8cf58f59-dcgvb` reached 1/1 ready in ~70s). - Startup logs confirm the async bootstrap now times out without blocking readiness, so the HTTP listener comes up cleanly. Follow-up: Proton is rejecting the streaming tail with `code: 62 ... Syntax error ... EMIT CHANGES`; the aggregator is retrying with exponential backoff. We’ll need to adjust the tail query so the digest keeps up-to-date once the flag stays on.

mfreeman451 commented

2026-03-28 04:26:55 +00:00

Author

Owner

Copy link

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492446609
Original created: 2025-11-05T17:20:38Z

Validated the streaming log tailer end-to-end:

Rebuilt/pushed ghcr.io/carverauto/serviceradar-core@sha256:c587c6cadf6b1e26182ae93641c42d75d236e93a3c0d76b41267140cee379355 and rolled the demo core deployment.
Injected a synthetic fatal log row via Proton (Phase3b log-digest test) to exercise the digest path.
Queried /api/logs/critical and /api/logs/critical/counters with an admin JWT; the API served the new entry directly from the in-memory digest, confirming the stream keeps up without relapsing to Proton.
Noted the one-time bootstrap timeout (expected with the async snapshot), but the streaming consumer now stays connected with no further EMIT CHANGES syntax errors.

Follow-up: none for Phase 3b tailer; next we can look at trimming that bootstrap timeout if it shows up in SLOs.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492446609 Original created: 2025-11-05T17:20:38Z --- Validated the streaming log tailer end-to-end: - Rebuilt/pushed `ghcr.io/carverauto/serviceradar-core@sha256:c587c6cadf6b1e26182ae93641c42d75d236e93a3c0d76b41267140cee379355` and rolled the demo core deployment. - Injected a synthetic fatal log row via Proton (`Phase3b log-digest test`) to exercise the digest path. - Queried `/api/logs/critical` and `/api/logs/critical/counters` with an admin JWT; the API served the new entry directly from the in-memory digest, confirming the stream keeps up without relapsing to Proton. - Noted the one-time bootstrap timeout (expected with the async snapshot), but the streaming consumer now stays connected with no further `EMIT CHANGES` syntax errors. Follow-up: none for Phase 3b tailer; next we can look at trimming that bootstrap timeout if it shows up in SLOs.

mfreeman451 commented

2026-03-28 04:26:55 +00:00

Author

Owner

Copy link

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492559591
Original created: 2025-11-05T17:46:51Z

Follow-up cleanup from the streaming rollout:

Fixed the COUNT(*) scan type in the service registry (uint64 instead of int), rebuilt/pushed ghcr.io/carverauto/serviceradar-core@sha256:8170567691819242005bddd711f6c7635ed49b2f02ce66704ead70b8d210f278, and rolled the demo core deployment.
The poller heartbeat warnings (converting UInt64 to *int is unsupported) are gone; /api/logs/critical still returns the latest fatal log from the digest stream.

With the log digest tailer feeding cleanly and the poller cache check fixed, Phase 3b is fully green. Next up is only ongoing monitoring.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492559591 Original created: 2025-11-05T17:46:51Z --- Follow-up cleanup from the streaming rollout: - Fixed the `COUNT(*)` scan type in the service registry (`uint64` instead of `int`), rebuilt/pushed `ghcr.io/carverauto/serviceradar-core@sha256:8170567691819242005bddd711f6c7635ed49b2f02ce66704ead70b8d210f278`, and rolled the demo core deployment. - The poller heartbeat warnings (`converting UInt64 to *int is unsupported`) are gone; `/api/logs/critical` still returns the latest fatal log from the digest stream. With the log digest tailer feeding cleanly and the poller cache check fixed, Phase 3b is fully green. Next up is only ongoing monitoring.

mfreeman451 commented

2026-03-28 04:26:55 +00:00

Author

Owner

Copy link

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492579691
Original created: 2025-11-05T17:51:45Z

Proton connection pressure is cleared up:

Increased the Proton client pool (max open connections 60 → idle 30, streaming helper bumped to 10) via core image sha256:85a9f7f4860f99b1ce0bd182a44880af4505c712f28a63e8c89eb1a60363c78a and rolled serviceradar-core in demo.
The prior proton: acquire conn timeout errors during edge onboarding / poller cache refresh are no longer appearing after the redeploy; log tailer and registry operations now run without starving the pool.

Remaining noisy log is the legacy poller DELETE syntax (tracked separately). Otherwise the new connection ceiling keeps the registry + onboarding flows happy.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3492579691 Original created: 2025-11-05T17:51:45Z --- Proton connection pressure is cleared up: - Increased the Proton client pool (max open connections 60 → idle 30, streaming helper bumped to 10) via core image `sha256:85a9f7f4860f99b1ce0bd182a44880af4505c712f28a63e8c89eb1a60363c78a` and rolled `serviceradar-core` in demo. - The prior `proton: acquire conn timeout` errors during edge onboarding / poller cache refresh are no longer appearing after the redeploy; log tailer and registry operations now run without starving the pool. Remaining noisy log is the legacy poller DELETE syntax (tracked separately). Otherwise the new connection ceiling keeps the registry + onboarding flows happy.

mfreeman451 commented

2026-03-28 04:26:55 +00:00

Author

Owner

Copy link

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3493548590
Original created: 2025-11-05T21:24:17Z

Updates from today:

Added registry/Proton cross-checks on hydration and on every stats refresh. If the in-memory registry ever diverges from table(unified_devices) we now log both counts plus a sample of missing device_ids (pkg/registry/hydrate.go, pkg/registry/diagnostics.go, pkg/core/stats_aggregator.go). No mismatches yet; hydration is reporting 50,007 devices while Proton currently reports 50,009.
Fixed poller delete syntax (ALTER STREAM instead of ALTER TABLE) so the new diagnostics would not spam with Proton errors.
The analytics UI now defaults back to /api/stats for its top-line device counts and only falls back to SRQL if the cache is empty. The tile is still bouncing between ~49.5k and 50k because Kong is rejecting internal SRQL calls with 401 (“Unauthorized”), so the fallback path only succeeds intermittently. That explains the eventual consistency we were seeing earlier.
Confirmed SRQL queries succeed when run directly through the Proton SQL endpoint, so the outstanding issue is between serviceradar-web → serviceradar-kong rather than the registry cache itself.

Next steps:

Debug why /api/query via serviceradar-kong:8000 is unauthorised and either fix the auth headers or point the internal client straight at the OCaml SRQL service.
Once SRQL is reliable again, remove the temporary fallback and rely solely on the cached stats (while keeping the new diagnostics in place to catch regressions).

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3493548590 Original created: 2025-11-05T21:24:17Z --- Updates from today: - Added registry/Proton cross-checks on hydration and on every stats refresh. If the in-memory registry ever diverges from `table(unified_devices)` we now log both counts plus a sample of missing device_ids (`pkg/registry/hydrate.go`, `pkg/registry/diagnostics.go`, `pkg/core/stats_aggregator.go`). No mismatches yet; hydration is reporting 50,007 devices while Proton currently reports 50,009. - Fixed poller delete syntax (`ALTER STREAM` instead of `ALTER TABLE`) so the new diagnostics would not spam with Proton errors. - The analytics UI now defaults back to `/api/stats` for its top-line device counts and only falls back to SRQL if the cache is empty. The tile is still bouncing between ~49.5k and 50k because Kong is rejecting internal SRQL calls with 401 (“Unauthorized”), so the fallback path only succeeds intermittently. That explains the eventual consistency we were seeing earlier. - Confirmed SRQL queries succeed when run directly through the Proton SQL endpoint, so the outstanding issue is between `serviceradar-web` → `serviceradar-kong` rather than the registry cache itself. Next steps: 1. Debug why `/api/query` via `serviceradar-kong:8000` is unauthorised and either fix the auth headers or point the internal client straight at the OCaml SRQL service. 2. Once SRQL is reliable again, remove the temporary fallback and rely solely on the cached stats (while keeping the new diagnostics in place to catch regressions).

mfreeman451 commented

2026-03-28 04:26:56 +00:00

Author

Owner

Copy link

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3493900823
Original created: 2025-11-05T22:39:57Z

Observed another skew in the analytics "Total Devices" tile after today's core deploy. The value climbed to ~72k even though Proton and the registry both still report ~50k devices.

What we have already done:

Filtered out ServiceRadar component IDs (poller/agent/checker) inside pkg/core/stats_aggregator.go and rolled the new core image across the demo namespace.
Confirmed /api/stats is live and the analytics dashboard queries it first, only falling back to SRQL when the cache turns up empty or zero.

Current working theories:

The frontend is still hitting the SRQL fallback path because the cached snapshot occasionally comes back as 0, and the SRQL query (in:devices time:last_7d stats:"count() as total") over-counts versioned rows.
The registry snapshot may still contain duplicate aliases we are missing, so the aggregator is counting more than the canonical Proton total (need to compare registry.SnapshotRecords() length vs. Proton again).
The stats cache might be racing with hydration—during startup it can return the zero-value snapshot, forcing the fallback path and sticking the inflated SRQL number in the React state.

Next actions before another roll-out:

Capture /api/stats responses alongside the fallback SRQL payload when the UI shows the inflated number (e.g. log both in the browser console or add telemetry in dataService.fetchAllAnalyticsData).
Instrument the stats aggregator to log the registry snapshot length every refresh and surface whether the cache returned zero (so we know if hypothesis #3 is real).
If the fallback is the culprit, either hard-disable it now that /api/stats is GA, or change the SRQL to respect _merged_into / _deleted so the count matches Proton.

I updated newarch_plan.md to capture these investigations so we do not repeat the same fixes.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1924#issuecomment-3493900823 Original created: 2025-11-05T22:39:57Z --- Observed another skew in the analytics "Total Devices" tile after today's core deploy. The value climbed to ~72k even though Proton and the registry both still report ~50k devices. What we have already done: - Filtered out ServiceRadar component IDs (poller/agent/checker) inside `pkg/core/stats_aggregator.go` and rolled the new core image across the demo namespace. - Confirmed `/api/stats` is live and the analytics dashboard queries it first, only falling back to SRQL when the cache turns up empty or zero. Current working theories: 1. The frontend is still hitting the SRQL fallback path because the cached snapshot occasionally comes back as `0`, and the SRQL query (`in:devices time:last_7d stats:"count() as total"`) over-counts versioned rows. 2. The registry snapshot may still contain duplicate aliases we are missing, so the aggregator is counting more than the canonical Proton total (need to compare `registry.SnapshotRecords()` length vs. Proton again). 3. The stats cache might be racing with hydration—during startup it can return the zero-value snapshot, forcing the fallback path and sticking the inflated SRQL number in the React state. Next actions before another roll-out: - Capture `/api/stats` responses alongside the fallback SRQL payload when the UI shows the inflated number (e.g. log both in the browser console or add telemetry in `dataService.fetchAllAnalyticsData`). - Instrument the stats aggregator to log the registry snapshot length every refresh and surface whether the cache returned zero (so we know if hypothesis #3 is real). - If the fallback is the culprit, either hard-disable it now that `/api/stats` is GA, or change the SRQL to respect `_merged_into` / `_deleted` so the count matches Proton. I updated `newarch_plan.md` to capture these investigations so we do not repeat the same fixes.

mfreeman451 closed this issue

2026-03-28 04:26:56 +00:00

No Branch/Tag specified

staging

fix/netprobe-event-driven-inventory

update-attributed-flow-explorer

demo/prod-release

release/1.2.94-netprobe-attribution

fix/nats-platform-object-store-acl

feat/causal-engine-v1

renovate/registry.carverauto.dev-serviceradar-forgejo-ci

feat/attributed-flow-correlation

feat/netprobe-p0f-userspace-ebpf-verifier

add-endpoint-sbom-inventory

renovate/ubuntu_jammy-22.04

fix/remove-attributed-flow-fixtures

renovate/node_20_alpine-20-alpine

renovate/alpine_3_23-3.23

fix/demo-nats-import-acls

feat/netprobe-rollback-tests

fix/armis-northbound-raw-token-auth

renovate/alpine_3_20-3.20

fix/remove-stale-nginx

feat/native-addon-web-importer

feat/native-addon-artifact-mirror

feat/native-addon-importer-db-test

feat/native-addon-publish-pipeline

feat/netprobe-addon-carve

feat/addon-os-package-template

feat/addon-bundle-tarball

feat/addon-ephemeral-helper

feat/addon-pushed-tarball

feat/addon-systemd-dispatch

fix/elixir-quality-format

feat/addon-delivery-supervision

feat/native-addon-rust-sdk

feat/native-addon-build-signing

docs/reconcile-fingerprintd-netprobe-feature-sets

fix/addon-status-test-netprobe-fixture

feat/migrate-netprobe-native-addon

codex/fix-datasvc-nats-storage-acl

fix/rust-test-failures

fix/go-test-failures

feat/native-addon-edge-ops

feat/host-network-visibility-phase-2

feat/native-addon-delivery-models

feat/3425-agent-feature-sets-proposal

feat/3444-bumblebee-agent

fix/netprobe-e2e-sidecar-runtime

fix/srql-sort-diagnostics

feat/passive-device-fingerprinting

fix/device-list-sweep-availability

update/plugin-system

add-service-monitoring-foundation

fix/cli-auth-settings-navigation

fix-proxmox-console-react-client-render

fix-services-plugin-status-read-model

fix/docker-update/cnpg

fix/sweep-ip-family-routing

codex/remote-access-desktop-rdp

propose-interface-action-target-context

codex/fix-armis-names-string

main

fix/agent-accept-deprecated-remote-access-config

fix/device-results-count-facets

fix/bazelisk-installer-retries

fix/tinygo-host-toolchain-fetch

add-per-agent-availability

fix/forgejo-release-multipart-assets

fix/agent-config-stale-session

fix/mtr-hop-dns-resolution

fix/hostname-only-device-create

fix/otlp-log-metadata-sanitization

add-nats-object-store-retention

fix/helm-serviceradar-state-pvc

fix/armis-large-sync-streaming

demo/release-v1.2.44-source-fix

demo-rollout-proxmox-bazel-fix

security/postgres-update

fix/mtr-bulk-queue-and-srql-targets

armis-northbound-availability-updates

codex/topology-endpoint-evidence-investigation

codex/topology-bootstrap-and-layout-simplification

codex/remove-ingress-nginx-edge

security/k8s-hardening

2406-feat-agent-fleet-management-secure-self-update-system

chore/k8s-arc-update

rust-fix

2371-analytics-stats-cards-should-abbreviate-numbers

chore/perl-cleanup

192-feat-tftp-server

mikemiles-dev/feature/netflow_collection

815-feat-support-win32-for-agentpoller

gh-pages

netprobe-0.2.3-458e203ae

netprobe-0.2.3-eab190932

v1.2.94

v1.2.93

v1.2.92

v1.2.91

sha-a808db901dd08e5f730ffe368af12eed9a5388af

v1.2.90

sha-2d851b470f300109a1e4ca6fda67a7886726c757

v1.2.89

v1.2.88

v1.2.87

v1.2.86

v1.2.85

sha-bc56da1a4b5f6233459e69def660d00df78a2fbc

v1.2.84

v1.2.83

v1.2.82

v1.2.81

v1.2.80

v1.2.79

v1.2.78

v1.2.77

v1.2.76

v1.2.75

v1.2.74

v1.2.73

v1.2.72

v1.2.71

v1.2.70

v1.2.69

v1.2.68

v1.2.67

v1.2.66

v1.2.65

v1.2.64

v1.2.63

v1.2.62

v1.2.61

v1.2.60

v1.2.59

v1.2.58

v1.2.57

v1.2.54

v1.2.53

v1.2.52

v1.2.51

v1.2.50

v1.2.49

v1.2.48

v1.2.47

v1.2.46

v1.2.45

v1.2.44

v1.2.43

v1.2.42

v1.2.41

v1.2.40

v1.2.39

v1.2.38

sha-de6d1025d59f039188754b895ff7fe65db9b306b

sha-8006b6105635acf43060fab2613eab3bccb1efcf

v1.2.37

v1.2.36

v1.2.35

v1.2.34

v1.2.33

v1.2.32

v1.2.31

v1.2.30

v1.2.29

v1.2.28

v1.2.27

v1.2.26

v1.2.25

v1.2.24

v1.2.23

v1.2.22

v1.2.21

v1.2.20

v1.2.19

v1.2.18

v1.2.17

v1.2.16

v1.2.15

v1.2.14

v1.2.13

v1.2.12

v1.2.11

v1.2.6

v1.2.10

v1.2.9

v1.2.8

v1.2.7

v1.2.5

v1.2.4

v1.2.3

v1.2.2

v1.2.1

v1.2.0

v1.1.2

v1.1.0

v1.0.92

v1.0.91

v1.0.90

v1.0.89

v1.0.88

v1.0.87

v1.0.86

v1.0.85

v1.0.84

v1.0.83

v1.0.82

v1.0.81

v1.0.78

v1.0.77

v1.0.76

v1.0.70

v1.0.69

v1.0.68

v1.0.67

v1.0.66

v1.0.65

v1.0.64

v1.0.63

v1.0.62

v1.0.61

v1.0.60

v1.0.59

v1.0.58

1.0.57

v1.0.56

v1.0.55

v1.0.54-pre5

v1.0.53

v1.0.53-pre19

v1.0.53-pre18

v1.0.53-pre17

v1.0.53-pre15

1.0.53-pre10

1.0.53-pre9

1.0.53-pre8

1.0.53-pre7

1.0.53-pre6

1.0.53-pre5

1.0.53-pre4

1.0.53-pre3

1.0.53-pre2

1.0.53-pre1

1.0.52

1.0.51

1.0.50

1.0.49

1.0.49-pre5

1.0.49-pre4

1.0.49-pre3

1.0.49-pre2

1.0.48

1.0.48-rc2

1.0.48-rc1

1.0.48-pre8

1.0.48-pre7

1.0.48-pre6

1.0.48-pre5

1.0.48-pre4

1.0.48-pre3

1.0.48-pre2

1.0.48-pre1

1.0.47

1.0.47-pre8

1.0.47-pre7

1.0.47-pre6

1.0.47-pre5

1.0.47-pre4

1.0.47-pre3

1.0.47-pre2

1.0.47-pre1

1.0.46

1.0.46-pre9

1.0.46-pre8

1.0.46-pre7

1.0.46-pre6

1.0.46-pre5

1.0.46-pre4

1.0.46-pre3

1.0.46-pre2

1.0.46-pre1

1.0.45

1.0.44

1.0.44-pre12

1.0.44-pre11

1.0.44-pre10

1.0.44-pre9

1.0.44-pre8

1.0.44-pre7

1.0.44-pre6

1.0.44-pre5

1.0.44-pre4

1.0.44-pre3

1.0.44-pre2

1.0.44-pre1

1.0.43

1.0.42

1.0.41

1.0.41-pre1

1.0.40

1.0.40-pre11

1.0.40-pre10

1.0.40-pre9

1.0.40-pre8

1.0.40-pre7

1.0.40-pre6

1.0.40-pre5

1.0.40-pre4

1.0.40-pre3

1.0.40-pre2

1.0.40-pre1

1.0.39

1.0.38

1.0.37

1.0.36

1.0.36-pre5

1.0.36-pre4

1.0.36-pre3

1.0.36-pre2

1.0.35

1.0.35-pre3

1.0.35-pre2

1.0.35-pre

1.0.34-pre3

1.0.34-pre2

1.0.34-pre1

1.0.33

1.0.33-pre2

1.0.33-pre

1.0.32

1.0.31

1.0.30

1.0.29

1.0.28

1.0.27

1.0.26

1.0.25

1.0.24

1.0.23

1.0.22

1.0.21

1.0.20

1.0.19

1.0.18

1.0.17

1.0.16

1.0.15

1.0.14

1.0.13

1.0.11

1.0.10

1.0.9

1.0.8

1.0.7

1.0.6

1.0.5

1.0.4

1.0.3

1.0.2

1.0.1

1.0.0

Labels

Clear labels

1week

2weeks

Failed compliance check

IP cameras

NATS

NATS JetStream

Possible security concern

Something isn't working

build

checkers

ci-cd

continuous integration-continuous deployments

cleanup

cnpg

cloud-native postgres

Pull requests that update a dependency file

device-management

documentation

Improvements or additions to documentation

duplicate

This issue or pull request already exists

dusk

ebpf

enhancement

New feature or request

Pull requests that update GitHub Actions code

Pull requests that update Go code

good first issue

Good for newcomers

help wanted

Extra attention is needed

invalid

This doesn't seem right

javascript

Pull requests that update Javascript code

Oracle Linux related issues

otel

opentelemetry logs, traces, metrics

plug-in

proton

timeplus proton streaming database

python

question

Further information is requested

Pull requests that update rust code

sdk

security

serviceradar-agent

serviceradar-agent-gateway

This will not be worked on

zen-engine

No labels

1week

2weeks

Failed compliance check

IP cameras

NATS

Possible security concern

serviceradar-agent-gateway

Milestone

Clear milestone

No items

No milestone

Projects

Clear projects

No items

No project

Assignees

Clear assignees

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

carverauto/serviceradar#639

Reference in a new issue

Repository

carverauto/serviceradar

Title

Body

No description provided.

Delete branch "%!s()"

Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?

Rows
Columns