Implement zero-touch onboarding across the ServiceRadar stack #619

Closed
opened 2026-03-28 04:26:30 +00:00 by mfreeman451 · 1 comment
Owner

Imported from GitHub.

Original GitHub issue: #1891
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1891
Original created: 2025-10-25T21:32:00Z


Summary

  • Deliver zero/low-touch onboarding across the full ServiceRadar stack (Core services, NATS/DataSvc, SRQL, Proton, OTEL collector, trapd, flowgger, agents, pollers, checkers, edge NATS leaf nodes).
  • Make container/Kubernetes installs frictionless by auto-configuring SPIFFE/SPIRE and service credentials during deployment.
  • Provide automation-friendly scripts for bare-metal and edge rollouts that minimize manual configuration.

Background

  • docs/docs/agent-onboarding-plan.md (updated 2025-10-25) now outlines stack-wide requirements, highlights the absence of onboarding APIs/CLI/UX, and documents gaps in SPIFFE packaging for both core and edge services.
  • Today every service, including the demo environment, depends on pre-baked configs and static TLS material; no centralized approval or identity issuance exists.

Tasks

  • #1892
  • Finalize a product brief capturing API contracts, datastore schema, audit logging, and UX states for onboarding every service role.
  • Implement Core onboarding service endpoints with persistence, token validation, SPIFFE entry creation hooks, and regression coverage.
  • Extend all runtimes (core services, agents, pollers, checkers, NATS leaf nodes) plus serviceradar-cli to perform the bootstrap handshake, manage returned credentials, and rotate/revoke enrollment packs.
  • Build admin UI workflows for pending registrations, bulk approvals, template assignment, and activity history, with policy controls for auto-approval by role/site.
  • Harden the security story: define bootstrap credential lifecycle, ship SPIRE server/agent packaging for container and bare-metal targets, and document fallbacks when SPIRE is unavailable.
  • Finalize reusable configuration templates and schema validation, wiring KV/Data Service interactions, NATS routing, and service-specific defaults into the approval flow.
  • Update Docker/K8s manifests and Helm/Compose assets so installs auto-bootstrap via SPIFFE with zero manual secret editing.
  • Ship bare-metal/VM automation (scripts, Ansible/Terraform roles) that install the stack, register services with SPIRE, and verify successful activation end to end.
  • Add demo environment automation that exercises the full onboarding path as part of CI/release validation.

Acceptance Criteria

  • SPIFFE/SPIRE is deployed and documented in the demo namespace; core services successfully authenticate via SPIFFE as a baseline.
  • Core onboarding endpoints exist with tests, docs, and audit logs; SPIFFE entries are created automatically on approval.
  • Every runtime can start with only endpoint + bootstrap token, obtain SPIFFE or mTLS credentials, and reconnect without manual file edits.
  • serviceradar-cli supports enrollment pack create/rotate/revoke and stack bootstrap commands with documentation and examples.
  • Web UI exposes pending registrations, approval history, bulk actions, and policy-based auto-approval.
  • Container/K8s deployments come up fully authenticated without manual secret injection; bare-metal scripts configure SPIRE and services with minimal input.
  • Demo namespace automation validates the zero-touch workflow and is integrated into the release checklist.
Imported from GitHub. Original GitHub issue: #1891 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1891 Original created: 2025-10-25T21:32:00Z --- ### Summary - Deliver zero/low-touch onboarding across the full ServiceRadar stack (Core services, NATS/DataSvc, SRQL, Proton, OTEL collector, trapd, flowgger, agents, pollers, checkers, edge NATS leaf nodes). - Make container/Kubernetes installs frictionless by auto-configuring SPIFFE/SPIRE and service credentials during deployment. - Provide automation-friendly scripts for bare-metal and edge rollouts that minimize manual configuration. ### Background - `docs/docs/agent-onboarding-plan.md` (updated 2025-10-25) now outlines stack-wide requirements, highlights the absence of onboarding APIs/CLI/UX, and documents gaps in SPIFFE packaging for both core and edge services. - Today every service, including the demo environment, depends on pre-baked configs and static TLS material; no centralized approval or identity issuance exists. ### Tasks - [x] #1892 - [ ] Finalize a product brief capturing API contracts, datastore schema, audit logging, and UX states for onboarding every service role. - [ ] Implement Core onboarding service endpoints with persistence, token validation, SPIFFE entry creation hooks, and regression coverage. - [x] Extend all runtimes (core services, agents, pollers, checkers, NATS leaf nodes) plus `serviceradar-cli` to perform the bootstrap handshake, manage returned credentials, and rotate/revoke enrollment packs. - [ ] Build admin UI workflows for pending registrations, bulk approvals, template assignment, and activity history, with policy controls for auto-approval by role/site. - [x] Harden the security story: define bootstrap credential lifecycle, ship SPIRE server/agent packaging for container and bare-metal targets, and document fallbacks when SPIRE is unavailable. - [ ] Finalize reusable configuration templates and schema validation, wiring KV/Data Service interactions, NATS routing, and service-specific defaults into the approval flow. - [x] Update Docker/K8s manifests and Helm/Compose assets so installs auto-bootstrap via SPIFFE with zero manual secret editing. - [ ] Ship bare-metal/VM automation (scripts, Ansible/Terraform roles) that install the stack, register services with SPIRE, and verify successful activation end to end. - [ ] Add demo environment automation that exercises the full onboarding path as part of CI/release validation. ### Acceptance Criteria - SPIFFE/SPIRE is deployed and documented in the demo namespace; core services successfully authenticate via SPIFFE as a baseline. - Core onboarding endpoints exist with tests, docs, and audit logs; SPIFFE entries are created automatically on approval. - Every runtime can start with only endpoint + bootstrap token, obtain SPIFFE or mTLS credentials, and reconnect without manual file edits. - `serviceradar-cli` supports enrollment pack create/rotate/revoke and stack bootstrap commands with documentation and examples. - Web UI exposes pending registrations, approval history, bulk actions, and policy-based auto-approval. - Container/K8s deployments come up fully authenticated without manual secret injection; bare-metal scripts configure SPIRE and services with minimal input. - Demo namespace automation validates the zero-touch workflow and is integrated into the release checklist.
Author
Owner

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1891#issuecomment-3813918899
Original created: 2026-01-28T21:02:49Z


closing, stale

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1891#issuecomment-3813918899 Original created: 2026-01-28T21:02:49Z --- closing, stale
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar#619
No description provided.