feat(agent): Pulse - agent attestation #1006

Open
opened 2026-03-28 04:30:45 +00:00 by mfreeman451 · 0 comments
Owner

Imported from GitHub.

Original GitHub issue: #2787
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/2787
Original created: 2026-02-11T04:06:06Z


This PRD outlines the implementation of Pulse, a cryptographic attestation layer for the serviceradar-agent.

While the Fortress suite provides kernel-level defense, Pulse ensures that the "heartbeat" signals reaching the Gateway are authentic, haven't been tampered with, and accurately represent the security state of the agent.


PRD: ServiceRadar "Pulse" (Cryptographic Attestation & Signed Heartbeats)

Target Component: pkg/agent/pulse & serviceradar-gateway
Identity Provider: Existing mTLS Client Certificates
Security Level: Anti-Spoofing / Anti-Blinding

1. Executive Summary

Pulse is a security protocol that upgrades standard agent heartbeats to Signed Attestations. It prevents "Blinding" attacks—where an attacker kills the agent and mocks its heartbeat to hide an intrusion—and "Replay" attacks. Pulse cryptographically ties the agent's identity to its current security posture, including the integrity of its eBPF programs and binary.

2. Security Objectives

  • Non-Repudiation: Every status update and heartbeat must be signed by the agent’s unique mTLS private key.
  • Anti-Replay: Inclusion of high-resolution timestamps and nonces to ensure old heartbeats cannot be intercepted and re-broadcast.
  • State Attestation: The heartbeat must include a "Postcheck" of the agent's environment (hashes of running eBPF programs and the agent binary).
  • Blinding Detection: The Gateway must trigger an alert if a signed heartbeat is missed or if the signature validation fails, treating it as a "Device Compromised" event.

3. Architecture

3.1 The Signature Chain

Pulse leverages the agent's pre-provisioned mTLS certificate:

  1. Payload Construction: The agent gathers health metrics and eBPF state.
  2. Hashing: A SHA-256 hash is generated for the current serviceradar-agent binary and the pinned eBPF ELF objects.
  3. Signing: The agent signs the entire bundle (Status + State Hashes + Timestamp) using its mTLS Private Key.
  4. Verification: The Gateway uses the Public Key associated with that Agent ID to verify the signature.

3.2 Attestation Data Points

The "Pulse" payload must include:

  • AgentID & Partition: Identifiers.
  • Timestamp (UTC): Prevents delay/replay.
  • Nonce: A unique 128-bit number per pulse.
  • Binary Hash: The SHA-256 of the running agent on disk.
  • eBPF Snapshot: A map of pinned BPF program names to their instructions' hashes.
  • Fortress Status: Boolean flag indicating if the BPF LSM Supervisor is active.

4. Technical Requirements

4.1 Agent-Side (Go)

  • Key Access: Pulse must use the same private key used for gRPC mTLS communication. It must never expose this key or write it to logs.
  • Efficiency: Signatures (ED25519 or RSA depending on cert type) must be generated in < 1ms to avoid blocking the PushLoop.
  • Self-Audit: The agent must check the /proc/self/exe to generate its own binary hash for the pulse.

4.2 Gateway-Side (Elixir)

  • Validation Logic: The Gateway must verify the signature of every heartbeat.
  • State Tracking: Store the "Last Known Good" binary hash. If an agent's binary hash changes without a coordinated update event, trigger a Severity 1 Security Alert.
  • Drift Detection: If the timestamp in the pulse drifts more than X seconds from the Gateway time, reject the pulse.

5. Protocol Flow

  1. Agent Ticks: Every 30s (default PushInterval), the Pulse service generates the attestation payload.
  2. BPF Check: Pulse queries /sys/fs/bpf/serviceradar/ to verify Shield and Sentinel are still pinned.
  3. Sign & Send: The payload is signed and sent as part of the proto.GatewayStatusRequest.
  4. Gateway Audit:
    • Signature Valid? (Yes/No)
    • State Hash matches expected? (Yes/No)
    • LSM Supervisor Active? (Yes/No)
  5. Result: If all "Yes," status is accepted. If any "No," the Gateway ignores the performance data and issues a Security Mitigation command to the rest of the network (e.g., "Isolate this IP via other agents' Shield firewalls").

6. Implementation Constraints

  1. No Key Hardcoding: Only use the standard filesystem paths for mTLS certs defined in ServerConfig.
  2. Graceful Key Rotation: Pulse must handle certificate renewal seamlessly without interrupting the attestation flow.
  3. Payload Size: The signed header/metadata should not add more than 1KB to the status push.

7. Success Metrics

  • Integrity: An attacker manually kills the serviceradar-agent and attempts to send a curl request to the Gateway status endpoint; the Gateway rejects it because it lacks a valid mTLS signature and the correct "State Attestation."
  • Transparency: Users can see the "Attestation Status" of every edge device in the Web-NG dashboard.
  • Zero False Positives: Clock synchronization (NTP) must be verified to ensure drift doesn't cause legitimate agents to be flagged as compromised.

8. Failure Modes

  • Clock Out-of-Sync: If an edge device's clock fails, the Gateway must provide a specific "Clock Desync" error rather than "Compromised."
  • Private Key Stolen: If the private key is stolen, the attacker can sign pulses, but the Binary Hash check will still catch them if they try to run a modified agent. This provides "Defense in Depth."
Imported from GitHub. Original GitHub issue: #2787 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/2787 Original created: 2026-02-11T04:06:06Z --- This PRD outlines the implementation of **Pulse**, a cryptographic attestation layer for the `serviceradar-agent`. While the **Fortress** suite provides kernel-level defense, **Pulse** ensures that the "heartbeat" signals reaching the Gateway are authentic, haven't been tampered with, and accurately represent the security state of the agent. --- # PRD: ServiceRadar "Pulse" (Cryptographic Attestation & Signed Heartbeats) **Target Component:** `pkg/agent/pulse` & `serviceradar-gateway` **Identity Provider:** Existing mTLS Client Certificates **Security Level:** Anti-Spoofing / Anti-Blinding ## 1. Executive Summary Pulse is a security protocol that upgrades standard agent heartbeats to **Signed Attestations**. It prevents "Blinding" attacks—where an attacker kills the agent and mocks its heartbeat to hide an intrusion—and "Replay" attacks. Pulse cryptographically ties the agent's identity to its current security posture, including the integrity of its eBPF programs and binary. ## 2. Security Objectives * **Non-Repudiation:** Every status update and heartbeat must be signed by the agent’s unique mTLS private key. * **Anti-Replay:** Inclusion of high-resolution timestamps and nonces to ensure old heartbeats cannot be intercepted and re-broadcast. * **State Attestation:** The heartbeat must include a "Postcheck" of the agent's environment (hashes of running eBPF programs and the agent binary). * **Blinding Detection:** The Gateway must trigger an alert if a signed heartbeat is missed or if the signature validation fails, treating it as a "Device Compromised" event. --- ## 3. Architecture ### 3.1 The Signature Chain Pulse leverages the agent's pre-provisioned mTLS certificate: 1. **Payload Construction:** The agent gathers health metrics and eBPF state. 2. **Hashing:** A SHA-256 hash is generated for the current `serviceradar-agent` binary and the pinned eBPF ELF objects. 3. **Signing:** The agent signs the entire bundle (Status + State Hashes + Timestamp) using its **mTLS Private Key**. 4. **Verification:** The Gateway uses the **Public Key** associated with that Agent ID to verify the signature. ### 3.2 Attestation Data Points The "Pulse" payload must include: * **AgentID & Partition:** Identifiers. * **Timestamp (UTC):** Prevents delay/replay. * **Nonce:** A unique 128-bit number per pulse. * **Binary Hash:** The SHA-256 of the running agent on disk. * **eBPF Snapshot:** A map of pinned BPF program names to their instructions' hashes. * **Fortress Status:** Boolean flag indicating if the BPF LSM Supervisor is active. --- ## 4. Technical Requirements ### 4.1 Agent-Side (Go) * **Key Access:** Pulse must use the same private key used for gRPC mTLS communication. It must never expose this key or write it to logs. * **Efficiency:** Signatures (ED25519 or RSA depending on cert type) must be generated in < 1ms to avoid blocking the `PushLoop`. * **Self-Audit:** The agent must check the `/proc/self/exe` to generate its own binary hash for the pulse. ### 4.2 Gateway-Side (Elixir) * **Validation Logic:** The Gateway must verify the signature of every heartbeat. * **State Tracking:** Store the "Last Known Good" binary hash. If an agent's binary hash changes without a coordinated update event, trigger a **Severity 1 Security Alert**. * **Drift Detection:** If the timestamp in the pulse drifts more than *X* seconds from the Gateway time, reject the pulse. --- ## 5. Protocol Flow 1. **Agent Ticks:** Every 30s (default `PushInterval`), the Pulse service generates the attestation payload. 2. **BPF Check:** Pulse queries `/sys/fs/bpf/serviceradar/` to verify Shield and Sentinel are still pinned. 3. **Sign & Send:** The payload is signed and sent as part of the `proto.GatewayStatusRequest`. 4. **Gateway Audit:** * Signature Valid? (Yes/No) * State Hash matches expected? (Yes/No) * LSM Supervisor Active? (Yes/No) 5. **Result:** If all "Yes," status is accepted. If any "No," the Gateway ignores the performance data and issues a **Security Mitigation** command to the rest of the network (e.g., "Isolate this IP via other agents' Shield firewalls"). --- ## 6. Implementation Constraints 1. **No Key Hardcoding:** Only use the standard filesystem paths for mTLS certs defined in `ServerConfig`. 2. **Graceful Key Rotation:** Pulse must handle certificate renewal seamlessly without interrupting the attestation flow. 3. **Payload Size:** The signed header/metadata should not add more than 1KB to the status push. --- ## 7. Success Metrics * **Integrity:** An attacker manually kills the `serviceradar-agent` and attempts to send a `curl` request to the Gateway status endpoint; the Gateway rejects it because it lacks a valid mTLS signature and the correct "State Attestation." * **Transparency:** Users can see the "Attestation Status" of every edge device in the Web-NG dashboard. * **Zero False Positives:** Clock synchronization (NTP) must be verified to ensure drift doesn't cause legitimate agents to be flagged as compromised. ## 8. Failure Modes * **Clock Out-of-Sync:** If an edge device's clock fails, the Gateway must provide a specific "Clock Desync" error rather than "Compromised." * **Private Key Stolen:** If the private key is stolen, the attacker can sign pulses, but the **Binary Hash** check will still catch them if they try to run a modified agent. This provides "Defense in Depth."
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar#1006
No description provided.