bug(sync): failed to write config #604

Closed
opened 2026-03-28 04:26:18 +00:00 by mfreeman451 · 1 comment
Owner

Imported from GitHub.

Original GitHub issue: #1864
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1864
Original created: 2025-10-23T05:49:23Z


Describe the bug

error: failed to write sweep config chunk 11 to agents/k8s-agent/checkers/sweep/sweep_chunk_11.json: all 3 retries failed: rpc error: code = Internal desc = failed to put many: failed to put key agents/k8s-agent/checkers/sweep/sweep_chunk_11.json: nats: nats: API error: code=503 err_code=10077 description=message block data missing
network_count: 0
device_target_count: 50000
config_key: agents/k8s-agent/checkers/sweep/sweep.json
[1] 2025/10/23 05:45:00.213090 [WRN] Filestore [KV_serviceradar-kv] loadBlock error: message block data missing                                                    [1] 2025/10/23 05:45:00.213107 [ERR] JetStream failed to store a msg on stream 'SERVICERADAR > KV_serviceradar-kv': message block data missing                     [1] 2025/10/23 05:45:00.523550 [DBG] 10.42.111.127:53538 - cid:97 - "v1.32.0:go:NATS CLI Version 0.1.3" - "SERVICERADAR/user:CN=serviceradar-debug-client,OU=Kubernetes,O=ServiceRadar,L=San Francisco,ST=CA,C=US" - Client Ping Timer
[1] 2025/10/23 05:45:00.523563 [DBG] 10.42.111.127:53538 - cid:97 - "v1.32.0:go:NATS CLI Version 0.1.3" - "SERVICERADAR/user:CN=serviceradar-debug-client,OU=Kubernetes,O=ServiceRadar,L=San Francisco,ST=CA,C=US" - Delaying PING due to remote client data or ping 2s ago                                                           [1] 2025/10/23 05:45:01.215103 [WRN] Filestore [KV_serviceradar-kv] loadBlock error: message block data missing                                                    [1] 2025/10/23 05:45:01.215119 [ERR] JetStream failed to store a msg on stream 'SERVICERADAR > KV_serviceradar-kv': message block data missing                     [1] 2025/10/23 05:45:03.217768 [WRN] Filestore [KV_serviceradar-kv] loadBlock error: message block data missing                                                    [1] 2025/10/23 05:45:03.217786 [ERR] JetStream failed to store a msg on stream 'SERVICERADAR > KV_serviceradar-kv': message block data missing
Log Details
Trace ID:

-

Span ID:

-

Service Version:

1.0.0

Service Instance:

-

Scope:

sync

Severity Number:

13

Attributes
error: failed to write sweep config chunk 11 to agents/k8s-agent/checkers/sweep/sweep_chunk_11.json: all 3 retries failed: rpc error: code = Internal desc = failed to put many: failed to put key agents/k8s-agent/checkers/sweep/sweep_chunk_11.json: nats: nats: API error: code=503 err_code=10077 description=message block data missing

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Imported from GitHub. Original GitHub issue: #1864 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1864 Original created: 2025-10-23T05:49:23Z --- **Describe the bug** ``` error: failed to write sweep config chunk 11 to agents/k8s-agent/checkers/sweep/sweep_chunk_11.json: all 3 retries failed: rpc error: code = Internal desc = failed to put many: failed to put key agents/k8s-agent/checkers/sweep/sweep_chunk_11.json: nats: nats: API error: code=503 err_code=10077 description=message block data missing network_count: 0 device_target_count: 50000 config_key: agents/k8s-agent/checkers/sweep/sweep.json ``` ``` [1] 2025/10/23 05:45:00.213090 [WRN] Filestore [KV_serviceradar-kv] loadBlock error: message block data missing [1] 2025/10/23 05:45:00.213107 [ERR] JetStream failed to store a msg on stream 'SERVICERADAR > KV_serviceradar-kv': message block data missing [1] 2025/10/23 05:45:00.523550 [DBG] 10.42.111.127:53538 - cid:97 - "v1.32.0:go:NATS CLI Version 0.1.3" - "SERVICERADAR/user:CN=serviceradar-debug-client,OU=Kubernetes,O=ServiceRadar,L=San Francisco,ST=CA,C=US" - Client Ping Timer [1] 2025/10/23 05:45:00.523563 [DBG] 10.42.111.127:53538 - cid:97 - "v1.32.0:go:NATS CLI Version 0.1.3" - "SERVICERADAR/user:CN=serviceradar-debug-client,OU=Kubernetes,O=ServiceRadar,L=San Francisco,ST=CA,C=US" - Delaying PING due to remote client data or ping 2s ago [1] 2025/10/23 05:45:01.215103 [WRN] Filestore [KV_serviceradar-kv] loadBlock error: message block data missing [1] 2025/10/23 05:45:01.215119 [ERR] JetStream failed to store a msg on stream 'SERVICERADAR > KV_serviceradar-kv': message block data missing [1] 2025/10/23 05:45:03.217768 [WRN] Filestore [KV_serviceradar-kv] loadBlock error: message block data missing [1] 2025/10/23 05:45:03.217786 [ERR] JetStream failed to store a msg on stream 'SERVICERADAR > KV_serviceradar-kv': message block data missing ``` ``` Log Details Trace ID: - Span ID: - Service Version: 1.0.0 Service Instance: - Scope: sync Severity Number: 13 Attributes error: failed to write sweep config chunk 11 to agents/k8s-agent/checkers/sweep/sweep_chunk_11.json: all 3 retries failed: rpc error: code = Internal desc = failed to put many: failed to put key agents/k8s-agent/checkers/sweep/sweep_chunk_11.json: nats: nats: API error: code=503 err_code=10077 description=message block data missing ``` **To Reproduce** Steps to reproduce the behavior: 1. Go to '...' 2. Click on '....' 3. Scroll down to '....' 4. See error **Expected behavior** A clear and concise description of what you expected to happen. **Screenshots** If applicable, add screenshots to help explain your problem. **Desktop (please complete the following information):** - OS: [e.g. iOS] - Browser [e.g. chrome, safari] - Version [e.g. 22] **Smartphone (please complete the following information):** - Device: [e.g. iPhone6] - OS: [e.g. iOS8.1] - Browser [e.g. stock browser, safari] - Version [e.g. 22] **Additional context** Add any other context about the problem here.
Author
Owner

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/1864#issuecomment-3435336301
Original created: 2025-10-23T06:14:25Z


Root cause: Armis sweep chunk writes were still emitting ~2MB payloads, which JetStream rejected as message block data missing. Added a size-aware chunker in pkg/sync/integrations/armis/config.go that caps each KV entry at ~512KiB (with adaptive halving when we encounter oversized device metadata) and a regression test (TestDefaultKVWriter_WriteSweepConfigChunks) that fails if we regress.

Validation: go test ./pkg/sync/... and pushed ghcr.io/carverauto/serviceradar-sync:sha-5036ca2f580004986152a1bdbff04416692cde22 before rolling the demo deployment (kubectl rollout status deployment/serviceradar-sync -n demo).

Next step: keep an eye on sync logs for any residual JetStream 503s now that the new chunking logic is live.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/1864#issuecomment-3435336301 Original created: 2025-10-23T06:14:25Z --- Root cause: Armis sweep chunk writes were still emitting ~2MB payloads, which JetStream rejected as `message block data missing`. Added a size-aware chunker in `pkg/sync/integrations/armis/config.go` that caps each KV entry at ~512KiB (with adaptive halving when we encounter oversized device metadata) and a regression test (`TestDefaultKVWriter_WriteSweepConfigChunks`) that fails if we regress. Validation: `go test ./pkg/sync/...` and pushed `ghcr.io/carverauto/serviceradar-sync:sha-5036ca2f580004986152a1bdbff04416692cde22` before rolling the demo deployment (`kubectl rollout status deployment/serviceradar-sync -n demo`). Next step: keep an eye on sync logs for any residual JetStream 503s now that the new chunking logic is live.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar#604
No description provided.