bug(agent): agent still has old poller behavior #789

Closed
opened 2026-03-28 04:28:34 +00:00 by mfreeman451 · 1 comment
Owner

Imported from GitHub.

Original GitHub issue: #2390
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/2390
Original created: 2026-01-20T02:20:09Z


Describe the bug

02:18:01.521 [info] StatusHandler received: service_type=sweep source=status service=network_sweep                                                                                         02:18:01.522 [info] StatusHandler received: service_type=grpc source=status service=db-event-writer                                                                                        02:18:01.522 [info] StatusHandler received: service_type=grpc source=status service=flowgger                                                                                               02:18:01.522 [info] StatusHandler received: service_type=grpc source=status service=mapper                                                                                                 02:18:01.523 [info] StatusHandler received: service_type=grpc source=status service=rperf-checker                                                                                          02:18:01.523 [info] StatusHandler received: service_type=grpc source=status service=snmp
02:18:01.524 [info] StatusHandler received: service_type=grpc source=status service=trapd                                                                                                  02:18:01.524 [info] StatusHandler received: service_type=grpc source=status service=zen
{"level":"info","component":"agent","service":"mapper","address":"serviceradar-mapper:50056","attempt":1,"maxRetries":3,"time":"2026-01-20T02:22:20Z","message":"Connecting to service"}   {"level":"info","component":"agent","service":"mapper","time":"2026-01-20T02:22:20Z","message":"Connected successfully"}                                                                   {"level":"error","component":"agent","error":"health check failed: health check failed: all 3 retries failed: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 10.43.119.99:50056: connect: connection refused\"","service":"mapper","healthy":false,"time":"2026-01-20T02:22:23Z","message":"Health check failed"}           {"level":"info","component":"agent","mode":"spiffe","time":"2026-01-20T02:22:26Z","message":"Creating security provider"}                                                                  {"level":"info","component":"agent","workloadSocket":"unix:/run/spire/sockets/agent.sock","time":"2026-01-20T02:22:26Z","message":"Initializing SPIFFE security provider"}                 {"level":"info","component":"agent","service":"snmp","address":"serviceradar-snmp-checker:50054","attempt":1,"maxRetries":3,"time":"2026-01-20T02:22:26Z","message":"Connecting to service"}                                                                                                                                                                                          {"level":"info","component":"agent","service":"snmp","time":"2026-01-20T02:22:26Z","message":"Connected successfully"}                                                                     {"level":"error","component":"agent","error":"health check failed: health check failed: all 3 retries failed: rpc error: code = Unavailable desc = name resolver error: produced zero addresses","service":"snmp","healthy":false,"time":"2026-01-20T02:22:29Z","message":"Health check failed"}

The agent seems to still be trying to communicate to other checkers, acting like a poller. This is not the new / desired behavior we want, the only thing the agent is supposed to do now is forward stuff to the agent-gateway, and external checkers will eventually be updated to push status/results to the agent.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Imported from GitHub. Original GitHub issue: #2390 Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/2390 Original created: 2026-01-20T02:20:09Z --- **Describe the bug** ``` 02:18:01.521 [info] StatusHandler received: service_type=sweep source=status service=network_sweep 02:18:01.522 [info] StatusHandler received: service_type=grpc source=status service=db-event-writer 02:18:01.522 [info] StatusHandler received: service_type=grpc source=status service=flowgger 02:18:01.522 [info] StatusHandler received: service_type=grpc source=status service=mapper 02:18:01.523 [info] StatusHandler received: service_type=grpc source=status service=rperf-checker 02:18:01.523 [info] StatusHandler received: service_type=grpc source=status service=snmp 02:18:01.524 [info] StatusHandler received: service_type=grpc source=status service=trapd 02:18:01.524 [info] StatusHandler received: service_type=grpc source=status service=zen ``` ``` {"level":"info","component":"agent","service":"mapper","address":"serviceradar-mapper:50056","attempt":1,"maxRetries":3,"time":"2026-01-20T02:22:20Z","message":"Connecting to service"} {"level":"info","component":"agent","service":"mapper","time":"2026-01-20T02:22:20Z","message":"Connected successfully"} {"level":"error","component":"agent","error":"health check failed: health check failed: all 3 retries failed: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 10.43.119.99:50056: connect: connection refused\"","service":"mapper","healthy":false,"time":"2026-01-20T02:22:23Z","message":"Health check failed"} {"level":"info","component":"agent","mode":"spiffe","time":"2026-01-20T02:22:26Z","message":"Creating security provider"} {"level":"info","component":"agent","workloadSocket":"unix:/run/spire/sockets/agent.sock","time":"2026-01-20T02:22:26Z","message":"Initializing SPIFFE security provider"} {"level":"info","component":"agent","service":"snmp","address":"serviceradar-snmp-checker:50054","attempt":1,"maxRetries":3,"time":"2026-01-20T02:22:26Z","message":"Connecting to service"} {"level":"info","component":"agent","service":"snmp","time":"2026-01-20T02:22:26Z","message":"Connected successfully"} {"level":"error","component":"agent","error":"health check failed: health check failed: all 3 retries failed: rpc error: code = Unavailable desc = name resolver error: produced zero addresses","service":"snmp","healthy":false,"time":"2026-01-20T02:22:29Z","message":"Health check failed"} ``` The agent seems to still be trying to communicate to other checkers, acting like a poller. This is not the new / desired behavior we want, the only thing the agent is supposed to do now is forward stuff to the agent-gateway, and external checkers will eventually be updated to push status/results to the agent. **To Reproduce** Steps to reproduce the behavior: 1. Go to '...' 2. Click on '....' 3. Scroll down to '....' 4. See error **Expected behavior** A clear and concise description of what you expected to happen. **Screenshots** If applicable, add screenshots to help explain your problem. **Desktop (please complete the following information):** - OS: [e.g. iOS] - Browser [e.g. chrome, safari] - Version [e.g. 22] **Smartphone (please complete the following information):** - Device: [e.g. iPhone6] - OS: [e.g. iOS8.1] - Browser [e.g. stock browser, safari] - Version [e.g. 22] **Additional context** Add any other context about the problem here.
mfreeman451 added this to the 1.1.0 milestone 2026-03-28 04:28:34 +00:00
Author
Owner

Imported GitHub comment.

Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/issues/2390#issuecomment-3771112526
Original created: 2026-01-20T05:31:49Z


Closed: removed legacy checker registry/poller path in agent. Agent no longer attempts grpc checker health/status polling. Deploy latest agent build to confirm core logs no longer show grpc status updates.

Imported GitHub comment. Original author: @mfreeman451 Original URL: https://github.com/carverauto/serviceradar/issues/2390#issuecomment-3771112526 Original created: 2026-01-20T05:31:49Z --- Closed: removed legacy checker registry/poller path in agent. Agent no longer attempts grpc checker health/status polling. Deploy latest agent build to confirm core logs no longer show grpc status updates.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
carverauto/serviceradar#789
No description provided.