fix zen rule crash #2657
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2657
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2657/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #2274
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/2274
Original created: 2026-01-12T19:29:39Z
Original updated: 2026-01-12T22:12:00Z
Original head: carverauto/serviceradar:bug/core-zen-rule-crash
Original base: staging
Original merged: 2026-01-12T22:11:59Z by @mfreeman451
User description
IMPORTANT: Please sign the Developer Certificate of Origin
Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:
Describe your changes
Issue ticket number and link
Code checklist before requesting a review
PR Type
Bug fix
Description
Added exception handling to prevent crashes in zen rule sync
Improved error handling for JSON encoding failures
Added logging for unexpected results and exceptions
Ensured sync operations always return :ok status
Diagram Walkthrough
File Walkthrough
zen_rule_sync.ex
Add exception handling and JSON encoding error recoveryelixir/serviceradar_core/lib/serviceradar/observability/zen_rule_sync.ex
sync_rule_with_logging/2Jason.encode!toJason.encodewith proper error handling insync_rule_impl/3Jason.EncodeErrorin the with clause:okto prevent process crashesImported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2274#issuecomment-3740160383
Original created: 2026-01-12T19:30:10Z
PR Compliance Guide 🔍
Below is a summary of compliance checks for this PR:
Sensitive data in logs
Description: The new logging paths use
inspect(unexpected)andException.format(..., STACKTRACE),which may leak sensitive data (e.g., payload contents, tokens, or internal state present
in exception messages/stacktraces) into logs if failures occur during Zen rule sync.
zen_rule_sync.ex [227-244]
Referred Code
🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status: Passed
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status: Passed
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
Missing actor context: New log entries for zen rule reconciliation include
tenant_idandrule_idbut do notinclude an explicit user/actor identifier, which may be required to reconstruct who
initiated the action.
Referred Code
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status:
Potential detail leakage: The new error tuple includes
Exception.message(error)for JSON encode failures, which mayexpose internal details to upstream callers depending on how this error is surfaced.
Referred Code
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status:
Sensitive data in logs: Logging
Exception.format(..., STACKTRACE)andinspect(unexpected)may capture and emitsensitive data from exceptions/return values into logs depending on what data is present
in the stacktrace or unexpected term.
Referred Code
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/2274#issuecomment-3740164488
Original created: 2026-01-12T19:31:22Z
PR Code Suggestions ✨
Explore these optional code suggestions:
Error handling may hide systemic problems
The current error handling always returns
:ok, which can hide systemic issues.Implement a circuit breaker to detect and react to repeated failures, preventing
silent, indefinite failure loops.
Examples:
elixir/serviceradar_core/lib/serviceradar/observability/zen_rule_sync.ex [213-245]
Solution Walkthrough:
Before:
After:
Suggestion importance[1-10]: 9
__
Why: This is a critical design-level suggestion that correctly identifies a significant risk introduced by the PR—masking systemic issues by always returning
:ok—and proposes a valid architectural pattern to mitigate it.Avoid catching exit signals in rescue
Modify the
rescueblock to only catchExceptions, not all throwables. Thisprevents interference with OTP supervision mechanisms by not catching exit
signals.
elixir/serviceradar_core/lib/serviceradar/observability/zen_rule_sync.ex [236-244]
Suggestion importance[1-10]: 7
__
Why: The suggestion correctly points out that a bare
rescuecan interfere with OTP supervision principles by catching exit signals. Specifyingrescue e in Exceptionis a significant improvement for robustness and adherence to Elixir best practices.