Bug/srql analytics #2418
No reviewers
Labels
No labels
1week
2weeks
Failed compliance check
IP cameras
NATS
Possible security concern
Review effort 1/5
Review effort 2/5
Review effort 3/5
Review effort 4/5
Review effort 5/5
UI
aardvark
accessibility
amd64
api
arm64
auth
back-end
bgp
blog
bug
build
checkers
ci-cd
cleanup
cnpg
codex
core
dependencies
device-management
documentation
duplicate
dusk
ebpf
enhancement
eta 1d
eta 1hr
eta 3d
eta 3hr
feature
fieldsurvey
github_actions
go
good first issue
help wanted
invalid
javascript
k8s
log-collector
mapper
mtr
needs-triage
netflow
network-sweep
observability
oracle
otel
plug-in
proton
python
question
reddit
redhat
research
rperf
rperf-checker
rust
sdk
security
serviceradar-agent
serviceradar-agent-gateway
serviceradar-web
serviceradar-web-ng
siem
snmp
sysmon
topology
ubiquiti
wasm
wontfix
zen-engine
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
carverauto/serviceradar!2418
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refs/pull/2418/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imported from GitHub pull request.
Original GitHub pull request: #1950
Original author: @mfreeman451
Original URL: https://github.com/carverauto/serviceradar/pull/1950
Original created: 2025-11-18T04:47:55Z
Original updated: 2025-11-18T08:48:16Z
Original head: carverauto/serviceradar:bug/srql_analytics
Original base: main
Original merged: 2025-11-18T08:47:50Z by @mfreeman451
User description
IMPORTANT: Please sign the Developer Certificate of Origin
Thank you for your contribution to ServiceRadar. Please note, when contributing, the developer must include
a DCO sign-off statement indicating the DCO acceptance in one commit message. Here
is an example DCO Signed-off-by line in a commit message:
Describe your changes
Issue ticket number and link
Code checklist before requesting a review
PR Type
Enhancement, Bug fix
Description
Extended SRQL query engine with support for new entity types:
Interfaces,Services,OtelMetrics,TraceSummaries, andTracesAdded statistics aggregation support across logs, traces, metrics, and interfaces queries with flexible grouping and filtering
Implemented PostgreSQL TLS/SSL connection support with custom connection manager and root certificate configuration
Enhanced authorization handling in web API with cookie fallback for access tokens
Added open-ended absolute time ranges for queries with only start or end bounds
Improved shell script portability in Docker entrypoint with better string handling and SSL configuration via environment variables
Added comprehensive data models for new query entities with JSON serialization support
Extended parser to recognize new entity types and stats expressions
Updated Kubernetes configuration for OTel integration, debug logging, and ingress routing
Added SRQL service documentation covering authentication, rate limiting, and deployment checklist
Diagram Walkthrough
File Walkthrough
13 files
route.ts
Improve authorization header resolution with cookie fallbackweb/src/app/api/query/route.ts
resolveAuthorizationHeader()function to handle multipleauthorization header sources
Authorizationheader (case-insensitive) andfalls back to
accessTokencookieheader resolution
trace_summaries.rs
Add trace summaries query execution modulerust/srql/src/query/trace_summaries.rs
generation
compatibility
duration comparisons
logs.rs
Add statistics aggregation support to logs queriesrust/srql/src/query/logs.rs
build_stats_query()function foraggregations
LogsStatsExprenum supportingcount()andgroup_uniq_array()expressionsfield support
otel_metrics.rs
Add OpenTelemetry metrics query modulerust/srql/src/query/otel_metrics.rs
attributes, and performance metrics
by service name
interfaces.rs
Add discovered interfaces query modulerust/srql/src/query/interfaces.rs
address arrays
attributes
models.rs
Add data models for new query entitiesrust/srql/src/models.rs
DiscoveredInterfaceRowstruct with JSON serialization fornetwork interface data
TraceSpanRowstruct for OpenTelemetry trace span dataServiceStatusRowstruct for service availability statusOtelMetricRowstruct for OpenTelemetry metricsTraceSummaryRowstruct withQueryableByNamefor trace summaryaggregations
traces.rs
Add OpenTelemetry traces query modulerust/srql/src/query/traces.rs
codes
comparisons
services.rs
Add service status query modulerust/srql/src/query/services.rs
agent/poller IDs
time.rs
Support open-ended absolute time rangesrust/srql/src/time.rs
AbsoluteOpenEndvariant for time ranges with only start boundAbsoluteOpenStartvariant for time ranges with only end boundresolve()method to handle open-ended ranges using currenttime or MIN_UTC
db.rs
Add PostgreSQL TLS/SSL connection supportrust/srql/src/db.rs
AsyncDieselConnectionManagerwith customPgConnectionManagerfor TLS support
PgTlsenum supporting both unencrypted and Rustls-basedconnections
ManageConnectiontrait for connection pooling withvalidation
PGSSLROOTCERTenvironment variable
parser.rs
Extend parser with new entities and stats supportrust/srql/src/parser.rs
Interfaces,Services,OtelMetrics,TraceSummaries,Tracesstatsfield toQueryAstfor aggregation expressionsmod.rs
Integrate new query modules into query enginerust/srql/src/query/mod.rs
interfaces,otel_metrics,services,trace_summaries,tracesexecute_query()to route new entity types to respectivehandlers
translate_query()to generate debug SQL for new entitiesstatsfield toQueryPlanstructschema.rs
Add database schema definitions for new tablesrust/srql/src/schema.rs
service_statustablediscovered_interfacestable witharray support
otel_tracestableotel_metricstable6 files
config.rs
Add PostgreSQL SSL certificate configurationrust/srql/src/config.rs
pg_ssl_root_certoptional field toAppConfigPGSSLROOTCERTenvironment variableconfigmap.yaml
Update Kubernetes configuration for OTel and API routingk8s/demo/base/configmap.yaml
/api/queryendpointserviceradar-otel:4317kustomization.yaml
Update staging deployment image tagsk8s/demo/staging/kustomization.yaml
serviceradar-webimage tag to specific SHA commit hashserviceradar-srqlimage tag to match web service commitBUILD.bazel
Enable Procedural Macro Dependencies in Bazel Buildrust/srql/BUILD.bazel
proc_macro_depstorust_librarytarget to support proceduralmacros from dependencies
proc_macro_depstorust_binarytarget for consistent macrosupport in the binary
serviceradar-srql.yaml
Enable Debug Logging and Backtrace for SRQL Servicek8s/demo/base/serviceradar-srql.yaml
RUST_BACKTRACE=1environment variable for enhanced errordiagnostics
RUST_LOG=srql=debugenvironment variable to enable debug-levellogging for the srql module
ingress.yaml
Update Ingress Backend Service Configurationk8s/demo/staging/ingress.yaml
serviceradar-kongtoserviceradar-web8000to3000for the root pathrouting
1 files
server.rs
Add error logging to query handlerrust/srql/src/server.rs
1 files
entrypoint-srql.sh
Improve shell script portability and SSL configurationdocker/compose/entrypoint-srql.sh
urlencode()function to handle string length correctly usingwc-cprintf -vwithodfor hex encoding to improve portabilityescape_conn_value()function for libpq connection stringescaping
SSL settings
PGSSLROOTCERTandPGSSLMODE1 files
srql-service.md
SRQL Service Configuration and Security Documentationdocs/docs/srql-service.md
authentication and rate limiting
SRQL_API_KEYandSRQL_API_KEY_KV_KEYenvironment variablesfor API key authentication
SRQL_RATE_LIMIT_MAXandSRQL_RATE_LIMIT_WINDOW_SECSand verification steps
1 files
Cargo.toml
Add Database and Async Trait Dependenciesrust/srql/Cargo.toml
async-traitdependency for async trait support(
bb8,tokio-postgres,tokio-postgres-rustls)rustls,rustls-pemfile) forsecure database connections
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1950#issuecomment-3545019641
Original created: 2025-11-18T04:48:58Z
PR Compliance Guide 🔍
(Compliance updated until commit
github.com/carverauto/serviceradar@9d3aedb45f)Below is a summary of compliance checks for this PR:
Insecure auth fallback
Description: Authorization header is populated from the 'accessToken' cookie without HttpOnly/secure
validation, enabling bearer token theft via client-writable cookies or CSRF header
injection paths; ensure the cookie is HttpOnly/SameSite and only used in trusted contexts.
route.ts [86-91]
Referred Code
SQL injection risk
Description: Raw SQL is constructed via string concatenation and manual placeholder rewriting; although
values are bound, the dynamic JSON key assembly and column names for stats may permit SQL
injection if alias/field parsing is bypassed—verify strict validation and avoid
concatenating untrusted identifiers.
logs.rs [159-168]
Referred Code
SQL injection risk
Description: Stats SQL string concatenates response keys, group column names, and ORDER BY expressions;
while values are bound, concatenated identifiers increase SQL injection surface if parsing
is bypassed—confirm whitelist of group fields and aliases cannot contain unsafe
characters.
otel_metrics.rs [296-323]
Referred Code
SQL injection risk
Description: Dynamic stats JSON is built by interpolating parsed aliases into SQL and aggregations;
despite parameter binding for values, alias-to-SQL insertion can allow injection if
validation misses edge cases—ensure aliases are strictly alphanumeric/underscore and
expressions are fully validated.
trace_summaries.rs [170-185]
Referred Code
Sensitive data exposure
Description: The JSON output duplicates 'attributes' into both 'attributes' and 'raw_data' with
different sources, which can expose unfiltered trace span data unintentionally; verify
that sensitive attributes are sanitized before serialization.
models.rs [206-229]
Referred Code
🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
Error logging: The added error logging records failures, but it is unclear whether all critical actions
and their outcomes (e.g., auth decisions, query execution context) are consistently logged
across new handlers and stats paths.
Referred Code
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status:
Edge cases: Custom SQL builders perform parsing and binding but some branches return generic invalid
request errors and may not validate extremely large IN lists or pagination bounds,
requiring verification of limits and edge-case handling.
Referred Code
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status:
User error detail: Several paths return specific validation messages (e.g., unsupported fields/expressions)
which may be user-facing via the API; confirm these are mapped to safe, generic client
errors while detailed info is only logged.
Referred Code
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status:
Bind logging: On query failure, code logs SQL text and bind values which could expose sensitive data
from filters; verify logs redact or omit sensitive bind contents.
Referred Code
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status:
Cookie auth: The new authorization resolver falls back to a Bearer token from the accessToken cookie;
confirm cookie security attributes and server-side validation to prevent token misuse.
Referred Code
Compliance status legend
🟢 - Fully Compliant🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label
Previous compliance checks
Compliance check up to commit 5f5004f
CSRF via cookie-to-bearer
Description: The new authorization resolution falls back to the 'accessToken' cookie and blindly
converts it to a 'Bearer' header, which may enable CSRF on the backend if the SRQL
endpoint trusts the Authorization header and CORS/CSRF protections are not enforced for
this internal call.
route.ts [86-91]
Referred Code
DB resource exhaustion
Description: The dynamic SQL stats filters construct 'IN/NOT IN' lists by expanding multiple '?'
placeholders based on user-provided lists; while values are still bound safely, the
absence of a maximum list length or total placeholder cap can be abused to create very
large queries and cause database resource exhaustion (DoS).
logs.rs [441-520]
Referred Code
DB resource exhaustion
Description: The stats query builder expands user-provided list filters into many placeholders without
bounding list size, allowing excessively large 'IN/NOT IN' clauses that may degrade
database performance or cause timeouts (potential DoS).
otel_metrics.rs [412-429]
Referred Code
Unbounded input size
Description: The 'ip_addresses' array filter permits arbitrarily large lists supplied by users and uses
array containment; without limits on list size or request validation, this can lead to
heavy query execution and indexing pressure resulting in DoS.
interfaces.rs [186-209]
Referred Code
Expensive aggregation DoS
Description: Stats mode parses user-supplied expressions and builds a single JSON payload but returns
it wrapped in a one-element array; while SQL is parameterized, complex stats strings and
lack of size limits on 'stats' input or filters can be leveraged to generate expensive
aggregation queries causing database load (DoS).
trace_summaries.rs [161-183]
Referred Code
Unbounded time range
Description: Allowing open-ended absolute time ranges (start to 'now' or 'MIN_UTC' to end) without
server-side bounds can enable extremely large scans over time-series tables, risking heavy
DB load and timeouts.
time.rs [62-80]
Referred Code
🎫 No ticket provided
Codebase context is not defined
Follow the guide to enable codebase context checks.
Generic: Meaningful Naming and Self-Documenting Code
Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting
Status: Passed
Generic: Secure Error Handling
Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.
Status: Passed
Generic: Secure Logging Practices
Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.
Status: Passed
Generic: Security-First Input Validation and Data Handling
Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities
Status: Passed
Generic: Comprehensive Audit Trails
Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.
Status:
Limited auditing: New query paths (Interfaces, Services, OtelMetrics, Traces, TraceSummaries) add database
actions without explicit audit logging of who performed queries or their outcomes, which
may be acceptable if auditing exists elsewhere.
Referred Code
Generic: Robust Error Handling and Edge Case Management
Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation
Status:
Error context: Several parsing/validation errors return generic InvalidRequest messages without including
offending field/value, which may hinder debugging though may be intentional to avoid
sensitive exposure.
Referred Code
Imported GitHub PR comment.
Original author: @qodo-code-review[bot]
Original URL: https://github.com/carverauto/serviceradar/pull/1950#issuecomment-3545021973
Original created: 2025-11-18T04:50:10Z
PR Code Suggestions ✨
Latest suggestions up to
9d3aedbRedact DATABASE_URL in logs
Redact the database credentials from the
SRQL_DATABASE_URLbefore logging it toprevent leaking secrets.
docker/compose/entrypoint-srql.sh [151-152]
Suggestion importance[1-10]: 9
__
Why: This suggestion correctly identifies a critical security vulnerability where database credentials are leaked into logs, and the proposed fix effectively redacts the sensitive information.
Validate Bearer auth scheme
Validate the
Authorizationheader to ensure it uses theBearerscheme. Rejectany other format to avoid forwarding potentially invalid or malicious headers.
web/src/app/api/query/route.ts [78-92]
Suggestion importance[1-10]: 8
__
Why: This is a valuable security hardening suggestion that ensures only
Bearertokens are forwarded, preventing potential issues with malformed or unexpectedAuthorizationheader schemes in the upstream service.Correct empty IN list handling
Correct the handling of empty
INandNOT INfilter lists. An emptyINlistshould result in a
1=0(always false) condition, while an emptyNOT INlistshould result in
1=1(always true).rust/srql/src/query/otel_metrics.rs [389-429]
Suggestion importance[1-10]: 8
__
Why: The suggestion correctly identifies a logical bug where an empty
INclause resolves to1=1, effectively ignoring the filter. This could lead to unintended data exposure or incorrect query results, making it a significant correctness fix.Harden TLS root cert loading
Improve the robustness of TLS root certificate loading by validating that at
least one certificate is loaded, using a more appropriate API for parsing, and
providing clearer error messages for failures.
rust/srql/src/db.rs [93-106]
Suggestion importance[1-10]: 7
__
Why: The suggestion correctly identifies a lack of robustness in the new TLS certificate loading logic and provides a more resilient implementation with clearer error handling, which is important for production stability.
Avoid MIN_UTC fallback and clarify errors
Refine the
AbsoluteOpenStarttime range logic to provide a more specific errormessage instead of falling back to
MIN_UTC, which can cause a generic rangelength error.
rust/srql/src/time.rs [30-83]
[To ensure code accuracy, apply this suggestion manually]Suggestion importance[1-10]: 6
__
Why: The suggestion correctly identifies an edge case in the new time range parsing logic that could lead to confusing error messages and improves the code by handling it explicitly.
Fix NULL-safe NOT IN semantics
Modify the SQL generation for
NotEq,NotLike, andNotIntext filters to beNULL-safe by including an
IS NULLcheck, preventing rows withNULLvalues frombeing unintentionally filtered out.
rust/srql/src/query/trace_summaries.rs [256-303]
Suggestion importance[1-10]: 7
__
Why: The suggestion correctly identifies that
NOT INand<>operators in SQL can produce unexpected results withNULLvalues, and proposes a valid fix to make these comparisons NULL-safe, which improves query correctness.Remove conflicting duplicate field
Remove the redundant
raw_datafield from the JSON output ofTraceSpanRow::into_json. Theattributesfield already provides the same data.rust/srql/src/models.rs [205-230]
Suggestion importance[1-10]: 5
__
Why: The suggestion correctly points out that the
raw_datafield is a redundant and potentially confusing duplication of theattributesfield, improving the clarity and consistency of the JSON output.Previous suggestions
✅ Suggestions up to commit
5f5004fUnify query generation using Diesel
Standardize query generation by using the Diesel query builder for all queries,
including statistics. This will eliminate inconsistent raw SQL string
manipulation and duplicated logic across different query handlers.
Examples:
rust/srql/src/query/trace_summaries.rs [130-202]
rust/srql/src/query/logs.rs [122-166]
Solution Walkthrough:
Before:
After:
Suggestion importance[1-10]: 9
__
Why: The suggestion correctly identifies a major design flaw where multiple query handlers use inconsistent and duplicated raw SQL string building, which increases maintenance overhead and risk, making it a critical issue to address.
Fix incorrect multi-byte character encoding
Refactor the
urlencodefunction to correctly handle multi-byte characters byiterating over bytes instead of characters, which prevents incorrect encoding.
docker/compose/entrypoint-srql.sh [20-43]
Suggestion importance[1-10]: 9
__
Why: The suggestion correctly identifies a critical bug in the new
urlencodeimplementation that breaks encoding for multi-byte characters and provides a robust fix.✅
Correctly handle empty IN filterSuggestion Impact:
The commit adds clauses.push("1=0".to_string()); when the IN filter values are empty, implementing the suggested behavior.code diff:
Handle an empty
INfilter by adding an always-false condition like1=0to thequery, ensuring it correctly returns no results.
rust/srql/src/query/trace_summaries.rs [282-289]
Suggestion importance[1-10]: 7
__
Why: The suggestion correctly identifies a bug where an empty
INclause is ignored, leading to incorrect query results, and proposes a valid fix.✅
Avoid hardcoding sensitive API keysSuggestion Impact:
The commit removed the hardcoded "x-api-key" from headers and added logic in the setup script to inject the API key dynamically into logging.otel.headers from a secret-derived variable, avoiding hardcoding in the configmap.code diff:
Replace the hardcoded OpenTelemetry API key in
configmap.yamlwith a referenceto a Kubernetes secret to avoid security risks.
k8s/demo/base/configmap.yaml [241-243]
[Suggestion processed]Suggestion importance[1-10]: 8
__
Why: The suggestion addresses a security risk by advising against a hardcoded placeholder API key and correctly recommends using a Kubernetes secret instead.