Alerts
Build alerts from readiness first, then add cluster, capacity, and service-plane error signals.
Neuwerk exposes alerting signals, but it does not ship a bundled alert rule pack. Build alerts from the runtime surfaces that already exist.
First Alert To Create
If you only create one urgent alert per node, start with:
/readyreturning503
Readiness is the best first-line signal because it already combines several conditions that matter to operators.
Alert Classes
The most useful alert groups are:
- node availability:
/healthand/ready - cluster health:
cluster,policy_replication, and Raft-related metrics - dataplane capacity: active flows and NAT port utilization
- service-plane failures: TLS interception and fail-closed counters
- integration failures: integration error counters when cloud or other integrations are active
Better For Alerts
High-signal alert inputs include:
- readiness failures
- sustained cluster peer errors
- sustained NAT port saturation
- sustained
svc_fail_closed_total - integration error counters
Better For Dashboards Or Investigation
These are usually better as dashboards or trend alerts than immediate pages:
- total DNS query volume
- total deny volume
- general HTTP request volume
- other workload-shaped counters
A spike in deny counters may mean a problem, or it may mean policy is working exactly as designed.
Use Stats And Audit After The Alert Fires
After an alert fires, move to:
GET /api/v1/statsGET /api/v1/audit/findings
Those two surfaces help distinguish a component failure from a deliberate policy outcome.
If audit is unavailable, verify performance mode before assuming the audit subsystem is broken.