S4E On-Prem provides comprehensive observability through metrics, logs, and health endpoints. This guide covers setting up monitoring infrastructure and configuring log collection for your deployment.
Observability Stack
S4E On-Prem is designed to integrate with standard Kubernetes observability tools:
| Component | Purpose | Recommended Tool |
|---|---|---|
| Metrics | Performance monitoring, resource usage, scan throughput | Prometheus + Grafana |
| Logging | Centralized log aggregation and search | ELK stack (Elasticsearch, Logstash, Kibana) or Loki |
| Alerting | Proactive notifications for anomalies and failures | Grafana Alerting or Alertmanager |
| Tracing | Distributed request tracing across services | Jaeger (optional) |
Prometheus Metrics
Service Metrics
All S4E services expose Prometheus-compatible metrics at the /metrics endpoint. Key metrics include:
s4e-core
| Metric | Type | Description |
|---|---|---|
s4e_http_requests_total |
Counter | Total HTTP requests by method, path, and status code |
s4e_http_request_duration_seconds |
Histogram | Request latency distribution |
s4e_active_sessions |
Gauge | Number of active user sessions |
s4e_api_errors_total |
Counter | API errors by type and endpoint |
Workers (scan, crawler, dispatcher)
| Metric | Type | Description |
|---|---|---|
s4e_scan_jobs_total |
Counter | Total scan jobs processed |
s4e_scan_jobs_active |
Gauge | Currently running scan jobs |
s4e_scan_duration_seconds |
Histogram | Scan execution time by scan type |
s4e_crawler_urls_discovered |
Counter | Total URLs discovered by the crawler |
s4e_queue_messages_consumed |
Counter | Messages consumed from RabbitMQ |
Infrastructure
| Metric | Type | Description |
|---|---|---|
pg_stat_activity_count |
Gauge | Active PostgreSQL connections |
rabbitmq_queue_messages |
Gauge | Messages in each RabbitMQ queue |
redis_connected_clients |
Gauge | Connected Redis clients |
Prometheus Configuration
Add a ServiceMonitor resource for each S4E service:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: s4e-core-monitor
namespace: s4e
spec:
selector:
matchLabels:
app: s4e-core
endpoints:
- port: http
path: /metrics
interval: 30s
Helm integration
S4E Helm charts include optional ServiceMonitor templates. Enable them by setting metrics.serviceMonitor.enabled: true in your values file.
Grafana Dashboards
S4E provides pre-built Grafana dashboard JSON files covering:
- Platform Overview -- service health, request rates, error rates.
- Scan Activity -- scan throughput, queue depths, worker utilization.
- Infrastructure -- database connections, Redis memory, RabbitMQ message rates.
- Resource Usage -- CPU, memory, and storage consumption per service.
Import the dashboard JSON files from the S4E release artifacts or configure them through ArgoCD.
Logging
Log Format
All S4E services emit structured JSON logs:
{
"timestamp": "2025-01-15T10:23:45.123Z",
"level": "INFO",
"service": "s4e-core",
"module": "auth",
"message": "User login successful",
"user_id": 42,
"ip": "10.0.1.15",
"request_id": "abc-123-def"
}
Log Levels
| Level | Usage |
|---|---|
DEBUG |
Detailed diagnostic information (disabled in production by default) |
INFO |
Normal operational events (startup, request processing, scan completion) |
WARNING |
Unexpected but recoverable situations |
ERROR |
Failures requiring attention (connection errors, scan failures) |
CRITICAL |
System-level failures requiring immediate action |
Configure the log level per service via the LOG_LEVEL environment variable.
ELK Stack Integration
Filebeat DaemonSet
Deploy Filebeat as a DaemonSet to collect container logs from all Kubernetes nodes:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: filebeat
namespace: logging
spec:
selector:
matchLabels:
app: filebeat
template:
spec:
containers:
- name: filebeat
image: docker.elastic.co/beats/filebeat:8.12.0
volumeMounts:
- name: varlog
mountPath: /var/log
- name: containers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
- name: varlog
hostPath:
path: /var/log
- name: containers
hostPath:
path: /var/lib/docker/containers
Kibana Index Patterns
Create index patterns in Kibana for S4E logs:
s4e-core-*-- API and authentication logss4e-scan-*-- Scan execution logss4e-crawler-*-- Crawl pipeline logss4e-*-- All S4E service logs combined
Fluentd / Fluent Bit Alternative
If you use Fluent Bit instead of the ELK stack, configure a filter to parse S4E JSON logs:
[FILTER]
Name parser
Match kube.s4e-*
Key_Name log
Parser json
[OUTPUT]
Name es
Match kube.s4e-*
Host elasticsearch.logging.svc
Port 9200
Index s4e-logs
Health Checks
Liveness and Readiness Probes
All S4E services expose health endpoints used by Kubernetes probes:
| Endpoint | Purpose |
|---|---|
/health/live |
Liveness check -- is the process running? |
/health/ready |
Readiness check -- can the service handle requests? |
The readiness probe verifies connectivity to required dependencies (database, Redis, RabbitMQ) before marking the pod as ready.
Monitoring Health
Create alerts for health check failures:
groups:
- name: s4e-health
rules:
- alert: S4EServiceDown
expr: up{namespace="s4e"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: "S4E service {{ $labels.job }} is down"
Alerting thresholds
Tune alert thresholds based on your deployment's normal behavior. Start with conservative thresholds and adjust as you establish baselines.
Best Practices
- Retain logs for compliance -- configure log retention policies that meet your regulatory requirements (typically 90-365 days).
- Use request IDs -- the
request_idfield enables end-to-end request tracing across services. - Monitor queue depths -- rising RabbitMQ queue depths indicate worker capacity issues.
- Set up PagerDuty or OpsGenie -- route critical alerts to your on-call rotation.
- Dashboard rotation -- display the Platform Overview dashboard on a wall monitor in your operations center.