Rabbitmq
S4E On-Prem uses RabbitMQ as the primary message broker for asynchronous inter-service communication. All scan jobs, crawl pipeline stages, and event notifications flow through RabbitMQ queues.
Overview
RabbitMQ serves as the backbone for S4E's event-driven architecture:
- Scan dispatch -- scan requests are published to queues and consumed by worker services.
- Crawl pipeline -- multi-stage crawl operations pass through a chain of queues.
- Event notifications -- triggers and actions are coordinated through message passing.
- Work distribution -- messages are distributed across multiple worker replicas for parallel processing.
Deployment Options
Option 1: In-Cluster RabbitMQ (Helm Subchart)
The S4E Helm chart includes a RabbitMQ subchart:
# s4e-values.yaml
rabbitmq:
enabled: true
auth:
username: s4e_mq
password: "<strong-password>"
persistence:
enabled: true
size: 20Gi
storageClass: ssd
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 4Gi
Option 2: External RabbitMQ
Connect to an existing RabbitMQ cluster:
rabbitmq:
enabled: false
core:
env:
RABBITMQ_HOST: "rabbitmq-cluster.messaging.internal"
RABBITMQ_PORT: "5672"
RABBITMQ_USER: "s4e_mq"
RABBITMQ_VHOST: "s4e"
secrets:
RABBITMQ_PASS: "<rabbitmq-password>"
Option 3: Clustered RabbitMQ
For high availability, deploy a RabbitMQ cluster with mirrored queues:
rabbitmq:
enabled: true
replicaCount: 3
clustering:
enabled: true
auth:
username: s4e_mq
password: "<strong-password>"
Quorum queues
RabbitMQ 3.8+ supports quorum queues, which provide better data safety than classic mirrored queues. S4E automatically uses quorum queues when available.
Queue Architecture
Exchange Topology
S4E uses a topic exchange model:
| Exchange | Type | Purpose |
|---|---|---|
s4e.scan |
Topic | Scan job routing |
s4e.crawl |
Topic | Crawler pipeline stages |
s4e.events |
Topic | System events and notifications |
s4e.actions |
Topic | Action and playbook execution |
s4e.dead_letter |
Fanout | Failed message collection |
Queue Definitions
Scan Queues
| Queue | Routing Key | Consumer |
|---|---|---|
scan.dispatch |
scan.dispatch.* |
s4e-dispatcher |
scan.execute |
scan.execute.* |
s4e-scan |
scan.results |
scan.results.* |
s4e-core |
Crawler Pipeline Queues
| Queue | Routing Key | Consumer | Stage |
|---|---|---|---|
crawl.ffuf |
crawl.ffuf |
s4e-crawler | Directory fuzzing |
crawl.katana |
crawl.katana |
s4e-crawler | Deep crawling |
crawl.api_doc |
crawl.api_doc |
s4e-crawler | API doc parsing |
crawl.url_unifier |
crawl.url_unifier |
s4e-crawler | URL dedup |
crawl.pii |
crawl.pii |
s4e-crawler | PII detection |
crawl.enrichment |
crawl.enrichment |
s4e-crawler | Result enrichment |
crawl.finisher |
crawl.finisher |
s4e-crawler | Pipeline finalization |
Virtual Hosts
For multi-tenant or environment isolation, configure separate vhosts:
rabbitmqctl add_vhost s4e_production
rabbitmqctl add_vhost s4e_staging
rabbitmqctl set_permissions -p s4e_production s4e_mq ".*" ".*" ".*"
Configuration
Consumer Settings
| Variable | Description | Default |
|---|---|---|
RABBITMQ_PREFETCH_COUNT |
Messages fetched per consumer before acknowledgment | 10 |
RABBITMQ_HEARTBEAT |
Heartbeat interval (seconds) | 60 |
RABBITMQ_CONNECTION_TIMEOUT |
Connection timeout (seconds) | 30 |
RABBITMQ_RETRY_DELAY |
Delay between connection retry attempts (seconds) | 5 |
RABBITMQ_MAX_RETRIES |
Maximum connection retry attempts | 10 |
Message Durability
All S4E queues are configured as durable by default:
- Messages are persisted to disk.
- Queues survive broker restarts.
- Consumer acknowledgments ensure at-least-once delivery.
Dead Letter Handling
Messages that fail processing after the configured retry count are routed to the dead letter exchange:
Monitor the dead_letter_queue for messages that require manual investigation.
Dead letter accumulation
Regularly monitor the dead letter queue depth. Accumulating dead letters indicate persistent processing failures that need attention.
Performance Tuning
Prefetch Count
The prefetch count controls how many messages a consumer fetches before processing:
| Workload | Recommended Prefetch |
|---|---|
| Fast tasks (< 1 second) | 20-50 |
| Medium tasks (1-30 seconds) | 5-10 |
| Slow tasks (> 30 seconds) | 1-3 |
S4E scan workers should use a lower prefetch count because scan operations are long-running. Crawler pipeline stages can use a higher prefetch count for throughput.
Memory and Disk Alarms
Configure RabbitMQ resource limits:
# rabbitmq.conf
vm_memory_high_watermark.relative = 0.6
vm_memory_high_watermark_paging_ratio = 0.5
disk_free_limit.relative = 2.0
When the memory watermark is reached, RabbitMQ stops accepting new messages from publishers, which causes backpressure on S4E services.
Connection Limits
Ensure the maximum channel count accommodates all S4E service replicas.
Monitoring
Management UI
RabbitMQ Management UI is available at port 15672:
Access at http://localhost:15672 with your configured credentials.
Key Metrics
| Metric | Description | Alert Threshold |
|---|---|---|
queue_messages_ready |
Messages waiting for consumers | > 1000 for > 5 minutes |
queue_messages_unacknowledged |
Messages being processed | > 500 for > 10 minutes |
consumers |
Active consumer count | Drops to 0 |
message_rates.publish |
Message publish rate | Sudden drop to 0 |
message_rates.deliver |
Message delivery rate | Falls below publish rate consistently |
mem_used |
Memory consumption | > 80% of watermark |
Prometheus Integration
Enable the RabbitMQ Prometheus plugin:
Metrics are exposed at http://rabbitmq:15692/metrics.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: rabbitmq-monitor
namespace: s4e
spec:
selector:
matchLabels:
app: rabbitmq
endpoints:
- port: prometheus
path: /metrics
interval: 30s
Troubleshooting
| Issue | Cause | Solution |
|---|---|---|
| Queue depth growing | Consumers not keeping up | Scale worker replicas or increase prefetch |
| Connection refused | RabbitMQ pod not ready | Check pod status and readiness probe |
| Memory alarm triggered | High message volume | Increase memory limits or add consumers |
| Messages in dead letter queue | Processing failures | Inspect message content and worker logs |
| Split-brain in cluster | Network partition | Follow RabbitMQ partition handling procedure |
Next Steps
- Queue optimization -- advanced queue tuning for high throughput.
- Worker services -- understand the message consumers.
- Environment variables -- complete configuration reference.