This guide covers backup strategies, restore procedures, and disaster recovery planning for S4E On-Prem deployments.


Backup Strategy

What to Back Up

Component Data Type Priority Method
PostgreSQL All application data (users, assets, scans, results) Critical pg_dump / continuous archiving
MongoDB Crawl results, raw scan output High mongodump
RabbitMQ Queue definitions (not message data) Medium Definition export
Redis Cache data (ephemeral, can be rebuilt) Low RDB snapshot (optional)
Helm values Deployment configuration Critical Git repository
TLS certificates Ingress certificates High Certificate management system

Configuration as code

Store all Helm values, ArgoCD applications, and Kubernetes manifests in a Git repository. This is the most reliable backup for your deployment configuration.

Backup Schedule

Recommended practice

S4E does not manage backups on your behalf. The schedule below is a recommendation for on-prem deployments — implement it using your preferred backup tooling.

Component Recommended Frequency Recommended Retention
PostgreSQL (full) Daily 30 days
PostgreSQL (WAL archiving) Continuous 7 days
MongoDB Daily 30 days
RabbitMQ definitions Weekly 4 weeks

Recovery Verification Checklist

After any recovery operation, verify:

  • [ ] All pods are Running and Ready.
  • [ ] API health endpoint returns 200: curl https://s4e.company.com/api/health/ready
  • [ ] User login succeeds.
  • [ ] Asset list loads correctly.
  • [ ] A test scan can be initiated and completes.
  • [ ] Historical scan results are accessible.
  • [ ] RabbitMQ queues are created and consumers are connected.
  • [ ] Scheduled scans are still configured.
  • [ ] Monitoring and alerting are operational.

Next Steps