Nightly S3 Backup with Retention via AWS CLI

AWS CLI's aws s3 sync copies only changed files, making it efficient for nightly directory backups to S3. Pairing it with S3 lifecycle rules handles automatic retention without extra scripting. Adding a CronJobPro heartbeat closes the gap between "cron fired" and "backup actually completed successfully."

Schedule

0 2 * * *

Every day at 2:00 AM server local time

Setup

  1. 1

    Install and configure the AWS CLI

    Install the AWS CLI v2 on your server following the official AWS instructions for your OS. Run aws configure and supply an IAM access key, secret, default region, and output format json. The IAM user or role needs s3:PutObject, s3:GetObject, s3:ListBucket, and s3:DeleteObject on your target bucket.

  2. 2

    Create the S3 bucket and set a lifecycle rule

    Create your bucket in the AWS console or with aws s3api create-bucket. In the bucket's Management tab add a lifecycle rule: set the prefix to backups/ and configure Expiration to delete objects older than your retention window (e.g. 30 days). This handles pruning automatically without shell scripting.

  3. 3

    Create the backup script on your server

    Save the script below to /usr/local/bin/s3-backup.sh and run chmod +x /usr/local/bin/s3-backup.sh. Edit the SOURCE_DIR, BUCKET, and HEARTBEAT_URL variables at the top to match your setup. The script exits non-zero on any sync failure and only pings the heartbeat on success.

  4. 4

    Register a CronJobPro heartbeat monitor

    In CronJobPro create a new Heartbeat monitor, set the period to 24 hours and the grace period to 30 minutes. Copy the generated URL (https://cronjobpro.com/ping/<token>) and paste it into the HEARTBEAT_URL variable in your script. Configure your alert channels (email, Slack, etc.) on the monitor.

  5. 5

    Add the cron entry

    Run crontab -e as the user that owns the backup directory and add the line shown in the schedule above pointing to your script. Verify the job ran on the first morning by checking /var/log/s3-backup.log and confirming a ping arrived in your CronJobPro monitor history.

The script

bash

#!/usr/bin/env bash
set -euo pipefail

# --- Configuration ---
SOURCE_DIR="/var/data/myapp"          # Local directory to back up
BUCKET="s3://my-backups-bucket/backups"  # S3 destination prefix
LOG_FILE="/var/log/s3-backup.log"
HEARTBEAT_URL="https://cronjobpro.com/ping/YOUR_TOKEN_HERE"

# --- Helpers ---
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG_FILE"; }

log "Starting S3 backup: $SOURCE_DIR -> $BUCKET"

# --- Sync ---
if aws s3 sync "$SOURCE_DIR" "$BUCKET" \
    --storage-class STANDARD_IA \
    --delete \
    --exclude '*.tmp' \
    --exclude '*.swp' \
    2>&1 | tee -a "$LOG_FILE"; then

    log "Backup completed successfully."

    # Notify CronJobPro heartbeat on success
    curl -fsS --max-time 10 --retry 3 "$HEARTBEAT_URL" > /dev/null 2>&1 || \
        log "WARNING: heartbeat ping failed (backup itself succeeded)"
else
    EXIT_CODE=$?
    log "ERROR: aws s3 sync failed with exit code $EXIT_CODE"
    # Report failure to CronJobPro so alert fires immediately
    curl -fsS --max-time 10 "${HEARTBEAT_URL}/fail" > /dev/null 2>&1 || true
    exit "$EXIT_CODE"
fi

Monitor it

Create a Heartbeat monitor in CronJobPro and set its period to 24 hours with a grace period of 30 minutes. The script calls https://cronjobpro.com/ping/YOUR_TOKEN only after aws s3 sync exits successfully, so a missed or late ping means the backup either did not run or failed partway through. On a sync failure the script calls /ping/YOUR_TOKEN/fail immediately, triggering an alert right away rather than waiting for the grace period to expire. Configure alert destinations (email, Slack, Discord, PagerDuty, or others) on the monitor so your on-call contact is notified whichever way the job breaks. Check the monitor's ping history each week to confirm timing is consistent; a backup that consistently arrives near the grace-period boundary may indicate the source directory has grown and the schedule needs adjusting.

Frequently asked questions

Will --delete remove files from S3 that I deleted locally by mistake?

Yes. The --delete flag mirrors deletions from the source, which is what you want for an exact sync but dangerous if files are accidentally removed locally. If you want S3 to act as a safety net, omit --delete and rely solely on the S3 lifecycle rule for expiry, accepting that deleted-locally files stay in S3 until the retention period ends.

How do I back up multiple directories?

Call aws s3 sync once per directory inside the same script, each pointing to a distinct prefix in the bucket (e.g. backups/db and backups/uploads). Only send the heartbeat ping after all sync calls succeed; wrap the whole block in a function that exits on first failure so a partial backup is reported as a failure.

What IAM permissions does the backup user actually need?

At minimum: s3:PutObject and s3:GetObject on the bucket objects, s3:ListBucket on the bucket itself, and s3:DeleteObject if you use --delete. Avoid granting s3:* or permissions on other buckets. Consider using an IAM role attached to your EC2 instance instead of a long-lived access key to reduce credential exposure.

Why use STANDARD_IA storage class instead of STANDARD?

STANDARD_IA (Infrequent Access) costs roughly 45% less per GB stored and is appropriate for backups you read rarely. The trade-off is a minimum storage duration of 30 days and a per-GB retrieval fee, so it suits nightly backups well but would be wasteful for data accessed or overwritten frequently.

More recipes

Nightly S3 Backup with Retention via AWS CLI | CronJobPro