Back to Blog
Python12 min read

Python Cron Jobs: APScheduler, Celery & Schedule Guide

Python gives you at least half a dozen ways to run code on a schedule. Some are dead simple but fragile. Others are production-hardened but take real effort to set up. This guide walks through every major option so you can pick the right one for your project.

The Landscape of Python Scheduling

Before diving into code, it helps to understand the broad categories. Python scheduling solutions fall into three buckets:

  • In-process schedulers run inside your Python application. They are easy to set up and work well for single-process scripts, but they die when the process dies. Examples: the schedule library and APScheduler.
  • Distributed task queues like Celery offload work to separate worker processes and use a message broker (Redis, RabbitMQ) to coordinate. This is the standard for production applications that need reliability, retries, and horizontal scaling.
  • External schedulers live entirely outside your Python code. System crontab, systemd timers, Kubernetes CronJobs, and HTTP-based schedulers like CronJobPro fall into this category. Your Python code just exposes an endpoint or script, and the scheduler triggers it.

1. The schedule Library

The schedule library is the simplest option. It has zero dependencies beyond Python itself, and the API reads like plain English.

pip install schedule
import schedule
import time
import requests

def sync_inventory():
    """Pull latest inventory from supplier API."""
    resp = requests.get(
        "https://api.supplier.com/inventory",
        headers={"Authorization": "Bearer sk-xxx"}
    )
    if resp.status_code == 200:
        # process data...
        print(f"Synced {len(resp.json())} items")

def send_daily_report():
    """Generate and email daily sales report."""
    # ... build report logic ...
    print("Report sent")

# Define schedules
schedule.every(15).minutes.do(sync_inventory)
schedule.every().day.at("09:00").do(send_daily_report)
schedule.every().monday.at("06:00").do(send_daily_report)

# Run the event loop
while True:
    schedule.run_pending()
    time.sleep(1)

When to use it: quick scripts, personal projects, one-off automation on a server where you have a long-running process.

Limitations: no persistence (if the process restarts, missed jobs are lost), no timezone support by default, no retry logic, no concurrency control. It is blocking -- the entire event loop waits for each job to complete before checking the next.

2. APScheduler (Advanced Python Scheduler)

APScheduler bridges the gap between the simplicity of the schedule library and the complexity of Celery. It supports cron-style expressions, interval triggers, date triggers, and multiple job stores (memory, SQLAlchemy, MongoDB, Redis).

pip install apscheduler

BackgroundScheduler

The most common pattern is BackgroundScheduler, which runs in a background thread and does not block your main application:

from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.triggers.cron import CronTrigger
from apscheduler.triggers.interval import IntervalTrigger

def cleanup_temp_files():
    import glob, os
    files = glob.glob("/tmp/myapp_*.tmp")
    for f in files:
        os.remove(f)
    print(f"Cleaned up {len(files)} temp files")

def check_ssl_certificates():
    # ... check expiry dates ...
    print("SSL check complete")

scheduler = BackgroundScheduler()

# Cron-style: every day at 3:00 AM
scheduler.add_job(
    cleanup_temp_files,
    CronTrigger(hour=3, minute=0),
    id="cleanup",
    name="Clean temp files",
    replace_existing=True,
)

# Interval: every 6 hours
scheduler.add_job(
    check_ssl_certificates,
    IntervalTrigger(hours=6),
    id="ssl_check",
    name="SSL certificate check",
)

scheduler.start()

# Your main application continues to run
# (e.g., a Flask or FastAPI server)
# scheduler.shutdown() when done

CronTrigger Syntax

APScheduler's CronTrigger accepts the same fields you know from standard cron expressions:

ParameterCron EquivalentExample
minuteField 1*/15
hourField 29-17
dayField 31,15
monthField 41-6
day_of_weekField 5mon-fri

You can also use CronTrigger.from_crontab("*/15 9-17 * * 1-5") to parse a standard five-field expression directly. This is useful when reading schedules from a config file or database.

Persistent Job Stores

By default APScheduler stores jobs in memory. If the process restarts, everything is gone. For production, configure a persistent job store:

from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
from apscheduler.schedulers.background import BackgroundScheduler

jobstores = {
    "default": SQLAlchemyJobStore(
        url="postgresql://user:pass@localhost/mydb"
    )
}

scheduler = BackgroundScheduler(jobstores=jobstores)
scheduler.start()
# Jobs survive restarts. APScheduler catches up on
# missed executions based on misfire_grace_time.

3. Celery Periodic Tasks (Celery Beat)

Celery is the de facto standard for distributed task processing in Python. Its scheduling component, Celery Beat, sends tasks to the queue at defined intervals. Worker processes pick them up and execute them independently.

pip install celery[redis]
# celery_app.py
from celery import Celery
from celery.schedules import crontab

app = Celery("myapp", broker="redis://localhost:6379/0")

@app.task
def generate_report():
    """Heavy report generation task."""
    # ... generate PDF, send email ...
    return "Report generated"

@app.task
def sync_external_data():
    """Pull data from third-party API."""
    # ... sync logic ...
    return "Data synced"

@app.task
def cleanup_expired_sessions():
    """Remove sessions older than 30 days."""
    # ... database cleanup ...
    return "Cleanup done"

# Beat schedule
app.conf.beat_schedule = {
    "daily-report": {
        "task": "celery_app.generate_report",
        "schedule": crontab(hour=8, minute=0),
    },
    "sync-every-15-min": {
        "task": "celery_app.sync_external_data",
        "schedule": crontab(minute="*/15"),
    },
    "nightly-cleanup": {
        "task": "celery_app.cleanup_expired_sessions",
        "schedule": crontab(hour=2, minute=30),
    },
}

app.conf.timezone = "UTC"

To run it, you need two processes:

# Terminal 1: Start the worker
celery -A celery_app worker --loglevel=info

# Terminal 2: Start the beat scheduler
celery -A celery_app beat --loglevel=info

Why Celery: tasks execute in separate worker processes, so a slow report does not block your data sync. You get built-in retries, result backends, priority queues, rate limiting, and monitoring through Flower. Tasks are distributed across multiple workers and even multiple servers.

Why not Celery: it requires a message broker (Redis or RabbitMQ), at least two additional processes (worker + beat), and meaningful operational overhead. For a small application that needs to run one or two tasks on a schedule, Celery is overkill.

4. Django Cron Jobs

Django projects have two main paths for scheduled tasks:

Management Commands + Crontab

The simplest and most reliable pattern is to write a Django management command and trigger it from system crontab:

# myapp/management/commands/send_digest.py
from django.core.management.base import BaseCommand
from django.core.mail import send_mail
from myapp.models import User, Activity

class Command(BaseCommand):
    help = "Send daily activity digest to all users"

    def handle(self, *args, **options):
        users = User.objects.filter(
            is_active=True,
            email_digest=True
        )
        for user in users:
            activities = Activity.objects.filter(
                user=user,
                created_at__gte=timezone.now() - timedelta(days=1)
            )
            if activities.exists():
                send_mail(
                    subject="Your daily digest",
                    message=self.build_digest(activities),
                    from_email="digest@myapp.com",
                    recipient_list=[user.email],
                )
        self.stdout.write(
            self.style.SUCCESS(f"Sent digest to {users.count()} users")
        )
# crontab -e
0 8 * * * cd /var/www/myapp && /usr/bin/python manage.py send_digest

This approach has no additional dependencies. It works with any hosting that gives you crontab access. The management command gets the full Django ORM, settings, and everything else.

Django + Celery Beat

For larger Django applications, Celery is the standard choice. The django-celery-beat package stores schedules in the database and provides an admin interface for managing them without code changes:

pip install django-celery-beat

# settings.py
INSTALLED_APPS = [
    ...
    "django_celery_beat",
]

# Run migrations to create schedule tables
# python manage.py migrate

Non-technical team members can then create and modify schedules through the Django admin panel, without deploying code.

5. Flask-APScheduler

For Flask applications, Flask-APScheduler integrates APScheduler directly into the Flask app context. Jobs have access to the application config and database connections.

pip install flask-apscheduler
from flask import Flask
from flask_apscheduler import APScheduler

app = Flask(__name__)

class Config:
    SCHEDULER_API_ENABLED = True
    JOBS = [
        {
            "id": "data_sync",
            "func": "tasks:sync_products",
            "trigger": "cron",
            "minute": "*/30",
        },
        {
            "id": "health_check",
            "func": "tasks:ping_dependencies",
            "trigger": "interval",
            "seconds": 300,
        },
    ]

app.config.from_object(Config)
scheduler = APScheduler()
scheduler.init_app(app)
scheduler.start()

Watch out for multiple workers. If you run Flask with Gunicorn using 4 workers, APScheduler starts 4 times, and every job runs 4 times. You need to either limit to a single worker, use a lock (database or Redis), or use the --preload flag and start the scheduler only in the master process.

6. Docker and Container Environments

Containers add a wrinkle to every approach above. A container is designed to run a single process. When that process stops, the container stops. This means:

  • System crontab does not exist inside most containers. You would need to install cron in the container and run it as the main process or alongside your app using a process manager like supervisord. This is workable but clunky.
  • In-process schedulers (schedule, APScheduler) work fine in containers as long as the container keeps running. But container orchestrators may restart containers at any time, and missed jobs are lost unless you use a persistent job store.
  • Celery works well in Docker because each component (app, worker, beat) runs in its own container. This is the standard production pattern.
  • Kubernetes CronJobs are the native way to schedule tasks in k8s. They spin up a Pod on schedule, run the job, and tear down the Pod. No long-running scheduler process needed.
# docker-compose.yml with Celery
services:
  web:
    build: .
    command: gunicorn myapp.wsgi:application -b 0.0.0.0:8000

  worker:
    build: .
    command: celery -A myapp worker --loglevel=info

  beat:
    build: .
    command: celery -A myapp beat --loglevel=info

  redis:
    image: redis:7-alpine

Comparison Table

Here is how the main options stack up against each other:

FeaturescheduleAPSchedulerCelery BeatExternal
Setup complexityLowMediumHighLow
Cron syntaxNoYesYesYes
PersistenceNoOptionalYesYes
RetriesNoManualBuilt-inBuilt-in
DistributedNoNoYesYes
MonitoringNoBasicFlowerDashboard

7. When to Use an External HTTP Scheduler

All the libraries above share a common problem: you are responsible for keeping the scheduling infrastructure running. If your APScheduler process crashes at 2 AM, your nightly backup does not run. If your Celery Beat container gets OOM-killed, nobody knows until Monday.

An external HTTP scheduler decouples scheduling from your application entirely. You expose your tasks as HTTP endpoints in your Python app:

# FastAPI example
from fastapi import FastAPI, Header, HTTPException

app = FastAPI()
API_SECRET = "your-webhook-secret"

@app.post("/tasks/generate-report")
async def generate_report(
    x_api_key: str = Header(...)
):
    if x_api_key != API_SECRET:
        raise HTTPException(status_code=401)

    # ... generate report ...
    return {"status": "ok", "rows_processed": 1847}

@app.post("/tasks/sync-inventory")
async def sync_inventory(
    x_api_key: str = Header(...)
):
    if x_api_key != API_SECRET:
        raise HTTPException(status_code=401)

    # ... sync logic ...
    return {"status": "ok", "items_synced": 423}

Then configure CronJobPro to call those endpoints on the schedule you need. The scheduler runs on separate infrastructure, so it keeps working even if your application has issues. You get automatic retries, execution history, failure alerts, and timezone-aware scheduling without writing any additional code.

This pattern works especially well for serverless and PaaS deployments (Vercel, Render, Railway, Fly.io) where you do not have access to system crontab or persistent background processes.

How to Choose

Use this decision guide based on your situation:

Quick script or personal project?

Use the schedule library. Minimal setup, gets the job done.

Single-server app that needs cron syntax?

Use APScheduler with a persistent job store. Production-quality without the operational overhead of Celery.

Multi-server app with heavy background processing?

Use Celery Beat. You probably need Celery for async tasks anyway, so add Beat for scheduling.

Serverless, PaaS, or you want zero scheduler maintenance?

Use an external HTTP scheduler like CronJobPro. Expose endpoints, configure schedules in the dashboard, and never think about scheduler infrastructure again.

Key Takeaways

  • The schedule library is perfect for simple scripts but lacks persistence and error handling.
  • APScheduler supports cron expressions, multiple job stores, and runs in a background thread alongside your application.
  • Celery Beat is the production standard for distributed Python applications but requires a message broker and multiple processes.
  • Django management commands with system crontab remain the simplest reliable pattern for Django projects.
  • External HTTP schedulers eliminate scheduler infrastructure entirely and work with any deployment model.

Related Articles

Skip the scheduler infrastructure

Expose your Python tasks as HTTP endpoints and let CronJobPro handle scheduling, retries, and monitoring. Free for up to 5 jobs.