Retries, timeouts, and the circuit breaker
Every CronJobPro job sends an HTTP request from our servers on schedule. A 2xx response counts as success; a non-2xx status, a timeout, a connection error, or a DNS failure counts as a failed execution. This article explains the settings that control what happens when a request fails: the request timeout, retries, the overlap policy, the circuit breaker, and the alert threshold.
Request timeout
The request timeout is how long, in seconds, CronJobPro waits for a response before giving up and marking the execution as failed. Set it a bit higher than your endpoint's realistic worst-case response time so a slow-but-healthy run is not falsely flagged, but low enough that a truly stuck request fails fast and frees the slot. If your endpoint kicks off long work, have it return 2xx quickly and do the heavy lifting in the background.
Retries and retry strategy
When an execution fails, CronJobPro retries it up to your max retries, spacing attempts according to the chosen strategy. Retries are durable, so they survive platform restarts. Pick the strategy that matches why your endpoint tends to fail.
| Strategy | Delay between attempts | Good for |
|---|---|---|
| Exponential | Grows after each attempt | Endpoints that may be overloaded or rate-limited; backing off gives them room to recover |
| Linear | Increases by a fixed step each attempt | A gradual, predictable ramp-up |
| Fixed | The same delay every time | Short, transient blips where a steady, quick retry is fine |
Keep max retries modest. A handful of attempts smooths over transient errors; piling on retries mainly delays the alert you need to see and burns time on an endpoint that is genuinely down.
Overlap policy: allow vs skip
The overlap policy decides what happens when the next scheduled run arrives while the previous run is still in flight. Allow lets the new run start anyway, so executions can overlap. Skip cancels the new run for that tick and waits for the next scheduled time.
Use the Skip policy whenever a job is not safe to run twice at once, such as report generation, syncs, or anything that mutates shared data. It prevents a slow run from stacking on top of itself.
The circuit breaker
When a job keeps failing despite retries, the circuit breaker auto-disables it so CronJobPro stops hammering a dead endpoint. It then auto-probes the job in a half-open state, sending an occasional test run; once a probe succeeds, the breaker closes and the job resumes its normal schedule. You are notified when a job is disabled this way, and again on recovery, via your configured channels.
Alert threshold: suppress noisy alerts
The alert threshold suppresses failure alerts until a job has failed N consecutive times. The default is 1, so you are alerted on the very first failure. Raise it to ride out the occasional one-off blip without a notification, but keep it low enough that a real outage still reaches you quickly. Recovery and circuit-breaker alerts are sent independently of this setting.
Recommended starting point
- 1
Open the job
Go to your job in the dashboard and open its settings.
- 2
Set a realistic timeout
Use a value slightly above your endpoint's typical worst-case response time.
- 3
Add a few retries
Choose exponential for overload- or rate-limit-prone endpoints; fixed for quick transient blips. Keep max retries small.
- 4
Pick the overlap policy
Choose Skip for any job that should not run twice at once; Allow only when concurrent runs are safe.
- 5
Tune the alert threshold
Leave it at 1 for critical jobs; nudge it up a little for flaky endpoints to cut noise.
Configure where these alerts go under Settings, Notifications, and confirm everything end to end by triggering a Run now. Note that a manual Run now counts toward your plan's daily-run quota.