Monitor Sidekiq Cron Jobs (sidekiq-cron & sidekiq-scheduler)
Sidekiq delegates recurring work to a background poller thread (sidekiq-cron) or a scheduler process (sidekiq-scheduler), and both depend on Redis state that can vanish without any error visible in your application logs. When the poller thread crashes or Redis evicts a key under memory pressure, your daily report or nightly sync job simply stops running — no exception, no alert, no trace in Sidekiq's web UI. An external heartbeat monitor is the only reliable way to detect this class of silent failure.
Why Sidekiq Scheduled Jobs Fail Without Warning
Both sidekiq-cron and sidekiq-scheduler store scheduling state in Redis and run their own internal threads inside the Sidekiq process. That architecture creates several failure paths that produce no exception in your application and no failed job in the Sidekiq web UI — the job simply never gets enqueued. Sidekiq's built-in monitoring only shows you jobs that reached a queue; it cannot detect jobs that were never enqueued in the first place. Internal log scraping, APM dashboards, and Kubernetes liveness probes all share the same blind spot: they verify the process is alive, not that scheduling is still happening.
- The sidekiq-cron Poller thread runs inside the Sidekiq process as a Ruby Thread. If an unhandled exception or a gem compatibility issue kills that thread (a documented real-world case involves connection_pool 3.x raising ArgumentError inside the thread), the Sidekiq process stays up and continues processing regular jobs while all cron scheduling silently stops.
- sidekiq-cron stores job metadata in a Redis sorted set (zset). If your Redis instance uses any eviction policy other than noeviction, Redis can silently evict those keys under memory pressure. The job definitions disappear and no new runs are ever enqueued, with no error anywhere.
- sidekiq-scheduler stores the entire schedule under a single Redis key (sidekiq:scheduler:schedules by default). With multiple Sidekiq processes or a rolling deploy, the last process to start overwrites that key, and any jobs present only in the previous schedule are lost until the next full restart that loads a complete config.
- A Sidekiq restart during a rolling deploy creates a window — sometimes only seconds — where no scheduler is active. sidekiq-cron's default reschedule_grace_period catches only jobs that were missed within the last 60 seconds, so a daily job whose window falls inside a longer outage is simply skipped without any catch-up.
- When sidekiq-cron is configured across multiple applications sharing the same Redis instance without key namespacing, one application's schedule can overwrite another's, silently dropping the other application's scheduled jobs.
- A job whose perform method raises an exception every time it runs will appear as a Sidekiq failure (visible in the Retries tab), but a job whose class file is missing, whose constant is not loaded, or whose schedule YAML is malformed after a deploy will fail at the enqueue step, producing no Sidekiq job record at all.
The External Heartbeat Approach: Catch What Sidekiq Cannot
An external heartbeat monitor (also called a dead-man's switch) inverts the monitoring model: instead of an outside system polling your application, your job actively signals a monitoring service on every successful completion. CronJobPro issues you a unique ping URL — https://cronjobpro.com/ping/<token> — and you configure an expected period plus a grace window. If the ping does not arrive within that window, CronJobPro fires an alert to your chosen channel (email, Slack, Discord, Teams, PagerDuty, Opsgenie, or webhook). You can also call https://cronjobpro.com/ping/<token>/fail to signal a known failure explicitly, or https://cronjobpro.com/ping/<token>/exitcode/<n> to pass a numeric exit code. This approach catches every failure mode described above: a dead poller thread, an evicted Redis key, a skipped window during a deploy, a missing job class — all produce the same outcome from the monitor's perspective: the ping does not arrive, and you are alerted. Internal Sidekiq metrics, APM agents, and Kubernetes probes cannot catch these because they all observe the process, not the specific scheduled action.
Add a heartbeat to Sidekiq
- 1
Create a heartbeat monitor in CronJobPro
In your CronJobPro dashboard, create a new Heartbeat monitor. Set the schedule to match your job's cron expression (for example, every day at 02:00 UTC), and set a grace period appropriate for how long the job takes — five to ten minutes is reasonable for most nightly jobs. CronJobPro will generate a unique ping URL in the form https://cronjobpro.com/ping/<token>.
- 2
Store the ping URL as an environment variable
Add the ping URL to your application's environment configuration. In Rails this is typically an entry in your credentials file or a CRONJOBPRO_PING_URL environment variable. Never hard-code the token in source control, as it identifies your monitor uniquely.
- 3
Add the ping call at the end of your perform method
In your Sidekiq worker class, call the ping URL using Ruby's built-in Net::HTTP (or Faraday / HTTParty if already in your stack) as the very last action inside the perform method, after all business logic has completed successfully. Placing it at the end ensures the ping only fires on genuine success — not on partial completion.
- 4
Handle the ping failure gracefully
Wrap the ping HTTP call in a rescue block so that a transient network failure when contacting CronJobPro does not itself raise an exception and cause Sidekiq to retry the job. Log the failure, but do not let it disrupt the job's own error semantics. If you want to report an explicit failure, call the /fail endpoint inside your existing rescue block before re-raising.
- 5
Verify end-to-end in staging
Trigger the job manually in staging with Sidekiq::Client.push or by calling MyJob.perform_async, then confirm in the CronJobPro dashboard that the monitor status flips to OK. Next, deliberately comment out the ping line and run the job again — confirm that after the grace window expires the monitor fires an alert. This two-step check validates both the success path and the silent-failure detection path.
ruby
# app/workers/nightly_report_worker.rb
#
# sidekiq-cron schedule (config/initializers/sidekiq.rb or sidekiq.yml):
# NightlyReportWorker:
# cron: '0 2 * * *'
# class: NightlyReportWorker
#
# sidekiq-scheduler schedule (config/sidekiq_scheduler.yml):
# nightly_report:
# cron: '0 2 * * *'
# class: NightlyReportWorker
# queue: default
require 'net/http'
require 'uri'
class NightlyReportWorker
include Sidekiq::Worker
sidekiq_options queue: :default, retry: 3
HEARTBEAT_URL = ENV.fetch('CRONJOBPRO_NIGHTLY_REPORT_PING_URL', nil)
def perform
# --- your business logic here ---
generate_and_deliver_report
# --- end business logic ---
# Ping the heartbeat monitor on success.
# This must be the last action so it only fires when the job
# fully completed without raising an exception.
ping_heartbeat
rescue StandardError => e
# Signal an explicit failure so CronJobPro alerts immediately
# rather than waiting for the grace window to expire.
ping_heartbeat_fail
raise # re-raise so Sidekiq records the failure and retries
end
private
def generate_and_deliver_report
# Replace with your actual work.
Rails.logger.info '[NightlyReportWorker] generating report'
end
def ping_heartbeat
return unless HEARTBEAT_URL
uri = URI.parse(HEARTBEAT_URL)
Net::HTTP.get(uri)
rescue StandardError => e
# Do not let a monitoring ping failure affect the job's own outcome.
Rails.logger.warn "[NightlyReportWorker] heartbeat ping failed: #{e.message}"
end
def ping_heartbeat_fail
return unless HEARTBEAT_URL
fail_uri = URI.parse("#{HEARTBEAT_URL}/fail")
Net::HTTP.get(fail_uri)
rescue StandardError => e
Rails.logger.warn "[NightlyReportWorker] heartbeat fail-ping failed: #{e.message}"
end
endFrequently asked questions
Will Sidekiq automatically re-run a cron job that was missed during a deploy?
Only partially. sidekiq-cron has a reschedule_grace_period (defaulting to 60 seconds) that catches up on missed schedules within that window after the process restarts. Jobs scheduled less frequently than the grace period — a daily job, for example — will simply be skipped if the outage spans their scheduled time. You can increase the grace period with Sidekiq::Cron.configure { |c| c.reschedule_grace_period = 600 }, but this does not help if the poller thread has silently died while the process itself is still running.
Can I use Sidekiq Pro or the Sidekiq web UI to detect that a scheduled job stopped running?
No. The Sidekiq web UI shows jobs that have been enqueued, are processing, or have failed. If a scheduled job is never enqueued — because the poller thread crashed, Redis evicted the job's key, or the scheduler process is down — there is no job record in Sidekiq at all. There is nothing to show in the web UI. This is the core reason external heartbeat monitoring is necessary.
What Redis eviction policy should I use for Sidekiq?
Redis should be configured with maxmemory-policy noeviction when used for Sidekiq. With any other policy (allkeys-lru, volatile-lru, etc.), Redis can silently evict keys that Sidekiq and sidekiq-cron depend on — including the sorted sets that store scheduled job state — when Redis reaches its memory limit. The Sidekiq documentation explicitly states that any setting other than noeviction may result in jobs being lost.
Why does running multiple Sidekiq processes cause sidekiq-scheduler to lose jobs?
sidekiq-scheduler stores all schedule definitions under a single Redis key that is not scoped to an individual process. When a second process starts, it writes its own schedule to that key, overwriting whatever the first process had stored. If the two processes have different schedules loaded — which can happen during a rolling deploy where old and new application versions overlap — jobs present only in the overwritten schedule disappear until the next restart. The recommended mitigation is to ensure all processes load an identical schedule, and to use different Redis key_prefix values or separate Redis databases when running multiple applications on the same Redis instance.
How do I test that my heartbeat monitor will actually alert me?
The most reliable test is to comment out the ping call in a staging environment, trigger the job manually, and wait for the grace period to expire — at which point CronJobPro should fire an alert through your configured channel. Alternatively, you can temporarily change the expected period in the CronJobPro dashboard to one minute, then stop your Sidekiq process entirely and confirm the alert arrives within the grace window. Always restore the correct schedule settings after testing.
More monitoring guides
Catch silent failures in Sidekiq
Add one HTTP ping and CronJobPro alerts you the moment a run is missed or fails.