April 1, 2026 6 min read
Cron job overlap: what it is, why it happens, and how to detect it
When a cron job starts before the previous instance has finished, you get overlapping runs. Here's why this happens in production, the damage it causes, and how to catch it automatically.
Your cron job is scheduled to run every 15 minutes. It normally finishes in 3 minutes. Then one night, a slow database query causes it to run for 20 minutes — and cron starts a new instance at the 15-minute mark while the previous one is still running.
Now you have two instances of the same job running simultaneously. Both are processing the same queue. Both are writing to the same tables. Both complete and report success.
This is cron job overlap, and it causes some of the most confusing production bugs you will encounter: duplicate records, race conditions, data inconsistencies, and resource exhaustion — none of which come with a clear error message pointing back to the root cause.
Why cron doesn't prevent this by default
Cron has no concept of job state. It fires a command at a scheduled time. Whether the previous instance of that command is still running is not something cron checks. From cron's perspective, every execution is independent.
This means that if a job takes longer than its schedule interval — for any reason — you get overlap. A job scheduled every 5 minutes that takes 8 minutes will have two instances running simultaneously for 3 minutes out of every 8. A daily job that suddenly takes 26 hours will overlap with the next day's run.
The scenarios where overlap causes real damage
Duplicate record creation. Two instances of a data import job both query the source and insert records. Without proper deduplication logic (which most import jobs don't have), you get twice the records.
Double-sending emails or notifications. Two instances of an email dispatch job both process the same outbox. Users receive the same email twice. Depending on what the email contains, this can range from annoying to embarrassing to legally problematic.
Race conditions in cleanup jobs. Two instances of a cleanup or archiving job both identify the same set of records as eligible for deletion. Both attempt to delete them. One fails with a "not found" error, which may or may not be caught and logged.
Resource exhaustion. Long-running jobs that hold database connections, file locks, or external API sessions can exhaust available resources when multiple instances stack up. A job that holds a Postgres advisory lock will block all subsequent instances from running, causing a queue of stacked processes.
Cascading overlap. If a job overlaps and takes even longer because of the resource contention, the next scheduled run also overlaps, creating a growing pile of concurrent instances that compounds the problem.
Preventing overlap in your code
The standard approach is a distributed lock. Before doing any work, your job attempts to acquire a lock. If the lock is held by another instance, the new instance exits immediately:
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL!);
const LOCK_KEY = 'job:nightly-sync:lock';
const LOCK_TTL_MS = 30 * 60 * 1000; // 30 minutes
async function nightly_sync() {
// Attempt to acquire lock — SET NX PX is atomic
const acquired = await redis.set(LOCK_KEY, '1', 'NX', 'PX', LOCK_TTL_MS);
if (!acquired) {
console.log('Previous instance still running, skipping this execution');
return;
}
try {
await doActualWork();
} finally {
await redis.del(LOCK_KEY);
}
}
The TTL on the lock is important: if the job crashes without releasing the lock, the TTL ensures it eventually expires and future runs are not permanently blocked.
In Python with a similar pattern:
import redis
import contextlib
r = redis.Redis.from_url(os.environ['REDIS_URL'])
def acquire_lock(key, ttl_seconds=1800):
return r.set(key, '1', nx=True, ex=ttl_seconds)
def release_lock(key):
r.delete(key)
def nightly_sync():
if not acquire_lock('job:nightly-sync:lock'):
print('Previous instance still running, skipping')
return
try:
do_actual_work()
finally:
release_lock('job:nightly-sync:lock')
Detecting overlap with external monitoring
Locking prevents new instances from running when a previous one is active. But it doesn't tell you when overlap was detected and a run was skipped. For production observability, you want to know:
- How often is the job being skipped because the previous run is still active?
- Is the job consistently running longer than its schedule interval?
External monitoring catches this differently. When a monitoring service receives a start ping for a job that already has an active run (a previous start ping with no corresponding success or fail), it records the previous run as overlapped and can alert you.
This is useful even when you have distributed locks in place, because it surfaces a pattern you might otherwise miss: the job is running long enough that overlap would occur if locking weren't in place, which means the job is trending toward a performance problem that will eventually cause issues.
In Crontify, overlapped runs are recorded in the run history with a distinct status. You can see at a glance how often a job is running past its expected window, and set up alerts when overlaps exceed a threshold.
Instrumenting for overlap detection is the same as standard start/success/fail monitoring:
import { CrontifyMonitor } from '@crontify/sdk';
const monitor = new CrontifyMonitor({
apiKey: process.env.CRONTIFY_API_KEY!,
monitorId: 'your-monitor-id',
});
await monitor.wrap(async () => {
await doActualWork();
});
wrap() sends the start ping automatically. If Crontify receives a new start ping while a run is already in the running state, it marks the previous run as overlapped and records the event.
The combination that works
The complete solution for cron job overlap has two parts:
Prevention (in your code): a distributed lock that prevents multiple instances from executing simultaneously. Redis with SET NX PX is the standard approach. The lock TTL must be set to the maximum acceptable job duration.
Detection (in your monitoring): external monitoring that records when a new start ping arrives while a previous run is still active. This surfaces performance degradation trends before they become incidents, and gives you a historical record of how often overlap occurs.
Neither part is sufficient alone. The lock prevents damage but produces no visibility. The monitoring produces visibility but does nothing to prevent the overlap.
Crontify is free for up to 5 monitors — no credit card required.
Start monitoring your scheduled jobs
Free plan includes 5 monitors. No credit card required. Up and running in under 5 minutes.
Get started free →More from the blog
April 1, 2026 6 min read
Why your cron job is getting slower (and how to detect it automatically)
A job that used to take 2 minutes now takes 45. Nothing errored. Here's how duration anomaly detection works and why it's one of the most underrated signals in scheduled job monitoring.
Read more →
March 28, 2026 8 min read
How to add monitoring to gocron scheduled jobs in Go
gocron has no built-in monitoring implementation. Here's how to instrument your gocron jobs with external monitoring so you know the moment one fails, hangs, or stops running.
Read more →