Bug 1080490 - /usr/lib/cron/run-crons miscalculates cron.daily triggers
/usr/lib/cron/run-crons miscalculates cron.daily triggers
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Basesystem
Current
Other Other
: P5 - None : Normal (vote)
: ---
Assigned To: Danilo Spinella
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-02-11 20:38 UTC by Achim Gratz
Modified: 2021-09-16 07:08 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
proposed patch (1.64 KB, patch)
2018-12-09 13:46 UTC, Achim Gratz
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Achim Gratz 2018-02-11 20:38:01 UTC
The part of the script that tries to figure out if it should run the cron.daily scripts miscalculates the condition for find.  Instead of finding scripts that were not run yet today, it looks for scripts that were not run in the last 24 hours.  If for whatever reason your cron.daily was run 23:59 yesterday, it will not run today.

The correct way to deal with this is to either calculate how many minutes we are into the current day and run anything older than that again (similar to what cron.monthly is doing) or drop a cookie with 'touch -d "today 0:00" cookie' and run if not newer than that (TIME="-not -newer cookie").  Making that safe using mkstemp is left as an exercise for the reader.

The cron.weekly section suffers from the same problem, but it's probably hard to notice.

The cron.monthly section would probably get easier with the cookie file method as well.  It is currently also making the mistake of dragging the current time of day into the result.
Comment 1 Achim Gratz 2018-02-12 18:43:03 UTC
Come to think of it, the beginning of the day should probably be computed in UTC, as that is for instance what ntpd uses to roll-over its statistics on.  A configurable offset to that would probably be nice, too.  Using UTC has the additional advantage that the DST nonsense never enters the picture.
Comment 2 Achim Gratz 2018-03-13 19:13:16 UTC
I went for this incantation in /usr/lib/cron/run-crons:

                        # run as usual   
                        else
                          TIME="-cmin +$((1 + ($($DATE +%s)-$($DATE -d today' '${CRON_DAILY_NOT_BEFORE:-04:00} +%s))/60))" 
                        fi ;;
          cron.weekly)  TIME="-cmin +10080 -or -cmin 10080"  ;;

DATE is defined so that it can be switched to use UTC or the local timezone by a sysconfig variable and CRON_DAILY_NOT_BEFORE should also be a sysconfig variable.
Comment 3 Achim Gratz 2018-03-13 19:26:37 UTC
One more comment on that script:

Stuff like "-cmin +10080 -or -cmin 10080" is uselessly taking up space, mental capacity and runtime to figure out what it does, I'd just write "-cmin +10079" or "-cmin +$((10080-1))".
Comment 4 Achim Gratz 2018-12-09 13:46:59 UTC
Created attachment 792240 [details]
proposed patch

So here's what I've been using for almost a year now in the hope that it'll get things finally moving towards fixing the problem.
Comment 5 Achim Gratz 2019-10-13 07:42:50 UTC
(In reply to Achim Gratz from comment #4)
> Created attachment 792240 [details]
> proposed patch
> 
> So here's what I've been using for almost a year now in the hope that it'll
> get things finally moving towards fixing the problem.

Can this be fixed?  I'm getting tired of having to re-patch the file each time an update changes it back to the buggy version.
Comment 6 Achim Gratz 2020-06-15 18:46:58 UTC
(In reply to Achim Gratz from comment #5)
> Can this be fixed?  I'm getting tired of having to re-patch the file each
> time an update changes it back to the buggy version.

Ping?
Comment 7 Petr Gajdos 2020-06-19 11:31:19 UTC
Hello Achim,

thanks for the patch. I am sorry, I am quite new to this topic, so I will perhaps have basic questions.

(1) For what reason do you actually need the CRON_DAILY_NOT_BEFORE feature?

(2) What happens when run-crons script is invoked more than one minute before 
    ${CRON_DAILY_NOT_BEFORE:-04:00}?

(3) Do I understand correctly that you want to achieve (without 
    CRON_DAILY_NOT_BEFORE implementation):

      find -daystart -ctime 0

(In reply to Achim Gratz from comment #0)
> The part of the script that tries to figure out if it should run the
> cron.daily scripts miscalculates the condition for find.  Instead of finding
> scripts that were not run yet today, it looks for scripts that were not run
> in the last 24 hours.  If for whatever reason your cron.daily was run 23:59
> yesterday, it will not run today.

(4) In any case, in my opinion this behavioral change seem to be important enough to be discussed on opensuse-factory@ I guess, could you please do?
Comment 8 Petr Gajdos 2020-06-19 12:52:30 UTC
(In reply to Petr Gajdos from comment #7)
>       find -daystart -ctime 0

        find -daystart -ctime +0

Apologize for the typo.
Comment 9 Achim Gratz 2020-06-20 06:42:05 UTC
(In reply to Petr Gajdos from comment #7)
> (1) For what reason do you actually need the CRON_DAILY_NOT_BEFORE feature?

It allows you to ensure that any potentially CPU intensive cronjobs don't all start at the same time when the machine is continuously running.  This somewhat mimics the delay specifications from anacron.

> (2) What happens when run-crons script is invoked more than one minute
> before 
>     ${CRON_DAILY_NOT_BEFORE:-04:00}?

It will not trigger then, but only the next time at or after 04:00 UTC.

> (3) Do I understand correctly that you want to achieve (without 
>     CRON_DAILY_NOT_BEFORE implementation):
> 
>       find -daystart -ctime 0

I haven't tested that option, but provided it can be made to work in UTC it should indeed fix the original problem of logrotation not getting started depending on what the two previous boot times were.  It doesn't solve the problem of the same job potentially getting started twice in quick succession if I happen to boot the machine just before midnight.

> (4) In any case, in my opinion this behavioral change seem to be important
> enough to be discussed on opensuse-factory@ I guess, could you please do?

The script obviously assumes a machine that occasionally sleeps but otherwise gets used at regular times and completely misses the possibility that it might get started at irregular times at different days as well.  So that's clearly a bug that needs fixing.  It is not directly documented what the original specification of this script was, but generally it's obviously supposed to run a cron job that was prevented from running at whatever would have been the scheduled time if the machine was running 24/7.  The sensible way to put it would be "run it as early as possible within the the time period specified, but not more frequently than once each period".  That potentially bunches a lot of jobs to all run simultaneously when rebooting after an extended downtime and due to the way the timestamping currently works this will perpetuate if the machine keeps running unless you introduce some mechanism to shift the schedule back on what would have been the original cron grid.
Comment 10 Petr Gajdos 2020-06-25 12:17:54 UTC
(In reply to Achim Gratz from comment #9)
> (In reply to Petr Gajdos from comment #7)
> > (1) For what reason do you actually need the CRON_DAILY_NOT_BEFORE feature?
> 
> It allows you to ensure that any potentially CPU intensive cronjobs don't
> all start at the same time when the machine is continuously running.  This
> somewhat mimics the delay specifications from anacron.

Which cronjobs do you want to separate?
 
> > (2) What happens when run-crons script is invoked more than one minute
> > before 
> >     ${CRON_DAILY_NOT_BEFORE:-04:00}?
> 
> It will not trigger then, but only the next time at or after 04:00 UTC.

# date
Wed Jun 24 11:44:31 CEST 2020
# TIME="-cmin +$((1 + ($(date +%s)-$(date -d 'today 13:00' +%s))/60))"
# echo $TIME
-cmin +-74
# touch /var/spool/cron/lastrun/cron.daily
# find /var/spool/cron/lastrun/cron.daily -cmin +-74
/var/spool/cron/lastrun/cron.daily
#

Hmm, if that sounds correct to you, then I am still misunderstanding the concept.

> > (4) In any case, in my opinion this behavioral change seem to be important
> > enough to be discussed on opensuse-factory@ I guess, could you please do?
> 
> The script obviously assumes a machine that occasionally sleeps but
> otherwise gets used at regular times and completely misses the possibility
> that it might get started at irregular times at different days as well.  So
> that's clearly a bug that needs fixing.  It is not directly documented what
> the original specification of this script was, but generally it's obviously
> supposed to run a cron job that was prevented from running at whatever would
> have been the scheduled time if the machine was running 24/7.  The sensible
> way to put it would be "run it as early as possible within the the time
> period specified, but not more frequently than once each period".  That
> potentially bunches a lot of jobs to all run simultaneously when rebooting
> after an extended downtime and due to the way the timestamping currently
> works this will perpetuate if the machine keeps running unless you introduce
> some mechanism to shift the schedule back on what would have been the
> original cron grid.

Perhaps time to think about run-parts? Like:
https://github.com/cronie-crond/crontabs
https://superuser.com/questions/402781/what-is-run-parts-in-etc-crontab-and-how-do-i-use-it
It is few features missing, though.
Comment 11 Kristyna Streitova 2020-06-25 12:48:36 UTC
(In reply to Petr Gajdos from comment #10)

> Perhaps time to think about run-parts? Like:
> https://github.com/cronie-crond/crontabs
> https://superuser.com/questions/402781/what-is-run-parts-in-etc-crontab-and-
> how-do-i-use-it
> It is few features missing, though.

I agree. run-crons script has been here for more than 20 years without bigger changes. Almost no one wanted to touch it as it's rather complex, with a lot of magic constants, strange undescribed behaviour and so on. Also, being dependent on ctime brings lot's of troubles (see e.g. bug#972984, bug#980873) as antivirus or backup software can change ctime during scanning/backup which breaks correct functioning of run-crons.

Upstream (which is Red Hat now) and other distros who use cronie (e.g. Fedora, Arch, Gentoo) use run-parts. It would make sense to try to be as close to the upstream as possible and provide unified behaviour that users know from other distros.
Comment 12 Danilo Spinella 2021-08-30 08:18:28 UTC
Now run-parts is available in openSUSE TW, provided by debianutils package. Using run-parts fixes this issue as it doesn't rely on ctime.

To use run-parts in cron, in file /etc/crontab, replace the line containing run-crons with the following:

@hourly         root      run-parts /etc/cron.hourly
@daily          root      run-parts /etc/cron.daily
@weekly         root      run-parts /etc/cron.weekly
@monthly        root      run-parts /etc/cron.monthly
Comment 13 Achim Gratz 2021-09-15 17:11:10 UTC
The suggested change doesn't actually solve the problem, since run-parts simply addresses "run all the scripts that are found in <directory>" and nothing else.

1. Just using run-parts reintroduces the problem that all these jobs get run at fixed times (or rather not, since usually the computer isn't running at the time it's scheduled in crontab).

2. To address the original problem you need to use anacron, which makes it necessary to modify crontab like this:

@hourly         root      run-parts /etc/cron.hourly
@daily          root      test -x /usr/sbin/anacron || run-parts /etc/cron.daily
@weekly         root      test -x /usr/sbin/anacron || run-parts /etc/cron.weekly
@monthly        root      test -x /usr/sbin/anacron || run-parts /etc/cron.monthly

The start of anacron is triggered in cron.hourly, which can create pretty big delays (the original frequency of checking run-crons was ewvery fifteen minutes).  I suggest anacron should be started via @reboot and not in cron.daily.
Comment 14 Danilo Spinella 2021-09-16 07:08:00 UTC
From anacron(8):

> Anacron is used to execute commands periodically, with a
> frequency specified in days.  Unlike cron(8), it does not assume
> that the machine is running continuously.  Hence, it can be used
> on machines that are not running 24 hours a day to control
> regular jobs as daily, weekly, and monthly jobs.

cron is for machines continuously running, anacron is for the other ones.

In your case, if you need to run scripts in /etc/cron.hourly, use /etc/anacrontab with run-parts and remove the entry in /etc/crontab. anacrontab(8) shows a good starting point.