If users do not enable a service like systemd-time-wait-sync.target
(because they don't want to delay boot for external events, such as an
NTP sync), then timers should still take the the weaker time-set.target
feature into account, so that the clock is at least monotonic.
Hence, order timer units after both of the targets: time-sync.target
*and* time-set.target. That way, the right thing will happen regardless
if people have no NTP server (and thus also no
systemd-time-wait-sync.service or equivalent) or, only have an NTP
server (and no systemd-time-wait-sync.service), or have both.
Ordering after time-set.target is basically "free". The logic it is
backed by should be instant, without communication with the outside
going on. It's useful still so that time servers that implement the
timestamp from /var/ logic can run in later boot.
FixedRandomDelay=yes will use
`siphash24(sd_id128_get_machine() || MANAGER_IS_SYSTEM(m) || getuid() || u->id)`,
where || is concatenation, instead of a random number to choose a value between
0 and RandomizedDelaySec= as the timer delay.
This essentially sets up a fixed, but seemingly random, offset for each timer
iteration rather than having a random offset recalculated each time it fires.
Closes#10355
Co-author: Anita Zhang <the.anitazha@gmail.com>
In 4c2ef32767 we enabled propagating
triggered unit state to the triggering unit for service units in more
load states, so that we don't accidentally stop tracking state
correctly.
Do the same for our other triggering unit states: automounts, paths, and
timers.
Also, make this an assertion rather than a simple test. After all it
should never happen that we get called for half-loaded units or units of
the wrong type. The load routines should already have made this
impossible.
When the RTC time at boot is off in the future by a few days, OnCalendar=
timers will be scheduled based on the time at boot. But if the time has been
adjusted since boot, the timers will end up scheduled way in the future, which
may cause them not to fire as shortly or often as expected.
Update the logic so that the time will be adjusted based on monotonic time.
We do that by calculating the adjusted manager startup realtime from the
monotonic time stored at that time, by comparing that time with the realtime
and monotonic time of the current time.
Added a test case to validate this works as expected. The test case creates a
QEMU virtual machine with the clock 3 days in the future. Then we adjust the
clock back 3 days, and test creating a timer with an OnCalendar= for every 15
minutes. We also check the manager startup timestamp from both `systemd-analyze
dump` and from D-Bus.
Test output without the corresponding code changes that fix the issue:
Timer elapse outside of the expected 20 minute window.
next_elapsed=1594686119
now=1594426921
time_delta=259198
With the code changes in, the test passes as expected.
In all the other cases, I think the code was clearer with the static table.
Here, not so much. And because of the existing dump code, the vtables cannot
be made static and need to remain exported. I still think it's worth to do the
change to have the cmdline introspection, but I'm disappointed with how this
came out.
Similar, refuse triggering deps on units that cannot trigger.
And rework how we ignore After= dependencies on device units, to work
the same way.
See: #14142
No functional change, just adjusting code to follow the same pattern
everywhere. In particular, never call _verify() on an already loaded unit,
but return early from the caller instead. This makes the code a bit easier
to follow.
unit_load_fragment_and_dropin() and unit_load_fragment_and_dropin_optional()
are really the same, with one minor difference in behaviour. Let's drop
the second function.
"_optional" in the name suggests that it's the "dropin" part that is optional.
(Which it is, but in this case, we mean the fragment to be optional.)
I think the new version with a flag is easier to understand.
timer units maintain state on disk (the persistent touch file), hence
let's expose cleaning it up generically with the new cleaning operation
for units.
This is a much simpler implementation as for the service unit type:
instead of forking out a worker process we just remove the touch file
directly. That should be OK since we only need to remove a single
(empty) file, instead of a recursive user-controlled directory tree.
Fixes: #4930
Just some renaming, no change in behaviour.
Background: I'd like to add more functions unit_test_xyz() that test
various things, hence let's streamline the naming a bit.
Fixes#5659.
Currently, if Persistent=true and the machine is off at the scheduled time of the timer unit, the timer
will be triggered immediately at the next boot even if RandomizedDelaySec= is specified.
As a result, if multiple timers meet that condition, they will be triggered at the same time and too
much CPU/IO work makes boot slow down.
With this commit, if the scheduled time of the persistent timer has already elapsed at boot,
set the time when systemd first started as the scheduled time and RandomizedDelaySec= is applied to it.
This allows clients to follow our internal state changes safely.
Previously, quick state changes (for example, when we restart a unit due
to Restart= after it quickly transitioned through DEAD/FAILED states)
would be coalesced into one bus signal event, with this change there's
the guarantee that all state changes after the unit was announced ones
are reflected on th bus.
Note we only do this kind of guaranteed flushing only for unit state
changes, not for other unit property changes, where clients still have
to expect coalescing. This is because the unit state is a very
important, high-level concept.
Fixes: #10185
We keep a mark whether a single-shot timer was triggered in the caller's
variable initial. When such a timer elapses while we are
serializing/deserializing the inner state, we consider the timer
incorrectly as elapsed and don't trigger it later.
This patch exploits last_trigger timestamp that we already serialize,
hence we can eliminate the argument initial completely.
A reproducer for OnBootSec= timers:
cat >repro.c <<EOD
/*
* Compile: gcc repro.c -o repro
* Run: ./repro
*/
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <time.h>
#include <unistd.h>
int main(int argc, char *argv[]) {
char command[1024];
int pause;
struct timespec now;
while (1) {
usleep(rand() % 200000); // prevent periodic repeats
clock_gettime(CLOCK_MONOTONIC, &now);
printf("%i\n", now.tv_sec);
system("rm -f $PWD/mark");
snprintf(command, 1024, "systemd-run --user --on-boot=%i --timer-property=AccuracySec=100ms "
"touch $PWD/mark", now.tv_sec + 1);
system(command);
system("systemctl --user list-timers");
pause = (1000000000 - now.tv_nsec)/1000 - 70000; // fiddle to hit the middle of reloading
usleep(pause > 0 ? pause : 0);
system("systemctl --user daemon-reload");
sync();
sleep(2);
if (open("./mark", 0) < 0)
if (errno == ENOENT) {
printf("mark file does not exist\n");
break;
}
}
return 0;
}
EOD
There is difference between time set by the user and real elapsed time because of accuracy feature.
If you change the system date(or time) between these times, the timer drops.
You can easily reproduce it with the following command.
-----------------------------------------------------------
$ systemd-run --on-active=3s ls; sleep 3; date -s "`date`"
-----------------------------------------------------------
In the following command, the problem is rarely reproduced. But it exists.
---------------------------------------------------------------------------------------------
$ systemd-run --on-active=3s --timer-property=AccuracySec=1us ls ; sleep 1; date -s "`date`"
---------------------------------------------------------------------------------------------
Note : Global AccuracySec value.
----------------------------------------------------------------------
$ cat /etc/systemd/system.conf
DefaultTimerAccuracySec=1min
----------------------------------------------------------------------
Let's be more careful with what we serialize: let's ensure we never
serialize strings that are longer than LONG_LINE_MAX, so that we know we
can read them back with read_line(…, LONG_LINE_MAX, …) safely.
In order to implement this all serialization functions are move to
serialize.[ch], and internally will do line size checks. We'd rather
skip a serialization line (with a loud warning) than write an overly
long line out. Of course, this is just a second level protection, after
all the data we serialize shouldn't be this long in the first place.
While we are at it also clean up logging: while serializing make sure to
always log about errors immediately. Also, (void)ify all calls we don't
expect errors in (or catch errors as part of the general
fflush_and_check() at the end.
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
This adds a flags parameter to unit_notify() which can be used to pass
additional notification information to the function. We the make the old
reload_failure boolean parameter one of these flags, and then add a new
flag that let's unit_notify() if we are configured to restart the
service.
Note that this adjusts behaviour of systemd to match what the docs say.
Fixes: #8398
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
E.g. if you have a monthly event and you set the computer clock back one
year, we can allow the next 12 monthly events to happen naturally. In fact
we already do this when you start a Persistent=yes timer, we just need to
apply the same logic when it's running and we notice the system clock
being set backwards.
On timejumps, including suspend, timer_time_change() calls for a
re-calculation of the next elapse. Sadly I'm not quite sure what the
intended effect of this was! Because it was not managing to fire
OnCalendar= timers which fired during the suspend... unless the timer had
already fired once before.
Reported, entirely correctly as far as I can see, on stackexchange:
https://unix.stackexchange.com/questions/351829/systemd-timer-that-expired-while-suspended
/* If we know the last time this was
* triggered, schedule the job based relative
- * to that. If we don't just start from
- * now. */
+ * to that. If we don't, just start from
+ * the activation time. */
The same code is called for both the initial calculation and this
re-calculation. If we're _not_ already active, then this is before the
activation time has been recorded in the unit, so just use the current
time as before. The new code is mechanically adapted from the same
logic for `OnActiveSec=` (case TIMER_ACTIVE in the code which follows).
Tested with `date --set`.
Motivations:
* Rotate monitoring data from Atop into files which are named per-day.
Fedora currently implements this with a cron job that runs at midnight,
but that didn't handle suspend correctly either.
* unbound-anchor.timer on Fedora, is used to update DNSSEC "root trust
anchor" daily, before the TTL expires. It uses OnCalendar=daily
AccuracySec=24h. Which is a bit suspect because the TTL is 2 days, but I
think it has the right general idea.
None of the other timer settings are correct, because they would not
account for time spent in suspend. Unless you set WakeSystem
(this feature is currently undocumented).
* So in general, we can expect to see people using OnCalendar= for the same
cases as cron.daily and cron.monthly. Which use anacron to keep track of
jobs which should be run even if the system was down at the time.
Timers which are configured to run more frequently than that, are
unlikely to mind if they get run slightly more often that the writer
realized, relative to the amount of time the system was really running.
* From the user report above: "I only want to use remind to show a desktop
notification, it seems excessive to wake up the computer for that. Also,
I would like to get the reminder first thing in the morning, so the
OnActiveSec doesn't help with that."
We have two variables `b` and `base`. `b` is declared within limited
scope; `base` is declared at the top of the function. However `base`
is actually only used within a scope which is exclusive of `b`. Clarify
by moving `base` inside the limited scope as well.
(Also `base` doesn't need initializing any more than `b` does. The
declaration of `base` is now immediately followed by a case analysis of
`v->base`, which serves almost exclusively to determine the value of
`base`).
When a unit job finishes early (e.g. when fork(2) fails) triggered unit goes
through states
stopped->failed (or failed->failed),
in case a ExecStart= command fails unit passes through
stopped->starting->failed.
The former transition doesn't result in unit active/inactive timestamp being
updated and timer (OnUnitActiveSec= or OnUnitInactiveSec=) would use an expired
timestamp triggering immediately again (repeatedly).
This patch exploits timer's last trigger timestamp to ensure the timer isn't
triggered more frequently than OnUnitActiveSec=/OnUnitInactiveSec= period.
Steps to reproduce:
0) Create sample units:
cat >~/.config/systemd/user/looper.service <<EOD
[Service]
ExecStart=/usr/bin/sleep 2
EOD
cat >~/.config/systemd/user/looper.timer <<EOD
[Timer]
AccuracySec=5
OnUnitActiveSec=5
EOD
1) systemctl --user daemon-reload
2) systemctl --user start looper.timer
# to have first activation timestamp/sentinel
systemctl --user start looper.service
o Observe the service is being regularly triggered.
3) systemctl set-property user@$UID.service TasksMax=2
o Observe the tight looping as long as the looper.service cannot be started.
Ref: #5969
This makes things quite a bit more systematic I think, as we can
systematically operate on all timestamps, for example for the purpose of
serialization/deserialization.
This rework doesn't necessarily make things shorter in the individual
lines, but it does reduce the line count a bit.
(This is useful particularly when we want to add additional timestamps,
for example to solve #7023)
This replaces the dependencies Set* objects by Hashmap* objects, where
the key is the depending Unit, and the value is a bitmask encoding why
the specific dependency was created.
The bitmask contains a number of different, defined bits, that indicate
why dependencies exist, for example whether they are created due to
explicitly configured deps in files, by udev rules or implicitly.
Note that memory usage is not increased by this change, even though we
store more information, as we manage to encode the bit mask inside the
value pointer each Hashmap entry contains.
Why this all? When we know how a dependency came to be, we can update
dependencies correctly when a configuration source changes but others
are left unaltered. Specifically:
1. We can fix UDEV_WANTS dependency generation: so far we kept adding
dependencies configured that way, but if a device lost such a
dependency we couldn't them again as there was no scheme for removing
of dependencies in place.
2. We can implement "pin-pointed" reload of unit files. If we know what
dependencies were created as result of configuration in a unit file,
then we know what to flush out when we want to reload it.
3. It's useful for debugging: "systemd-analyze dump" now shows
this information, helping substantially with understanding how
systemd's dependency tree came to be the way it came to be.