Upon an incoming connection for an accepting socket, we'd create a unit like
foo@0.service, then figure out that the instance name should be e.g. "0-41-0",
and then add the name foo@0-41-0.service to the unit. This obviously violates
the rule that any service needs to have a constance instance part.
So let's reverse the order: we first determine the instance name and then
create the unit with the correct name from the start.
There are two cases where we don't know the instance name:
- analyze-verify: we just do a quick check that the instance unit can be
created. So let's use a bogus instance string.
- selinux: the code wants to load the service unit to extract the ExecStart path
and query it for the selinux label. Do the same as above.
Note that in both cases it is possible that the real unit that is loaded could
be different than the one with the bogus instance value, for example if there
is a dropin for a specific instance name. We can't do much about this, since we
can't figure out the instance name in advance. The old code had the same
shortcoming.
Only log at LOG_INFO level, i.e. make this informational. During start
let's leave it at LOG_WARNING though.
Of course, it's ugly leaving processes around like that either in start
or in stop, but at start its more dangerous than on stop, so be tougher
there.
In all the other cases, I think the code was clearer with the static table.
Here, not so much. And because of the existing dump code, the vtables cannot
be made static and need to remain exported. I still think it's worth to do the
change to have the cmdline introspection, but I'm disappointed with how this
came out.
In subsequent commits, calls to if_nametoindex() will be replaced by a wrapper
that falls back to alternative name resolution over netlink. netlink support
requires libsystemd (for sd-netlink), and we don't want to add any functions
that require netlink in basic/. So stuff that calls if_nametoindex() for user
supplied interface names, and everything that depends on that, needs to be
moved.
It makes sense to filter state changes for some load states that
shouldn't happen, but the common cases should be accepted, because they
might happen during runtime when "systemctl daemon-reload" is issued and
unit files changed state in between. Otherwise we lose events.
Fixes: #4708
Similar, refuse triggering deps on units that cannot trigger.
And rework how we ignore After= dependencies on device units, to work
the same way.
See: #14142
chase_symlinks() would return negative on error, and either a non-negative status
or a non-negative fd when CHASE_OPEN was given. This made the interface quite
complicated, because dependning on the flags used, we would get two different
"types" of return object. Coverity was always confused by this, and flagged
every use of chase_symlinks() without CHASE_OPEN as a resource leak (because it
would this that an fd is returned). This patch uses a saparate output parameter,
so there is no confusion.
(I think it is OK to have functions which return either an error or an fd. It's
only returning *either* an fd or a non-fd that is confusing.)
No functional change, just adjusting code to follow the same pattern
everywhere. In particular, never call _verify() on an already loaded unit,
but return early from the caller instead. This makes the code a bit easier
to follow.
unit_load_fragment_and_dropin() and unit_load_fragment_and_dropin_optional()
are really the same, with one minor difference in behaviour. Let's drop
the second function.
"_optional" in the name suggests that it's the "dropin" part that is optional.
(Which it is, but in this case, we mean the fragment to be optional.)
I think the new version with a flag is easier to understand.
This is the most basic consumer of the new systemd-vs-kernel checker,
both acting as a reasonable standalone exerciser of the code, and also
as a way for easy inspection of deviations from systemd internal state.
v2:
- if RestartKillSignal= is not specified, fall back to KillSignal=. This is necessary
to preserve backwards compatibility (and keep KillSignal= generally useful).
Some PIDs can remain in the watched list even though their processes have
exited since a long time. It can easily happen if the main process of a forking
service manages to spawn a child before the control process exits for example.
However when a pid is about to be mapped to a unit by calling unit_watch_pid(),
the caller usually knows if the pid should belong to this unit exclusively: if
we just forked() off a child, then we can be sure that its PID is otherwise
unused. In this case we take this opportunity to remove any stalled PIDs from
the watched process list.
If we learnt about a PID in any other form (for example via PID file, via
searching, MAINPID= and so on), then we can't assume anything.
Just some renaming, no change in behaviour.
Background: I'd like to add more functions unit_test_xyz() that test
various things, hence let's streamline the naming a bit.
Similar to the cgroup magic we nowadays do when listening to sockets, to
assign them the right bpf programs, let's also do the same and join the
specified netns in the child process.
This allows people to listen in sockets in specific namespaces, or join
multiple services and socket units together to live in the same
namespace.
This allows clients to follow our internal state changes safely.
Previously, quick state changes (for example, when we restart a unit due
to Restart= after it quickly transitioned through DEAD/FAILED states)
would be coalesced into one bus signal event, with this change there's
the guarantee that all state changes after the unit was announced ones
are reflected on th bus.
Note we only do this kind of guaranteed flushing only for unit state
changes, not for other unit property changes, where clients still have
to expect coalescing. This is because the unit state is a very
important, high-level concept.
Fixes: #10185
This splits the "environment" field of Manager into two:
transient_environment and client_environment. The former is generated
from configuration file, kernel cmdline, environment generators. The
latter is the one the user can control with "systemctl set-environment"
and similar.
Both sets are merged transparently whenever needed. Separating the two
sets has the benefit that we can safely flush out the former while
keeping the latter during daemon reload cycles, so that env var settings
from env generators or configuration files do not accumulate, but
dynamic API changes are kept around.
Note that this change is not entirely transparent to users: if the user
first uses "set-environment" to override a transient variable, and then
uses "unset-environment" to unset it again things will revert to the
original transient variable now, while previously the variable was fully
removed. This change in behaviour should not matter too much though I
figure.
Fixes: #9972
Let's be more careful with what we serialize: let's ensure we never
serialize strings that are longer than LONG_LINE_MAX, so that we know we
can read them back with read_line(…, LONG_LINE_MAX, …) safely.
In order to implement this all serialization functions are move to
serialize.[ch], and internally will do line size checks. We'd rather
skip a serialization line (with a loud warning) than write an overly
long line out. Of course, this is just a second level protection, after
all the data we serialize shouldn't be this long in the first place.
While we are at it also clean up logging: while serializing make sure to
always log about errors immediately. Also, (void)ify all calls we don't
expect errors in (or catch errors as part of the general
fflush_and_check() at the end.
All over the place we define local variables for the various sockopts
that take a bool-like "int" value. Sometimes they are const, sometimes
static, sometimes both, sometimes neither.
Let's clean this up, introduce a common const variable "const_int_one"
(as well as one matching "const_int_zero") and use it everywhere, all
acorss the codebase.