This adds some paranoia code that moves some of the fds we allocate for
longer periods of times to fds > 2 if they are allocated below this
boundary. This is a paranoid safety thing, in order to avoid that
external code might end up erroneously use our fds under the assumption
they were valid stdin/stdout/stderr. Think: some app closes
stdin/stdout/stderr and then invokes 'fprintf(stderr, …' which causes
writes on our fds.
This both adds the helper to do the moving as well as ports over a
number of users to this new logic. Since we don't want to litter all our
code with invocations of this I tried to strictly focus on fds we keep
open for long periods of times only and only in code that is frequently
loaded into foreign programs (under the assumptions that in our own
codebase we are smart enough to always keep stdin/stdout/stderr
allocated to avoid this pitfall). Specifically this means all code used
by NSS and our sd-xyz API:
1. our logging APIs
2. sd-event
3. sd-bus
4. sd-resolve
5. sd-netlink
This changed was inspired by this:
https://github.com/systemd/systemd/issues/8075#issuecomment-363689755
This shows that apparently IRL there are programs that do close
stdin/stdout/stderr, and we should accomodate for that.
Note that this won't fix any bugs, this just makes sure that buggy
programs are less likely to interfere with out own code.
It is often the case that a file descriptor and its corresponding IO
sd_event_source share a life span. When this is the case, developers will
have to unref the event source and close the file descriptor. Instead, we
can just have the event source take ownership of the file descriptor and
close it when the event source is freed. This is especially useful when
combined with cleanup attributes and sd_event_source_unrefp().
This patch adds two new public functions:
sd_event_source_get_io_fd_own()
sd_event_source_set_io_fd_own()
Currently, sd-event supports the ability to have a thread-local default
event loop. However, this is less useful than it can be since all functions
which require an sd_event* as input require the caller to pass it. This
patch adds a new macro which allows the developer to pass a constant
SD_EVENT_DEFAULT instead. This reduces work for the caller.
For example:
r = sd_event_default(&e);
r = sd_event_add_io(e, ...);
sd_event_unref(e);
Becomes:
r = sd_event_add_io(SD_EVENT_DEFAULT, ...);
If no thread-local default event loop exists, the function calls will
return -ENOPKG. No event loop will ever be implicitly created.
log.h really should only include the bare minimum of other headers, as
it is really pulled into pretty much everything else and already in
itself one of the most basic pieces of code we have.
Let's hence drop inclusion of:
1. sd-id128.h because it's entirely unneeded in current log.h
2. errno.h, dito.
3. sys/signalfd.h which we can replace by a simple struct forward
declaration
4. process-util.h which was needed for getpid_cached() which we now hide
in a funciton log_emergency_level() instead, which nicely abstracts
the details away.
5. sys/socket.h which was needed for struct iovec, but a simple struct
forward declaration suffices for that too.
Ultimately this actually makes our source tree larger (since users of
the functionality above must now include it themselves, log.h won't do
that for them), but I think it helps to untangle our web of includes a
tiny bit.
(Background: I'd like to isolate the generic bits of src/basic/ enough
so that we can do a git submodule import into casync for it)
This moves pretty much all uses of getpid() over to getpid_raw(). I
didn't specifically check whether the optimization is worth it for each
replacement, but in order to keep things simple and systematic I
switched over everything at once.
The code make the following assertion: when freeing a event loop object
(usually it's done after exiting from the main event loop), no signal events
are still queued and are pending.
This assertion can be found in event_unmask_signal_data() with
"assert(!d->current);" assertion.
It appears that this assertion can be wrong at least in a specific case
described below.
Consider the following example which is inspired from udev: a process defines 3
source events: 2 are created by sd_event_add_signal() and 1 is created by
sd_event_add_post().
1. the process receives the 2 signals consecutively so that signal 'A' source
event is queued and pending. Consequently the post source event is also
queued and pending. This is done by sd_event_wait().
2. The callback for signal 'A' is called by sd_event_dispatch().
3. The next call to sd_event_wait() will queue signal 'B' source event.
4. The callback for the post source event is called and calls sd_event_exit().
5. the event loop is exited.
6. freeing the event loop object will lead to the assertion failure in
event_unmask_signal_data().
This patch simply removes this assertion as it doesn't seem to be a
bug if the signal data still reference a signal source at this point.
If a callback of an event source returns an error, then the event source
might already be half-destroyed, if the callback dropped all refs.
Hence, don't assume that the type is still valid, and save it before we
issue the callback.
This extends the existing event loop iteration counter to 64bit, and exposes it
via a new function sd_event_get_iteration(). This is helpful for cases like
issue #3612. After all, since we maintain the counter anyway, we might as well
expose it.
(This also fixes an unrelated issue in the man page for sd_event_wait() where
micro and milliseconds got mixed up)
Before we invoke now(CLOCK_BOOTTIME), let's make sure we actually have that
clock, since now() will otherwise hit an assert.
Specifically, let's refuse CLOCK_BOOTTIME early in sd-event if the kernel
doesn't actually support it.
This is a follow-up for #3037, and specifically:
https://github.com/systemd/systemd/pull/3037#issuecomment-210199167
It was added in 2.6.39, and causes an assertion to fail when running in mock
hosted on 2.6.32-based RHEL-6:
Assertion 'clock_gettime(map_clock_id(clock_id), &ts) == 0' failed at systemd/src/basic/time-util.c:70, function now(). Aborting.
Throughout the tree there's spurious use of spaces separating ++ and --
operators from their respective operands. Make ++ and -- operator
consistent with the majority of existing uses; discard the spaces.
This should simplify handling of time events in clients and is in-line with the USEC_INFINITY macro we already have.
This way setting a timeout to 0 indicates "elapse immediately", and a timeout of USEC_INFINITY "elapse never".
sd_event_now() is a public function, so we must check all
arguments for validity. Update man page and add tests.
Sample debug message:
Assertion 'IN_SET(clock, CLOCK_REALTIME, CLOCK_REALTIME_ALARM, CLOCK_MONOTONIC, CLOCK_BOOTTIME, CLOCK_BOOTTIME_ALARM)' failed at src/libsystemd/sd-event/sd-event.c:2719, function sd_event_now(). Ignoring.
Set SD_EVENT_PROFILE_DELAYS to activate accounting and periodic logging
of the distribution of delays between sd_event_run() calls.
Time spent in dispatching as well as time spent outside of
sd_event_run() is measured and accounted for. Every 5 seconds a
logarithmic histogram loop iteration delays since 5 seconds previous is
logged.
This is useful in identifying the frequency and magnitude of latencies
affecting the event loop, which should be kept to a minimum.
Printing the pointer variable really doesn't help, so drop that.
Instead, add a string lookup table for the EventSourceType enum, and print
the type of event source in case of errors.
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.
With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.
The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).
This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.
Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:
#define _cleanup_(function) __attribute__((cleanup(function)))
Or similar, to make the gcc feature easier to use.
Making this logic public has the benefit that we can remove three header
files whose only purpose was to define these functions internally.
See #2008.
We already have a state RUNNING and EXITING when we dispatch regular and
exit callbacks. Let's introduce a new state called PREPARING that is
active while we invoke preparation callbacks. This way we have a state
each for all three kinds of event handlers.
The states are currently not documented, hence let's add a new state to
the end, before we start documenting this.
Let's make _ref() calls happy when NULL is passed to them, and simply
return NULL without any assertion logic. This makes them nicely
symmetric to the _unref() calls which also are happy to take NULL and
become NOPs then.
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.
This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.
Also touches a few unrelated include files.
pthread APIs (unlike the rest of libc) return their errors as positive
error codes directly from the functions, rather than using errno. Let's
make sure we always handle things that way.
Currently, we guarantee that if two event-sources with the same priority
fire at the same time, they're always dispatched in the same order. While
this might sound nice in theory, there's is little benefit in providing
stability on that level. We have no control over the order the events are
reported, hence, we cannot guarantee that we get notified about both at
the same time.
By dropping the stability guarantee, we loose roughly 10% Heap swaps in
the prioq on a desktop cold-boot. Krzysztof Kotlenga even reported up to
20% on his tests. This sounds worth optimizing, so drop the stability
guarantee.
We should never access the "signal" part of the event source unless the
event source is actually for a signal. In this case it's a child pid
handler however, hence make sure to use the right signal.
This is a fix for PR #1177, which in turn was a fix for
9da4cb2be2.
RT signals operate in a queue, and we should be careful to never merge
two queued signals into one. Hence, makes sure we only ever dequeue a
single signal at a time and leave the remaining ones queued in the
signalfd. In order to implement correct priorities for the signals
introduce one signalfd per priority, so that we only process the highest
priority signal at a time.
Previously, if the event loop never ran before sd_event_now() would
fail. With this change it will instead fall back to invoking now(). This
way, the function cannot fail anymore, except for programming error when
invoking it with wrong parameters.
This takes into account the fact that many callers did not handle the
error condition correctly, and if the callers did, then they kept simply
invoking now() as fall back on their own. Hence let's shorten the code
using this call, and make things more robust, and let's just fall back
to now() internally.
Whether now() is used or the cache timestamp may still be detected via
the return value of sd_event_now(). If > 0 is returned, then the fall
back to now() was used, if == 0 is returned, then the cached value was
returned.
This patch also simplifies many of the invocations of sd_event_now():
the manual fall back to now() can be removed. Also, in cases where the
call is invoked withing void functions we can now protect the invocation
via assert_se(), acknowledging the fact that the call cannot fail
anymore except for programming errors with the parameters.
This change is inspired by #841.
If we call EPOLL_CTL_DEL, we *REALLY* expect the file-descriptor to be
present in that given epoll-set. We actually track such state via our
s->io.registered flag, so it better be true.
Make sure if that's not true, we treat it similar to assert_return() (ie.,
print a loud warning).