Systemd

Commit Graph

Author	SHA1	Message	Date
Lennart Poettering	4b58153dd2	core: add "invocation ID" concept to service manager This adds a new invocation ID concept to the service manager. The invocation ID identifies each runtime cycle of a unit uniquely. A new randomized 128bit ID is generated each time a unit moves from and inactive to an activating or active state. The primary usecase for this concept is to connect the runtime data PID 1 maintains about a service with the offline data the journal stores about it. Previously we'd use the unit name plus start/stop times, which however is highly racy since the journal will generally process log data after the service already ended. The "invocation ID" kinda matches the "boot ID" concept of the Linux kernel, except that it applies to an individual unit instead of the whole system. The invocation ID is passed to the activated processes as environment variable. It is additionally stored as extended attribute on the cgroup of the unit. The latter is used by journald to automatically retrieve it for each log logged message and attach it to the log entry. The environment variable is very easily accessible, even for unprivileged services. OTOH the extended attribute is only accessible to privileged processes (this is because cgroupfs only supports the "trusted." xattr namespace, not "user."). The environment variable may be altered by services, the extended attribute may not be, hence is the better choice for the journal. Note that reading the invocation ID off the extended attribute from journald is racy, similar to the way reading the unit name for a logging process is. This patch adds APIs to read the invocation ID to sd-id128: sd_id128_get_invocation() may be used in a similar fashion to sd_id128_get_boot(). PID1's own logging is updated to always include the invocation ID when it logs information about a unit. A new bus call GetUnitByInvocationID() is added that allows retrieving a bus path to a unit by its invocation ID. The bus path is built using the invocation ID, thus providing a path for referring to a unit that is valid only for the current runtime cycleof it. Outlook for the future: should the kernel eventually allow passing of cgroup information along AF_UNIX/SOCK_DGRAM messages via a unique cgroup id, then we can alter the invocation ID to be generated as hash from that rather than entirely randomly. This way we can derive the invocation race-freely from the messages.	2016-10-07 20:14:38 +02:00
Lennart Poettering	398a50cdd1	journal: fix format string used for usec_t	2016-10-07 20:14:38 +02:00
Lennart Poettering	d473176a74	journal: complete slice info in journal metadata We are already attaching the system slice information to log messages, now add theuser slice info too, as well as the object slice info.	2016-10-07 20:14:38 +02:00
Felix Zhang	dd8352659c	journal: fix typo in comment (#4176 )	2016-09-18 11:14:50 +02:00
Zbigniew Jędrzejewski-Szmek	43688c49d1	tree-wide: rename config_parse_many to …_nulstr In preparation for adding a version which takes a strv.	2016-09-16 10:32:03 -04:00
Vito Caputo	6431c7e216	journal: add/use flushed_flag_is_set() helper (#4041 ) Minor cleanup suggested by Lennart.	2016-08-26 17:51:13 +02:00
Vito Caputo	929eeb5498	journal: implicitly flush to var on recovery (#4028 ) When the system journal becomes re-opened post-flush with the runtime journal open, it implies we've recovered from something like an ENOSPC situation where the system journal rotate had failed, leaving the system journal closed, causing the runtime journal to be opened post-flush. For the duration of the unavailable system journal, we log to the runtime journal. But when the system journal gets opened (space made available, for example), we need to close the runtime journal before new journal writes will go to the system journal. Calling server_flush_to_var() after opening the system journal with a runtime journal present, post-flush, achieves this while preserving the runtime journal's contents in the system journal. The combination of the present flushed flag file and the runtime journal being open is a state where we should be logging to the system journal, so it's appropriate to resume doing so once we've successfully opened the system journal.	2016-08-25 17:37:57 +02:00
Zbigniew Jędrzejewski-Szmek	61755fdae0	journald: do not create split journals for dynamic users Dynamic users should be treated like system users, and their logs should end up in the main system journal.	2016-08-18 23:34:40 -04:00
Vito Caputo	105bdb46b4	journal: ensure open journals from find_journal() (#3973 ) If journals get into a closed state like when rotate fails due to ENOSPC, when space is made available it currently goes unnoticed leaving the journals in a closed state indefinitely. By calling system_journal_open() on entry to find_journal() we ensure the journal has been opened/created if possible. Also moved system_journal_open() up to after open_journal(), before find_journal(). Fixes https://github.com/systemd/systemd/issues/3968	2016-08-17 14:51:07 +02:00
Lennart Poettering	3bbaff3e08	tree-wide: use sd_id128_is_null() instead of sd_id128_equal where appropriate It's a bit easier to read because shorter. Also, most likely a tiny bit faster.	2016-07-22 12:38:08 +02:00
Zbigniew Jędrzejewski-Szmek	2ed968802c	tree-wide: get rid of selinux_context_t (#3732 ) `9eb9c93275` deprecated selinux_context_t. Replace with a simple char* everywhere. Alternative fix for #3719.	2016-07-15 18:44:02 +02:00
Torstein Husebø	61233823aa	treewide: fix typos and remove accidental repetition of words	2016-07-11 16:18:43 +02:00
Lennart Poettering	fc2fffe770	tree-wide: introduce new SOCKADDR_UN_LEN() macro, and use it everywhere The macro determines the right length of a AF_UNIX "struct sockaddr_un" to pass to connect() or bind(). It automatically figures out if the socket refers to an abstract namespace socket, or a socket in the file system, and properly handles the full length of the path field. This macro is not only safer, but also simpler to use, than the usual offsetof() + strlen() logic.	2016-05-05 22:24:36 +02:00
Lennart Poettering	5d1ce25728	sd-journal: add API for opening journal files or directories by fd Also, expose this via the "journalctl --file=-" syntax for STDIN. This feature remains undocumented though, as it is probably not too useful in real-life as this still requires fds that support mmaping and seeking, i.e. does not work for pipes, for which reading from STDIN is most commonly used.	2016-04-25 15:24:46 +02:00
Zbigniew Jędrzejewski-Szmek	ccddd104fc	tree-wide: use mdash instead of a two minuses	2016-04-21 23:00:13 -04:00
Zbigniew Jędrzejewski-Szmek	6e1045e538	journald: rewrite function with switch, fix handling of -ESHUTDOWN The comments and the log messages are next to one another, so it's easier to check that the messages match the comments. The sign was omitted in the check for -ESHUTDOWN, so it was never matched.	2016-04-16 18:40:21 -04:00
Vito Caputo	9ed794a32d	tree-wide: minor formatting inconsistency cleanups	2016-02-23 14:20:34 -08:00
Vito Caputo	b58c888f30	journal: defer journal closes on rotate When we rotate journals, we must set offline and close the current one, but don't generally need to wait for this to complete. Instead, we'll initiate an asynchronous offline via journal_file_set_offline(oldfile, false), and add the file to a per-server set of deferred closes to be closed later when they won't block. There's one complication however; journal_file_open() via journal_file_verify_header() assumes that any writable journal in the online state is the product of an unclean shutdown or other form of corruption. Thus there's a need for journal_file_open() to be aware of deferred closes and synchronize with their completion when opening preexisting journals for writing. To facilitate this the deferred closes set is supplied to the journal_file_open() function where the deferred closes may be closed synchronously before verifying the header in such circumstances.	2016-02-19 18:50:20 -08:00
Vito Caputo	ac2e41f510	journal: asynchronous journal_file_set_offline() This adds a wait flag to journal_file_set_offline(), when false the offline is performed asynchronously in a separate thread. When wait is true, if an asynchronous offline is already in-progress it is restarted and waited for. Otherwise the offline is performed synchronously without the use of a thread. journal_file_set_online() cancels or waits for the asynchronous offline to complete if in-flight, depending on where in the offline process the thread happens to be. If the thread is in the fsync() phase, it is cancelled and waiting is unnecessary. Otherwise, the thread is joined before proceeding. A new offline_state member is added to JournalFile which is used via atomic operations for communicating between the offline thread and the journal_file_set_{offline,online}() functions.	2016-02-19 18:50:20 -08:00
Vito Caputo	69a3a6fd3d	journal: add void cast to journal_file_close() calls	2016-02-19 18:50:16 -08:00
Daniel Mack	b26fa1a2fb	tree-wide: remove Emacs lines from all files This should be handled fine now by .dir-locals.el, so need to carry that stuff in every file.	2016-02-10 13:41:57 +01:00
Vito Caputo	089ed40bf4	journal: remove template from open_journal args None of the callers take advantage of this parameter, it's always NULL, this is just a private helper function to simplify the call sites so drop the template parameter altogether. If a caller emerges later who needs it, it can be restored.	2016-02-05 21:30:53 -08:00
Tom Gundersen	9766c16bd0	Merge pull request #2440 from poettering/journal-fix journald: minor fixes	2016-01-26 18:16:48 +01:00
Lennart Poettering	4850d39ab7	journald: add a couple of static asserts checking logging constants Whenever we include a log level or facility in a journal string field, make sure the compiler checks for us that that's actually the right thing to do.	2016-01-26 14:43:24 +01:00
Lennart Poettering	1d35b2d6e2	Merge pull request #2424 from keszybz/journald-disk-usage Journald disk usage	2016-01-26 14:20:45 +01:00
Lennart Poettering	e167d7fd8d	journald: minor fixes This primarily contains some minor coding style fixups for `7a24f3bf2f` and earlier changes. Specifically: * Don't log at log levels above LOG_DEBUG from "library" code like journal-file.c * Don't negate errno values before passing them to log_debug_errno(), as the call can handle this fine anyway * Cast some calls we knowingly ignore the return values of to (void) * Don't clobber function call-by-ref return values on failure * Don't mix function calls and variable declarations in one line There's also one more relevant change: when failing to enqueue a journal change fs event, we'll run it immediately.	2016-01-26 14:13:30 +01:00
Zbigniew Jędrzejewski-Szmek	32917e3388	journald: restore oom safety v2: - use xsprintf	2016-01-25 10:53:51 -05:00
Zbigniew Jędrzejewski-Szmek	9d5a981398	Merge pull request #2318 from vcaputo/coalesce-ftruncates-redux journal: coalesce ftruncate()s in 250ms windows	2016-01-23 22:09:51 -05:00
Zbigniew Jędrzejewski-Szmek	282c5c4e42	journald: use structured message + catalog entry for disk usage The format of the journald disk usage log entry was changed back and forth a few times. It is annoying to have a very verbose message, but if it is short it is hard to understand. But we have a tool for this, the catalogue. $ journalctl -x -u systemd-journald Jan 23 18:48:50 rawhide systemd-journald[891]: Runtime journal (/run/log/journal/) is 8.0M, max 196.2M, 188.2M free. -- Subject: Disk space used by the journal -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Runtime journal (/run/log/journal/) is currently using 8.0M. -- Maximum allowed usage is set to 196.2M. -- Leaving at least 294.3M free (of currently available 1.9G of disk space). -- Enforced usage limit is thus 196.2M, of which 188.2M are still available. -- -- The limits controlling how much disk space is used by the journal may -- be configured with SystemMaxUse=, SystemKeepFree=, SystemMaxFileSize=, -- RuntimeMaxUse=, RuntimeKeepFree=, RuntimeMaxFileSize= settings in -- /etc/systemd/journald.conf. See journald.conf(5) for details. Jan 23 18:48:50 rawhide systemd-journald[891]: System journal (/var/log/journal/) is 480.1M, max 1.6G, 1.2G free. -- Subject: Disk space used by the journal -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- System journal (/var/log/journal/) is currently using 480.1M. -- Maximum allowed usage is set to 1.6G. -- Leaving at least 2.5G free (of currently available 5.8G of disk space). -- Enforced usage limit is thus 1.6G, of which 1.2G are still available. -- -- The limits controlling how much disk space is used by the journal may -- be configured with SystemMaxUse=, SystemKeepFree=, SystemMaxFileSize=, -- RuntimeMaxUse=, RuntimeKeepFree=, RuntimeMaxFileSize= settings in -- /etc/systemd/journald.conf. See journald.conf(5) for details.	2016-01-23 19:49:00 -05:00
Zbigniew Jędrzejewski-Szmek	8a03c9ef74	journald: allow additional payload in server_driver_message The code to format the iovec is shared with log.c. All call sites to server_driver_message are changed to include the additional "MESSAGE=" part, but the new functionality is not used and change in functionality is not expected. iovec is preallocated, so the maximum number of messages is limited. In server_driver_message N_IOVEC_PAYLOAD_FIELDS is currently set to 1. New code is not oom safe, it will fail if memory cannot be allocated. This will be fixed in subsequent commit.	2016-01-23 19:49:00 -05:00
Zbigniew Jędrzejewski-Szmek	ff82c36c79	journald: do not free uninitialized pointer in error path	2016-01-18 15:21:28 -05:00
Vito Caputo	7a24f3bf2f	journal: coalesce ftruncate()s in 250ms windows Prior to this change every journal append causes an ftruncate() for the sake of inotify propagation of the mmap-based writes. With this change the notification is deferred up to ~250ms, coalescing any repeated journal writes during the deferred period into a single ftruncate(). The ftruncate() call isn't free and doing it on every append adds unnecessary overhead and latency in the journald event loop. Introduces journal_file_enable_post_change_timer() which manages a timer on the provided sd-event instance for scheduling coalesced ftruncates. The ftruncate() behavior is unchanged unless journal_file_enable_post_change_timer() is called on the JournalFile. While not a tremendous improvement, profiling systemd-journald event loop latencies using instrumentation as introduced by `34b8751` it was observed that coalescing the ftruncates was low-hanging fruit worth pursuing. Note orders 12 and 13 shifting left into order 11 and order 6 dipping into order 5: Unmodified: log2(us) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ----------------------------------------------------------- [10685.414572] 0 0 0 0 38 602 61 2 290 60 1643 2554 13 1 4 1 0 0 1 [10690.415114] 0 0 0 0 0 646 54 7 309 44 2073 2148 17 1 3 0 0 0 1 [10695.415509] 0 0 0 0 1 650 73 3 324 37 2071 2270 9 0 0 1 0 1 0 [10700.416297] 0 0 0 0 0 659 50 4 318 38 2111 2152 6 0 1 0 0 1 1 [10705.417136] 0 0 0 0 2 660 48 4 320 38 2129 2146 12 1 1 0 0 1 1 [10710.489114] 0 0 0 0 0 673 38 3 321 37 1925 2339 7 0 0 0 0 1 1 [10715.489613] 0 0 0 0 3 656 64 8 317 48 2365 2007 7 0 0 0 0 0 1 Coalesced: log2(us) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ----------------------------------------------------------- [ 6169.161360] 0 0 0 1 24 786 54 11 389 24 4192 771 6 4 0 0 1 0 1 [ 6174.161705] 0 0 0 1 18 800 35 6 380 27 3977 893 3 1 0 0 1 0 1 [ 6179.162741] 0 0 0 1 28 768 51 4 391 16 3998 831 5 3 0 0 0 0 2 [ 6184.162856] 0 0 0 0 19 770 60 2 376 26 3795 1004 9 5 1 0 1 0 1 [ 6189.163279] 0 0 0 0 28 761 49 7 372 27 3729 1056 3 2 0 0 1 0 1 [ 6194.164255] 0 0 0 0 25 785 49 7 394 19 3996 908 6 3 2 0 0 0 1 [ 6199.164658] 0 0 0 0 29 797 35 5 389 18 3995 898 3 4 1 1 1 0 1 The remaining high-order delays are a result of the synchronous fsyncs in systemd-journald, beyond the scope of this commit.	2016-01-14 16:36:07 -08:00
David Herrmann	de418eb91c	Merge pull request #2053 from poettering/selinux-fix Two unrelated fixes	2015-11-30 19:30:03 +01:00
Zbigniew Jędrzejewski-Szmek	5c3bde3fa8	journal: move the gist of server_fix_perms to acl-util.[hc] Most of the function is moved to acl-util.c to make it possible to add tests in subsequent commit. Setting of the mode in server_fix_perms is removed: - we either just created the file ourselves, and the permission be better right, - or the file was already there, and we should not modify the permissions. server_fix_perms is renamed to server_fix_acls to better reflect new meaning, and made static because it is only used in one file.	2015-11-27 23:32:32 -05:00
Lennart Poettering	6355e75610	selinux: split up mac_selinux_have() from mac_selinux_use() Let's distuingish the cases where our code takes an active role in selinux management, or just passively reports whatever selinux properties are set. mac_selinux_have() now checks whether selinux is around for the passive stuff, and mac_selinux_use() for the active stuff. The latter checks the former, plus also checks UID == 0, under the assumption that only when we run priviliged selinux management really makes sense. Fixes: #1941	2015-11-27 20:28:13 +01:00
Daniel Mack	a57246551a	Merge pull request #1926 from phomes/include-order-libudev tree-wide: group include of libudev.h with sd-*	2015-11-17 09:36:25 +01:00
Thomas Hindoe Paaboel Andersen	b4bbcaa9c4	tree-wide: group include of libudev.h with sd-*	2015-11-17 07:06:08 +01:00
Lennart Poettering	65c1d46b09	journald: trivial simplification	2015-11-17 00:52:10 +01:00
Lennart Poettering	4a0b58c4a3	tree-wide: use right cast macros for UIDs, GIDs and PIDs	2015-11-17 00:52:10 +01:00
Thomas Hindoe Paaboel Andersen	cf0fbc49e6	tree-wide: sort includes Sort the includes accoding to the new coding style.	2015-11-16 22:09:36 +01:00
Lennart Poettering	4de2402b60	journald: fix accuracy of watchdog timer event Adding 3/4th of the watchdog frequency as accuracy on top of 1/2 of the watchdog frequency means we might end up at 5/4th of the frequency which means we might miss the message from time to time. Maybe fixes #1804	2015-11-12 12:33:10 +01:00
Lennart Poettering	33d52ab92f	journald: rework --sync/--rotate logic to use CLOCK_MONOTONIC timestamp files Previously, we'd rely on the mtime timestamps of the touch files to see if our sync/rotation requests were already suppressed. This means we rely on CLOCK_REALTIME timestamps. With this patch we instead store the CLOCK_MONOTONIC timestamp in the touch files, and avoid relying on mtime. This should make things more reliable when the clock or underlying mtime granularity is not very good. This also adds warning messages if writing any of the flag files fails.	2015-11-12 11:17:01 +01:00
Lennart Poettering	dbd6e31cf9	journalctl: make --rotate synchronous, too Of course, ideally we'd just use normal synchronous bus calls, but this is out of the question as long as we rely on dbus-daemon (which logs to journald, and thus cannot use to avoid cyclic sync loops). Hence, instead, reuse the wait logic already implemented for --sync, and use a signal in one direction, and a mtime watch file for the reply.	2015-11-11 14:29:10 +01:00
Lennart Poettering	94b6551662	journalctl: add new --sync switch for syncing the journal to disk With this new "--sync" switch we add a synchronous way to sync everything queued to disk, and return only after that's complete. This command gives the guarantee that anything queued before has hit the disk before the command returns. While we are at it, also improve the man pages and help text for journalctl a bit.	2015-11-11 13:39:18 +01:00
Lennart Poettering	17afc8f27b	journald: be less picky when receiving epoll events The event might be flagged with stuff we don't expect, hence don't be needlessly picky, just rely on the kernel passing us sensible events.	2015-11-10 21:04:39 +01:00
Lennart Poettering	b374689c02	journald: dispatch SIGTERM/SIGINT with a low priority Let's make sure to process all queued log data before exiting, so that we don't unnecessary lose messages when shutting down. https://github.com/systemd/systemd/pull/1812#issuecomment-155149871	2015-11-10 17:36:52 +01:00
Lennart Poettering	75eb615480	defs: rework CONF_DIRS_NULSTR() macro The macro is generically useful for putting together search paths, hence let's make it truly generic, by dropping the implicit ".d" appending it does, and leave that to the caller. Also rename it from CONF_DIRS_NULSTR() to CONF_PATHS_NULSTR(), since it's not strictly about dirs that way, but any kind of file system path. Also, mark CONF_DIR_SPLIT_USR() as internal macro by renaming it to _CONF_PATHS_SPLIT_USR() so that the leading underscore indicates that it's internal.	2015-11-10 17:31:31 +01:00
Michal Schmidt	709f6e46a3	treewide: use the negative error codes returned by our functions Our functions return negative error codes. Do not rely on errno being set after calling our own functions.	2015-11-05 13:44:06 +01:00
Lennart Poettering	119e9655dc	journal: restore watchdog support	2015-11-03 17:45:12 +01:00
Lennart Poettering	e22aa3d328	journald: never block when sending messages on NOTIFY_SOCKET socket Otherwise we might run into deadlocks, when journald blocks on the notify socket on PID 1, and PID 1 blocks on IPC to dbus-daemon and dbus-daemon blocks on logging to journald. Break this cycle by making sure that journald never ever blocks on PID 1. Note that this change disables support for event loop watchdog support, as these messages are sent in blocking style by sd-event. That should not be a big loss though, as people reported frequent problems with the watchdog hitting journald on excessively slow IO. Fixes: #1505.	2015-11-01 22:12:29 +01:00

1 2 3 4 5

214 Commits