Systemd

Commit Graph

Author	SHA1	Message	Date
Vito Caputo	104fc4be11	mmap-cache: bind prot(ection) to MMapFileDescriptor There are no mmap_cache_get() users that actually deviate prot from the JournalFile's f->prot. So there's no point in making this a separate parameter to mmap_cache_get(), nor is there any need to store it in JournalFile's f->prot. Instead just pass it to mmap_cache_add_fd() at MMapFileDescriptor creation, storing it in there for the mmap() callers, which already receive MMapFileDescriptor . For functions receiving both an MMapFileDescriptor and prot, the prot argument has been simply removed and call sites updated. Formalizing this fd:prot binding at the public API also enables discarding the prot check in window_matches(), which is a hot function on long window lists, so a minor CPU efficiency gain should be had there as seen with the past removal of the fd check. Unnoticable for uncached journals, but maybe a little runtime improvement when cached in specific circumstances. window_matches_fd() has also been simplified to treat the MMapFileDescrptor * as equivalent to its fd and prot.	2020-12-10 13:03:31 +01:00
Yu Watanabe	db9ecf0501	license: LGPL-2.1+ -> LGPL-2.1-or-later	2020-11-09 13:23:58 +09:00
Lennart Poettering	d80b051cea	tree-wide: add new HAVE_COMPRESSION compile time flag let's simplify the checks for ZSTD/LZ4/XZ As suggested: https://github.com/systemd/systemd/pull/16096#discussion_r440705585	2020-06-25 15:02:45 +02:00
Lennart Poettering	8653185a9e	journal: support zstd compression for large objects in journal files	2020-06-25 15:02:18 +02:00
Lennart Poettering	4ce534f4cd	journal: use a different hash function for each journal file This adds a new (incompatible) feature to journal files: if enabled the hash function used for the hash tables is no longer jenkins hash with a zero key, but siphash keyed by the file uuid that is included in the file header anyway. This should make our hash tables more robust against collision attacks, as long as the attacker has no read access to the journal files. We switch from jenkins to siphash simply because it's more well-known and we standardize for the rest of our codebase onto it. This is hardening in order to make collision attacks harder for clients that can forge log messages but have no read access to the logs. It has no effect on clients that have read access.	2020-06-25 15:01:45 +02:00
Lennart Poettering	a76560915f	journal-file: use FLAGS_SET where appropriate	2020-06-25 15:00:44 +02:00
Michal Sekletár	28ca867abd	sd-journal: close journal files that were deleted by journald before we've setup inotify watch Fixes #14695	2020-02-05 18:34:52 +01:00
Yu Watanabe	627df1dc42	journal: use cleanup attribute at one more place	2019-05-28 18:07:18 +09:00
Zbigniew Jędrzejewski-Szmek	f2dc22b447	headers: add missing includes Fixes #12125.	2019-03-28 19:59:56 +01:00
Zbigniew Jędrzejewski-Szmek	ca78ad1de9	headers: remove unneeded includes from util.h This means we need to include many more headers in various files that simply included util.h before, but it seems cleaner to do it this way.	2019-03-27 11:53:12 +01:00
Zbigniew Jędrzejewski-Szmek	a03d43593c	journal: fix sort order of header includes	2018-11-20 07:27:37 +01:00
Lennart Poettering	6812765891	journal-file: refactor journal_file_open_reliably() Let's split out the part that actually renames the file in case we can't open it into a new function journal_file_dispose(). This way we can reuse the function in other cases where we want to open a file but can't.	2018-10-25 21:43:09 +02:00
Lennart Poettering	7a4d21ad20	journal-file: refactor journal_file_rotate() Let's split the function in three: the part where we archive the old file into journal_file_archive(), and the part where we initiate the deferred closing into journal_file_initiate_close(). journal_file_rotate() then simply becomes a wrapper around these two calls, and the opening of the new journal file. This useful so that we can archive journal files without having to open new ones, i.e. to do only the archival part of the rotation, without the rotation part.	2018-10-25 21:43:09 +02:00
Lennart Poettering	0c69794138	tree-wide: remove Lennart's copyright lines These lines are generally out-of-date, incomplete and unnecessary. With SPDX and git repository much more accurate and fine grained information about licensing and authorship is available, hence let's drop the per-file copyright notice. Of course, removing copyright lines of others is problematic, hence this commit only removes my own lines and leaves all others untouched. It might be nicer if sooner or later those could go away too, making git the only and accurate source of authorship information.	2018-06-14 10:20:20 +02:00
Lennart Poettering	818bf54632	tree-wide: drop 'This file is part of systemd' blurb This part of the copyright blurb stems from the GPL use recommendations: https://www.gnu.org/licenses/gpl-howto.en.html The concept appears to originate in times where version control was per file, instead of per tree, and was a way to glue the files together. Ultimately, we nowadays don't live in that world anymore, and this information is entirely useless anyway, as people are very welcome to copy these files into any projects they like, and they shouldn't have to change bits that are part of our copyright header for that. hence, let's just get rid of this old cruft, and shorten our codebase a bit.	2018-06-14 10:20:20 +02:00
Zbigniew Jędrzejewski-Szmek	d180c34998	journal: allow boot_id to be passed to journal_append_entry() In this commit, this is done only in testing code, i.e. there is no functional change apart from tests.	2018-05-31 14:30:23 +02:00
Zbigniew Jędrzejewski-Szmek	5a271b08b3	journal: remove unused args from journal_file_copy_entry()	2018-05-31 14:30:23 +02:00
Zbigniew Jędrzejewski-Szmek	11a1589223	tree-wide: drop license boilerplate Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.	2018-04-06 18:58:55 +02:00
Lennart Poettering	d9a43665eb	Merge pull request #8313 from alexgartrell/compression-threshold Compression threshold	2018-03-21 12:37:54 +01:00
Lennart Poettering	ffe535e43e	journal-file: drop unused tail_entry_monotonic_valid field. As pointed out by Matthijs van Duin: https://lists.freedesktop.org/archives/systemd-devel/2018-March/040499.html	2018-03-20 23:31:11 +01:00
Alex Gartrell	57850536d5	journal: provide compress_threshold_bytes parameter Previously the compression threshold was hardcoded to 512, which meant that smaller values wouldn't be compressed. This left some storage savings on the table, so instead, we make that number tunable.	2018-03-20 11:48:52 -07:00
Lennart Poettering	858749f731	sd-journal: properly handle inotify queue overflow This adds proper handling of IN_Q_OVERFLOW: when the inotify queue runs over we'll reiterate all directories we are looking at. At the same time we'll mark all files and directories we encounter that way with a generation counter we first increased. All files and directories not marked like this are then unloaded. With this logic we do the best when the inotify queue overflows: we synchronize our in-memory state again with what's on disk. This contains some refactoring of the directory logic, to share more code between uuid directories and "root" directories and generally make things a bit more readable by splitting things up into smaller bits. See: #7998 #8032	2018-02-12 11:07:55 +01:00
Zbigniew Jędrzejewski-Szmek	f916819053	journal: use new helpers with journal_file_close journal_file_close_set() is not necessary anymore.	2017-11-28 21:34:50 +01:00
Zbigniew Jędrzejewski-Szmek	53e1b68390	Add SPDX license identifiers to source files under the LGPL This follows what the kernel is doing, c.f. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.	2017-11-19 19:08:15 +01:00
Zbigniew Jędrzejewski-Szmek	349cc4a507	build-sys: use #if Y instead of #ifdef Y everywhere The advantage is that is the name is mispellt, cpp will warn us. $ git grep -Ee "conf.set$'(HAVE\|ENABLE)_" -l\|xargs sed -r -i "s/conf.set\('(HAVE\|ENABLE)_/conf.set10('\1_/" $ git grep -Ee '#ifn?def (HAVE\|ENABLE)' -l\|xargs sed -r -i 's/#ifdef (HAVE\|ENABLE)/#if \1/; s/#ifndef (HAVE\|ENABLE)/#if ! \1/;' $ git grep -Ee 'if.defined\(HAVE' -l\|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_])$/\1/g' $ git grep -Ee 'if.defined$ENABLE' -l\|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_])$/\1/g' + manual changes to meson.build squash! build-sys: use #if Y instead of #ifdef Y everywhere v2: - fix incorrect setting of HAVE_LIBIDN2	2017-10-04 12:09:29 +02:00
Vito Caputo	be7cdd8ec9	journal: explicitly add fds to mmap-cache (#6307 ) This way we have a MMapFileDescriptor reference external to the cache, and can supply the handle directly to mmap_cache_get(), eliminating hashmap lookups entirely from the hot path.	2017-07-10 19:24:56 -04:00
Vito Caputo	8eb851711f	journal: set STATE_ARCHIVED as part of offlining (#2740 ) The only code path which makes a journal durable is via journal_file_set_offline(). When we perform a rotate the journal's header->state is being set to STATE_ARCHIVED prior to journal_file_set_offline() being called. In journal_file_set_offline(), we short-circuit the entire offline when f->header->state != STATE_ONLINE. This all results in none of the journal_file_set_offline() fsync() calls being reached when rotate archives a journal, so archived journals are never explicitly made durable. What we do now is instead of setting the f->header->state to STATE_ARCHIVED directly in journal_file_rotate() prior to journal_file_close(), we set an archive flag in f->archive for the journal_file_set_offline() machinery to honor by committing STATE_ARCHIVED instead of STATE_OFFLINE when set. Prior to this, rotated journals were never getting fsync() explicitly performed on them, since journal_file_set_offline() short-circuited. Obviously this is undesirable, and depends entirely on the underlying filesystem as to how much durability was achieved when simply closing the file. Note that this problem existed prior to the recent asynchronous fsync changes, but those changes do facilitate our performing this durable offline on rotate without blocking, regardless of the underlying filesystem sync-on-close semantics.	2016-04-27 08:29:43 +02:00
Lennart Poettering	5d1ce25728	sd-journal: add API for opening journal files or directories by fd Also, expose this via the "journalctl --file=-" syntax for STDIN. This feature remains undocumented though, as it is probably not too useful in real-life as this still requires fds that support mmaping and seeking, i.e. does not work for pipes, for which reading from STDIN is most commonly used.	2016-04-25 15:24:46 +02:00
Vito Caputo	b58c888f30	journal: defer journal closes on rotate When we rotate journals, we must set offline and close the current one, but don't generally need to wait for this to complete. Instead, we'll initiate an asynchronous offline via journal_file_set_offline(oldfile, false), and add the file to a per-server set of deferred closes to be closed later when they won't block. There's one complication however; journal_file_open() via journal_file_verify_header() assumes that any writable journal in the online state is the product of an unclean shutdown or other form of corruption. Thus there's a need for journal_file_open() to be aware of deferred closes and synchronize with their completion when opening preexisting journals for writing. To facilitate this the deferred closes set is supplied to the journal_file_open() function where the deferred closes may be closed synchronously before verifying the header in such circumstances.	2016-02-19 18:50:20 -08:00
Vito Caputo	ac2e41f510	journal: asynchronous journal_file_set_offline() This adds a wait flag to journal_file_set_offline(), when false the offline is performed asynchronously in a separate thread. When wait is true, if an asynchronous offline is already in-progress it is restarted and waited for. Otherwise the offline is performed synchronously without the use of a thread. journal_file_set_online() cancels or waits for the asynchronous offline to complete if in-flight, depending on where in the offline process the thread happens to be. If the thread is in the fsync() phase, it is cancelled and waiting is unnecessary. Otherwise, the thread is joined before proceeding. A new offline_state member is added to JournalFile which is used via atomic operations for communicating between the offline thread and the journal_file_set_{offline,online}() functions.	2016-02-19 18:50:20 -08:00
Daniel Mack	b26fa1a2fb	tree-wide: remove Emacs lines from all files This should be handled fine now by .dir-locals.el, so need to carry that stuff in every file.	2016-02-10 13:41:57 +01:00
Vito Caputo	7a24f3bf2f	journal: coalesce ftruncate()s in 250ms windows Prior to this change every journal append causes an ftruncate() for the sake of inotify propagation of the mmap-based writes. With this change the notification is deferred up to ~250ms, coalescing any repeated journal writes during the deferred period into a single ftruncate(). The ftruncate() call isn't free and doing it on every append adds unnecessary overhead and latency in the journald event loop. Introduces journal_file_enable_post_change_timer() which manages a timer on the provided sd-event instance for scheduling coalesced ftruncates. The ftruncate() behavior is unchanged unless journal_file_enable_post_change_timer() is called on the JournalFile. While not a tremendous improvement, profiling systemd-journald event loop latencies using instrumentation as introduced by `34b8751` it was observed that coalescing the ftruncates was low-hanging fruit worth pursuing. Note orders 12 and 13 shifting left into order 11 and order 6 dipping into order 5: Unmodified: log2(us) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ----------------------------------------------------------- [10685.414572] 0 0 0 0 38 602 61 2 290 60 1643 2554 13 1 4 1 0 0 1 [10690.415114] 0 0 0 0 0 646 54 7 309 44 2073 2148 17 1 3 0 0 0 1 [10695.415509] 0 0 0 0 1 650 73 3 324 37 2071 2270 9 0 0 1 0 1 0 [10700.416297] 0 0 0 0 0 659 50 4 318 38 2111 2152 6 0 1 0 0 1 1 [10705.417136] 0 0 0 0 2 660 48 4 320 38 2129 2146 12 1 1 0 0 1 1 [10710.489114] 0 0 0 0 0 673 38 3 321 37 1925 2339 7 0 0 0 0 1 1 [10715.489613] 0 0 0 0 3 656 64 8 317 48 2365 2007 7 0 0 0 0 0 1 Coalesced: log2(us) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ----------------------------------------------------------- [ 6169.161360] 0 0 0 1 24 786 54 11 389 24 4192 771 6 4 0 0 1 0 1 [ 6174.161705] 0 0 0 1 18 800 35 6 380 27 3977 893 3 1 0 0 1 0 1 [ 6179.162741] 0 0 0 1 28 768 51 4 391 16 3998 831 5 3 0 0 0 0 2 [ 6184.162856] 0 0 0 0 19 770 60 2 376 26 3795 1004 9 5 1 0 1 0 1 [ 6189.163279] 0 0 0 0 28 761 49 7 372 27 3729 1056 3 2 0 0 1 0 1 [ 6194.164255] 0 0 0 0 25 785 49 7 394 19 3996 908 6 3 2 0 0 0 1 [ 6199.164658] 0 0 0 0 29 797 35 5 389 18 3995 898 3 4 1 1 1 0 1 The remaining high-order delays are a result of the synchronous fsyncs in systemd-journald, beyond the scope of this commit.	2016-01-14 16:36:07 -08:00
Thomas Hindoe Paaboel Andersen	71d35b6b55	tree-wide: sort includes in *.h This is a continuation of the previous include sort patch, which only sorted for .c files.	2015-11-18 23:09:02 +01:00
Lennart Poettering	d1afbcd221	journal: fix error handling when compressing journal objects Let's make sure we handle compression errors properly, and don't misunderstand an error for success. Also, let's actually compress things if lz4 is enabled. Fixes #1662.	2015-10-24 13:19:42 +02:00
Lennart Poettering	8580d1f73d	journal: rework vacuuming logic Implement a maximum limit on number of journal files to keep around. Enforcing a limit is useful on this since our performance when viewing pays a heavy penalty for each journal file to interleve. This setting is turned on now by default, and set to 100. Also, actully implement what `348ced9097` promised: use whatever we find on disk at startup as lower bound on how much disk space we can use. That commit introduced some provisions to implement this, but actually never did. This also adds "journalctl --vacuum-files=" to vacuum files on disk by their number explicitly.	2015-10-02 23:21:59 +02:00
Lennart Poettering	804ae586d4	journal: make journal_file_close() return NULL The way it is customary everywhere else in our sources.	2015-10-02 22:36:33 +02:00
Lennart Poettering	dade37d403	journal: avoid mapping empty data and field hash tables When a new journal file is created we write the header first, then sync and only then create the data and field hash tables in them. That means to other processes it might appear that the files have a valid header but not data and field hash tables. Our reader code should be able to deal with this. With this change we'll not map the two hash tables right-away after opening a file for reading anymore (because that will of course fail if the objects are missing), but delay this until the first time we access them. On top of that, when we want to look something up in the hash tables and we notice they aren't initialized yet, we consider them empty. This improves handling of some journal files reported in #487.	2015-07-24 01:55:45 +02:00
Michal Schmidt	950c07d421	journal: make skipping of exhausted journal files effective again Commit `668c965af` "journal: skipping of exhausted journal files is bad if direction changed" fixed a correctness issue, but it also significantly limited the cases where the optimization that skips exhausted journal files could apply. As a result, some journalctl queries are much slower in v219 than in v218. (e.g. queries where a "--since" cutoff should have quickly eliminated older journal files from consideration, but didn't.) If already in the initial iteration find_location_with_matches() finds no entry, the journal file's location is not updated. This is fine, except that: - We must update at least f->last_direction. The optimization relies on it. Let's separate that from journal_file_save_location() and update it immediately after the direction checks. - The optimization was conditional on "f->current_offset > 0", but it would always be 0 in this scenario. This check is unnecessary for the optimization.	2015-02-25 17:32:27 +01:00
Zbigniew Jędrzejewski-Szmek	a2341f6836	Move DEFINE_TRIVIAL_CLEANUP_FUNC to macro.h This remove the need for various header files to include the (relatively heavyweight) util.h.	2015-01-18 19:06:48 -05:00
Lennart Poettering	f27a386430	journald: whenever we rotate a file, btrfs defrag it Our write pattern is quite awful for CoW file systems (btrfs...), as we keep updating file parts in the beginning of the file. This results in fragmented journal files. Hence: when rotating files, defragment them, since at that point we know that no further write accesses will be made.	2015-01-06 20:31:40 +01:00
Lennart Poettering	2678031a17	journald: when we detect the journal file we are about to write to has been deleted, rotate https://bugzilla.redhat.com/show_bug.cgi?id=1171719	2015-01-05 02:57:36 +01:00
Lennart Poettering	fa6ac76083	journald: process SIGBUS for the memory maps we set up Even though we use fallocate() it appears that file systems like btrfs will trigger SIGBUS on certain low-disk-space situation. We should handle that, hence catch the signal, add it to a list of invalidated pages, and replace the page with an empty memory area. After each write check if SIGBUS was triggered, and consider the write invalid if it was. This should make journald a lot more robust with file systems where fallocate() is not reliable, for example all CoW file systems (btrfs...), where changing written data can fail with disk full errors. https://bugzilla.redhat.com/show_bug.cgi?id=1045810	2015-01-05 01:40:51 +01:00
Michal Schmidt	f534928ad7	journal: journal_file_next_entry() does not need pointer to current Object The current offset is sufficient information.	2014-12-18 14:41:22 +01:00
Michal Schmidt	6e693b42dc	journal: optimize iteration by skipping exhausted files If from a previous iteration we know we are at the end of a journal file, don't bother looking into the file again. This is complicated by the fact that the EOF does not have to be permanent (think of "journalctl -f"). So we also check if the number of entries in the journal file changed. This optimization has a similar effect as "journal: optimize iteration: skip whole files behind current location" had.	2014-12-18 14:29:46 +01:00
Michal Schmidt	d8ae66d7fa	journal: compare candidate entries using JournalFiles' locations When comparing the locations of candidate entries, we can rely on the location information stored in struct JournalFile.	2014-12-18 12:26:00 +01:00
Michal Schmidt	6573ef05a3	journal: keep per-JournalFile location info during iteration In next_beyond_location() when we find a candidate entry in a journal file, save its location information in struct JournalFile. The purpose of remembering the locations of candidate entries is to be able to save work in the next iteration. This patch does only the remembering part. LOCATION_SEEK means the location identifies a candidate entry. When a winner is picked from among candidates, it becomes LOCATION_DISCRETE. LOCATION_TAIL here signifies we've iterated the file to the end (or the beginning in the case of reversed direction).	2014-12-18 12:17:20 +01:00
Michal Schmidt	1fc605b0e1	journal: abstract the resetting of JournalFile's location	2014-12-18 11:56:19 +01:00
Michal Schmidt	99cc7653a8	journal: move definition of LocationType to journal-file.h In preparation for individual JournalFiles maintaining a location of their own.	2014-12-18 11:53:39 +01:00
Michal Schmidt	14499361a5	journal: delete unused function journal_file_skip_entry() Its only caller is a test.	2014-12-18 11:53:08 +01:00
Michal Schmidt	ae2adbcd09	journal: delete unused function journal_file_move_to_entry_by_offset()	2014-12-18 11:47:13 +01:00

1 2 3

104 Commits