Commit graph

75 commits

Author SHA1 Message Date
Lennart Poettering b4cccbc13a cgroup: change cg_unified() to possibly return errors again
We use our cgroup APIs in various contexts, including from our libraries
sd-login, sd-bus. As we don#t control those environments we can't rely
that the unified cgroup setup logic succeeds, and hence really shouldn't
assert on it.

This more or less reverts 415fc41cea.
2017-02-24 17:52:58 +01:00
Tejun Heo 415fc41cea core: simplify cg_[all_]unified()
cg_[all_]unified() test whether a specific controller or all controllers are on
the unified hierarchy.  While what's being asked is a simple binary question,
the callers must assume that the functions may fail any time, which
unnecessarily complicates their usages.  This complication is unnecessary.
Internally, the test result is cached anyway and there are only a few places
where the test actually needs to be performed.

This patch simplifies cg_[all_]unified().

* cg_[all_]unified() are updated to return bool.  If the result can't be
  decided, assertion failure is triggered.  Error handlings from their callers
  are dropped.

* cg_unified_flush() is updated to calculate the new result synchrnously and
  return whether it succeeded or not.  Places which need to flush the test
  result are updated to test for failure.  This ensures that all the following
  cg_[all_]unified() tests succeed.

* Places which expected possible cg_[all_]unified() failures are updated to
  call and test cg_unified_flush() before calling cg_[all_]unified().  This
  includes functions used while setting up mounts during boot and
  manager_setup_cgroup().
2017-02-18 17:51:13 -05:00
Zbigniew Jędrzejewski-Szmek d3e8277d50 cgtop: use common function to query cgroup root
show_cgroup_get_root_and_warn is renamed to show_cgroup_get_path_and_warn
because it now optionally allows querying a non-root path.

This removes duplicated code and teaches cgtop to combine
-M with a root prefix:

$ systemd-cgtop -M myprecious /system.slice
...
2017-02-01 20:29:09 -05:00
Zbigniew Jędrzejewski-Szmek 605405c6cc tree-wide: drop NULL sentinel from strjoin
This makes strjoin and strjoina more similar and avoids the useless final
argument.

spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/*.c)

git grep -e '\bstrjoin\b.*NULL' -l|xargs sed -i -r 's/strjoin\((.*), NULL\)/strjoin(\1)/'

This might have missed a few cases (spatch has a really hard time dealing
with _cleanup_ macros), but that's no big issue, they can always be fixed
later.
2016-10-23 11:43:27 -04:00
Tejun Heo ca2f6384aa core: rename cg_unified() to cg_all_unified()
A following patch will update cgroup handling so that the systemd controller
(/sys/fs/cgroup/systemd) can use the unified hierarchy even if the kernel
resource controllers are on the legacy hierarchies.  This would require
distinguishing whether all controllers are on cgroup v2 or only the systemd
controller is.  In preparation, this patch renames cg_unified() to
cg_all_unified().

This patch doesn't cause any functional changes.
2016-08-15 18:13:36 -04:00
Tejun Heo 66ebf6c0a1 core: add cgroup CPU controller support on the unified hierarchy
Unfortunately, due to the disagreements in the kernel development community,
CPU controller cgroup v2 support has not been merged and enabling it requires
applying two small out-of-tree kernel patches.  The situation is explained in
the following documentation.

 https://git.kernel.org/cgit/linux/kernel/git/tj/cgroup.git/tree/Documentation/cgroup-v2-cpu.txt?h=cgroup-v2-cpu

While it isn't clear what will happen with CPU controller cgroup v2 support,
there are critical features which are possible only on cgroup v2 such as
buffered write control making cgroup v2 essential for a lot of workloads.  This
commit implements systemd CPU controller support on the unified hierarchy so
that users who choose to deploy CPU controller cgroup v2 support can easily
take advantage of it.

On the unified hierarchy, "cpu.weight" knob replaces "cpu.shares" and "cpu.max"
replaces "cpu.cfs_period_us" and "cpu.cfs_quota_us".  [Startup]CPUWeight config
options are added with the usual compat translation.  CPU quota settings remain
unchanged and apply to both legacy and unified hierarchies.

v2: - Error in man page corrected.
    - CPU config application in cgroup_context_apply() refactored.
    - CPU accounting now works on unified hierarchy.
2016-08-07 09:45:39 -04:00
Lennart Poettering bdc49d4491 cgtop: minimize aux variable scope 2016-06-06 22:04:33 +02:00
Alessandro Puccetti 0738b927a1 cgtop: fix ret pointer usage (#3443) 2016-06-06 12:38:23 +02:00
Alessandro Puccetti 308253c5a2 cgtop: add option to show a single cgroup subtree (#3413)
When many services are running, it was difficult to see only the interesting ones.
This patch allows to show only the subtree of interest.
2016-06-05 13:42:37 -04:00
Lennart Poettering ac96418b4f pager: don't start pager if the terminal is explicitly set to TERM=dumb
As suggested here:

https://bugs.freedesktop.org/show_bug.cgi?id=64737#c8

This adds a new call terminal_is_dumb() and makes use of this where
appropriate.
2016-05-30 18:23:54 +02:00
Tejun Heo 13c31542cc core: add io controller support on the unified hierarchy
On the unified hierarchy, blkio controller is renamed to io and the interface
is changed significantly.

* blkio.weight and blkio.weight_device are consolidated into io.weight which
  uses the standardized weight range [1, 10000] with 100 as the default value.

* blkio.throttle.{read|write}_{bps|iops}_device are consolidated into io.max.
  Expansion of throttling features is being worked on to support
  work-conserving absolute limits (io.low and io.high).

* All stats are consolidated into io.stats.

This patchset adds support for the new interface.  As the interface has been
revamped and new features are expected to be added, it seems best to treat it
as a separate controller rather than trying to expand the blkio settings
although we might add automatic translation if only blkio settings are
specified.

* io.weight handling is mostly identical to blkio.weight[_device] handling
  except that the weight range is different.

* Both read and write bandwidth settings are consolidated into
  CGroupIODeviceLimit which describes all limits applicable to the device.
  This makes it less painful to add new limits.

* "max" can be used to specify the maximum limit which is equivalent to no
  config for max limits and treated as such.  If a given CGroupIODeviceLimit
  doesn't contain any non-default configs, the config struct is discarded once
  the no limit config is applied to cgroup.

* lookup_blkio_device() is renamed to lookup_block_device().

Signed-off-by: Tejun Heo <htejun@fb.com>
2016-05-05 16:43:06 -04:00
Naohiro Aota 80b2ab4bc0 cgtop: initialize `ours' to NULL properly (#3139)
Running cgtop on a system, which lacks expecting stat file, results in a
segfault. For example, a system with blkio tree but without cfq io scheduler,
lacks "blkio.io_service_bytes".

When the targeting cgroup's file does not exist, process() returns 0 and
also does not modify `*ret' value (which is `*ours'). As a result,
callers of refresh_one() can have bogus pointer, which result in SEGV.

This patch just properly initialize the variable to NULL.
2016-04-28 11:41:50 -04:00
Lennart Poettering dd422d1e5b tree-wide: make more global variables static
let's export as little as we can
2016-02-13 12:28:28 +01:00
Daniel Mack b26fa1a2fb tree-wide: remove Emacs lines from all files
This should be handled fine now by .dir-locals.el, so need to carry that
stuff in every file.
2016-02-10 13:41:57 +01:00
Daniel Mack d054f0a4d4 tree-wide: use xsprintf() where applicable
Also add a coccinelle receipt to help with such transitions.
2016-01-12 15:36:32 +01:00
Lennart Poettering 4afd3348c7 tree-wide: expose "p"-suffix unref calls in public APIs to make gcc cleanup easy
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.

With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.

The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).

This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.

Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:

       #define _cleanup_(function) __attribute__((cleanup(function)))

Or similar, to make the gcc feature easier to use.

Making this logic public has the benefit that we can remove three header
files whose only purpose was to define these functions internally.

See #2008.
2015-11-27 19:19:36 +01:00
Lennart Poettering b5efdb8af4 util-lib: split out allocation calls into alloc-util.[ch] 2015-10-27 13:45:53 +01:00
Lennart Poettering 6bedfcbb29 util-lib: split string parsing related calls from util.[ch] into parse-util.[ch] 2015-10-27 13:25:55 +01:00
Lennart Poettering 3ffd4af220 util-lib: split out fd-related operations into fd-util.[ch]
There are more than enough to deserve their own .c file, hence move them
over.
2015-10-25 13:19:18 +01:00
Lennart Poettering 266f3e269d bus-util: rename bus_open_transport() to bus_connect_transport()
In sd-bus, the sd_bus_open_xyz() family of calls allocates a new bus,
while sd_bus_default_xyz() family tries to reuse the thread's default
bus. bus_open_transport() sometimes internally uses the former,
sometimes the latter family, but suggests it only calls the former via
its name. Hence, let's avoid this confusion, and generically rename the
call to bus_connect_transport().

Similar for all related calls.

And while we are at it, also change cgls + cgtop to do direct systemd
connections where possible, since all they do is talk to systemd itself.
2015-09-29 21:55:52 +02:00
Lennart Poettering 3f6fd1ba65 util: introduce common version() implementation and use it everywhere
This also allows us to drop build.h from a ton of files, hence do so.
Since we touched the #includes of those files, let's order them properly
according to CODING_STYLE.
2015-09-29 21:08:37 +02:00
Lennart Poettering 08edf879ed cgtop: make sure help text doesn't cause main contents to move
Let's always keep space for the full help text. (We used to do that, but
recently another line of help was added which broke this.)
2015-09-22 16:31:02 +02:00
Lennart Poettering 1fc464f6fb cgtop: underline table header
Let's underline the header line of the table shown by cgtop, how it is
customary for tables. In order to do this, let's introduce new ANSI
underline macros, and clean up the existing ones as side effect.
2015-09-22 16:30:42 +02:00
Evgeny Vereshchagin 96a6426f30 cgtop: add -M/--machine 2015-09-21 12:04:45 +00:00
Daniel Mack a18f3caa56 Merge pull request #1239 from poettering/cgroup-pids
core: add support for the "pids" cgroup controller
2015-09-10 19:11:29 +02:00
Lennart Poettering 03a7b521e3 core: add support for the "pids" cgroup controller
This adds support for the new "pids" cgroup controller of 4.3 kernels.
It allows accounting the number of tasks in a cgroup and enforcing
limits on it.

This adds two new setting TasksAccounting= and TasksMax= to each unit,
as well as a gloabl option DefaultTasksAccounting=.

This also updated "cgtop" to optionally make use of the new
kernel-provided accounting.

systemctl has been updated to show the number of tasks for each service
if it is available.

This patch also adds correct support for undoing memory limits for units
using a MemoryLimit=infinity syntax. We do the same for TasksMax= now
and hence keep things in sync here.
2015-09-10 18:41:06 +02:00
Lennart Poettering 59f448cf15 tree-wide: never use the off_t unless glibc makes us use it
off_t is a really weird type as it is usually 64bit these days (at least
in sane programs), but could theoretically be 32bit. We don't support
off_t as 32bit builds though, but still constantly deal with safely
converting from off_t to other types and back for no point.

Hence, never use the type anymore. Always use uint64_t instead. This has
various benefits, including that we can expose these values directly as
D-Bus properties, and also that the values parse the same in all cases.
2015-09-10 18:16:18 +02:00
Lennart Poettering efdb02375b core: unified cgroup hierarchy support
This patch set adds full support the new unified cgroup hierarchy logic
of modern kernels.

A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is
added. If specified the unified hierarchy is mounted to /sys/fs/cgroup
instead of a tmpfs. No further hierarchies are mounted. The kernel
command line option defaults to off. We can turn it on by default as
soon as the kernel's APIs regarding this are stabilized (but even then
downstream distros might want to turn this off, as this will break any
tools that access cgroupfs directly).

It is possibly to choose for each boot individually whether the unified
or the legacy hierarchy is used. nspawn will by default provide the
legacy hierarchy to containers if the host is using it, and the unified
otherwise. However it is possible to run containers with the unified
hierarchy on a legacy host and vice versa, by setting the
$UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0,
respectively.

The unified hierarchy provides reliable cgroup empty notifications for
the first time, via inotify. To make use of this we maintain one
manager-wide inotify fd, and each cgroup to it.

This patch also removes cg_delete() which is unused now.

On kernel 4.2 only the "memory" controller is compatible with the
unified hierarchy, hence that's the only controller systemd exposes when
booted in unified heirarchy mode.

This introduces a new enum for enumerating supported controllers, plus a
related enum for the mask bits mapping to it. The core is changed to
make use of this everywhere.

This moves PID 1 into a new "init.scope" implicit scope unit in the root
slice. This is necessary since on the unified hierarchy cgroups may
either contain subgroups or processes but not both. PID 1 hence has to
move out of the root cgroup (strictly speaking the root cgroup is the
only one where processes and subgroups are still allowed, but in order
to support containers nicey, we move PID 1 into the new scope in all
cases.) This new unit is also used on legacy hierarchy setups. It's
actually pretty useful on all systems, as it can then be used to filter
journal messages coming from PID 1, and so on.

The root slice ("-.slice") is now implicitly created and started (and
does not require a unit file on disk anymore), since
that's where "init.scope" is located and the slice needs to be started
before the scope can.

To check whether we are in unified or legacy hierarchy mode we use
statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in
legacy mode, if it reports cgroupfs we are in unified mode.

This patch set carefuly makes sure that cgls and cgtop continue to work
as desired.

When invoking nspawn as a service it will implicitly create two
subcgroups in the cgroup it is using, one to move the nspawn process
into, the other to move the actual container processes into. This is
done because of the requirement that cgroups may either contain
processes or other subgroups.
2015-09-01 23:52:27 +02:00
Lennart Poettering 9660efb82f cgtop: properly show "/" instead of empty string in cgroup list 2015-09-01 17:20:56 +02:00
Lennart Poettering dcd7199082 cgtop: rework error handling
Never report errors twice.
2015-08-31 13:29:46 +02:00
Lennart Poettering 7fcfb7ee2f cgtop: allow toggling of --recursive= and -k at runtime 2015-08-31 13:20:44 +02:00
Lennart Poettering 3cb5beea0c cgtop: recursively count cgroup member tasks
When showing the number of tasks in a cgroup, recursively count tasks in
child cgroups and include them in the number. This ensures that the
number of tasks is cummulative the same way as memory, cpu and IO
resources are.

Old behaviour can be restored by passing the new --recursive=no switch.
2015-08-31 13:20:44 +02:00
Lennart Poettering 41ba8b6e69 cgtop: ignore kernel threads when counting tasks
However, allow them to be counted in by specifying -k
2015-08-31 13:20:44 +02:00
Lennart Poettering 03af6492f0 cgtop: show resource usage relative to cgroup root only
This way the output is restricted to cgroups from a container when run
in one.
2015-08-31 13:20:43 +02:00
Lennart Poettering 45d7a8bb6c cgtop: major modernizations
In preparation of the unified cgroup support, let's clean up cgtop:

a) rework time code to be based on "nsec_t" rather than "struct timespec"

b) Introduce long option --order= for selecting ordering

c) count number of processes only in the main hierarchy, don't bother
   with the controller hierarchies. We don't allow orthogonal
   hierarchies in systemd anymore, hence there's no point to check the
   other hierarchies.

d) Deal with non-monotonic cpuacct values (see #749)

e) When sorting groups, don't do prefix compare when ordering by number
   of tasks, since this is not accumulative for all children.

f) Actually make --cpu without parameter work

g) Don't output control characters when we get them as input.

Fixes #749.
2015-08-28 02:27:29 +02:00
Umut Tezduyar Lindskog 97b845b0fc cgtop: include missing signal.h for sigwinch 2015-07-17 10:39:06 +02:00
Charles Duffy 1d84ae050c cgtop: IO readings are valid if any data is available, even if unchanged since last tick
Emit "0" rather than "-" if no change in IO values are seen for a process since
last tick, so long as accounting has registered content at all.
2015-06-10 18:06:28 -05:00
Charles Duffy dcc7aacd67 cgtop: more sensible flushing behavior w/ non-TTY output
- Explicitly flush stdout before sleep between iterations
- Only clear user keystrokes when output is to TTY
- Add a newline between output batches when output is not to TTY
2015-06-09 19:39:16 -05:00
Charles Duffy 780fe62eca cgtop: allow user to force looping behavior even in non-TTY mode 2015-06-09 19:39:16 -05:00
Charles Duffy a2c9f63136 cgtop: raw output option (disable conversion to human-readable units) 2015-06-09 19:39:16 -05:00
Ronny Chevalier 288a74cce5 shared: add terminal-util.[ch] 2015-04-11 00:34:02 +02:00
Umut Tezduyar Lindskog 510c4a0f1e cgtop: fix assert when not on tty
systemd-cgtop --dept=1 -b -n 10 -d 0.1 | cat

Assertion 'new_length >= 3' failed at src/shared/util.c:3 \
595, function ellipsize_mem(). Aborting.
Aborted (core dumped)

(David: add comment)
2015-03-11 16:59:53 +01:00
Michal Schmidt da927ba997 treewide: no need to negate errno for log_*_errno()
It corrrectly handles both positive and negative errno values.
2014-11-28 13:29:21 +01:00
Michal Schmidt 0a1beeb642 treewide: auto-convert the simple cases to log_*_errno()
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:

find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'

Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
2014-11-28 12:04:41 +01:00
Michal Schmidt 2d5c93c7af install, cgtop: adjust hashmap_move_one() callers for -ENOMEM possibility
That hashmap_move_one() currently cannot fail with -ENOMEM is an
implementation detail, which is not possible to guarantee in general.
Hashmap implementations based on anything else than chaining of
individual entries may have to allocate.

hashmap_move_one will not fail with -ENOMEM if a proper reservation has
been made beforehand. Use reservations in install.c.

In cgtop.c simply propagate the error instead of asserting.
2014-10-23 17:38:02 +02:00
Michal Schmidt d5099efc47 hashmap: introduce hash_ops to make struct Hashmap smaller
It is redundant to store 'hash' and 'compare' function pointers in
struct Hashmap separately. The functions always comprise a pair.
Store a single pointer to struct hash_ops instead.

systemd keeps hundreds of hashmaps, so this saves a little bit of
memory.
2014-09-15 16:08:50 +02:00
Zbigniew Jędrzejewski-Szmek 601185b43d Unify parse_argv style
getopt is usually good at printing out a nice error message when
commandline options are invalid. It distinguishes between an unknown
option and a known option with a missing arg. It is better to let it
do its job and not use opterr=0 unless we actually want to suppress
messages. So remove opterr=0 in the few places where it wasn't really
useful.

When an error in options is encountered, we should not print a lengthy
help() and overwhelm the user, when we know precisely what is wrong
with the commandline. In addition, since help() prints to stdout, it
should not be used except when requested with -h or --help.

Also, simplify things here and there.
2014-08-03 21:46:07 -04:00
Lennart Poettering 39883f622f make gcc shut up
If -flto is used then gcc will generate a lot more warnings than before,
among them a number of use-without-initialization warnings. Most of them
without are false positives, but let's make them go away, because it
doesn't really matter.
2014-02-19 17:53:50 +01:00
Lennart Poettering 43a99a7afe build-sys: minor fixes found with cppcheck 2013-12-25 19:00:38 +01:00
Lennart Poettering eb9da376d7 clients: unify how we invoke getopt_long()
Among other things this makes sure we always expose a --version command
and show it in the help texts.
2013-11-06 18:28:39 +01:00