Systemd

Author	SHA1	Message	Date
Zbigniew Jędrzejewski-Szmek	74ad38ff0e	Merge pull request #3160 from htejun/cgroup-fixes-rev2 Cgroup fixes.	2016-05-07 15:08:57 -04:00
Tejun Heo	13c31542cc	core: add io controller support on the unified hierarchy On the unified hierarchy, blkio controller is renamed to io and the interface is changed significantly. * blkio.weight and blkio.weight_device are consolidated into io.weight which uses the standardized weight range [1, 10000] with 100 as the default value. * blkio.throttle.{read\|write}_{bps\|iops}_device are consolidated into io.max. Expansion of throttling features is being worked on to support work-conserving absolute limits (io.low and io.high). * All stats are consolidated into io.stats. This patchset adds support for the new interface. As the interface has been revamped and new features are expected to be added, it seems best to treat it as a separate controller rather than trying to expand the blkio settings although we might add automatic translation if only blkio settings are specified. * io.weight handling is mostly identical to blkio.weight[_device] handling except that the weight range is different. * Both read and write bandwidth settings are consolidated into CGroupIODeviceLimit which describes all limits applicable to the device. This makes it less painful to add new limits. * "max" can be used to specify the maximum limit which is equivalent to no config for max limits and treated as such. If a given CGroupIODeviceLimit doesn't contain any non-default configs, the config struct is discarded once the no limit config is applied to cgroup. * lookup_blkio_device() is renamed to lookup_block_device(). Signed-off-by: Tejun Heo <htejun@fb.com>	2016-05-05 16:43:06 -04:00
Lennart Poettering	d8fdc62037	core: use an AF_UNIX/SOCK_DGRAM socket for cgroup agent notification dbus-daemon currently uses a backlog of 30 on its D-bus system bus socket. On overloaded systems this means that only 30 connections may be queued without dbus-daemon processing them before further connection attempts fail. Our cgroups-agent binary so far used D-Bus for its messaging, and hitting this limit hence may result in us losing cgroup empty messages. This patch adds a seperate cgroup agent socket of type AF_UNIX/SOCK_DGRAM. Since sockets of these types need no connection set up, no listen() backlog applies. Our cgroup-agent binary will hence simply block as long as it can't enqueue its datagram message, so that we won't lose cgroup empty messages as likely anymore. This also rearranges the ordering of the processing of SIGCHLD signals, service notification messages (sd_notify()...) and the two types of cgroup notifications (inotify for the unified hierarchy support, and agent for the classic hierarchy support). We now always process events for these in the following order: 1. service notification messages (SD_EVENT_PRIORITY_NORMAL-7) 2. SIGCHLD signals (SD_EVENT_PRIORITY_NORMAL-6) 3. cgroup inotify and cgroup agent (SD_EVENT_PRIORITY_NORMAL-5) This is because when receiving SIGCHLD we invalidate PID information, which we need to process the service notification messages which are bound to PIDs. Hence the order between the first two items. And we want to process SIGCHLD metadata to detect whether a service is gone, before using cgroup notifications, to decide when a service is gone, since the former carries more useful metadata. Related to this: https://bugs.freedesktop.org/show_bug.cgi?id=95264 https://github.com/systemd/systemd/issues/1961	2016-05-05 12:37:04 +02:00
Tejun Heo	ccf78df1fc	core: make unit_has_mask_realized() consider controller enable state unit_has_mask_realized() determines whether the specified unit has its cgroups set up properly given the desired target_mask; however, on the unified hierarchy, controllers need to be enabled explicitly for children and the mask of enabled controllers can deviate from target_mask. Only considering target_mask in unit_has_mask_realized() can lead to false positives and skipping enabling the requested controllers. This patch adds unit->cgroup_enabled_mask to track which controllers are enabled and updates unit_has_mask_realized() to also consider enable_mask. Signed-off-by: Tejun Heo <htejun@fb.com>	2016-04-30 16:12:54 -04:00
Lennart Poettering	463d0d1569	core: remove ManagerRunningAs enum Previously, we had two enums ManagerRunningAs and UnitFileScope, that were mostly identical and converted from one to the other all the time. The latter had one more value UNIT_FILE_GLOBAL however. Let's simplify things, and remove ManagerRunningAs and replace it by UnitFileScope everywhere, thus making the translation unnecessary. Introduce two new macros MANAGER_IS_SYSTEM() and MANAGER_IS_USER() to simplify checking if we are running in one or the user context.	2016-04-12 13:43:30 +02:00
Tejun Heo	ab2c3861dc	core: update populated event handling in unified hierarchy Earlier during the development of unified hierarchy, the populated event was reported through by the dedicated "cgroup.populated" file; however, the interface was updated so that it's reported through the "populated" field of "cgroup.events" file. Update populated event handling logic accordingly.	2016-03-26 12:05:57 -04:00
Daniel Mack	50f48ad37a	cgroup: remove support for NetClass= directive Support for net_cls.class_id through the NetClass= configuration directive has been added in v227 in preparation for a per-unit packet filter mechanism. However, it turns out the kernel people have decided to deprecate the net_cls and net_prio controllers in v2. Tejun provides a comprehensive justification for this in his commit, which has landed during the merge window for kernel v4.5: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=bd1060a1d671 As we're aiming for full support for the v2 cgroup hierarchy, we can no longer support this feature. Userspace tool such as nftables are moving over to setting rules that are specific to the full cgroup path of a task, which obsoletes these controllers anyway. This commit removes support for tweaking details in the net_cls controller, but keeps the NetClass= directive around for legacy compatibility reasons.	2016-02-10 16:38:56 +01:00
Daniel Mack	b26fa1a2fb	tree-wide: remove Emacs lines from all files This should be handled fine now by .dir-locals.el, so need to carry that stuff in every file.	2016-02-10 13:41:57 +01:00
Lennart Poettering	077ba06eaa	core: don't generate warnings when write access to the cgroup fs fails in --user due to EACCES After all, in the classic hierarchy that's pretty much the default case.	2015-11-17 00:52:10 +01:00
Lennart Poettering	7760171904	util-lib: move inotify-related definitions to fs-util.[ch]	2015-10-27 14:58:05 +01:00
Lennart Poettering	6bc73acb01	process-util: rename get_parent_of_pid() → get_process_ppid() In order to match the other get_process_xyz() calls.	2015-10-27 14:01:48 +01:00
Lennart Poettering	b5efdb8af4	util-lib: split out allocation calls into alloc-util.[ch]	2015-10-27 13:45:53 +01:00
Lennart Poettering	8b43440b7e	util-lib: move string table stuff into its own string-table.[ch]	2015-10-27 13:25:56 +01:00
Lennart Poettering	0d39fa9c69	util-lib: move more file I/O related calls into fileio.[ch]	2015-10-27 13:25:55 +01:00
Lennart Poettering	6bedfcbb29	util-lib: split string parsing related calls from util.[ch] into parse-util.[ch]	2015-10-27 13:25:55 +01:00
Lennart Poettering	3ffd4af220	util-lib: split out fd-related operations into fd-util.[ch] There are more than enough to deserve their own .c file, hence move them over.	2015-10-25 13:19:18 +01:00
Lennart Poettering	07630cea1f	util-lib: split our string related calls from util.[ch] into its own file string-util.[ch] There are more than enough calls doing string manipulations to deserve its own files, hence do something about it. This patch also sorts the #include blocks of all files that needed to be updated, according to the sorting suggestions from CODING_STYLE. Since pretty much every file needs our string manipulation functions this effectively means that most files have sorted #include blocks now. Also touches a few unrelated include files.	2015-10-24 23:05:02 +02:00
Daniel Mack	32ee7d3309	cgroup: add support for net_cls controllers Add a new config directive called NetClass= to CGroup enabled units. Allowed values are positive numbers for fix assignments and "auto" for picking a free value automatically, for which we need to keep track of dynamically assigned net class IDs of units. Introduce a hash table for this, and also record the last ID that was given out, so the allocator can start its search for the next 'hole' from there. This could eventually be optimized with something like an irb. The class IDs up to 65536 are considered reserved and won't be assigned automatically by systemd. This barrier can be made a config directive in the future. Values set in unit files are stored in the CGroupContext of the unit and considered read-only. The actually assigned number (which may have been chosen dynamically) is stored in the unit itself and is guaranteed to remain stable as long as the unit is active. In the CGroup controller, set the configured CGroup net class to net_cls.classid. Multiple unit may share the same net class ID, and those which do are linked together.	2015-09-16 00:21:55 +02:00
Lennart Poettering	e7ab4d1ac9	cgroup: unify how we invalidate cgroup controller settings Let's make sure that we follow the same codepaths when adjusting a cgroup property via the dbus SetProperty() call, and when we execute the StartupCPUShares= effect.	2015-09-11 18:31:50 +02:00
Lennart Poettering	d53d94743c	core: refactor cpu shares/blockio weight cgroup logic Let's stop using the "unsigned long" type for weights/shares, and let's just use uint64_t for this, as that's what we expose on the bus. Unify parsers, and always validate the range for these fields. Correct the default blockio weight to 500, since that's what the kernel actually uses. When parsing the weight/shares settings from unit files accept the empty string as a way to reset the weight/shares value. When getting it via the bus, uniformly map (uint64_t) -1 to unset. Open up StartupCPUShares= and StartupBlockIOWeight= to transient units.	2015-09-11 18:31:49 +02:00
Lennart Poettering	03a7b521e3	core: add support for the "pids" cgroup controller This adds support for the new "pids" cgroup controller of 4.3 kernels. It allows accounting the number of tasks in a cgroup and enforcing limits on it. This adds two new setting TasksAccounting= and TasksMax= to each unit, as well as a gloabl option DefaultTasksAccounting=. This also updated "cgtop" to optionally make use of the new kernel-provided accounting. systemctl has been updated to show the number of tasks for each service if it is available. This patch also adds correct support for undoing memory limits for units using a MemoryLimit=infinity syntax. We do the same for TasksMax= now and hence keep things in sync here.	2015-09-10 18:41:06 +02:00
Lennart Poettering	3905f12713	cgroups: make sure the "devices" controller's enum is named the same way as the controller in the kernel Follow-up to `5bf8002a3a`.	2015-09-08 18:15:50 +02:00
Lennart Poettering	19af675e99	cgroups: delegation to unprivileged services is safe in the unified hierarchy Delegation to unpriviliged processes is safe in the unified hierarchy, hence allow it. This has the benefit of permitting "systemd --user" instances to further partition their resources between user services.	2015-09-04 09:23:07 +02:00
Lennart Poettering	b3ac818be8	core: split up manager_get_unit_by_pid() Let's move the actual cgroup part of it into a new separate function manager_get_unit_by_pid_cgroup(), and then make manager_get_unit_by_pid() just a wrapper that also checks the two pid hashmaps. Then, let's make sure the various calls that want to deliver events to the owners of a PID check both hashmaps and the cgroup and deliver the event to each of them. OTOH make sure bus calls like GetUnitByPID() continue to check the PID hashmaps first and the cgroup only as fallback.	2015-09-04 09:07:31 +02:00
Lennart Poettering	fea72cc033	macro: introduce new PID_TO_PTR macros and make use of them This adds a new PID_TO_PTR() macro, plus PTR_TO_PID() and makes use of it wherever we maintain processes in a hash table. Previously we sometimes used LONG_TO_PTR() and other times ULONG_TO_PTR() for that, hence let's make this more explicit and clean up things.	2015-09-04 09:07:30 +02:00
Thomas Hindoe Paaboel Andersen	b3c5bad3d6	tree-wide: fix indentation	2015-09-02 20:46:42 +02:00
Lennart Poettering	efdb02375b	core: unified cgroup hierarchy support This patch set adds full support the new unified cgroup hierarchy logic of modern kernels. A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is added. If specified the unified hierarchy is mounted to /sys/fs/cgroup instead of a tmpfs. No further hierarchies are mounted. The kernel command line option defaults to off. We can turn it on by default as soon as the kernel's APIs regarding this are stabilized (but even then downstream distros might want to turn this off, as this will break any tools that access cgroupfs directly). It is possibly to choose for each boot individually whether the unified or the legacy hierarchy is used. nspawn will by default provide the legacy hierarchy to containers if the host is using it, and the unified otherwise. However it is possible to run containers with the unified hierarchy on a legacy host and vice versa, by setting the $UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0, respectively. The unified hierarchy provides reliable cgroup empty notifications for the first time, via inotify. To make use of this we maintain one manager-wide inotify fd, and each cgroup to it. This patch also removes cg_delete() which is unused now. On kernel 4.2 only the "memory" controller is compatible with the unified hierarchy, hence that's the only controller systemd exposes when booted in unified heirarchy mode. This introduces a new enum for enumerating supported controllers, plus a related enum for the mask bits mapping to it. The core is changed to make use of this everywhere. This moves PID 1 into a new "init.scope" implicit scope unit in the root slice. This is necessary since on the unified hierarchy cgroups may either contain subgroups or processes but not both. PID 1 hence has to move out of the root cgroup (strictly speaking the root cgroup is the only one where processes and subgroups are still allowed, but in order to support containers nicey, we move PID 1 into the new scope in all cases.) This new unit is also used on legacy hierarchy setups. It's actually pretty useful on all systems, as it can then be used to filter journal messages coming from PID 1, and so on. The root slice ("-.slice") is now implicitly created and started (and does not require a unit file on disk anymore), since that's where "init.scope" is located and the slice needs to be started before the scope can. To check whether we are in unified or legacy hierarchy mode we use statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in legacy mode, if it reports cgroupfs we are in unified mode. This patch set carefuly makes sure that cgls and cgtop continue to work as desired. When invoking nspawn as a service it will implicitly create two subcgroups in the cgroup it is using, one to move the nspawn process into, the other to move the actual container processes into. This is done because of the requirement that cgroups may either contain processes or other subgroups.	2015-09-01 23:52:27 +02:00
Lennart Poettering	5fe8876b32	core: when looking for the unit for a process, look at the PID hashmaps first It's cheaper that going to cgroupfs, and also usually the better choice since it's not racy and can map PIDs even if they were moved to a different unit.	2015-09-01 18:47:46 +02:00
Lennart Poettering	6f883237f1	cgroup: drop "ignore_self" argument from cg_is_empty() In all cases where the function (or cg_is_empty_recursive()) ignoring the calling process is actually wrong, as a process keeps a cgroup busy regardless if its the current one or another. Hence, let's simplify things and drop the "ignore_self" parameter.	2015-09-01 18:37:01 +02:00
Lennart Poettering	e9db43d591	units: enable waiting for unit termination in certain cases The legacy cgroup hierarchy does not support reliable empty notifications in containers and if there are left-over subgroups in a cgroup. This makes it hard to correctly wait for them running empty, and thus we previously disabled this logic entirely. With this change we explicitly check for the container case, and whether the unit is a "delegation" unit (i.e. one where programs may create their own subgroups). If we are neither in a container, nor operating on a delegation unit cgroup empty notifications become reliable and thus we start waiting for the empty notifications again. This doesn't really fix the general problem around cgroup notifications but reduces the effect around it. (This also reorders #include lines by their focus, as suggsted in CODING_STYLE. We have to add "virt.h", so let's do that at the right place.) Also see #317.	2015-09-01 17:44:17 +02:00
Lennart Poettering	35b7ff80e2	unit: add new macros to test for unit contexts	2015-08-31 13:20:43 +02:00
Lennart Poettering	b2c23da8ce	core: rename SystemdRunningAs to ManagerRunningAs It's primarily just a property of the Manager object after all, and we try to refer to PID 1 as "manager" instead of "systemd", hence let's to stick to this here too.	2015-05-11 22:51:49 +02:00
Ronny Chevalier	0b452006de	shared: add process-util.[ch]	2015-04-10 23:54:49 +02:00
Lennart Poettering	5ad096b3f1	core: expose consumed CPU time per unit This adds support for showing the accumulated consumed CPU time per-unit in the "systemctl status" output. The property is also readable via the bus.	2015-03-02 12:15:25 +01:00
Zbigniew Jędrzejewski-Szmek	a3bd89ea99	core/cgroup: fix embarrassing typo https://github.com/docker/docker/issues/10280	2015-01-31 23:03:56 -05:00
Torstein Husebø	cc98b3025e	treewide: fix multiple typos	2015-01-26 10:39:47 -05:00
Daniel Mack	71c2687360	cgroup: fix typo	2015-01-19 18:34:17 +01:00
Zbigniew Jędrzejewski-Szmek	7539904965	cgroup: memory limits on / are not supported	2015-01-05 19:04:10 -05:00
Zbigniew Jędrzejewski-Szmek	6da139137e	cgroup: fix error message systemd[1]: Failed to set memory.limit_in_bytes on : Invalid argument	2015-01-05 19:04:10 -05:00
Lennart Poettering	714e2e1d56	cgroup: downgrade log messages when we cannot write to cgroup trees that are mounted read-only	2015-01-05 01:40:51 +01:00
Lennart Poettering	7b3fd6313c	scope: make attachment of initial PIDs a bit more robust	2014-12-10 22:06:44 +01:00
Lennart Poettering	0cd385d318	core: don't migrate PIDs for units that may contain subcgroups, do this only for leaf units Otherwise a slice or delegation unit might move PIDs around ignoring the fact that it is attached to a subcgroup.	2014-12-10 20:38:24 +01:00
Lennart Poettering	b1491eba40	core: rename unit_destroy_cgroup() to unit_destroy_cgroup_if_empty() since it's not quite as destructive as it sounds nowadays	2014-12-09 02:31:42 +01:00
Ross Lagerwall	dab5bf8599	cgroup: Handle error when destroying cgroup If a cgroup fails to be destroyed (most likely because there are still processes running as part of a service after the main pid exits), don't free and remove the cgroup unit from the manager. This fixes a regression introduced by the cgroup rework in v205 where systemd would forget about processes still running after the unit becomes inactive. (This can happen when the main pid exits and KillMode=process or none).	2014-12-09 02:28:09 +01:00
Michal Schmidt	4a62c710b6	treewide: another round of simplifications Using the same scripts as in `f647962d64` "treewide: yet more log_*_errno + return simplifications".	2014-11-28 19:57:32 +01:00
Michal Schmidt	56f64d9576	treewide: use log__errno whenever %m is in the format string If the format string contains %m, clearly errno must have a meaningful value, so we might as well use log__errno to have ERRNO= logged. Using: find . -name '.[ch]' \| xargs sed -r -i -e \ 's/log_(debug\|info\|notice\|warning\|error\|emergency)\((".%m.*")/log_\1_errno(errno, \2/' Plus some whitespace, linewrap, and indent adjustments.	2014-11-28 19:49:27 +01:00
Michal Schmidt	23bbb0de4e	treewide: more log_*_errno + return simplifications	2014-11-28 18:24:30 +01:00
Michal Schmidt	da927ba997	treewide: no need to negate errno for log_*_errno() It corrrectly handles both positive and negative errno values.	2014-11-28 13:29:21 +01:00
Michal Schmidt	0a1beeb642	treewide: auto-convert the simple cases to log__errno() As a followup to `086891e5c1` "log: add an "error" parameter to all low-level logging calls and intrdouce log_error_errno() as log calls that take error numbers", use sed to convert the simple cases to use the new macros: find . -name '.[ch]' \| xargs sed -r -i -e \ 's/log_(debug\|info\|notice\|warning\|error\|emergency)$"(.)%s"(.), strerror\(-([a-zA-Z_]+)$\);/log_\1_errno(-\4, "\2%m"\3);/' Multi-line log_() invocations are not covered. And we also should add log_unit__errno().	2014-11-28 12:04:41 +01:00
Lennart Poettering	a931ad47a8	core: introduce new Delegate=yes/no property controlling creation of cgroup subhierarchies For priviliged units this resource control property ensures that the processes have all controllers systemd manages enabled. For unpriviliged services (those with User= set) this ensures that access rights to the service cgroup is granted to the user in question, to create further subgroups. Note that this only applies to the name=systemd hierarchy though, as access to other controllers is not safe for unpriviliged processes. Delegate=yes should be set for container scopes where a systemd instance inside the container shall manage the hierarchies below its own cgroup and have access to all controllers. Delegate=yes should also be set for user@.service, so that systemd --user can run, controlling its own cgroup tree. This commit changes machined, systemd-nspawn@.service and user@.service to set this boolean, in order to ensure that container management will just work, and the user systemd instance can run fine.	2014-11-05 18:49:14 +01:00
Zbigniew Jędrzejewski-Szmek	b1d6dcf5a5	Do not format USEC_INFINITY as NULL systemctl would print 'CPUQuotaPerSecUSec=(null)' for no limit. This does not look right. Since USEC_INFINITY is one of the valid values, format_timespan() could return NULL, and we should wrap every use of it in strna() or similar. But most callers didn't do that, and it seems more robust to return a string ("infinity") that makes sense most of the time, even if in some places the result will not be grammatically correct.	2014-09-29 11:09:39 -04:00
Lennart Poettering	d81afec1c9	core: split up "starting" manager state into "initializing" and "starting" We'll stay in "initializing" until basic.target has reached, at which point we will enter "starting". This is preparation so that we can change the startip timeout to only apply to the first phase of startup, not the full procedure.	2014-08-22 18:10:31 +02:00
Lennart Poettering	1aeab12b19	cgroup: only generate warnings if actually writing to cgroup attributes failed	2014-08-15 18:14:37 +02:00
Lennart Poettering	6b2f67b31c	cgroup: downgrade log messages about non-existant cgroup attributes to LOG_DEBUG	2014-08-15 11:57:07 +02:00
Kay Sievers	3a43da2832	time-util: add and use USEC/NSEC_INFINIY	2014-07-29 13:20:20 +02:00
Zbigniew Jędrzejewski-Szmek	0d8c31ff72	test-engine: fix access to unit load path Also add a bit of debugging output to help diagnose problems, add missing units, and simplify cppflags. Move test-engine to normal tests from manual tests, it should now work without destroying the system.	2014-07-20 19:48:16 -04:00
Lennart Poettering	9a05490933	cgroups: simplify CPUQuota= logic Only accept cpu quota values in percentages, get rid of period definition. It's not clear whether the CFS period controllable per-cgroup even has a future in the kernel, hence let's simplify all this, hardcode the period to 100ms and only accept percentage based quota values.	2014-05-22 11:53:12 +09:00
Lennart Poettering	637f421e5c	cgroups: always propagate controller membership to siblings, for all controllers This is the behaviour the kernel cgroup rework exposes for all controllers, hence let's do this already now for all cases.	2014-05-22 07:50:03 +09:00
Lennart Poettering	db785129c9	cgroup: rework startup logic Introduce a (unsigned long) -1 as "unset" state for cpu shares/block io weights, and keep the startup unit set around all the time.	2014-05-22 07:13:56 +09:00
WaLyong Cho	95ae05c0e7	core: add startup resource control option Similar to CPUShares= and BlockIOWeight= respectively. However only assign the specified weight during startup. Each control group attribute is re-assigned as weight by CPUShares=weight and BlockIOWeight=weight after startup. If not CPUShares= or BlockIOWeight= be specified, then the attribute is re-assigned to each default attribute value. (default cpu.shares=1024, blkio.weight=1000) If only CPUShares=weight or BlockIOWeight=weight be specified, then that implies StartupCPUShares=weight and StartupBlockIOWeight=weight.	2014-05-22 07:13:56 +09:00
Łukasz Stelmach	cd7affaeea	core: check the right variable for failed open()	2014-05-08 13:24:34 +02:00
Kay Sievers	99a17ada9c	core: require cgroups filesystem to be available We should no longer pretend that we can run in any sensible way without the kernel supporting us with cgroups functionality.	2014-05-05 18:52:36 +02:00
Lennart Poettering	b2f8b02ec2	core: expose CFS CPU time quota as high-level unit properties	2014-04-25 13:27:25 +02:00
Lennart Poettering	7d711efb9c	core: make sure we can combine DevicePolicy=closed with PrivateDevices=yes if PrivateDevices=yes is used we need to make sure we can still create /dev/null and so on.	2014-03-19 22:00:43 +01:00
Lennart Poettering	03e334a1c7	util: replace close_nointr_nofail() by a more useful safe_close() safe_close() automatically becomes a NOP when a negative fd is passed, and returns -1 unconditionally. This makes it easy to write lines like this: fd = safe_close(fd); Which will close an fd if it is open, and reset the fd variable correctly. By making use of this new scheme we can drop a > 200 lines of code that was required to test for non-negative fds or to reset the closed fd variable afterwards.	2014-03-18 19:31:34 +01:00
Lennart Poettering	e41969e3d1	core: support globbing matches in DeviceAllow= when checking for device groups	2014-03-11 17:43:41 +01:00
Lennart Poettering	01efdf13a6	cgroup: certain cgroup attributes are not available in the root cgroup, hence don't bother	2014-02-24 03:38:58 +01:00
Lennart Poettering	90060676c4	cgroup: Extend DeviceAllow= syntax to whitelist groups of devices, not just particular devices nodes	2014-02-22 03:05:34 +01:00
Lennart Poettering	d4fdc205a4	update TODO	2014-02-19 18:20:12 +01:00
Jan Engelhardt	73e231abde	doc: update punctuation Resolve spotted issues related to missing or extraneous commas, dashes.	2014-02-17 19:03:07 -05:00
Lennart Poettering	03b90d4bad	core: find the closest parent slice that has a specfic cgroup controller enabled when enabling/disabling cgroup controllers for units	2014-02-17 15:49:21 +01:00
Lennart Poettering	bc432dc7eb	core: rework cgroup mask propagation Previously a cgroup setting down tree would result in cgroup membership additions being propagated up the tree and to the siblings, however a unit could never lose cgroup memberships again. With this change we'll make sure that both cgroup additions and removals propagate properly.	2014-02-17 15:49:21 +01:00
David Strauss	6414b7c981	cgroups: Cache controller masks and optimize queues.	2013-11-22 11:22:47 +10:00
Zbigniew Jędrzejewski-Szmek	a94042fa9b	systemd: fix memory leak in cgroup code If the unit already was in the hashmap, path would be leaked.	2013-11-09 19:02:53 -05:00
David Strauss	f366954523	Comment spelling fixes.	2013-11-06 20:03:18 +10:00
Lennart Poettering	15c60e99a9	cgroup: run PID 1 in the root cgroup This way cleaning up the cgroup tree on shutdown is a lot easier since we are in the root dir. Also PID 1 was previously artificially placed in system.slice, even though our rule actually was not to have processes in slices. The root slice otoh is magic anyway, so having PID 1 in there sounds less surprising. Of course, this means that PID is scheduled against the three top-level slices.	2013-11-06 02:12:21 +01:00
Lennart Poettering	71fda00f32	list: make our list macros a bit easier to use by not requring type spec on each invocation We can determine the list entry type via the typeof() gcc construct, and so we should to make the macros much shorter to use.	2013-10-14 06:11:19 +02:00
Lennart Poettering	13b84ec7df	cgroup: if we do a cgroup operation then do something on all supported controllers Previously we did operations like attach, trim or migrate only on the controllers that were enabled for a specific unit. With this changes we will now do them for all supproted controllers, and fall back to all possible prefix paths if the specified paths do not exist. This fixes issues if a controller is being disabled for a unit where it was previously enabled, and makes sure that all processes stay as "far down" the tree as groups exist.	2013-09-25 03:38:17 +02:00
Lennart Poettering	e58cec11e6	cgroup: always enable memory.use_hierarchy= for all cgroups in the memory hierarchy The non-hierarchial mode contradicts the whole idea of a cgroup tree so let's not support this. In the future the kernel will only support the hierarchial logic anyway.	2013-09-23 16:02:31 -05:00
Lennart Poettering	ddca82aca0	cgroup: get rid of MemorySoftLimit= The cgroup attribute memory.soft_limit_in_bytes is unlikely to stay around in the kernel for good, so let's not expose it for now. We can readd something like it later when the kernel guys decided on a final API for this.	2013-09-17 14:58:00 -05:00
Gao feng	112a7f4696	cgroup: add missing equals for BlockIOWeight	2013-09-16 09:19:00 -04:00
Lukas Nykryn	81c68af03f	core/cgroup: first print then free	2013-09-13 14:40:58 +02:00
Gao feng	6a94f2e938	cgroup: fix incorrectly setting memory cgroup If the memory_limit of unit is -1, we should write "-1" to the file memory.limit_in_bytes. not the (unit64_t) -1. otherwise the memory.limit_in_bytes will be set to zero.	2013-09-13 14:32:14 +02:00
Gao feng	84121bc2ee	cgroup: correct the log information it should be memory.soft_limit_in_bytes.	2013-09-13 14:32:14 +02:00
Gao feng	15b4a7548f	cgroup: add the missing setting of variable's value set the value of variable "r" to the return value of cg_set_attribute.	2013-09-13 14:32:14 +02:00
Harald Hoyer	b58b8e11c5	Do not realloc strings, which are already in the hashmap as keys This prevents corruption of the hashmap, because we would free() the keys in the hashmap, if the unit is already in there, with the same cgroup path.	2013-08-28 16:02:57 +02:00
Harald Hoyer	3d040cf244	Revert "cgroup.c: check return value of unit_realize_cgroup_now()" This reverts commit `1f11a0cdfe`.	2013-08-28 16:02:39 +02:00
Harald Hoyer	1f11a0cdfe	cgroup.c: check return value of unit_realize_cgroup_now() do not recurse further, if unit_realize_cgroup_now() failed	2013-08-23 18:46:51 +02:00
Lennart Poettering	8e7076caae	cgroup: split out per-device BlockIOWeight= setting into BlockIODeviceWeight= This way we can nicely map the configuration directive to properties and back, without requiring two different signatures for the same property.	2013-07-11 20:40:18 +02:00
Lennart Poettering	8a84192905	cgroup: don't ever try to destroy the cgroup of the root slice The root slice is after all the root cgroup, so don't attempt to delete it.	2013-07-11 18:49:52 +02:00
Lennart Poettering	be2c1bd2a8	cgroup: don't move systemd into systems.slice when running as --user instance	2013-07-11 18:49:52 +02:00
Lennart Poettering	376dd21dc0	cgroup: downgrade error message when we cannot remove a cgroup to debug Some units set KillMode=none to survive the initrd→rootfs transition. We cannot remove their cgroups, but that shouldn't really be considered an issue, so let's downgrade the error message.	2013-07-10 23:41:03 +02:00
Lennart Poettering	06025d9148	core: don't consider a unit's cgroup empty if only a subcgroup runs empty	2013-07-02 16:24:13 +02:00
Lennart Poettering	b56c28c31a	cgroup: implicitly add units to GC queue when their cgroups run empty	2013-07-01 00:17:59 +02:00
Lennart Poettering	0a1eb06d9a	cgroup: readd proper cgroup empty tracking	2013-07-01 00:17:59 +02:00
Lennart Poettering	4ad490007b	core: general cgroup rework Replace the very generic cgroup hookup with a much simpler one. With this change only the high-level cgroup settings remain, the ability to set arbitrary cgroup attributes is removed, so is support for adding units to arbitrary cgroup controllers or setting arbitrary paths for them (especially paths that are different for the various controllers). This also introduces a new -.slice root slice, that is the parent of system.slice and friends. This enables easy admin configuration of root-level cgrouo properties. This replaces DeviceDeny= by DevicePolicy=, and implicitly adds in /dev/null, /dev/zero and friends if DeviceAllow= is used (unless this is turned off by DevicePolicy=).	2013-06-27 04:17:34 +02:00
Lennart Poettering	9444b1f20e	logind: add infrastructure to keep track of machines, and move to slices - This changes all logind cgroup objects to use slice objects rather than fixed croup locations. - logind can now collect minimal information about running VMs/containers. As fixed cgroup locations can no longer be used we need an entity that keeps track of machine cgroups in whatever slice they might be located. Since logind already keeps track of users, sessions and seats this is a trivial addition. - nspawn will now register with logind and pass various bits of metadata along. A new option "--slice=" has been added to place the container in a specific slice. - loginctl gained commands to list, introspect and terminate machines. - user.slice and machine.slice will now be pulled in by logind.service, since only logind.service requires this slice.	2013-06-20 03:49:59 +02:00
Lennart Poettering	a016b9228f	core: add new .slice unit type for partitioning systems In order to prepare for the kernel cgroup rework, let's introduce a new unit type to systemd, the "slice". Slices can be arranged in a tree and are useful to partition resources freely and hierarchally by the user. Each service unit can now be assigned to one of these slices, and later on login users and machines may too. Slices translate pretty directly to the cgroup hierarchy, and the various objects can be assigned to any of the slices in the tree.	2013-06-17 21:36:51 +02:00
Lennart Poettering	7027ff61a3	nspawn: introduce the new /machine/ tree in the cgroup tree and move containers there Containers will now carry a label (normally derived from the root directory name, but configurable by the user), and the container's root cgroup is /machine/<label>. This label is called "machine name", and can cover both containers and VMs (as soon as libvirt also makes use of /machine/). libsystemd-login can be used to query the machine name from a process. This patch also includes numerous clean-ups for the cgroup code.	2013-04-16 04:41:21 +02:00
Lennart Poettering	a32360f1a5	core: always create /user and /machine top-level cgroup dirs This allows clients to put inotify watches on these trees to watch for state changes, without having to wait until these dirs are created. This introduces the new top-level /machine cgroup dir as canonical location where OS containers and VMs shall be located (as discussed with the libvirt folks).	2013-04-15 21:59:04 +02:00
Lennart Poettering	974efc4658	cgroup: always keep access mode of 'tasks' and 'cgroup.procs' files in cgroup directories in sync	2013-04-08 18:22:47 +02:00
Lennart Poettering	8e70580bb0	cgroup: minor optimization	2013-03-22 15:46:49 +01:00
Lennart Poettering	246aa6dd9d	core: add bus API and systemctl commands for altering cgroup parameters during runtime	2013-01-14 21:24:57 +01:00
Zbigniew Jędrzejewski-Szmek	67445f4e22	core: move ManagerRunningAs to shared Note: I did s/MANAGER/SYSTEMD/ everywhere, even though it makes the patch quite verbose. Nevertheless, keeping MANAGER prefix in some places, and SYSTEMD prefix in others would just lead to confusion down the road. Better to rip off the band-aid now.	2012-09-18 19:53:34 +02:00
Shawn Landden	0d0f0c50d3	log.h: new log_oom() -> int -ENOMEM, use it also a number of minor fixups and bug fixes: spelling, oom errors that didn't print errors, not properly forwarding error codes, few more consistency issues, et cetera	2012-07-26 11:48:26 +02:00
Shawn Landden	669241a076	use "Out of memory." consistantly (or with "\n") glibc/glib both use "out of memory" consistantly so maybe we should consider that instead of this. Eliminates one string out of a number of binaries. Also fixes extra newline in udev/scsi_id	2012-07-25 11:23:57 +02:00
Lennart Poettering	b7def68494	util: rename join() to strjoin() This is to match strappend() and the other string related functions.	2012-07-13 13:41:01 +02:00
Kay Sievers	9eb977db5b	util: split-out path-util.[ch]	2012-05-08 02:33:10 +02:00
Lennart Poettering	88f3e0c91f	service: explicitly remove control/ subcgroup after each control command The kernel will only notify us of cgroups running empty if no subcgroups exist anymore. Hence make sure we don't leave our own control/ subcgroup around longer than necessary. https://bugzilla.redhat.com/show_bug.cgi?id=818381	2012-05-03 21:54:44 +02:00
Lennart Poettering	b59e246565	logind: remove redundant entries from logind's default controller lists too	2012-04-16 19:15:00 +02:00
Lennart Poettering	9156e799a2	manager: remove unavailable/redundant entries from default controllers list	2012-04-16 18:59:07 +02:00
Lennart Poettering	3474ae3c7e	cgroup: if a controller is not available don't try to create cgroups in its hierarchy	2012-04-16 18:59:07 +02:00
Lennart Poettering	ecedd90fcd	service: place control command in subcgroup control/ Previously, we were brutally and onconditionally killing all processes in a service's cgroup before starting the service anew, in order to ensure that StartPre lines cannot be misused to spawn long-running processes. On logind-less systems this has the effect that restarting sshd necessarily calls all active ssh sessions, which is usually not desirable. With this patch control processes for a service are placed in a sub-cgroup called "control/". When starting a service anew we simply kill this cgroup, but not the main cgroup, in order to avoid killing any long-running non-control processes from previous runs. https://bugzilla.redhat.com/show_bug.cgi?id=805942	2012-04-13 23:29:59 +02:00
Lennart Poettering	5430f7f2bc	relicense to LGPLv2.1 (with exceptions) We finally got the OK from all contributors with non-trivial commits to relicense systemd from GPL2+ to LGPL2.1+. Some udev bits continue to be GPL2+ for now, but we are looking into relicensing them too, to allow free copy/paste of all code within systemd. The bits that used to be MIT continue to be MIT. The big benefit of the relicensing is that closed source code may now link against libsystemd-login.so and friends.	2012-04-12 00:24:39 +02:00
Kay Sievers	b30e2f4c18	move libsystemd_core.la sources into core/	2012-04-11 16:03:51 +02:00

... 2 3 4 5 6

265 commits