This reverts commit b45c068dd8.
I think the idea was generally sound, but didn't take into account the
limitations of show-environment and how it is used. People expect to be able to
eval systemctl show-environment output in bash, and no escaping syntax is
defined for environment *names* (we only do escaping for *values*). We could
skip such problematic variables in 'systemctl show-environment', and only allow
them to be inherited directly. But this would be confusing and ugly.
The original motivation for this change was that various import operations
would fail. a4ccce22d9 changed systemctl to filter
invalid variables in import-environment.
https://gitlab.gnome.org/GNOME/gnome-session/-/issues/71 does a similar change
in GNOME. So those problematic variables should not cause failures, but just
be silently ignored.
Finally, the environment block is becoming a dumping ground. In my gnome
session 'systemctl show-environment --user' includes stuff like PWD, FPATH
(from zsh), SHLVL=0 (no idea what that is). This is not directly related to
variable names (since all those are allowed under the stricter rules too), but
I think we should start pushing people away from running import-environment and
towards importing only select variables.
https://github.com/systemd/systemd/pull/17188#issuecomment-708676511
There was some confusion about what POSIX says about variable names:
names shall not contain the character '='. For values to be portable
across systems conforming to POSIX.1-2008, the value shall be composed
of characters from the portable character set (except NUL and as
indicated below).
i.e. it allows almost all ASCII in variable names (without NUL and DEL and
'='). OTOH, it says that *utilities* use a smaller set of characters:
Environment variable names used by the utilities in the Shell and
Utilities volume of POSIX.1-2008 consist solely of uppercase letters,
digits, and the <underscore> ( '_' ) from the characters defined in
Portable Character Set and do not begin with a digit.
When enforcing variable names in environment blocks, we need to use this
first definition, so that we can propagate all valid variables.
I think having non-printable characters in variable names is too much, so
I took out the whitespace stuff from the first definition.
OTOH, when we use *shell syntax*, for example doing variable expansion,
it seems enough to support expansion of variables that the shell would allow.
Fixes#14878,
https://bugzilla.redhat.com/show_bug.cgi?id=1754395,
https://bugzilla.redhat.com/show_bug.cgi?id=1879216.
Whenever I see EXTRACT_QUOTES, I'm always confused whether it means to
leave the quotes in or to take them out. Let's say "unquote", like we
say "cunescape".
Let's be more careful with what we serialize: let's ensure we never
serialize strings that are longer than LONG_LINE_MAX, so that we know we
can read them back with read_line(…, LONG_LINE_MAX, …) safely.
In order to implement this all serialization functions are move to
serialize.[ch], and internally will do line size checks. We'd rather
skip a serialization line (with a loud warning) than write an overly
long line out. Of course, this is just a second level protection, after
all the data we serialize shouldn't be this long in the first place.
While we are at it also clean up logging: while serializing make sure to
always log about errors immediately. Also, (void)ify all calls we don't
expect errors in (or catch errors as part of the general
fflush_and_check() at the end.
Let's clean up the failure codepaths, by using _cleanup_.
This relies on the new behaviour of env_append() introduced in the
previous commit that guarantess the list always remains properly NULL
terminated
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
Previously we were a bit sloppy with the index and size types of arrays,
we'd regularly use unsigned. While I don't think this ever resulted in
real issues I think we should be more careful there and follow a
stricter regime: unless there's a strong reason not to use size_t for
array sizes and indexes, size_t it should be. Any allocations we do
ultimately will use size_t anyway, and converting forth and back between
unsigned and size_t will always be a source of problems.
Note that on 32bit machines "unsigned" and "size_t" are equivalent, and
on 64bit machines our arrays shouldn't grow that large anyway, and if
they do we have a problem, however that kind of overly large allocation
we have protections for usually, but for overflows we do not have that
so much, hence let's add it.
So yeah, it's a story of the current code being already "good enough",
but I think some extra type hygiene is better.
This patch tries to be comprehensive, but it probably isn't and I missed
a few cases. But I guess we can cover that later as we notice it. Among
smaller fixes, this changes:
1. strv_length()' return type becomes size_t
2. the unit file changes array size becomes size_t
3. DNS answer and query array sizes become size_t
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=76745
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
The environment variables we've serialized can quite possibly contain
characters outside the set allowed by env_assignment_is_valid(). In
fact, my environment seems to contain a couple of these:
* TERMCAP set by screen contains a '\x7f' character
* BASH_FUNC_module%% variable has a '%' character in name
Strict check of environment variables name and value certainly makes sense for
unit files, but not so much for deserialization of values we already had
in our environment.
Apart from bugs (as in #6152), this can happen if we ever make
our requirements for environment entries more stringent. As with
the rest of deserialization, we should just warn and continue.
Sometimes it's useful to provide a default value during an environment
expansion, if the environment variable isn't already set.
For instance $XDG_DATA_DIRS is suppose to default to:
/usr/local/share/:/usr/share/
if it's not yet set. That means callers wishing to augment
XDG_DATA_DIRS need to manually add those two values.
This commit changes replace_env to support the following shell
compatible default value syntax:
XDG_DATA_DIRS=/foo:${XDG_DATA_DIRS:-/usr/local/share/:/usr/share}
Likewise, it's useful to provide an alternate value during an
environment expansion, if the environment variable isn't already set.
For instance, $LD_LIBRARY_PATH will inadvertently search the current
working directory if it starts or ends with a colon, so the following
is usually wrong:
LD_LIBRARY_PATH=/foo/lib:${LD_LIBRARY_PATH}
To address that, this changes replace_env to support the following
shell compatible alternate value syntax:
LD_LIBRARY_PATH=/foo/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
[zj: gate the new syntax under REPLACE_ENV_ALLOW_EXTENDED switch, so
existing callers are not modified.]
(Only in environment.d files.)
We have only basic compatibility with shell syntax, but specifying variables
without using braces is probably more common, and I think a lot of people would
be surprised if this didn't work.
merge_env_file is a new function, that's like load_env_file, but takes a
pre-existing environment as an input argument. New environment entries are
merged. Variable expansion is performed.
Falling back to the process environment is supported (when a flag is set).
Alternatively this could be implemented as passing an additional fallback
environment array, but later on we're adding another flag to allow braceless
expansion, and the two flags can be combined in one arg, so there's less
stuff to pass around.
strempty() converts a NULL value to empty string, so
that it can be passed on to functions that don't support NULL.
replace_env calls strempty before passing its value on to strappend.
strappend supports NULL just fine, though, so this commit drops the
strempty call.
If an environment array has duplicates, strv_env_get_n returns
the results for the first match. This is wrong, because later
entries in the environment are supposed to replace earlier
entries.
strv_env_replace was calling env_match(), which in effect allowed multiple
values for the same key to be inserted into the environment block. That's
pointless, because APIs to access variables only return a single value (the
latest entry), so it's better to keep the block clean, i.e. with just a single
entry for each key.
Add a new helper function that simply tests if the part before '=' is equal in
two strings and use that in strv_env_replace.
In load_env_file_push, use strv_env_replace to immediately replace the previous
assignment with a matching name.
Afaict, none of the callers are materially affected by this change, but it
seems like some pointless work was being done, if the same value was set
multiple times. We'd go through parsing and assigning the value for each
entry. With this change, we handle just the last one.
This protocol is generally useful, we might just as well reuse it for the
env. generators.
The implementation is changed a bit: instead of making a new strv and freeing
the old one, just mutate the original. This is much faster with larger arrays,
while in fact atomicity is preserved, since we only either insert the new
entry or not, without being in inconsistent state.
v2:
- fix confusion with return value