glibc/sysdeps/unix/sysv/linux/internal-signals.h

92 lines
2.7 KiB
C
Raw Normal View History

/* Special use of signals internally. Linux version.
Copyright (C) 2014-2018 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
#ifndef __INTERNAL_SIGNALS_H
# define __INTERNAL_SIGNALS_H
posix: New Linux posix_spawn{p} implementation This patch implements a new posix_spawn{p} implementation for Linux. The main difference is it uses the clone syscall directly with CLONE_VM and CLONE_VFORK flags and a direct allocated stack. The new stack and start function solves most the vfork limitation (possible parent clobber due stack spilling). The remaning issue are related to signal handling: 1. That no signal handlers must run in child context, to avoid corrupt parent's state. 2. Child must synchronize with parent to enforce stack deallocation and to possible return execv issues. The first one is solved by blocking all signals in child, even NPTL-internal ones (SIGCANCEL and SIGSETXID). The second issue is done by a stack allocation in parent and a synchronization with using a pipe or waitpid (in case or error). The pipe has the advantage of allowing the child signal an exec error (checked with new tst-spawn2 test). There is an inherent race condition in pipe2 usage for architectures that do not support the syscall directly. In such cases the a pipe plus fctnl is used instead and it may lead to file descriptor leak in parent (as decribed by fcntl documentation). The child process stack is allocate with a mmap with MAP_STACK flag using default architecture stack size. Although it is slower than use a stack buffer from parent, it allows some slack for the compatibility code to run scripts with no shebang (which may use a buffer with size depending of argument list count). Performance should be similar to the vfork default posix implementation and way faster than fork path (vfork on mostly linux ports are basically clone with CLONE_VM plus CLONE_VFORK). The only difference is the syscalls required for the stack allocation/deallocation. It fixes BZ#10354, BZ#14750, and BZ#18433. Tested on i386, x86_64, powerpc64le, and aarch64. [BZ #14750] [BZ #10354] [BZ #18433] * include/sched.h (__clone): Add hidden prototype. (__clone2): Likewise. * include/unistd.h (__dup): Likewise. * posix/Makefile (tests): Add tst-spawn2. * posix/tst-spawn2.c: New file. * sysdeps/posix/dup.c (__dup): Add hidden definition. * sysdeps/unix/sysv/linux/aarch64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/alpha/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/arm/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/hppa/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/i386/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/ia64/clone2.S (__clone): Likewise. * sysdeps/unix/sysv/linux/m68k/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/microblaze/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/mips/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/nios2/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sh/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/tile/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/x86_64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/nptl-signals.h (____nptl_is_internal_signal): New function. * sysdeps/unix/sysv/linux/spawni.c: New file.
2016-01-19 20:33:32 +01:00
#include <signal.h>
Remove __need macros from signal.h. The types affected are __sig_atomic_t, sig_atomic_t, __sigset_t, sigset_t, sigval_t, sigevent_t, and siginfo_t. __sig_atomic_t is a scalar, so it's now directly available from bits/types.h. The others get bits/types/ headers. Side effects include: There have been small changes to which non-signal headers expose which subset of the signal-related types. A couple of architectures' nested siginfo_t fields had to be renamed to prevent undesired macro expansion. Internal code that wants to manipulate signal masks must now include <sigsetops.h> (which is not installed) and should be aware that __sigaddset, __sigandset, __sigdelset, __sigemptyset, and __sigorset no longer return a value (unlike the public API). Relatedly, the public signal.h no longer declares any of those functions. The obsolete sigmask() macro no longer has a system-specific definition -- in the cases where it matters, it didn't work anyway. New Linux architectures should create bits/siginfo-arch.h and/or bits/siginfo-consts-arch.h to customize their siginfo_t, rather than duplicating everything in bits/siginfo.h (which no longer exists). Add new __SI_* macros if necessary. Ports to other operating systems are strongly encouraged to generalize this scheme further. * bits/sigevent-consts.h * bits/siginfo-consts.h * bits/types/__sigset_t.h * bits/types/sigevent_t.h * bits/types/siginfo_t.h * sysdeps/unix/sysv/linux/bits/sigevent-consts.h * sysdeps/unix/sysv/linux/bits/siginfo-consts.h * sysdeps/unix/sysv/linux/bits/types/__sigset_t.h * sysdeps/unix/sysv/linux/bits/types/sigevent_t.h * sysdeps/unix/sysv/linux/bits/types/siginfo_t.h: New system-dependent bits headers. * sysdeps/unix/sysv/linux/bits/siginfo-arch.h * sysdeps/unix/sysv/linux/bits/siginfo-consts-arch.h * sysdeps/unix/sysv/linux/ia64/bits/siginfo-arch.h * sysdeps/unix/sysv/linux/ia64/bits/siginfo-consts-arch.h * sysdeps/unix/sysv/linux/mips/bits/siginfo-arch.h * sysdeps/unix/sysv/linux/sparc/bits/siginfo-arch.h * sysdeps/unix/sysv/linux/tile/bits/siginfo-arch.h * sysdeps/unix/sysv/linux/tile/bits/siginfo-consts-arch.h * sysdeps/unix/sysv/linux/x86/bits/siginfo-arch.h: New Linux-only system-dependent bits headers. * signal/bits/types/sig_atomic_t.h * signal/bits/types/sigset_t.h * signal/bits/types/sigval_t.h: New non-system-dependent bits headers. * sysdeps/generic/sigsetops.h * sysdeps/unix/sysv/linux/sigsetops.h: New internal headers. * include/bits/types/sig_atomic_t.h * include/bits/types/sigset_t.h * include/bits/types/sigval_t.h: New wrappers. * signal/sigsetops.h * bits/siginfo.h * bits/sigset.h * sysdeps/unix/sysv/linux/bits/siginfo.h * sysdeps/unix/sysv/linux/bits/sigset.h * sysdeps/unix/sysv/linux/ia64/bits/siginfo.h * sysdeps/unix/sysv/linux/mips/bits/siginfo.h * sysdeps/unix/sysv/linux/s390/bits/siginfo.h * sysdeps/unix/sysv/linux/sparc/bits/siginfo.h * sysdeps/unix/sysv/linux/tile/bits/siginfo.h * sysdeps/unix/sysv/linux/x86/bits/siginfo.h: Deleted. * signal/Makefile, sysdeps/unix/sysv/linux/Makefile: Update lists of installed headers. * posix/bits/types.h: Define __sig_atomic_t here. * signal/signal.h: Use the new bits headers; no need to handle __need_sig_atomic_t nor __need_sigset_t. Don't use __sigmask to define sigmask. * include/signal.h: No need to handle __need_sig_atomic_t nor __need_sigset_t. Don't define __sigemptyset. * io/sys/poll.h, setjmp/setjmp.h * sysdeps/arm/sys/ucontext.h, sysdeps/generic/sys/ucontext.h * sysdeps/i386/sys/ucontext.h, sysdeps/m68k/sys/ucontext.h * sysdeps/mach/hurd/i386/bits/sigcontext.h * sysdeps/mips/sys/ucontext.h, sysdeps/powerpc/novmxsetjmp.h * sysdeps/pthread/bits/sigthread.h * sysdeps/unix/sysv/linux/hppa/sys/ucontext.h * sysdeps/unix/sysv/linux/m68k/sys/ucontext.h * sysdeps/unix/sysv/linux/mips/sys/ucontext.h * sysdeps/unix/sysv/linux/nios2/sys/ucontext.h * sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h * sysdeps/unix/sysv/linux/s390/sys/ucontext.h * sysdeps/unix/sysv/linux/sh/sys/ucontext.h * sysdeps/unix/sysv/linux/sparc/sys/ucontext.h * sysdeps/unix/sysv/linux/tile/sys/ucontext.h * sysdeps/unix/sysv/linux/x86/sys/ucontext.h: Use bits/types/__sigset_t.h. * misc/sys/select.h, posix/spawn.h * sysdeps/unix/sysv/linux/powerpc/sys/ucontext.h * sysdeps/unix/sysv/linux/sys/epoll.h * sysdeps/unix/sysv/linux/sys/signalfd.h: Use bits/types/sigset_t.h. * resolv/netdb.h, rt/mqueue.h: Use bits/types/sigevent_t.h. * rt/aio.h: Use bits/types/sigevent_t.h and bits/sigevent-consts.h. * socket/sys/socket.h: Don't include bits/sigset.h. * login/utmp_file.c, shadow/lckpwdf.c, signal/sigandset.c * signal/sigisempty.c, stdlib/abort.c, sysdeps/posix/profil.c * sysdeps/posix/sigignore.c, sysdeps/posix/sigintr.c * sysdeps/posix/signal.c, sysdeps/posix/sigset.c * sysdeps/posix/sprofil.c, sysdeps/posix/sysv_signal.c * sysdeps/unix/sysv/linux/nptl-signals.h: Include sigsetops.h. * signal/sigaddset.c, signal/sigandset.c, signal/sigdelset.c * signal/sigorset.c, stdlib/abort.c, sysdeps/posix/sigignore.c * sysdeps/posix/signal.c, sysdeps/posix/sigset.c: __sigaddset, __sigandset, __sigdelset, __sigemptyset, __sigorset now return no value. * signal/sigaddset.c, signal/sigdelset.c, signal/sigismem.c Include <errno.h>, <signal.h>, and <sigsetops.h> instead of "sigsetops.h". * signal/sigsetops.c: Explicitly define __sigismember, __sigaddset, and __sigdelset as compatibility symbols. * signal/Versions: Correct commentary on __sigpause, __sigaddset, __sigdelset, __sigismember. * inet/rcmd.c: Include sigsetops.h. Convert old code using __sigblock/__sigsetmask to use __sigprocmask and friends.
2016-12-02 01:24:13 +01:00
#include <sigsetops.h>
Filter out NPTL internal signals (BZ #22391) This patch filters out the internal NPTL signals (SIGCANCEL/SIGTIMER and SIGSETXID) from signal functions. GLIBC on Linux requires both signals to proper implement pthread cancellation, posix timers, and set*id posix thread synchronization. And not filtering out the internal signal is troublesome: - A conformant program on a architecture that does not filter out the signals might inadvertently disable pthread asynchronous cancellation, set*id synchronization or posix timers. - It might also to security issues if SIGSETXID is masked and set*id functions are called (some threads might have effective user or group id different from the rest). The changes are basically: - Change __is_internal_signal to bool and used on all signal function that has a signal number as input. Also for signal function which accepts signals sets (sigset_t) it assumes that canonical function were used to add/remove signals which lead to some input simplification. - Fix tst-sigset.c to avoid check for SIGCANCEL/SIGTIMER and SIGSETXID. It is rewritten to check each signal indidually and to check realtime signals using canonical macros. - Add generic __clear_internal_signals and __is_internal_signal version since both symbols are used on generic implementations. - Remove superflous sysdeps/nptl/sigfillset.c. - Remove superflous SIGTIMER handling on Linux __is_internal_signal since it is the same of SIGCANCEL. - Remove dangling define and obvious comment on nptl/sigaction.c. Checked on x86_64-linux-gnu. [BZ #22391] * nptl/sigaction.c (__sigaction): Use __is_internal_signal to check for internal nptl signals. * nptl/sigaction.c (__sigaction): Likewise. * signal/sigaddset.c (sigaddset): Likewise. * signal/sigdelset.c (sigdelset): Likewise. * sysdeps/posix/signal.c (__bsd_signal): Likewise. * sysdeps/posix/sigset.c (sigset): Call and check sigaddset return value. * signal/sigfillset.c (sigfillset): User __clear_internal_signals to filter out internal nptl signals. * signal/tst-sigset.c (do_test): Check ech signal indidually and also check realtime signals using standard macros. * sysdeps/generic/internal-signals.h (__clear_internal_signals, __is_internal_signal, __libc_signal_block_all, __libc_signal_block_app, __libc_signal_restore_set): New functions. * sysdeps/nptl/sigfillset.c: Remove file. * sysdeps/unix/sysv/linux/internal-signals.h (__is_internal_signal): Change return to bool. (__clear_internal_signals): Remove SIGTIMER clean since it is equal to SIGCANEL on Linux. * sysdeps/unix/sysv/linux/sigtimedwait.c (__sigtimedwait): Assume signal set was constructed using standard functions. Reported-by: Yury Norov <ynorov@caviumnetworks.com>
2017-11-01 02:10:37 +01:00
#include <stdbool.h>
#include <sysdep.h>
posix: New Linux posix_spawn{p} implementation This patch implements a new posix_spawn{p} implementation for Linux. The main difference is it uses the clone syscall directly with CLONE_VM and CLONE_VFORK flags and a direct allocated stack. The new stack and start function solves most the vfork limitation (possible parent clobber due stack spilling). The remaning issue are related to signal handling: 1. That no signal handlers must run in child context, to avoid corrupt parent's state. 2. Child must synchronize with parent to enforce stack deallocation and to possible return execv issues. The first one is solved by blocking all signals in child, even NPTL-internal ones (SIGCANCEL and SIGSETXID). The second issue is done by a stack allocation in parent and a synchronization with using a pipe or waitpid (in case or error). The pipe has the advantage of allowing the child signal an exec error (checked with new tst-spawn2 test). There is an inherent race condition in pipe2 usage for architectures that do not support the syscall directly. In such cases the a pipe plus fctnl is used instead and it may lead to file descriptor leak in parent (as decribed by fcntl documentation). The child process stack is allocate with a mmap with MAP_STACK flag using default architecture stack size. Although it is slower than use a stack buffer from parent, it allows some slack for the compatibility code to run scripts with no shebang (which may use a buffer with size depending of argument list count). Performance should be similar to the vfork default posix implementation and way faster than fork path (vfork on mostly linux ports are basically clone with CLONE_VM plus CLONE_VFORK). The only difference is the syscalls required for the stack allocation/deallocation. It fixes BZ#10354, BZ#14750, and BZ#18433. Tested on i386, x86_64, powerpc64le, and aarch64. [BZ #14750] [BZ #10354] [BZ #18433] * include/sched.h (__clone): Add hidden prototype. (__clone2): Likewise. * include/unistd.h (__dup): Likewise. * posix/Makefile (tests): Add tst-spawn2. * posix/tst-spawn2.c: New file. * sysdeps/posix/dup.c (__dup): Add hidden definition. * sysdeps/unix/sysv/linux/aarch64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/alpha/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/arm/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/hppa/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/i386/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/ia64/clone2.S (__clone): Likewise. * sysdeps/unix/sysv/linux/m68k/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/microblaze/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/mips/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/nios2/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sh/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/tile/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/x86_64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/nptl-signals.h (____nptl_is_internal_signal): New function. * sysdeps/unix/sysv/linux/spawni.c: New file.
2016-01-19 20:33:32 +01:00
/* The signal used for asynchronous cancelation. */
#define SIGCANCEL __SIGRTMIN
/* Signal needed for the kernel-supported POSIX timer implementation.
We can reuse the cancellation signal since we can distinguish
cancellation from timer expirations. */
#define SIGTIMER SIGCANCEL
/* Signal used to implement the setuid et.al. functions. */
#define SIGSETXID (__SIGRTMIN + 1)
posix: New Linux posix_spawn{p} implementation This patch implements a new posix_spawn{p} implementation for Linux. The main difference is it uses the clone syscall directly with CLONE_VM and CLONE_VFORK flags and a direct allocated stack. The new stack and start function solves most the vfork limitation (possible parent clobber due stack spilling). The remaning issue are related to signal handling: 1. That no signal handlers must run in child context, to avoid corrupt parent's state. 2. Child must synchronize with parent to enforce stack deallocation and to possible return execv issues. The first one is solved by blocking all signals in child, even NPTL-internal ones (SIGCANCEL and SIGSETXID). The second issue is done by a stack allocation in parent and a synchronization with using a pipe or waitpid (in case or error). The pipe has the advantage of allowing the child signal an exec error (checked with new tst-spawn2 test). There is an inherent race condition in pipe2 usage for architectures that do not support the syscall directly. In such cases the a pipe plus fctnl is used instead and it may lead to file descriptor leak in parent (as decribed by fcntl documentation). The child process stack is allocate with a mmap with MAP_STACK flag using default architecture stack size. Although it is slower than use a stack buffer from parent, it allows some slack for the compatibility code to run scripts with no shebang (which may use a buffer with size depending of argument list count). Performance should be similar to the vfork default posix implementation and way faster than fork path (vfork on mostly linux ports are basically clone with CLONE_VM plus CLONE_VFORK). The only difference is the syscalls required for the stack allocation/deallocation. It fixes BZ#10354, BZ#14750, and BZ#18433. Tested on i386, x86_64, powerpc64le, and aarch64. [BZ #14750] [BZ #10354] [BZ #18433] * include/sched.h (__clone): Add hidden prototype. (__clone2): Likewise. * include/unistd.h (__dup): Likewise. * posix/Makefile (tests): Add tst-spawn2. * posix/tst-spawn2.c: New file. * sysdeps/posix/dup.c (__dup): Add hidden definition. * sysdeps/unix/sysv/linux/aarch64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/alpha/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/arm/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/hppa/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/i386/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/ia64/clone2.S (__clone): Likewise. * sysdeps/unix/sysv/linux/m68k/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/microblaze/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/mips/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/nios2/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sh/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/tile/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/x86_64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/nptl-signals.h (____nptl_is_internal_signal): New function. * sysdeps/unix/sysv/linux/spawni.c: New file.
2016-01-19 20:33:32 +01:00
/* Return is sig is used internally. */
Filter out NPTL internal signals (BZ #22391) This patch filters out the internal NPTL signals (SIGCANCEL/SIGTIMER and SIGSETXID) from signal functions. GLIBC on Linux requires both signals to proper implement pthread cancellation, posix timers, and set*id posix thread synchronization. And not filtering out the internal signal is troublesome: - A conformant program on a architecture that does not filter out the signals might inadvertently disable pthread asynchronous cancellation, set*id synchronization or posix timers. - It might also to security issues if SIGSETXID is masked and set*id functions are called (some threads might have effective user or group id different from the rest). The changes are basically: - Change __is_internal_signal to bool and used on all signal function that has a signal number as input. Also for signal function which accepts signals sets (sigset_t) it assumes that canonical function were used to add/remove signals which lead to some input simplification. - Fix tst-sigset.c to avoid check for SIGCANCEL/SIGTIMER and SIGSETXID. It is rewritten to check each signal indidually and to check realtime signals using canonical macros. - Add generic __clear_internal_signals and __is_internal_signal version since both symbols are used on generic implementations. - Remove superflous sysdeps/nptl/sigfillset.c. - Remove superflous SIGTIMER handling on Linux __is_internal_signal since it is the same of SIGCANCEL. - Remove dangling define and obvious comment on nptl/sigaction.c. Checked on x86_64-linux-gnu. [BZ #22391] * nptl/sigaction.c (__sigaction): Use __is_internal_signal to check for internal nptl signals. * nptl/sigaction.c (__sigaction): Likewise. * signal/sigaddset.c (sigaddset): Likewise. * signal/sigdelset.c (sigdelset): Likewise. * sysdeps/posix/signal.c (__bsd_signal): Likewise. * sysdeps/posix/sigset.c (sigset): Call and check sigaddset return value. * signal/sigfillset.c (sigfillset): User __clear_internal_signals to filter out internal nptl signals. * signal/tst-sigset.c (do_test): Check ech signal indidually and also check realtime signals using standard macros. * sysdeps/generic/internal-signals.h (__clear_internal_signals, __is_internal_signal, __libc_signal_block_all, __libc_signal_block_app, __libc_signal_restore_set): New functions. * sysdeps/nptl/sigfillset.c: Remove file. * sysdeps/unix/sysv/linux/internal-signals.h (__is_internal_signal): Change return to bool. (__clear_internal_signals): Remove SIGTIMER clean since it is equal to SIGCANEL on Linux. * sysdeps/unix/sysv/linux/sigtimedwait.c (__sigtimedwait): Assume signal set was constructed using standard functions. Reported-by: Yury Norov <ynorov@caviumnetworks.com>
2017-11-01 02:10:37 +01:00
static inline bool
__is_internal_signal (int sig)
posix: New Linux posix_spawn{p} implementation This patch implements a new posix_spawn{p} implementation for Linux. The main difference is it uses the clone syscall directly with CLONE_VM and CLONE_VFORK flags and a direct allocated stack. The new stack and start function solves most the vfork limitation (possible parent clobber due stack spilling). The remaning issue are related to signal handling: 1. That no signal handlers must run in child context, to avoid corrupt parent's state. 2. Child must synchronize with parent to enforce stack deallocation and to possible return execv issues. The first one is solved by blocking all signals in child, even NPTL-internal ones (SIGCANCEL and SIGSETXID). The second issue is done by a stack allocation in parent and a synchronization with using a pipe or waitpid (in case or error). The pipe has the advantage of allowing the child signal an exec error (checked with new tst-spawn2 test). There is an inherent race condition in pipe2 usage for architectures that do not support the syscall directly. In such cases the a pipe plus fctnl is used instead and it may lead to file descriptor leak in parent (as decribed by fcntl documentation). The child process stack is allocate with a mmap with MAP_STACK flag using default architecture stack size. Although it is slower than use a stack buffer from parent, it allows some slack for the compatibility code to run scripts with no shebang (which may use a buffer with size depending of argument list count). Performance should be similar to the vfork default posix implementation and way faster than fork path (vfork on mostly linux ports are basically clone with CLONE_VM plus CLONE_VFORK). The only difference is the syscalls required for the stack allocation/deallocation. It fixes BZ#10354, BZ#14750, and BZ#18433. Tested on i386, x86_64, powerpc64le, and aarch64. [BZ #14750] [BZ #10354] [BZ #18433] * include/sched.h (__clone): Add hidden prototype. (__clone2): Likewise. * include/unistd.h (__dup): Likewise. * posix/Makefile (tests): Add tst-spawn2. * posix/tst-spawn2.c: New file. * sysdeps/posix/dup.c (__dup): Add hidden definition. * sysdeps/unix/sysv/linux/aarch64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/alpha/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/arm/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/hppa/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/i386/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/ia64/clone2.S (__clone): Likewise. * sysdeps/unix/sysv/linux/m68k/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/microblaze/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/mips/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/nios2/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sh/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/tile/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/x86_64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/nptl-signals.h (____nptl_is_internal_signal): New function. * sysdeps/unix/sysv/linux/spawni.c: New file.
2016-01-19 20:33:32 +01:00
{
return (sig == SIGCANCEL) || (sig == SIGSETXID);
posix: New Linux posix_spawn{p} implementation This patch implements a new posix_spawn{p} implementation for Linux. The main difference is it uses the clone syscall directly with CLONE_VM and CLONE_VFORK flags and a direct allocated stack. The new stack and start function solves most the vfork limitation (possible parent clobber due stack spilling). The remaning issue are related to signal handling: 1. That no signal handlers must run in child context, to avoid corrupt parent's state. 2. Child must synchronize with parent to enforce stack deallocation and to possible return execv issues. The first one is solved by blocking all signals in child, even NPTL-internal ones (SIGCANCEL and SIGSETXID). The second issue is done by a stack allocation in parent and a synchronization with using a pipe or waitpid (in case or error). The pipe has the advantage of allowing the child signal an exec error (checked with new tst-spawn2 test). There is an inherent race condition in pipe2 usage for architectures that do not support the syscall directly. In such cases the a pipe plus fctnl is used instead and it may lead to file descriptor leak in parent (as decribed by fcntl documentation). The child process stack is allocate with a mmap with MAP_STACK flag using default architecture stack size. Although it is slower than use a stack buffer from parent, it allows some slack for the compatibility code to run scripts with no shebang (which may use a buffer with size depending of argument list count). Performance should be similar to the vfork default posix implementation and way faster than fork path (vfork on mostly linux ports are basically clone with CLONE_VM plus CLONE_VFORK). The only difference is the syscalls required for the stack allocation/deallocation. It fixes BZ#10354, BZ#14750, and BZ#18433. Tested on i386, x86_64, powerpc64le, and aarch64. [BZ #14750] [BZ #10354] [BZ #18433] * include/sched.h (__clone): Add hidden prototype. (__clone2): Likewise. * include/unistd.h (__dup): Likewise. * posix/Makefile (tests): Add tst-spawn2. * posix/tst-spawn2.c: New file. * sysdeps/posix/dup.c (__dup): Add hidden definition. * sysdeps/unix/sysv/linux/aarch64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/alpha/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/arm/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/hppa/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/i386/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/ia64/clone2.S (__clone): Likewise. * sysdeps/unix/sysv/linux/m68k/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/microblaze/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/mips/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/nios2/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sh/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/tile/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/x86_64/clone.S (__clone): Likewise. * sysdeps/unix/sysv/linux/nptl-signals.h (____nptl_is_internal_signal): New function. * sysdeps/unix/sysv/linux/spawni.c: New file.
2016-01-19 20:33:32 +01:00
}
/* Remove internal glibc signal from the mask. */
static inline void
__clear_internal_signals (sigset_t *set)
{
__sigdelset (set, SIGCANCEL);
__sigdelset (set, SIGSETXID);
}
#define SIGALL_SET \
((__sigset_t) { .__val = {[0 ... _SIGSET_NWORDS-1 ] = -1 } })
/* Block all signals, including internal glibc ones. */
static inline int
__libc_signal_block_all (sigset_t *set)
{
INTERNAL_SYSCALL_DECL (err);
return INTERNAL_SYSCALL (rt_sigprocmask, err, 4, SIG_BLOCK, &SIGALL_SET,
set, _NSIG / 8);
}
/* Block all application signals (excluding internal glibc ones). */
static inline int
__libc_signal_block_app (sigset_t *set)
{
sigset_t allset = SIGALL_SET;
__clear_internal_signals (&allset);
INTERNAL_SYSCALL_DECL (err);
return INTERNAL_SYSCALL (rt_sigprocmask, err, 4, SIG_BLOCK, &allset, set,
_NSIG / 8);
}
/* Restore current process signal mask. */
static inline int
__libc_signal_restore_set (const sigset_t *set)
{
INTERNAL_SYSCALL_DECL (err);
return INTERNAL_SYSCALL (rt_sigprocmask, err, 4, SIG_SETMASK, set, NULL,
_NSIG / 8);
}
/* Used to communicate with signal handler. */
extern struct xid_command *__xidcmd attribute_hidden;
#endif