Commit Graph

39287 Commits

Author SHA1 Message Date
Félix Baylac-Jacqué 1502e8dab9
Some notes 2022-10-07 07:34:07 +02:00
Aurelien Jarno 7e8283170c x86-64: Require BMI1/BMI2 for AVX2 strrchr and wcsrchr implementations
The AVX2 strrchr and wcsrchr implementation uses the 'blsmsk'
instruction which belongs to the BMI1 CPU feature and the 'shrx'
instruction, which belongs to the BMI2 CPU feature.

Fixes: df7e295d18 ("x86: Optimize {str|wcs}rchr-avx2")
Partially resolves: BZ #29611

Reviewed-by: Noah Goldstein  <goldstein.w.n@gmail.com>
2022-10-03 23:46:11 +02:00
Aurelien Jarno 3c0c78afab x86-64: Require BMI2 and LZCNT for AVX2 memrchr implementation
The AVX2 memrchr implementation uses the 'shlxl' instruction, which
belongs to the BMI2 CPU feature and uses the 'lzcnt' instruction, which
belongs to the LZCNT CPU feature.

Fixes: af5306a735 ("x86: Optimize memrchr-avx2.S")
Partially resolves: BZ #29611

Reviewed-by: Noah Goldstein  <goldstein.w.n@gmail.com>
2022-10-03 23:46:11 +02:00
Aurelien Jarno e3e7fab7fe x86-64: Require BMI2 for AVX2 (raw|w)memchr implementations
The AVX2 memchr, rawmemchr and wmemchr implementations use the 'bzhi'
and 'sarx' instructions, which belongs to the BMI2 CPU feature.

Fixes: acfd088a19 ("x86: Optimize memchr-avx2.S")
Partially resolves: BZ #29611

Reviewed-by: Noah Goldstein  <goldstein.w.n@gmail.com>
2022-10-03 23:46:11 +02:00
Aurelien Jarno f31a5a884e x86-64: Require BMI2 for AVX2 wcs(n)cmp implementations
The AVX2 wcs(n)cmp implementations use the 'bzhi' instruction, which
belongs to the BMI2 CPU feature.

NB: It also uses the 'tzcnt' BMI1 instruction, but it is executed as BSF
as BSF if the CPU doesn't support TZCNT, and produces the same result
for non-zero input.

Partially fixes: b77b06e0e2 ("x86: Optimize strcmp-avx2.S")
Partially resolves: BZ #29611

Reviewed-by: Noah Goldstein  <goldstein.w.n@gmail.com>
2022-10-03 23:46:11 +02:00
Aurelien Jarno fc7de1d9b9 x86-64: Require BMI2 for AVX2 strncmp implementation
The AVX2 strncmp implementations uses the 'bzhi' instruction, which
belongs to the BMI2 CPU feature.

NB: It also uses the 'tzcnt' BMI1 instruction, but it is executed as BSF
as BSF if the CPU doesn't support TZCNT, and produces the same result
for non-zero input.

Partially fixes: b77b06e0e2 ("x86: Optimize strcmp-avx2.S")
Partially resolves: BZ #29611

Reviewed-by: Noah Goldstein  <goldstein.w.n@gmail.com>
2022-10-03 23:46:11 +02:00
Aurelien Jarno 4d64c64457 x86-64: Require BMI2 for AVX2 strcmp implementation
The AVX2 strcmp implementation uses the 'bzhi' instruction, which
belongs to the BMI2 CPU feature.

NB: It also uses the 'tzcnt' BMI1 instruction, but it is executed as BSF
as BSF if the CPU doesn't support TZCNT, and produces the same result
for non-zero input.

Partially fixes: b77b06e0e2 ("x86: Optimize strcmp-avx2.S")
Partially resolves: BZ #29611

Reviewed-by: Noah Goldstein  <goldstein.w.n@gmail.com>
2022-10-03 23:46:11 +02:00
Aurelien Jarno 10f79d3670 x86-64: Require BMI2 for AVX2 str(n)casecmp implementations
The AVX2 str(n)casecmp implementations use the 'bzhi' instruction, which
belongs to the BMI2 CPU feature.

NB: It also uses the 'tzcnt' BMI1 instruction, but it is executed as BSF
as BSF if the CPU doesn't support TZCNT, and produces the same result
for non-zero input.

Partially fixes: b77b06e0e2 ("x86: Optimize strcmp-avx2.S")
Partially resolves: BZ #29611

Reviewed-by: Noah Goldstein  <goldstein.w.n@gmail.com>
2022-10-03 23:46:11 +02:00
Aurelien Jarno b80f16adbd x86: include BMI1 and BMI2 in x86-64-v3 level
The "System V Application Binary Interface AMD64 Architecture Processor
Supplement" mandates the BMI1 and BMI2 CPU features for the x86-64-v3
level.

Reviewed-by: Noah Goldstein  <goldstein.w.n@gmail.com>
2022-10-03 23:46:11 +02:00
Noah Goldstein 653c12c7d8 x86: Cleanup pthread_spin_{try}lock.S
Save a jmp on the lock path coming from an initial failure in
pthread_spin_lock.S.  This costs 4-bytes of code but since the
function still fits in the same number of 16-byte blocks (default
function alignment) it does not have affect on the total binary size
of libc.so (unchanged after this commit).

pthread_spin_trylock was using a CAS when a simple xchg works which
is often more expensive.

Full check passes on x86-64.
2022-10-03 14:13:49 -07:00
Noah Goldstein 10c779f44a Benchtests: Add bench for pthread_spin_{try}lock and mutex_trylock
Reuses infrastructure from previous pthread_mutex_lock benchmarks to
test other performance sensitive functions.
2022-10-03 14:13:49 -07:00
Adhemerval Zanella 114e299ca6 x86: Remove .tfloat usage
Some compiler does not support it (such as clang integrated assembler)
neither gcc emits it.
2022-10-03 14:03:21 -03:00
Yu Chien Peter Lin 365b3af67e nptl: Convert tst-setuid2 to test-driver
Use <support/test-driver.c> and replace pthread calls to its xpthread
equivalents.

Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-10-03 11:19:36 -03:00
Yu Chien Peter Lin 3bea50ccbc support: Add xpthread_cond_signal wrapper
Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-10-03 11:19:36 -03:00
John David Anglin b7bd94068e hppa: Fix initialization of dp register [BZ 29635]
After upgrading glibc to Debian 2.35-1, gdb faulted on
startup and dropped core in a function call in the main
application.  This was caused by not initializing the
global dp register for the main application early enough.

Restore the code to initialize dp in _dl_start_user.
It was removed when code was added to initialize dp in
elf_machine_runtime_setup.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
2022-10-01 19:49:25 +00:00
Joseph Myers ba70f69598 Fix iseqsig for _FloatN and _FloatNx in C++ with GCC 13
With GCC 13, _FloatN and _FloatNx types, when they exist, are distinct
types like they are in C with GCC 7 and later, rather than typedefs
for types such as float, double or long double.

This breaks the templated iseqsig implementation for C++ in <math.h>,
when used with types that were previously implemented as aliases.  Add
the necessary definitions for _Float32, _Float64, _Float128 (when the
same format as long double), _Float32x and _Float64x in this case, so
that iseqsig can be used with such types in C++ with GCC 13 as it
could with previous GCC versions.

Also add tests for calling iseqsig in C++ with arguments of such types
(more minimal than existing tests, so that they can work with older
GCC versions and without relying on any C++ library support for the
types or on hardcoding details of their formats).  The LDBL_MANT_DIG
!= 106 conditionals on some tests are because the type-generic
comparison macros have undefined behavior when neither argument has a
type whose set of values is a subset of those for the type of the
other argument, which applies when one argument is IBM long double and
the other is an IEEE format wider than binary64.

Tested with build-many-glibcs.py glibcs build for aarch64-linux-gnu
i686-linux-gnu mips-linux-gnu mips64-linux-gnu-n32 powerpc-linux-gnu
powerpc64le-linux-gnu x86_64-linux-gnu.
2022-09-30 19:36:41 +00:00
Adhemerval Zanella 609c9d0951 malloc: Do not clobber errno on __getrandom_nocancel (BZ #29624)
Use INTERNAL_SYSCALL_CALL instead of INLINE_SYSCALL_CALL.  This
requires emulate the semantic for hurd call (so __arc4random_buf
uses the fallback).

Checked on x86_64-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2022-09-30 15:25:15 -03:00
Adhemerval Zanella 13db9ee2cb stdlib: Fix __getrandom_nocancel type and arc4random usage (BZ #29638)
Using an unsigned type prevents the fallback to be used if kernel
does not support getrandom syscall.

Checked on x86_64-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2022-09-30 15:24:49 -03:00
Xi Ruoyao 8b10727a9a LoongArch: Add static PIE support
If the compiler is new enough, enable static PIE support.  In the static
PIE version of _start (in rcrt1.o), use la.pcrel instead of la.got
because in a static PIE we cannot use GOT entries until the dynamic
relocations for GOT are resolved.
2022-09-30 11:51:58 +08:00
Noah Goldstein 5eb21c62ce Benchtest: Add additional benchmarks for strlen and strnlen
Current benchmarks are missing many cases in the mid-length range
which is often the hottest size range.
2022-09-28 20:16:04 -07:00
Noah Goldstein b0969fa53a x86: Fix wcsnlen-avx2 page cross length comparison [BZ #29591]
Previous implementation was adjusting length (rsi) to match
bytes (eax), but since there is no bound to length this can cause
overflow.

Fix is to just convert the byte-count (eax) to length by dividing by
sizeof (wchar_t) before the comparison.

Full check passes on x86-64 and build succeeds w/ and w/o multiarch.
2022-09-28 20:15:16 -07:00
Joseph Myers 3e5760fcb4 Update _FloatN header support for C++ in GCC 13
GCC 13 adds support for _FloatN and _FloatNx types in C++, so breaking
the installed glibc headers that assume such support is not present.
GCC mostly works around this with fixincludes, but that doesn't help
for building glibc and its tests (glibc doesn't itself contain C++
code, but there's C++ code built for tests).  Update glibc's
bits/floatn-common.h and bits/floatn.h headers to handle the GCC 13
support directly.

In general the changes match those made by fixincludes, though I think
the ones in sysdeps/powerpc/bits/floatn.h, where the header tests
__LDBL_MANT_DIG__ == 113 or uses #elif, wouldn't match the existing
fixincludes patterns.

Some places involving special C++ handling in relation to _FloatN
support are not changed.  There's no need to change the
__HAVE_FLOATN_NOT_TYPEDEF definition (also in a form that wouldn't be
matched by the fixincludes fixes) because it's only used in relation
to macro definitions using features not supported for C++
(__builtin_types_compatible_p and _Generic).  And there's no need to
change the inline function overloads for issignaling, iszero and
iscanonical in C++ because cases where types have the same format but
are no longer compatible types are handled automatically by the C++
overload resolution rules.

This patch also does not change the overload handling for iseqsig, and
there I think changes *are* needed, beyond those in this patch or made
by fixincludes.  The way that overload is defined, via a template
parameter to a structure type, requires overloads whenever the types
are incompatible, even if they have the same format.  So I think we
need to add overloads with GCC 13 for every supported _FloatN and
_FloatNx type, rather than just having one for _Float128 when it has a
different ABI to long double as at present (but for older GCC, such
overloads must not be defined for types that end up defined as
typedefs for another type).

Tested with build-many-glibcs.py: compilers build for
aarch64-linux-gnu ia64-linux-gnu mips64-linux-gnu powerpc-linux-gnu
powerpc64le-linux-gnu x86_64-linux-gnu; glibcs build for
aarch64-linux-gnu ia64-linux-gnu i686-linux-gnu mips-linux-gnu
mips64-linux-gnu-n32 powerpc-linux-gnu powerpc64le-linux-gnu
x86_64-linux-gnu.
2022-09-28 20:10:08 +00:00
Samuel Thibault d7f32c9958 hurd: Fix typo 2022-09-28 19:21:44 +02:00
Jörg Sonnenberger c9226c03da get_nscd_addresses: Fix subscript typos [BZ #29605]
Fix the subscript on air->family, which was accidentally set to COUNT
when it should have remained as I.

Resolves: BZ #29605

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-09-28 12:47:10 -04:00
Samuel Thibault 7de3f0a96c hurd: Increase SOMAXCONN to 4096
Notably fakeroot-tcp may introduce a lot of parallel connections.
2022-09-27 23:37:42 +02:00
Wilco Dijkstra 22f4ab2d20 Use atomic_exchange_release/acquire
Rename atomic_exchange_rel/acq to use atomic_exchange_release/acquire
since these map to the standard C11 atomic builtins.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-26 16:58:08 +01:00
Florian Weimer c02e29a0ba nss: Use shared prefix in IPv4 address in tst-reload1
Otherwise, sorting based on the longest-matching prefix in
getaddrinfo can reorder the addresses in ways the test does not
expect, depending on the IPv4 address of the host.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-09-23 19:30:57 +02:00
Florian Weimer fecf481f0f nss: Enhance tst-reload1 coverage and logging
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-09-23 19:30:57 +02:00
Wilco Dijkstra 4a07fbb689 Use C11 atomics instead of atomic_decrement_and_test
Replace atomic_decrement_and_test with atomic_fetch_add_relaxed.
These are simple counters which do not protect any shared data from
concurrent accesses. Also remove the unused file cond-perf.c.

Passes regress on AArch64.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-23 15:59:56 +01:00
Wilco Dijkstra d1babeb32d Use C11 atomics instead of atomic_increment(_val)
Replace atomic_increment and atomic_increment_val with atomic_fetch_add_relaxed.
One case in sem_post.c uses release semantics (see comment above it).
The others are simple counters and do not protect any shared data from
concurrent accesses.

Passes regress on AArch64.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-23 15:59:56 +01:00
Wilco Dijkstra 8114b95cef Use C11 atomics instead of atomic_and/or
Remove the 4 uses of atomic_and and atomic_or with atomic_fetch_and_acquire
and atomic_fetch_or_acquire. This is preserves existing implied semantics,
however relaxed MO on FUTEX_OWNER_DIED accesses may be correct.

Passes regress on AArch64.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-23 15:59:56 +01:00
Qingqing Li 774d43f27d malloc: Print error when oldsize is not equal to the current size.
This is used to detect errors early.  The read of the oldsize is
not protected by any lock, so check this value to avoid causing
bigger mistakes.

Reviewed-by: DJ Delorie <dj@redhat.com>
2022-09-22 15:32:56 -04:00
Adhemerval Zanella Netto de477abcaa Use '%z' instead of '%Z' on printf functions
The Z modifier is a nonstandard synonymn for z (that predates z
itself) and compiler might issue an warning for in invalid
conversion specifier.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-09-22 08:48:04 -03:00
Florian Weimer 340097d0b5 elf: Extract glibcelf constants from <elf.h>
The need to maintain elf/elf.h and scripts/glibcelf.py in parallel
results in a backporting hazard: they need to be kept in sync to
avoid elf/tst-glibcelf consistency check failures.  glibcelf (unlike
tst-glibcelf) does not use the C implementation to extract constants.
This applies the additional glibcpp syntax checks to <elf.h>.

This  changereplaces the types derived from Python enum types with
custom types _TypedConstant, _IntConstant, and _FlagConstant.  These
types have fewer safeguards, but this also allows incremental
construction and greater flexibility for grouping constants among
the types.  Architectures-specific named constants are now added
as members into their superclasses (but value-based lookup is
still restricted to generic constants only).

Consequently, check_duplicates in elf/tst-glibcelf has been adjusted
to accept differently-named constants of the same value if their
subtypes are distinct.  The ordering check for named constants
has been dropped because they are no longer strictly ordered.

Further test adjustments: Some of the type names are different.
The new types do not support iteration (because it is unclear
whether iteration should cover the all named values (including
architecture-specific constants), or only the generic named values),
so elf/tst-glibcelf now uses by_name explicit (to get all constants).
PF_HP_SBP and PF_PARISC_SBP are now of distinct types (PfHP and
PfPARISC), so they are how both present on the Python side.  EM_NUM
and PT_NUM are filtered (which was an oversight in the old
conversion).

The new version of glibcelf should also be compatible with earlier
Python versions because it no longer depends on the enum module and its
advanced features.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-09-22 12:10:41 +02:00
Florian Weimer e6e6184bed scripts: Enhance glibcpp to do basic macro processing
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-09-22 12:10:41 +02:00
Florian Weimer f40c7887d3 scripts: Extract glibcpp.py from check-obsolete-constructs.py
The C tokenizer is useful separately.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-09-22 12:10:41 +02:00
Alistair Francis 2e81493fa6 riscv: Remove RV32 floating point functions
We don't need RV32 specific floating point functions, instead make them
generic for RISC-V.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-21 14:37:43 -04:00
Alistair Francis 73e9fe43ac riscv: Consolidate the libm-test-ulps
Both RV32 and RV64 should have the same libm-test-ulps, so consolidate
them into a single file.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2022-09-21 14:37:13 -04:00
Samuel Thibault 385f2ecda9 hurd: Fix SIOCADD/DELRT ioctls
The hurd network stack uses struct ifrtreq rather than ortentry.
2022-09-21 19:58:44 +02:00
Samuel Thibault b84199eb18 hurd: Drop struct rtentry and in6_rtmsg
These were cargo-culted, they are not used at all in Hurd interfaces.
2022-09-21 19:58:44 +02:00
Damien Zammit 9ba0f010a6 hurd: Add _IOT_ifrtreq to <net/route.h>
So that we can use struct ifrtreq in ioctls.
2022-09-21 19:58:44 +02:00
Adhemerval Zanella fd36873ff9 elf: Use C11 atomics on _dl_mcount
All atomic operation are counters, so relaxed MO should be suffice.

Checked on x86_64-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2022-09-21 10:57:47 -03:00
Samuel Thibault c0c9092f75 hurd: Use IF_NAMESIZE rather than IFNAMSIZ
The latter is not available without __USE_MISC.
2022-09-21 08:51:50 +02:00
Damien Zammit ffd0b295d9 hurd: Add ifrtreq structure to net/route.h
As used by the hurdish route ioctls.
2022-09-21 00:42:13 +02:00
John David Anglin fa47e8e6df hppa: undef __ASSUME_SET_ROBUST_LIST
QEMU does not support support set_robust_list. Thus, we need
to enable detection of set_robust_list system call.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
2022-09-20 20:14:14 +00:00
Adhemerval Zanella 85a3228744 linux: Use same type for MMAP2_PAGE_UNIT
It avoid a possible compiler warning where right size of operator
is converted from a negative value to unsigned.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-09-20 10:57:40 -03:00
Adhemerval Zanella aeb4d2e981 m68k: Enforce 4-byte alignment on internal locks (BZ #29537)
A new internal definition, __LIBC_LOCK_ALIGNMENT, is used to force
the 4-byte alignment only for m68k, other architecture keep the
natural alignment of the type used internally (and hppa does not
require 16-byte alignment for kernel-assisted CAS).

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-09-20 10:56:54 -03:00
Florian Weimer c75d20b5b2 nss: Fix tst-nss-files-hosts-long on single-stack hosts (bug 24816)
getent implicitly passes AI_ADDRCONFIG to getaddrinfo by default.
Use --no-addrconfig to suppress that, so that both IPv4 and IPv6
lookups succeed even if the address family is not supported by the
host.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-09-20 13:09:55 +02:00
Florian Weimer a623f13adf nss: Implement --no-addrconfig option for getent
The ahosts, ahostsv4, ahostsv6 commands unconditionally pass
AI_ADDRCONFIG to getaddrinfo, which is not always desired.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-09-20 13:09:25 +02:00
Florian Weimer f97905f246 gconv: Use 64-bit interfaces in gconv_parseconfdir (bug 29583)
It's possible that inode numbers are outside the 32-bit range.
The existing code only handles the in-libc case correctly, and
still uses the legacy interfaces when building iconv.

Suggested-by: Helge Deller <deller@gmx.de>
2022-09-20 12:12:43 +02:00