Support C2X printf %b, %B

C2X adds a printf %b format (see
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2630.pdf>, accepted
for C2X), for outputting integers in binary.  It also has recommended
practice for a corresponding %B format (like %b, but %#B starts the
output with 0B instead of 0b).  Add support for these formats to
glibc.

One existing test uses %b as an example of an unknown format, to test
how glibc printf handles unknown formats; change that to %v.  Use of
%b and %B as user-registered format specifiers continues to work (and
we already have a test that covers that, tst-printfsz.c).

Note that C2X also has scanf %b support, plus support for binary
constants starting 0b in strtol (base 0 and 2) and scanf %i (strtol
base 0 and scanf %i coming from a previous paper that added binary
integer literals).  I intend to implement those features in a separate
patch or patches; as discussed in the thread starting at
<https://sourceware.org/pipermail/libc-alpha/2020-December/120414.html>,
they will be more complicated because they involve adding extra public
symbols to ensure compatibility with existing code that might not
expect 0b constants to be handled by strtol base 0 and 2 and scanf %i,
whereas simply adding a new format specifier poses no such
compatibility concerns.

Note that the actual conversion from integer to string uses existing
code in _itoa.c.  That code has special cases for bases 8, 10 and 16,
probably so that the compiler can optimize division by an integer
constant in the code for those bases.  If desired such special cases
could easily be added for base 2 as well, but that would be an
optimization, not actually needed for these printf formats to work.

Tested for x86_64 and x86.  Also tested with build-many-glibcs.py for
aarch64-linux-gnu with GCC mainline to make sure that the test does
indeed build with GCC 12 (where format checking warnings are enabled
for most of the test).
This commit is contained in:
Joseph Myers 2021-11-10 15:52:21 +00:00
parent 3387c40a8b
commit 309548bec3
11 changed files with 245 additions and 31 deletions

4
NEWS
View File

@ -51,6 +51,10 @@ Major new features:
* The ISO C2X macro _PRINTF_NAN_LEN_MAX has been added to <stdio.h>.
* printf-family functions now support the %b format for output of
integers in binary, as specified in draft ISO C2X, and the %B variant
of that format recommended by draft ISO C2X.
* A new DSO sorting algorithm has been added in the dynamic linker that uses
topological sorting by depth-first search (DFS), solving performance issues
of the existing sorting algorithm when encountering particular circular

View File

@ -1664,9 +1664,9 @@ an @code{int} argument should be printed in decimal notation, the
the @samp{%%} conversion to print a literal @samp{%} character.
There are also conversions for printing an integer argument as an
unsigned value in octal, decimal, or hexadecimal radix (@samp{%o},
@samp{%u}, or @samp{%x}, respectively); or as a character value
(@samp{%c}).
unsigned value in binary, octal, decimal, or hexadecimal radix
(@samp{%b}, @samp{%o}, @samp{%u}, or @samp{%x}, respectively); or as a
character value (@samp{%c}).
Floating-point numbers can be printed in normal, fixed-point notation
using the @samp{%f} conversion or in exponential notation using the
@ -1825,6 +1825,13 @@ Conversions}, for details. @samp{%d} and @samp{%i} are synonymous for
output, but are different when used with @code{scanf} for input
(@pxref{Table of Input Conversions}).
@item @samp{%b}, @samp{%B}
Print an integer as an unsigned binary number. @samp{%b} uses
lower-case @samp{b} with the @samp{#} flag and @samp{%B} uses
upper-case. @samp{%b} is an ISO C2X feature; @samp{%B} is an
extension recommended by ISO C2X. @xref{Integer Conversions}, for
details.
@item @samp{%o}
Print an integer as an unsigned octal number. @xref{Integer
Conversions}, for details.
@ -1901,15 +1908,17 @@ simply ignored; this is sometimes useful.
@subsection Integer Conversions
This section describes the options for the @samp{%d}, @samp{%i},
@samp{%o}, @samp{%u}, @samp{%x}, and @samp{%X} conversion
@samp{%b}, @samp{%B}, @samp{%o}, @samp{%u}, @samp{%x}, and @samp{%X} conversion
specifications. These conversions print integers in various formats.
The @samp{%d} and @samp{%i} conversion specifications both print an
@code{int} argument as a signed decimal number; while @samp{%o},
@samp{%u}, and @samp{%x} print the argument as an unsigned octal,
@code{int} argument as a signed decimal number; while @samp{b}, @samp{%o},
@samp{%u}, and @samp{%x} print the argument as an unsigned binary, octal,
decimal, or hexadecimal number (respectively). The @samp{%X} conversion
specification is just like @samp{%x} except that it uses the characters
@samp{ABCDEF} as digits instead of @samp{abcdef}.
@samp{ABCDEF} as digits instead of @samp{abcdef}. The @samp{%B}
conversion specification is just like @samp{%b} except that, with the
@samp{#} flag, the output starts with @samp{0B} instead of @samp{0b}.
The following flags are meaningful:
@ -1931,7 +1940,9 @@ includes a sign, this flag is ignored if you supply both of them.
@item @samp{#}
For the @samp{%o} conversion, this forces the leading digit to be
@samp{0}, as if by increasing the precision. For @samp{%x} or
@samp{%X}, this prefixes a leading @samp{0x} or @samp{0X} (respectively)
@samp{%X}, this prefixes a leading @samp{0x} or @samp{0X}
(respectively) to the result. For @samp{%b} or @samp{%B}, this
prefixes a leading @samp{0b} or @samp{0B} (respectively)
to the result. This doesn't do anything useful for the @samp{%d},
@samp{%i}, or @samp{%u} conversions. Using this flag produces output
which can be parsed by the @code{strtoul} function (@pxref{Parsing of
@ -1957,7 +1968,8 @@ characters at all are produced.
Without a type modifier, the corresponding argument is treated as an
@code{int} (for the signed conversions @samp{%i} and @samp{%d}) or
@code{unsigned int} (for the unsigned conversions @samp{%o}, @samp{%u},
@code{unsigned int} (for the unsigned conversions @samp{%b},
@samp{%B}, @samp{%o}, @samp{%u},
@samp{%x}, and @samp{%X}). Recall that since @code{printf} and friends
are variadic, any @code{char} and @code{short} arguments are
automatically converted to @code{int} by the default argument

View File

@ -70,7 +70,8 @@ tests := tstscanf test_rdwr test-popen tstgetln test-fseek \
tst-vfprintf-width-prec-alloc \
tst-printf-fp-free \
tst-printf-fp-leak \
test-strerr
test-strerr \
tst-printf-binary
test-srcs = tst-unbputc tst-printf tst-printfsz-islongdouble

View File

@ -328,6 +328,8 @@ __parse_one_specmb (const UCHAR_T *format, size_t posn,
case L'o':
case L'X':
case L'x':
case L'B':
case L'b':
#if LONG_MAX != LONG_LONG_MAX
if (spec->info.is_long_double)
spec->data_arg_type = PA_INT|PA_FLAG_LONG_LONG;

View File

@ -0,0 +1,130 @@
/* Test binary printf formats.
Copyright (C) 2021 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<https://www.gnu.org/licenses/>. */
#include <limits.h>
#include <stdio.h>
#include <string.h>
#include <wchar.h>
#include <libc-diag.h>
#include <support/check.h>
/* GCC does not know the %b or %B formats before GCC 12. */
DIAG_PUSH_NEEDS_COMMENT;
#if !__GNUC_PREREQ (12, 0)
DIAG_IGNORE_NEEDS_COMMENT (11, "-Wformat");
DIAG_IGNORE_NEEDS_COMMENT (11, "-Wformat-extra-args");
#endif
#define CHECK_PRINTF(EXPECTED, FMT, ...) \
do \
{ \
int ret = SNPRINTF (buf, sizeof buf / sizeof buf[0], L_(FMT), \
__VA_ARGS__); \
TEST_COMPARE_STRING_MACRO (buf, L_(EXPECTED)); \
TEST_COMPARE (ret, STRLEN (L_(EXPECTED))); \
} \
while (0)
static int
do_test (void)
{
CHAR buf[1024];
CHECK_PRINTF ("0", "%b", 0u);
CHECK_PRINTF ("0", "%B", 0u);
CHECK_PRINTF ("0", "%#b", 0u);
CHECK_PRINTF ("0", "%#B", 0u);
CHECK_PRINTF ("1", "%b", 1u);
CHECK_PRINTF ("1", "%B", 1u);
CHECK_PRINTF ("10", "%b", 2u);
CHECK_PRINTF ("10", "%B", 2u);
CHECK_PRINTF ("11", "%b", 3u);
CHECK_PRINTF ("11", "%B", 3u);
CHECK_PRINTF ("10000111011001010100001100100001", "%b", 0x87654321);
CHECK_PRINTF ("10000111011001010100001100100001", "%B", 0x87654321);
CHECK_PRINTF ("100001100100001", "%hb", (int) 0x87654321);
CHECK_PRINTF ("100001100100001", "%hB", (int) 0x87654321);
CHECK_PRINTF ("100001", "%hhb", (int) 0x87654321);
CHECK_PRINTF ("100001", "%hhB", (int) 0x87654321);
CHECK_PRINTF ("10000111011001010100001100100001", "%lb", 0x87654321ul);
CHECK_PRINTF ("10000111011001010100001100100001", "%lB", 0x87654321ul);
CHECK_PRINTF ("11111110110111001011101010011001"
"10000111011001010100001100100001", "%llb",
0xfedcba9987654321ull);
CHECK_PRINTF ("11111110110111001011101010011001"
"10000111011001010100001100100001", "%llB",
0xfedcba9987654321ull);
#if LONG_WIDTH >= 64
CHECK_PRINTF ("11111110110111001011101010011001"
"10000111011001010100001100100001", "%lb",
0xfedcba9987654321ul);
CHECK_PRINTF ("11111110110111001011101010011001"
"10000111011001010100001100100001", "%lB",
0xfedcba9987654321ul);
#endif
CHECK_PRINTF (" 1010", "%5b", 10u);
CHECK_PRINTF (" 1010", "%5B", 10u);
CHECK_PRINTF ("01010", "%05b", 10u);
CHECK_PRINTF ("01010", "%05B", 10u);
CHECK_PRINTF ("1011 ", "%-5b", 11u);
CHECK_PRINTF ("1011 ", "%-5B", 11u);
CHECK_PRINTF ("0b10011", "%#b", 19u);
CHECK_PRINTF ("0B10011", "%#B", 19u);
CHECK_PRINTF (" 0b10011", "%#10b", 19u);
CHECK_PRINTF (" 0B10011", "%#10B", 19u);
CHECK_PRINTF ("0b00010011", "%0#10b", 19u);
CHECK_PRINTF ("0B00010011", "%0#10B", 19u);
CHECK_PRINTF ("0b00010011", "%#010b", 19u);
CHECK_PRINTF ("0B00010011", "%#010B", 19u);
CHECK_PRINTF ("0b10011 ", "%#-10b", 19u);
CHECK_PRINTF ("0B10011 ", "%#-10B", 19u);
CHECK_PRINTF ("00010011", "%.8b", 19u);
CHECK_PRINTF ("00010011", "%.8B", 19u);
CHECK_PRINTF ("0b00010011", "%#.8b", 19u);
CHECK_PRINTF ("0B00010011", "%#.8B", 19u);
CHECK_PRINTF (" 00010011", "%15.8b", 19u);
CHECK_PRINTF (" 00010011", "%15.8B", 19u);
CHECK_PRINTF ("00010011 ", "%-15.8b", 19u);
CHECK_PRINTF ("00010011 ", "%-15.8B", 19u);
CHECK_PRINTF (" 0b00010011", "%#15.8b", 19u);
CHECK_PRINTF (" 0B00010011", "%#15.8B", 19u);
CHECK_PRINTF ("0b00010011 ", "%-#15.8b", 19u);
CHECK_PRINTF ("0B00010011 ", "%-#15.8B", 19u);
/* GCC diagnoses ignored flags. */
DIAG_PUSH_NEEDS_COMMENT;
DIAG_IGNORE_NEEDS_COMMENT (12, "-Wformat");
/* '0' flag ignored with '-'. */
CHECK_PRINTF ("1011 ", "%0-5b", 11u);
CHECK_PRINTF ("1011 ", "%0-5B", 11u);
CHECK_PRINTF ("0b10011 ", "%#0-10b", 19u);
CHECK_PRINTF ("0B10011 ", "%#0-10B", 19u);
/* '0' flag ignored with precision. */
CHECK_PRINTF (" 00010011", "%015.8b", 19u);
CHECK_PRINTF (" 00010011", "%015.8B", 19u);
CHECK_PRINTF (" 0b00010011", "%0#15.8b", 19u);
CHECK_PRINTF (" 0B00010011", "%0#15.8B", 19u);
DIAG_POP_NEEDS_COMMENT;
/* Test positional argument handling. */
CHECK_PRINTF ("test 1011 test2 100010001000100010001000100010001",
"%2$s %1$b %4$s %3$llb", 11u, "test", 0x111111111ull, "test2");
return 0;
}
DIAG_POP_NEEDS_COMMENT;
#include <support/test-driver.c>

View File

@ -0,0 +1,25 @@
/* Test binary printf formats. Narrow string version.
Copyright (C) 2021 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<https://www.gnu.org/licenses/>. */
#define SNPRINTF snprintf
#define TEST_COMPARE_STRING_MACRO TEST_COMPARE_STRING
#define STRLEN strlen
#define CHAR char
#define L_(C) C
#include <tst-printf-binary-main.c>

View File

@ -91,7 +91,7 @@ I am ready for my first lesson today.";
fmtst2chk("%0*.*x");
#ifndef BSD
printf("bad format:\t\"%b\"\n");
printf("bad format:\t\"%v\"\n");
printf("nil pointer (padded):\t\"%10p\"\n", (void *) NULL);
#endif

View File

@ -39,7 +39,7 @@ cat <<'EOF' |
%0*x: `0012'
%*.*x: `0012'
%0*.*x: `0012'
bad format: "%b"
bad format: "%v"
nil pointer (padded): " (nil)"
decimal negative: "-2345"
octal negative: "37777773327"
@ -153,7 +153,7 @@ cat <<'EOF' |
%0*x: `0012'
%*.*x: `0012'
%0*.*x: `0012'
bad format: "%b"
bad format: "%v"
nil pointer (padded): " (nil)"
decimal negative: "-2345"
octal negative: "37777773327"

View File

@ -390,7 +390,7 @@ static const uint8_t jump_table[] =
/* '4' */ 8, /* '5' */ 8, /* '6' */ 8, /* '7' */ 8,
/* '8' */ 8, /* '9' */ 8, 0, 0,
0, 0, 0, 0,
0, /* 'A' */ 26, 0, /* 'C' */ 25,
0, /* 'A' */ 26, /* 'B' */ 30, /* 'C' */ 25,
0, /* 'E' */ 19, /* F */ 19, /* 'G' */ 19,
0, /* 'I' */ 29, 0, 0,
/* 'L' */ 12, 0, 0, 0,
@ -398,7 +398,7 @@ static const uint8_t jump_table[] =
0, 0, 0, 0,
/* 'X' */ 18, 0, /* 'Z' */ 13, 0,
0, 0, 0, 0,
0, /* 'a' */ 26, 0, /* 'c' */ 20,
0, /* 'a' */ 26, /* 'b' */ 30, /* 'c' */ 20,
/* 'd' */ 15, /* 'e' */ 19, /* 'f' */ 19, /* 'g' */ 19,
/* 'h' */ 10, /* 'i' */ 15, /* 'j' */ 28, 0,
/* 'l' */ 11, /* 'm' */ 24, /* 'n' */ 23, /* 'o' */ 17,
@ -444,7 +444,7 @@ static const uint8_t jump_table[] =
#define STEP0_3_TABLE \
/* Step 0: at the beginning. */ \
static JUMP_TABLE_TYPE step0_jumps[30] = \
static JUMP_TABLE_TYPE step0_jumps[31] = \
{ \
REF (form_unknown), \
REF (flag_space), /* for ' ' */ \
@ -476,9 +476,10 @@ static const uint8_t jump_table[] =
REF (mod_ptrdiff_t), /* for 't' */ \
REF (mod_intmax_t), /* for 'j' */ \
REF (flag_i18n), /* for 'I' */ \
REF (form_binary), /* for 'B', 'b' */ \
}; \
/* Step 1: after processing width. */ \
static JUMP_TABLE_TYPE step1_jumps[30] = \
static JUMP_TABLE_TYPE step1_jumps[31] = \
{ \
REF (form_unknown), \
REF (form_unknown), /* for ' ' */ \
@ -509,10 +510,11 @@ static const uint8_t jump_table[] =
REF (form_floathex), /* for 'A', 'a' */ \
REF (mod_ptrdiff_t), /* for 't' */ \
REF (mod_intmax_t), /* for 'j' */ \
REF (form_unknown) /* for 'I' */ \
REF (form_unknown), /* for 'I' */ \
REF (form_binary), /* for 'B', 'b' */ \
}; \
/* Step 2: after processing precision. */ \
static JUMP_TABLE_TYPE step2_jumps[30] = \
static JUMP_TABLE_TYPE step2_jumps[31] = \
{ \
REF (form_unknown), \
REF (form_unknown), /* for ' ' */ \
@ -543,10 +545,11 @@ static const uint8_t jump_table[] =
REF (form_floathex), /* for 'A', 'a' */ \
REF (mod_ptrdiff_t), /* for 't' */ \
REF (mod_intmax_t), /* for 'j' */ \
REF (form_unknown) /* for 'I' */ \
REF (form_unknown), /* for 'I' */ \
REF (form_binary), /* for 'B', 'b' */ \
}; \
/* Step 3a: after processing first 'h' modifier. */ \
static JUMP_TABLE_TYPE step3a_jumps[30] = \
static JUMP_TABLE_TYPE step3a_jumps[31] = \
{ \
REF (form_unknown), \
REF (form_unknown), /* for ' ' */ \
@ -577,10 +580,11 @@ static const uint8_t jump_table[] =
REF (form_unknown), /* for 'A', 'a' */ \
REF (form_unknown), /* for 't' */ \
REF (form_unknown), /* for 'j' */ \
REF (form_unknown) /* for 'I' */ \
REF (form_unknown), /* for 'I' */ \
REF (form_binary), /* for 'B', 'b' */ \
}; \
/* Step 3b: after processing first 'l' modifier. */ \
static JUMP_TABLE_TYPE step3b_jumps[30] = \
static JUMP_TABLE_TYPE step3b_jumps[31] = \
{ \
REF (form_unknown), \
REF (form_unknown), /* for ' ' */ \
@ -611,12 +615,13 @@ static const uint8_t jump_table[] =
REF (form_floathex), /* for 'A', 'a' */ \
REF (form_unknown), /* for 't' */ \
REF (form_unknown), /* for 'j' */ \
REF (form_unknown) /* for 'I' */ \
REF (form_unknown), /* for 'I' */ \
REF (form_binary), /* for 'B', 'b' */ \
}
#define STEP4_TABLE \
/* Step 4: processing format specifier. */ \
static JUMP_TABLE_TYPE step4_jumps[30] = \
static JUMP_TABLE_TYPE step4_jumps[31] = \
{ \
REF (form_unknown), \
REF (form_unknown), /* for ' ' */ \
@ -647,7 +652,8 @@ static const uint8_t jump_table[] =
REF (form_floathex), /* for 'A', 'a' */ \
REF (form_unknown), /* for 't' */ \
REF (form_unknown), /* for 'j' */ \
REF (form_unknown) /* for 'I' */ \
REF (form_unknown), /* for 'I' */ \
REF (form_binary), /* for 'B', 'b' */ \
}
/* Before invoking this macro, process_arg_int etc. macros have to be
@ -706,6 +712,14 @@ static const uint8_t jump_table[] =
LABEL (form_hexa): \
/* Unsigned hexadecimal integer. */ \
base = 16; \
goto LABEL (unsigned_number); \
/* NOTREACHED */ \
\
LABEL (form_binary): \
/* Unsigned binary integer. */ \
base = 2; \
goto LABEL (unsigned_number); \
/* NOTREACHED */ \
\
LABEL (unsigned_number): /* Unsigned number of base BASE. */ \
\
@ -803,8 +817,8 @@ static const uint8_t jump_table[] =
{ \
width -= workend - string + prec; \
\
if (number.word != 0 && alt && base == 16) \
/* Account for 0X hex marker. */ \
if (number.word != 0 && alt && (base == 16 || base == 2)) \
/* Account for 0X, 0x, 0B or 0b hex or binary marker. */ \
width -= 2; \
\
if (is_negative || showsign || space) \
@ -823,7 +837,7 @@ static const uint8_t jump_table[] =
else if (space) \
outchar (L_(' ')); \
\
if (number.word != 0 && alt && base == 16) \
if (number.word != 0 && alt && (base == 16 || base == 2)) \
{ \
outchar (L_('0')); \
outchar (spec); \
@ -854,7 +868,7 @@ static const uint8_t jump_table[] =
--width; \
} \
\
if (number.word != 0 && alt && base == 16) \
if (number.word != 0 && alt && (base == 16 || base == 2)) \
{ \
outchar (L_('0')); \
outchar (spec); \

View File

@ -52,7 +52,8 @@ tests := tst-wcstof wcsmbs-tst1 tst-wcsnlen tst-btowc tst-mbrtowc \
tst-c16c32-1 wcsatcliff tst-wcstol-locale tst-wcstod-nan-locale \
tst-wcstod-round test-char-types tst-fgetwc-after-eof \
tst-wcstod-nan-sign tst-c16-surrogate tst-c32-state \
$(addprefix test-,$(strop-tests)) tst-mbstowcs
$(addprefix test-,$(strop-tests)) tst-mbstowcs \
tst-wprintf-binary
include ../Rules

View File

@ -0,0 +1,25 @@
/* Test binary printf formats. Wide string version.
Copyright (C) 2021 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
<https://www.gnu.org/licenses/>. */
#define SNPRINTF swprintf
#define TEST_COMPARE_STRING_MACRO TEST_COMPARE_STRING_WIDE
#define STRLEN wcslen
#define CHAR wchar_t
#define L_(C) L ## C
#include "../stdio-common/tst-printf-binary-main.c"