glibc/sysdeps/sparc/sparc32/divrem.m4
Ulrich Drepper f41c80910d Update.
1997-08-24 12:24  Ulrich Drepper  <drepper@cygnus.com>

	* configure.in (INSTALL): Quote `$'.

	* libc.map: Add __xpg_basename.

	* csu/Makefile (initfini.s): Disable optimization.

	* elf/dl-deps.c: Implement handling of DL_FILTER.

	* elf/dl-load.c (_dl_init_paths): Add error check.

	* intl/finddomain.c (_nl_find_domain): Correct comment.
	* intl/localealias.c: Include <bits/libc-lock.h> not <libc-lock.h>.

	* libio/stdio.h: Make {,v}snprintf available if __USE_BSD.
	Change extern inline functions to work correctly in C++.

	* locale/iso-4217.def: Update for more recent ISO 4217 version.

	* locale/loadlocale.c (_nl_load_locale): Add cast.

	* manual/message.texi: Finish gettext section.

	* posix/getopt_init.c: Don't use relative #include path.
	(__getopt_clean_environment): Change function to take pointer to
	environment as argument.  Optimize generation of test string a bit.
	* sysdeps/unix/sysv/linux/init-first.c: Call __getopt_clean_environment
	with additional argument.

	* poisx/glob.c: Add prototype for next_brace_sub.

	* sysdeps/generic/dl-sysdep.c: Recognize AT_BASE value on auxiliary
	vector.

	* sysdeps/i386/dl-machine.h (elf_machine_load_address): Rewrite
	to not generate relocation entry.  Suggested by Richard Henderson.
	(ELF_MACHINE_BEFORE_RTLD_RELOC): Removed.
	(elf_machine_runtime_setup): Add .aligns.

	* sysdeps/i386/fpu/fraiseexcpt.c: Add volatile to asms.

	* sysdeps/i386/fpu/bits/mathinline.h: Partially undo change of
	1997-08-14 03:14.  gcc 2.7.2* is really broken in some aspects.

	* sysdeps/standalone/i386/i386.h: Clean up asm statements a bit.
	* sysdeps/standalone/i960/i960ca.h: Likewise.

1997-08-22 19:04  Richard Henderson  <rth@cygnus.com>

	* elf/rtld.c (_dl_start): Init _dl_rtld_map.l_opencount due to
	undocumented test addition in _dl_map_object.

	Support ET_EXEC versions of ld.so, for debugging at least:

	* elf/dl-load.c (_dl_map_object): Add_name_to_object could get
	called despite the DT_SONAME != NULL test, segfaulting.  Simplify
	the code here as well.
	* elf/dl-lookup.c (do_lookup): Skip objects with no symtab.
	(_dl_setup_hash): Likewise for hash tables.
	* elf/dl-version.c (_dl_check_map_versions): Likewise for strtabs.
	* elf/rtld.c (_dl_start): Likewise for rpath.
	(_dl_rtld_libname2): New variable.
	(dl_main): Use it to add an soname for ourselves when we don't have
	one of our own.  Base it on the target's .interp.
	(dl_main): Again, skip printing of objects that don't have strtabs.

	Sparc 32 merge:

	* elf/dl-runtime.c (ELF_FIXUP_RETURN_VALUE): Provide default value.
	(fixup): Simplify code.  Use ELF_FIXUP_RETURN_VALUE.
	(profile_fixup): Likewise, though this still needs fixing for
	Sparc32 and PPC.
	* sysdeps/powerpc/dl-machine.h: Transmute ELF_FIXUP_RETURNS_ADDRESS
	to ELF_FIXUP_RETURN_VALUE.

	* sysdeps/sparc/sparc32/dl-machine.h: Implement lazy relocation.
	Fix up _dl_start_user to handle _dl_skip_args properly.
	Use _dl_hwcap to determine if "flush" is available/needed.

	* sysdeps/sparc/configure.in: Remove.  It doesn't actually do
	anything anymore, and what it did do is done somewhere else.
	* sysdeps/sparc/configure: Likewise.

	* sysdeps/sparc/fpu/bits/mathdef.h (FP_ILOGB0, FP_ILOGBNAN): New.

	* sysdeps/sparc/fpu/fraiseexcpt.c: Rearrange for smaller code.

	* sysdeps/sparc/sparc32/Makefile: Fix sparc->sparc/sparc32 bits
	in divrem expansions.

	* sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.h (END, LOC): New
	definitions for assembly syntax differences.

	* sysdeps/sparc/sparc32/__longjmp.S: %g6,%g7 are reserved to the
	"system".  Use %g2,%g3 instead.  Use new local label macro.
	* sysdeps/sparc/sparc32/add_n.S: Use <sysdep.h> and ENTRY, END,
	and LOC for proper assembly headers/footers.
	* sysdeps/sparc/sparc32/addmul_1.S: Likewise.
	* sysdeps/sparc/sparc32/alloca.S: Likewise.
	* sysdeps/sparc/sparc32/dotmul.S: Likewise.
	* sysdeps/sparc/sparc32/lshift.S: Likewise.
	* sysdeps/sparc/sparc32/mul_1.S: Likewise.
	* sysdeps/sparc/sparc32/rshift.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv8/addmul_1.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv8/mul_1.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv8/submul_1.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv8/udiv_qrnnd.S: Likewise.
	* sysdeps/sparc/sparc32/sub_n.S: Likewise.
	* sysdeps/sparc/sparc32/submul_1.S: Likewise.
	* sysdeps/sparc/sparc32/udiv_qrnnd.S: Likewise.
	* sysdeps/sparc/sparc32/umul.S: Likewise.
	* sysdeps/sparc/sparc32/divrem.m4: Likewise.
	* sysdeps/sparc/sparc32/rem.S: Regenerate.
	* sysdeps/sparc/sparc32/sdiv.S: Regenerate.
	* sysdeps/sparc/sparc32/udiv.S: Regenerate.
	* sysdeps/sparc/sparc32/urem.S: Regenerate.

	* sysdeps/sparc/sparc32/sparcv8/dotmul.S: New file.
	* sysdeps/sparc/sparc32/sparcv8/rem.S: New file.
	* sysdeps/sparc/sparc32/sparcv8/sdiv.S: New file.
	* sysdeps/sparc/sparc32/sparcv8/udiv.S: New file.
	* sysdeps/sparc/sparc32/sparcv8/umul.S: New file.
	* sysdeps/sparc/sparc32/sparcv8/urem.S: New file.

	* sysdeps/sparc/sparc32/bsd-_setjmp.S: Dike out.
	* sysdeps/sparc/sparc32/bsd-setjmp.S: Likewise.
	* sysdeps/sparc/sparc32/setjmp.S: Add _setjmp and setjmp entry points.

	* sysdeps/unix/sysv/linux/sparc/sparc32/__sigtrampoline.S:
	Clean up PIC code.

	* sysdeps/sparc/sparc32/elf/start.S: New file, slightly modified
	from the sparc64 version.
	* sysdeps/sparc/sparc32/elf/start.c: Removed.

	* sysdeps/unix/sysv/linux/sparc/sparc32/init-first.h: Rewrite in
	assembly based on the sparc64 version.

	* sysdeps/sparc/sparc32/fpu/bits/fenv.h: Duh.  Use proper syntax
	for manipulating %fsr.
	* sysdeps/sparc/sparc32/fpu/fpu_control.h: Make IEEE conformance
	be the default.

	* elf/elf.h (HWCAP_SPARC_*): New definitions.
	* elf/rtld.c (_dl_hwcap): New variable.
	* sysdeps/generic/dl-sysdep.c (_dl_sysdep_start): Record AT_HWCAP.

	* sysdeps/unix/sysv/linux/sparc/sparc32/getpagesize.c: New file.
	Attempt to get hold of the page size based on what we might have
	been told at startup time in _dl_pagesize.  This will be obsolete
	when I finish the kernel hooks for a proper sysconf(), stay tuned.

	Sparc 64 merge:

	* sysdeps/sparc/sparc64/dl-machine.h (ELF_FIXUP_RETURN_VALUE): New.
	Figure out the right thing to return based on the .plt format.

	* sysdeps/sparc/sparc64/fpu/fpu_control.h: Update comment.

	* sysdeps/unix/sysv/linux/sparc/sparc64/bits/types.h (__dev_t):
	Should have been 64-bits wide.

	* sysdeps/unix/sysv/linux/sparc/sparc64/init-first.h: sll->sllx,
	optimize for branch delay slot usage.

1997-08-22  Andreas Schwab  <schwab@issan.informatik.uni-dortmund.de>

	* csu/Makefile ($(objpfx)crt%.o): Fix a missing *.so -> *.os
	change.

1997-08-20  Andreas Jaeger  <aj@arthur.rhein-neckar.de>

	* math/libm-test.c (identities): Change epsilon.

	* sysdeps/i386/fpu/bits/mathinline.h: Correct arguments to fabs,
	fabsf, fabsl, __fabsl.

	* sysdeps/libm-i387/e_remainderl.S: Pop extra value from FPU stack.
	* sysdeps/libm-ieee754/s_csinhl.c: Include <fenv.h>.
1997-08-24 10:55:18 +00:00

233 lines
5.9 KiB
Plaintext

/*
* Division and remainder, from Appendix E of the Sparc Version 8
* Architecture Manual, with fixes from Gordon Irlam.
*/
/*
* Input: dividend and divisor in %o0 and %o1 respectively.
*
* m4 parameters:
* NAME name of function to generate
* OP OP=div => %o0 / %o1; OP=rem => %o0 % %o1
* S S=true => signed; S=false => unsigned
*
* Algorithm parameters:
* N how many bits per iteration we try to get (4)
* WORDSIZE total number of bits (32)
*
* Derived constants:
* TOPBITS number of bits in the top `decade' of a number
*
* Important variables:
* Q the partial quotient under development (initially 0)
* R the remainder so far, initially the dividend
* ITER number of main division loop iterations required;
* equal to ceil(log2(quotient) / N). Note that this
* is the log base (2^N) of the quotient.
* V the current comparand, initially divisor*2^(ITER*N-1)
*
* Cost:
* Current estimate for non-large dividend is
* ceil(log2(quotient) / N) * (10 + 7N/2) + C
* A large dividend is one greater than 2^(31-TOPBITS) and takes a
* different path, as the upper bits of the quotient must be developed
* one bit at a time.
*/
define(N, `4')dnl
define(WORDSIZE, `32')dnl
define(TOPBITS, eval(WORDSIZE - N*((WORDSIZE-1)/N)))dnl
dnl
define(dividend, `%o0')dnl
define(divisor, `%o1')dnl
define(Q, `%o2')dnl
define(R, `%o3')dnl
define(ITER, `%o4')dnl
define(V, `%o5')dnl
dnl
dnl m4 reminder: ifelse(a,b,c,d) => if a is b, then c, else d
define(T, `%g1')dnl
define(SC, `%g2')dnl
ifelse(S, `true', `define(SIGN, `%g3')')dnl
dnl
dnl This is the recursive definition for developing quotient digits.
dnl
dnl Parameters:
dnl $1 the current depth, 1 <= $1 <= N
dnl $2 the current accumulation of quotient bits
dnl N max depth
dnl
dnl We add a new bit to $2 and either recurse or insert the bits in
dnl the quotient. R, Q, and V are inputs and outputs as defined above;
dnl the condition codes are expected to reflect the input R, and are
dnl modified to reflect the output R.
dnl
define(DEVELOP_QUOTIENT_BITS,
` ! depth $1, accumulated bits $2
bl LOC($1.eval(2**N+$2))
srl V,1,V
! remainder is positive
subcc R,V,R
ifelse($1, N,
` b 9f
add Q, ($2*2+1), Q
', ` DEVELOP_QUOTIENT_BITS(incr($1), `eval(2*$2+1)')')
LOC($1.eval(2**N+$2)):
! remainder is negative
addcc R,V,R
ifelse($1, N,
` b 9f
add Q, ($2*2-1), Q
', ` DEVELOP_QUOTIENT_BITS(incr($1), `eval(2*$2-1)')')
ifelse($1, 1, `9:')')dnl
#include <sysdep.h>
#include <sys/trap.h>
ENTRY(NAME)
ifelse(S, `true',
` ! compute sign of result; if neither is negative, no problem
orcc divisor, dividend, %g0 ! either negative?
bge 2f ! no, go do the divide
ifelse(OP, `div',
` xor divisor, dividend, SIGN ! compute sign in any case',
` mov dividend, SIGN ! sign of remainder matches dividend')
tst divisor
bge 1f
tst dividend
! divisor is definitely negative; dividend might also be negative
bge 2f ! if dividend not negative...
sub %g0, divisor, divisor ! in any case, make divisor nonneg
1: ! dividend is negative, divisor is nonnegative
sub %g0, dividend, dividend ! make dividend nonnegative
2:
')
! Ready to divide. Compute size of quotient; scale comparand.
orcc divisor, %g0, V
bne 1f
mov dividend, R
! Divide by zero trap. If it returns, return 0 (about as
! wrong as possible, but that is what SunOS does...).
ta ST_DIV0
retl
clr %o0
1:
cmp R, V ! if divisor exceeds dividend, done
blu LOC(got_result) ! (and algorithm fails otherwise)
clr Q
sethi %hi(1 << (WORDSIZE - TOPBITS - 1)), T
cmp R, T
blu LOC(not_really_big)
clr ITER
! `Here the dividend is >= 2**(31-N) or so. We must be careful here,
! as our usual N-at-a-shot divide step will cause overflow and havoc.
! The number of bits in the result here is N*ITER+SC, where SC <= N.
! Compute ITER in an unorthodox manner: know we need to shift V into
! the top decade: so do not even bother to compare to R.'
1:
cmp V, T
bgeu 3f
mov 1, SC
sll V, N, V
b 1b
add ITER, 1, ITER
! Now compute SC.
2: addcc V, V, V
bcc LOC(not_too_big)
add SC, 1, SC
! We get here if the divisor overflowed while shifting.
! This means that R has the high-order bit set.
! Restore V and subtract from R.
sll T, TOPBITS, T ! high order bit
srl V, 1, V ! rest of V
add V, T, V
b LOC(do_single_div)
sub SC, 1, SC
LOC(not_too_big):
3: cmp V, R
blu 2b
nop
be LOC(do_single_div)
nop
/* NB: these are commented out in the V8-Sparc manual as well */
/* (I do not understand this) */
! V > R: went too far: back up 1 step
! srl V, 1, V
! dec SC
! do single-bit divide steps
!
! We have to be careful here. We know that R >= V, so we can do the
! first divide step without thinking. BUT, the others are conditional,
! and are only done if R >= 0. Because both R and V may have the high-
! order bit set in the first step, just falling into the regular
! division loop will mess up the first time around.
! So we unroll slightly...
LOC(do_single_div):
subcc SC, 1, SC
bl LOC(end_regular_divide)
nop
sub R, V, R
mov 1, Q
b LOC(end_single_divloop)
nop
LOC(single_divloop):
sll Q, 1, Q
bl 1f
srl V, 1, V
! R >= 0
sub R, V, R
b 2f
add Q, 1, Q
1: ! R < 0
add R, V, R
sub Q, 1, Q
2:
LOC(end_single_divloop):
subcc SC, 1, SC
bge LOC(single_divloop)
tst R
b,a LOC(end_regular_divide)
LOC(not_really_big):
1:
sll V, N, V
cmp V, R
bleu 1b
addcc ITER, 1, ITER
be LOC(got_result)
sub ITER, 1, ITER
tst R ! set up for initial iteration
LOC(divloop):
sll Q, N, Q
DEVELOP_QUOTIENT_BITS(1, 0)
LOC(end_regular_divide):
subcc ITER, 1, ITER
bge LOC(divloop)
tst R
bl,a LOC(got_result)
! non-restoring fixup here (one instruction only!)
ifelse(OP, `div',
` sub Q, 1, Q
', ` add R, divisor, R
')
LOC(got_result):
ifelse(S, `true',
` ! check to see if answer should be < 0
tst SIGN
bl,a 1f
ifelse(OP, `div', `sub %g0, Q, Q', `sub %g0, R, R')
1:')
retl
ifelse(OP, `div', `mov Q, %o0', `mov R, %o0')
END(NAME)