glibc/sysdeps/sparc/sparc32/sub_n.S
Ulrich Drepper f41c80910d Update.
1997-08-24 12:24  Ulrich Drepper  <drepper@cygnus.com>

	* configure.in (INSTALL): Quote `$'.

	* libc.map: Add __xpg_basename.

	* csu/Makefile (initfini.s): Disable optimization.

	* elf/dl-deps.c: Implement handling of DL_FILTER.

	* elf/dl-load.c (_dl_init_paths): Add error check.

	* intl/finddomain.c (_nl_find_domain): Correct comment.
	* intl/localealias.c: Include <bits/libc-lock.h> not <libc-lock.h>.

	* libio/stdio.h: Make {,v}snprintf available if __USE_BSD.
	Change extern inline functions to work correctly in C++.

	* locale/iso-4217.def: Update for more recent ISO 4217 version.

	* locale/loadlocale.c (_nl_load_locale): Add cast.

	* manual/message.texi: Finish gettext section.

	* posix/getopt_init.c: Don't use relative #include path.
	(__getopt_clean_environment): Change function to take pointer to
	environment as argument.  Optimize generation of test string a bit.
	* sysdeps/unix/sysv/linux/init-first.c: Call __getopt_clean_environment
	with additional argument.

	* poisx/glob.c: Add prototype for next_brace_sub.

	* sysdeps/generic/dl-sysdep.c: Recognize AT_BASE value on auxiliary
	vector.

	* sysdeps/i386/dl-machine.h (elf_machine_load_address): Rewrite
	to not generate relocation entry.  Suggested by Richard Henderson.
	(ELF_MACHINE_BEFORE_RTLD_RELOC): Removed.
	(elf_machine_runtime_setup): Add .aligns.

	* sysdeps/i386/fpu/fraiseexcpt.c: Add volatile to asms.

	* sysdeps/i386/fpu/bits/mathinline.h: Partially undo change of
	1997-08-14 03:14.  gcc 2.7.2* is really broken in some aspects.

	* sysdeps/standalone/i386/i386.h: Clean up asm statements a bit.
	* sysdeps/standalone/i960/i960ca.h: Likewise.

1997-08-22 19:04  Richard Henderson  <rth@cygnus.com>

	* elf/rtld.c (_dl_start): Init _dl_rtld_map.l_opencount due to
	undocumented test addition in _dl_map_object.

	Support ET_EXEC versions of ld.so, for debugging at least:

	* elf/dl-load.c (_dl_map_object): Add_name_to_object could get
	called despite the DT_SONAME != NULL test, segfaulting.  Simplify
	the code here as well.
	* elf/dl-lookup.c (do_lookup): Skip objects with no symtab.
	(_dl_setup_hash): Likewise for hash tables.
	* elf/dl-version.c (_dl_check_map_versions): Likewise for strtabs.
	* elf/rtld.c (_dl_start): Likewise for rpath.
	(_dl_rtld_libname2): New variable.
	(dl_main): Use it to add an soname for ourselves when we don't have
	one of our own.  Base it on the target's .interp.
	(dl_main): Again, skip printing of objects that don't have strtabs.

	Sparc 32 merge:

	* elf/dl-runtime.c (ELF_FIXUP_RETURN_VALUE): Provide default value.
	(fixup): Simplify code.  Use ELF_FIXUP_RETURN_VALUE.
	(profile_fixup): Likewise, though this still needs fixing for
	Sparc32 and PPC.
	* sysdeps/powerpc/dl-machine.h: Transmute ELF_FIXUP_RETURNS_ADDRESS
	to ELF_FIXUP_RETURN_VALUE.

	* sysdeps/sparc/sparc32/dl-machine.h: Implement lazy relocation.
	Fix up _dl_start_user to handle _dl_skip_args properly.
	Use _dl_hwcap to determine if "flush" is available/needed.

	* sysdeps/sparc/configure.in: Remove.  It doesn't actually do
	anything anymore, and what it did do is done somewhere else.
	* sysdeps/sparc/configure: Likewise.

	* sysdeps/sparc/fpu/bits/mathdef.h (FP_ILOGB0, FP_ILOGBNAN): New.

	* sysdeps/sparc/fpu/fraiseexcpt.c: Rearrange for smaller code.

	* sysdeps/sparc/sparc32/Makefile: Fix sparc->sparc/sparc32 bits
	in divrem expansions.

	* sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.h (END, LOC): New
	definitions for assembly syntax differences.

	* sysdeps/sparc/sparc32/__longjmp.S: %g6,%g7 are reserved to the
	"system".  Use %g2,%g3 instead.  Use new local label macro.
	* sysdeps/sparc/sparc32/add_n.S: Use <sysdep.h> and ENTRY, END,
	and LOC for proper assembly headers/footers.
	* sysdeps/sparc/sparc32/addmul_1.S: Likewise.
	* sysdeps/sparc/sparc32/alloca.S: Likewise.
	* sysdeps/sparc/sparc32/dotmul.S: Likewise.
	* sysdeps/sparc/sparc32/lshift.S: Likewise.
	* sysdeps/sparc/sparc32/mul_1.S: Likewise.
	* sysdeps/sparc/sparc32/rshift.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv8/addmul_1.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv8/mul_1.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv8/submul_1.S: Likewise.
	* sysdeps/sparc/sparc32/sparcv8/udiv_qrnnd.S: Likewise.
	* sysdeps/sparc/sparc32/sub_n.S: Likewise.
	* sysdeps/sparc/sparc32/submul_1.S: Likewise.
	* sysdeps/sparc/sparc32/udiv_qrnnd.S: Likewise.
	* sysdeps/sparc/sparc32/umul.S: Likewise.
	* sysdeps/sparc/sparc32/divrem.m4: Likewise.
	* sysdeps/sparc/sparc32/rem.S: Regenerate.
	* sysdeps/sparc/sparc32/sdiv.S: Regenerate.
	* sysdeps/sparc/sparc32/udiv.S: Regenerate.
	* sysdeps/sparc/sparc32/urem.S: Regenerate.

	* sysdeps/sparc/sparc32/sparcv8/dotmul.S: New file.
	* sysdeps/sparc/sparc32/sparcv8/rem.S: New file.
	* sysdeps/sparc/sparc32/sparcv8/sdiv.S: New file.
	* sysdeps/sparc/sparc32/sparcv8/udiv.S: New file.
	* sysdeps/sparc/sparc32/sparcv8/umul.S: New file.
	* sysdeps/sparc/sparc32/sparcv8/urem.S: New file.

	* sysdeps/sparc/sparc32/bsd-_setjmp.S: Dike out.
	* sysdeps/sparc/sparc32/bsd-setjmp.S: Likewise.
	* sysdeps/sparc/sparc32/setjmp.S: Add _setjmp and setjmp entry points.

	* sysdeps/unix/sysv/linux/sparc/sparc32/__sigtrampoline.S:
	Clean up PIC code.

	* sysdeps/sparc/sparc32/elf/start.S: New file, slightly modified
	from the sparc64 version.
	* sysdeps/sparc/sparc32/elf/start.c: Removed.

	* sysdeps/unix/sysv/linux/sparc/sparc32/init-first.h: Rewrite in
	assembly based on the sparc64 version.

	* sysdeps/sparc/sparc32/fpu/bits/fenv.h: Duh.  Use proper syntax
	for manipulating %fsr.
	* sysdeps/sparc/sparc32/fpu/fpu_control.h: Make IEEE conformance
	be the default.

	* elf/elf.h (HWCAP_SPARC_*): New definitions.
	* elf/rtld.c (_dl_hwcap): New variable.
	* sysdeps/generic/dl-sysdep.c (_dl_sysdep_start): Record AT_HWCAP.

	* sysdeps/unix/sysv/linux/sparc/sparc32/getpagesize.c: New file.
	Attempt to get hold of the page size based on what we might have
	been told at startup time in _dl_pagesize.  This will be obsolete
	when I finish the kernel hooks for a proper sysconf(), stay tuned.

	Sparc 64 merge:

	* sysdeps/sparc/sparc64/dl-machine.h (ELF_FIXUP_RETURN_VALUE): New.
	Figure out the right thing to return based on the .plt format.

	* sysdeps/sparc/sparc64/fpu/fpu_control.h: Update comment.

	* sysdeps/unix/sysv/linux/sparc/sparc64/bits/types.h (__dev_t):
	Should have been 64-bits wide.

	* sysdeps/unix/sysv/linux/sparc/sparc64/init-first.h: sll->sllx,
	optimize for branch delay slot usage.

1997-08-22  Andreas Schwab  <schwab@issan.informatik.uni-dortmund.de>

	* csu/Makefile ($(objpfx)crt%.o): Fix a missing *.so -> *.os
	change.

1997-08-20  Andreas Jaeger  <aj@arthur.rhein-neckar.de>

	* math/libm-test.c (identities): Change epsilon.

	* sysdeps/i386/fpu/bits/mathinline.h: Correct arguments to fabs,
	fabsf, fabsl, __fabsl.

	* sysdeps/libm-i387/e_remainderl.S: Pop extra value from FPU stack.
	* sysdeps/libm-ieee754/s_csinhl.c: Include <fenv.h>.
1997-08-24 10:55:18 +00:00

330 lines
7.6 KiB
ArmAsm

! SPARC __mpn_sub_n -- Subtract two limb vectors of the same length > 0 and
! store difference in a third limb vector.
!
! Copyright (C) 1995, 1996 Free Software Foundation, Inc.
!
! This file is part of the GNU MP Library.
!
! The GNU MP Library is free software; you can redistribute it and/or modify
! it under the terms of the GNU Library General Public License as published by
! the Free Software Foundation; either version 2 of the License, or (at your
! option) any later version.
!
! The GNU MP Library is distributed in the hope that it will be useful, but
! WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
! or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Library General Public
! License for more details.
!
! You should have received a copy of the GNU Library General Public License
! along with the GNU MP Library; see the file COPYING.LIB. If not, write to
! the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
! MA 02111-1307, USA.
! INPUT PARAMETERS
#define RES_PTR %o0
#define S1_PTR %o1
#define S2_PTR %o2
#define SIZE %o3
#include <sysdep.h>
ENTRY(__mpn_sub_n)
xor S2_PTR,RES_PTR,%g1
andcc %g1,4,%g0
bne LOC(1) ! branch if alignment differs
nop
! ** V1a **
andcc RES_PTR,4,%g0 ! RES_PTR unaligned? Side effect: cy=0
be LOC(v1) ! if no, branch
nop
/* Add least significant limb separately to align RES_PTR and S2_PTR */
ld [S1_PTR],%g4
add S1_PTR,4,S1_PTR
ld [S2_PTR],%g2
add S2_PTR,4,S2_PTR
add SIZE,-1,SIZE
subcc %g4,%g2,%o4
st %o4,[RES_PTR]
add RES_PTR,4,RES_PTR
LOC(v1):
addx %g0,%g0,%o4 ! save cy in register
cmp SIZE,2 ! if SIZE < 2 ...
bl LOC(end2) ! ... branch to tail code
subcc %g0,%o4,%g0 ! restore cy
ld [S1_PTR+0],%g4
addcc SIZE,-10,SIZE
ld [S1_PTR+4],%g1
ldd [S2_PTR+0],%g2
blt LOC(fin1)
subcc %g0,%o4,%g0 ! restore cy
/* Add blocks of 8 limbs until less than 8 limbs remain */
LOC(loop1):
subxcc %g4,%g2,%o4
ld [S1_PTR+8],%g4
subxcc %g1,%g3,%o5
ld [S1_PTR+12],%g1
ldd [S2_PTR+8],%g2
std %o4,[RES_PTR+0]
subxcc %g4,%g2,%o4
ld [S1_PTR+16],%g4
subxcc %g1,%g3,%o5
ld [S1_PTR+20],%g1
ldd [S2_PTR+16],%g2
std %o4,[RES_PTR+8]
subxcc %g4,%g2,%o4
ld [S1_PTR+24],%g4
subxcc %g1,%g3,%o5
ld [S1_PTR+28],%g1
ldd [S2_PTR+24],%g2
std %o4,[RES_PTR+16]
subxcc %g4,%g2,%o4
ld [S1_PTR+32],%g4
subxcc %g1,%g3,%o5
ld [S1_PTR+36],%g1
ldd [S2_PTR+32],%g2
std %o4,[RES_PTR+24]
addx %g0,%g0,%o4 ! save cy in register
addcc SIZE,-8,SIZE
add S1_PTR,32,S1_PTR
add S2_PTR,32,S2_PTR
add RES_PTR,32,RES_PTR
bge LOC(loop1)
subcc %g0,%o4,%g0 ! restore cy
LOC(fin1):
addcc SIZE,8-2,SIZE
blt LOC(end1)
subcc %g0,%o4,%g0 ! restore cy
/* Add blocks of 2 limbs until less than 2 limbs remain */
LOC(loope1):
subxcc %g4,%g2,%o4
ld [S1_PTR+8],%g4
subxcc %g1,%g3,%o5
ld [S1_PTR+12],%g1
ldd [S2_PTR+8],%g2
std %o4,[RES_PTR+0]
addx %g0,%g0,%o4 ! save cy in register
addcc SIZE,-2,SIZE
add S1_PTR,8,S1_PTR
add S2_PTR,8,S2_PTR
add RES_PTR,8,RES_PTR
bge LOC(loope1)
subcc %g0,%o4,%g0 ! restore cy
LOC(end1):
subxcc %g4,%g2,%o4
subxcc %g1,%g3,%o5
std %o4,[RES_PTR+0]
addx %g0,%g0,%o4 ! save cy in register
andcc SIZE,1,%g0
be LOC(ret1)
subcc %g0,%o4,%g0 ! restore cy
/* Add last limb */
ld [S1_PTR+8],%g4
ld [S2_PTR+8],%g2
subxcc %g4,%g2,%o4
st %o4,[RES_PTR+8]
LOC(ret1):
retl
addx %g0,%g0,%o0 ! return carry-out from most sign. limb
LOC(1): xor S1_PTR,RES_PTR,%g1
andcc %g1,4,%g0
bne LOC(2)
nop
! ** V1b **
andcc RES_PTR,4,%g0 ! RES_PTR unaligned? Side effect: cy=0
be LOC(v1b) ! if no, branch
nop
/* Add least significant limb separately to align RES_PTR and S1_PTR */
ld [S2_PTR],%g4
add S2_PTR,4,S2_PTR
ld [S1_PTR],%g2
add S1_PTR,4,S1_PTR
add SIZE,-1,SIZE
subcc %g2,%g4,%o4
st %o4,[RES_PTR]
add RES_PTR,4,RES_PTR
LOC(v1b):
addx %g0,%g0,%o4 ! save cy in register
cmp SIZE,2 ! if SIZE < 2 ...
bl LOC(end2) ! ... branch to tail code
subcc %g0,%o4,%g0 ! restore cy
ld [S2_PTR+0],%g4
addcc SIZE,-10,SIZE
ld [S2_PTR+4],%g1
ldd [S1_PTR+0],%g2
blt LOC(fin1b)
subcc %g0,%o4,%g0 ! restore cy
/* Add blocks of 8 limbs until less than 8 limbs remain */
LOC(loop1b):
subxcc %g2,%g4,%o4
ld [S2_PTR+8],%g4
subxcc %g3,%g1,%o5
ld [S2_PTR+12],%g1
ldd [S1_PTR+8],%g2
std %o4,[RES_PTR+0]
subxcc %g2,%g4,%o4
ld [S2_PTR+16],%g4
subxcc %g3,%g1,%o5
ld [S2_PTR+20],%g1
ldd [S1_PTR+16],%g2
std %o4,[RES_PTR+8]
subxcc %g2,%g4,%o4
ld [S2_PTR+24],%g4
subxcc %g3,%g1,%o5
ld [S2_PTR+28],%g1
ldd [S1_PTR+24],%g2
std %o4,[RES_PTR+16]
subxcc %g2,%g4,%o4
ld [S2_PTR+32],%g4
subxcc %g3,%g1,%o5
ld [S2_PTR+36],%g1
ldd [S1_PTR+32],%g2
std %o4,[RES_PTR+24]
addx %g0,%g0,%o4 ! save cy in register
addcc SIZE,-8,SIZE
add S1_PTR,32,S1_PTR
add S2_PTR,32,S2_PTR
add RES_PTR,32,RES_PTR
bge LOC(loop1b)
subcc %g0,%o4,%g0 ! restore cy
LOC(fin1b):
addcc SIZE,8-2,SIZE
blt LOC(end1b)
subcc %g0,%o4,%g0 ! restore cy
/* Add blocks of 2 limbs until less than 2 limbs remain */
LOC(loope1b):
subxcc %g2,%g4,%o4
ld [S2_PTR+8],%g4
subxcc %g3,%g1,%o5
ld [S2_PTR+12],%g1
ldd [S1_PTR+8],%g2
std %o4,[RES_PTR+0]
addx %g0,%g0,%o4 ! save cy in register
addcc SIZE,-2,SIZE
add S1_PTR,8,S1_PTR
add S2_PTR,8,S2_PTR
add RES_PTR,8,RES_PTR
bge LOC(loope1b)
subcc %g0,%o4,%g0 ! restore cy
LOC(end1b):
subxcc %g2,%g4,%o4
subxcc %g3,%g1,%o5
std %o4,[RES_PTR+0]
addx %g0,%g0,%o4 ! save cy in register
andcc SIZE,1,%g0
be LOC(ret1b)
subcc %g0,%o4,%g0 ! restore cy
/* Add last limb */
ld [S2_PTR+8],%g4
ld [S1_PTR+8],%g2
subxcc %g2,%g4,%o4
st %o4,[RES_PTR+8]
LOC(ret1b):
retl
addx %g0,%g0,%o0 ! return carry-out from most sign. limb
! ** V2 **
/* If we come here, the alignment of S1_PTR and RES_PTR as well as the
alignment of S2_PTR and RES_PTR differ. Since there are only two ways
things can be aligned (that we care about) we now know that the alignment
of S1_PTR and S2_PTR are the same. */
LOC(2): cmp SIZE,1
be LOC(jone)
nop
andcc S1_PTR,4,%g0 ! S1_PTR unaligned? Side effect: cy=0
be LOC(v2) ! if no, branch
nop
/* Add least significant limb separately to align S1_PTR and S2_PTR */
ld [S1_PTR],%g4
add S1_PTR,4,S1_PTR
ld [S2_PTR],%g2
add S2_PTR,4,S2_PTR
add SIZE,-1,SIZE
subcc %g4,%g2,%o4
st %o4,[RES_PTR]
add RES_PTR,4,RES_PTR
LOC(v2):
addx %g0,%g0,%o4 ! save cy in register
addcc SIZE,-8,SIZE
blt LOC(fin2)
subcc %g0,%o4,%g0 ! restore cy
/* Add blocks of 8 limbs until less than 8 limbs remain */
LOC(loop2):
ldd [S1_PTR+0],%g2
ldd [S2_PTR+0],%o4
subxcc %g2,%o4,%g2
st %g2,[RES_PTR+0]
subxcc %g3,%o5,%g3
st %g3,[RES_PTR+4]
ldd [S1_PTR+8],%g2
ldd [S2_PTR+8],%o4
subxcc %g2,%o4,%g2
st %g2,[RES_PTR+8]
subxcc %g3,%o5,%g3
st %g3,[RES_PTR+12]
ldd [S1_PTR+16],%g2
ldd [S2_PTR+16],%o4
subxcc %g2,%o4,%g2
st %g2,[RES_PTR+16]
subxcc %g3,%o5,%g3
st %g3,[RES_PTR+20]
ldd [S1_PTR+24],%g2
ldd [S2_PTR+24],%o4
subxcc %g2,%o4,%g2
st %g2,[RES_PTR+24]
subxcc %g3,%o5,%g3
st %g3,[RES_PTR+28]
addx %g0,%g0,%o4 ! save cy in register
addcc SIZE,-8,SIZE
add S1_PTR,32,S1_PTR
add S2_PTR,32,S2_PTR
add RES_PTR,32,RES_PTR
bge LOC(loop2)
subcc %g0,%o4,%g0 ! restore cy
LOC(fin2):
addcc SIZE,8-2,SIZE
blt LOC(end2)
subcc %g0,%o4,%g0 ! restore cy
LOC(loope2):
ldd [S1_PTR+0],%g2
ldd [S2_PTR+0],%o4
subxcc %g2,%o4,%g2
st %g2,[RES_PTR+0]
subxcc %g3,%o5,%g3
st %g3,[RES_PTR+4]
addx %g0,%g0,%o4 ! save cy in register
addcc SIZE,-2,SIZE
add S1_PTR,8,S1_PTR
add S2_PTR,8,S2_PTR
add RES_PTR,8,RES_PTR
bge LOC(loope2)
subcc %g0,%o4,%g0 ! restore cy
LOC(end2):
andcc SIZE,1,%g0
be LOC(ret2)
subcc %g0,%o4,%g0 ! restore cy
/* Add last limb */
LOC(jone):
ld [S1_PTR],%g4
ld [S2_PTR],%g2
subxcc %g4,%g2,%o4
st %o4,[RES_PTR]
LOC(ret2):
retl
addx %g0,%g0,%o0 ! return carry-out from most sign. limb
END(__mpn_sub_n)