Even though clang comes with a number of internal CUDA wrapper headers,
compiling sample CUDA programs will result in errors similar to:
In file included from <built-in>:1:
In file included from /usr/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:204:
/usr/home/arr/cuda/var/cuda-repo-10-0-local-10.0.130-410.48/usr/local/cuda-10.0//include/crt/math_functions.hpp:2910:7: error: no matching function for call to '__isnan'
if (__isnan(a)) {
^~~~~~~
/usr/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:460:16: note: candidate function not viable: call to __device__ function from __host__ function
__DEVICE__ int __isnan(double __a) { return __nv_isnand(__a); }
^
CUDA expects __isnan() and __isnanf() declarations to be available,
which are glibc specific extensions, equivalent to the regular isnan()
and isnanf().
To provide these, define __isnan() and __isnanf() as aliases of the
already existing static inline functions __inline_isnan() and
__inline_isnanf() from math.h.
Reported by: arrowd
PR: 241550
MFC after: 1 week
The primary benefit of these functions is that argument
reduction is done once instead of twice in independent
calls to sin() and cos().
* lib/msun/Makefile:
. Add s_sincos[fl].c to the build.
. Add sincos.3 documentation.
. Add appropriate MLINKS.
* lib/msun/Symbol.map:
. Expose sincos[fl] symbols in dynamic libm.so.
* lib/msun/man/sincos.3:
. Documentation for sincos[fl].
* lib/msun/src/k_sincos.h:
. Kernel for sincos() function. This merges the individual kernels
for sin() and cos(). The merger offered an opportunity to re-arrange
the individual kernels for better performance.
* lib/msun/src/k_sincosf.h:
. Kernel for sincosf() function. This merges the individual kernels
for sinf() and cosf(). The merger offered an opportunity to re-arrange
the individual kernels for better performance.
* lib/msun/src/k_sincosl.h:
. Kernel for sincosl() function. This merges the individual kernels
for sinl() and cosl(). The merger offered an opportunity to re-arrange
the individual kernels for better performance.
* lib/msun/src/math.h:
. Add prototytpes for sincos[fl]().
* lib/msun/src/math_private.h:
. Add RETURNV macros. This is needed to reset fpsetprec on I386
hardware for a function with type void.
* lib/msun/src/s_sincos.c:
. Implementation of sincos() where sin() and cos() were merged into
one routine and possibly re-arranged for better performance.
* lib/msun/src/s_sincosf.c:
. Implementation of sincosf() where sinf() and cosf() were merged into
one routine and possibly re-arranged for better performance.
* lib/msun/src/s_sincosl.c:
. Implementation of sincosl() where sinl() and cosl() were merged into
one routine and possibly re-arranged for better performance.
PR: 215977, 218300
Submitted by: Steven G. Kargl <sgk@troutmask.apl.washington.edu>
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D10765
. Hook e_lgammal[_r].c to the build.
. Create man page links for lgammal[-r].3.
* Symbol.map:
. Sort lgammal to its rightful place.
. Add FBSD_1.4 section for the new lgamal_r symbol.
* ld128/e_lgammal_r.c:
. 128-bit implementataion of lgammal_r().
* ld80/e_lgammal_r.c:
. Intel 80-bit format implementation of lgammal_r().
* src/e_lgamma.c:
. Expose lgammal as a weak reference to lgamma for platforms
where long double is mapped to double.
* src/e_lgamma_r.c:
. Use integer literal constants instead of real literal constants.
Let compiler(s) do the job of conversion to the appropriate type.
. Expose lgammal_r as a weak reference to lgamma_r for platforms
where long double is mapped to double.
* src/e_lgammaf_r.c:
. Fixed the Cygnus Support conversion of e_lgamma_r.c to float.
This includes the generation of new polynomial and rational
approximations with fewer terms. For each approximation, include
a comment on an estimate of the accuracy over the relevant domain.
. Use integer literal constants instead of real literal constants.
Let compiler(s) do the job of conversion to the appropriate type.
This allows the removal of several explicit casts of double values
to float.
* src/e_lgammal.c:
. Wrapper for lgammal() about lgammal_r().
* src/imprecise.c:
. Remove the lgamma.
* src/math.h:
. Add a prototype for lgammal_r().
* man/lgamma.3:
. Document the new functions.
Reviewed by: bde
and tgammal in libm. These functions are part of ISO/IEC 9899:1999
and their prototypes should have been moved into the appropriate
__ISO_C_VISIBLE >= 1999 section. After moving the prototypes,
remnants of r236148 can be removed.
PR: standards/191754
Reviewed by: bde
. Add s_erfl.c to building libm.
. Add MLINKS for erfl.3 and erfcl.3.
* Symbol.map:
. Move erfl and erfcl to their proper location.
* ld128/s_erfl.c:
. Implementations of erfl and erfcl in the IEEE 754 128-bit format.
* ld80/s_erfl.c:
. Implementations of erfl and erfcl in the Intel 80-bit format.
* man/erf.3:
. Document the new functions.
. While here, remove an incomplete sentence.
* src/imprecise.c:
. Remove the stupidity of mapping erfl and erfcl to erf and erfc.
* src/math.h:
. Move the declarations of erfl and erfcl to their proper place.
* src/s_erf.c:
. For architectures where double and long double are the same
floating point format, use weak references to map erfl to
erf and ercl to erfc.
Reviewed by: bde (many earlier versions)
This includes:
o All directories named *ia64*
o All files named *ia64*
o All ia64-specific code guarded by __ia64__
o All ia64-specific makefile logic
o Mention of ia64 in comments and documentation
This excludes:
o Everything under contrib/
o Everything under crypto/
o sys/xen/interface
o sys/sys/elf_common.h
Discussed at: BSDcan
. Hook coshl, sinhl, and tanhl into libm.
. Create symbolic links for corresponding manpages.
. While here remove a nearby extraneous space.
* Symbol.map:
* src/math.h:
. Move coshl, sinhl, and tanhl to their proper locations.
* man/cosh.3:
* man/sinh.3:
* man/tanh.3:
. Update the manpages.
* src/e_cosh.c:
* src/e_sinh.c:
* src/s_tanh.c:
. Add weak reference for LBDL_MANT_DIG==53 targets.
* src/imprecise.c:
. Remove the coshl, sinhl, and tanhl kludge.
* src/e_coshl.c:
. ld80 and ld128 implementation of coshl().
* src/e_sinhl.c:
. ld80 and ld128 implementation of sinhl().
* src/s_tanhl.c:
. ld80 and ld128 implementation of tanhl().
Obtained from: bde (mostly), das and kargl
as a fairly faithful implementation of the algorithm found in
PTP Tang, "Table-driven implementation of the Expm1 function
in IEEE floating-point arithmetic," ACM Trans. Math. Soft., 18,
211-222 (1992).
Over the last 18-24 months, the code has under gone significant
optimization and testing.
Reviewed by: bde
Obtained from: bde (most of the optimizations)
format. These implementations are based on
PTP Tang, "Table-driven implementation of the exponential function
in IEEE floating-point arithmetic," ACM Trans. Math. Soft., 15,
144-157 (1989).
PR: standards/152415
Submitted by: kargl
Reviewed by: bde, das
Approved by: das (mentor)
no longer "fast" on sparc64. (It really wasn't to begin with, since
the old implementation was using long doubles, and long doubles are
emulated in software on sparc64.)
macro expand to __isnanf() instead of isnanf() for float arguments.
This change is needed because isnanf() isn't declared in strict POSIX
or C99 mode.
Compatibility note: Apps using isnan(float) that are compiled after
this change won't link against an older libm.
Reported by: Florian Forster <octo@verplant.org>
long doubles (i386, amd64, ia64) and one for machines with 128-bit
long doubles (sparc64). Other platforms use the double version.
I've only done runtime testing on i386.
Thanks to bde@ for helpful discussions and bugfixes.
adds two new directories in msun: ld80 and ld128. These are for
long double functions specific to the 80-bit long double format
used on x86-derived architectures, and the 128-bit format used on
sparc64, respectively.
sparc64's 128-bit long doubles.
- Define FP_FAST_FMAL for ia64.
- Prototypes for fmal, frexpl, ldexpl, nextafterl, nexttoward{,f,l},
scalblnl, and scalbnl.
flags, so they are not pure. Remove the __pure2 annotation from them.
I believe that the following routines and their float and long double
counterparts are the only ones here that can be __pure2:
copysign is* fabs finite fmax fmin fpclassify ilogb nan signbit
When gcc supports FENV_ACCESS, perhaps there will be a new annotation
that allows the other functions to be considered pure when FENV_ACCESS
is off.
Discussed with: bde