Pure C floating point problems

C and PASCAL (or any other high-level languages) in here please

Moderators: exxos, simonsunnyboy, Mug UK, Zorro 2, Moderator Team

User avatar
Nyh
Atari God
Atari God
Posts: 1496
Joined: Tue Oct 12, 2004 2:25 pm
Location: Netherlands

Pure C floating point problems

Postby Nyh » Thu Mar 02, 2017 3:53 pm

Working on some interesting C code and using Pure C as my primary C compiler (I just love Pure debugger for debugging my programs, yes, sometimes there are bugs in my C code), I found some bugs in the Pure C library.

1 the function ldexp(x, p) doesn't work as expected when x == 0.0. The expected result is x still being 0.0 but depending on the argument you get NaN or +- Inf. I think p is added to the exponent part of the floating point number even when the mantissa is all zero's, making zero into one of the special constants Nan, -Inf or + Inf.

2 the constant DBL_MIN (1.681051571556046753E-4932) will be converted tot 0.0 by the compiler.

3 the constant DBL_MAX (1.189731495357231765E+4932) will be converted to +Inf by the compiler.

The decimal value of the constants is correct, the smallest positive integer, 0x0.1000000000000000*2^-16383 has 1.681051571556046753E-4932 as its decimal representation, and the largest positive integer 0x0.FFFFFFFFFFFFFFFF*2^16384 is indeed 1.189731495357231765E+4932 but is doesn't work the other way around.

There is a work around for the values of DBL_MIN and DBL_MAX, the compiler seems to be afraid of decimal points. Using:
#define DBL_MIN (1.0/5948657476786158825275E+4910)
#define DBL_MAX 118973149535723176499E+4912
will generate the correct bit representations for DBL_MIN and DBL_MAX.

The same error is also in LDBL_MIN and LDBL_MAX, correct them with:
#define LDBL_MIN (1.0/5948657476786158825275E+4910)
#define LDBL_MAX 118973149535723176499E+4912

Hans Wessels

Social Media

     

Return to “C / PASCAL etc.”

Who is online

Users browsing this forum: No registered users and 1 guest