Version française

GNU Linux and the Extended Precision on x86 Processors

[Document initially written in 2004.]


The traditional FPU of the x86 processors (a.k.a. x87) can be configured to round its results either in extended precision (64-bit significands) or in double precision (53-bit significands); internal rounding in single precision is also supported, but not used in practice. Some operating systems, such as FreeBSD, NetBSD up to version 6, and Microsoft Windows (but not always?), chose to configure the processor so that, by default, it rounds in double precision. On the other hand, under GNU Linux, the rounding is done in extended precision (I do not know any other system having the same behavior by default). This is a bad choice for the reasons given below.

Note: the SSE2 instructions, used by 64-bit systems in particular, are not concerned by these problems.

More up-to-date information may be found on Wikipedia.

The Goal of the Extended Precision

Before listing the drawbacks of the extended precision, one should note that the (optional) extended precision was introduced in the IEEE-754 standard essentially to allow an accurate double-precision implementation of the elementary functions of the mathematical library (exponential, logarithm, trigonometric and hyperbolic functions, etc), thus for very particular cases.

The file /usr/include/fpu_control.h under Linux/x86 says:

#define _FPU_EXTENDED 0x300     /* libm requires double extended precision.  */

but tests (with mpcheck in particular) show that the math library does not seem to need extended precision: double precision seems to be sufficient. Or does anyone have a counter-example? Update: As shown in glibc bug 706, the pow function is inaccurate in some cases, and even more when the processor is configured in double precision.

The Double Rounding

Even if in a programming language, a type corresponding to the double precision is used (for instance, the type double on most C implementations), a processor computing in extended precision internally will not necessarily give the same results as with a processor computing in double precision. For instance, let us consider the two floating-point double-precision numbers x = 253 + 2 and y = 1 − 2−16, and their sum in the rounding-to-nearest mode z = x + y. On a system working in double precision, the result is the rounding of 253 + 3 − 2−16, that is, 253 + 2 (closer to the exact result than the following machine number 253 + 4 is). On a system working in extended precision, the result will be first rounded in extended precision, that is, 253 + 3, then rounded in double precision (when stored to memory, for instance), that is, 253 + 4 (by using the even rounding rule, since 253 + 3 is the middle of two consecutive machine numbers). Therefore the obtained final result is different.

Note: this phenomenon, called double rounding, occurs in the rounding-to-nearest mode only, but this is the rounding mode that is used in general.

Portability Problems

First the extended precision leads to portability problems, even if much care is taken at the type level, in particular because the support for extended precision is optional, and in practice, many processors do not support it. Also note that the extended precision is not completely normalized. Even if obtaining exactly the same result on two different machines is not regarded as important and the requirements are linked only to the accuracy of the results, the extended precision is useless, since a portable program must assume the worst case.

If the portability is not important and the software needs to compute in extended precision on some given machine (by using the long double type in C, for instance), the software can still configure the processor in rounding to extended precision (in a non-portable way unfortunately, but anyway, the portability has already been sacrificed).

Concerning the portability, it is thus preferable to have everywhere a processor whose default behavior is to round to the double precision, without any major drawback for the programs needing extended precision.

Accuracy Problems

The goal of the extended precision is to allow to have more accurate basic operations in order to obtain more accurate final results. But most programs have not been written specially for the extended precision, and in general, one gets no guarantee. For instance, many programs written in C use the type double, and even if the intermediate computations are carried out in extended precision, some results may automatically be converted to double precision, and these conversions cannot be controlled.

The extended precision can be needed by some software. But in this case, it is important to be able to prove the error bounds and not to rely on assumptions which could be wrong. In particular, using the extended precision with the type double will not allow the user to obtain better certified error bounds; on the contrary, the bounds may be worse because of the double-rounding problem. The only solution is to write non-portable code (as told above), whether the default precision is the double one or the extended one.

For the other software, the choice of the default precision has little importance, but rounding in double precision is preferable concerning the proofs related to the error bounds.

Problems with Some Algorithms

Some algorithms are based on the correct-rounding property and may become completely wrong if a double rounding can occur. Concerning this point, my paper The Euclidean division implemented with a floating-point division and a floor gives an example particularly useful for programs written in ECMAScript (often referred to as Javascript) or using XPath.

Moreover, a same floating-point expression evaluated twice can yield different results and make a comparison fail. It is particularly true with GCC, which does not conform to some points of the ISO C standard. Several people encountered these problems; see the GCC bug 323 and the duplicates (which are in fact not always real duplicates, since the problem is not always the same, even though the first cause is the extended precision), and my program tst-ieee754.c. Note that GCC bug 323 has been fixed in GCC 4.5, as long as the right compiler options are used, like asking to conform to the C standard (see also the corresponding patch): GCC now supports handling floating-point excess precision arising from use of the x87 floating-point unit in a way that conforms to ISO C99. This is enabled with -fexcess-precision=standard and with standards conformance options such as -std=c99, and may be disabled using -fexcess-precision=fast.

Problems Related to Standards Conformance

Finally, some standards such as Java and XPath (and also XSLT, as it is partly based on XPath) require IEEE double precision with correct rounding, in order to obtain completely reproducible results on different machines or software. Default rounding in extended precision makes their implementation non portable (as explained above); and in practice, most implementations do not change the internal precision of the processor, thus are buggy (see below).

Note: Java also has a single-precision type (float), but the choice between the rounding in double precision and the rounding in extended precision does not solve the rounding problems concerning this type; in both cases, without special code, the implementation would be incorrect on this point. However one may assume that no developers concerned in the numerical behavior of their program use the float type for the calculations, especially as this would not be faster.

Bugs Related to the Choice of Extended Precision by Default

XSLT Processors

Test the XPath arithmetic with this XSLT stylesheet, to be applied to any XML file, on itself for instance. The following XSLT processors give an incorrect result (message Not a conforming XPath implementation.) under Linux/x86:

Java Virtual Machines (JVMs)

Test the arithmetic of your JVM with this Java class (Java source). Correct result:

$ java test
z = 9.007199254740994E15
d = 0.0

Incorrect result, when the processor rounds in extended precision:

$ java test
z = 9.007199254740996E15
d = 2.0

The GNU interpreter gij (last tested version: 4.3.3) gives this incorrect result under Linux/x86. Bug 255525 on the Debian BTS. Bug 16122 on GCC Bugzilla.

This bug was fixed in the following JVMs:

Note: Sun's and IBM's JVMs always give a correct result.

For more information about Java numerics: Improving Java for Numerical Computation (in particular, see Section Use of floating-point hardware) on the JavaNumerics web site, and Updates to the Java Language Specification for JDK Release 1.2 Floating Point.

The problem of the single precision remains (is it often used?). Developers of JIT compilers may be interested in the article Optimizing precision overhead for x86 processors by Takeshi Ogasawara, Hideaki Komatsu and Toshio Nakatani, in Software — Practice and Experience, 34(9):875–893, July 2004.

Javascript (ECMAScript)

Test of the Javascript arithmetic of your browser: . The number 0 is the correct result. If you get 2, this probably means that the Javascript implementation of your browser uses extended precision, and this is incorrect (for more information, see Section 8.5 of the ECMAScript Language Specification). If you get nothing, this means that your browser does not support Javascript or Javascript has been disabled.

The following web browsers give the incorrect result 2 under Linux/x86:

  • Mozilla (last tested version: 1.8b Gecko/20050114), but see below concerning Firefox. Bug 309797 on the Debian BTS.

  • Opera (last tested version: 11.00).

The bug has apparently been fixed in Firefox. Version 1.5 Gecko/20060110 gave an incorrect result (idem until Gecko 1.9.1), but recent versions such as Firefox 3.5.6 and later seem to behave correctly. Bug 264912 on Bugzilla (now closed).


A critical bug has been found in PHP, but it is related to the wider exponent range (and GCC bug 323), rather than the extended precision (the code in question runs in the double-precision mode). See below.

Problems due to the Wider Exponent Range

In fact, the configuration of the x87 FPU in double-precision mode does not solve all the problems. Indeed, the x87 registers also have a wider exponent range. This means that to hope to be portable, one (in general, the compiler) should store the value into memory to really obtain a result in the double-precision format, but with a loss of efficiency. Then, a value in the subnormal range of the double-precision format will still be on 53 bits in an x87 register instead of having a reduced precision, and storing the value into memory will yield a double rounding. It is possible to handle these cases correctly, but this is no longer so simple. And though these cases are rare (they are only underflow cases), they can occur in practice, as shown below.

A bug in PHP due to the wider exponent range was found by Rick Regan on 2010-12-30: PHP Hangs On Numeric Value 2.2250738585072011e-308. See also his bug report and his explanations. In fact, the real bug is GCC bug 323, but PHP needs to avoid it. Consequence for a web server: possible denial of service.