### Re: 43S News

Posted:

**Tue May 19, 2020 12:17 pm**We're not using math.h primarily because it uses a binary based floating point format. Calculators use decimal floating point because their users mostly work in decimal. The differences can be a surprise at times. Something as simple as 0.3 cannot be exactly represented as a binary number, it can in decimal.

Sine and cosine are calculated using a Taylor expansion after the argument has been reduced to the first quadrant of the circle. The series converges rapidly here without numeric stability issues.

The reasons for the different precisions of reals are generally accuracy, by calculating intermediate results in higher precision it is easy to get a correct answer. The number of digits required depends on the operation. By spending a lot more effort on the arithmetic, the number of additional digits could be greatly reduced. Still, it is easier to brute force than to think

The exception is the largest representation which is present because the underlying mathematics library (decNumber) has a quirk when calculating remainders. Enough digits need to be carried to represent the result and the original argument -- e.g. 1E100 modulo .345 requires 100 + desired number of output digits (so upwards of 140) to get the correct answer. When calculating the trigonometric functions, the argument needs to be modulo reduced by 2𝜋. An (hypothetical) argument of the order of 1E300 that happens to be very close to a multiple of 2𝜋 could result in an answer of 1E-300 meaning 600+ digits.

Sine and cosine are calculated using a Taylor expansion after the argument has been reduced to the first quadrant of the circle. The series converges rapidly here without numeric stability issues.

The reasons for the different precisions of reals are generally accuracy, by calculating intermediate results in higher precision it is easy to get a correct answer. The number of digits required depends on the operation. By spending a lot more effort on the arithmetic, the number of additional digits could be greatly reduced. Still, it is easier to brute force than to think

The exception is the largest representation which is present because the underlying mathematics library (decNumber) has a quirk when calculating remainders. Enough digits need to be carried to represent the result and the original argument -- e.g. 1E100 modulo .345 requires 100 + desired number of output digits (so upwards of 140) to get the correct answer. When calculating the trigonometric functions, the argument needs to be modulo reduced by 2𝜋. An (hypothetical) argument of the order of 1E300 that happens to be very close to a multiple of 2𝜋 could result in an answer of 1E-300 meaning 600+ digits.