Flutterby™! : Bug of the day

Bug of the day

2014-02-06 00:55:05.885153+00 by Dan Lyke 12 comments

Bug of the day: 0.00000000000000711 became 7.11E-15 became $7.11. Lessons: never use floating point for accounting; typed languages are good

[ related topics: Boats ]

comments in ascending chronological order (reverse):

#Comment Re: made: 2014-02-06 01:09:39.758876+00 by: meuon

That one is blatant, the smaller rounding errors will drive you nuts.

#Comment Re: made: 2014-02-06 01:35:20.731415+00 by: Jack William Bell

Thus why COBOL had a native BCD type. No rounding errors. Ever.

IBM even cooked BCD into the microcode for some of their big-iron CPUs. Plus EBCDIC encoding supported it directly, so conversions to printable format were simple.

#Comment Re: made: 2014-02-06 13:50:32.279896+00 by: TheSHAD0W

I've always handled this by storing currency amounts as integers, in cents. Convert to floating point only for calculations that require it, then back to integer.

#Comment Re: made: 2014-02-06 14:42:26.910805+00 by: Jack William Bell

I've always handled this by storing currency amounts as integers, in cents. Convert to floating point only for calculations that require it, then back to integer.

You can still get rounding errors that way. And you can get a different kind of rounding error when multiplying everything by 100 and then doing all your calculations using integers. Basically everything you do using a standard binary representation (other than simple addition/subtraction) is subject to rounding errors.

There are tricks to avoid the problem. And maybe the errors aren't to significant digits for whatever you are doing. But when it comes to working with big datasets of currency values BCD is really the only way to get it right.

#Comment Re: made: 2014-02-06 16:54:30.221591+00 by: Dan Lyke

I definitely remember both the 6502 and the IBM 360 having BCD math as a part of their assembly language.

I also remember the one and only time that assumptions about currency bit me, now I can't remember if it was floating point, or whether I was smart enough to use integers. I'm thinking it was the latter. At any rate I remember that it was a $5.11 difference in a $70k or so line item, because I foolishly assumed that f(a+b) = f(a)+f(b) where f(x) was approximately x*y where y was between 0 and 1.

The "convert to floating point and back" thing can work, but you have to be careful there about precision, and still make sure that all of your addition and subtraction of sums is integers.

But the other thing here is that it's reaffirming my feel that untyped languages are dangerous for large systems. Yesterday we had two bugs show up in under an hour that were related to people (not me) not thinking clearly about types. Having straitjackets can help a lot when you've got people with CS degrees writing code.

#Comment Re: made: 2014-02-07 00:23:55.492977+00 by: spc476

Quite a few CPUs had BCD support, like the MC6809, MC68000, and the x86 line. I'm not sure if any modern RISC-based CPU even has BCD support these days.

#Comment Re: made: 2014-02-08 18:32:21.272202+00 by: John Anderson

Dan, you should read Moonpig: a billing system that doesn't suck -- also relevant to the "Why I hate Perl" post, but I'm linking it here because of the "how to do accounting math properly" part.

#Comment Re: made: 2014-02-10 23:39:17.945075+00 by: meuon

John: Good read. I think we already solved most of his issues but I need to study it much more carefully. Thanks.

#Comment Re: made: 2014-02-11 01:29:56.678145+00 by: John Anderson

Meuon: you may also be able to find video of the talk MJD did at the Pittsburgh Perl Workshop this year, which is where that article came from. It's a pretty entertaining talk.

#Comment Re: made: 2014-02-11 15:06:42.51508+00 by: Dan Lyke

Just started reading, and as described his currency calculation scheme may have a hidden gotcha depending on how his Perl is compiled that he probably won't see until an accountant is reading quarterlies and wondering why the pennies don't match.

'cause 2 billion millicents is only $20k, and there are plenty of current production machines where:

perl -MPOSIX -le 'print LONG_MAX'

prints 2147483647, not 9223372036854775807. But if he knows that the code will only ever run on 64 bit machines, he's probably okay.

#Comment Re: made: 2014-02-12 01:25:55.771414+00 by: TheSHAD0W

Uhm. Can't you just test the value of LONG_MAX in the program and throw an error if it's low?

#Comment Re: made: 2014-02-12 14:35:23.860588+00 by: Dan Lyke

Yeah, it's just that suddenly you've created COBOL's 150 lines of boilerplate.

Now to be fair, these days every Perl script that isn't running on some outdated server starts with

use Modern::Perl;

or some site specific alternative, I guess my point, and I'm not getting the feel that someone who's idea of optimally using a relational database to store general ledger data is "make it a key-value store and serialize everything into GUID indexed blobs" is really thinking through all of the C-style details, is that it's programmer cognitive load.

Abstractions are fantastic when they reduce it, not so much when they increase it, and all abstractions are leaky. When the abstraction covers up details in a way that I don't have to worry about, that's great. But too often those details leak out in ways that get me.