Integer overflow

I’ve been looking at a server that has Progress OpenEdge installed (10.1C). I’ve found a bug in this recently, and it’s useful to be aware of this if you use the same software. More generally, it’s something to be aware of if you write any computer programs, so that you won’t make the same mistake.

I often see warning messages in the Application log whenever it has to modify environment variables, e.g. PROGRESS 5407:
Usr WARNING: -nb exceeded. Automatically increasing from 120 to 152. (5407)

It’s annoying to see several messages like that, particularly after a reboot, but it’s pretty harmless. however, there’s a more serious issue with event ID 5409, which refers to the “mmax” variable. A typical message would look like this:
Usr WARNING: -mmax exceeded. Automatically increasing from 12643 to 12648. (5409)

Unfortunately, this warning gets logged about 3000 times every second, so it floods the log and pushes out any older log entries. It wouldn’t be so bad if it eventually stopped, but it goes into an infinite loop. The numbers keep increasing, then something odd happens at about 32,000:

Usr WARNING: -mmax exceeded. Automatically increasing from 32749 to 32750. (5409)
Usr WARNING: -mmax exceeded. Automatically increasing from 32750 to 32757. (5409)
Usr WARNING: -mmax exceeded. Automatically increasing from 32757 to -(. (5409)
Usr WARNING: -mmax exceeded. Automatically increasing from -( to -32758. (5409)
Usr WARNING: -mmax exceeded. Automatically increasing from -32758 to -32757. (5409)
Usr WARNING: -mmax exceeded. Automatically increasing from -32757 to -32754. (5409)

I haven’t seen the source code, but anyone who’s familiar with binary can guess what’s happening here. They must be using a 16-bit signed integer to store the value, but the “resize” logic is treating it as an unsigned integer.

With 2s complement notation, the highest value you can store in a 16-bit signed integer is 0111 1111 1111 1111 (binary) = 32,767 (decimal). If this was an unsigned integer, you could add 1 to get 1000 0000 0000 0000 (binary) = 32,768 (decimal). However, using signed integers, 1000 0000 0000 0000 (binary) = -32,768 (decimal).

The effect of this is that the value of mmax goes up to 32,767, then flips to -32,768, goes up to 0, keeps going up to 32,767, and flips to -32,768 again. This cycle perpetuates until someone manually intervenes, e.g. by rebooting the server.

I don’t know whether this bug has been fixed by a service pack, because the front-end software I use has only been tested against this version (i.e. I can’t upgrade OpenEdge). However, I think it’s rather embarrassing for any recent software to have a bug like this, considering that overflow errors are often linked to security flaws.

I learnt about 2s complement when I was 18 (back in 1993), in the first year of my CompSci undergrad degree. When people discuss IT recruitment, they often ask whether you need a degree, since lots of people can pick up vocational skills on their own. However, this is the type of situation where I benefit from the academic theory, which probably won’t be covered by books like “Teach yourself Java in 24 hours”.

Even if you don’t understand the underlying principles, most programming languages will generate an error (or throw an exception) if you try to store a value that’s too big. So, I wonder whether the Progress developers have turned that check off, e.g. to improve performance?

Anyway, I think there are 3 changes that they should make:
1) When you reach the biggest possible number, stop trying to increment it!
2) If it doesn’t make sense for this value to be negative, use an unsigned integer.
3) If the 16-bit range is too small for you, use a 32-bit number instead.

I can’t report bugs to Progress directly, but the people who supplied the front-end application will notify them on my behalf, so hopefully they’ll fix it.

Leave a Reply

Your email address will not be published. Required fields are marked *