Archive for June 2009
18
An old JavaScript implementation bug – parseInt
3 Comments · Posted by Bruno França dos Reis in JavaScript
Hello!
Today in the company I’m working, a colleague was saying he had found a very very strange behaviour in JavaScript. I think he was parsing string representing dates by means of the parseInt function.
The problem is that his strings were like “23/04/2008″, “17/08/2005″, and so on. If you split those strings in the slash character, and then try to parseInt each part obtained, you will end up trying to do a
var day = parseInt("17");
var month = parseInt("08");
var year = parseInt("2005");
For those who would expect to get:
- day = 17,
- month = 8, and
- year = 2005,
they will have a surprise, if you run the script in most browsers. What you will get is:
- day = 17,
- month = 0, and
- year = 2005,
By playing around for less than one minute, you can find out what is happening: JavaScript sees that the string starts with a “0″ and is very smart (sarcasm) and decides to parse it as an octal.
Googling (or Binging, nowadays…) a bit, you will see very old threads on forums discussing this bug. Some people say “JavaScript is broken”, others reply “JavaScript is not broken. RTFM.”, and then the usual flame war is started, people invoking flame war laws, and all.
So, what is really happening here? Who is correct?
The ECMA-262 standard
As tend to I dislike unfounded discussions, my first reaction was to look for a copy of the ECMA-262 standard, which can be downloaded here: http://www.ecma-international.org/publications/standards/Ecma-262.htm.
Page 77, paragraph 15.1.2.2, states the definition of the parseInt function. THAT is authoritative information, isn’t it?
So, the definition of parseInt, which actually takes 2 arguments (string and radix) reads:
The parseInt function produces an integer value dictated by interpretation of the contents of the string argument according to the specified radix. Leading whitespace in the string is ignored. If radix is undefined or 0, it is assumed to be 10 except when the number begins with the character pairs 0x or 0X, in which case a radix of 16 is assumed. Any radix-16 number may also optionally begin with the character pairs 0x or 0X.
As you can see, the standard states that JavaScript plays the smart guy when dealing with strings starting with “0x”, that is, hexadecimal string representation of numbers. Now, where does it say JavaScript should also be smart about octals?
Conclusion: JavaScript is indeed not implemented in the standard way in most browsers nowadays.
Actually, the specs, in the next page, permits implementations to interpret strings beginning by “0″ without and “x” or “X” right after as octals. However, in encourages them to interpret it as decimal. Why would leading browsers diverge from the official recommendations?
Workarounds
A number of simple workarounds are available. First, you can simply use the radix argument and specify it should be 10:
var day = parseInt("08", 10)
Another solution, the one I prefer (and use all the time) when converting strings to numbers is
var day = "08" - 0
Yes, you subtract the INTEGER 0 from a string. To see why it is syntactically correct, you can go the the page 31, section 9.3.1 of the same document and read the ToNumber function, and then read the section 11.6.2 on the page 50, about the subtraction operator. Firstly it converts both operands to numbers with the ToNumber function. After that it calculates the result.
The point is, at leasts in Firefox, the implementation of ToNumber (which deals with “0x” as parseInt) works correctly when dealing with strings such as “08″, and correctly parses it to 8.
Now, I ask: why does Firefox correctly implement ToNumber but gives unexpected results on the parseInt function, when the specs says it is recommended to use base 10?
Comment!
No tags
