The relevant IEEE standard defines a numeric constant NaN (not a number) and prescribes that NaN should compare as not equal to itself. Why is that?
All the languages I'm familiar with implement this rule. But it often causes significant problems, for example unexpected behavior when NaN is stored in a container, when NaN is in the data that is being sorted, etc. Not to mention, the vast majority of programmers expect any object to be equal to itself (before they learn about NaN), so surprising them adds to the bugs and confusion.
IEEE standards are well thought out, so I am sure there is a good reason why NaN comparing as equal to itself would be bad. I just can't figure out what it is.
My original answer (from 4 years ago) criticizes the decision from the modern-day perspective without understanding the context in which the decision was made. As such, it doesn't answer the question.
The correct answer is given here:
NaN
!=NaN
originated out of two pragmatic considerations:[...] There was no
isnan( )
predicate at the time that NaN was formalized in the 8087 arithmetic; it was necessary to provide programmers with a convenient and efficient means of detecting NaN values that didn’t depend on programming languages providing something likeisnan( )
which could take many years
There was one disadvantage to that approach: it made NaN less useful in many situations unrelated to numerical computation. For example, much later when people wanted to use NaN
to represent missing values and put them in hash-based containers, they couldn't do it.
If the committee foresaw future use cases, and considered them important enough, they could have gone for the more verbose !(x<x & x>x)
instead of x!=x
as a test for NaN
. However, their focus was more pragmatic and narrow: providing the best solution for a numeric computation, and as such they saw no issue with their approach.
===
Original answer:
I am sorry, much as I appreciate the thought that went into the top-voted answer, I disagree with it. NaN does not mean "undefined" - see http://www.cs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF, page 7 (search for the word "undefined"). As that document confirms, NaN is a well-defined concept.
Furthermore, IEEE approach was to follow the regular mathematics rules as much as possible, and when they couldn't, follow the rule of "least surprise" - see https://stackoverflow.com/a/1573715/336527. Any mathematical object is equal to itself, so the rules of mathematics would imply that NaN == NaN should be True. I cannot see any valid and powerful reason to deviate from such a major mathematical principle (not to mention the less important rules of trichotomy of comparison, etc.).
As a result, my conclusion is as follows.
IEEE committee members did not think this through very clearly, and made a mistake. Since very few people understood the IEEE committee approach, or cared about what exactly the standard says about NaN (to wit: most compilers' treatment of NaN violates the IEEE standard anyway), nobody raised an alarm. Hence, this mistake is now embedded in the standard. It is unlikely to be fixed, since such a fix would break a lot of existing code.
Edit: Here is one post from a very informative discussion. Note: to get an unbiased view you have to read the entire thread, as Guido takes a different view to that of some other core developers. However, Guido is not personally interested in this topic, and largely follows Tim Peters recommendation. If anyone has Tim Peters' arguments in favor of NaN != NaN
, please add them in comments; they have a good chance to change my opinion.
NaN
violate trichotomy makes sense, but like you I see no reasonable semantic justification for not having ==
define an equivalence relation when its operands are both of the same type (going a little further, I think languages should explicitly disallow comparisons between things of different types--even when implicit conversions exist--if such comparisons cannot implement an equivalence relation). The concept of an equivalence relations is so fundamental in both programming and mathematics, it seems crazy to violate it - supercat 2013-09-18 19:40
NaN != NaN
is deduced beyond saying it's needed to distinguish NaN
from non-NaN
s absent library support like isnan()
- tmyklebu 2014-02-13 19:59
NaN==NaN
return something other than true or false would have been problematic, but given that (a<b)
does not necessarily equal !(a>=b)
, I see no reason that (a==b)
must necessarily equal !(a!=b)
. Having NaN==NaN
and Nan!=NaN
both return false would allow code which needs either definition of equality to use the one it needs - supercat 2014-04-29 23:28
The accepted answer is 100% without question WRONG. Not halfway wrong or even slightly wrong. I fear this issue is going to confuse and mislead programmers for a long time to come when this question pops up in searches.
NaN is designed to propagate through all calculations, infecting them like a virus, so if somewhere in your deep, complex calculations you hit upon a NaN, you don't bubble out a seemingly sensible answer. Otherwise by identity NaN/NaN should equal 1, along with all the other consequences like (NaN/NaN)==1, (NaN*1)==NaN, etc. If you imagine that your calculations went wrong somewhere (rounding produced a zero denominator, yielding NaN), etc then you could get wildly incorrect (or worse: subtly incorrect) results from your calculations with no obvious indicator as to why.
There are also really good reasons for NaNs in calculations when probing the value of a mathematical function; one of the examples given in the linked document is finding the zeros() of a function f(). It is entirely possible that in the process of probing the function with guess values that you will probe one where the function f() yields no sensible result. This allows zeros() to see the NaN and continue its work.
The alternative to NaN is to trigger an exception as soon as an illegal operation is encountered (also called a signal or a trap). Besides the massive performance penalties you might encounter, at the time there was no guarantee that the CPUs would support it in hardware or the OS/language would support it in software; everyone was their own unique snowflake in handling floating-point. IEEE decided to explicitly handle it in software as the NaN values so it would be portable across any OS or programming language. Correct floating point algorithms are generally correct across all floating point implementations, whether that be node.js or COBOL (hah).
In theory, you don't have to set specific #pragma directives, set crazy compiler flags, catch the correct exceptions, or install special signal handlers to make what appears to be the identical algorithm actually work correctly. Unfortunately some language designers and compiler writers have been really busy undoing this feature to the best of their abilities.
Please read some of the information about the history of IEEE 754 floating point. Also this answer on a similar question where a member of the committee responded: What is the rationale for all comparisons returning false for IEEE754 NaN values?
"An Interview with the Old Man of Floating-Point"
"History of IEEE Floating-Point Format"
What every computer scientist should know about floating point arithmetic
NaN + 1 != 0
, or NaN * 1 > 0
, it returns True
or False
as if everything was fine. Therefore, you can't rely on NaN
protecting you from problems if you plan to use comparison operators. Given that comparisons won't help you "propagate" NaNs, why not at least make them sensical? As things stand, they mess up the use cases of NaN in dictionaries, they make sort unstable, etc.
Also, a minor mistake in your answer. NaN/NaN == 1
would not evaluate True
if I had my way - max 2014-05-17 11:03
Many commenters have argued that it would be more useful to preserve reflexivity of equality and trichotomy on the grounds that adopting NaN != NaN doesn’t seem to preserve any familiar axiom. I confess to having some sympathy for this viewpoint, so I thought I would revisit this answer and provide a bit more context.
So maybe, dear Sir, you might consider being a bit less forceful in your statements - max 2014-05-17 11:16
isnan()
at the time, which is valid reason why the decision was taken. However, I can't see any reason that is still valid today, except that it would be a very bad idea to change the semantics - Sven Marnach 2015-10-08 11:22
x
and y
are NaN so comparing them as equal would be an invalid answer - russbishop 2016-02-10 19:40
Well, log(-1)
gives NaN
, and acos(2)
also gives NaN
. Does that mean that log(-1) == acos(2)
? Clearly not. Hence it makes perfect sense that NaN
is not equal to itself.
Revisiting this almost two years later, here's a "NaN-safe" comparison function:
function compare(a,b) {
return a == b || (isNaN(a) && isNaN(b));
}
NaN
just means undefined. By definition, you can’t reasonably assert that some undefined value is equal to another one - Jon Purdy 2012-04-05 18:52
log
function and the acos
function, then all negative values past -1
would be considered an intersection. Interestingly, Infinity == Infinity
is true, despite the fact that the same can't be said in actual mathematics - Niet the Dark Absol 2012-04-05 19:05
1 + 3 = 4
and 2 + 2 = 4
. Does that mean that 1 + 3 = 2 + 2
? Clearly yes. Hence your answer does not make perfect sense - borisdiakur 2012-07-20 21:33
log
, sqrt
, or parseInt
) that is called with two different arguments - you expect f(a)=f(b) <=> a=b
. Now -3!=2 <=> acos(-3)!=acos(2) => NaN!=NaN
Bergi 2013-04-02 13:08
log(-1) != log(-1)
does not make sense. So neither NaN
equals NaN
nor NaN
does not equal NaN
makes sense in all cases. Arguably, it'd make more sense if NaN == NaN
evalutated to something representing unknown, but then ==
wouldn't return a boolean - Tim Goodman 2013-06-28 16:39
f(a)=f(b) <=> a=b
this would only hold if f was an injective function. But f could be constant. What one can expect is that a=b => f(a)=f(b)
, but then your argument does not apply - Max 2013-09-09 01:12
NaN==NaN
should return false, why should that imply that NaN!=NaN
should return true? Given that NaN < NaN
and NaN >= NaN
both return false, even though they're "opposite" conditions, why could not NaN == NaN
and NaN != NaN
both return true - supercat 2014-02-11 00:40
NaN
was obtained - Niet the Dark Absol 2014-02-11 00:42
==
to return false, would not the the fact that they might be equal be adequate reason for !=
to also return false? The idea of equivalence relations is a very useful one, but I know of no nice way using only IEEE operations to test whether x and y are equivalent except (!(x < y) && !(x > y)); being able to test !(x == y)
would seem much cleaner - supercat 2014-02-11 02:21
!=
operator is redundant with ==
, while the other "opposite" forms allow a single test to specify the desired behavior when comparing NaN
with itself [e.g. !(a > b)
is equivalent to a <= b
except that the former returns true for NaN
and the latter false. If Nan != NaN
were false, then code which wanted to test equivalence could test !(a != b)
rather than having to use !(a < b) && !(b < a)
- supercat 2014-04-29 23:20
a==b
to be equivalent to !(a!=b)
but not also compel a>b
to be equivalent to !(a<=b)
? Also, can you suggest any usage cases where NaN!=NaN
is helpful, which are anywhere near as common as checking whether a value has been stored in a set? If a language can't support two different meanings for equality, specifying that code which wants to test if two things are equal non-NaNs should use (x<=y)||(x>=y)
would seem less annoying that saying that what would otherwise be type-agnostic equivalence-testing code must special-case floats - supercat 2014-04-29 23:51
log(-1)
and acos(2)
examples make it clear why NaN
shouldn't be equal to itself. I think it's better to have two separate idioms when testing for equality of two NaN
s. One, is symbolic equality (or equivalence) and the other is absolute exactness (or equality or identity). In my opinion, NaN
is equivalent to NaN
but not equal to NaN
. log(-1)
and acos(2)
can be considered same in meaning but not exact in value - nawfal 2015-11-24 11:39
NaN == NaN; true
, NaN != NaN; false
, NaN === NaN; false
, NaN !== NaN; false
. As supercat says I don't get why PHP, JS etc have designed it like NaN == NaN; false
and NaN != NaN; true
. The same question raise for Null
s as well. Some platforms consider them equal, some not. If only we could have one opinion on this - nawfal 2015-11-24 11:42
NaN
s and Null
s before attempting to do stuff with them : - Niet the Dark Absol 2015-11-24 12:44
var x = some_func(y)
we are considering x
to be equal to RHS. On the other hand 1 + 3
is mathematically equal to 4
. It's a side-effect that the addition function or operator return a value that is also equal. 1 + 3
is equal to 4
. log(-1)
result in NaN
. To make it amply clear, even though log(-1)
result in NaN
, it should be noted log(-1) == NaN
fails in many languages. Whether this should fail is the question - nawfal 2015-11-24 13:23
1 + 3
and 2 + 2
result in 4
, and 4 == 4
. That's why 1 + 3 = 2 + 2
. log(-1)
and acos(2)
result in NaN
but is the value NaN
equal to NaN
? The ==
in 1st case operates on two numbers. Their behavior is well defined. NaN
is not a number and the debate is on what should be the behavior. Just because some operator operates on two operands of one kind in one way doesn't mean it should operate on operands of any kind the same way. To assert NaN == NaN
because 4 == 4
leads to a circular argument - nawfal 2015-11-24 13:27
NaN
is IdK
(I don't Know) - JLRishe 2017-10-27 05:53
A nice property is: if x == x
returns false, then x
is NaN.
(one can use this property to check if x
is NaN
or not.)
a
and b
could have used !(a != b)
- supercat 2014-02-11 00:38
Try this:
var a = 'asdf';
var b = null;
var intA = parseInt(a);
var intB = parseInt(b);
console.log(intA); //logs NaN
console.log(intB); //logs NaN
console.log(intA==intB);// logs false
If intA == intB were true, that might lead you to conclude that a==b, which it clearly isn't.
Another way to look at it is that NaN just gives you information about what something ISN'T, not what it is. For example, if I say 'an apple is not a gorilla' and 'an orange is not a gorilla', would you conclude that 'an apple'=='an orange'?
a=16777216f
, b=0.25
, and c=0.125
, should the fact that a+b == a+c
be taken to imply that b==c
? Or merely that the two calculations yield indistinguishable results? Why should not sqrt(-1) and (0.0/0.0) be considered indistinguishable, absent a means of distinguishing them - supercat 2014-04-29 23:25
==
in order to preserve transitivity - Alexander 2015-06-29 16:19
Actually, there is a concept in mathematics known as “unity” values. These values are extensions that are carefully constructed to reconcile outlying problems in a system. For example, you can think of ring at infinity in the complex plane as being a point or a set of points, and some formerly pretentious problems go away. There are other examples of this with respect to cardinalities of sets where you can demonstrate that you can pick the structure of the continuum of infinities so long as |P(A)| > |A| and nothing breaks.
DISCLAIMER: I am only working with my vague memory of my some interesting caveats during my math studies. I apologize if I did a woeful job of representing the concepts I alluded to above.
If you want to believe that NaN is a solitary value, then you are probably going to be unhappy with some of the results like the equality operator not working the way you expect/want. However, if you choose to believe that NaN is more of a continuum of “badness” represented by a solitary placeholder, then you are perfectly happy with the behavior of the equality operator. In other words, you lose sight of the fish you caught in the sea but you catch another that looks the same but is just as smelly.
==
that is not reflexive, symmetric and transitive; it's unfortunate that Python won't stop him. But when Python itself makes ==
non-reflexive, and you can't even override it, this is a complete disaster from both practical viewpoint (container membership) and elegance/mental clarity viewpoin - max 2013-03-19 20:42