What is the type of the null literal?

The C# 2.0 specification says
The null literal evaluates to the null value, which is used to denote a reference not pointing at any object or array, or the absence of a value. The null type has a single value, which is the null value.
But every version of the specification since then does not contain this language. So what then is the type of the null literal expression?

It doesn't have one; the specification never says what the type of a null literal. It says that a null literal can be converted to any reference type, pointer type, or nullable value type, but on its own, considered outside of the context which performs that conversion, it has no type.
When Mads and I were sorting out the exact wording of various parts of the specification for C# 3.0 we realized that the null type was bizarre. It is a type with only one value -- or is it? Is the value of a null nullable int really the same as the value of a null string? And don't values of nullable value type already have a type, namely, the nullable value type? 1 So already this is very confusing.
Worse, the null type is a type that Reflection knows nothing about; there's a Type object associated with void which has no values at all, but none associated with the null type. It is a type that doesn't have a proper name, is in no namespace, that GetType() never returns, that you can't specify as the type of a local variable or field or method return type or anything.
In short, it really is a type that is there for completionists: it ensures that every compile-time expression can be said to have a type. Except that C# already had expressions that had no type: method groups in C# 1.0, anonymous methods in C# 2.0 and lambdas in C# 3.0 all also have no type. If all those things can have no type, clearly the null literal need not have a type either. Therefore we removed references to the useless "null type" in the C# 3.0 specification.
As an implementation detail, the Microsoft implementations of C# 1.0 through 5.0 all do have an internal object to represent the "null type". They also have objects to represent the non-existing types of lambdas, anonymous methods and method groups. This implementation choice has a number of pros and cons. On the pro side, the compiler can ask for the type of any expression and get an answer. On the con side, it means that sometimes bugs in the type analysis that really ought to have crashed the compiler, and hence been found by testing early, instead cause semantic changes in programs! My favourite example of that is that it is possible in C# 2.0 to use the illegal expression null ?? null. A careful reading of the specification shows that this expression should fail to compile. But due to a bug, the compiler fails to flag it as an erroneous usage of the ?? operator, and goes on to infer that the type of this expression is the null type, even though that expression is not a null literal. That error then goes on to cause many other downstream bugs as the type analyzer tries to make sense of the expression.
In Roslyn we debated what to do about this; if I recall correctly the final decision was to make two APIs, one which asks "what is the type of this expression?", and one which asks "what is the type of this expression given a certain context?". In the first case, the null literal expression has no type and so null is returned; in the second, the type that the null literal is being converted to can be returned.
  1. The reader who critically notes that it is question-begging to ask whether values of a given type have a type ought to instead applaud my consistency. Tautologies are by definition consistent.
Previous
Next Post »