Math from scratch, part two: zero and one

Last time I said that I was going to choose the "sequence of bits" technique for representing a natural number. At this point I don't think I need to go into detail about how bits work; you all know that each position represents the addition of a power of two multiplied by the value of the bit at that position, blah blah blah. Let's instead concentrate on the technical details. I'm going to represent natural numbers using the following recursive definition:
  • Zero is a natural number
  • One is a natural number
  • A zero or one bit, called the least significant bit, followed by a natural number that ends in one, is a natural number.
"Followed by" is a bit counterintuitive here. In our left-to-right reading culture we normally think of the stuff that "follows" as being to the right, but we normally also write the least significant digit to the right. When I'm writing out a sequence of bits in this series I'll use the standard "least significant digit on the right" convention and you can just remember that the tail of the linked list is logically to the left, not the right. I'll try to always write the tail variable to the left of the head variable so that it is more clear in the code as well.
So anyways, basically what we have here is a linked list of bits going from least to most significant that never ends in zero, except for the base case of zero of course.
We could use bool as the bit, and even define bool constants ZeroBit and OneBit, but hey, we're doing this from scratch, right? Let's just make objects. Like I said before, this series is pedagogical, not pragmatic. We're not trying to save on space here and in fact are consuming over a hundred bits of memory per actual bit represented. Of course we will only create two actual bit objects; every zero bit is identical to every other, and the same for one. But the reference to a bit object in each link in the linked list is pretty large.1
Normally I would want numbers to be value types since they are logically values. But it is hard to make a linked list of value types, so we'll go with a class.
Enough chatting, let's write some code.
namespace ImmutableNumbers
{
    public sealed class Natural
    {
        private sealed class Bit
        {
            public override string ToString()
            {
                return this == Natural.ZeroBit ? "0" : "1";
            }
        }
        private static readonly Bit ZeroBit = new Bit();
        private static readonly Bit OneBit = new Bit();

        public static readonly Natural Zero = new Natural(null, ZeroBit);
        public static readonly Natural One = new Natural(null, OneBit);

        private Natural tail;
        private Bit head;

        private Natural(Natural tail, Bit head)
        {
            this.head = head;
            this.tail = tail;
        }

        private static Natural Create(Natural tail, Bit head)
        {
            if (ReferenceEquals(tail, null) || ReferenceEquals(tail, Zero))
                return head == ZeroBit ? Zero : One;
            else
                return new Natural(tail, head);
        }
    }
}
OK, let's review. We've got a linked list of bits. We have a private factory method that maintains two nice invariants: first, it ensures that any Natural that is logically one is reference equal to One. Second, by ensuring the same invariant for Zero it also ensures that no non-zero Natural ever ends in a zero bit. Put another way, a null tail is equivalent to an infinite sequence of zero bits, and this is the only way to represent an infinite string of zero bits. (Again, with the exception of the zero value.)
A few more interesting things about this code:
  • I added a ToString method to Bit solely for debugging purposes. This will be one of the few uses of a built-in type other than object and bool.
  • I'm going to be adding overloaded equality operators to Natural soon, so to make it crystal clear when reference equality is intended, I'm going to spell it out every time. (But not on bits; bit equality is always reference equality.)
  • The recursive math algorithms break down nicely into mutually exclusive cases. I'm going to use the if(...) return ... else if(...) return ... else return ...; pattern throughout.
  • The call sites above are the only ones where the constructor will be invoked. From within the class, the factory method will always used. From outside the class, we'll expose Zero and One; once we have addition, that's all that is necessary to make a number of any size.
  • As we'll see next time, the convention for users of this class is that a null Natural is an error. The convention within the class will be that a null Natural is an infinite string of zero bits running off to the left.
  • Natural is sealed. I have no by-design scenario in which a user will extend this type, and I am not planning to get myself into the logical knot of "an integer is a special kind of natural" or vice versa.2
There's a saying in computer science that the only numbers that matter are zero, one and "arbitrarily many". We've got the first two. Next time on FAIC: we'll get the third by implementing addition.
  1. It is interesting to consider other ways to solve this problem. For example, we could have made two derived classes, OneBitNatural and ZeroBitNatural, and then used the runtime type of the object to determine the value of the least significant bit. That would get us down to probably less than 100 bits per bit represented, but it still pales in comparison to the density of a byte array.
  2. An integer can have a value that is not the value of any natural, so an integer class cannot be derived from the natural class. One could build a type hierarchy in which an immutable natural is a special kind of integer without violating the Liskov Substitution Principle, but the whole point here is that I want naturals to be foundational.
Previous
Next Post »