Description
I had a bet with a co-worker that C# would outperform F# for a simple counting exercise. This turned out to be quite a surprise for many reasons. In this blog post I will try to explain why.Problem
Sum the numbers from 1 to n using brute force (can't use the (1 + N)*(N/2) equation). Do this in both C# and F# and compare the speed. Since F# needs to do this recursively I thought this should be a no contest win for C#.Using F#
We initially wrote the code without using tail recursion which immediately ended with a stack overflow. So we went back and wrote it doing it the functional way, with tail recursion. Here is what we came up with.<span
class="codeKeyword">module</span> FSharpLib
<span class="codeKeyword">let</span> sumNum fromNum toNum =
<span class="codeKeyword">let</span> <span
class="codeKeyword">rec</span> sumNumTail fromNum toNum result =
<span class="codeKeyword">if</span> fromNum >
toNum then
result
<span class="codeKeyword">else</span>
sumNumTail (fromNum+1L) toNum (result+fromNum)
sumNumTail fromNum toNum 0L
Using C# - Attempt 1
You'll see later why I say attempt 1. But, this is the simplest C# code I could think of.<span
class="codeKeyword">public</span> <span
class="codeKeyword">static</span> <span
class="codeKeyword">long</span> CSharpForLoop(<span
class="codeKeyword">long</span> fromNumber, <span
class="codeKeyword">long</span> toNumber) {
<span class="codeKeyword">long</span> result = 0;
<span class="codeKeyword">for</span> (<span
class="codeKeyword">long</span> i = fromNumber; i <=
toNumber; i++) {
result += i;
}
<span class="codeKeyword">return</span> result;
}
The Results - Attempt 1
This is the results of running the above code using a fromNumber of 1 and a toNumber of 10000000 and running it 10 times alternating between C# and F# to filter out start-up times, etc.F# | 0.8606902s | 11% faster |
---|---|---|
C# | 0.9556198s |
I ran it like 20 times because I couldn't believe it. I looked at the C# code for any possible optimizations I could add. But, being so simple there wasn't a lot I could do.
Reflector is an amazing tool
I turned to my good friend Reflector to see what was going on. Here is what the F# code looked like when viewing it as C# code.<span
class="codeKeyword">public</span> <span
class="codeKeyword">static</span> <span
class="codeKeyword">long</span> sumNum(<span
class="codeKeyword">long</span> fromNum, <span
class="codeKeyword">long</span> toNum)
{
FastFunc<<span class="codeKeyword">long</span>,
FastFunc<<span class="codeKeyword">long</span>,
FastFunc<<span class="codeKeyword">long</span>,
<span
class="codeKeyword">long</span>>>>
sumNumTail = <span class="codeKeyword">new</span> clo@7();
<span class="codeKeyword">return</span>
FastFunc<<span class="codeKeyword">long</span>,
<span
class="codeKeyword">long</span>>.InvokeFast3<<span
class="codeKeyword">long</span>, <span
class="codeKeyword">long</span>>(sumNumTail, fromNum,
toNum, 0L);
}
and clo@7.Invoke looks like
<span
class="codeKeyword">public</span> <span
class="codeKeyword">override</span> <span
class="codeKeyword">long</span> Invoke(<span
class="codeKeyword">long</span> fromNum, <span
class="codeKeyword">long</span> toNum, <span
class="codeKeyword">long</span> result)
{
<span class="codeKeyword">while</span> (fromNum <=
toNum)
{
result += fromNum;
toNum = toNum;
fromNum += 1L;
}
<span class="codeKeyword">return</span> result;
}
They used a while loop instead of a for loop. OK, I can do that. But, what happened to tail, I thought that tail is why the CLR is so good at functional language implementations.
C# - Attempt 2
<span
class="codeKeyword">public</span> <span
class="codeKeyword">static</span> <span
class="codeKeyword">long</span> CSharpWhileLoop(<span
class="codeKeyword">long</span> fromNumber, <span
class="codeKeyword">long</span> toNumber) {
<span class="codeKeyword">long</span> result = 0;
<span class="codeKeyword">while</span> (fromNumber
<= toNumber)
{
result += fromNumber;
<span class="codeComment">// I left out the toNum = toNum
because it didn't do anything.</span>
fromNumber += 1L;
}
<span class="codeKeyword">return</span> result;
}
and the results (I've got you now F#)...
F# | 0.8606902s |
---|---|
C# For Loop | 0.9556198s |
C# While Loop | 0.9105592s |
Oh wait... DAMN! What could F# be doing to make it almost 10% faster than C#.
To the IL code Batman
This is the IL for the F# code. L_0000: nop <span class="codeComment">// while loop</span>
L_0001: nop
L_0002: ldarg.1 <span class="codeComment">// fromNum <= toNum</span>
L_0003: ldarg.2
L_0004: ble.s L_0009
L_0006: nop <span class="codeComment">// return result</span>
L_0007: ldarg.3
L_0008: ret
L_0009: nop <span class="codeComment">// result += fromNum</span>
L_000a: ldarg.1
L_000b: ldc.i8 1
L_0014: add
L_0015: ldarg.2 <span class="codeComment">// fromNum += 1L</span>
L_0016: ldarg.3
L_0017: ldarg.1
L_0018: add
L_0019: starg.s result <span class="codeComment">// loop</span>
L_001b: starg.s toNum
L_001d: starg.s fromNum
L_001f: br.s L_0000
This is the IL for the C# while loop.
L_0000: nop <span class="codeComment">// long result = 0L</span>
L_0001: ldc.i4.0
L_0002: conv.i8
L_0003: stloc.0
L_0004: br.s L_0012 <span class="codeComment">// while loop</span>
L_0006: nop <span class="codeComment">// result += fromNumber</span>
L_0007: ldloc.0
L_0008: ldarg.0
L_0009: add
L_000a: stloc.0 <span class="codeComment">// fromNumber += 1L</span>
L_000b: ldarg.0
L_000c: ldc.i4.1
L_000d: conv.i8
L_000e: add
L_000f: starg.s fromNumber
L_0011: nop
L_0012: ldarg.0 <span class="codeComment">// fromNumber <= toNumber</span>
L_0013: ldarg.1
L_0014: cgt
L_0016: ldc.i4.0
L_0017: ceq
L_0019: stloc.2
L_001a: ldloc.2
L_001b: brtrue.s L_0006 <span class="codeComment">// loop</span>
L_001d: ldloc.0
L_001e: stloc.1
L_001f: br.s L_0021
L_0021: ldloc.1
L_0022: ret
What just happened.
Wow, the C# compiler is quite a bit more verbose. First of all, C# makes the while loop into a do/while loop. Secondly, in the loop condition C# creates a temporary variable to store the comparison and then checks for 'true' ("brtrue.s"). If you look at the F# IL you'll see the use of the "ble.s" instructions (branch if less or equal) instead. Finally we see that C# uses a few "conv.i8" instructions (convert to 8 byte/64-bit int) which I can only guess adds to it's defeat.Conclusion
So is F# just a more performant language or is C# just compiling to more robust IL code which as F# matures will end up suffering the same fate? I personally think (hope actually) the later is true. I'm not an expert in compilers but I would have to assume the C# compiler team is using every trick in the book to make highly performant IL code without sacrificing robustness and I can only assume that F# will get to a point where it needs to make the same concessions to handle some obscure cases.Sign up here with your email
ConversionConversion EmoticonEmoticon