Why does adding an extra field to struct greatly improves its performance?

Spread the love

Question Description

I noticed that a struct wrapping a single float is significantly slower than using a float directly, with approximately half of the performance.

using System;
using System.Diagnostics;

struct Vector1 {

    public float X;

    public Vector1(float x) {
        X = x;
    }

    public static Vector1 operator +(Vector1 a, Vector1 b) {
        a.X = a.X + b.X;
        return a;
    }
}

However, upon adding an additional ‘extra’ field, some magic seems to happen and performance once again becomes more reasonable:

struct Vector1Magic {

    public float X;
    private bool magic;

    public Vector1Magic(float x) {
        X = x;
        magic = true;
    }

    public static Vector1Magic operator +(Vector1Magic a, Vector1Magic b) {
        a.X = a.X + b.X;
        return a;
    }
}

The code I used to benchmark these is as follows:

class Program {
    static void Main(string[] args) {
        int iterationCount = 1000000000;
        var sw = new Stopwatch();
        sw.Start();
        var total = 0.0f;
        for (int i = 0; i < iterationCount; i++) {
            var v = (float) i;
            total = total + v;
        }
        sw.Stop();
        Console.WriteLine("Float time was {0} for {1} iterations.", sw.Elapsed, iterationCount);
        Console.WriteLine("total = {0}", total);
        sw.Reset();
        sw.Start();
        var totalV = new Vector1(0.0f);
        for (int i = 0; i < iterationCount; i++) {
            var v = new Vector1(i);
            totalV += v;
        }
        sw.Stop();
        Console.WriteLine("Vector1 time was {0} for {1} iterations.", sw.Elapsed, iterationCount);
        Console.WriteLine("totalV = {0}", totalV);
        sw.Reset();
        sw.Start();
        var totalVm = new Vector1Magic(0.0f);
        for (int i = 0; i < iterationCount; i++) {
            var vm = new Vector1Magic(i);
            totalVm += vm;
        }
        sw.Stop();
        Console.WriteLine("Vector1Magic time was {0} for {1} iterations.", sw.Elapsed, iterationCount);
        Console.WriteLine("totalVm = {0}", totalVm);
        Console.Read();
    }
}

With the benchmark results:

Float time was 00:00:02.2444910 for 1000000000 iterations.
Vector1 time was 00:00:04.4490656 for 1000000000 iterations.
Vector1Magic time was 00:00:02.2262701 for 1000000000 iterations.

Compiler/environment settings:
OS: Windows 10 64 bit
Toolchain: VS2017
Framework: .Net 4.6.2
Target: Any CPU Prefer 32 bit

If 64 bit is set as the target, our results are more predictable, but significantly worse than what we see with Vector1Magic on the 32 bit target:

Float time was 00:00:00.6800014 for 1000000000 iterations.
Vector1 time was 00:00:04.4572642 for 1000000000 iterations.
Vector1Magic time was 00:00:05.7806399 for 1000000000 iterations.

For the real wizards, I've included a dump of the IL here: https://pastebin.com/sz2QLGEx

Further investigation indicates that this seems to be specific to the windows runtime, as the mono compiler produces the same IL.

On the mono runtime, both struct variants have roughly 2x slower performance compared to the raw float. This is quite a bit different to the performance we see on .Net.

What's going on here?

*Note this question originally included a flawed benchmark process (Thanks Max Payne for pointing this out), and has been updated to more accurately reflect the timings.

Practice As Follows

This should not happen. This is obviously some sort of missalignment that forces the JIT to not work like it should.

struct Vector1 //Works fast in 32 Bit 
{
    public double X;
}

struct Vector1 //Works fast in 64 Bit and 32 Bit
{
    public double X;
    public double X2;
}

You also must call:
Console.WriteLine(total); that increases the time exactly to Vector1Magic time which makes sense. The question still holds, why Vector1 is so slow.

Maybe structs are not optimized for sizeof(foo) < 64 Bit in 64 bit mode.

It seems that this was ansered 7 years ago:
Why is 16 byte the recommended size for struct in C#?

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.