A response to the blog post "{n} times faster than C". Our final program achieved a speedup of 128x (36 GiB/s throughput) by reformulating the problem and leveraging SIMD intrinsics.
As a fan of functional programing, it is validating in a way to see the more functional approach being faster. The reason for not wanting to mutate is that it’s easier to reason about pure code. Usually this is for the programmers benefit, but it can be good for the compiler too as we see here. Obviously there are many cases where it is faster to mutate (many data structures can benefit from mutation) but there is this general assumption that fp is slower which isn’t exactly true either.
As a fan of functional programing, it is validating in a way to see the more functional approach being faster. The reason for not wanting to mutate is that it’s easier to reason about pure code. Usually this is for the programmers benefit, but it can be good for the compiler too as we see here. Obviously there are many cases where it is faster to mutate (many data structures can benefit from mutation) but there is this general assumption that fp is slower which isn’t exactly true either.