I have the utmost respect for the ffmpeg developers; writing ASM is a skill I do not possess. I do have to wonder, though, would it be easier for cross-platform compatibility to write in C instead. I have always understood that C generally compiles almost directly to assembly with little to no abstraction overhead, and it would not require platform-specific ASM code. What is the logic in choosing ASM over C? I have no doubt there is a good reason.
In 99 cases out of 100, you won’t be able to hand craft assembly better than a good compiler can - partly due to compilers being much better and partly due to the skill level required. 20 or 30 years ago compilers weren’t as good and a reasonably competent person could craft more optimised assembly but these days compilers are pretty damn good and you need some extra level of ability to best the compiler.
However, there’s still that 1 time out of 100 and given how resource intensive ffmpeg is, it’s worth spending that extra time to hyper optimise the code because it’ll pay off massively.
The reason is SIMD instructions / vectorized assembly instructions with consideration for delay slots, instruction latency, memory access times etc., for which GCC and Clang optimizers are both terrible and cannot automatically transform C code to them in any but simple cases.
This is also a reason why specialized DSP processors with SIMD capabilities have dedicated proprietary compilers for them.
The logic is that it’s much faster which is important for code that runs on a large portion of the world’s devices. Pretty much anything to do with video is using ffmpeg. From a set top box, to your phone, computer, YouTube & Netflix, even on Mars.
Video processing is hard, and when you’re processing that much data a x10 speedup is huge. That’s why it’s written in assembly. And there’s really no downsides to it because the original implementation is in C (cross-platform), then there are handmade assembly versions for each specific platform (performance). Win-win.
Not to mention size. Assembly is so incredibly small without all the code interpretation and library overhead. I remember some of the old warez scene exe’s for DOS that were a few kilobytes but ended up being a huge video quality intro. Some lasting minutes. Rather than a few seconds.
If by ‘String/Quantum’ you mean String Theory and quantum physics then you are wrong on the latter (and somewhat even the former). Quantum physics doesn’t replace classical physics nor are they necessarily in opposition, and quantum physics is as much a theory as classical physics; so bashing one for being ‘theory’ is just as true for the other. And quantum physics is certainly in common use as you simply cant do anything at the atomic level without it. For example, any modern computer would not be able to function if quantum physics wasn’t used to inform their design; in the same vein a modern computer would not function if classical physics was used to design them. It’s important to remember that the word ‘theory’ in this context doesn’t mean unproven, rather it describes a collection of confirmed, falsifiable, explanations of the natural world.
As for String Theory, it shouldn’t be thought of as equivalent in scale to quantum physics, it’s really just an optional framework within quantum physics that attempts to describe the fundamental nature of particles in a way that supports quantum gravity. Due to this its usage is confined to theoretical physics and is dependent on which aspects of a system is being investigated, but it’s still used in some situations as its one of the best supported tools available.
I guess my main point is that quantum physics isn’t fringe theory that shows up only in theoretical work, it’s very much a requirement for all fields and is thereby prevalent and very much in common use. I have a CS degree and many of my courses touched on quantum mechanics, from pnp/npn transistor design to quantum-annealing/gate proof cryptography, without getting too into the mechanics/math as we were not physicists.
sethboy66’s response is very good, but I’ll summarize it here in case it helps. In science, the word “theory” basically means “explanation.” Some explanations are proven wrong, but others have a lot of evidence going for them. Quantum mechanics is a theory that’s basically proven, and it’s commonly used all the time, such as in the computer you used to write that comment. String theory is not really proven, but it’s basically an extension to the existing body of quantum mechanics.
Typically inline assembly is written in an #IFDEF block with a C/C++ alternative provided. Since the assembly is machine specific the devs need to write it for all the processor families they want to optimize for.
I think it would be neat if, as something gained popularity, more and more of it were re-written in optimized assembly. I mainly work in .NET, which performs fine for what it is, but there are some libraries like Dapper (which is a micro-ORM) which are written in IL, which is incredibly difficult to do but results in it being insanely fast compared to what you could do in purely managed .NET. I’m sure if it were written in assembly it would be an order of magnitude faster than that.
I have always understood that C generally compiles almost directly to assembly with little to no abstraction overhead, and it would not require platform-specific ASM code.
You have always understood incorrectly then. I’d recommend a trip over to Godbolt and take a look at the assembler output from C code. Play around with compiler options and see the (often MASSIVE!) changes. That alone should tell you that it doesn’t compile “almost directly to assembly”.
But then note something different. Count the different instructions used by the C compiler. Then look at the number of instructions available in an average CISC processor. Huge swaths of the instruction set, especially the more esoteric, but performance-oriented instructions for very specific use cases, are typically not touched by the compiler.
In the very, very, very ancient days of C the C compiler compiled almost directly to assembly. Specifically PDP-11 assembly. And any processor that was similar to the PDP-11 had similar mappings available. This hasn’t been the case, however, likely longer than you’ve been alive.
I have the utmost respect for the ffmpeg developers; writing ASM is a skill I do not possess. I do have to wonder, though, would it be easier for cross-platform compatibility to write in C instead. I have always understood that C generally compiles almost directly to assembly with little to no abstraction overhead, and it would not require platform-specific ASM code. What is the logic in choosing ASM over C? I have no doubt there is a good reason.
In 99 cases out of 100, you won’t be able to hand craft assembly better than a good compiler can - partly due to compilers being much better and partly due to the skill level required. 20 or 30 years ago compilers weren’t as good and a reasonably competent person could craft more optimised assembly but these days compilers are pretty damn good and you need some extra level of ability to best the compiler.
However, there’s still that 1 time out of 100 and given how resource intensive ffmpeg is, it’s worth spending that extra time to hyper optimise the code because it’ll pay off massively.
The reason is SIMD instructions / vectorized assembly instructions with consideration for delay slots, instruction latency, memory access times etc., for which GCC and Clang optimizers are both terrible and cannot automatically transform C code to them in any but simple cases.
This is also a reason why specialized DSP processors with SIMD capabilities have dedicated proprietary compilers for them.
The logic is that it’s much faster which is important for code that runs on a large portion of the world’s devices. Pretty much anything to do with video is using ffmpeg. From a set top box, to your phone, computer, YouTube & Netflix, even on Mars.
Video processing is hard, and when you’re processing that much data a x10 speedup is huge. That’s why it’s written in assembly. And there’s really no downsides to it because the original implementation is in C (cross-platform), then there are handmade assembly versions for each specific platform (performance). Win-win.
Not to mention size. Assembly is so incredibly small without all the code interpretation and library overhead. I remember some of the old warez scene exe’s for DOS that were a few kilobytes but ended up being a huge video quality intro. Some lasting minutes. Rather than a few seconds.
Assembly is probably the closest thing to magic humans have ever created.
(I’m disqualifying String/Quantum as they are “theories” and not in common use)
If by ‘String/Quantum’ you mean String Theory and quantum physics then you are wrong on the latter (and somewhat even the former). Quantum physics doesn’t replace classical physics nor are they necessarily in opposition, and quantum physics is as much a theory as classical physics; so bashing one for being ‘theory’ is just as true for the other. And quantum physics is certainly in common use as you simply cant do anything at the atomic level without it. For example, any modern computer would not be able to function if quantum physics wasn’t used to inform their design; in the same vein a modern computer would not function if classical physics was used to design them. It’s important to remember that the word ‘theory’ in this context doesn’t mean unproven, rather it describes a collection of confirmed, falsifiable, explanations of the natural world.
As for String Theory, it shouldn’t be thought of as equivalent in scale to quantum physics, it’s really just an optional framework within quantum physics that attempts to describe the fundamental nature of particles in a way that supports quantum gravity. Due to this its usage is confined to theoretical physics and is dependent on which aspects of a system is being investigated, but it’s still used in some situations as its one of the best supported tools available.
I guess my main point is that quantum physics isn’t fringe theory that shows up only in theoretical work, it’s very much a requirement for all fields and is thereby prevalent and very much in common use. I have a CS degree and many of my courses touched on quantum mechanics, from pnp/npn transistor design to quantum-annealing/gate proof cryptography, without getting too into the mechanics/math as we were not physicists.
sethboy66’s response is very good, but I’ll summarize it here in case it helps. In science, the word “theory” basically means “explanation.” Some explanations are proven wrong, but others have a lot of evidence going for them. Quantum mechanics is a theory that’s basically proven, and it’s commonly used all the time, such as in the computer you used to write that comment. String theory is not really proven, but it’s basically an extension to the existing body of quantum mechanics.
I hav that same thought about par files I just can’t figure out how they do what they do.
Typically inline assembly is written in an #IFDEF block with a C/C++ alternative provided. Since the assembly is machine specific the devs need to write it for all the processor families they want to optimize for.
EDIT: sorry that was the wrong page, I meant to link this one: https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html
http://timelessname.com/elfbin/
It’s worth a read to have your mind blown about what you can do when you hand optimise assembly.
Great reads! Thanks for posting.
I think it would be neat if, as something gained popularity, more and more of it were re-written in optimized assembly. I mainly work in .NET, which performs fine for what it is, but there are some libraries like Dapper (which is a micro-ORM) which are written in IL, which is incredibly difficult to do but results in it being insanely fast compared to what you could do in purely managed .NET. I’m sure if it were written in assembly it would be an order of magnitude faster than that.
deleted by creator
You have always understood incorrectly then. I’d recommend a trip over to Godbolt and take a look at the assembler output from C code. Play around with compiler options and see the (often MASSIVE!) changes. That alone should tell you that it doesn’t compile “almost directly to assembly”.
But then note something different. Count the different instructions used by the C compiler. Then look at the number of instructions available in an average CISC processor. Huge swaths of the instruction set, especially the more esoteric, but performance-oriented instructions for very specific use cases, are typically not touched by the compiler.
In the very, very, very ancient days of C the C compiler compiled almost directly to assembly. Specifically PDP-11 assembly. And any processor that was similar to the PDP-11 had similar mappings available. This hasn’t been the case, however, likely longer than you’ve been alive.