Bug of the day: 2022-02-28

There is this deinterlacing algorithm called yadif. It is very fast, but also very CPU-expensive. For this reason, it also has some ASM optimisations. GStreamer had support for yadif but not for the ASM optimisations, so I had previously adapted them from FFmpeg, which had both. Now, two years later, I had found a small bug: In some video, which involved some static graphics overlaid upon fast-moving content, the lines were jumping up and down (“bobbing”). However, that happened only in GStreamer’s ASM optimisations. Not in the plain C code. Not in FFmpeg’s ASM optimisations.

Having to look over some ASM code that you wrote two years ago already requires a significant amount of bravery. To make things easier, though, I had an equivalent C implementation, a reference ASM code that worked fine, and a lot of comments in my own code. Or at least I thought that these things would make it easier.

So I started looking at the implementation, remembering what I had done, checking if it does what the comments say it does, checking if the end result corresponds to the C implementation, and also comparing it with FFmpeg’s code. I started double-checking and triple-checking everything. It was all correct.

All my differences were in entry points. FFmpeg has fewer input parameters and then calculates some intermediate values, such as the value of the same pixel on the previous line, or on the previous frame. However, GStreamer calculates those in the deinterlacer base class, so I had more input parameters. I thought that maybe one of those was used improperly, but they were all correct. So I was really at a loss.

What do we do when we have no idea what’s wrong with the code? Change the functionality of random parts and see what breaks. By doing this, I slowly figured out that the value of some variable was too high in some special case (the diff parameter when mode is 0). However, there are many steps involved in that calculation, and nothing makes a lot of sense until the very end. Just to make sure, I quadruple-checked that part of the code. Nope, correct. I thought, maybe I’m accidentally messing with a register that’s needed later. Nope, I’m not.

At this point I decided to take a step back and look at the more inconspicuous parts of the code. While looking around, I noticed this macro:

%macro LOAD 2
    movh      %1, %2
    punpcklbw %1, m7
%endmacro

This loads some value into a register and then interleaves it with zeroes. We are adding pixel values, so we need to make sure that the carry doesn’t accidentally spill over into neighbouring pixels. This makes the assumption that m7 is zero. Indeed, I remember this being set to zero early on. But let’s make sure…

LOAD         m7, [bzeroq]

Ah-HA! The end effect would be that the value of m7 is interleaved with itself instead of with zeroes. That was indeed only in the mode==0 special case, and directly influenced the result of the diff parameter.

bzeroq is one of those entry-point parameters, that FFmpeg calculates in the ASM code but GStreamer gets as input. FFmpeg calculates that value earlier on, uses it once, then puts it into the stack for later. I had decided, no need to go via the stack when I can load it directly. Turns out… I can’t.

Going via the stack like FFmpeg did solve my bug.

https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1816/diffs?commit_id=fbeecb9e5567b5822e93ea50fd28f820cf7bbdaf