One of the most tedious things in life is converting unrolled, unmacro'd FFT code to use butterfly and complex multiplication macros. Because suppose you ever want to integerize it and don't want to maintain 2 transform codebases.
This must be exactly what threading core rope ROMs would have been like.
Converting 16 mere lines of complex multiplies to use macros for hours only to make everything undelectably inaccurate except for one corner case you didn't test for.

Sign in to participate in the conversation

A Mastodon instance for people interested in multimedia, codecs, assembly, SIMD, and the occasional weeb stuff.