A fast 12x12 IDCT has been developed to provide a 'djpeg -scale 3/2' option for
2:3 (8:12) image upscaling.
This is interesting because it is the first non-integer upsampling feature in the
IJG context.
col   0     1     2     3     4     5     6     7     8     9    10    11
index
                                       :
  /  C6    C6    C6    C6    C6    C6  : C6    C6    C6    C6    C6    C6  \
  |                                    :                                   |
  |  C1    C3    C5    C7    C9    C11 :-C11  -C9   -C7   -C5   -C3   -C1  |
  |                                    :                                   |
  |  C2    C6    C10  -C10  -C6   -C2  :-C2   -C6   -C10   C10   C6    C2  |
  |                                    :                                   |
  |  C3    C9   -C9   -C3   -C3   -C9  : C9   -C3    C3    C9   -C9   -C3  |
  |                                    :                                   |
  |  C4    0    -C4   -C4    0     C4  : C4    0    -C4   -C4    0     C4  |
  |                                    :                                   |
  |  C5   -C9   -C1   -C11   C3    C7  :-C7   -C3    C11   C1    C9   -C5  |
  |                                    :                                   |
  |  C6   -C6   -C6    C6    C6   -C6  :-C6    C6    C6   -C6   -C6    C6  |
  |                                    :                                   |
  |  C7   -C3   -C11   C1   -C9   -C5  : C5    C9    -C1   C11   C3   -C7  |
  |....................................:...................................|
  |                                    :                                   |
  where  Ck = cos(k*pi/24)
Now the IDCT is the transpose of the DCT, hence
col 0 1 2 3 4 5 6 7 index / C6 C1 C2 C3 C4 C5 C6 C7 | | C6 C3 C6 C9 0 -C9 -C6 -C3 | | C6 C5 C10 -C9 -C4 -C1 -C6 -C11 | | C6 C7 -C10 -C3 -C4 -C11 C6 C1 | | C6 C9 -C6 -C3 0 C3 C6 -C9 | | C6 C11 -C2 -C9 C4 C7 -C6 -C5 |---------------------------------------------- | C6 -C11 -C2 C9 C4 -C7 -C6 C5 | | C6 -C9 -C6 -C3 0 -C3 C6 C9 | | C6 -C7 -C10 C3 -C4 C11 C6 -C1 | | C6 -C5 C10 C9 -C4 C1 -C6 C11 | | C6 -C3 C6 -C9 0 C9 -C6 C3 | \ C6 -C1 C2 -C3 C4 -C5 C6 -C7With ck = sqrt(2) * Ck and C6 = 1/sqrt(2) we get
col 0 1 2 3 4 5 6 7 index / 1 c1 c2 c3 c4 c5 1 c7 | | 1 c3 1 c9 0 -c9 -1 -c3 | | 1 c5 c10 -c9 -c4 -c1 -1 -c11 | | 1 c7 -c10 -c3 -c4 -c11 1 c1 | | 1 c9 -1 -c3 0 c3 1 -c9 | | 1 c11 -c2 -c9 c4 c7 -1 -c5 |---------------------------------------------- | 1 -c11 -c2 c9 c4 -c7 -1 c5 | | 1 -c9 -1 -c3 0 -c3 1 c9 | | 1 -c7 -c10 c3 -c4 c11 1 -c1 | | 1 -c5 c10 c9 -c4 c1 -1 c11 | | 1 -c3 1 -c9 0 c9 -1 c3 | \ 1 -c1 c2 -c3 C4 -c5 1 -c7 where c1 = 1.402114769 c2 = 1.366025404 c3 = 1.306562965 c4 = 1.224744871 c5 = 1.121971054 [c6 = 1 not needed] c7 = 0.860918669 [c8 not needed] c9 = 0.541196100 c10 = 0.366025404 c11 = 0.184591911Even part optimization (columns 0, 2, 4, 6):
Odd part optimization (columns 1, 3, 5, 7):
Rows 1 and 4 form a 'rotation' expression (see fast 4x4 IDCT derivation) which can
be spanned over the full odd columns block (1, 3, 5, 7) and thereby 'normalized' by
substituting x1 - x7 and x3 - x5 with factors c3 and c9.
Column 3 has just 2 multiplicators (c3 and c9).
The remaining elements can be reduced to 8 multiplications.
This gives us 3 (rotation) + 2 (column 3) + 8 = 13 mults in the odd part
calculation.
Note that the rotation with c3 and c9 is again the same as in the even part of the 8x8 point LL&M IDCT algorithm.