Optimized tensor permutation

Just a memo at this point.

No code is available but they claimed:

``` Compared to established frameworks like HPTT[27], NumPy, and PyTorch, GenTT demonstrates remarkable speedups of up to 38× in specialized cases and 5× in general scenarios. ```

@c.groth @terasakisatoshi @SatoshiMorita @yjkao

1 Like

Among AI generated papers, this one does not have the best quality. Figure 6 and 9 are still not readable. I can generate a better one in 5 hours ;D

The error bars in Figure 9 are very huge. Is the resolution of Figure 5 low?
The reported speed up is great, but I’d like to verify it myself, though the code does not seem to be available.

Very good point. It is also difficult for me to understand the plots.


For Fig. 6, the inset y axis is from 0-60, which is not consistent with the region quoted. It is not clear what is the speed up compared with.

Fig.9 is even worse. The main text infer that the speed up is w.r.t. numpy. So numpy as 16x speed up w.r.t numpy.

It looks like a AI generated fake paper that lacks of coherence.

OK, let us igore the paper. Thank you for your “analysis”!

I was originally interested in implemeting a pure Rust library for permutations.
We could follow the strategey of HPTT or Strided.jl.