This repo aims at providing a collection of efficient Triton-based implementations for state-of-the-art linear attention models. Any pull requests are welcome!
Some results have been hidden because they may be inaccessible to you
Show inaccessible results