Abstract: Recently, the sparsely-gated Mixture-Of-Experts (MoE) architecture has garnered significant attention. To benefit a wider audience, fine-tuning MoE models on more affordable clusters, which ...
Having developed an affinity for his local hills, Bhima Bowden decided to put them to use. Meg Elliot hears his story.