Give AlbumentationsX a star on GitHub — it powers this leaderboard

Star on GitHub

IBM/ModuleFormer

ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.

Stars: 226Language: Python
IBM/ModuleFormer - GitHub Repository | PyPI Leaderboard