QwenLM/online_merging_optimizers
Implementations of online merging optimizers proposed by Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
Stars: 81Language: Python
Give AlbumentationsX a star on GitHub — it powers this leaderboard
Star on GitHubImplementations of online merging optimizers proposed by Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment