News

Welcome to the official repository of RegMix, a new approach to optimizing data mixtures for large language model (LLM) pre-training! Join our Discord for more discussions! mixture_config: Tools for ...
This project contains the code for the paper accepted at NeurIPS 2020 with the above title. The file RPCA.py contains an implementation of the algorithm and the simulations done in the paper. This ...