Chemical space sampling with novel active learning cuts training data cost by an order of magnitude

Vivin Vinod and Peter Zaspel have developed a faster, more efficient way to train machine learning models on complex chemical data. Detailed in their new paper, “LFaB: Low Fidelity as Bias for Active Learning in the Chemical Configuration Space” in the Journal of Chemical Theory and Computation, they introduced a novel bias-based active learning strategy. By using a lower-fidelity output as a proxy for model bias, their method dramatically outperforms both traditional variance-based active learning and standard random sampling. The LFaB method is highly adaptable, proving its effectiveness across a wide range of chemical properties including excitation energies.