Prioritized training on points that are learnable, worth learning, and not yet learned

Mindermann, Sören,
Razzak, Muhammed,
Xu, Winnie,

*Kirsch, Andreas*,
Sharma, Mrinank,
Morisot, Adrien,
Gomez, Aidan N.,
Farquhar, Sebastian,
Brauner, Jan,
and

Gal, Yarin
*In SubSetML: Subset Selection in Machine Learning: From Theory to Practice (ICML Workshop)*
2021

Information theory is of importance to machine learning, but the notation for information-theoretic quantities is sometimes opaque. The right notation can convey valuable intuitions and concisely express new ideas. We propose such a notation for machine learning users and expand it to include information-theoretic quantities between events (outcomes) and random variables. We apply this notation to a popular information-theoretic acquisition function in Bayesian active learning which selects the most informative (unlabelled) samples to be labelled by an expert. We demonstrate the value of our notation when extending the acquisition function to the core-set problem, which consists of selecting the most informative samples *given* the labels.