# research blog

• ### Unifying Approaches in Active Learning and Active Sampling

Our paper “Unifying Approaches in Active Learning and Active Sampling via Fisher Information and Information-Theoretic Quantities” was recently published in TMLR.

• ### Assessing Generalization via Disagreement

Our paper “A Note on ‘Assessing Generalization of SGD via Disagreement’” was published in TMLR this week and serves both as a short reproduction and review note. It engages with the claims in “Assessing Generalization of SGD via Disagreement” by Jiang et al. (2022) , which received an ICLR 2022 spotlight. We would like to thank the authors for constructively engaging with our note on OpenReview.

• ### Stirling's Approximation for Binomial Coefficients

In MacKay (2003) on page 2, the following straightforward approximation for a binomial coefficient is introduced: $$$\log \binom{N}{r} \simeq(N-r) \log \frac{N}{N-r}+r \log \frac{N}{r}.$$$ The derivation in the book is short but not very intuitive although it feels like it should be. Information theory would be the likely candidate to provide intuitions. But information-theoretic quantities like entropies do not apply to fixed observations, only random variables, or do they?

• ### Better intuition for information theory

The following blog post is based on Yeung’s beautiful paper “A new outlook on Shannon’s information measures”: it shows how we can use concepts from set theory, like unions, intersections and differences, to capture information-theoretic expressions in an intuitive form that is also correct.

The paper shows one can indeed construct a signed measure that consistently maps the sets we intuitively construct to their information-theoretic counterparts.

This can help develop new intuitions and insights when solving problems using information theory and inform new research. In particular, our paper “BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning” was informed by such insights.

• ### MNIST by zip

tl;dr: We can use compression algorithms (like the well-known zip file compression) for machine learning purposes, specifically for classifying hand-written digits (MNIST). Code available: https://github.com/BlackHC/mnist_by_zip.