This proof (more like an interesting justification). is taken from page 2 of (MacKay 2003).
$$ P(r \mid \lambda) = \frac{e^{-\lambda}\lambda^{r}}{r!} $$$$ e^{-\lambda} \frac{\lambda^{\lambda}}{\lambda!} \approx \frac{1}{\sqrt{ 2\pi \lambda }} \implies \lambda! \approx \lambda^{\lambda}e^{-\lambda}\sqrt{ 2\pi \lambda } $$Which finishes the derivation of the approximation.
Approximation of the binomial
A quick derivation with the Stirling’s approximation gives a nice approximation for log of the binomials
$$ \ln \binom{N}{r} \equiv \ln \frac{N!}{(N - r)! r!} \approx \ln \frac{N^{N} \sqrt{ 2\pi N }}{(N - r)^{N - r} \sqrt{ 2\pi (N - r) } r^{r} \sqrt{ 2\pi r }} = $$$$ = N\ln N + \frac{1}{2}\ln(2\pi N) - (N - r)\ln(N - r) - r\ln r - \frac{1}{2}\ln(4\pi^{2}(N - r)r) = $$$$ = (N - r) \ln\left( \frac{N}{N - r} \right) + r \ln \left( \frac{N}{r} \right) + \frac{1}{2}\ln\left( 2\pi N \frac{N-r}{N} \frac{r}{N} \right) $$$$ H_{2}(x) = x \ln\left( \frac{1}{x} \right) + (1- x)\ln\left( \frac{1}{1 - x} \right) $$with $x = \frac{r}{N}$, + another asymptotically constant approximation term (or $\sqrt{ n }$ if you look at the logits..
References
[1] MacKay “Information Theory, Inference and Learning Algorithms” Cambridge University Press 2003