Model Selection with AIC & BIC

Machine Learning Quick Reads
3 min readMar 10, 2021

AIC (Akaike Information Criterion) and BIC (‎Bayesian Information Criterion) sound like twin brothers, what are the differences and how can we apply them?

Photo by Andrew Seaman on Unsplash

Q: Which model is the best?

Is it the most complex model that fits the data perfectly and captures every single detail of the data? Or should it be a simpler model that fits the data on a big picture and omits the unnecessary details?

The best model should be complex enough to capture key details of the data, yet not too complex to overfit the data.

A good model does not overfit, yet is flexible enough to be generalized.

A model that finds the best balance will be predictive. But how can we evaluate a model’s complexity and its fitting numerically? AIC and BIC are the tools we can utilize for this.

Akaike Information Criterion & Bayesian Information Criterion

Where k, the number of parameters, captures the complexity of a model. ln(L), the log-likelihood of the model on the data, captures the goodness of fit. And n is the number of data points.

A model with a lower AIC and BIC provides a…

--

--

Machine Learning Quick Reads
Machine Learning Quick Reads

Written by Machine Learning Quick Reads

Lead Author: Yaokun Lin, Actuary | ML Practitioner | Apply Tomorrow's Technology to Solve Today's Problems

No responses yet