How can you interpret the coefficients of a logistic regression model?

Posted on December 19, 2016

By: Bart Baesens, Seppe vanden Broucke

This QA first appeared in Data Science Briefings, the DataMiningApps newsletter as a “Free Tweet Consulting Experience” — where we answer a data science or analytics question of 140 characters maximum. Also want to submit your question? Just Tweet us @DataMiningApps. Want to remain anonymous? Then send us a direct message and we’ll keep all your details private. Subscribe now for free if you want to be the first to receive our articles and stay up to data on data science news, or follow us @DataMiningApps.

You asked: How can you interpret the coefficients of a logistic regression model?

Our answer:

Logistic regression estimates the following model:

where Y corresponds to fraud, default, churn, response, etc. and X₁, …X_N are the predictors (e.g. age, income, etc.). This can be reformulated in terms of the odds as follows:

unnamed-1

The log(odds) or logit then becomes:

To interpret a logistic regression model, one can calculate the odds ratio. Suppose variable X_i (e.g. age, income, etc.) increases with one unit with all other variables being kept constant (ceteris paribus), then the new logit becomes the old logit with β_i added. Likewise, the new odds become the old odds multiplied by e^βi_.The latter represents the odds ratio, i.e. the multiplicative increase in the odds when X_i increases by 1 (ceteris paribus). Hence,

β_i> 0 implies e^βi > 1 and the odds and probability increase with X_i
β_i< 0 implies e^βi < 1 and the odds and probability decrease with X_i

Another way of interpreting a logistic regression model is by calculating the doubling amount. This represents the amount of change required for doubling the primary outcome odds. It can be easily seen that for a particular variable X_i, the doubling amount equals log(2)/β_i.