Business Applications and Limitations of Analytical Credit Scoring

Contributed by: Bart Baesens

This article first appeared in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to receive our feature articles, or follow us @DataMiningApps. Do you also wish to contribute to Data Science Briefings? Shoot us an e-mail over at briefings@dataminingapps.com and let’s get in touch!


Throughout the past few decades banks have gathered plenty of information describing the default behavior of their customers.  Examples are historical information about a customer’s date of birth, gender, income, employment status, etc.  All this data has been nicely stored into huge (e.g. relational) databases or data warehouses.  On top of this, banks have accumulated lots of business experience about their credit products.  As an example, many credit experts do a pretty good job at discriminating between low risk and high risk mortgages using their business expertise only.  It is now the aim of credit scoring to analyze both sources of data into more detail and come up with an analytical score reflecting the creditworthiness of a customer.  Two types of credit scoring can be distinguished: application scoring and behavioral scoring.

The purpose of application scoring is to come up with a credit score which reflects the default risk of a customer at the moment of loan application.  This is a very important scoring mechanism as it will help decide whether the credit should be accepted or rejected.  In order to build an application scorecard, one first needs to define the concept of default.  It could be based upon profit, amount owed, negative net present value or number of months in payment arrears.  With the introduction of the Basel capital accords, the default definition has now been set to 90 days in payment arrears.  Note however that, in some countries, this definition has been overruled.  In the United States, for example, in retail credit for residential mortgages, the default definition is 180 days, for qualifying revolving exposures it’s also 180 days, and for other retail exposures, it’s 120 days.  Once the default definition has been fixed, the next step is to collect the variables that can be used to predict default.  Two different types of information can be distinguished: application variables and bureau variables.

Application variables are provided to the bank by the applicant upon loan application.  Popular examples are: age, gender, marital status, income, time at residence, time at employment, time in industry, first digit of postal code, residential status, employment status, lifestyle code, existing client (Y/N), number of years as client, number of products internally, total liabilities, total debt, total debt service ratio, gross debt service ratio, revolving debt/total debt, and number of credit cards.  All these variables are internally available to the bank.  They can be complemented by bureau variables.  Bureau variables are obtained from credit bureaus which are external to the bank.  A credit bureau is an organization that assembles and aggregates credit information from various financial institutions or banks.  It can collect both positive or negative credit information, depending upon the country in which it operates.  Usually, credit bureaus provide two sources of information.  A first example is raw bureau data such as: number of previous delinquencies, total amount of credit outstanding, previous delinquency history, time at credit bureau, total credit bureau inquiries, time since last credit bureau inquiry, inquiries in the last 3/6/12 months, etc.  Using this raw bureau data, credit bureaus can now build bureau credit scores.  These bureau scores can then be sold to banks which can then use them in their application scoring models.  Credit bureaus are all around these days.  In the US, popular bureaus are Experian, Equifax and TransUnion which each cover their own geographical region.  All three provide a FICO score which ranges between 300 to 850 with higher scores reflecting better credit quality.

Behavioral scoring is another statistical credit scoring approach which analyses the behavior of already acquired credit customers.  Behavioral scoring models are typically constructed using a 24 month time frame.  12 months are taken to measure and quantify all the information which will be used as predictors, and a subsequent 12 months to determine the default status.  Behavioral scoring is dynamic since it summarizes the behavior into various dynamic variables such as average checking account balance, maximum checking account balance, trend in checking account balance, delinquency history, etc.

As already mentioned, the most important usage of application scores is to decide upon loan approval.  They can also be used for pricing purposes.  Risk-based pricing sets the price or other characteristics (e.g., loan term, collateral) of the loan based upon the perceived risk as measured by the application score.  A lower score will imply a higher interest rate.  Hence, subprime loans (e.g., having a FICO score of less than 620) will come with higher rates and fees.

Behavioral scores can be used for various business purposes.  First, they can be used for marketing applications.  The behavioral scores can be segmented and each of the segments can then be individually approached with targeted mailings.  Another usage is for up-, down- or cross-selling.  Up-selling means that you want to sell more of the same product.  Think about credit cards or credit lines for example.  In case of a good behavioral score and thus low credit risk, the bank may consider to increase the credit limit hereby generating more revenue.  Down-selling means selling less of the same product.  So, in case of a bad behavioral score, the bank may consider to mitigate its potential loss by lowering the credit limit.  Finally, cross-selling means selling other products.  For example, if the customer has a good behavioral score on his/her mortgage, the bank may try and sell some additional insurance products.

Although the idea of using behavioral scores to set credit limits sounds reasonable, there has been some debate in the literature about whether this is appropriate or not.  In their book, Credit Scoring and Its Applications (2002), Thomas et al. argue that one should be careful when using behavioral scores for limit setting.  Their reasoning goes as follows.  A behavioral credit score is calculated using a given operating policy and credit limit.  Hence, using the behavioral score to change the credit limit basically invalidates the effectiveness of the score.  To further illustrate this, they came up with an analogy.  Suppose that only those people who have no or few accidents when they drive a car at 30 mph in the towns should be allowed to drive at 70 mph on the motorways.  Clearly, this is not a good reasoning since other skills may be required to drive faster.  Similarly, it’s not because a customer has a good behavioral score with a low credit limit, that his or her behavioral score will remain good with a high credit limit since other characteristics or skills might be needed to manage accounts with large credit limits.  However, despite this argument, behavioral scores are commonly used in the industry to manage credit limits.  Typically, the behavioral score will be categorized into bands, whereby each band will correspond to a specific credit limit.

Behavioral scores can also be used to authorize accounts to go in excess of their credit limit  A gradually decreasing behavioral score could be an early warning signal for looming credit problems, which can be very useful information from a pro-active debt collection perspective.  It allows to develop a collection strategy by working out actions that might prevent default.

Both application and behavioral scores will also be used for risk management in a Basel II/III context.  More specifically, they will be used as key inputs to estimate the default rate on a loan portfolio, which will then be used to calculate the expected losses (covered by provisions) and unexpected losses (covered by capital).  They can also be helpful for securitization purposes by slicing and dicing a credit portfolio into tranches with similar risk.

Besides financial institutions, also other organizations can use credit scores to support their business decisions.  For example, electricity and Telco companies can use credit scores in their pricing or contracting policies.  Employers can use them to get a better idea of the profile of job applicants, whereas landlords can get a better idea about the solvency of their future renters.  Insurance companies can use credit scores to set insurance premiums or decide for whom to accept the insurance policy.  Note that some of these applications are controversial and subject to debate.

The widespread use of both application and behavioral scorecards has made them a key decision support tool in modern risk measurement and management.  However, both scoring methods do face a number of limitations.

A first limitation concerns the data that is used to estimate credit scoring models.  Since data are the major, and in most cases the only, ingredient to build these models, its quality and predictive ability is key to their success.  The quality of the data refers, e.g., to the number of missing values and outliers, and the recency and representativity of the data.  Data quality issues can be difficult to detect without specific domain knowledge, but have an important impact on the scorecard development and resulting risk measures.  The availability of high-quality data is a very important prerequisite for building good credit scoring models.  However, the data need not only be of high quality, but it should be predictive as well, in the sense that the captured characteristics are related to the customer defaulting or not.  Before constructing a scorecard, we need to thoroughly reflect why a customer defaults and which characteristics could potentially be related to this.  Customers may default because of unknown reasons or information not available to the financial institution, thereby posing another limitation to the performance of credit scoring models.  The statistical techniques used in developing credit scoring models typically assume a data set of sufficient size containing enough defaults.  This may not always be the case for specific types of portfolios where only limited data is available, or only a low number of defaults is observed.  For these types of portfolios, one may have to rely on alternative risk assessment methods using, e.g., expert judgment based methods.

Financial institutions should also be aware that scorecards have only a limited lifetime.  The populations on which they were estimated will typically vary throughout time because of changing economic conditions or new strategic actions (e.g., new customers segments targeted, new credit products introduced) undertaken by the bank.  This is often referred to as population drift and will necessitate the financial institution to rebuild its scorecards if the default risk in the new population is totally different from the one present in the population that was used to build the old scorecard.

Many credit bureaus nowadays start disclosing how their bureau scores (e.g., FICO scores) are computed in order to encourage customers to improve their financial profile, and hence increase their success in getting credit.  Since this gives customers the tools to polish up their scores and make them look “good” in future credit applications, this may trigger new types of default risk (and fraud), hereby invalidating the original scorecard and necessitating more frequent rebuilds.

Introducing credit scoring into an organization requires serious investments in information and communication technology (ICT, hardware and software), personnel training and support facilities.  The total cost needs to be carefully considered beforehand and compared with future benefits, which may be hard to quantify.

Finally, a last criticism concerns the fact that most credit scoring systems only model default risk, i.e., the risk that a customer runs into payment arrears on one of his/her financial obligations. Default risk is, however, only one type of credit risk.  Besides default risk, credit risk also entails recovery risk and exposure risk.

Liked this article? Read more in: Baesens B., Roesch D., Scheule H., Credit Risk Analytics – Measurement Techniques, Applications and Examples in SAS, Wiley, 2016.