An Economic Perspective on Fraud Analytics: Calculating ROI of Fraud Detection Systems

Posted on December 8, 2015

Contributed by: Bart Baesens, Véronique Van Vlasselaer, Wouter Verbeke

This article first appeared in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to receive our feature articles, or follow us @DataMiningApps. Do you also wish to contribute to Data Science Briefings? Shoot us an e-mail over at briefings@dataminingapps.com and let’s get in touch!

The importance and need for effective fraud detection and prevention systems is highlighted by some recent numbers which give an indication of the estimated size and the financial impact of fraud:

A typical organization loses 5% of its revenues to fraud each year (www.acfe.com)
The total cost of insurance fraud (non-health insurance) in the U.S.A. is estimated to be more than $40 billion per year (www.fbi.gov)
Fraud is costing the U.K. £73 billion a year (National Fraud Authority)
Credit card companies “lose approximately seven cents per every hundred dollars of transactions due to fraud” (Andrew Schrage, Money Crashers Personal Finance, 2012)
The average size of the informal economy, as a percent of official GNI in the year 2000, in developing countries is 41%, in transition countries 38%, and in OECD countries 18% (Schneider, 2002)

Even though these numbers are rough estimates rather than exact measurements, they are based on evidence and do indicate the importance and impact of the phenomenon, and therefore as well the need for organizations and governments to actively fight and prevent fraud with all means they have at their disposal. These numbers indicate that it is likely worthwhile to invest in fraud detection and prevention systems, since a significant financial return on investment can be made. However, estimating the return on investment in analytical approaches to fighting fraud is not self-evident, requiring an assessment of the total cost of ownership of analytical models as well as the full impact of fraud on the organization and the total utility of fraud detection and investigation.

Total Cost of Ownership

The Total Cost of Ownership (TCO) of a fraud analytical model refers to the cost of owning and operating the analytical model over its expected lifetime, from inception to retirement. It should consider both quantitative and qualitative costs and is a key input to make strategic decisions about how to optimally invest in fraud analytics. The costs involved can be decomposed into: acquisition costs, ownership and operation costs, and post ownership costs, as illustrated with some examples in the table below:

Example costs for calculating Total Cost of Ownership (TCO)
Acquisition costs Software costs including initial purchase, upgrade, intellectual property and licensing fees Hardware costs including initial purchase price and maintenance Network and security costs Data costs including costs for purchasing external data Model developer costs such as salaries and training	Ownership and operation costs Model migration and change management costs Model setup costs Model execution costs Model monitoring costs Support costs (troubleshooting, helpdesk, …) Insurance costs Model staffing costs such as salaries and training Model upgrade costs Model downtime costs	Post ownership costs De-installation and disposal costs Replacement costs

Example costs for calculating Total Cost of Ownership (TCO)

Acquisition costs

Software costs including initial purchase, upgrade, intellectual property and licensing fees
Hardware costs including initial purchase price and maintenance
Network and security costs
Data costs including costs for purchasing external data
Model developer costs such as salaries and training

Ownership and operation costs

Model migration and change management costs
Model setup costs
Model execution costs
Model monitoring costs
Support costs (troubleshooting, helpdesk, …)
Insurance costs
Model staffing costs such as salaries and training
Model upgrade costs
Model downtime costs

Post ownership costs

De-installation and disposal costs
Replacement costs

The goal of TCO analysis is to get a comprehensive view of all costs involved. From an economic perspective, this should also include the timing of the costs through proper discounting using e.g. the weighted average cost of capital (WACC) as the discount factor. Furthermore, it should help identifying any potential hidden and/or sunk costs. In many fraud analytical projects, the combined cost of hardware and software is subordinate to the people cost that comes with the development and usage of the analytical models (e.g. training, employment and management costs). Furthermore, TCO analysis allows to pinpoint cost problems before they become material. E.g., the change management costs to migrate from a legacy fraud model to a new analytical fraud model are often largely underestimated. TCO analysis is a key input for strategic decisions such as vendor selection, Buy versus Lease decisions, In- versus Outsourcing, overall budgeting and capital calculation. Note that when making these investment decisions, it is also very important to include the benefits in the analysis since TCO only considers the cost perspective.

Return on Investment

Return on Investment (ROI) is defined as the ratio of a return (benefit or net profit) over the investment of resources that generated this return. Both the return and the investment are typically expressed in monetary units, whereas the ROI is calculated as a percentage. In this section we discuss how to calculate the ROI of fraud detection, which may be less straightforward to calculate than the ROI of a financial product, but nonetheless can provide useful insights to an organization.

The returns of a fraud detection system depend on the amount of cases investigated, and the fraction of those that are effectively fraudulent. Remark that this fraction is a property of the fraud detection system, and depends on the power of the system to detect fraudulent cases.

The optimal amount of resources to allocate to fraud investigation and as such the sample to investigate is defined as the amount of resources that maximized the total utility associated with inspecting a sample. This sample can be selected either as a top-fraction of most suspicious cases with the highest scores assigned by a detection model, or as a top-fraction of the cases with the highest expected fraud amount which is defined as the probability to be fraudulent times the estimated fraud amount.

The utility of different outcomes is expressed as a net monetary value, either positive or negative, representing the costs and benefits to an organization (of any nature, both economic and non-economic, yet always expressed in monetary units) associated with the decision to inspect or not to inspect either a fraudulent or non-fraudulent case.

The investment required to generate the total returns or total utility can be assumed equal to the total cost of ownership. The total cost of ownership includes costs of diverse nature, covering the full investment required to build, operate, and maintain a fraud detection system. However, the total cost of ownership does not include costs related to resources that are required to further act upon the outputs of the detection system, i.e. inspecting and handling suspicious cases. All these costs together will be denoted as the Total Cost of Fraud Handling, and include inspection costs, legal costs, etc. Clearly, calculating the total cost of fraud handling may be a cumbersome task, yet indispensable to calculate the ROI.

Hence, we get to a general ROI formula for assessing investments in fraud detection and prevention, which can be fine-tuned to the specific setting of any organization:

Conclusion

Fraud has a significant impact on organizations of all sorts and sizes. Estimating the size of the impact in terms of financial losses is difficult and the resulting figures are typically rather sensitive to the underlying assumptions. Yet, calculating the returns of investing in a powerful fraud detection system can and definitely should be done to evaluate whether the system is delivering value to the organization as well as to quantify how much value. This article briefly introduces a formula to calculate ROI in this setting, indicating different sources of costs and benefits to be taken into account.

Further detailed information on this topic and on how to optimize fraud detection and prevention efforts, as well as a methodology to calculate capital requirements to cover for fraud losses can be found in a book written by the authors of this article, entitled Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection, published in August 2015 by Wiley in the SAS Business Series. The book presents methods and techniques to develop powerful fraud detection and prevention systems using a data driven approach, from A to Z.