Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
arnobotha authored Apr 2, 2024
1 parent 6fec1be commit 6965e85
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# TruEnd-Procedure
[![DOI](https://zenodo.org/badge/695033824.svg)](https://zenodo.org/doi/10.5281/zenodo.10908342)

A novel procedure is presented for finding the true but latent endpoints within the repayment histories of individual loans. The monthly observations beyond these true endpoints are false, largely due to operational or system failures that otherwise delay the timely closure of loans, thereby corrupting the eventual dataset. However, these false observations are difficult to detect at scale since each affected loan history might have a different sequence of zero (or very small) month-end balances that persist towards the end. Identifying and discarding these trails of diminutive balances will depend on the exact definition of a "small balance". Our procedure can find such an optimised balance-definition that is neither too small nor too large, such that we retain neither false history nor discard credible history across all loans. We demonstrate this procedure using a dataset of residential mortgages, as provided by a large South African bank, and we isolate the ideal small-balance definition for this portfolio. Evidently, the affected loans within this portfolio are both remarkably prevalent and have excess histories that are surprisingly long. These excess histories ruin the timing of certain risk events such as write-off and early settlement, thereby compromising any subsequent time-to-event model such as survival analysis, as demonstrated. Discarding these excess histories demonstrably improves the accuracy of both the predicted timing and severity of risk events, without impacting the monetary value of the portfolio in any material way. In turn, the resulting estimates of credit losses are lower and less biased, which augurs well for raising accurate credit impairments under the IFRS 9 accounting standard. Our work therefore highlights and solves a data problem that essentially amounts to measurement error, thereby underscoring the pivotal role of data preparation in producing credible forecasts of credit risk.
A novel procedure is presented for finding the true but latent endpoints within the repayment histories of individual loans. The monthly observations beyond these true endpoints are false, largely due to operational failures that delay account closure, thereby corrupting the eventual dataset. Detecting these false observations is difficult at scale since each affected loan history might have a different sequence of zero (or very small) month-end balances that persist towards the end. Identifying these trails of diminutive balances would require an exact definition of a "small balance", which can be found using our procedure. We demonstrate this procedure and isolate the ideal small-balance definition using residential mortgages from a large South African bank. Evidently, corrupted loans are remarkably prevalent and have excess histories that are surprisingly long, which ruin the timing of certain risk events and compromise any subsequent time-to-event model such as survival analysis. Excess histories can be discarded using the ideal small-balance definition, which demonstrably improves the accuracy of both the predicted timing and severity of risk events, without materially impacting the monetary value of the portfolio. The resulting estimates of credit losses are lower and less biased, which augurs well for raising accurate credit impairments under the IFRS 9 accounting standard. Our work therefore highlights and solves a data problem, which underscores the pivotal role of data preparation in producing credible forecasts of credit risk.

## Structure
This R-codebase can be run sequentially using the file numbering itself as a structure. Delinquency measures are algorithmically defined in **DelinqM.R** as data-driven functions, which may be valuable to the practitioner outside of the study's current scope. The TruEnd-procedure and its set of funcations are defined in **TruEnd.R**, which may also be valuable to the practitioner beyond the current scope.
Expand Down

0 comments on commit 6965e85

Please sign in to comment.