An insurance policy is an arrangement in which the company agrees to provide a compensation in case of specified damage, in return for payment of a specified premium amount. Since, a lot of people pay the premium, but only a few of them face vehicle accidents and damage, and get the compensation, everyone shares the risk of everyone else. So, the insurance business is based on the existence of risks and the desire to avoid them. Thus, the data based quantification of risk and uncertainty plays a crucial role in this field, and this is how machine learning comes into the picture. So, we are aiming to build a model that would predict whether a customer would be interested in buying the insurance based on the information like:
1. Demographics: Gender, Age, Vehicle code
2. Vehicle: Vehicle Age, Damage
3. Policy: Premium
This would help the insurance company in optimizing its business model and revenue.
To predict whether the customer will be interested in buying vehicle insurance or not, we need to use classifier to
train the data. We tried three models for this:
1. Logistic Regression using Scikit Learn
2. Logistic Regression from scratch
3. Support Vector Machine using Scikit Learn
After getting the values of confusion matrix we also plotted ROC curve from True Positive Rate and False Positive Rate
- Wang, Hui Dong. "Research on the Features of Car Insurance Data Based on Machine Learning." Procedia Computer Science 166 (2020): 582-587.
2. Grize, Yves‐Laurent, Wolfram Fischer, and Christian Lützelschwab. "Machine learning applications in nonlife insurance." Applied Stochastic Models in Business and Industry 36.4 (2020): 523-537.
3. Mane, Sandeep and Srivastava, Jaideep and Hwang, San-Yin and Vayghan, Jamshid,2004 12,475- 478,Estimation of false negatives in classification, 0-7695-2142-8, 10.1109/ICDM.2004.10048.
4. https://www.analyticsvidhya.com/blog/2020/04/feature-scaling-machine-learning-normalization-standardization/
5. https://towardsdatascience.com/understanding-logistic-regression-step-by-step-704a78be7e0a
6. https://www.altexsoft.com/blog/datascience/machine-learning-project-structure-stages-roles-and-tools/
7. https://towardsdatascience.com/data-visualization-for-machine-learning-and-data-science-a45178970be7
8. https://towardsdatascience.com/how-to-tackle-any-classification-problem-end-to-end-choose-the-right-classification-ml-algorithm-4d0becc6a295
9. https://towardsdatascience.com/logistic-regression-detailed-overview-46c4da4303bc
10. https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47
If you are having any queries regarding this project then feel free to drop an email on yash.p4@ahduni.edu.in or priyank.h@ahduni.edu.in or samarth.s@ahduni.edu.in or shaili.g@ahduni.edu.in.