This project focuses on predicting loan approvals using a dataset with various applicant details. The process includes data preprocessing, feature engineering, model training, and evaluation using Decision Tree and Naive Bayes classifiers.
loan-train.csv
: Training datasetloan-test.csv
: Test datasetloan_approval_prediction.ipynb
: Jupyter notebook with detailed analysis and code
- Data Preprocessing
- Handling missing values
- Data normalization (log transformation)
- Feature Engineering
- Categorical data encoding
- Feature scaling
- Model Training
- Decision Tree Classifier
- Naive Bayes Classifier
- Model Evaluation
- Accuracy calculation
- Performance metrics analysis
- Data Manipulation
- pandas for data manipulation
- numpy for numerical operations
- Machine Learning
- sklearn for model training, evaluation, and preprocessing
- Data Visualization
- Matplotlib for histogram plotting of transformed features
- Clone the repository:
git clone https://github.com/crazyNerrd/Loan-Approval.git
- Install the required packages:
pip install -r requirements.txt
- Navigate to the project directory:
cd loan-approval-prediction
- Open the Jupyter notebook:
jupyter notebook loan_approval_prediction.ipynb
- Handle missing values using mode and mean imputation.
- Apply log transformation to
LoanAmount
andTotalIncome
for normalization. - Encode categorical variables using LabelEncoder.
- Scale features using StandardScaler.
- Split the dataset into training and testing sets using
train_test_split
. - Train Decision Tree and Naive Bayes classifiers.
- Evaluate models' accuracy using metrics from sklearn.
- Apply the same preprocessing steps to the test dataset.
- Make predictions using the trained Naive Bayes classifier.
- Display accuracy scores for both Decision Tree and Naive Bayes classifiers.
- Provide insights on model performance and potential improvements.