This repository explores predicting health insurance premium amounts using individual characteristics. The code leverages the Random Forest algorithm and various Python libraries like scikit-learn and pandas for data manipulation and analysis.The dataset that I am using for the task of health insurance premium prediction is collected from Kaggle. It contains data about: the age of the person, gender of the person, Body Mass Index of the person, how many children the person is having, whether the person smokes or not, the region where the person lives and the charges of the insurance premium.
How the Database Looks like
Regression Function:
Final Results:
- Scikit-learn: Provides the RandomForestRegressor algorithm for accurate predictions.
- Pandas: Facilitates efficient data loading, manipulation, and exploration through DataFrames.
- Numpy: Enables numerical computations and array operations for feature engineering and model training.
- Plotly.express: Creates interactive visualizations like histograms and pie charts for data exploration.
- Download the repository
- make a new folder extract the files in it
- Install necessary packages using pip install -r requirements.txt.
- After finally downloading the requirements you can run the code and train and experiment with the model