Skip to content

Random Forest

Latest
Compare
Choose a tag to compare
@Harbringe Harbringe released this 15 Mar 03:51
· 4 commits to main since this release
28f8833

Model Information

  • Model Type: Random Forest Classifier
  • Accuracy: 90.8%
  • Precision: 81%
  • Description: This machine learning model is a Random Forest Classifier trained on PowerCo's provided dataset. It has shown an accuracy of 90.8% and a precision of 81% in predicting customer churn. Random Forest is an ensemble learning technique that combines multiple decision trees during training and outputs the class that is the mode of the classes predicted by individual trees.

How to use

To use the machine learning model for predicting customer churn, follow these steps:

  1. Import the necessary libraries and load the trained model using joblib.
  2. Preprocess the input data:
    • Normalize the numeric columns in the dataset.
    • One-hot encode categorical columns using get_dummies.
    • Perform a log10 transformation on columns with high skewness.
    • Columns requiring log10 transformation: 'cons_12m', 'cons_gas_12m', 'cons_last_month', 'forecast_cons_12m', 'forecast_cons_year', 'forecast_meter_rent_12m', 'imp_cons', 'net_margin'.

Here's an example code snippet:

import pandas as pd
import joblib
from sklearn.preprocessing import StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
import numpy as np

# Load the trained model
model = joblib.load('path_to_your_model/model.joblib')

# Load the dataset
data = pd.read_csv('path_to_your_dataset/dataset.csv')

# Normalize numeric columns
numeric_cols = ['cons_12m', 'cons_gas_12m', 'cons_last_month', 'forecast_cons_12m', 
                'forecast_cons_year', 'forecast_meter_rent_12m', 'imp_cons', 'net_margin']

scaler = StandardScaler()
data[numeric_cols] = scaler.fit_transform(data[numeric_cols])

# One-hot encode categorical columns
categorical_cols = ['categorical_col1', 'categorical_col2', ...]  # Add your categorical columns here e.g. ('sales_channels', 'origin_up')
data = pd.get_dummies(data, columns=categorical_cols)

# Log10 transformation for columns with high skewness
skewed_cols = ['cons_12m', 'cons_gas_12m', 'cons_last_month', 'forecast_cons_12m', 
               'forecast_cons_year', 'forecast_meter_rent_12m', 'imp_cons', 'net_margin']
data[skewed_cols] = np.log10(data[skewed_cols] + 1)

# Prepare the input features
X = data.drop(columns=['churn_label'])  # Assuming 'churn_label' is the target column

# Make predictions
y_pred = model.predict(X)

# Print predictions
print(y_pred)

Replace 'path_to_your_model/model.joblib' and 'path_to_your_dataset/dataset.csv' with the actual paths to your trained model and dataset files, respectively. Also, update categorical_cols with the actual categorical columns from your dataset.