This project implements a machine learning system for predicting Customer Lifetime Value in banking, including data generation, model training, and a web-based prediction API.
The system consists of three main components:
- Synthetic data generation for banking customers
- Machine learning model for CLV prediction
- FastAPI-based web service with an interactive UI
.
├── api.py # FastAPI web service implementation
├── data_gen.py # Synthetic data generation script
├── model.py # CLV prediction model implementation
├── index.html # Web interface
├── requirements.txt # Project dependencies
├── data/ # Generated datasets
│ ├── customers.csv
│ ├── products.csv
│ ├── transactions.csv
│ └── customer_metrics.csv
└── models/ # Trained model artifacts
└── clv_model_latest/
├── model.joblib
├── scaler.joblib
├── label_encoders.joblib
├── feature_names.joblib
└── feature_importance.joblib
-
Data Generation
- Realistic synthetic customer data
- Transaction history generation
- Product holdings simulation
- Configurable parameters for data size and date ranges
-
Machine Learning Model
- Gradient Boosting Regressor
- Feature engineering pipeline
- Model persistence and versioning
- Confidence score calculation
-
Web API & Interface
- RESTful endpoints for predictions
- Interactive web UI for data input
- Real-time CLV predictions
- Error handling and validation
- Clone the repository:
git clone https://github.com/Ismat-Samadov/clv_model.git
cd clv_model
- Create and activate a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
Run the data generation script:
python data_gen.py
This will create synthetic banking data in the data/
directory.
The model training is handled automatically when running the API for the first time. To manually train:
from model import BankingCLVModel
model = BankingCLVModel()
model.train(customers_df, transactions_df, products_df, metrics_df)
Run the FastAPI server:
uvicorn api:app --reload
The server will start at http://localhost:8000
Open http://localhost:8000
in your web browser to access the CLV prediction interface.
Predicts CLV for a given customer profile.
Example request body:
{
"customer_id": 1,
"age": 35,
"income": 75000,
"credit_score": 720,
"tenure_months": 24,
"region": "North",
"acquisition_channel": "Online",
"products": [
{
"product_type": "Savings",
"start_date": "2023-01-01T00:00:00",
"balance": 5000,
"status": "Active"
}
],
"transactions": [
{
"transaction_date": "2024-01-01T10:00:00",
"transaction_type": "Deposit",
"amount": 1000,
"channel": "Online"
}
]
}
Returns the health status of the API and model.
The CLV prediction model:
- Uses a Gradient Boosting Regressor
- Incorporates customer demographics, product holdings, and transaction patterns
- Provides confidence scores for predictions
- Includes feature importance analysis
- Customer demographics (age, income, credit score)
- Account tenure
- Transaction patterns
- Product holdings
- Regional indicators
- Channel preferences
Key configuration options are available in the respective Python files:
data_gen.py
: Data generation parametersmodel.py
: Model hyperparametersapi.py
: API settings and validation rules
The system logs important events and errors to api.log
. Configure logging levels in api.py
.
- Modify the data generation in
data_gen.py
- Update feature engineering in
model.py
- Add new endpoints in
api.py
- Update the web interface in
index.html
Run the development server with:
uvicorn api:app --reload --port 8000
- Input validation for all API endpoints
- Error handling for invalid data
- Confidence score calculation for predictions
- Rate limiting for API endpoints (TODO)
Major dependencies include:
- FastAPI
- scikit-learn
- pandas
- numpy
- joblib
- uvicorn
See requirements.txt
for complete list.
- Fork the repository
- Create a feature branch
- Commit changes
- Push to the branch
- Create a Pull Request
Ismat Samadov