The project was created as part of the "ML Service Development: From Idea to MVP" (RU) course, run by the team of online master's programme "Machine Learning and Data-Intensive Systems" of the Faculty of Computer Science of the Higher School of Economics.
Dataset | Model file | Streamlit web-application
Credit_scoring.ipynb
: the main Jupyter Notebook of the project, in which data analysis and model building were conductedstreamlit_app/app.py
: Streamlit application main file to run the web interface of the modelstreamlit_app/model.py
: a script in which the model is loaded and the target variable is predicted
- Pandas and NumPy libraries were used for data processing.
- Data analysis and graphing were performed using the Matplotlib and Seaborn libraries.
- The scikit-learn library was used for machine learning, and in particular:
- the RandomForestClassifier classification model
- the RandomizedSearchCV method for finding the optimal hyperparameters of the model
- methods for key model metrics estimating
- the MinMaxScaler method for features scaling
- The pickle library was used to save the model.
- Using the Streamlit framework, a web service was created to interact with the model.
To start the web interface, install the requirements and run the app.py
file using the streamlit
tool:
pip install -r requirements.txt
streamlit run ./streamlit_app/app.py
The application will then be available at http://localhost:8501/