Cancer is one of the leading cause of death worldwide, responsible for nearly one in six deaths, and impacting many families around the globe. In 2022, approximately 20 million new cases of cancer were diagnosed worldwide, and 9.7 million people succumbed to the disease.
The purpose of this project is to develop an interactive tool for cancer diagnosis classification by implementing a pipeline using a Neural Network model (MLP Classifier) and a Machine Learning model (XGBoost). The project includes an API using the FastAPI and a Streamlit interface, allowing the users to input their features and receive real-time classification result. On the Streamlit interface the user can choose between both models, input his data and get the results through an image message showing if has less or high likelihood of cancer and the importance of preventing cancer, since early diagnosis are really important in effectively managing and treating cancer, increasing the chances of successful outcomes.
- Project visualization: https://cancer-classification.streamlit.app/
The streamlit can be seen on the link above and also accessing by the following command line on the Anaconda prompt:
streamlit run cancer-app.py
-
Streamlit code: https://github.com/raquelcolares/Cancer_Classification/blob/main/streamlit/cancer-app.py
-
Project Notebook: https://github.com/raquelcolares/Cancer_Classification/tree/main/notebook
-
Dataset: https://github.com/raquelcolares/Cancer_Classification/tree/main/dataset
-
FastAPI: https://github.com/raquelcolares/Cancer_Classification/blob/main/cancer-api.py
The api can be seen on the link above and also accessing by the following command line on the Anaconda prompt:
fastapi dev cancer-api.py
World Health Organization - WHO. (2024). World Cancer Day 2024. Available at: https://www.emro.who.int/media/news/world-cancer-day-2024.html#:~:text=Cancer%20is%20a%20major%20contributor,deaths%20from%20cancer%20in%202022. (Accessed: 11 November 2024).
National Cancer Institute. (2015). Risk Factors for Cancer. Available at: https://www.cancer.gov/about-cancer/causes-prevention/risk. (Accessed: 11 November 2024).
Professor Kaveh Bakhtiyari. (2024). Lectures: Neural Netwoks, Applied Machine Learning II, Algorithms and Data Structure. LaSalle College.
Keserer E. (2024). What Are machine Learning Pipelines, and Why Are They Important? Available at: https://www.akkio.com/post/what-are-machine-learning-pipelines-and-why-are-they-important#:~:text=Pipelining%20is%20important%20because%20it,train%20your%20data%20more%20effectively. (Accessed: 11 November 2024).
Kaggle. Cancer Prediction Dataset. Available at: https://www.kaggle.com/datasets/rabieelkharoua/cancer-prediction-dataset/ (Accessed: 10 November 2024).