I'm a data scientist with a background in mechanical engineering and a strong passion for mathematics, physics, science, and technology. After gaining initial work experience as a mechanical engineer, I realized my true interest lies in helping companies make better business decisions through data-driven insights. I believe that data is one of the greatest assets a company has, and my goal is to help businesses unlock its full potential.
Currently, I am seeking a junior data scientist or data analyst role where I can apply my skills to real-world business challenges. My experience in both engineering and data science allows me to bridge the gap between technical expertise and business impact.
- Languages: Python, SQL
- Machine Learning: Scikit-learn, XGBoost, Keras, TensorFlow
- Data Manipulation: Pandas, NumPy
- Data Visualization: Matplotlib, Seaborn, Power BI
- Tools: Jupyter Notebooks, Git, Docker
- Key Areas: Data Visualization, Dynamic Dashboards Creation, Data Cleaning and Preprocessing, Time Series Forecasting, Clustering, Feature Engineering, Classification Models, Regression Models
- Description: A real-world business challenge involving sales forecasting at the product-store level using XGBoost. Clustering techniques were used to group products and stores to optimize marketing and inventory management strategies.
- Technologies: Python, XGBoost, KMeans, Scikit-learn, Pandas, Power BI
- Key Results: Implemented a sales forecasting model with an RMSE of 7-10 units, identified key clusters for more efficient stock management.
- Description: This project analyzes public grants issued by the Ayuntamiento of Barcelona, focusing on the distribution of funds across various sectors and forecasting future grant amounts. The forecasting model was designed to predict the total grants for September 2024 based on historical data.
- Technologies: Python, Pandas, XGBoost, CatBoost, ARIMA, Matplotlib, Seaborn
- Key Results:
- XGBoost was chosen as the best-performing model, with an RMSE of 441,278.
- The analysis revealed that sectors like Culture and Sports have been prioritized since 2020, while sectors such as Urban Development and Social Rights have seen decreasing investments.
- The predictive model is being used as a tool for financial planning, providing valuable insights into the allocation of public resources.
- Description: A deep learning project that focuses on image classification using Convolutional Neural Networks (CNNs). Various structural modifications and techniques like data augmentation were applied to improve model performance.
- Technologies: Python, Keras, TensorFlow, CNN, Data Augmentation
- Key Results: Achieved 90.7% accuracy after implementing data augmentation techniques.
- LinkedIn: https://www.linkedin.com/in/simone-solieri/
- Email: simone.solieri@gmail.com