Data-Analysis-and-ML

My projects and checkpoints for the University of Edinburgh data analysis and machine learning (DAML) graduate course.

In total, I worked on 3 projects and 14 checkpoints. The projects were focused on exploring key stages of the particle physics analysis pipeline of the ATLAS and CMS experiments. The projects are in the form of Jupyter Notebook reports.

The checkpoints were based on self-contained analysis tasks that were explored along with the lectures. The projects contain all of the output files needed to interpret the results. Only the initial datasets are not included due to their size.

Report1: Dark machines anomaly detection challenge

Designed an autoencoder (AE) for unsupervised anomaly detection
Performed hyperparameter search: training each model on SM sample and evaluating on mixed SM and BSM sample
Chose AE with the highest average ROC area under the curve for all BSM samples
Found anomaly thresholds maximising the significance improvement

Report2: Geant4 custom detector build, detection and classification analysis

Base set-up

Fired truth electrons, protons, photons and neutrons at homogeneous electromagnetic calorimeter (10 cm lead-glass) and hadronic sampling calorimeter (3cm lead, 30 cm liquid argon, 5 stacks)
5000 particles of each type fired with energy incrementing by 5 MeV each time, starting at 200 MeV and finishing at 25 GeV. 20000 events in total. Simulation precision tuned to speed up computation.
Processed detector hits and used them to train a neural network to classify between the four particles. The current detector cannot distinguish the charge of particles so protons-neutrons and electrons-photons get mistagged between the classes. The best val_loss is 0.5818. It does better than guessing which is equivalent to 0.25.
Calibration quality is very good for electrons and photons as they are completely stopped. Protons and neutrons are not fully contained in the detector so they have poor calibration quality.

Enhanced set-up

Improvements: Add silicon tracking layer (1cm) before ECAL, increase the lead to liquid argon ratio (5:25), increase the depth of HCAL (from 5 to 8 layers) and add 0.05 Tesla magnetic field in the x direction.
Effect: Distinguish between charged and non-charged particles, increase the stopping power of the HCAL, increase the detector material to contain the particles better, and distinguish the charge between particles.
Repeat the event simulation for another 20000 events, process the detector read-out and train a neural network.
Best val_loss is now 0.9984. The symmetry in the detector activation between electrons-photons and neutrons-protons is now removed and the network can easily discriminate between the four particles.
Calibration quality and energy resolution for neutrons and protons have significantly improved due to the improved stopping power of the HCAL.
Classification can be improved with some pre-selection criteria.

Report3: 1 TeV BSM Higgs search

Manual cuts followed by NN classification on weakly correlated variables with invariant mass, avoiding background sculpting
Curve fitting using Minuit optimiser: finding best distribution shape, polynomial (H0) vs gaussian+polynomial (H1)
Fitting polynomial coefficients, gaussian width and signal fraction using the invariant mass
Obtaining chi-squared difference between H0 and H1 and converting this into statistical significance (final result 7.29 sigma)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data-Analysis-and-ML

Report1: Dark machines anomaly detection challenge

Report2: Geant4 custom detector build, detection and classification analysis

Report3: 1 TeV BSM Higgs search

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
1 TeV BSM Higgs search		1 TeV BSM Higgs search
Dark machines AD challenge		Dark machines AD challenge
Geant4 detector building analysis		Geant4 detector building analysis
CP10_Geant4_energy_calibration&energy_fractions.ipynb		CP10_Geant4_energy_calibration&energy_fractions.ipynb
CP11_Momentum_resolution_regression_muons.ipynb		CP11_Momentum_resolution_regression_muons.ipynb
CP12_Geant4_rediscovering_Z_boson_mass_reconstruction.zip		CP12_Geant4_rediscovering_Z_boson_mass_reconstruction.zip
CP13_1TeV_HIggs_search_correlations&ROC_curves.ipynb		CP13_1TeV_HIggs_search_correlations&ROC_curves.ipynb
CP14_Gradient-boosting-regressor_muon_energy_reco.ipynb		CP14_Gradient-boosting-regressor_muon_energy_reco.ipynb
CP1_NN&DL_concepts.ipynb		CP1_NN&DL_concepts.ipynb
CP2_Muon_decay_pseudo-experiemnt.ipynb		CP2_Muon_decay_pseudo-experiemnt.ipynb
CP3_CNNs&Autoencoders-quickdraw_dataset.ipynb		CP3_CNNs&Autoencoders-quickdraw_dataset.ipynb
CP4_Max_likelihood_fits_mass_S+B.ipynb		CP4_Max_likelihood_fits_mass_S+B.ipynb
CP5_PDFs_fits_z-scores.ipynb		CP5_PDFs_fits_z-scores.ipynb
CP6_Combined_PDFs_syst.&stat._errors_H-tests.ipynb		CP6_Combined_PDFs_syst.&stat._errors_H-tests.ipynb
CP7_ARGUS_B_mixing_search.ipynb		CP7_ARGUS_B_mixing_search.ipynb
CP8_VAEs&GANs_MNIST_datset.ipynb		CP8_VAEs&GANs_MNIST_datset.ipynb
CP9_Geant4_detector_modifications&measurements.ipynb		CP9_Geant4_detector_modifications&measurements.ipynb
README.md		README.md

StefKats/Data-Analysis-and-ML

Folders and files

Latest commit

History

Repository files navigation

Data-Analysis-and-ML

Report1: Dark machines anomaly detection challenge

Report2: Geant4 custom detector build, detection and classification analysis

Report3: 1 TeV BSM Higgs search

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages