Skip to content

Heuristics drift time #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 45 commits into from
May 6, 2025
Merged

Conversation

willquinn
Copy link
Contributor

@willquinn willquinn commented Feb 14, 2025

Features added:

  • HPGe scalar field (LH5) specification and routines to read them in memory
  • HPGe drift time processor, using ak.transform()
  • drift-time heuristic processor, based on Awkward arrays and sped up with Numba
  • new units.py module with first routines to handle physical units
  • unit tests

Outstanding tasks:

  • full profiling needs to be done to test speed
  • validate distributions on high-statistics test data

@willquinn willquinn marked this pull request as draft February 14, 2025 01:27
@willquinn
Copy link
Contributor Author

I will add doc strings to all these functions.

@tdixon97
Copy link
Collaborator

tdixon97 commented Feb 14, 2025

  • I think it needs alot more documentation,
  • We also need to think about what parts of this are general and which are specific. I want to build some basic functionality to allow people to do similar things in future.
  • I would also like much more significant testing, eg some simple cases.
  • Regarding the clustering, do you understand what MPP does, we need to implement also this.

@willquinn
Copy link
Contributor Author

I can use the cases I defined in r90 and do some more simple tests, though the test_files have 10 events in so it is simple. I just need to check what I expect the output to be.

@tdixon97
Copy link
Collaborator

I suggest some more trivial cases where the output is known (eg. uniform drift time), one step etc etc.

@willquinn
Copy link
Contributor Author

Regarding the MPP clustering, I checked your function and compared it to the one in MPP, and it seemed the same. The way I interpreted the algorithm in cluster.py is that you check distances and track ids. I set all steps in the event to have the same track ID by using the event ID.

@tdixon97
Copy link
Collaborator

But does MPP cluster together steps in different tracks?

@willquinn
Copy link
Contributor Author

willquinn commented Feb 14, 2025

Here is the MPP clustering code - modified slightly by myself, but the main function is the same:

void MPPStepsClusterer::SensVolProcess()
{
  // First figure out what order we want to go through fSteps and fActiveness in. We want to go through the steps in numerical order so that tracks will be grouped together and in order
  fNSensVol++;
  vector<size_t> i_order(fSteps->size());
  iota(i_order.begin(), i_order.end(), 0);
  sort(i_order.begin(), i_order.end(), [this](size_t lhs, size_t rhs) { return make_pair((*fSteps)[lhs]->GetTrackID(), (*fSteps)[lhs]->GetStepNumber()) < make_pair((*fSteps)[rhs]->GetTrackID(), (*fSteps)[rhs]->GetStepNumber()); });
  
  fClusters.clear();
  fNClusters=0;
  fNSteps.clear();
  fClusterSteps.clear();
  fClusterEnergy.clear();
  fClusterActiveness.clear();
  fClusterSize.clear();
  fClusterPositionX.clear();
  fClusterPositionY.clear();
  fClusterPositionZ.clear();
  
  const MGTMCStepData* laststep = NULL;
  size_t icluster = 0;
  vector<vector<size_t> > cluster_steps;
  cluster_steps.reserve(fSteps->size());
  
  // Loop through the steps and add them to the correct cluster
  for(size_t i : i_order) {
    const MGTMCStepData* thisstep = (*fSteps)[i];

    // If this is a prestep, seek out the cluster/step that spawned it
    if(thisstep->GetIsPreStep()) {
      laststep = NULL;
      // loop through each step in each cluster until we find the parent
      for(size_t jcluster=0; jcluster<cluster_steps.size(); jcluster++) {
        for(size_t iStep : cluster_steps[jcluster]) {
          const MGTMCStepData* step = (*fSteps)[iStep];
          if(thisstep->GetParentTrackID() == step->GetTrackID()
            && TMath::Abs(thisstep->GetT() - step->GetT()) < fClusterTime
            && (thisstep->GetPositionVector()-step->GetPositionVector()).Mag() < fClusterDistance) {
            icluster = jcluster;
            laststep = step;
            break;
          }
        }
	      if(laststep) break;
      }

      // do not place presteps in clusters
      continue;
    }
    
    // if this step is not within the cluster distance/time of the last step (or there is no last step), start a new cluster
    if(!(laststep
	 && ( thisstep->GetTrackID() == laststep->GetTrackID()
	      || thisstep->GetParentTrackID() == laststep->GetTrackID() )
	 && thisstep->GetStepLength() < fClusterDistance
	 && TMath::Abs(laststep->GetT()-thisstep->GetT()) < fClusterTime) ) {
      cluster_steps.emplace_back();
      cluster_steps.back().reserve(fSteps->size());
      icluster = cluster_steps.size()-1;
      fNClusters++;
    }
    // Add this step to the cluster if it is not a prestep. All non-pre-steps are guaranteed to be in a cluster
    cluster_steps[icluster].push_back(i);
    fNTotSteps++;
    laststep = thisstep;
  }
  
  // now make clusters that are averages of lists of steps
  for(size_t iClust=0; iClust<cluster_steps.size(); iClust++) {
    vector<size_t>& steps = cluster_steps[iClust];
    
    // Get new cluster from cache (or create new one)
    if(fClusterIt==fClusterCache.end())
      fClusterIt = fClusterCache.emplace(fClusterIt);
    MGTMCStepData& clust = *fClusterIt;
    fClusterIt++;
    // initialize cluster with first step
    clust = *((*fSteps)[steps[0]]);
    clust.SetEdep(0.);
    fClusters.push_back(&clust);
    fClusterSteps.emplace_back(0);
    fClusterEnergy.push_back(0.);
    fClusterActiveness.push_back(0.);
    fClusterSize.push_back(0.);
    fClusterPositionX.push_back(0.);
    fClusterPositionY.push_back(0.);
    fClusterPositionZ.push_back(0.);
    
    // loop through other steps and average them in
    for(size_t iStep : steps) {
      const MGTMCStepData* step = (*fSteps)[iStep];
      fClusterSteps[iClust].push_back(step);
      
      double activeness = ( fActiveness!=NULL ? (*fActiveness)[iStep] : 1. );
      clust.SetEdep( clust.GetEdep()+step->GetEdep() );
      double w = ( clust.GetEdep()==0 ? 1. : step->GetEdep()/clust.GetEdep());
      
      fClusterEnergy[iClust] += step->GetEdep()*activeness;
      fClusterActiveness[iClust] += w*(activeness - fClusterActiveness[iClust]);
      fClusterSize[iClust] += (1.-w) * ( (step->GetPositionVector()-clust.GetPositionVector()).Mag2() - fClusterSize[iClust] );
      
      clust.SetT( w*step->GetT() + (1.-w)*clust.GetT());
      clust.SetX( w*step->GetX() + (1.-w)*clust.GetX());
      clust.SetY( w*step->GetY() + (1.-w)*clust.GetY());
      clust.SetZ( w*step->GetZ() + (1.-w)*clust.GetZ());
      clust.SetLocalX( w*step->GetLocalX() + (1.-w)*clust.GetLocalX());
      clust.SetLocalY( w*step->GetLocalY() + (1.-w)*clust.GetLocalY());
      clust.SetLocalZ( w*step->GetLocalZ() + (1.-w)*clust.GetLocalZ());
    }
    if(fWindowT0) clust.SetTimeOffset(clust.GetT() - *fWindowT0);
    fNSteps.push_back(fClusterSteps[iClust].size());
    fClusterSize[iClust] = sqrt(fClusterSize[iClust]);
    fClusterPositionX[iClust] = clust.GetLocalX();
    fClusterPositionY[iClust] = clust.GetLocalY();
    fClusterPositionZ[iClust] = clust.GetLocalZ();
  }

  fTotalNClusters += fNClusters;
  if(GetVerbosity() > 0) {
    for(size_t iC=0; iC<fClusters.size(); iC++) {
      const MGTMCStepData* clust = fClusters[iC];
      printf("  %4lu  %8d  %5lu  %6lu  %8.2e  (%6.2f, %6.2f, %6.2f)  %7.2f  %5.3f  %8.2e\n",
	     IWindow(), SensVolID(), iC, fNSteps[iC],
	     clust->GetTimeOffset(), clust->GetLocalX(), clust->GetLocalY(),
	     clust->GetLocalZ(),  clust->GetEdep()*1000,
	     fClusterActiveness[iC], fClusterSize[iC]);
    }
  }
}

@gipert
Copy link
Member

gipert commented Mar 19, 2025

hi willie what's the status here?

@willquinn
Copy link
Contributor Author

I have restructured some of the code, added some doc strings.

My implemention of some clustering needs some work, also not sure if its in the right place.

@gipert
Copy link
Member

gipert commented Apr 30, 2025

image

@willquinn
Copy link
Contributor Author

What is the spacing on the grid?

@gipert
Copy link
Member

gipert commented Apr 30, 2025

roughly 1 mm (to generate some map quickly). anyways have a look at the script i have uploaded

Copy link

codecov bot commented Apr 30, 2025

Codecov Report

Attention: Patch coverage is 77.67857% with 25 lines in your changes missing coverage. Please review.

Project coverage is 69.98%. Comparing base (e1626c8) to head (a8f11cb).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/reboost/hpge/psd.py 59.25% 22 Missing ⚠️
src/reboost/hpge/utils.py 92.00% 2 Missing ⚠️
src/reboost/units.py 96.96% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #52      +/-   ##
==========================================
+ Coverage   69.55%   69.98%   +0.43%     
==========================================
  Files          25       27       +2     
  Lines        1849     1959     +110     
==========================================
+ Hits         1286     1371      +85     
- Misses        563      588      +25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gipert
Copy link
Member

gipert commented May 2, 2025

got the code working, even if i don't like it so much as I had to make an extra copy (np.asarray()). will need to evaluate performance and decide what to do

@gipert gipert marked this pull request as ready for review May 2, 2025 15:08
@gipert gipert merged commit 48d331c into legend-exp:main May 6, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants