Skip to content

Heuristics drift time #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 45 commits into from
May 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
629f2db
first commit of drift time class that reads drift time maps and can c…
willquinn Feb 13, 2025
95ccbb8
first commit of drift time heuristic code
willquinn Feb 13, 2025
f5bfd15
some bug fixes
willquinn Feb 14, 2025
dc0d08b
test files of appropriate file size
willquinn Feb 14, 2025
69c4ce6
drift time maps now in lh5
willquinn Feb 14, 2025
5b3aeb6
Initial commit of dt heuristic test
willquinn Feb 14, 2025
b31b00c
bug fix on lh5 import
willquinn Feb 14, 2025
3e78ca9
test works
willquinn Feb 14, 2025
1dc5298
restructuring
willquinn Feb 14, 2025
a0b37bf
making the drift time map reading generic
willquinn Feb 14, 2025
63cc5ee
Merge branch 'legend-exp:main' into heuristics_drift_time
willquinn Mar 7, 2025
3ca785d
Merge branch 'legend-exp:main' into heuristics_drift_time
willquinn Mar 19, 2025
fe5e816
some large changes to make reading a drift time map more generic
willquinn Mar 19, 2025
82f907c
added some doc strings and fixed some bugs
willquinn Mar 19, 2025
3bf8c5e
updates
willquinn Apr 7, 2025
a252ae1
Merge branch 'legend-exp:main' into heuristics_drift_time
willquinn Apr 7, 2025
d13a83f
prettifing
willquinn Apr 7, 2025
808a2e2
changed the format of the file so there is only onle files with all t…
willquinn Apr 8, 2025
8460bf5
removed the clustering
willquinn Apr 9, 2025
f101c90
removed clustering
willquinn Apr 9, 2025
fde7e73
removed clustering
willquinn Apr 9, 2025
6da248f
removed the drfit time file. Drift times is now its own processor. Dt…
willquinn Apr 9, 2025
67063ac
Fixed some of the doc strings. drift_time is a new processor, removed…
willquinn Apr 9, 2025
130bc95
Added a relative position function and fixed some interpolation funct…
willquinn Apr 9, 2025
a8a75ad
Fixed the tests so they work with the new format of the processors
willquinn Apr 9, 2025
c60010b
fix files path issues
willquinn Apr 9, 2025
baf0c8d
add generic way to read gridded scalar fields on (r,z) hpge plane
gipert Apr 25, 2025
71ed655
pass kwargs to RegularGridInterpolator
gipert Apr 29, 2025
4f2009f
Merge remote-tracking branch 'origin/main' into heuristics_drift_time
gipert Apr 29, 2025
f5fe9e6
add julia script to make test dt map and simulation
gipert Apr 29, 2025
1c05dc0
add some tests
gipert Apr 29, 2025
beece14
adjust dt map grid
gipert Apr 30, 2025
92c0a13
parallelize dt map gen script
gipert Apr 30, 2025
f0d6a17
re-write the drift_time processor
gipert Apr 30, 2025
9effb8f
add awkward-numba implementation of the dt-heuristic processor
gipert May 2, 2025
347020c
Merge branch 'main' into heuristics_drift_time
gipert May 2, 2025
ef2e117
fix docs
gipert May 2, 2025
a3d43e6
fix numba proc
gipert May 2, 2025
96cafcf
add gitattributes
gipert May 2, 2025
8e912ba
add gitattributes
gipert May 2, 2025
206bb1b
Merge branch 'heuristics_drift_time' of github.com:willquinn/reboost …
gipert May 2, 2025
cda11f3
use detector from test data
gipert May 5, 2025
198806a
add routine to convert pint units to valid lh5 lgdo attrs
gipert May 5, 2025
fa675f7
fix tests
gipert May 5, 2025
a8f11cb
several changes
gipert May 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.gdml linguist-generated=true
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ repos:
- id: mixed-line-ending
- id: name-tests-test
args: ["--pytest-test-first"]
exclude: tests/hpge/simulation/.*.py
- id: requirements-txt-fixer
- id: trailing-whitespace

Expand Down
4 changes: 3 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ dependencies = [
"pandas",
"matplotlib",
"pygama",
"pyg4ometry",
]
dynamic = [
"version",
Expand Down Expand Up @@ -75,7 +76,7 @@ test = [
"legend-pygeom-hpges",
"legend-pygeom-tools",
"pyg4ometry",
"pylegendtestdata",
"pylegendtestdata>=0.6",

]

Expand Down Expand Up @@ -146,6 +147,7 @@ ignore = [
"D213", # Multi-line docstring summary should start at the first line
"D401", # Summary does not need to be in imperative mood
"D413", # No blank line after last section in docstring
"PLC2401", # We like non-ASCII characters for math
]
isort.required-imports = ["from __future__ import annotations"]
# see also napoleon config in docs/source/conf.py
Expand Down
188 changes: 176 additions & 12 deletions src/reboost/hpge/psd.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,29 +3,33 @@
import logging

import awkward as ak
import numba
import numpy as np
from lgdo import Array
import pint
import pyg4ometry
from lgdo import Array, VectorOfVectors
from numpy.typing import ArrayLike, NDArray

from .. import units
from ..units import ureg as u
from .utils import HPGeScalarRZField

log = logging.getLogger(__name__)


def r90(edep: ak.Array, xloc: ak.Array, yloc: ak.Array, zloc: ak.Array) -> Array:
"""Computes R90 for each hit in a ged.
"""R90 HPGe pulse shape heuristic.

Parameters
----------
edep
awkward array of energy
array of energy.
xloc
awkward array of x coordinate position
array of x coordinate position.
yloc
awkward array of y coordinate position
array of y coordinate position.
zloc
awkward array of z coordinate position

Returns
-------
r90
array of z coordinate position.
"""
tot_energy = ak.sum(edep, axis=-1, keepdims=True)

Expand All @@ -44,15 +48,15 @@ def eweight_mean(field, energy):
sorted_dist = dist[sorted_indices]
sorted_edep = edep[sorted_indices]

def cumsum(layout, **_kwargs):
def _ak_cumsum(layout, **_kwargs):
if layout.is_numpy:
return ak.contents.NumpyArray(np.cumsum(layout.data))

return None

# Calculate the cumulative sum of energies for each event
cumsum_edep = ak.transform(
cumsum, sorted_edep
_ak_cumsum, sorted_edep
) # Implement cumulative sum over whole jagged array
if len(edep) == 1:
cumsum_edep_corrected = cumsum_edep
Expand All @@ -72,3 +76,163 @@ def cumsum(layout, **_kwargs):
r90 = sorted_dist[r90_indices]

return Array(ak.flatten(r90).to_numpy())


def drift_time(
xloc: ArrayLike,
yloc: ArrayLike,
zloc: ArrayLike,
dt_map: HPGeScalarRZField,
coord_offset: pint.Quantity | pyg4ometry.gdml.Position = (0, 0, 0) * u.m,
) -> VectorOfVectors:
"""Calculates drift times for each step (cluster) in an HPGe detector.

Parameters
----------
xloc
awkward array of x coordinate position.
yloc
awkward array of y coordinate position.
zloc
awkward array of z coordinate position.
dt_map
the drift time map.
coord_offset
this `(x, y, z)` coordinates will be subtracted to (xloc, yloc, zloc)`
before drift time computation. The length units must be the same as
`xloc`, `yloc` and `zloc`.
"""
# sanitize coord_offset
coord_offset = units.pg4_to_pint(coord_offset)

# unit handling (for matching with drift time map units)
xu, yu = [units.units_convfact(data, dt_map.r_units) for data in (xloc, yloc)]
zu = units.units_convfact(zloc, dt_map.z_units)

# unwrap LGDOs
xloc, yloc, zloc = [units.unwrap_lgdo(data)[0] for data in (xloc, yloc, zloc)]

# awkward transform to apply the drift time map to the step coordinates
def _ak_dt_map(layouts, **_kwargs):
if layouts[0].is_numpy and layouts[1].is_numpy:
return ak.contents.NumpyArray(
dt_map.φ(np.stack([layouts[0].data, layouts[1].data], axis=1))
)

return None

# transform coordinates
xloc = xu * xloc - coord_offset[0].to(dt_map.r_units).m
yloc = yu * yloc - coord_offset[1].to(dt_map.r_units).m
zloc = zu * zloc - coord_offset[2].to(dt_map.z_units).m

# evaluate the drift time
dt_values = ak.transform(
_ak_dt_map,
np.sqrt(xloc**2 + yloc**2),
zloc,
)

return VectorOfVectors(
dt_values,
attrs={"units": units.unit_to_lh5_attr(dt_map.φ_units)},
)


def drift_time_heuristic(
drift_time: ArrayLike,
edep: ArrayLike,
) -> Array:
"""HPGe drift-time-based pulse-shape heuristic.

See :func:`_drift_time_heuristic_impl` for a description of the algorithm.

Parameters
----------
drift_time
drift time of charges originating from steps/clusters. Can be
calculated with :func:`drift_time`.
edep
energy deposited in step/cluster (same shape as `drift_time`).
"""
# extract LGDO data and units
drift_time, t_units = units.unwrap_lgdo(drift_time)
edep, e_units = units.unwrap_lgdo(edep)

# we want to attach the right units to the dt heuristic, if possible
attrs = {}
if t_units is not None and e_units is not None:
attrs["units"] = units.unit_to_lh5_attr(t_units / e_units)

return Array(_drift_time_heuristic_impl(drift_time, edep), attrs=attrs)


@numba.njit(cache=True)
def _drift_time_heuristic_impl(
dt: ak.Array,
edep: ak.Array,
) -> NDArray:
r"""Low-level implementation of the HPGe drift-time-based pulse-shape heuristic.

Accepts Awkward arrays and uses Numba to speed up the computation.

For each hit (collection of steps), the drift times and corresponding
energies are sorted in ascending order. The function finds the optimal
split point :math:`m` that maximizes the *identification metric*:

.. math::

I = \frac{|T_1 - T_2|}{E_\text{s}(E_1, E_2)}

where:

.. math::

T_1 = \frac{\sum_{i < m} t_i E_i}{\sum_{i < m} E_i}
\quad \text{and} \quad
T_2 = \frac{\sum_{i \geq m} t_i E_i}{\sum_{i \geq m} E_i}

are the energy-weighted mean drift times of the two groups.

.. math::

E_\text{scale}(E_1, E_2) = \frac{1}{\sqrt{E_1 E_2}}

is the scaling factor.

The function iterates over all possible values of :math:`m` and selects the
maximum `I` as the drift time heuristic value.
"""
dt_heu = np.zeros(len(dt))

# loop over hits
for i in range(len(dt)):
t = np.asarray(dt[i])
e = np.asarray(edep[i])

valid_idx = np.where(e > 0)[0]
if len(valid_idx) < 2:
continue

t = t[valid_idx]
e = e[valid_idx]

sort_idx = np.argsort(t)
t = t[sort_idx]
e = e[sort_idx]

max_id_metric = 0
for j in range(1, len(t)):
e1 = np.sum(e[:j])
e2 = np.sum(e[j:])

t1 = np.sum(t[:j] * e[:j]) / e1
t2 = np.sum(t[j:] * e[j:]) / e2

id_metric = abs(t1 - t2) * np.sqrt(e1 * e2)

max_id_metric = max(max_id_metric, id_metric)

dt_heu[i] = max_id_metric

return dt_heu
78 changes: 78 additions & 0 deletions src/reboost/hpge/utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
from __future__ import annotations

from typing import Callable, NamedTuple

import lgdo
import numpy as np
import pint
from dbetto import AttrsDict
from lgdo import lh5
from scipy.interpolate import RegularGridInterpolator


class HPGeScalarRZField(NamedTuple):
"""A scalar field defined in the cylindrical-like (r, z) HPGe plane."""

φ: Callable
"Scalar field, function of the coordinates (r, z)."
r_units: pint.Unit
"Physical units of the coordinate `r`."
z_units: pint.Unit
"Physical units of the coordinate `z`."
φ_units: pint.Unit
"Physical units of the field."


def get_hpge_scalar_rz_field(
filename: str, obj: str, field: str, out_of_bounds_val: int | float = np.nan, **kwargs
) -> HPGeScalarRZField:
"""Create an interpolator for a gridded scalar HPGe field defined on `(r, z)`.

Reads from disk the following data structure: ::

FILENAME/
└── OBJ · struct{r,z,FIELD}
├── r · array<1>{real} ── {'units': 'UNITS'}
├── z · array<1>{real} ── {'units': 'UNITS'}
└── FIELD · array<2>{real} ── {'units': 'UNITS'}

where ``FILENAME``, ``OBJ`` and ``FIELD`` are provided as
arguments to this function. `obj` is a :class:`~lgdo.types.struct.Struct`,
`r` and `z` are one dimensional arrays specifying the radial and z
coordinates of the rectangular grid — not the coordinates of each single
grid point. In this coordinate system, the center of the p+ contact surface
is at `(0, 0)`, with the p+ contact facing downwards. `field` is instead a
two-dimensional array specifying the field value at each grid point. The
first and second dimensions are `r` and `z`, respectively. NaN values are
interpreted as points outside the detector profile in the `(r, z)` plane.

Before returning a :class:`HPGeScalarRZField`, the gridded field is fed to
:class:`scipy.interpolate.RegularGridInterpolator`.

Parameters
----------
filename
name of the LH5 file containing the gridded scalar field.
obj
name of the HDF5 dataset where the data is saved.
field
name of the HDF5 dataset holding the field values.
out_of_bounds_val
value to use to replace NaNs in the field values.
"""
data = lh5.read(obj, filename)

if not isinstance(data, lgdo.Struct):
msg = f"{obj} in {filename} is not an LGDO Struct"
raise ValueError(msg)

data = AttrsDict(
{
k: np.nan_to_num(data[k].view_as("np", with_units=True), nan=out_of_bounds_val)
for k in ("r", "z", field)
}
)

interpolator = RegularGridInterpolator((data.r.m, data.z.m), data[field].m, **kwargs)

return HPGeScalarRZField(interpolator, data.r.u, data.z.u, data[field].u)
Loading
Loading