[Enhancement]: Performance benchmarking #729

tomvothecoder · 2025-01-29T18:48:54Z

We should do performance benchmarks with xCDAT across different workflows and data types/sizes.

Examples include (WIP):

No response

No response

Two part problem with performance:

User needs to chunk their datasets appropriately for optimal performance, based on the shape of the data
- This is often the bottleneck for most users. Not trivial to understand the optimal chunk sizes.
- Also setting a cluster is better for performance monitoring, but another barrier to entry.
- Is there a general way to chunk across different datasets of varying sizes/dimensions?
Is xCDAT optimized for work on these chunks? -- We use Xarray APIs for core operations (e.g., grouping), which operate in parallel on Dask Arrays

tomvothecoder added the type: enhancement New enhancement request label Jan 29, 2025

tomvothecoder added this to the FY25Q2 (01/01/25 - 03/31/25) milestone Jan 29, 2025

tomvothecoder added this to xCDAT Development Jan 29, 2025

github-project-automation bot moved this to Todo in xCDAT Development Jan 29, 2025