Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: Performance benchmarking #729

Open
1 task
tomvothecoder opened this issue Jan 29, 2025 · 0 comments
Open
1 task

[Enhancement]: Performance benchmarking #729

tomvothecoder opened this issue Jan 29, 2025 · 0 comments
Labels
type: enhancement New enhancement request

Comments

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Jan 29, 2025

Is your feature request related to a problem?

We should do performance benchmarks with xCDAT across different workflows and data types/sizes.

Examples include (WIP):

  • High resolution data

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

Two part problem with performance:

  1. User needs to chunk their datasets appropriately for optimal performance, based on the shape of the data
    • This is often the bottleneck for most users. Not trivial to understand the optimal chunk sizes.
    • Also setting a cluster is better for performance monitoring, but another barrier to entry.
    • Is there a general way to chunk across different datasets of varying sizes/dimensions?
  2. Is xCDAT optimized for work on these chunks? -- We use Xarray APIs for core operations (e.g., grouping), which operate in parallel on Dask Arrays
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement New enhancement request
Projects
Status: Todo
Development

No branches or pull requests

1 participant