You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
$R^2$ has a similar (if not the same) formula as nse. Note that $R^2$'s context is different in that it is popularly (but not always) used for linear regression - which puts in a restriction that it cannot perform poorer than the null hypothesis (mean). Hence why, typically - $0 \leq R^2 \leq 1$ although in reality, the codomain of the two functions are the same... $-\infty \leq R^2 = \text{nse} \leq 1$(citation required)
From an implementation point of view, there is:
an opportunity to refactor, include $R^2$ and reuse the same logic for nse.
a further opportunity, if we do proceed with 1. to make mse more efficient - e.g. using numba or rust as this function seems to be pivotal in the construction of both metrics.
(note, use of rust or numba or even opencl for heterogeneous compute is something that I briefly explored with Fractions Skill Score - FSS - and found significant performance gains - but currently in a experimental branch)
Note
Incidentally - and this needs verifying - this is also the reduced form of the square of the pearsons correlation ($\rho^2$), if the model used is a least squares regression (LSR). This is because, the objective of LSR is to minimize the the covariance between $E_i$ and $X_i$, where $X_i$ is the observation and $E_i = Y_i - X_i$, given a the fitted model datum $Y_i$. Doing so would reduce $\rho^2$ to $R^2$
This may have interesting applications when comparing both scores, $R^2$ (or NSE) AND $\rho^2$ against a "naive" optimizer like least squares v.s. a physical model. Where we may be able understand how much explainability a model is able to capture over least squares (or linear) regression, and under what circumstances.
i.e. while $R^2$ tells us how much of the variance in data is explained by a model, comparing it with a LSR fit will tell us how much of the variance that is attributed to "non-linearity" is explained by the model (noting that this is not a guarantee, as the scores can be similar but due to different things being optimized - where one excels at linear regression, and the other excels at non-linear patterns, though I suspect bringing in $\rho$ into the picture may help with that). It is likely that something like this already exists in literature though.
#773 i.e. spearman's correlation uses pearson's correlation e.g. xr.corr. But this is also used in pearson's in scores - it maybe worth consolidating other areas where these kinds of similarities exist, especially for more basic scoring functions, like mse, xr.corr.
In other words, I think its worth separating the low level computations from the higher level scores (on a case by case basis initially anyway), especially useful if down the line we want to provide alternatives in numba or rust backend.
A similar example can be seen in fractions skill score fss where the "integral sum" is actually computed separately, and in theory can be refactored out and be used on its own for any higher level score that needs to perform multidimensional sliding window sums.
nse
. Note thatFrom an implementation point of view, there is:
an opportunity to refactor, include$R^2$ and reuse the same logic for
nse
.a further opportunity, if we do proceed with 1. to make
mse
more efficient - e.g. usingnumba
orrust
as this function seems to be pivotal in the construction of both metrics.(note, use of
rust
ornumba
or evenopencl
for heterogeneous compute is something that I briefly explored with Fractions Skill Score - FSS - and found significant performance gains - but currently in a experimental branch)Note
Incidentally - and this needs verifying - this is also the reduced form of the square of the pearsons correlation ($\rho^2$ ), if the model used is a least squares regression (LSR). This is because, the objective of LSR is to minimize the the covariance between $E_i$ and $X_i$ , where $X_i$ is the observation and $E_i = Y_i - X_i$ , given a the fitted model datum $Y_i$ . Doing so would reduce $\rho^2$ to $R^2$
This may have interesting applications when comparing both scores,$R^2$ (or NSE) AND $\rho^2$ against a "naive" optimizer like least squares v.s. a physical model. Where we may be able understand how much explainability a model is able to capture over least squares (or linear) regression, and under what circumstances.
i.e. while$R^2$ tells us how much of the variance in data is explained by a model, comparing it with a LSR fit will tell us how much of the variance that is attributed to "non-linearity" is explained by the model (noting that this is not a guarantee, as the scores can be similar but due to different things being optimized - where one excels at linear regression, and the other excels at non-linear patterns, though I suspect bringing in $\rho$ into the picture may help with that). It is likely that something like this already exists in literature though.
Originally posted by @nikeethr in #815 (comment)
The text was updated successfully, but these errors were encountered: