Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddf diff can not handle multiple on_key option #131

Open
semio opened this issue Apr 16, 2021 · 2 comments
Open

ddf diff can not handle multiple on_key option #131

semio opened this issue Apr 16, 2021 · 2 comments

Comments

@semio
Copy link
Owner

semio commented Apr 16, 2021

some diff stats like rval requires grouping data by one dimension (usually geo) before computing the stats. But sometimes there are multiple choices of the groruping dimension. such as in SG, we have datapoints by geo and global/regions. so some datapoints file should groupby global/region first. But I can only supply one value to on_key option of ddf diff, which will cause error.

@jheeffer
Copy link

jheeffer commented Apr 16, 2021

Isn't it always group by all keys except for time, if it is in key? So each group only has time changing in key?

I guess rval only makes sense with when is a time series?

@semio
Copy link
Owner Author

semio commented Apr 20, 2021

No, rval can be used to compare other type of data. Our goal is to tell how different are new and old datapoints, so in fact we just need to ensure that we are comparing the same observation for each datapoint, which means that there is no need to do grouping at all

I guess I grouped them by country and calculate the average rval to show average diff of all countries. But I am not sure that if the average is a better indicators than the rval for all datapoints. Also seems average rval is not meanful, see https://www.researchgate.net/post/average_of_Pearson_correlation_coefficient_values

I suggest that let's remove the grouping for now and if necessary check with our statistician to see which indicators should be use

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants