-
Notifications
You must be signed in to change notification settings - Fork 44
Some information about sizing #686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments/suggestions but LGTM, thanks for writing it!
|
||
## Statistical Power | ||
|
||
Statistical power is the probability that the experiment measures a stat sig impact, given that such an impact really exists ([ref](<https://en.wikipedia.org/wiki/Power_(statistics)>)). It is a function of the analysis (metric & statistical test), the design (number of branches), the number of users in the experiment, and the size of the effect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we expand terms like "stat sig" or assume that the audience of these pages understands what they mean?
docs/deep-dives/data/sizing.md
Outdated
|
||
### Commentary on absolute size | ||
|
||
Statistical power generally (for the statistical tests that Jetstream performs) scales proportionally to the square root of the number of users. As a result, the power gains from increasing the size of an experiment depend on the actual size being considered. That is to say that going from 1% -> 2% has a much larger impact on the power than going from 9% -> 10%. One corollary of this is that, as a general rule, the power gains of going beyond 10% of users per branch are rather minimal. Generally, a 20% experiment (10%/branch) has only slightly higher power than a 100% experiment. There are of course exceptions depending on the absolute size of the targeted population, the smaller the targeted population, the less this rule applies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a 20% experiment (10%/branch) has only slightly higher power than a 100% experiment
Should this say slightly "lower" power?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused here -- the title says Commentary on Absolute Size, but this seems to be about proportional size in percentages, and then says that there are exceptions for absolute size but doesn't go into details. Should this be titled Commentary on Relative Size? Or maybe I'm just not fully understanding the rest of the paragraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal was to give some examples of absolute experiment sizes (e.g., 1%, 10%) and talk about how the power changes as those change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, just a misunderstanding on terminology then -- I expected "absolute" to refer to actual numbers, not percentages. I think not worth worrying about.
Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>
Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>
No description provided.