Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Groupby.sum, DataFrame.sum and Series.sum for object type should be NA instead of 0 for all-nan values #60458

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

snitish
Copy link
Contributor

@snitish snitish commented Dec 1, 2024

Copy link
Contributor

github-actions bot commented Jan 3, 2025

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Jan 3, 2025
@snitish snitish closed this Jan 15, 2025
@snitish snitish deleted the groupby_sum_str_bug branch February 6, 2025 19:46
@rhshadrach
Copy link
Member

rhshadrach commented Feb 15, 2025

@snitish - This looks good to me if you'd like to reopen. I'm handling the string case in #60936

@snitish
Copy link
Contributor Author

snitish commented Feb 15, 2025

@rhshadrach sure I'd be happy to. There was some debate on the original thread on whether the default sum value for object dtype should be NA or "" (empty string). Any thoughts on this?

@snitish snitish restored the groupby_sum_str_bug branch February 15, 2025 18:13
@snitish snitish reopened this Feb 15, 2025
@snitish snitish changed the title BUG: Groupby sum for object type should be None instead of 0 for all … BUG: Groupby sum for object type should be NA instead of 0 for all-nan values Feb 15, 2025
@snitish snitish marked this pull request as draft February 15, 2025 18:37
@rhshadrach
Copy link
Member

rhshadrach commented Feb 15, 2025

There was some debate on the original thread on whether the default sum value for object dtype should be NA or "" (empty string). Any thoughts on this?

For object dtype, I'm only seeing support for NA in that thread. Am I missing any contrary opinions? The only hesitation I see is @Dr-Irv on how this impacts users, but I do not see an easy deprecation path and I am comfortable with listing this in the breaking changes for 3.0

When I say NA here, I mean "the NA value for the given dtype". For object dtype, the NA value is np.nan.

@snitish
Copy link
Contributor Author

snitish commented Feb 18, 2025

Thanks for approving, @WillAyd. This PR still doesn't address the issue for DataFrame.sum() and Series.sum(). I'm working on that.

@snitish snitish changed the title BUG: Groupby sum for object type should be NA instead of 0 for all-nan values BUG: Groupby.sum, DataFrame.sum and Series.sum for object type should be NA instead of 0 for all-nan values Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants