Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix area targets file specification and add targets file documentation #253

Merged
merged 7 commits into from
Oct 25, 2024

Conversation

martinholmer
Copy link
Collaborator

Fixes #246.

@martinholmer martinholmer marked this pull request as draft October 24, 2024 20:55
@martinholmer martinholmer marked this pull request as ready for review October 24, 2024 21:31
@martinholmer martinholmer requested a review from donboyd5 October 24, 2024 21:31
Copy link
Collaborator

@donboyd5 donboyd5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the comment I made in conversations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinholmer This is great, but it would be good to have a phone call - maybe next week? - to try to anticipate a few needs. Here are a few comments:

  • I wonder if the elif row.count == 1: # count units with any variable amount option is important. This will be the same for every variable in a masked group and should be the same as the count of the overall number of returns in a masked group. So maybe it is duplicative.

  • We might want a future capability that allows us to specify which Congressional session's boundaries to use, 117 or 118, given that we will have targets for both. (For my own use, I will prepare a file that has an indicator variable, session and I will filter on this variable before writing a targets file.) It might be nice to have the ability to have either kind of CD. Or maybe you just want to leave it up to the user to decide which session to put in the file?

  • For some variables, at some point we may want the ability to isolate counts and amounts for positive or negative values rather than just for nonzero values. For example, nationally, we targeted positive capital gains separately from negative capital gains (losses). This can be important. This can be handled either of two ways, neither of which we currently allow: (1) through a targets file categorical variable such as count, by having a category for positive and one for negative, or (2) by us creating programmatically custom variables before masking, such as c01000_positive and c01000_negative (where c01000_postive is c01000 * (c01000 > 0, etc.) and allowing users to include target rows for custome variables, which they'd need to know about.

  • Which raises the question of which variables are allowed. The documentation says any Tax-Calculator variable is allowed, which I like, although I think right now we only allow input variables and c00100. To prepare properly for SALT, I think eventually we'll want to allow the variables that start with "c" that are related to SALT and to itemized deductions. Also, when we did national targeting, we ran into an issue where some published aggregates were transformations created by adding several Tax-Calculator variables and to target them we needed to create the sum.

In our next phone call perhaps we could discuss which if any capabilities we need related to the above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@donboyd5, Thanks for the extensive review of PR #253.

@martinholmer martinholmer merged commit c3cbb51 into master Oct 25, 2024
2 checks passed
@martinholmer martinholmer deleted the fix-targets-spec branch October 25, 2024 16:02
donboyd5 pushed a commit to donboyd5/tax-microdata-benchmarking that referenced this pull request Nov 6, 2024
Fix area targets file specification and add targets file documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants