-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix area targets file specification and add targets file documentation #253
Merged
Merged
Changes from 4 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
b68044e
Add count 2 option to area targets files
martinholmer e83496e
Add more targets file documentation to areas/targets/README.md
martinholmer 7a24355
Revise targets file documentation in areas/targets/README.md
martinholmer e29d71f
Revise targets file documentation in areas/targets/README.md
martinholmer 389be86
Update in response to code review
martinholmer 4d8d6f3
Update valid_area code in the create_area_weights.py script
martinholmer ad32feb
Format revision in tmd/areas/targets/README.md documentation
martinholmer File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,40 @@ | ||
# areas / targets | ||
|
||
Contains sub-national area targets files. | ||
|
||
|
||
# Targets File Content | ||
|
||
An areas targets file is a CSV-formatted file with its first row | ||
containing column names and its second row containing the area | ||
population target. Each subsequent row contains another target. | ||
Rows after the first two that start with a `#` character are | ||
considered comments and are skipped. | ||
|
||
Here are the column names and their valid values: | ||
|
||
- **`varname`**: any Tax-Calculator variable name. | ||
|
||
- **`count`**: integer in [0,2] range: count==0 implies dollar total | ||
of varname is tabulated, count==1 implies number of | ||
tax units with any value of varname is tabulated, | ||
count==2 implies number of tax units with non-zero | ||
values of varname is tabulated. | ||
|
||
- **`scope`**: integer in [0,2] range: scope==0 implies all tax units, | ||
scope==1 implies PUF-derived filing units, and | ||
scope==2 implies CPS-derived filing units. | ||
|
||
- **`agilo`**: float representing lower bound of the AGI range (which | ||
is included in the range) that is tabulated. | ||
|
||
- **`agihi`**: float representing upper bound of the AGI range (which | ||
is excluded from the range) that is tabulated. | ||
|
||
- **`fstatus`**: integer in [0,5] range: fstatus=0 implies all tax | ||
units are tabulated, other fstatus values imply just | ||
the tax units with the Tax-Calculator `MARS` variable | ||
equal to fstatus are included in the tabulation. | ||
|
||
- **`target`**: target amount (dollars if count==0 or number of | ||
tax units if count==1 or count==2). |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinholmer This is great, but it would be good to have a phone call - maybe next week? - to try to anticipate a few needs. Here are a few comments:
I wonder if the
elif row.count == 1: # count units with any variable amount
option is important. This will be the same for every variable in a masked group and should be the same as the count of the overall number of returns in a masked group. So maybe it is duplicative.We might want a future capability that allows us to specify which Congressional session's boundaries to use, 117 or 118, given that we will have targets for both. (For my own use, I will prepare a file that has an indicator variable,
session
and I will filter on this variable before writing a targets file.) It might be nice to have the ability to have either kind of CD. Or maybe you just want to leave it up to the user to decide which session to put in the file?For some variables, at some point we may want the ability to isolate counts and amounts for positive or negative values rather than just for nonzero values. For example, nationally, we targeted positive capital gains separately from negative capital gains (losses). This can be important. This can be handled either of two ways, neither of which we currently allow: (1) through a targets file categorical variable such as count, by having a category for positive and one for negative, or (2) by us creating programmatically custom variables before masking, such as c01000_positive and c01000_negative (where c01000_postive is c01000 * (c01000 > 0, etc.) and allowing users to include target rows for custome variables, which they'd need to know about.
Which raises the question of which variables are allowed. The documentation says any Tax-Calculator variable is allowed, which I like, although I think right now we only allow input variables and c00100. To prepare properly for SALT, I think eventually we'll want to allow the variables that start with "c" that are related to SALT and to itemized deductions. Also, when we did national targeting, we ran into an issue where some published aggregates were transformations created by adding several Tax-Calculator variables and to target them we needed to create the sum.
In our next phone call perhaps we could discuss which if any capabilities we need related to the above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@donboyd5, Thanks for the extensive review of PR #253.