-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose configuration for handling coded values like -666666666 #75
base: master
Are you sure you want to change the base?
Conversation
raise NullValueException('Unhandled coded value: ', str(v)) | ||
else: | ||
return func(v) | ||
return null_wrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another (maybe simpler) way to do this would be to avoid the decorator abstraction and use a "checking" function instead. I liked the decorator since it felt to me like a semantic way of indicating that this checking logic is a high-level modification that can be applied somewhat abstractly to the other type casting functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd be a bit more comfortable with this architecture if we had more cases than just the -666666666
. That said I'm comfortable to go with this approach until we accumulate more cases, and then we can potentially refactor if appropriate.
@@ -158,6 +197,10 @@ def get(self, fields, geo, year=None, **kwargs): | |||
|
|||
@retry_on_transient_error | |||
def query(self, fields, geo, year=None, **kwargs): | |||
cast_nulls = kwargs.get('cast_nulls', True) | |||
if cast_nulls not in [True, False]: | |||
raise CensusException('cast_nulls argument must be True or False') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not totally confident in cast_nulls
as the best name for the configuration argument that adjusts this behavior. In particular, it doesn't communicate that the alternative is to raise an error. Still, it was the best I had.
census/core.py
Outdated
# rest of the row values for context, so flag the | ||
# error and continue the iteration | ||
error = True | ||
result = {header: item} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some additional context for why we need to finish this inner loop before raising the error is that the results generated by for d in data
look something like this:
{'var_name': -666666666, 'state': 42, 'county': 11}
The inner loop for header, cast, item in zip(headers, types, d)
processes each of these elements in turn, returning a formatted "row" for the final results. The problem with raising an error in the inner loop is that we'll only know one of these header: item
pairs (like var_name: -666666666
or 'state': 42
) but unless you know all of the elements it's impossible to determine which API call raised the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Jean!
cast them to null. | ||
""" | ||
# This call should return a value of -666666666 | ||
return_val = self._client.acs5.state_county_tract('B19081_001E', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this to be the default behavior? I think the default should be the behavior of cast_null=True, but I'm persuadable.
""" | ||
Test casting -666666666 values to null. | ||
""" | ||
return_val = self._client.acs5.state_county_tract('B19081_001E', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also test that a warning is emitted that a value was cast to a null. Particularly if we decide to make this the default behavior.
raise NullValueException('Unhandled coded value: ', str(v)) | ||
else: | ||
return func(v) | ||
return null_wrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd be a bit more comfortable with this architecture if we had more cases than just the -666666666
. That said I'm comfortable to go with this approach until we accumulate more cases, and then we can potentially refactor if appropriate.
cast_nulls=True) | ||
self.assertEqual(return_val[0]['B19081_001E'], None) | ||
|
||
def test_bad_cast_nulls_argument(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great to have these tests!
Summary
Adjust the
Client.query
method to hande coded values, allowing the user to either:Set option 1) as the default. Implement these choices for the value
-666666666
, the only one I've encountered in the wild so far.Closes #72.
Testing instructions