-
Notifications
You must be signed in to change notification settings - Fork 864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix all mypy typechecking errors, add a lot of type annotations #1399
Conversation
oprypin
commented
Nov 1, 2023
- Closes Full type annotations #1389
Functions that don't have any type annotations aren't checked by mypy, they are skipped for now in this commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got much less that half way through this and keep seeing the same issues reoccurring and gave up. See my comments below and please make the adjustments everywhere, not just where I've noted them.
@@ -426,7 +426,9 @@ def get_items(self, block: str) -> list[str]: | |||
if not items and self.TAG == 'ol': | |||
# Detect the integer value of first list item | |||
INTEGER_RE = re.compile(r'(\d+)') | |||
self.STARTSWITH = INTEGER_RE.match(m.group(1)).group() | |||
int_match = INTEGER_RE.match(m.group(1)) | |||
assert int_match is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you are trying to accomplish here. Does this make it possible for Markdown content to cause an error? If so, that would be unacceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With lines 429-431 reverted:
$ mypy markdown
markdown/blockprocessors.py:429: error: Item "None" of "Match[str] | None" has no attribute "group" [union-attr]
Found 1 error in 1 file (checked 33 source files)
-
The code before:
self.STARTSWITH = RE.match(string).group()
where
RE.match(string)
can beNone
, andNone.group()
is an error -
The code after:
int_match = RE.match(string) assert int_match is not None self.STARTSWITH = int_match.group()
The old code doesn't care to check for the None
case where it would violently error with AttributeError
. mypy doesn't let it slide and exposes a potential bug.
The new code directly checks for it and errors with a clearer message.
There is no change regarding which situations an error does or does not happen.
TBD: It is worth checking whether this bug case can actually happen (in current code already). In any case we should be happy that mypy flagged this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, now we know about a bug which we didn't know about before. That is good. But this is not the way to address it. Under no circumstances should source text cause Markdown to raise an error. In fact, that is one of the primary goals of the project as documented on the home page. Therefore, this is not the proper way to fix the bug. The error needs to be silenced and some reasonable text needs to be included in the output (depending on the conditions under which the issue arises).
output_file = writer(output, errors="xmlcharrefreplace") | ||
output_file.write(html) | ||
output_writer = writer(output, errors="xmlcharrefreplace") | ||
output_writer.write(html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this change was made at all. Note that the Contributing Guide states:
Legacy code which does not follow the guidelines should only be updated if and when other changes (bug fix, feature addition, etc.) are being made to that section of code. While new features should be given names that follow modern Python naming conventions, existing names should be preserved to avoid backward incompatible changes.
I realize that in this instance the variable name is not exposed outside of this method so there is no concern over a backward incompatible change, but the general principle remains. Please, let's refrain from making unnecessary changes just for personal preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% of code body changes in this pull request are in response to error messages produced by mypy. Don't need to tell me this regarding unrelated changes because I'm always the first one to say this as well.
This particular change is because mypy doesn't like that these two differently-typed values share the same variable name.
With lines 425-429 reverted:
$ mypy markdown
markdown/core.py:427: error: Incompatible types in assignment (expression has type "StreamReader", variable has type "StreamReaderWriter") [assignment]
Found 1 error in 1 file (checked 33 source files)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I embrace Python not being strongly typed. If that means making changes like this, then no thank you.
def markdownFromFile(**kwargs: Any): | ||
def markdownFromFile( | ||
*, | ||
input: str | BinaryIO | None = None, | ||
output: str | BinaryIO | None = None, | ||
encoding: str | None = None, | ||
**kwargs: Any | ||
) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't recall why the specific keyword parameters were removed some years ago, but why did you feel the need to add them back in? I'm not opposed to it if there is a good reason related to this scope of this PR, but I would like to here the reason.
However, I am more concerned about why you added position arguments (*)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function's signature doesn't change, I'm just formalizing the de-facto signature. The *
is to start named-only parameters, not positional-only parameters. They were also named-only before.
sibling = self.current_sibling | ||
prev_sibling = self.current_sibling | ||
block, the_rest = self.detab(block, self.content_indent) | ||
self.current_sibling = None | ||
self.content_indent = 0 | ||
return sibling, block, the_rest | ||
return prev_sibling, block, the_rest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, we are making unnecessary out-of-scope changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was made in response to mypy errors. Again this is because mypy doesn't like that a variable name is reused with different types in different places. Which is really a limitation of mypy, most other type checkers would deal with this perfectly fine.
With lines 78-82 reverted:
$ mypy markdown
markdown/extensions/admonition.py:84: error: Incompatible types in assignment (expression has type "Element | None", variable has type "Element") [assignment]
markdown/extensions/admonition.py:87: error: Incompatible types in assignment (expression has type "None", variable has type "Element") [assignment]
markdown/extensions/admonition.py:101: error: Incompatible types in assignment (expression has type "Element | None", variable has type "Element") [assignment]
markdown/extensions/admonition.py:113: error: Incompatible types in assignment (expression has type "None", variable has type "Element") [assignment]
@@ -143,6 +148,7 @@ def run(self, parent, blocks): | |||
p.text = title | |||
p.set('class', self.CLASSNAME_TITLE) | |||
else: | |||
assert sibling is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this raise an error based on Markdown input? If so, this is unacceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code already has a problem where sibling
might be None and sibling.tag
will be an error. mypy correctly detects and reports it, and requires us to make an explicit assertion, which indicates that we either know that this will never actually happen, or perhaps don't care if it does but then the error will be more explicit.
TBD: It is worth checking whether this bug case can actually happen (in current code already). In any case we should be happy that mypy flagged this.
With line 151 reverted:
$ mypy markdown
markdown/extensions/admonition.py:152: error: Item "None" of "Element | None" has no attribute "tag" [union-attr]
markdown/extensions/admonition.py:152: error: Item "None" of "Element | None" has no attribute "text" [union-attr]
markdown/extensions/admonition.py:153: error: Item "None" of "Element | None" has no attribute "text" [union-attr]
markdown/extensions/admonition.py:154: error: Item "None" of "Element | None" has no attribute "text" [union-attr]
markdown/extensions/admonition.py:155: error: Argument 1 to "SubElement" has incompatible type "Element | None"; expected "Element" [arg-type]
markdown/extensions/admonition.py:158: error: Incompatible types in assignment (expression has type "Element | None", variable has type "Element") [assignment]
""" Find code blocks and store in `htmlStash`. """ | ||
blocks = root.iter('pre') | ||
for block in blocks: | ||
if len(block) == 1 and block[0].tag == 'code': | ||
local_config = self.config.copy() | ||
text = block[0].text | ||
assert text is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another assert....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another place where the code currently assumes that something is present there and violently errors if there's no match. mypy doesn't let it slide.
TBD: It is worth checking whether this bug case can actually happen (in current code already). In any case we should be happy that mypy flagged this.
With lines 275-278 reverted:
$ mypy markdown
markdown/extensions/codehilite.py:276: error: Argument 1 to "code_unescape" of "HiliteTreeprocessor" has incompatible type "str | None"; expected "str" [arg-type]
Found 1 error in 1 file (checked 33 source files)
|
||
raw_block = blocks.pop(0) | ||
m = self.RE.search(raw_block) | ||
assert m is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another assert...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another place where the code currently assumes that the match will happen. mypy doesn't let it slide. I see here that it is 100% safe to assume that the match actually happens, based on the outcome of test
. But it is OK that mypy forces us to write out this assumption explicitly, both so the writer double-checks this and that the reader doesn't need to think that maybe this is a bug.
With line 47 reverted:
$ mypy markdown
markdown/extensions/def_list.py:48: error: Item "None" of "Match[str] | None" has no attribute "start" [union-attr]
markdown/extensions/def_list.py:49: error: Item "None" of "Match[str] | None" has no attribute "end" [union-attr]
markdown/extensions/def_list.py:56: error: Item "None" of "Match[str] | None" has no attribute "group" [union-attr]
markdown/extensions/def_list.py:58: error: Item "None" of "Match[str] | None" has no attribute "group" [union-attr]
Found 4 errors in 1 file (checked 33 source files)
if not terms and sibling is None: | ||
# This is not a definition item. Most likely a paragraph that | ||
# starts with a colon at the beginning of a document or list. | ||
blocks.insert(0, raw_block) | ||
return False | ||
if not terms and sibling.tag == 'p': | ||
# The previous paragraph contains the terms | ||
state = 'looselist' | ||
terms = sibling.text.split('\n') | ||
parent.remove(sibling) | ||
# Acquire new sibling | ||
sibling = self.lastChild(parent) | ||
else: | ||
state = 'list' | ||
state = 'list' | ||
if not terms: | ||
if sibling is None: | ||
# This is not a definition item. Most likely a paragraph that | ||
# starts with a colon at the beginning of a document or list. | ||
blocks.insert(0, raw_block) | ||
return False | ||
if sibling.tag == 'p': | ||
# The previous paragraph contains the terms | ||
state = 'looselist' | ||
assert sibling.text is not None | ||
terms = sibling.text.split('\n') | ||
parent.remove(sibling) | ||
# Acquire new sibling | ||
sibling = self.lastChild(parent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be out-of-scope as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is again in response to mypy errors, unfortunately these kind of gymnastics are necessary sometimes to make it happy.
With lines 61-75 reverted:
$ mypy markdown
markdown/extensions/def_list.py:66: error: Item "None" of "Element | None" has no attribute "tag" [union-attr]
markdown/extensions/def_list.py:69: error: Item "None" of "Element | None" has no attribute "text" [union-attr]
markdown/extensions/def_list.py:69: error: Item "None" of "str | Any | None" has no attribute "split" [union-attr]
markdown/extensions/def_list.py:70: error: Argument 1 to "remove" of "Element" has incompatible type "Element | None"; expected "Element" [arg-type]
The adjustments would basically mean reverting the changes, but almost none of them can be reverted, because then mypy errors appear. So this would just mean giving up trying to check the code with mypy. |
|
Python not being strongly typed if a feature which I embrace. I will continue to write my code that way. Anything which forces me to write strongly typed code in a language which is not strongly typed is unwelcome. I see type annotations as documentation. The code dictates the annotations, not the other way around. Of course, over time as updates and changes are made to the code, the annotations could get out-of-sync with the code. Therefore, it is reasonable to include a type checker in the CI to ensure that doesn't happen. However, the type checker should only be checking the already defined annotations. There is no need to type check an unannotated private variable which is never used outside of a single method (for example). If mypy cannot serve that purpose, then it is the wrong tool for the job. |
I am going to close this in favor of #1401. I suspect that mypy is not the right fit for this project. However, if it can be configured to fit, then this can be reopened. Or, if a different tool is proposed, that should be in a new PR. |
I'll want to sync this branch to latest if #1401 is merged, and then leave this for posterity. Not sure if I'll be able to do that if it's closed. |
Yeah mypy is pretty bad! |
Okay, I'll leave it open for now. |
I think if you are using types, using mypy is very important to keep things from getting broken. I also agree that there is nothing wrong with limiting the checks and/or doing per line ignores for particularly tricky things that can be resolved later if there is interest. |
I have taken a step back and taken a fresh look at the overall matter of annotations and type checking. I think perhaps I dove in with a few misunderstandings and miscomprehensions. Sorry about that; especially if I have led anyone down a path of doing work I would never approve. That was not my intention. So to be clear. upon reflection, the following are my general expectations.
There are a few things we will need to do to meet the above goals. Item 1 above requires no action expect to perhaps remove some proposed changes. Item 2 is mostly complete. Thanks for the hard work of everyone involved. Item 3 seems to be the problem. Originally I thought that mypy could meet our needs. After all, the page entitled Using mypy with an existing codebase in their documentation talks about how the tool can be used to progressively work through your code. However, on closer inspection, I see that the idea is to work through the code to gradually make it strongly typed. In fact, the suggestion is to do so before any annotations are added. That seems to suggest that perhaps mypy can only check annotations if the code is strongly typed. That is not a useful tool for our purpose. Unfortunately, I do not have any answers for how we can address this item. It occurs to me that it might not actually be possible to accomplish what I am proposing here. |
I did the merge, can be closed now :D |