Skip to content

typeform: remove a regex call, rearrange checks#21459

Merged
JelleZijlstra merged 5 commits intopython:masterfrom
JelleZijlstra:isidentifier
May 10, 2026
Merged

typeform: remove a regex call, rearrange checks#21459
JelleZijlstra merged 5 commits intopython:masterfrom
JelleZijlstra:isidentifier

Conversation

@JelleZijlstra
Copy link
Copy Markdown
Member

Related to #21262 (comment) : I haven't benchmarked but this is likely to be faster.

@github-actions

This comment has been minimized.

@JelleZijlstra JelleZijlstra changed the title typeform: remove a regex call typeform: remove a regex call, rearrange checks May 10, 2026
Copy link
Copy Markdown
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Good chance to see if the benchmarking tech in sterliakov/mypy-issues#297 measures something :-)

Comment thread mypy/semanal.py Outdated
elif (
str_value == ""
or str_value.isspace()
or (len(str_value) == 1 and str_value in ".,/:*-=[]\\")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does the need for this change come from?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Codex it accounted for a few % of all strings in mypy.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general theory is that it's good to be simple len/containment checks that mypyc optimizes, so we can avoid doing the slower regex call.

Copy link
Copy Markdown
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try this! I have just one question

Comment thread mypy/semanal.py Outdated
elif (
str_value == ""
or str_value.isspace()
or (len(str_value) == 1 and str_value in ".,/:*-=[]\\")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking a bit more about this, since we already check for identifier above, are there 1-char strings that are representing a valid Python type? It seems to me we can simply skip all 1-char strings.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I think the shortest possible non-identifier valid type expression is actually 3 characters (a.b or a|b), so we can throw out all short strings here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually two-character strings aren't safe because something like T is valid. But len(str_value) < 2 should be safe.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@JelleZijlstra JelleZijlstra merged commit c0cced3 into python:master May 10, 2026
25 checks passed
@JelleZijlstra JelleZijlstra deleted the isidentifier branch May 10, 2026 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants