-
Notifications
You must be signed in to change notification settings - Fork 41
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This package attempts to map statements in type-inferred code back to the source code as written by the programmer. The intention is to use this in Cthulhu to present the results of inference in an easier-to-digest form. There are, of course, potential additional applications of this source-mapping, which is why it is developed as a semi-independent package. Co-authored-by: Shuhei Kadowaki <[email protected]>
- Loading branch information
Showing
8 changed files
with
1,100 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
MIT License | ||
|
||
Copyright (c) 2023 Tim Holy <[email protected]> and contributors | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name = "TypedSyntax" | ||
uuid = "d265eb64-f81a-44ad-a842-4247ee1503de" | ||
authors = ["Tim Holy <[email protected]> and contributors"] | ||
version = "1.0.0" | ||
|
||
[deps] | ||
CodeTracking = "da1fd8a2-8d9e-5ec2-8556-3022fb5608a2" | ||
JuliaSyntax = "70703baa-626e-46a2-a12c-08ffd08c73b4" | ||
|
||
[compat] | ||
CodeTracking = "1" | ||
JuliaSyntax = "0.3.2" | ||
julia = "1" | ||
|
||
[extras] | ||
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" | ||
|
||
[targets] | ||
test = ["Test"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,133 @@ | ||
# TypedSyntax | ||
|
||
This package aims to map types, as determined via type-inference, back to the source code as written by the developer. It can be used to understand program behavior and identify causes of "type instability" (inference failures) without the need to read [intermediate representations](https://docs.julialang.org/en/v1/devdocs/ast/) of Julia code. | ||
|
||
This package is built on [JuliaSyntax](https://github.com/JuliaLang/JuliaSyntax.jl) and extends it by attaching type annotations to the nodes of its syntax trees. Here's a demo: | ||
|
||
```julia | ||
julia> using TypedSyntax | ||
|
||
julia> f(x, y, z) = x + y * z; | ||
|
||
julia> node = TypedSyntaxNode(f, (Float64, Int, Float32)) | ||
line:col│ tree │ type | ||
1:1 │[=] │Float64 | ||
1:1 │ [call] | ||
1:1 │ f | ||
1:3 │ x │Float64 | ||
1:6 │ y │Int64 | ||
1:9 │ z │Float32 | ||
1:13 │ [call-i] │Float64 | ||
1:14 │ x │Float64 | ||
1:16 │ + | ||
1:17 │ [call-i] │Float32 | ||
1:18 │ y │Int64 | ||
1:20 │ * | ||
1:22 │ z │Float32 | ||
``` | ||
|
||
The right hand column is the new information added by `TypedSyntaxNode`, indicating the type assigned to each variable or function call. | ||
|
||
You can also display this in a form closer to the original source code, but with type-annotations: | ||
|
||
```julia | ||
julia> printstyled(stdout, node; hide_type_stable=false) | ||
f(x::Float64, y::Int64, z::Float32)::Float64 = (x::Float64 + (y::Int64 * z::Float32)::Float32)::Float64 | ||
``` | ||
|
||
`hide_type_stable=true` (which is the default) will suppress printing of concrete types, so you need to set it to `false` if you want to see all the types. | ||
|
||
The default is aimed at identifying sources of "type instability" (poor inferrability): | ||
|
||
```julia | ||
julia> printstyled(stdout, TypedSyntaxNode(f, (Float64, Int, Real))) | ||
``` | ||
|
||
which produces | ||
|
||
<code>f(x, y, z::<b>Real</b>)::<b>Any</b> = (x + (y * z::<b>Real</b>)::<b>Any</b>)::<b>Any</b></code> | ||
|
||
The boldfaced text above is typically printed in color in the REPL: | ||
|
||
- red indicates non-concrete types | ||
- yellow indicates a "small union" of concrete types. These usually pose no issues, unless there are too many combinations of such unions. | ||
|
||
Printing with color can be suppressed with the keyword argument `iswarn=false`. | ||
|
||
## Caveats | ||
|
||
TypedSyntax aims for accuracy, but there are a number of factors that pose challenges. | ||
First, anonymous and internal functions appear as part of the source text, but internally Julia handles these as separate type-inferred methods, and these are hidden from the annotator. | ||
Therefore, in | ||
|
||
```julia | ||
julia> sumfirst(c) = sum(x -> first(x), c); # better to use `sum(first, c)` but this is just an illustration | ||
|
||
julia> printstyled(stdout, TypedSyntaxNode(sumfirst, (Vector{Any},))) | ||
sumfirst(c)::Any = sum(x -> first(x), c)::Any | ||
``` | ||
|
||
`x` and `first(x)` both have type `Any`, but they are not annotated as such because they are hidden inside the anonymous function. | ||
|
||
Second, this package works by attempting to "reconstruct history": starting from the type-inferred code, it tries to map calls back to the source. It would be much safer to instead keep track of the source during inference, but at present this is not possible (see [this Julia issue](https://github.com/JuliaLang/julia/issues/31162)). There are cases where this mapping fails: for example, with | ||
|
||
```julia | ||
julia> function summer(list) | ||
s = 0 | ||
for x in list | ||
s += x | ||
end | ||
return s | ||
end; | ||
``` | ||
then (on Julia 1.9) | ||
```julia | ||
julia> tsn, mappings = TypedSyntax.tsn_and_mappings(summer, (Vector{Float64},)); | ||
|
||
julia> hcat(tsn.typedsource.code, mappings) | ||
16×2 Matrix{Any}: | ||
:(_4 = 0) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(_2) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[list] | ||
:(_3 = Base.iterate(%2)) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[(= x list)] | ||
:(_3 === nothing) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(Base.not_int(%4)) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(goto %16 if not %5) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(_3::Tuple{Float64, Int64}) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(_5 = Core.getfield(%7, 1)) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(Core.getfield(%7, 2)) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(_4 = _4 + _5) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[(+= s x)] | ||
:(_3 = Base.iterate(%2, %9)) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(_3 === nothing) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(Base.not_int(%12)) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(goto %16 if not %13) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(goto %7) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[] | ||
:(return _4) Union{TreeNode{SyntaxData}, TreeNode{TypedSyntaxData}}[s] | ||
``` | ||
The left column contains the statements of the type-inferred code, the right column the mappings back to the source. | ||
You can see that the majority of these mappings are empty, indicating either no good match or that there were multiple possible matches. This is because lowering changes the implementation so significantly that there are few calls that relate directly to the source. | ||
|
||
Nevertheless, many statements in the source can be annotated: | ||
|
||
```julia | ||
julia> tsn | ||
line:col│ tree │ type | ||
1:1 │[function] │Union{Float64, Int64} | ||
1:10 │ [call] | ||
1:10 │ summer | ||
1:17 │ list │Vector{Float64} | ||
1:22 │ [block] | ||
2:5 │ [=] | ||
2:5 │ s | ||
2:9 │ 0 | ||
3:5 │ [for] | ||
3:8 │ [=] │Union{Nothing, Tuple{Float64, Int64}} | ||
3:9 │ x | ||
3:14 │ list │Vector{Float64} | ||
3:18 │ [block] | ||
4:9 │ [+=] │Float64 | ||
4:9 │ s │Float64 | ||
4:14 │ x │Float64 | ||
6:5 │ [return] │Union{Float64, Int64} | ||
6:12 │ s │Union{Float64, Int64} | ||
``` | ||
This is largely because just the named-variables provide considerable information. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
module TypedSyntax | ||
|
||
using Core: CodeInfo, MethodInstance, SlotNumber, SSAValue | ||
using Core.Compiler: TypedSlot | ||
using JuliaSyntax: JuliaSyntax, TreeNode, AbstractSyntaxData, SyntaxData, SyntaxNode, GreenNode, AbstractSyntaxNode, SyntaxHead, SourceFile, | ||
head, kind, child, children, haschildren, untokenize, first_byte, last_byte, source_line, source_location, | ||
sourcetext, @K_str, @KSet_str, is_infix_op_call, is_prefix_op_call, is_prec_assignment, is_operator, is_literal | ||
using Base.Meta: isexpr | ||
using CodeTracking | ||
|
||
export TypedSyntaxNode | ||
|
||
include("node.jl") | ||
include("show.jl") | ||
|
||
end |
Oops, something went wrong.
ed51dac
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JuliaRegistrator register subdir=TypedSyntax
ed51dac
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Registration pull request created: JuliaRegistries/General/78815
After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.
This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via: