Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add @autostruct macro #22

Merged
merged 13 commits into from
Oct 29, 2024
Merged

Add @autostruct macro #22

merged 13 commits into from
Oct 29, 2024

Conversation

mcabbott
Copy link
Member

@mcabbott mcabbott commented Oct 23, 2024

This is an alternative to @compact for easily defining layers. Instead of rolling everything into one, @autostruct function MyModel takes the constructor function and magically defines the corresponding struct. You must still define the forward pass (m::MyModel)(x) = ... as usual.

The struct has an internal name. Iff you change the line MyModel(dense1, dense2), it will define a new struct to match the new definition. The binding MyModel = var"MyModel#13" is not const, but this only affects construction: calling the model is still type-stable. (The struct always has type parameters.)

julia> using Fluxperimental, Flux

julia> @autostruct function MyModel(d::Int)
          dense1, dense2 = [Dense(d=>d, tanh) for _ in 1:2]    # arbitrary code here, not just keyword-like
          dense2.bias[:] .= 1/d
          return MyModel(dense1, dense2)  # demand this be very simple, no = signs allowed (return optional)
       end
var"MyModel#13"

julia> function (m::MyModel)(x)  # forward pass looks just like a normal struct
         y = m.dense1(x)
         z = m.dense2(y)
         (x .+ y .+ z)./3
       end

julia> my = Chain(MyModel(2), Dense(2=>2))  # at present, always compact printing, from @layer MyModel
Chain(
  MyModel(...),                         # 12 parameters
  Dense(2 => 2),                        # 6 parameters
)                   # Total: 6 arrays, 18 parameters, 384 bytes.

julia> @inferred my(randn32(2,3))
2×3 Matrix{Float32}:
 -0.339922   -1.27762    -0.645441
  0.0119689   0.0943767   0.0357405

julia> Flux.trainable(m::MyModel) = (; m.dense1)  # no macro magic for this

julia> MyModel(2)
MyModel(...)        # 6 parameters, plus 6 non-trainable

@MilesCranmer
Copy link
Contributor

Nice!

@CarloLucibello
Copy link
Member

Niiiice! Comments:

  1. Could it become @auto struct MyModel(d::Int)? Eventually, there could also be a @auto mutable struct MyModel(d::Int)
  2. Can it be decomposed into 2 macros, say @autoflux and @auto? @auto does everything not flux-related, and should be a standalone package which every julian would find helpful. @autoflux will just call @auto and then add the @layer magic.
  3. We should have a prototyping version and one for when the fields are settled and we don't care anymore about compatibility with Revise, so that the struct name can be MyModel for real at some point.

src/autostruct.jl Outdated Show resolved Hide resolved
src/autostruct.jl Outdated Show resolved Hide resolved
@mcabbott
Copy link
Member Author

mcabbott commented Oct 24, 2024

  1. Could it become @auto struct MyModel(d::Int)?

My reservation is that the code inside is function-body code, not code you would normally find under struct. In fact, I thought things like this might not parse... but they do:

julia> :(@mac struct Something
            a, b = eachcol(rand(3,3))
            for i in 1:3
              a[i] = i^2
            end
            return a+b
         end)
:(#= REPL[165]:1 =# @mac struct Something
          #= REPL[165]:2 =#
          (a, b) = eachcol(rand(3, 3))
          #= REPL[165]:3 =#
          for i = 1:3
              #= REPL[165]:4 =#
              a[i] = i ^ 2
              #= REPL[165]:5 =#
          end
          #= REPL[165]:6 =#
          return a + b
      end)

julia> struct Something
                   a, b = eachcol(rand(3,3))
                   for i in 1:3
                     a[i] = i^2
                   end
                   return a+b
                end
3-element Vector{Float64}:
 1.3967202370899579
 4.078791996072846
 9.943963169620071

julia> methods(Something)
# 0 methods for type constructor

Nevertheless this seems a bit strange to me. I like that if you delete the macro, and instead supply the struct definition yourself, then the function is a valid constructor. It's not a particularly magical macro.

(Whether the name @autostruct is right I don't know... not attached to it.)

  1. Can it be decomposed into 2 macros

It would certainly be easy to provide two macros -- say sharing the same inner function _autostruct just with some keyword difference.

IDK if this is the right package for a non-Flux one though. Do you have non-Flux use cases in mind?

Literally composing macros tends to work badly. But one can pretend by having in e.g. @auto @layer function MyLayer the first macro @auto look for the other macro (which has not yet been expanded, need not even be defined) as a symbol to do something different.

  1. ... and one for when the fields are settled

If this ends up being useful, then I agree that might be a tidier final situation. I wonder if it ever matters besides being tidy -- will there be situations when performance differs, etc?

Instead of a variant of the macro, the other way to make things permanent is to delete the macro, and add an explicit @concrete struct MyModel; dense1; dense2; end. The constructor function and the forward pass will not need to change at all.

@CarloLucibello
Copy link
Member

CarloLucibello commented Oct 24, 2024

thought things like this might not parse...

wow, weird stuff, that should be avoided. The @auto struct MyModel(d::Int) scenario would require for the first argument of the expression to be a :call.

I like the fact that @auto struct MyModel(d::Int) says concisely "I'm constructing a struct and its constructor at the same time" without looking too weird, and that it can be readily extended to mutable struct.

I like that if you delete the macro, and instead supply the struct definition yourself, then the function is a valid constructor. It's not a particularly magical macro.

That's a good point. Wouldn't be hard to replace @auto struct with function but I'm ok with current proposal as well.

@CarloLucibello
Copy link
Member

CarloLucibello commented Oct 24, 2024

IDK if this is the right package for a non-Flux one though. Do you have non-Flux use cases in mind?

I was just looking forward to having a general-purpose package hosted by FluxML but also anywhere else.
It would be revise-friendly as https://github.com/BeastyBlacksmith/ProtoStructs.jl but also more powerful right?
Use cases is anyone developing a type-heavy package.

Also minimalistic versions like

@autostruct A(x, y)
@autostruct B(x::Int, y = 2)

would be useful.

This makes me think that we could allow the syntax return MyModel(dense1::Dense, dense2::Dense) for type-constraining. This would be a second order thing left for the future.

src/autostruct.jl Outdated Show resolved Hide resolved
@mcabbott
Copy link
Member Author

could allow the syntax return MyModel(dense1::Dense, dense2::Dense)

Could do.

My examples all have only one constructor, and one reason to add constraints is to check other constructors. Doing the obvious thing does not work:

julia> MyModel(d1::Int, d2::Int) = MyModel(Dense(d1 => d2), Dense(d2 => d1))  # does not work
ERROR: cannot define function MyModel; it already has a value

julia> MyModel
var"MyModel#13"

julia> var"MyModel#13"(d1::Int, d2::Int) = MyModel(Dense(d1 => d2), Dense(d2 => d1))  # does work!
var"MyModel#13"

julia> MyModel(2, 3)
MyModel(...)        # 9 parameters, plus 8 non-trainable

If you have multiple constructors all using the macro... same struct if they agree exactly on field names, but if not, you'll get different structs. I think?

but also more powerful right? Use cases is anyone developing a type-heavy package.

I guess we can keep one eye on general-purpose usefulness. What I think this can't do is be type-stable through construction -- which ProtoStructs.jl manages:

julia> @proto struct Tmp
         x::Int
         y::Any
       end

julia> @code_warntype (z -> Tmp(1,z).y)([3 4.])
MethodInstance for (::var"#21#22")(::Matrix{Float64})
  from (::var"#21#22")(z) @ Main REPL[13]:1
Arguments
  #self#::Core.Const(var"#21#22"())
  z::Matrix{Float64}
Body::Matrix{Float64}

@MilesCranmer
Copy link
Contributor

MilesCranmer commented Oct 24, 2024

I think making prototyping possible would be important for this. Similarly to proto structs, maybe just have the macro turn it into a named tuple wrapper? That way, every single struct would be

struct {name}{T<:NamedTuple}
    __x::T
end
Base.getproperty(…)= # access fields of __x

and you would simply update the tuple passed.

Then you can re-run it again with different numbers of arguments, and it won’t matter

@CarloLucibello
Copy link
Member

CarloLucibello commented Oct 24, 2024

There where some problems with Zygote + Functors with that approach
FluxML/Functors.jl#46

@CarloLucibello
Copy link
Member

I think making prototyping possible would be important for this.

I don't know if that was clear, but this PR's approach is fine for prototyping

@CarloLucibello
Copy link
Member

name could be @autolayer. Or just @layer?

@CarloLucibello
Copy link
Member

speculative: we can also add a macro @save_args to be used inside the constructor that adds two namedtuple fields, _args and _kwargs containing positional and keywords arguments. This allows easy storing of the hyperparameters and nicer show.

@mcabbott
Copy link
Member Author

mcabbott commented Oct 24, 2024

Re printing... there's now an option to do @layer :expand MyModel, which produces this -- not yet perfect:

julia> @autostruct :expand function MyModel(d::Int)
...

julia> my = Chain(MyModel(2), Dense(2=>2))
Chain(
  ##MyModel#270(
    Dense(2 => 2, tanh),                # 6 parameters
    Dense(2 => 2, tanh),                # 6 parameters
  ),
  Dense(2 => 2),                        # 6 parameters
)                   # Total: 6 arrays, 18 parameters, 384 bytes.

IDK if that should be here, or if we should just make it so that you can call both macros (currently this doesn't work):

@autostruct function MyModel(d::Int)
...
@layer :expand MyModel

constructor that adds two namedtuple fields, _args and _kwargs containing positional and keywords arguments.

The macro can see these arguments already (it just doesn't look). It could store their values as a special field of the generated struct, to print MyModel(2) in this case.

However, I'm a little scared of edge cases this will invite. Someone will do @autostruct function Model((in, out)::Pair; bias::Union{Nothing,Vector}=zeros(100)) and perhaps worse things? Maybe it can be smart enough to give up on anything worse than a tuple of integers? Still it adds complexity... and maybe for now we should keep it simple.

@mcabbott
Copy link
Member Author

mcabbott commented Oct 28, 2024

Here's a failure case of the current code:

julia> @autostruct :expand function New2(a, b)
           A = Dense(a => b)
           B = Dense(b => a)
           return New2(A, B)
       end
var"##New2#262"

julia> New2(1, 2)
ERROR: MethodError: no method matching Dense(::Pair{Dense{typeof(identity), Matrix{…}, Vector{…}}, Dense{typeof(identity), Matrix{…}, Vector{…}}})
Stacktrace:
 [1] var"##New2#262"(a::Dense{typeof(identity), Matrix{…}, Vector{…}}, b::Dense{typeof(identity), Matrix{…}, Vector{…}})
   @ Main ./REPL[4]:2
 [2] var"##New2#262"(a::Int64, b::Int64)
   @ Main ./REPL[4]:4

julia> methods(New2)  # only the first seems to be called
# 2 methods for type constructor:
 [1] var"##New2#262"(a, b)
     @ REPL[4]:1
 [2] var"##New2#262"(A::var"T#1", B::var"T#2") where {var"T#1", var"T#2"}
     @ ~/.julia/dev/Fluxperimental/src/autostruct.jl:110

julia> @autostruct :expand function New3(a::Int, b::Int)
           A = Dense(a => b)
           B = Dense(b => a)
           return New3(A, B)
       end
var"##New3#263"

julia> New3(4, 5)  # with a::Int, there is no ambiguity
New3(
  Dense(4 => 5),                        # 25 parameters
  Dense(5 => 4),                        # 24 parameters
)                   # Total: 4 arrays, 49 parameters, 404 bytes.
  • The constructor could written by the macro could call New2{Dense, Dense}(A, B). But then Functors will also have to somehow call that, not New2(A, B). And the printed form won't parse.

  • Or the macro could demand some type restriction, in cases where the number of arguments match the number of fields? But that doesn't guarantee no conflict.

Or, as done in 12270a0, it can add enough extra fields which store nothing to remove ambiguity:

julia> @autostruct :expand function New2(a, b)
           A = Dense(a => b)
           B = Dense(b => a)
           return New2(A, B)
       end
var"##New2#276"

julia> New2(1, 2)
New2(
  Dense(1 => 2),                        # 4 parameters
  Dense(2 => 1),                        # 3 parameters
  nothing,
)                   # Total: 4 arrays, 7 parameters, 236 bytes.

julia> methods(New2)
# 2 methods for type constructor:
 [1] var"##New2#276"(a, b)
     @ REPL[54]:1
 [2] var"##New2#276"(A::var"T#1", B::var"T#2", _nothing_1::Nothing) where {var"T#1", var"T#2"}
     @ ~/.julia/dev/Fluxperimental/src/autostruct.jl:125

@@ -12,7 +12,7 @@ Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"

[compat]
Compat = "4"
Flux = "0.13.7, 0.14"
Flux = "0.14.23"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Flux = "0.14.23"
Flux = "0.14"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I guess you saw, it extends some printing methods which are only recently added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, 0.14.23 was correct.

Since Flux has julia v1.10 lower bound, we should do the same here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tagging, maybe it will work now...

@CarloLucibello
Copy link
Member

I would be conservative and require to specify some argument's type in the constructor when the type has the same number of field. This is what you would do when not using / removing @autostruct. No strong feelings against the nothing option though.

@mcabbott
Copy link
Member Author

It's true that we could leave it up to you. But perhaps it's more mysterious when things fail and there is no visible struct causing the ambiguity?

Perhaps nothing may cause issues if you replace this with a real struct & then try to load a saved Flux.state.

@mcabbott mcabbott merged commit 197d494 into master Oct 29, 2024
3 checks passed
@mcabbott mcabbott deleted the autostruct branch October 29, 2024 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants