Is ContinuousTransformerWrapper needed? #284

mpeven · 2024-10-28T20:27:23Z

Maybe I'm missing something, just wondering if ContinuousTransformerWrapper is just an instance of TransformerWrapper like this:

class ContinuousTransformerWrapper(TransformerWrapper):
    def __init__(self, in_dim, emb_dim, out_dim, **kwargs):
        super().__init__(
            num_tokens=0,
            token_emb=torch.nn.Identity() if in_dim == emb_dim else torch.nn.Linear(in_dim, emb_dim, bias=False),
            logits_dim=out_dim,
            **kwargs
        )

The text was updated successfully, but these errors were encountered:

lucidrains · 2024-10-29T13:20:47Z

@mpeven yes indeed, with a mse loss in its autoregressive wrapper

it started off pretty simple, but then people wanted more and more features from the discrete transformerwrapper

maybe a merging is overdue

lucidrains · 2024-10-29T13:21:32Z

@mpeven in the beginning, i was also expecting a lot more diverging research, but it hasn't panned out that way

if anything i think the new practice may be to add denoising diffusion heads (which warrants a complete new wrapper)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is ContinuousTransformerWrapper needed? #284

Is ContinuousTransformerWrapper needed? #284

mpeven commented Oct 28, 2024

lucidrains commented Oct 29, 2024 •

edited

Loading

lucidrains commented Oct 29, 2024

Is ContinuousTransformerWrapper needed? #284

Is ContinuousTransformerWrapper needed? #284

Comments

mpeven commented Oct 28, 2024

lucidrains commented Oct 29, 2024 • edited Loading

lucidrains commented Oct 29, 2024

lucidrains commented Oct 29, 2024 •

edited

Loading