-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how long convolution ensures causal language modeling #47
Comments
I noticed in another response that you mentioned zero padding was applied to the kernel. I would like to know where this step is performed in the code. Looking forward to your reply |
Hello, I noticed that in |
That looks like a bug. That code is only used for LRA, so it might affect
some of those results. I don’t believe it’s used anywhere else.
…On Thu, Sep 5, 2024 at 4:21 PM 0205090923 ***@***.***> wrote:
Hello, I noticed that in long_conv_kernel.py self.L = L*2 if not causal
else L, so we should set the L = L for causal? This seems to be
inconsistent with the explanations elsewhere.. I'm so confused, can you
kindly explain the causal for Longconv?
—
Reply to this email directly, view it on GitHub
<#47 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABDDIIS2DAENPUO6WSWQDCLZVBLFLAVCNFSM6AAAAABNVZFXE2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZRGY3DSOBRGQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
So we should set L = 2 * L for causal? could kindly explain can it works for causal... it seems no explicit padding is applied in the code, thank you |
Actually I remembered that this is not how this code works. self.L = L
creates a kernel of length L that gets padded implicitly up to 2L later on.
self.L = 2L creates a kernel of length 2L. The FFT is still of length 2L,
so this is actually not a bidirectional kernel.
See this blog post, the 2L version is equivalent to the “wrap it around” in
the blog, it computes a circular convolution.
…On Thu, Sep 5, 2024 at 4:37 PM 0205090923 ***@***.***> wrote:
So we should set L = 2 * L for causal? could kindly explain can it works
for causal... it seems no explicit padding is applied in the code, thank you
—
Reply to this email directly, view it on GitHub
<#47 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABDDIIVEF2WCUS4B34MFOBDZVBNAPAVCNFSM6AAAAABNVZFXE2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZRG4YDMMZSGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hello, I would like to know how long convolution ensures causal language modeling. It seems that I couldn't find any explicit padding applied in the code.
The text was updated successfully, but these errors were encountered: