You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is clear that you are using finite difference (FD) to compute $u_t$, ut = (u[:, 2:, :] - u[:, :-2, :]) / (2 * dt). For $u_x$, you are computing it in the Fourier space as described in the paper. However, it seems to me that what you are doing here (i.e., only one round of FFT, wavenumber multiplication, and IFFT) is insufficient; for example, the pointwise activation functions are not included at all. I do not understand why $u_x$ and $u_{xx}$ could be computed in such a simple way.
Benchmark with autograd
To understand this question, I check the results against autograd. These are what I have done:
extend the returns of FDM_Burgers() and PINO_loss() in train_utils/losses.py to expose the gradient outputs;
compare FD and autograd results in the training method train_2d_burger() in train_utils/train_2d.py;
a minor fix in train_burgers.py to run debug on CPU.
For quick reference, this is my code for the FD vs autograd comparison in step 2:
forx, yintrain_loader:
# make x require gradx.requires_grad=Truex, y=x.to(rank), y.to(rank)
out=model(x).reshape(y.shape)
data_loss=myloss(out, y)
##################### BENCHMARK ut, ux ###################### results from FDMloss_u, loss_f, Du, ut, ux, uxx=PINO_loss(out, x[:, 0, :, 0], v)
fromtorch.autogradimportgradg_AD=grad(out.sum(), x, create_graph=True)[0]
# from datasets.py# Xs = torch.stack([Xs, gridx.repeat([n_sample, self.T, 1]),# gridt.repeat([n_sample, 1, self.s])], dim=3)ux_AD=g_AD[:, :, :, 1] # x coordinates -> second dimut_AD=g_AD[:, :, :, 2] # t coordinates -> third dimprint('Difference for ut')
print(ut_AD[0, 1:-1] -ut[0])
print('\n\nDifference for ux')
print(ux_AD[0] -ux[0])
assertFalse, 'Stop for debug'
If you replace the original source files with the attached three files and run
As we can see, the differences between FD and autograd for $u_t$ are quite small, as expected, which also imply that I am using autograd correctly in train_2d_burger(). However, the differences for $u_x$ are exceedingly large, which seems to support my doubt that FDM_Burgers() is insufficent for $u_x$.
The text was updated successfully, but these errors were encountered:
The issue
I am trying to understand how gradients are computed for Burgers, implemented by
FDM_Burgers()
intrain_utils/losses.py
, as pasted below:It is clear that you are using finite difference (FD) to compute$u_t$ , $u_x$ , you are computing it in the Fourier space as described in the paper. However, it seems to me that what you are doing here (i.e., only one round of FFT, wavenumber multiplication, and IFFT) is insufficient; for example, the pointwise activation functions are not included at all. I do not understand why $u_x$ and $u_{xx}$ could be computed in such a simple way.
ut = (u[:, 2:, :] - u[:, :-2, :]) / (2 * dt)
. ForBenchmark with
autograd
To understand this question, I check the results against
autograd
. These are what I have done:FDM_Burgers()
andPINO_loss()
intrain_utils/losses.py
to expose the gradient outputs;autograd
results in the training methodtrain_2d_burger()
intrain_utils/train_2d.py
;train_burgers.py
to run debug on CPU.For quick reference, this is my code for the FD vs
autograd
comparison in step 2:If you replace the original source files with the attached three files and run
you should be able to get some outputs similar to the following:
As we can see, the differences between FD and$u_t$ are quite small, as expected, which also imply that I am using $u_x$ are exceedingly large, which seems to support my doubt that $u_x$ .
autograd
forautograd
correctly intrain_2d_burger()
. However, the differences forFDM_Burgers()
is insufficent forThe text was updated successfully, but these errors were encountered: