New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

When to use slimconv? #2

Open

Senwang98 opened this issue Jan 9, 2021 · 8 comments

Senwang98 commented Jan 9, 2021

Hi,
Thanks for your work, I want to know if I use slimconv to my model, Is there need to replace all the conv to slimconv?
The question is when should I use slimconv to replace conv?

Owner

JiaxiongQ commented Jan 9, 2021

Thanks for your attention, you can use our slimconv to replace the normal 3x3 conv, but the following layer should better be 1x1 conv to rise the dimension of features.

Author

Senwang98 commented Jan 15, 2021

Hi, @JiaxiongQ
请问是否一定要在做完slim_conv之后立马接一个conv11，比如我的特征图通道数是64，那么slim_conv之后拿48通道数的特征图做一些进一步处理之后，再接一个conv11恢复成64，这样可以吗？这么做只是为了减少参数。

如果做完slim_conv之后直接用conv1*1，那么似乎并没有带来计算参数上面的优势？？？

Owner

JiaxiongQ commented Jan 17, 2021 via email

也可以，只要slimconv裁剪通道之后的channel数和你设置的匹配都可以

On Fri, Jan 15, 2021 at 12:52 PM Egqawkq ***@***.***> wrote: Hi, @JiaxiongQ <https://github.com/JiaxiongQ> 请问是否一定要在做完slim_conv之后立马接一个conv1 *1，比如我的特征图通道数是64，那么slim_conv之后拿48通道数的特征图做一些进一步处理之后，再接一个conv1* 1恢复成64，这样可以吗？这么做只是为了减少参数。如果做完slim_conv之后直接用conv1*1，那么似乎并没有带来计算参数上面的优势？？？ — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJANJRDACJI3JCRSRMYO3ADSZ7CX7ANCNFSM4V3G7GSA> .

zsureuk commented Feb 12, 2021 •

edited

Loading

Hi @JiaxiongQ
您好，首先非常感谢您能够提出非常优秀的卷积结构。
我尝试着复现您在resnet20的结果，于是修改您分享的SC-ResNet.py文件以便它能够在cifar数据集上运行，我把最终3/4C的通道直接还原成了完整的C通道（也就是把下路的1/4C扩到了1/2C）。然而我复现得到的resent20的实验结果（90.55%）远小于base model的精度（92.04%）。
我使用的训练参数是：SGD with weight decay= 5e-4, batch-size = 128, initial learning rate = 0.1 然后每50epochs降低为原来的0.1。
同样在这种情况在resnet56也是如此。请问我复现使用的参数跟您之前做的时候比有什么疏漏的地方吗？
您可否分享一下关于cifar的训练和模型文件呢？
非常期待作者的回复。

Owner

JiaxiongQ commented Feb 13, 2021 via email

谢谢您的关注。 1.我们的模型需要每个学习率阶段的训练epoch大一点，建议每80个epoch下降0.1。 2.我们之前的实验都是针对的res bottleneck，附件中有针对res block的改进，在cifar100上做实验是有一定的提升的。

On Sat, Feb 13, 2021 at 7:08 AM zsureuk ***@***.***> wrote: Hi @JiaxiongQ <https://github.com/JiaxiongQ> 您好，首先非常感谢您能够提出非常优秀的卷积结构。我尝试着复现您在resnet20的结果，于是修改您分享的SC-ResNet.py文件以便它能够在cifar数据集上运行，我把最终3/4C的通道直接还原成了完整的C通道（也就是把下路的1/4C扩到了1/2C）。然而我复现得到的resent20的实验结果（90.55%）远小于base model的精度（92.04%）。我使用的训练参数是：SGD with weight decay= 5e-4, batch-size = 128, initial learning rate = 0.1 然后每50epochs降低为原来的0.1。同样在这种情况在resnet56也是如此。请问您可否分享一下关于cifar的训练和模型文件呢？非常期待作者的回复。 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJANJRCFTQQGVONSQOJQUALS6WYIJANCNFSM4V3G7GSA> .

Owner

JiaxiongQ commented Feb 13, 2021 via email

class BasicBlock(nn.Module): expansion = 1 def __init__(self, in_planes, planes, stride=1, option='A',cnt=0): super(BasicBlock, self).__init__() if cnt>-1: self.conv1 = myconv_3x3R(in_planes,stride=stride, kernel_size=3, padding=1, bias=False) else: self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False) self.bn1 = nn.BatchNorm2d(planes) self.conv2 = nn.Conv2d(in_planes//2+in_planes//4, planes, kernel_size=3, stride=1, padding=1, bias=False) self.bn2 = nn.BatchNorm2d(planes) self.shortcut = nn.Sequential() if stride != 1 or in_planes != planes: if option == 'A': """ For CIFAR10 ResNet paper uses option A. """ self.shortcut = LambdaLayer(lambda x:F.pad(x[:, :, ::2, ::2], (0, 0, 0, 0, planes//4, planes//4), "constant", 0)) elif option == 'B': self.shortcut = nn.Sequential( nn.Conv2d(in_planes, self.expansion * planes, kernel_size=1, stride=stride, bias=False), nn.BatchNorm2d(self.expansion * planes) ) def forward(self, x): # out = F.relu(self.bn1(self.conv1(x))) out = F.relu(self.conv1(x)) out = self.bn2(self.conv2(out)) out += self.shortcut(x) out = F.relu(out) return out On Sat, Feb 13, 2021 at 10:50 AM Jiaxiong Qiu <[email protected]> wrote:

上面的附件中有点问题，请参考这个 On Sat, Feb 13, 2021 at 10:48 AM Jiaxiong Qiu ***@***.***> wrote: > 谢谢您的关注。 > 1.我们的模型需要每个学习率阶段的训练epoch大一点，建议每80个epoch下降0.1。 > 2.我们之前的实验都是针对的res bottleneck，附件中有针对res block的改进，在cifar100上做实验是有一定的提升的。 > > On Sat, Feb 13, 2021 at 7:08 AM zsureuk ***@***.***> wrote: > >> Hi @JiaxiongQ <https://github.com/JiaxiongQ> >> 您好，首先非常感谢您能够提出非常优秀的卷积结构。 >> 我尝试着复现您在resnet20的结果，于是修改您分享的SC-ResNet.py文件以便它能够在cifar数据集上运行，我把最终3/4C的通道直接还原成了完整的C通道（也就是把下路的1/4C扩到了1/2C）。然而我复现得到的resent20的实验结果（90.55%）远小于base >> model的精度（92.04%）。 >> 我使用的训练参数是：SGD with weight decay= 5e-4, batch-size = 128, initial >> learning rate = 0.1 然后每50epochs降低为原来的0.1。 >> 同样在这种情况在resnet56也是如此。请问您可否分享一下关于cifar的训练和模型文件呢？ >> 非常期待作者的回复。 >> >> — >> You are receiving this because you were mentioned. >> Reply to this email directly, view it on GitHub >> <#2 (comment)>, >> or unsubscribe >> <https://github.com/notifications/unsubscribe-auth/AJANJRCFTQQGVONSQOJQUALS6WYIJANCNFSM4V3G7GSA> >> . >> >

zsureuk commented Feb 13, 2021

@JiaxiongQ 非常感谢您的回复
在您的附近里我发现有这样两行代码：
if cnt>-1:
self.conv1 = myconv_3x3R(in_planes,stride=stride, kernel_size=3, padding=1, bias=False)
请问myconv_3x3R是class slim_conv_3x3(nn.Module)吗？还是另外定义的卷积呢？另外请问cnt是有什么特殊含义吗？
对于cifar这种每个basicblock只包含两个3x3的卷积的resent，等于说是只改动第一个3x3卷积，第二个3x3是为了恢复成C通道的常规3x3卷积哈

Owner

JiaxiongQ commented Feb 15, 2021 via email

对的，cnt只是为了增加个超参来进行performance和prams/flops的trade-off。 myconv_3x3R是code release之前的名称，在basic_block上是这样： class myconv_3x3R(nn.Module): def __init__(self, in_planes, kernel_size=3, padding=1, bias=False, stride=1, dilation=1): super(myconv_3x3R, self).__init__() self.stride = stride l1=2 l2=4 self.conv2_2 = nn.Sequential(nn.Conv2d(in_planes//l1, in_planes//l2, kernel_size=1, bias=False), nn.BatchNorm2d(in_planes//l2), nn.ReLU(inplace=True), nn.Conv2d(in_planes // l2, in_planes // l2, kernel_size=kernel_size, stride=stride, padding=padding, bias=False, dilation=dilation), nn.BatchNorm2d(in_planes // l2) ) self.conv2_1 = nn.Sequential(nn.Conv2d(in_planes // l1, in_planes // l1, kernel_size=kernel_size, stride=stride, padding=padding, bias=False, dilation=dilation), nn.BatchNorm2d(in_planes // l1) ) self.fc = nn.Sequential(nn.Conv2d(in_planes, in_planes // 8, kernel_size=1, bias=False), nn.BatchNorm2d(in_planes // 8), nn.ReLU(inplace=True), nn.Conv2d(in_planes // 8, in_planes,kernel_size=1), nn.Sigmoid()) self.pool = nn.AdaptiveAvgPool2d(1) self.l1 = l1 def forward(self, x): out = x _,c,_,_ = out.size() w = self.pool(out) w = self.fc(w) w_f = torch.flip(w,[1]) out1 = w*out out2 = w_f*out fs1 = torch.split(out1, c//2, 1) fs2 = torch.split(out2, c//2, 1) ft1 = fs1[0] + fs1[1] ft2 = fs2[0] + fs2[1] out2_1 = self.conv2_1(ft1) out2_2 = self.conv2_2(ft2) out = torch.cat((out2_1, out2_2), 1) return out

On Sun, Feb 14, 2021 at 12:11 AM zsureuk ***@***.***> wrote: @JiaxiongQ <https://github.com/JiaxiongQ> 非常感谢您的回复在您的附近里我发现有这样两行代码： if cnt>-1: self.conv1 = myconv_3x3R(in_planes,stride=stride, kernel_size=3, padding=1, bias=False) 请问myconv_3x3R是class slim_conv_3x3(nn.Module)吗？还是另外定义的卷积呢？另外请问cnt是有什么特殊含义吗？对于cifar这种每个basicblock只包含两个3x3的卷积的resent，等于说是只改动第一个3x3卷积，第二个3x3是为了恢复成C通道的常规3x3卷积哈 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJANJRAPIYSYPOEX3A7CUSLS62QDJANCNFSM4V3G7GSA> .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment