代码执行与输出分析
- 模型结构
自定义的 MyVgg 模型由五个 vggLayer 组成,每个 vggLayer 包含两层卷积(3×3)加一个 ReLU 激活函数和一个 2×22 \times 22×2 最大池化。最终使用全局自适应池化层(AdaptiveAvgPool2d)将特征图调整为 7×77 \times 77×7,然后通过三个全连接层完成分类任务。
结构上与标准 VGG 模型非常相似,但池化后统一采用自适应池化,而不是原版中的固定步幅池化。
- 前向传播张量尺寸分析
输入:(1,3,224,224)(1, 3, 224, 224)(1,3,224,224)
-
Hout=Wout=224−3+2×11+1=224(卷积层尺寸不变)H_{\text{out}} = W_{\text{out}} = \frac{224 - 3 + 2 \times 1}{1} + 1 = 224 \quad \text{(卷积层尺寸不变)} Hout=Wout=1224−3+2×1+1=224(卷积层尺寸不变) Hout=Wout=2242=112(池化层减少尺寸)H_{\text{out}} = W_{\text{out}} = \frac{224}{2} = 112 \quad \text{(池化层减少尺寸)} Hout=Wout=2224=112(池化层减少尺寸)layer1
输入:(1,3,224,224)(1, 3, 224, 224)(1,3,224,224)
两次卷积 + ReLU + 池化(stride=2):输出:(1,64,112,112)(1, 64, 112, 112)(1,64,112,112)
-
layer2
输入:(1,64,112,112)(1, 64, 112, 112)(1,64,112,112)
池化后:(1,128,56,56)(1, 128, 56, 56)(1,128,56,56) -
layer3
输入:(1,128,56,56)(1, 128, 56, 56)(1,128,56,56)
池化后:(1,256,28,28)(1, 256, 28, 28)(1,256,28,28) -
layer4
输入:(1,256,28,28)(1, 256, 28, 28)(1,256,28,28)
池化后:(1,512,14,14)(1, 512, 14, 14)(1,512,14,14) -
layer5
输入:(1,512,14,14)(1, 512, 14, 14)(1,512,14,14)
池化后:(1,512,7,7)(1, 512, 7, 7)(1,512,7,7) -
adapool
自适应池化到 (7,7)(7, 7)(7,7),输入不变。 -
全连接层
自适应池化输出:(1,512,7,7)(1, 512, 7, 7)(1,512,7,7) 展平后为 (1,512×7×7=25088)(1, 512 \times 7 \times 7 = 25088)(1,512×7×7=25088)。
接着经过三个全连接层,最终输出 (1,1000)(1, 1000)(1,1000)。
- 参数统计
使用的 get_parameter_number 函数计算了模型及各层的参数数量:
-
全模型
MyVgg- 卷积层、全连接层等的参数总数: Total: 134,320,840\text{Total: 134,320,840}Total: 134,320,840
- 可训练参数数量: Trainable: 134,320,840\text{Trainable: 134,320,840}Trainable: 134,320,840
-
layer1的参数- 卷积层1参数:3×3×3×64=17283 \times 3 \times 3 \times 64 = 17283×3×3×64=1728
- 卷积层2参数:3×3×64×64=36,8643 \times 3 \times 64 \times 64 = 36,8643×3×64×64=36,864
- 总参数:1728+36,864=38,5921728 + 36,864 = 38,5921728+36,864=38,592
-
3×3×3×64=17283 \times 3 \times 3 \times 64 = 17283×3×3×64=1728layer1.conv1的参数
仅第一层卷积:
代码优化建议
-
冗余池化调用
在forward中,x = self.adapool(x)被重复调用,移除冗余调用即可。 -
激活函数改进
全连接层最后输出也用了ReLU,可以替换为适合分类任务的Softmax或直接返回 logits,激活函数使用位置需要慎重。 -
对比原版 VGG 参数统计
打印vgg和MyVgg的参数数量,确保两者一致性,同时可以加载预训练参数以验证正确性。
输出示例
假设执行正常,最终输出会包括张量尺寸和参数统计:
torch.Size([1, 1000]) {'Total': 134320840, 'Trainable': 134320840} {'Total': 38592, 'Trainable': 38592} {'Total': 1728, 'Trainable': 1728}
import torchvision.models as models
import torch.nn as nn
vgg = models.vgg13()
print(vgg)
class vggLayer(nn.Module):
def __init__(self,in_cha, mid_cha, out_cha):
super(vggLayer, self).__init__()
self.relu = nn.ReLU()
self.pool = nn.MaxPool2d(2)
self.conv1 = nn.Conv2d(in_cha, mid_cha, 3, 1, 1)
self.conv2 = nn.Conv2d(mid_cha, out_cha, 3, 1, 1)
def forward(self,x):
x = self.conv1(x)
x= self.relu(x)
x = self.conv2(x)
x = self.relu(x)
x = self.pool(x)
return x
class MyVgg(nn.Module):
def __init__(self):
super(MyVgg, self).__init__()
self.layer1 = vggLayer(3, 64, 64)
self.layer2 = vggLayer(64, 128, 128)
self.layer3 = vggLayer(128, 256, 256)
self.layer4 = vggLayer(256, 512, 512)
self.layer5 = vggLayer(512, 512, 512)
self.adapool = nn.AdaptiveAvgPool2d(7)
self.relu = nn.ReLU()
self.fc1 = nn.Linear(25088, 4096)
self.fc2 = nn.Linear(4096, 4096)
self.fc3 = nn.Linear(4096, 1000)
def forward(self,x):
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.layer5(x)
x = self.adapool(x)
x= self.adapool(x)
x = x.view(x.size()[0], -1)
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
x = self.relu(x)
x = self.fc3(x)
x = self.relu(x)
return x
import torch
myVgg = MyVgg()
#
img = torch.zeros((1, 3, 224,224))
out = myVgg(img)
print(out.size())
def get_parameter_number(model):
total_num = sum(p.numel() for p in model.parameters())
trainable_num = sum(p.numel() for p in model.parameters() if p.requires_grad)
return {'Total': total_num, 'Trainable': trainable_num}
print(get_parameter_number(myVgg))
print(get_parameter_number(myVgg.layer1))
print(get_parameter_number(myVgg.layer1.conv1))
#
# print(get_parameter_number(vgg))

4858

被折叠的 条评论
为什么被折叠?



