Currently PyTorch supports these resnet models:

'resnet101': '',
'resnet152': '',
'resnet18': '',
'resnet34': '',
'resnet50': '',
'resnext101_32x8d': '',
'resnext50_32x4d': ''

You wonder how the actual Python Resnet code looks:

model = resnet18(pretrained=False)
import inspect
lines = inspect.getsource(model.__init__)
lines = inspect.getsource(model.forward)

It should be like this:

def __init__(self, block, layers, num_classes=1000, zero_init_residual=False,
                groups=1, width_per_group=64, replace_stride_with_dilation=None,
    super(ResNet, self).__init__()
    if norm_layer is None:
        norm_layer = nn.BatchNorm2d
    self._norm_layer = norm_layer

    self.inplanes = 64
    self.dilation = 1
    if replace_stride_with_dilation is None:
        # each element in the tuple indicates if we should replace
        # the 2x2 stride with a dilated convolution instead
        replace_stride_with_dilation = [False, False, False]
    if len(replace_stride_with_dilation) != 3:
        raise ValueError("replace_stride_with_dilation should be None "
                            "or a 3-element tuple, got {}".format(replace_stride_with_dilation))
    self.groups = groups
    self.base_width = width_per_group
    self.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=7, stride=2, padding=3,
    self.bn1 = norm_layer(self.inplanes)
    self.relu = nn.ReLU(inplace=True)
    self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
    self.layer1 = self._make_layer(block, 64, layers[0])
    self.layer2 = self._make_layer(block, 128, layers[1], stride=2,
    self.layer3 = self._make_layer(block, 256, layers[2], stride=2,
    self.layer4 = self._make_layer(block, 512, layers[3], stride=2,
    self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
    self.fc = nn.Linear(512 * block.expansion, num_classes)

    for m in self.modules():
        if isinstance(m, nn.Conv2d):
            nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
        elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
            nn.init.constant_(m.weight, 1)
            nn.init.constant_(m.bias, 0)

    # Zero-initialize the last BN in each residual branch,
    # so that the residual branch starts with zeros, and each residual block behaves like an identity.
    # This improves the model by 0.2~0.3% according to
    if zero_init_residual:
        for m in self.modules():
            if isinstance(m, Bottleneck):
                nn.init.constant_(m.bn3.weight, 0)
            elif isinstance(m, BasicBlock):
                nn.init.constant_(m.bn2.weight, 0)

The code in the forward looks like this:

def forward(self, x):
    x = self.conv1(x)
    x = self.bn1(x)
    x = self.relu(x)
    x = self.maxpool(x)

    x = self.layer1(x)
    x = self.layer2(x)
    x = self.layer3(x)
    x = self.layer4(x)

    x = self.avgpool(x)
    x = x.reshape(x.size(0), -1)
    x = self.fc(x)

    return x

PyTorch code for Resent in here also shows we have :

  • BasicBlock
  • Bottleneck

modules. The first one is used for Resent18, and Resenet34 and later one for all the other (more advanced) architectures.

Both, BasicBlock and Bottleneck have the identity connection as explained in here.

BasicBlock is always using conv3x3 while Bottleneck combines conv3x3 and conv1x1 convolutions (kernel size 3 and 1).

What may be interesting to alter and analyze

What may be altered is the order inside both BasicBlock and Bottleneck.

    out = self.conv1(x)
    out = self.bn1(out)
    out = self.relu(out)

You could set this as:

    out = self.conv1(x)
    out = self.relu(out)
    out = self.bn1(out)

While it has lot of sense to regularize at the end.

Also why not using stride 2 convolution instead of max pooling as one max pooling appears early in the ResNet encoding head.