Keras實現CNN經典神經網絡（VGGNet，InceptionNet,ResNet等)

在卷積神經網絡的發展過程中，出現了很多經典的網絡架構。這篇文章介紹LeNet，AlexNet，VGGNet，InceptionNet以及ResNet等5個經典的卷積神經網絡架構。

LeNet

LeNet是Yann LeCun在1998年提出的最早的神經卷積網絡之一，其網絡架構如圖1所示。

LeNet為比較初始化的卷積架構，它主要是由兩層卷積構成，它的輸入為32*32*3的矩陣，即彩色的32*32像素的照片。第一層卷積由6個5*5的卷積核構成，卷積層的輸出直接進入池化層，該池化的方法為最大池化方法。第二層的卷積是由16個5*5的卷積核構成，卷積層的輸出直接進入到最大池化層。隨後是由3個全連接層，其神經元的個數分別為120、84和10。

圖2顯示了神經網絡的每一層的架構和對應的參數。

其代碼如下：

class LeNet5(Model):
        def __init__(self):
        super(LeNet5, self).__init__()
                self.c1 = Conv2D(filters=6, kernel_size=(5, 5),activation='sigmoid')
                self.p1 = MaxPool2D(pool_size=(2, 2), strides=2)
                self.c2 = Conv2D(filters=16, kernel_size=(5, 5),activation='sigmoid')
                self.p2 = MaxPool2D(pool_size=(2, 2), strides=2)
                self.flatten = Flatten()
                self.f1 = Dense(120, activation='sigmoid')
                self.f2 = Dense(84, activation='sigmoid')
                self.f3 = Dense(10, activation='softmax')
        def call(self, x):
                x = self.c1(x)
                x = self.p1(x)
                x = self.c2(x)
                x = self.p2(x)
                x = self.flatten(x)
                x = self.f1(x)
                x = self.f2(x)
                y = self.f3(x)
                return y

model = LeNet5()
model.compile(optimizer='adam',loss=tf.keras.losses.sparseCategoricalCrossentropy(from_logits=False),metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/LeNet5.ckpt"

if os.path.exists(checkpoint_save_path + '.index'):
        print('-------------load the model-----------------')
        model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True,save_best_only=True)

history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback])

model.summary()

AlexNet

AlexNet網絡誕生於2012年，它和LeNet有相似之處，但網絡規模有很大的變化，其架構示意圖如圖3 所示。

相比於LeNet，AlexNet把卷積層增加到了5層。在之前的兩個卷積層中加入了BatchNormalization()，以及激活函數由sigmoid變化為relu函數。圖4顯示了每一層的架構和對應每一層設置的參數。

其代碼如下：

class AlexNet8(Model):
        def __init__(self):
        super(AlexNet8, self).__init__()
                self.c1 = Conv2D(filters=96, kernel_size=(3, 3))
    self.b1 = BatchNormalization()
    self.a1 = Activation('relu')
    self.p1 = MaxPool2D(pool_size=(3, 3), strides=2)
    self.c2 = Conv2D(filters=256, kernel_size=(3, 3))
    self.b2 = BatchNormalization()
    self.a2 = Activation('relu') 
    self.p2 = MaxPool2D(pool_size=(3, 3), strides=2)
    self.c3 = Conv2D(filters=384, kernel_size=(3, 3), padding='same',activation='relu')
    self.c4 = Conv2D(filters=384, kernel_size=(3, 3), padding='same',activation='relu')
    self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same',activation='relu')
    self.p3 = MaxPool2D(pool_size=(3, 3), strides=2)
    self.flatten = Flatten()
    self.f1 = Dense(2048, activation='relu')
    self.d1 = Dropout(0.5)
    self.f2 = Dense(2048, activation='relu')
    self.d2 = Dropout(0.5)
    self.f3 = Dense(10, activation='softmax')
        def call(self, x):
        x = self.c1(x)
    x = self.b1(x)
    x = self.a1(x)
    x = self.p1(x)
    x = self.c2(x)
    x = self.b2(x)
    x = self.a2(x)
    x = self.p2(x)
    x = self.c3(x)
    x = self.c4(x)
    x = self.c5(x)
    x = self.p3(x)
    x = self.flatten(x)
    x = self.f1(x)
    x = self.d1(x)
    x = self.f2(x)
    x = self.d2(x)
    y = self.f3(x)
    return y

model = AlexNet8()
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/AlexNet8.ckpt"

if os.path.exists(checkpoint_save_path + '.index'):
        print('-------------load the model-----------------')
        model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True)

history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,callbacks=[cp_callback])

model.summary()

VGGNet16

VGGNet16最大的改進就是提升了網絡的深度，由AlexNet的總共8層網絡提升到了16層，這意味著網絡有著更強的表達能力。VGGNet使用的都是3*3的小卷積核，實際證明這種小卷積核的效果要好於大的卷積核。

VGGNet16的網絡架構如下圖所示：

其程序代碼如下：

class VGG16(Model):
        def __init__(self):
        super(VGG16, self).__init__()
                self.c1 = Conv2D(filters=64, kernel_size=(3, 3), padding='same') # 卷積層1
                self.b1 = BatchNormalization() # BN層1
                self.a1 = Activation('relu') # 激活層1
                self.c2 = Conv2D(filters=64, kernel_size=(3, 3), padding='same', )
                self.b2 = BatchNormalization() # BN層1
                self.a2 = Activation('relu') # 激活層1
                self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
                self.d1 = Dropout(0.2) # dropout層
                self.c3 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
                self.b3 = BatchNormalization() # BN層1
                self.a3 = Activation('relu') # 激活層1
                self.c4 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
                self.b4 = BatchNormalization() # BN層1
                self.a4 = Activation('relu') # 激活層1
                self.p2 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
                self.d2 = Dropout(0.2) # dropout層
                self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
                self.b5 = BatchNormalization() # BN層1
                self.a5 = Activation('relu') # 激活層1
                self.c6 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
                self.b6 = BatchNormalization() # BN層1
                self.a6 = Activation('relu') # 激活層1
                self.c7 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
                self.b7 = BatchNormalization()
                self.a7 = Activation('relu')
                self.p3 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
                self.d3 = Dropout(0.2)
                self.c8 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
                self.b8 = BatchNormalization() # BN層1
                self.a8 = Activation('relu') # 激活層1
                self.c9 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
                self.b9 = BatchNormalization() # BN層1
                self.a9 = Activation('relu') # 激活層1
                self.c10 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
                self.b10 = BatchNormalization()
                self.a10 = Activation('relu')
                self.p4 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
                self.d4 = Dropout(0.2)
                self.c11 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
                self.b11 = BatchNormalization() # BN層1
                self.a11 = Activation('relu') # 激活層1
                self.c12 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
                self.b12 = BatchNormalization() # BN層1
                self.a12 = Activation('relu') # 激活層1
                self.c13 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
                self.b13 = BatchNormalization()
                self.a13 = Activation('relu')
                self.p5 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
                self.d5 = Dropout(0.2)
                self.flatten = Flatten()
                self.f1 = Dense(512, activation='relu')
                self.d6 = Dropout(0.2)
                self.f2 = Dense(512, activation='relu')
                self.d7 = Dropout(0.2)
                self.f3 = Dense(10, activation='softmax')
        def call(self, x):
                x = self.c1(x)
                x = self.b1(x)
                x = self.a1(x)
                x = self.c2(x)
                x = self.b2(x)
                x = self.a2(x)
    x = self.p1(x)
    x = self.d1(x)
    x = self.c3(x)
    x = self.b3(x)
    x = self.a3(x)
    x = self.c4(x)
    x = self.b4(x)
    x = self.a4(x)
    x = self.p2(x)
    x = self.d2(x)
    x = self.c5(x)
    x = self.b5(x)
    x = self.a5(x)
    x = self.c6(x)
    x = self.b6(x)
    x = self.a6(x)
    x = self.c7(x)
    x = self.b7(x)
    x = self.a7(x)
    x = self.p3(x)
    x = self.d3(x)
    x = self.c8(x)
    x = self.b8(x)
    x = self.a8(x)
    x = self.c9(x)
    x = self.b9(x)
    x = self.a9(x)
    x = self.c10(x)
    x = self.b10(x)
    x = self.a10(x)
    x = self.p4(x)
    x = self.d4(x)
    x = self.c11(x)
    x = self.b11(x)
    x = self.a11(x)
    x = self.c12(x)
    x = self.b12(x)
    x = self.a12(x)
    x = self.c13(x)
    x = self.b13(x)
    x = self.a13(x)
    x = self.p5(x)
    x = self.d5(x)
    x = self.flatten(x)
    x = self.f1(x)
    x = self.d6(x)
    x = self.f2(x)
    x = self.d7(x)
    y = self.f3(x)
    return y

model = VGG16()

model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/VGG16.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
        print('-------------load the model-----------------')
        model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True)

history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback])

model.summary()

InceptionNet

InceptionNet誕生於2015年，它是通過增加網絡的寬度來提升網絡的能力，與VGGNet通過卷積層堆疊的方式（縱向）相比，它是一個不同的方向（橫向）。下圖顯示了InceptionNet基本單元架構，這個架構可以理解為神經網絡的一個卷積層。

在這裡可以建立兩個類，第一個類為標準化的卷積層，其代碼如下：

class ConvBNRelu(Model):
        def __init__(self, ch, kernelsz=3, strides=1, padding='same'):
                super(ConvBNRelu, self).__init__()
                self.model = tf.keras.models.Sequential([
                Conv2D(ch, kernelsz, strides=strides, padding=padding),
                BatchNormalization(),
                Activation('relu')
])

        def call(self, x):
                x = self.model(x, training=False) #在training=False時，BN通過整個訓練集計算均值、方差去做批歸一化，training=True時，通過當前batch的均值、方差去做批歸一化。推理時 training=False效果好
                return x

#定義了標準化的卷積層（ConvBNRelu）後，可以定義InceptionNet的基本單元了，其代碼如下：

class InceptionBlk(Model):
        def __init__(self, ch, strides=1):
                super(InceptionBlk, self).__init__()
                self.ch = ch
                self.strides = strides
                self.c1 = ConvBNRelu(ch, kernelsz=1, strides=strides) #最左邊的卷積層1*1
                self.c2_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)#左邊第二個黃色標識
                self.c2_2 = ConvBNRelu(ch, kernelsz=3, strides=1)#左邊第二個藍色標識
                self.c3_1 = ConvBNRelu(ch, kernelsz=1, strides=strides) #左邊第三個黃色標識
                self.c3_2 = ConvBNRelu(ch, kernelsz=5, strides=1) #左邊第三個藍色標識
                self.p4_1 = MaxPool2D(3, strides=1, padding='same')#左邊第四個紅色標識
                self.c4_2 = ConvBNRelu(ch, kernelsz=1, strides=strides)#左邊第四個黃色標識
        def call(self, x):
                x1 = self.c1(x)
                x2_1 = self.c2_1(x)
                x2_2 = self.c2_2(x2_1)
                x3_1 = self.c3_1(x)
                x3_2 = self.c3_2(x3_1)
                x4_1 = self.p4_1(x)
                x4_2 = self.c4_2(x4_1)
# concat along axis=channel
                x = tf.concat([x1, x2_2, x3_2, x4_2], axis=3)
                return x

構架兩個Block的InceptionNet，每個Block中包含兩層基本的InceptionBlk單元，在每個Block中InceptionBlk單元的stride參數設置是不同的，第一層是stride=1，第二層是stride=2。這就意味著每經過一個Block，圖的尺寸變為1/2，那麼對應的把卷積核的個數乘以2。具體的架構如下圖所示。

其實現的代碼如下：

class Inception10(Model):
        def __init__(self, num_blocks, num_classes, init_ch=16, **kwargs):
                super(Inception10, self).__init__(**kwargs)
                self.in_channels = init_ch
                self.out_channels = init_ch
                self.num_blocks = num_blocks
                self.init_ch = init_ch
                self.c1 = ConvBNRelu(init_ch)
                self.blocks = tf.keras.models.Sequential()
                for block_id in range(num_blocks):
                        for layer_id in range(2):
                                if layer_id == 0:
                                        block = InceptionBlk(self.out_channels, strides=2)
                                else:
                                        block = InceptionBlk(self.out_channels, strides=1)
                        self.blocks.add(block)
# enlarger out_channels per block
                        self.out_channels *= 2
                self.p1 = GlobalAveragePooling2D()
                self.f1 = Dense(num_classes, activation='softmax')
        def call(self, x):
                x = self.c1(x)
                x = self.blocks(x)
                x = self.p1(x)
                y = self.f1(x)
                return y

model = Inception10(num_blocks=2, num_classes=10)
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/Inception10.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
        print('-------------load the model-----------------')
        model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True)
history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback])

model.summary()

此外，也可以不用Class這個方式來實現InceptionNet，具體代碼如下：

import tensorflow as tf

# 定義一個Inception模塊
def InceptionModule(inputs):
        # 第一條分支
        branch1 = tf.keras.layers.Conv2D(64, (1,1), padding='same', activation='relu')(inputs)
        # 第二條分支
        branch2 = tf.keras.layers.Conv2D(64, (1,1), padding='same', activation='relu')(inputs)
        branch2 = tf.keras.layers.Conv2D(96, (3,3), padding='same', activation='relu')(branch2)
        # 第三條分支
        branch3 = tf.keras.layers.Conv2D(64, (1,1), padding='same', activation='relu')(inputs)
        branch3 = tf.keras.layers.Conv2D(64, (5,5), padding='same', activation='relu')(branch3)
        # 第四條分支
        branch4 = tf.keras.layers.MaxPooling2D((3,3), strides=(1,1), padding='same')(inputs)
        branch4 = tf.keras.layers.Conv2D(32, (1,1), padding='same', activation='relu')(branch4)
        # 將四個分支合併
        outputs = tf.keras.layers.concatenate([branch1, branch2, branch3, branch4], axis=-1)
return outputs

# 定義一個InceptionNet
def InceptionNet(input_shape, num_classes):
        # 輸入層
        inputs = tf.keras.layers.Input(shape=input_shape)
        # 第一層
        x = tf.keras.layers.Conv2D(64, (7,7), strides=(2,2), padding='same', activation='relu')(inputs)
        x = tf.keras.layers.MaxPooling2D((3,3), strides=(2,2), padding='same')(x)
        # 第二層
        x = tf.keras.layers.Conv2D(64, (1,1), padding='same', activation='relu')(x)
        x = tf.keras.layers.Conv2D(192, (3,3), padding='same', activation='relu')(x)
        x = tf.keras.layers.MaxPooling2D((3,3), strides=(2,2), padding='same')(x)
        # 第三層
        x = InceptionModule(x)
        x = InceptionModule(x)
        x = InceptionModule(x)
        # 全局平均池化層
        x = tf.keras.layers.GlobalAveragePooling2D()(x)
        # 輸出層
        outputs = tf.keras.layers.Dense(num_classes, activation='softmax')(x)
        # 創建模型
        model = tf.keras.Model(inputs=inputs, outputs=outputs)
return model

ResNet

以上的經典卷積神經網絡遇到的一個問題是隨著層數的不斷上升，其測試的精度不會上升，這個主要是由於梯度消失導致的。ResNet的核心思想是將層間殘差跳連，引入前方信息，減少梯度消失，這樣可以加大神經網絡的層數。其結構示意圖如下所示。

在上圖中有虛線和實線，其區別是保證其維度的相同，其計算步驟如下圖所示。

ResNet的Block的模塊的代碼中有一個參數residual_path主要是判斷維度是否相同，其具體代碼如下：

class ResnetBlock(Model):
        def __init__(self, filters, strides=1, residual_path=False):
                super(ResnetBlock, self).__init__()
                self.filters = filters
                self.strides = strides
                self.residual_path = residual_path
                self.c1 = Conv2D(filters, (3, 3), strides=strides, padding='same', use_bias=False)
                self.b1 = BatchNormalization()
                self.a1 = Activation('relu')
                self.c2 = Conv2D(filters, (3, 3), strides=1, padding='same', use_bias=False)
                self.b2 = BatchNormalization()
                # residual_path為True時，對輸入進行下採樣，即用1x1的卷積核做卷積操作，保證x能和F(x)維度相同，順利相加
                if residual_path:
                        self.down_c1 = Conv2D(filters, (1, 1), strides=strides, padding='same', use_bias=False)
                        self.down_b1 = BatchNormalization()
                self.a2 = Activation('relu')

        def call(self, inputs):
                residual = inputs # residual等於輸入值本身，即residual=x
                # 將輸入通過卷積、BN層、激活層，計算F(x)
                x = self.c1(inputs)
                x = self.b1(x)
                x = self.a1(x)
                x = self.c2(x)
                y = self.b2(x)
                if self.residual_path:
                        residual = self.down_c1(inputs)
                        residual = self.down_b1(residual)
                out = self.a2(y + residual) # 最後輸出的是兩部分的和，即F(x)+x或F(x)+Wx,再過激活函數
        return out

#最後可以利用ResnetBlock來構建ResNet的網絡架構，其代碼如下：

class ResNet18(Model):
        def __init__(self, block_list, initial_filters=64): # block_list表示每個block有幾個卷積層
                super(ResNet18, self).__init__()
                self.num_blocks = len(block_list) # 共有幾個block
                self.block_list = block_list
                self.out_filters = initial_filters
                self.c1 = Conv2D(self.out_filters, (3, 3), strides=1, padding='same', use_bias=False)
                self.b1 = BatchNormalization()
                self.a1 = Activation('relu')
                self.blocks = tf.keras.models.Sequential()
# 構建ResNet網絡結構
                for block_id in range(len(block_list)): # 第幾個resnet block
                        for layer_id in range(block_list[block_id]): # 第幾個卷積層
                                if block_id != 0 and layer_id == 0: # 對除第一個block以外的每個block的輸入進行下採樣              
                                        block = ResnetBlock(self.out_filters, strides=2, residual_path=True)
                                else:
                                        block = ResnetBlock(self.out_filters, residual_path=False)
                                self.blocks.add(block) # 將構建好的block加入resnet
                                self.out_filters *= 2 # 下一個block的卷積核數是上一個block的2倍
                self.p1 = tf.keras.layers.GlobalAveragePooling2D()
                self.f1 = tf.keras.layers.Dense(10, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2())

        def call(self, inputs):
                x = self.c1(inputs)
                x = self.b1(x)
                x = self.a1(x)
                x = self.blocks(x)
                x = self.p1(x)
                y = self.f1(x)
                return y

model = ResNet18([2, 2, 2, 2])
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/ResNet18.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
        print('-------------load the model-----------------')
        model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True)
history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,callbacks=[cp_callback])
model.summary()

不用Class實現代碼如下：

import tensorflow as tf

# 定義一個殘差塊
def ResidualBlock(x, filters, downsample=False):
        shortcut = x
        stride = (1, 1)
        # 如果需要下採樣，則對輸入進行下採樣操作
        if downsample:
                stride = (2, 2)
                shortcut = tf.keras.layers.Conv2D(filters, (1, 1), strides=stride, padding='same')(shortcut)
                shortcut = tf.keras.layers.BatchNormalization()(shortcut)
        # 主分支
        x = tf.keras.layers.Conv2D(filters, (3, 3), strides=stride, padding='same')(x)
        x = tf.keras.layers.BatchNormalization()(x)
        x = tf.keras.layers.Activation('relu')(x)
        x = tf.keras.layers.Conv2D(filters, (3, 3), strides=(1, 1), padding='same')(x)
        x = tf.keras.layers.BatchNormalization()(x)
        # 將主分支的輸出與shortcut相加
        x = tf.keras.layers.add([x, shortcut])
        x = tf.keras.layers.Activation('relu')(x)
        return x

# 定義一個ResNet模型
def ResNet(input_shape, num_classes):
        inputs = tf.keras.layers.Input(shape=input_shape)
        # 預處理層
        x = tf.keras.layers.ZeroPadding2D(padding=(3, 3))(inputs)
        x = tf.keras.layers.Conv2D(64, (7, 7), strides=(2, 2))(x)
        x = tf.keras.layers.BatchNormalization()(x)
        x = tf.keras.layers.Activation('relu')(x)
        x = tf.keras.layers.MaxPooling2D((3, 3), strides=(2, 2))(x)
        # 殘差塊組1
        x = ResidualBlock(x, filters=64)
        x = ResidualBlock(x, filters=64)
        # 殘差塊組2
        x = ResidualBlock(x, filters=128, downsample=True)
        x = ResidualBlock(x, filters=128)
        # 殘差塊組3
        x = ResidualBlock(x, filters=256, downsample=True)
        x = ResidualBlock(x, filters=256)
        # 殘差塊組4
        x = ResidualBlock(x, filters=512, downsample=True)
        x = ResidualBlock(x, filters=512)
        # 全局平均池化層
        x = tf.keras.layers.GlobalAveragePooling2D()(x) 
        # 輸出層
        outputs = tf.keras.layers.Dense(num_classes, activation='softmax')(x)
        # 創建模型
        model = tf.keras.Model(inputs=inputs, outputs=outputs)
        return model

總體上看，ResNet把網絡深度大幅度進行了提升，在2015年Imagenet圖像識別Top5錯誤率降低至3.57%。