在卷積神經網絡的發展過程中,出現了很多經典的網絡架構。這篇文章介紹LeNet,AlexNet,VGGNet,InceptionNet以及ResNet等5個經典的卷積神經網絡架構。
- LeNet
LeNet是Yann LeCun在1998年提出的最早的神經卷積網絡之一,其網絡架構如圖1所示。
LeNet為比較初始化的卷積架構,它主要是由兩層卷積構成,它的輸入為32*32*3的矩陣,即彩色的32*32像素的照片。第一層卷積由6個5*5的卷積核構成,卷積層的輸出直接進入池化層,該池化的方法為最大池化方法。第二層的卷積是由16個5*5的卷積核構成,卷積層的輸出直接進入到最大池化層。隨後是由3個全連接層,其神經元的個數分別為120、84和10。
圖2顯示了神經網絡的每一層的架構和對應的參數。
其代碼如下:
class LeNet5(Model):
def __init__(self):
super(LeNet5, self).__init__()
self.c1 = Conv2D(filters=6, kernel_size=(5, 5),activation='sigmoid')
self.p1 = MaxPool2D(pool_size=(2, 2), strides=2)
self.c2 = Conv2D(filters=16, kernel_size=(5, 5),activation='sigmoid')
self.p2 = MaxPool2D(pool_size=(2, 2), strides=2)
self.flatten = Flatten()
self.f1 = Dense(120, activation='sigmoid')
self.f2 = Dense(84, activation='sigmoid')
self.f3 = Dense(10, activation='softmax')
def call(self, x):
x = self.c1(x)
x = self.p1(x)
x = self.c2(x)
x = self.p2(x)
x = self.flatten(x)
x = self.f1(x)
x = self.f2(x)
y = self.f3(x)
return y
model = LeNet5()
model.compile(optimizer='adam',loss=tf.keras.losses.sparseCategoricalCrossentropy(from_logits=False),metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/LeNet5.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True,save_best_only=True)
history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback])
model.summary()
- AlexNet
AlexNet網絡誕生於2012年,它和LeNet有相似之處,但網絡規模有很大的變化,其架構示意圖如圖3 所示。
相比於LeNet,AlexNet把卷積層增加到了5層。在之前的兩個卷積層中加入了BatchNormalization(),以及激活函數由sigmoid變化為relu函數。圖4顯示了每一層的架構和對應每一層設置的參數。
其代碼如下:
class AlexNet8(Model):
def __init__(self):
super(AlexNet8, self).__init__()
self.c1 = Conv2D(filters=96, kernel_size=(3, 3))
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.p1 = MaxPool2D(pool_size=(3, 3), strides=2)
self.c2 = Conv2D(filters=256, kernel_size=(3, 3))
self.b2 = BatchNormalization()
self.a2 = Activation('relu')
self.p2 = MaxPool2D(pool_size=(3, 3), strides=2)
self.c3 = Conv2D(filters=384, kernel_size=(3, 3), padding='same',activation='relu')
self.c4 = Conv2D(filters=384, kernel_size=(3, 3), padding='same',activation='relu')
self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same',activation='relu')
self.p3 = MaxPool2D(pool_size=(3, 3), strides=2)
self.flatten = Flatten()
self.f1 = Dense(2048, activation='relu')
self.d1 = Dropout(0.5)
self.f2 = Dense(2048, activation='relu')
self.d2 = Dropout(0.5)
self.f3 = Dense(10, activation='softmax')
def call(self, x):
x = self.c1(x)
x = self.b1(x)
x = self.a1(x)
x = self.p1(x)
x = self.c2(x)
x = self.b2(x)
x = self.a2(x)
x = self.p2(x)
x = self.c3(x)
x = self.c4(x)
x = self.c5(x)
x = self.p3(x)
x = self.flatten(x)
x = self.f1(x)
x = self.d1(x)
x = self.f2(x)
x = self.d2(x)
y = self.f3(x)
return y
model = AlexNet8()
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/AlexNet8.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True)
history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,callbacks=[cp_callback])
model.summary()
- VGGNet16
VGGNet16最大的改進就是提升了網絡的深度,由AlexNet的總共8層網絡提升到了16層,這意味著網絡有著更強的表達能力。VGGNet使用的都是3*3的小卷積核,實際證明這種小卷積核的效果要好於大的卷積核。
VGGNet16的網絡架構如下圖所示:
其程序代碼如下:
class VGG16(Model):
def __init__(self):
super(VGG16, self).__init__()
self.c1 = Conv2D(filters=64, kernel_size=(3, 3), padding='same') # 卷積層1
self.b1 = BatchNormalization() # BN層1
self.a1 = Activation('relu') # 激活層1
self.c2 = Conv2D(filters=64, kernel_size=(3, 3), padding='same', )
self.b2 = BatchNormalization() # BN層1
self.a2 = Activation('relu') # 激活層1
self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d1 = Dropout(0.2) # dropout層
self.c3 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
self.b3 = BatchNormalization() # BN層1
self.a3 = Activation('relu') # 激活層1
self.c4 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
self.b4 = BatchNormalization() # BN層1
self.a4 = Activation('relu') # 激活層1
self.p2 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d2 = Dropout(0.2) # dropout層
self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b5 = BatchNormalization() # BN層1
self.a5 = Activation('relu') # 激活層1
self.c6 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b6 = BatchNormalization() # BN層1
self.a6 = Activation('relu') # 激活層1
self.c7 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b7 = BatchNormalization()
self.a7 = Activation('relu')
self.p3 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d3 = Dropout(0.2)
self.c8 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b8 = BatchNormalization() # BN層1
self.a8 = Activation('relu') # 激活層1
self.c9 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b9 = BatchNormalization() # BN層1
self.a9 = Activation('relu') # 激活層1
self.c10 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b10 = BatchNormalization()
self.a10 = Activation('relu')
self.p4 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d4 = Dropout(0.2)
self.c11 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b11 = BatchNormalization() # BN層1
self.a11 = Activation('relu') # 激活層1
self.c12 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b12 = BatchNormalization() # BN層1
self.a12 = Activation('relu') # 激活層1
self.c13 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b13 = BatchNormalization()
self.a13 = Activation('relu')
self.p5 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d5 = Dropout(0.2)
self.flatten = Flatten()
self.f1 = Dense(512, activation='relu')
self.d6 = Dropout(0.2)
self.f2 = Dense(512, activation='relu')
self.d7 = Dropout(0.2)
self.f3 = Dense(10, activation='softmax')
def call(self, x):
x = self.c1(x)
x = self.b1(x)
x = self.a1(x)
x = self.c2(x)
x = self.b2(x)
x = self.a2(x)
x = self.p1(x)
x = self.d1(x)
x = self.c3(x)
x = self.b3(x)
x = self.a3(x)
x = self.c4(x)
x = self.b4(x)
x = self.a4(x)
x = self.p2(x)
x = self.d2(x)
x = self.c5(x)
x = self.b5(x)
x = self.a5(x)
x = self.c6(x)
x = self.b6(x)
x = self.a6(x)
x = self.c7(x)
x = self.b7(x)
x = self.a7(x)
x = self.p3(x)
x = self.d3(x)
x = self.c8(x)
x = self.b8(x)
x = self.a8(x)
x = self.c9(x)
x = self.b9(x)
x = self.a9(x)
x = self.c10(x)
x = self.b10(x)
x = self.a10(x)
x = self.p4(x)
x = self.d4(x)
x = self.c11(x)
x = self.b11(x)
x = self.a11(x)
x = self.c12(x)
x = self.b12(x)
x = self.a12(x)
x = self.c13(x)
x = self.b13(x)
x = self.a13(x)
x = self.p5(x)
x = self.d5(x)
x = self.flatten(x)
x = self.f1(x)
x = self.d6(x)
x = self.f2(x)
x = self.d7(x)
y = self.f3(x)
return y
model = VGG16()
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/VGG16.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True)
history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback])
model.summary()
- InceptionNet
InceptionNet誕生於2015年,它是通過增加網絡的寬度來提升網絡的能力,與VGGNet通過卷積層堆疊的方式(縱向)相比,它是一個不同的方向(橫向)。下圖顯示了InceptionNet基本單元架構,這個架構可以理解為神經網絡的一個卷積層。
在這裡可以建立兩個類,第一個類為標準化的卷積層,其代碼如下:
class ConvBNRelu(Model):
def __init__(self, ch, kernelsz=3, strides=1, padding='same'):
super(ConvBNRelu, self).__init__()
self.model = tf.keras.models.Sequential([
Conv2D(ch, kernelsz, strides=strides, padding=padding),
BatchNormalization(),
Activation('relu')
])
def call(self, x):
x = self.model(x, training=False) #在training=False時,BN通過整個訓練集計算均值、方差去做批歸一化,training=True時,通過當前batch的均值、方差去做批歸一化。推理時 training=False效果好
return x
#定義了標準化的卷積層(ConvBNRelu)後,可以定義InceptionNet的基本單元了,其代碼如下:
class InceptionBlk(Model):
def __init__(self, ch, strides=1):
super(InceptionBlk, self).__init__()
self.ch = ch
self.strides = strides
self.c1 = ConvBNRelu(ch, kernelsz=1, strides=strides) #最左邊的卷積層1*1
self.c2_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)#左邊第二個黃色標識
self.c2_2 = ConvBNRelu(ch, kernelsz=3, strides=1)#左邊第二個藍色標識
self.c3_1 = ConvBNRelu(ch, kernelsz=1, strides=strides) #左邊第三個黃色標識
self.c3_2 = ConvBNRelu(ch, kernelsz=5, strides=1) #左邊第三個藍色標識
self.p4_1 = MaxPool2D(3, strides=1, padding='same')#左邊第四個紅色標識
self.c4_2 = ConvBNRelu(ch, kernelsz=1, strides=strides)#左邊第四個黃色標識
def call(self, x):
x1 = self.c1(x)
x2_1 = self.c2_1(x)
x2_2 = self.c2_2(x2_1)
x3_1 = self.c3_1(x)
x3_2 = self.c3_2(x3_1)
x4_1 = self.p4_1(x)
x4_2 = self.c4_2(x4_1)
# concat along axis=channel
x = tf.concat([x1, x2_2, x3_2, x4_2], axis=3)
return x
構架兩個Block的InceptionNet,每個Block中包含兩層基本的InceptionBlk單元,在每個Block中InceptionBlk單元的stride參數設置是不同的,第一層是stride=1,第二層是stride=2。這就意味著每經過一個Block,圖的尺寸變為1/2,那麼對應的把卷積核的個數乘以2。具體的架構如下圖所示。
其實現的代碼如下:
class Inception10(Model):
def __init__(self, num_blocks, num_classes, init_ch=16, **kwargs):
super(Inception10, self).__init__(**kwargs)
self.in_channels = init_ch
self.out_channels = init_ch
self.num_blocks = num_blocks
self.init_ch = init_ch
self.c1 = ConvBNRelu(init_ch)
self.blocks = tf.keras.models.Sequential()
for block_id in range(num_blocks):
for layer_id in range(2):
if layer_id == 0:
block = InceptionBlk(self.out_channels, strides=2)
else:
block = InceptionBlk(self.out_channels, strides=1)
self.blocks.add(block)
# enlarger out_channels per block
self.out_channels *= 2
self.p1 = GlobalAveragePooling2D()
self.f1 = Dense(num_classes, activation='softmax')
def call(self, x):
x = self.c1(x)
x = self.blocks(x)
x = self.p1(x)
y = self.f1(x)
return y
model = Inception10(num_blocks=2, num_classes=10)
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/Inception10.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True)
history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback])
model.summary()
此外,也可以不用Class這個方式來實現InceptionNet,具體代碼如下:
import tensorflow as tf
# 定義一個Inception模塊
def InceptionModule(inputs):
# 第一條分支
branch1 = tf.keras.layers.Conv2D(64, (1,1), padding='same', activation='relu')(inputs)
# 第二條分支
branch2 = tf.keras.layers.Conv2D(64, (1,1), padding='same', activation='relu')(inputs)
branch2 = tf.keras.layers.Conv2D(96, (3,3), padding='same', activation='relu')(branch2)
# 第三條分支
branch3 = tf.keras.layers.Conv2D(64, (1,1), padding='same', activation='relu')(inputs)
branch3 = tf.keras.layers.Conv2D(64, (5,5), padding='same', activation='relu')(branch3)
# 第四條分支
branch4 = tf.keras.layers.MaxPooling2D((3,3), strides=(1,1), padding='same')(inputs)
branch4 = tf.keras.layers.Conv2D(32, (1,1), padding='same', activation='relu')(branch4)
# 將四個分支合併
outputs = tf.keras.layers.concatenate([branch1, branch2, branch3, branch4], axis=-1)
return outputs
# 定義一個InceptionNet
def InceptionNet(input_shape, num_classes):
# 輸入層
inputs = tf.keras.layers.Input(shape=input_shape)
# 第一層
x = tf.keras.layers.Conv2D(64, (7,7), strides=(2,2), padding='same', activation='relu')(inputs)
x = tf.keras.layers.MaxPooling2D((3,3), strides=(2,2), padding='same')(x)
# 第二層
x = tf.keras.layers.Conv2D(64, (1,1), padding='same', activation='relu')(x)
x = tf.keras.layers.Conv2D(192, (3,3), padding='same', activation='relu')(x)
x = tf.keras.layers.MaxPooling2D((3,3), strides=(2,2), padding='same')(x)
# 第三層
x = InceptionModule(x)
x = InceptionModule(x)
x = InceptionModule(x)
# 全局平均池化層
x = tf.keras.layers.GlobalAveragePooling2D()(x)
# 輸出層
outputs = tf.keras.layers.Dense(num_classes, activation='softmax')(x)
# 創建模型
model = tf.keras.Model(inputs=inputs, outputs=outputs)
return model
- ResNet
以上的經典卷積神經網絡遇到的一個問題是隨著層數的不斷上升,其測試的精度不會上升,這個主要是由於梯度消失導致的。ResNet的核心思想是將層間殘差跳連,引入前方信息,減少梯度消失,這樣可以加大神經網絡的層數。其結構示意圖如下所示。
在上圖中有虛線和實線,其區別是保證其維度的相同,其計算步驟如下圖所示。
ResNet的Block的模塊的代碼中有一個參數residual_path主要是判斷維度是否相同,其具體代碼如下:
class ResnetBlock(Model):
def __init__(self, filters, strides=1, residual_path=False):
super(ResnetBlock, self).__init__()
self.filters = filters
self.strides = strides
self.residual_path = residual_path
self.c1 = Conv2D(filters, (3, 3), strides=strides, padding='same', use_bias=False)
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.c2 = Conv2D(filters, (3, 3), strides=1, padding='same', use_bias=False)
self.b2 = BatchNormalization()
# residual_path為True時,對輸入進行下採樣,即用1x1的卷積核做卷積操作,保證x能和F(x)維度相同,順利相加
if residual_path:
self.down_c1 = Conv2D(filters, (1, 1), strides=strides, padding='same', use_bias=False)
self.down_b1 = BatchNormalization()
self.a2 = Activation('relu')
def call(self, inputs):
residual = inputs # residual等於輸入值本身,即residual=x
# 將輸入通過卷積、BN層、激活層,計算F(x)
x = self.c1(inputs)
x = self.b1(x)
x = self.a1(x)
x = self.c2(x)
y = self.b2(x)
if self.residual_path:
residual = self.down_c1(inputs)
residual = self.down_b1(residual)
out = self.a2(y + residual) # 最後輸出的是兩部分的和,即F(x)+x或F(x)+Wx,再過激活函數
return out
#最後可以利用ResnetBlock來構建ResNet的網絡架構,其代碼如下:
class ResNet18(Model):
def __init__(self, block_list, initial_filters=64): # block_list表示每個block有幾個卷積層
super(ResNet18, self).__init__()
self.num_blocks = len(block_list) # 共有幾個block
self.block_list = block_list
self.out_filters = initial_filters
self.c1 = Conv2D(self.out_filters, (3, 3), strides=1, padding='same', use_bias=False)
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.blocks = tf.keras.models.Sequential()
# 構建ResNet網絡結構
for block_id in range(len(block_list)): # 第幾個resnet block
for layer_id in range(block_list[block_id]): # 第幾個卷積層
if block_id != 0 and layer_id == 0: # 對除第一個block以外的每個block的輸入進行下採樣
block = ResnetBlock(self.out_filters, strides=2, residual_path=True)
else:
block = ResnetBlock(self.out_filters, residual_path=False)
self.blocks.add(block) # 將構建好的block加入resnet
self.out_filters *= 2 # 下一個block的卷積核數是上一個block的2倍
self.p1 = tf.keras.layers.GlobalAveragePooling2D()
self.f1 = tf.keras.layers.Dense(10, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2())
def call(self, inputs):
x = self.c1(inputs)
x = self.b1(x)
x = self.a1(x)
x = self.blocks(x)
x = self.p1(x)
y = self.f1(x)
return y
model = ResNet18([2, 2, 2, 2])
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=['sparse_categorical_accuracy'])
checkpoint_save_path = "./checkpoint/ResNet18.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True)
history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,callbacks=[cp_callback])
model.summary()
不用Class實現代碼如下:
import tensorflow as tf
# 定義一個殘差塊
def ResidualBlock(x, filters, downsample=False):
shortcut = x
stride = (1, 1)
# 如果需要下採樣,則對輸入進行下採樣操作
if downsample:
stride = (2, 2)
shortcut = tf.keras.layers.Conv2D(filters, (1, 1), strides=stride, padding='same')(shortcut)
shortcut = tf.keras.layers.BatchNormalization()(shortcut)
# 主分支
x = tf.keras.layers.Conv2D(filters, (3, 3), strides=stride, padding='same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Activation('relu')(x)
x = tf.keras.layers.Conv2D(filters, (3, 3), strides=(1, 1), padding='same')(x)
x = tf.keras.layers.BatchNormalization()(x)
# 將主分支的輸出與shortcut相加
x = tf.keras.layers.add([x, shortcut])
x = tf.keras.layers.Activation('relu')(x)
return x
# 定義一個ResNet模型
def ResNet(input_shape, num_classes):
inputs = tf.keras.layers.Input(shape=input_shape)
# 預處理層
x = tf.keras.layers.ZeroPadding2D(padding=(3, 3))(inputs)
x = tf.keras.layers.Conv2D(64, (7, 7), strides=(2, 2))(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Activation('relu')(x)
x = tf.keras.layers.MaxPooling2D((3, 3), strides=(2, 2))(x)
# 殘差塊組1
x = ResidualBlock(x, filters=64)
x = ResidualBlock(x, filters=64)
# 殘差塊組2
x = ResidualBlock(x, filters=128, downsample=True)
x = ResidualBlock(x, filters=128)
# 殘差塊組3
x = ResidualBlock(x, filters=256, downsample=True)
x = ResidualBlock(x, filters=256)
# 殘差塊組4
x = ResidualBlock(x, filters=512, downsample=True)
x = ResidualBlock(x, filters=512)
# 全局平均池化層
x = tf.keras.layers.GlobalAveragePooling2D()(x)
# 輸出層
outputs = tf.keras.layers.Dense(num_classes, activation='softmax')(x)
# 創建模型
model = tf.keras.Model(inputs=inputs, outputs=outputs)
return model
總體上看,ResNet把網絡深度大幅度進行了提升,在2015年Imagenet圖像識別Top5錯誤率降低至3.57%。