TensorFlow 2之卷积神经网络分类 CIFAR-10

2020-10-02 | 0 评论 | 0 浏览

背景

温故知新。

选用 CIFAR-10 数据集来验证简单的卷积神经网络在图像分类问题上的表现。

简介

卷积神经网络非常适合用来处理图像，这个模型如果用来训练 MNIST 手写数字数据集，可以达到 99% 的正确率，但是在 CIFAR10 数据集上，只有 68.3% 的正确率，我们将在后面的文章中，使用复杂网络模型或者迁移学习来提高准确率。

数据集

CIFAR-10 包含了60,000张图片，共10类。训练集50,000张，测试集10,000张。但与MNIST不同的是，CIFAR-10 数据集中的图片是彩色的，每张图片的大小是 32x32x3 ，3代表 R/G/B 三个通道，每个像素点的颜色由 R/G/B 三个值决定，R/G/B 的取值范围为0-255。

初体验

#!/usr/bin/python3

import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import layers, datasets, models

# 第一步 准备数据集
#   与 MNIST 手写数字一样，CIFAR-10 包含了60,000张图片，共10类。
#   训练集50,000张，测试集10,000张。
#   但与MNIST不同的是，CIFAR-10 数据集中的图片是彩色的，每张图片的大小是 32x32x3。
#   3代表 R/G/B 三个通道，每个像素点的颜色由 R/G/B 三个值决定，R/G/B 的取值范围为0-255。
(train_x, train_y), (test_x, test_y) = datasets.cifar10.load_data()

#   看一下图片的样子
#   可以直接在 https://geektutu.com/post/tf2doc-cnn-cifar10.html 中看
# plt.figure(figsize=(5, 3))
# plt.subplots_adjust(hspace=0.1)
# for n in range(15):
#     plt.subplot(3, 5, n+1)
#     plt.imshow(train_x[n])
#     plt.axis('off')
# _ = plt.suptitle("CIFAR-10 Example")

# 第二步 预处理
#   训练之前，我们需要对数据进行预处理。图片的每个像素值在0-255之间，需要转为0-1。训练集和测试集都需要经过相同的处理。
train_x, test_x = train_x / 255.0, test_x / 255.0
# print('train_x shape:', train_x.shape, 'test_x shape:', test_x.shape)
# (50000, 32, 32, 3), (10000, 32, 32, 3)

# 第三步 搭建模型
#   卷积层
#     CNN 的输入是三维张量 (image_height, image_width, color_channels)，即 input_shape。
#     每一层卷积层使用tf.keras.layers.Conv2D来搭建。
#       Conv2D 共接收2个参数，第2个参数是卷积核大小，第1个参数是卷积核的个数。
#     第1、2卷积层后紧跟了最大池化层(MaxPooling2D)，最大池化即选择图像区域的最大值作为该区域池化后的值，
#       另一个常见的池化操作是平均池化，即计算图像区域的平均值作为该区域池化后的值。
#     每一轮卷积或池化后，图像的宽和高的值都会减小
#       假设图像的高度为h，卷积核大小为 m，那么很容易得出卷积后的高度 h1 = h - m + 1。
#       池化前的高度为 h1，池化滤波器大小为 s，那么池化后的高度为 h1 / s。
#       对应到model.summary()的输出，输入大小为 (32, 32)，经过32个3x3的卷积核卷积后，大小为 (30, 30)，紧接着池化后，大小变为(15, 15)。

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

# model.summary()
# Model: "sequential"
# _________________________________________________________________
# Layer (type)                 Output Shape              Param #
# =================================================================
# conv2d (Conv2D)              (None, 30, 30, 32)        896
# _________________________________________________________________
# max_pooling2d (MaxPooling2D) (None, 15, 15, 32)        0
# _________________________________________________________________
# conv2d_1 (Conv2D)            (None, 13, 13, 64)        18496
# _________________________________________________________________
# max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64)          0
# _________________________________________________________________
# conv2d_2 (Conv2D)            (None, 4, 4, 64)          36928
# =================================================================
# Total params: 56,320
# Trainable params: 56,320
# Non-trainable params: 0
# _________________________________________________________________

#   全连接层
#     我们的目的是对图像进行分类，即期望输出一个长度为10的一维向量，第k个值代表输入图片分类为k的概率。
#     因此需要通过 Dense 层，即全连接层，将3维的卷积层输出，转换为一维。
#     这里可以使用tf.keras.layers.Flatten()。
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

# model.summary()
# Model: "sequential"
# _________________________________________________________________
# Layer (type)                 Output Shape              Param #
# =================================================================
# conv2d (Conv2D)              (None, 30, 30, 32)        896
# _________________________________________________________________
# max_pooling2d (MaxPooling2D) (None, 15, 15, 32)        0
# _________________________________________________________________
# conv2d_1 (Conv2D)            (None, 13, 13, 64)        18496
# _________________________________________________________________
# max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64)          0
# _________________________________________________________________
# conv2d_2 (Conv2D)            (None, 4, 4, 64)          36928
# _________________________________________________________________
# flatten (Flatten)            (None, 1024)              0
# _________________________________________________________________
# dense (Dense)                (None, 64)                65600
# _________________________________________________________________
# dense_1 (Dense)              (None, 10)                650
# =================================================================
# Total params: 122,570
# Trainable params: 122,570
# Non-trainable params: 0
# _________________________________________________________________

# 第四步 编译模型
#   模型准备训练前，在模型编译(Compile)时还需要设置一些参数
#   Loss function - 损失函数，训练时评估模型的正确率，希望最小化这个函数，往正确的方向训练模型。
#   Optimizer - 优化器算法，更新模型参数的算法。
#   Metrics - 指标，用来监视训练和测试步数，下面的例子中使用accuracy，即图片被正确分类的比例。
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# 第五步 训练模型
#   训练神经网络，通常有以下几个步骤。
#     传入训练数据，train_images和train_labels。
#     训练模型去关联图片和标签。
#     模型对测试集test_images作预测，并用test_labels验证预测结果。
model.fit(train_x, train_y, epochs=5)

# 第六步 评估准确率
#   看看在测试集中表现如何？
test_loss, test_acc = model.evaluate(test_x, test_y)
print('\nTest accuracy:', test_acc)

结果

Epoch 1/5
1563/1563 [==============================] - 29s 19ms/step - loss: 1.5245 - accuracy: 0.4427
Epoch 2/5
1563/1563 [==============================] - 32s 21ms/step - loss: 1.1699 - accuracy: 0.5871
Epoch 3/5
1563/1563 [==============================] - 31s 20ms/step - loss: 1.0131 - accuracy: 0.6448
Epoch 4/5
1563/1563 [==============================] - 31s 20ms/step - loss: 0.9127 - accuracy: 0.6810
Epoch 5/5
1563/1563 [==============================] - 29s 19ms/step - loss: 0.8376 - accuracy: 0.7058
313/313 [==============================] - 1s 4ms/step - loss: 0.9497 - accuracy: 0.6701

Test accuracy: 0.6700999736785889

参考

TensorFlow 2 中文文档 - 卷积神经网络分类 CIFAR-10