用于 Fashion-MNIST 服装分类的深度学习 CNN

作者 Jason Brownlee 于 2020年8月28日发布在深度学习计算机视觉 71

Fashion-MNIST服装分类问题是计算机视觉和深度学习中一个新的标准数据集。

尽管该数据集相对简单，但它可以作为学习和练习如何从头开始开发、评估和使用深度卷积神经网络进行图像分类的基础。这包括如何开发一个健壮的测试工具来估计模型的性能，如何探索模型的改进，以及如何保存模型并在之后加载它以对新数据进行预测。

在本教程中，您将发现如何从头开始开发一个卷积神经网络来进行服装分类。

完成本教程后，您将了解：

如何开发一个测试工具，以对模型进行稳健评估，并为分类任务建立性能基线。
如何探索基线模型的扩展以改进学习和模型容量。
如何开发一个最终模型，评估最终模型的性能，并使用它对新图像进行预测。

开始您的项目，阅读我的新书深度学习计算机视觉，其中包含分步教程和所有示例的Python源代码文件。

让我们开始吧。

更新于2019年6月：修复了模型定义在CV循环之外的细微错误。更新了结果（感谢Aditya）。
2019 年 10 月更新：更新至 Keras 2.3 和 TensorFlow 2.0。

How to Develop a Deep Convolutional Neural Network From Scratch for Fashion MNIST Clothing Classification

如何从头开始为Fashion MNIST服装分类开发深度卷积神经网络
图片来自 Zdrovit Skurcz，保留部分权利。

教程概述

本教程分为五个部分；它们是：

Fashion MNIST 服装分类
模型评估方法
如何开发基线模型
如何开发改进模型
如何完成模型并进行预测

想通过深度学习实现计算机视觉成果吗？

立即参加我为期7天的免费电子邮件速成课程（附示例代码）。

点击注册，同时获得该课程的免费PDF电子书版本。

Fashion MNIST 服装分类

相比于MNIST数据集，Fashion-MNIST数据集被提议为一个更具挑战性的替代数据集。

它包含60,000个小的、28x28像素的灰度图像，描绘了10种类型的服装，如鞋子、T恤、连衣裙等。所有0-9的整数到类别标签的映射如下所示。

0: T恤/上衣
1: 裤子
2: 套衫
3: 连衣裙
4: 外套
5: 凉鞋
6: 衬衫
7: 运动鞋
8: 包
9: 踝靴

它比MNIST更具挑战性，使用深度学习卷积神经网络在保留的测试数据集上可以达到大约90%到95%的分类准确率。

下面的示例使用Keras API加载Fashion-MNIST数据集，并绘制训练数据集中前九张图像的图。

# example of loading the fashion mnist dataset
from matplotlib import pyplot
from keras.datasets import fashion_mnist
# load dataset
(trainX, trainy), (testX, testy) = fashion_mnist.load_data()
# summarize loaded dataset
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))
# plot first few images
for i in range(9):
	# define subplot
	pyplot.subplot(330 + 1 + i)
	# plot raw pixel data
	pyplot.imshow(trainX[i], cmap=pyplot.get_cmap('gray'))
# show the figure
pyplot.show()

# 加载fashion mnist数据集的示例

from matplotlib import pyplot

from keras.datasets import fashion_mnist

# 加载数据集

(trainX, trainy), (testX, testy) = fashion_mnist.load_data()

# 总结已加载的数据集

print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))

print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

# 绘制前几张图片

for i in range(9):

# 定义子图

pyplot.subplot(330 + 1 + i)

# 绘制原始像素数据

pyplot.imshow(trainX[i], cmap=pyplot.get_cmap('gray'))

# 显示图

pyplot.show()

运行此示例将加载Fashion-MNIST训练集和测试集，并打印它们的形状。

我们可以看到训练数据集中有 60,000 个样本，测试数据集中有 10,000 个样本，并且图像确实是 28x28 像素的正方形。

Train: X=(60000, 28, 28), y=(60000,)
Test: X=(10000, 28, 28), y=(10000,)

1 2	训练集：X=(60000, 28, 28), y=(60000,) 测试集：X=(10000, 28, 28), y=(10000,)

还绘制了数据集中前九张图像的图，显示这些图像确实是服装的灰度照片。

Plot of a Subset of Images From the Fashion-MNIST Dataset

Fashion-MNIST 数据集图像子集图

模型评估方法

相比于被现代卷积神经网络“解决”的MNIST数据集的广泛使用，Fashion MNIST数据集是作为其响应而开发的。

Fashion-MNIST被提议作为MNIST的替代品，虽然它还没有被完全解决，但通常可以实现10%或更低的错误率。与MNIST一样，它可以作为开发和练习使用卷积神经网络解决图像分类方法论的有用起点。

我们可以从头开始开发一个新模型，而不是回顾关于数据集上表现良好的模型的文献。

该数据集已经有了定义明确的训练集和测试集供我们使用。

为了估计模型在给定训练运行中的性能，我们可以进一步将训练集划分为训练集和验证集。然后可以绘制每次运行在训练集和验证集上的性能，以提供学习曲线并深入了解模型如何学习问题。

Keras API通过在训练模型时向model.fit()函数指定“validation_data”参数来支持这一点，该参数将返回一个对象，该对象描述了在每个训练周期中模型在选定的损失和指标上的性能。

# record model performance on a validation dataset during training
history = model.fit(..., validation_data=(valX, valY))

1 2	# 在训练期间记录模型在验证集上的性能 history = model.fit(..., validation_data=(valX, valY))

为了普遍估计模型在该问题上的性能，我们可以使用k折交叉验证，可能使用5折交叉验证。这将考虑模型在训练集和测试集差异以及学习算法的随机性方面的方差。模型的性能可以取为k折的平均性能，并附带标准差，如果需要，可用于估计置信区间。

我们可以使用scikit-learn API中的KFold类来实现给定神经网络模型的k折交叉验证评估。有多种方法可以实现这一点，尽管我们可以选择一种灵活的方法，即KFold仅用于指定每个分割使用的行索引。

# example of k-fold cv for a neural net
data = ...
# prepare cross validation
kfold = KFold(5, shuffle=True, random_state=1)
# enumerate splits
for train_ix, test_ix in kfold.split(data):
        model = ...
	...

# 神经网络k折交叉验证示例

数据 = ...

# 准备交叉验证

kfold = KFold(5, shuffle=True, random_state=1)

# 枚举划分

for train_ix, test_ix in kfold.split(data):

model = ...

...

我们将保留实际的测试数据集，并将其用作对我们最终模型的评估。

如何开发基线模型

第一步是开发一个基线模型。

这是至关重要的，因为它不仅涉及开发测试工具的基础设施，以便我们可以根据数据集评估我们设计的任何模型，而且通过任何改进都可以与之相比，它建立了模型在该问题上的性能基线。

测试工具的设计是模块化的，我们可以为每个部分开发一个单独的函数。这使得测试工具的某个方面可以根据需要进行修改或互换，而与其余部分分开。

我们可以用五个关键元素来开发这个测试工具。它们是数据集的加载、数据集的准备、模型的定义、模型的评估和结果的呈现。

加载数据集

我们对数据集有一些了解。

例如，我们知道图像都经过预分割（例如，每张图像包含一个衣物），所有图像都具有相同的28x28像素的方形尺寸，并且图像是灰度图。因此，我们可以加载图像并将数据数组重塑为包含单个颜色通道。

# load dataset
(trainX, trainY), (testX, testY) = fashion_mnist.load_data()
# reshape dataset to have a single channel
trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
testX = testX.reshape((testX.shape[0], 28, 28, 1))

# 加载数据集

(trainX, trainY), (testX, testY) = fashion_mnist.load_data()

# 将数据集重塑为单通道

trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))

testX = testX.reshape((testX.shape[0], 28, 28, 1))

我们还知道有 10 个类别，并且这些类别表示为唯一的整数。

因此，我们可以对每个样本的类别元素进行独热编码，将整数转换为10个元素的二进制向量，其中类别值索引处为1。我们可以使用to_categorical()实用函数来实现这一点。

# one hot encode target values
trainY = to_categorical(trainY)
testY = to_categorical(testY)

# 独热编码目标值

trainY = to_categorical(trainY)

testY = to_categorical(testY)

load_dataset() 函数实现了这些行为，可用于加载数据集。

# load train and test dataset
def load_dataset():
	# load dataset
	(trainX, trainY), (testX, testY) = fashion_mnist.load_data()
	# reshape dataset to have a single channel
	trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
	testX = testX.reshape((testX.shape[0], 28, 28, 1))
	# one hot encode target values
	trainY = to_categorical(trainY)
	testY = to_categorical(testY)
	return trainX, trainY, testX, testY

# 加载训练和测试数据集

def load_dataset():

# 加载数据集

(trainX, trainY), (testX, testY) = fashion_mnist.load_data()

# 将数据集重塑为单通道

trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))

testX = testX.reshape((testX.shape[0], 28, 28, 1))

# 独热编码目标值

trainY = to_categorical(trainY)

testY = to_categorical(testY)

return trainX, trainY, testX, testY

准备像素数据

我们知道数据集中每个图像的像素值是介于黑色和白色之间的无符号整数，即0到255。

我们不知道如何最好地缩放像素值进行建模，但我们知道需要进行某种缩放。

一个好的起点是标准化灰度图像的像素值，例如将它们重新缩放到[0,1]的范围。这包括首先将数据类型从无符号整数转换为浮点数，然后将像素值除以最大值。

# convert from integers to floats
train_norm = train.astype('float32')
test_norm = test.astype('float32')
# normalize to range 0-1
train_norm = train_norm / 255.0
test_norm = test_norm / 255.0

# 将整数转换为浮点数

train_norm = train.astype('float32')

test_norm = test.astype('float32')

# 归一化到 0-1 范围

train_norm = train_norm / 255.0

test_norm = test_norm / 255.0

下面的 prep_pixels() 函数实现了这些行为，并提供了需要缩放的训练集和测试集的像素值。

# scale pixels
def prep_pixels(train, test):
	# convert from integers to floats
	train_norm = train.astype('float32')
	test_norm = test.astype('float32')
	# normalize to range 0-1
	train_norm = train_norm / 255.0
	test_norm = test_norm / 255.0
	# return normalized images
	return train_norm, test_norm

# 缩放像素

def prep_pixels(train, test):

# 将整数转换为浮点数

train_norm = train.astype('float32')

test_norm = test.astype('float32')

# 归一化到0-1范围

train_norm = train_norm / 255.0

test_norm = test_norm / 255.0

# 返回归一化图像

return train_norm, test_norm

此函数必须在任何建模之前调用，以准备像素值。

定义模型

接下来，我们需要为该问题定义一个基线卷积神经网络模型。

该模型有两个主要方面：由卷积层和池化层组成的特征提取前端，以及将进行预测的分类器后端。

对于卷积前端，我们可以从一个具有小滤镜尺寸（3x3）和适中滤镜数量（32）的单个卷积层开始，然后是最大池化层。然后可以将滤镜图展平，为分类器提供特征。

考虑到该问题是一个多类别分类问题，我们知道需要一个具有10个节点的输出层来预测图像属于10个类别中每个类别的概率分布。这还需要使用softmax激活函数。在特征提取器和输出层之间，我们可以添加一个全连接层来解释特征，这里使用100个节点。

所有层都将使用ReLU激活函数和He权重初始化方案，这两者都是最佳实践。

我们将使用保守的随机梯度下降优化器配置，学习率为0.01，动量为0.9。将优化分类交叉熵损失函数，这对于多类别分类是合适的，并且我们将监控分类准确率指标，该指标是合适的，因为我们有10个类别中每个类别具有相同数量的样本。

下面的define_model()函数将定义并返回此模型。

# define cnn model
def define_model():
	model = Sequential()
	model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
	model.add(MaxPooling2D((2, 2)))
	model.add(Flatten())
	model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
	model.add(Dense(10, activation='softmax'))
	# compile model
	opt = SGD(lr=0.01, momentum=0.9)
	model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
	return model

# 定义cnn模型

def define_model():

model = Sequential()

model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))

model.add(MaxPooling2D((2, 2)))

model.add(Flatten())

model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))

model.add(Dense(10, activation='softmax'))

# 编译模型

opt = SGD(lr=0.01, momentum=0.9)

model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

return model

评估模型

模型定义后，我们需要对其进行评估。

模型将使用5折交叉验证进行评估。选择k=5是为了同时提供重复评估的基线，并且不会太大导致需要很长的运行时间。每个测试集将占训练数据集的20%，或大约12,000个样本，接近该问题实际测试集的大小。

训练数据集在分割之前会被打乱，并且每次都会进行样本打乱，以便我们评估的任何模型在每一折中都具有相同的训练集和测试集，从而提供公平的比较。

我们将使用32个样本的默认批次大小，训练基线模型10个训练周期。每一折的测试集将用于在训练运行的每个周期中评估模型，以便我们之后可以创建学习曲线，并在运行结束时评估模型，以便我们可以估计模型的性能。因此，我们将跟踪每个运行产生的历史记录以及该折的分类准确率。

下面的evaluate_model()函数实现了这些行为，它接受训练数据集作为参数，并返回一个可以稍后汇总的准确率分数和训练历史列表。

# evaluate a model using k-fold cross-validation
def evaluate_model(dataX, dataY, n_folds=5):
	scores, histories = list(), list()
	# prepare cross validation
	kfold = KFold(n_folds, shuffle=True, random_state=1)
	# enumerate splits
	for train_ix, test_ix in kfold.split(dataX):
		# define model
		model = define_model()
		# select rows for train and test
		trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
		# fit model
		history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)
		# evaluate model
		_, acc = model.evaluate(testX, testY, verbose=0)
		print('> %.3f' % (acc * 100.0))
		# append scores
		scores.append(acc)
		histories.append(history)
	return scores, histories

# 使用k折交叉验证评估模型

def evaluate_model(dataX, dataY, n_folds=5):

scores, histories = list(), list()

# 准备交叉验证

kfold = KFold(n_folds, shuffle=True, random_state=1)

# 枚举划分

for train_ix, test_ix in kfold.split(dataX):

# 定义模型

model = define_model()

# 选择训练和测试的行

trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]

# 拟合模型

history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)

# 评估模型

_, acc = model.evaluate(testX, testY, verbose=0)

print('> %.3f' % (acc * 100.0))

# 追加分数

scores.append(acc)

histories.append(history)

return scores, histories

5. 呈现结果

模型评估完成后，我们可以呈现结果。

需要呈现两个关键方面：模型在训练期间的学习行为的诊断，以及模型性能的估计。这些可以通过单独的函数来实现。

首先，诊断涉及创建线图，显示模型在k折交叉验证的每个折叠期间在训练集和测试集上的性能。这些图对于了解模型是否过拟合、欠拟合或与数据集拟合良好非常有用。

我们将创建一个包含两个子图的单个图形，一个用于损失，一个用于准确率。蓝色线条将表示模型在训练集上的性能，橙色线条将表示在保留测试集上的性能。下面的summarize_diagnostics()函数在收集到的训练历史记录的帮助下创建并显示此图。

# plot diagnostic learning curves
def summarize_diagnostics(histories):
	for i in range(len(histories)):
		# plot loss
		pyplot.subplot(211)
		pyplot.title('Cross Entropy Loss')
		pyplot.plot(histories[i].history['loss'], color='blue', label='train')
		pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')
		# plot accuracy
		pyplot.subplot(212)
		pyplot.title('Classification Accuracy')
		pyplot.plot(histories[i].history['accuracy'], color='blue', label='train')
		pyplot.plot(histories[i].history['val_accuracy'], color='orange', label='test')
	pyplot.show()

# 绘制诊断学习曲线

def summarize_diagnostics(histories):

for i in range(len(histories)):

# 绘制损失

pyplot.subplot(211)

pyplot.title('交叉熵损失')

pyplot.plot(histories[i].history['loss'], color='blue', label='train')

pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')

# 绘制准确率

pyplot.subplot(212)

pyplot.title('分类准确率')

pyplot.plot(histories[i].history['accuracy'], color='blue', label='train')

pyplot.plot(histories[i].history['val_accuracy'], color='orange', label='test')

pyplot.show()

接下来，通过计算均值和标准差来总结交叉验证过程中收集到的分类准确率分数。这可以估算模型在该数据集上训练的平均预期性能，以及平均方差的估算。我们还将通过创建和显示箱形图和须状图来总结分数的分布。

下面的 `summarize_performance()` 函数为给定模型评估期间收集的分数列表实现了此功能。

# summarize model performance
def summarize_performance(scores):
	# print summary
	print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))
	# box and whisker plots of results
	pyplot.boxplot(scores)
	pyplot.show()

# 总结模型性能

def summarize_performance(scores):

# 打印摘要

print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))

# box and whisker plots of results

pyplot.boxplot(scores)

pyplot.show()

完整示例

我们需要一个函数来驱动测试工具。

这涉及调用所有已定义的函数。

# run the test harness for evaluating a model
def run_test_harness():
	# load dataset
	trainX, trainY, testX, testY = load_dataset()
	# prepare pixel data
	trainX, testX = prep_pixels(trainX, testX)
	# evaluate model
	scores, histories = evaluate_model(trainX, trainY)
	# learning curves
	summarize_diagnostics(histories)
	# summarize estimated performance
	summarize_performance(scores)

# 运行测试工具以评估 cifar10 数据集上的模型

def run_test_harness():

# 加载数据集

trainX, trainY, testX, testY = load_dataset()

# 准备像素数据

trainX, testX = prep_pixels(trainX, testX)

# 评估模型

scores, histories = evaluate_model(trainX, trainY)

# 学习曲线

summarize_diagnostics(histories)

# summarize estimated performance

summarize_performance(scores)

现在我们有了所有需要的东西；下面列出了在 MNIST 数据集上使用卷积神经网络的基线模型的完整代码示例。

# baseline cnn model for fashion mnist
from numpy import mean
from numpy import std
from matplotlib import pyplot
from sklearn.model_selection import KFold
from keras.datasets import fashion_mnist
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.optimizers import SGD

# load train and test dataset
def load_dataset():
	# load dataset
	(trainX, trainY), (testX, testY) = fashion_mnist.load_data()
	# reshape dataset to have a single channel
	trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
	testX = testX.reshape((testX.shape[0], 28, 28, 1))
	# one hot encode target values
	trainY = to_categorical(trainY)
	testY = to_categorical(testY)
	return trainX, trainY, testX, testY

# scale pixels
def prep_pixels(train, test):
	# convert from integers to floats
	train_norm = train.astype('float32')
	test_norm = test.astype('float32')
	# normalize to range 0-1
	train_norm = train_norm / 255.0
	test_norm = test_norm / 255.0
	# return normalized images
	return train_norm, test_norm

# define cnn model
def define_model():
	model = Sequential()
	model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
	model.add(MaxPooling2D((2, 2)))
	model.add(Flatten())
	model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
	model.add(Dense(10, activation='softmax'))
	# compile model
	opt = SGD(lr=0.01, momentum=0.9)
	model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
	return model

# evaluate a model using k-fold cross-validation
def evaluate_model(dataX, dataY, n_folds=5):
	scores, histories = list(), list()
	# prepare cross validation
	kfold = KFold(n_folds, shuffle=True, random_state=1)
	# enumerate splits
	for train_ix, test_ix in kfold.split(dataX):
		# define model
		model = define_model()
		# select rows for train and test
		trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
		# fit model
		history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)
		# evaluate model
		_, acc = model.evaluate(testX, testY, verbose=0)
		print('> %.3f' % (acc * 100.0))
		# append scores
		scores.append(acc)
		histories.append(history)
	return scores, histories

# plot diagnostic learning curves
def summarize_diagnostics(histories):
	for i in range(len(histories)):
		# plot loss
		pyplot.subplot(211)
		pyplot.title('Cross Entropy Loss')
		pyplot.plot(histories[i].history['loss'], color='blue', label='train')
		pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')
		# plot accuracy
		pyplot.subplot(212)
		pyplot.title('Classification Accuracy')
		pyplot.plot(histories[i].history['accuracy'], color='blue', label='train')
		pyplot.plot(histories[i].history['val_accuracy'], color='orange', label='test')
	pyplot.show()

# summarize model performance
def summarize_performance(scores):
	# print summary
	print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))
	# box and whisker plots of results
	pyplot.boxplot(scores)
	pyplot.show()

# run the test harness for evaluating a model
def run_test_harness():
	# load dataset
	trainX, trainY, testX, testY = load_dataset()
	# prepare pixel data
	trainX, testX = prep_pixels(trainX, testX)
	# evaluate model
	scores, histories = evaluate_model(trainX, trainY)
	# learning curves
	summarize_diagnostics(histories)
	# summarize estimated performance
	summarize_performance(scores)

# entry point, run the test harness
run_test_harness()

100

101

102

103

104

105

106

107

108

109

# baseline cnn model for fashion mnist

from numpy import mean

from numpy import std

from matplotlib import pyplot

from sklearn.model_selection import KFold

from keras.datasets import fashion_mnist

from keras.utils import to_categorical

from keras.models import Sequential

从 keras.layers 导入 Conv2D

从 keras.layers 导入 MaxPooling2D

from keras.layers import Dense

from keras.layers import Flatten

from keras.optimizers import SGD

# 加载训练和测试数据集

def load_dataset():

# 加载数据集

(trainX, trainY), (testX, testY) = fashion_mnist.load_data()

# 将数据集重塑为单通道

trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))

testX = testX.reshape((testX.shape[0], 28, 28, 1))

# 独热编码目标值

trainY = to_categorical(trainY)

testY = to_categorical(testY)

return trainX, trainY, testX, testY

# 缩放像素

def prep_pixels(train, test):

# 将整数转换为浮点数

train_norm = train.astype('float32')

test_norm = test.astype('float32')

# 归一化到0-1范围

train_norm = train_norm / 255.0

test_norm = test_norm / 255.0

# 返回归一化图像

return train_norm, test_norm

# 定义cnn模型

def define_model():

model = Sequential()

model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))

model.add(MaxPooling2D((2, 2)))

model.add(Flatten())

model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))

model.add(Dense(10, activation='softmax'))

# 编译模型

opt = SGD(lr=0.01, momentum=0.9)

model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

return model

# 使用k折交叉验证评估模型

def evaluate_model(dataX, dataY, n_folds=5):

scores, histories = list(), list()

# 准备交叉验证

kfold = KFold(n_folds, shuffle=True, random_state=1)

# 枚举划分

for train_ix, test_ix in kfold.split(dataX):

# 定义模型

model = define_model()

# 选择训练和测试的行

trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]

# 拟合模型

history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)

# 评估模型

_, acc = model.evaluate(testX, testY, verbose=0)

print('> %.3f' % (acc * 100.0))

# 追加分数

scores.append(acc)

histories.append(history)

return scores, histories

# 绘制诊断学习曲线

def summarize_diagnostics(histories):

for i in range(len(histories)):

# 绘制损失

pyplot.subplot(211)

pyplot.title('交叉熵损失')

pyplot.plot(histories[i].history['loss'], color='blue', label='train')

pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')

# 绘制准确率

pyplot.subplot(212)

pyplot.title('分类准确率')

pyplot.plot(histories[i].history['accuracy'], color='blue', label='train')

pyplot.plot(histories[i].history['val_accuracy'], color='orange', label='test')

pyplot.show()

# 总结模型性能

def summarize_performance(scores):

# 打印摘要

print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))

# box and whisker plots of results

pyplot.boxplot(scores)

pyplot.show()

# 运行测试工具以评估 cifar10 数据集上的模型

def run_test_harness():

# 加载数据集

trainX, trainY, testX, testY = load_dataset()

# 准备像素数据

trainX, testX = prep_pixels(trainX, testX)

# 评估模型

scores, histories = evaluate_model(trainX, trainY)

# 学习曲线

summarize_diagnostics(histories)

# summarize estimated performance

summarize_performance(scores)

# 入口点，运行测试工具

run_test_harness()

运行示例会为交叉验证过程的每个折叠打印分类准确率。这有助于了解模型评估的进展情况。

注意：由于算法或评估程序的随机性，或者数值精度的差异，您的结果可能有所不同。请考虑运行示例几次并比较平均结果。

我们可以看到，在每个折叠中，基线模型都达到了低于 10% 的错误率，在某些情况下达到了 98% 和 99% 的准确率。这些都是不错的结果。

> 91.200
> 91.217
> 90.958
> 91.242
> 91.317

> 91.200

> 91.217

> 90.958

> 91.242

> 91.317

接下来，将显示学习曲线图，从而深入了解模型在每个折叠中的学习行为。

在这种情况下，我们可以看到模型通常能很好地拟合，训练和测试学习曲线正在收敛。可能有一些轻微过拟合的迹象。

Loss-and-Accuracy-Learning-Curves-for-the-Baseline-Model-on-the-Fashion-MNIST-Dataset-During-k-Fold-Cross-Validation

接下来，计算模型性能的摘要。我们可以看到，在此情况下，模型的估计技能约为 96%，这令人印象深刻。

Accuracy: mean=91.187 std=0.121, n=5

1	Accuracy: mean=91.187 std=0.121, n=5

最后，创建箱形图和须状图来总结准确率分数的分布。

Box-and-Whisker-Plot-of-Accuracy-Scores-for-the-Baseline-Model-on-the-Fashion-MNIST-Dataset-Evaluated-Using-k-Fold-Cross-Validation

正如我们所料，分布范围在九十年代初。

现在我们有了一个健壮的测试框架和一个性能良好的基线模型。

如何开发改进模型

我们可以通过多种方式探索改进基线模型的方法。

我们将关注那些通常能带来改进的领域，即所谓的“容易摘到的果实”。第一种是改变卷积操作以添加填充，第二种是在此基础上增加滤波器的数量。

Padding Convolutions

为卷积操作添加填充通常可以提高模型性能，因为更多的输入图像或特征图有机会参与或贡献于输出。

默认情况下，卷积操作使用“*valid*”填充，这意味着仅在可能的情况下应用卷积。这可以更改为“*same*”填充，以便在输入周围添加零值，使输出与输入的大小相同。

...
model.add(Conv2D(32, (3, 3), padding='same', activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))

1 2	... model.add(Conv2D(32, (3, 3), padding='same', activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))

为了完整起见，下面提供了包含填充更改的完整代码列表。

# model with padded convolutions for the fashion mnist dataset
from numpy import mean
from numpy import std
from matplotlib import pyplot
from sklearn.model_selection import KFold
from keras.datasets import fashion_mnist
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.optimizers import SGD

# load train and test dataset
def load_dataset():
	# load dataset
	(trainX, trainY), (testX, testY) = fashion_mnist.load_data()
	# reshape dataset to have a single channel
	trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
	testX = testX.reshape((testX.shape[0], 28, 28, 1))
	# one hot encode target values
	trainY = to_categorical(trainY)
	testY = to_categorical(testY)
	return trainX, trainY, testX, testY

# scale pixels
def prep_pixels(train, test):
	# convert from integers to floats
	train_norm = train.astype('float32')
	test_norm = test.astype('float32')
	# normalize to range 0-1
	train_norm = train_norm / 255.0
	test_norm = test_norm / 255.0
	# return normalized images
	return train_norm, test_norm

# define cnn model
def define_model():
	model = Sequential()
	model.add(Conv2D(32, (3, 3), padding='same', activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
	model.add(MaxPooling2D((2, 2)))
	model.add(Flatten())
	model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
	model.add(Dense(10, activation='softmax'))
	# compile model
	opt = SGD(lr=0.01, momentum=0.9)
	model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
	return model

# evaluate a model using k-fold cross-validation
def evaluate_model(dataX, dataY, n_folds=5):
	scores, histories = list(), list()
	# prepare cross validation
	kfold = KFold(n_folds, shuffle=True, random_state=1)
	# enumerate splits
	for train_ix, test_ix in kfold.split(dataX):
		# define model
		model = define_model()
		# select rows for train and test
		trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
		# fit model
		history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)
		# evaluate model
		_, acc = model.evaluate(testX, testY, verbose=0)
		print('> %.3f' % (acc * 100.0))
		# append scores
		scores.append(acc)
		histories.append(history)
	return scores, histories

# plot diagnostic learning curves
def summarize_diagnostics(histories):
	for i in range(len(histories)):
		# plot loss
		pyplot.subplot(211)
		pyplot.title('Cross Entropy Loss')
		pyplot.plot(histories[i].history['loss'], color='blue', label='train')
		pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')
		# plot accuracy
		pyplot.subplot(212)
		pyplot.title('Classification Accuracy')
		pyplot.plot(histories[i].history['accuracy'], color='blue', label='train')
		pyplot.plot(histories[i].history['val_accuracy'], color='orange', label='test')
	pyplot.show()

# summarize model performance
def summarize_performance(scores):
	# print summary
	print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))
	# box and whisker plots of results
	pyplot.boxplot(scores)
	pyplot.show()

# run the test harness for evaluating a model
def run_test_harness():
	# load dataset
	trainX, trainY, testX, testY = load_dataset()
	# prepare pixel data
	trainX, testX = prep_pixels(trainX, testX)
	# evaluate model
	scores, histories = evaluate_model(trainX, trainY)
	# learning curves
	summarize_diagnostics(histories)
	# summarize estimated performance
	summarize_performance(scores)

# entry point, run the test harness
run_test_harness()

100

101

102

103

104

105

106

107

108

109

# model with padded convolutions for the fashion mnist dataset

from numpy import mean

from numpy import std

from matplotlib import pyplot

from sklearn.model_selection import KFold

from keras.datasets import fashion_mnist

from keras.utils import to_categorical

from keras.models import Sequential

从 keras.layers 导入 Conv2D

从 keras.layers 导入 MaxPooling2D

from keras.layers import Dense

from keras.layers import Flatten

from keras.optimizers import SGD

# 加载训练和测试数据集

def load_dataset():

# 加载数据集

(trainX, trainY), (testX, testY) = fashion_mnist.load_data()

# 将数据集重塑为单通道

trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))

testX = testX.reshape((testX.shape[0], 28, 28, 1))

# 独热编码目标值

trainY = to_categorical(trainY)

testY = to_categorical(testY)

return trainX, trainY, testX, testY

# 缩放像素

def prep_pixels(train, test):

# 将整数转换为浮点数

train_norm = train.astype('float32')

test_norm = test.astype('float32')

# 归一化到0-1范围

train_norm = train_norm / 255.0

test_norm = test_norm / 255.0

# 返回归一化图像

return train_norm, test_norm

# 定义cnn模型

def define_model():

model = Sequential()

model.add(Conv2D(32, (3, 3), padding='same', activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))

model.add(MaxPooling2D((2, 2)))

model.add(Flatten())

model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))

model.add(Dense(10, activation='softmax'))

# 编译模型

opt = SGD(lr=0.01, momentum=0.9)

model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

return model

# 使用k折交叉验证评估模型

def evaluate_model(dataX, dataY, n_folds=5):

scores, histories = list(), list()

# 准备交叉验证

kfold = KFold(n_folds, shuffle=True, random_state=1)

# 枚举划分

for train_ix, test_ix in kfold.split(dataX):

# 定义模型

model = define_model()

# 选择训练和测试的行

trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]

# 拟合模型

history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)

# 评估模型

_, acc = model.evaluate(testX, testY, verbose=0)

print('> %.3f' % (acc * 100.0))

# 追加分数

scores.append(acc)

histories.append(history)

return scores, histories

# 绘制诊断学习曲线

def summarize_diagnostics(histories):

for i in range(len(histories)):

# 绘制损失

pyplot.subplot(211)

pyplot.title('交叉熵损失')

pyplot.plot(histories[i].history['loss'], color='blue', label='train')

pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')

# 绘制准确率

pyplot.subplot(212)

pyplot.title('分类准确率')

pyplot.plot(histories[i].history['accuracy'], color='blue', label='train')

pyplot.plot(histories[i].history['val_accuracy'], color='orange', label='test')

pyplot.show()

# 总结模型性能

def summarize_performance(scores):

# 打印摘要

print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))

# box and whisker plots of results

pyplot.boxplot(scores)

pyplot.show()

# 运行测试工具以评估 cifar10 数据集上的模型

def run_test_harness():

# 加载数据集

trainX, trainY, testX, testY = load_dataset()

# 准备像素数据

trainX, testX = prep_pixels(trainX, testX)

# 评估模型

scores, histories = evaluate_model(trainX, trainY)

# 学习曲线

summarize_diagnostics(histories)

# summarize estimated performance

summarize_performance(scores)

# 入口点，运行测试工具

run_test_harness()

再次运行示例会报告交叉验证过程中每个折叠的模型性能。

注意：由于算法或评估程序的随机性，或者数值精度的差异，您的结果可能有所不同。请考虑运行示例几次并比较平均结果。

与仅使用 same 填充相比，我们可以看到模型性能可能略有提高。

> 90.875
> 91.442
> 91.242
> 91.275
> 91.450

> 90.875

> 91.442

> 91.242

> 91.275

> 91.450

创建了学习曲线图。与基线模型一样，我们可能会看到一些轻微的过拟合。这可以通过使用正则化或训练更少的 epoch 来解决。

Loss-and-Accuracy-Learning-Curves-for-the-Same-Padding-on-the-Fashion-MNIST-Dataset-During-k-Fold-Cross-Validation

接下来，将展示模型的估计性能，与基线模型的 91.187% 相比，模型的平均准确率略有下降，为 91.257%。

这可能是一个真实的效果，也可能不是，因为它在标准差的范围内。也许进行更多的实验重复可以弄清楚这一点。

Accuracy: mean=91.257 std=0.209, n=5

1	Accuracy: mean=91.257 std=0.209, n=5

Box-and-Whisker-Plot-of-Accuracy-Scores-for-Same-Padding-on-the-Fashion-MNIST-Dataset-Evaluated-Using-k-Fold-Cross-Validation

Increasing Filters

增加卷积层中使用的滤波器数量通常可以提高性能，因为它能为从输入图像中提取特征提供更多机会。

这在非常小的滤波器（如 3×3 像素）被使用时尤其重要。

在此更改中，我们可以将卷积层中的滤波器数量从 32 个增加到 64 个（加倍）。我们还将利用使用“*same*”填充可能带来的改进。

...
model.add(Conv2D(64, (3, 3), padding='same', activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))

1 2	... model.add(Conv2D(64, (3, 3), padding='same', activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))

为了完整起见，下面提供了包含填充更改的完整代码列表。

# model with double the filters for the fashion mnist dataset
from numpy import mean
from numpy import std
from matplotlib import pyplot
from sklearn.model_selection import KFold
from keras.datasets import fashion_mnist
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.optimizers import SGD

# load train and test dataset
def load_dataset():
	# load dataset
	(trainX, trainY), (testX, testY) = fashion_mnist.load_data()
	# reshape dataset to have a single channel
	trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
	testX = testX.reshape((testX.shape[0], 28, 28, 1))
	# one hot encode target values
	trainY = to_categorical(trainY)
	testY = to_categorical(testY)
	return trainX, trainY, testX, testY

# scale pixels
def prep_pixels(train, test):
	# convert from integers to floats
	train_norm = train.astype('float32')
	test_norm = test.astype('float32')
	# normalize to range 0-1
	train_norm = train_norm / 255.0
	test_norm = test_norm / 255.0
	# return normalized images
	return train_norm, test_norm

# define cnn model
def define_model():
	model = Sequential()
	model.add(Conv2D(64, (3, 3), padding='same', activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
	model.add(MaxPooling2D((2, 2)))
	model.add(Flatten())
	model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
	model.add(Dense(10, activation='softmax'))
	# compile model
	opt = SGD(lr=0.01, momentum=0.9)
	model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
	return model

# evaluate a model using k-fold cross-validation
def evaluate_model(dataX, dataY, n_folds=5):
	scores, histories = list(), list()
	# prepare cross validation
	kfold = KFold(n_folds, shuffle=True, random_state=1)
	# enumerate splits
	for train_ix, test_ix in kfold.split(dataX):
		# define model
		model = define_model()
		# select rows for train and test
		trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
		# fit model
		history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)
		# evaluate model
		_, acc = model.evaluate(testX, testY, verbose=0)
		print('> %.3f' % (acc * 100.0))
		# append scores
		scores.append(acc)
		histories.append(history)
	return scores, histories

# plot diagnostic learning curves
def summarize_diagnostics(histories):
	for i in range(len(histories)):
		# plot loss
		pyplot.subplot(211)
		pyplot.title('Cross Entropy Loss')
		pyplot.plot(histories[i].history['loss'], color='blue', label='train')
		pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')
		# plot accuracy
		pyplot.subplot(212)
		pyplot.title('Classification Accuracy')
		pyplot.plot(histories[i].history['accuracy'], color='blue', label='train')
		pyplot.plot(histories[i].history['val_accuracy'], color='orange', label='test')
	pyplot.show()

# summarize model performance
def summarize_performance(scores):
	# print summary
	print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))
	# box and whisker plots of results
	pyplot.boxplot(scores)
	pyplot.show()

# run the test harness for evaluating a model
def run_test_harness():
	# load dataset
	trainX, trainY, testX, testY = load_dataset()
	# prepare pixel data
	trainX, testX = prep_pixels(trainX, testX)
	# evaluate model
	scores, histories = evaluate_model(trainX, trainY)
	# learning curves
	summarize_diagnostics(histories)
	# summarize estimated performance
	summarize_performance(scores)

# entry point, run the test harness
run_test_harness()

100

101

102

103

104

105

106

107

108

109

# model with double the filters for the fashion mnist dataset

from numpy import mean

from numpy import std

from matplotlib import pyplot

from sklearn.model_selection import KFold

from keras.datasets import fashion_mnist

from keras.utils import to_categorical

from keras.models import Sequential

从 keras.layers 导入 Conv2D

从 keras.layers 导入 MaxPooling2D

from keras.layers import Dense

from keras.layers import Flatten

from keras.optimizers import SGD

# 加载训练和测试数据集

def load_dataset():

# 加载数据集

(trainX, trainY), (testX, testY) = fashion_mnist.load_data()

# 将数据集重塑为单通道

trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))

testX = testX.reshape((testX.shape[0], 28, 28, 1))

# 独热编码目标值

trainY = to_categorical(trainY)

testY = to_categorical(testY)

return trainX, trainY, testX, testY

# 缩放像素

def prep_pixels(train, test):

# 将整数转换为浮点数

train_norm = train.astype('float32')

test_norm = test.astype('float32')

# 归一化到0-1范围

train_norm = train_norm / 255.0

test_norm = test_norm / 255.0

# 返回归一化图像

return train_norm, test_norm

# 定义cnn模型

def define_model():

model = Sequential()

model.add(Conv2D(64, (3, 3), padding='same', activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))

model.add(MaxPooling2D((2, 2)))

model.add(Flatten())

model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))

model.add(Dense(10, activation='softmax'))

# 编译模型

opt = SGD(lr=0.01, momentum=0.9)

model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

return model

# 使用k折交叉验证评估模型

def evaluate_model(dataX, dataY, n_folds=5):

scores, histories = list(), list()

# 准备交叉验证

kfold = KFold(n_folds, shuffle=True, random_state=1)

# 枚举划分

for train_ix, test_ix in kfold.split(dataX):

# 定义模型

model = define_model()

# 选择训练和测试的行

trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]

# 拟合模型

history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)

# 评估模型

_, acc = model.evaluate(testX, testY, verbose=0)

print('> %.3f' % (acc * 100.0))

# 追加分数

scores.append(acc)

histories.append(history)

return scores, histories

# 绘制诊断学习曲线

def summarize_diagnostics(histories):

for i in range(len(histories)):

# 绘制损失

pyplot.subplot(211)

pyplot.title('交叉熵损失')

pyplot.plot(histories[i].history['loss'], color='blue', label='train')

pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')

# 绘制准确率

pyplot.subplot(212)

pyplot.title('分类准确率')

pyplot.plot(histories[i].history['accuracy'], color='blue', label='train')

pyplot.plot(histories[i].history['val_accuracy'], color='orange', label='test')

pyplot.show()

# 总结模型性能

def summarize_performance(scores):

# 打印摘要

print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))

# box and whisker plots of results

pyplot.boxplot(scores)

pyplot.show()

# 运行测试工具以评估 cifar10 数据集上的模型

def run_test_harness():

# 加载数据集

trainX, trainY, testX, testY = load_dataset()

# 准备像素数据

trainX, testX = prep_pixels(trainX, testX)

# 评估模型

scores, histories = evaluate_model(trainX, trainY)

# 学习曲线

summarize_diagnostics(histories)

# summarize estimated performance

summarize_performance(scores)

# 入口点，运行测试工具

run_test_harness()

再次运行示例会报告交叉验证过程中每个折叠的模型性能。

注意：由于算法或评估程序的随机性，或者数值精度的差异，您的结果可能有所不同。请考虑运行示例几次并比较平均结果。

每个折叠的分数可能表明与基线模型和单独使用 same 填充相比，性能有所提高。

> 90.917
> 90.908
> 90.175
> 91.158
> 91.408

> 90.917

> 90.908

> 90.175

> 91.158

> 91.408

创建了学习曲线图，在这种情况下，模型在问题上仍然具有合理的拟合度，并且有一些运行过拟合的小迹象。

Loss-and-Accuracy-Learning-Curves-for-the-More-Filters-and-Padding-on-the-Fashion-MNIST-Dataset-During-k-Fold-Cross-Validation

接下来，将展示模型的估计性能，与基线模型的 90.913% 相比，性能略有提高至 91.257%。

同样，该变化仍在标准差的范围内，并且不清楚该效应是否真实。

Accuracy: mean=90.913 std=0.412, n=5

1	Accuracy: mean=90.913 std=0.412, n=5

如何完成模型并进行预测

只要我们有想法、时间和资源来测试它们，模型改进过程就可以继续下去。

在某个时候，必须选择并采纳最终的模型配置。在这种情况下，我们将保持简单，并将基线模型用作最终模型。

首先，我们将完成模型，通过在整个训练数据集上拟合模型并将模型保存到文件以供以后使用。然后，我们将加载模型并在保留的测试数据集上评估其性能，以了解所选模型在实践中的表现如何。最后，我们将使用保存的模型对单个图像进行预测。

保存最终模型

最终模型通常拟合所有可用数据，例如所有训练和测试数据集的组合。

在本教程中，我们特意预留了一个测试数据集，以便我们能够估算最终模型的性能，这在实践中是一个好主意。因此，我们将仅在训练数据集上拟合模型。

# fit model
model.fit(trainX, trainY, epochs=10, batch_size=32, verbose=0)

1 2	# 拟合模型 model.fit(trainX, trainY, epochs=10, batch_size=32, verbose=0)

拟合后，我们可以通过调用模型上的 `save()` 函数并将指定的文件名传递进去，将最终模型保存到 h5 文件中。

# save model
model.save('final_model.h5')

1 2	# 保存模型 model.save('final_model.h5')

注意：保存和加载 Keras 模型需要您的工作站上安装了 h5py 库。

下面列出了在训练数据集上拟合最终模型并将其保存到文件的完整示例。

# save the final model to file
from keras.datasets import fashion_mnist
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.optimizers import SGD

# load train and test dataset
def load_dataset():
	# load dataset
	(trainX, trainY), (testX, testY) = fashion_mnist.load_data()
	# reshape dataset to have a single channel
	trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
	testX = testX.reshape((testX.shape[0], 28, 28, 1))
	# one hot encode target values
	trainY = to_categorical(trainY)
	testY = to_categorical(testY)
	return trainX, trainY, testX, testY

# scale pixels
def prep_pixels(train, test):
	# convert from integers to floats
	train_norm = train.astype('float32')
	test_norm = test.astype('float32')
	# normalize to range 0-1
	train_norm = train_norm / 255.0
	test_norm = test_norm / 255.0
	# return normalized images
	return train_norm, test_norm

# define cnn model
def define_model():
	model = Sequential()
	model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
	model.add(MaxPooling2D((2, 2)))
	model.add(Flatten())
	model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
	model.add(Dense(10, activation='softmax'))
	# compile model
	opt = SGD(lr=0.01, momentum=0.9)
	model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
	return model

# run the test harness for evaluating a model
def run_test_harness():
	# load dataset
	trainX, trainY, testX, testY = load_dataset()
	# prepare pixel data
	trainX, testX = prep_pixels(trainX, testX)
	# define model
	model = define_model()
	# fit model
	model.fit(trainX, trainY, epochs=10, batch_size=32, verbose=0)
	# save model
	model.save('final_model.h5')

# entry point, run the test harness
run_test_harness()

# 将最终模型保存到文件

from keras.datasets import fashion_mnist

from keras.utils import to_categorical

from keras.models import Sequential

从 keras.layers 导入 Conv2D

从 keras.layers 导入 MaxPooling2D

from keras.layers import Dense

from keras.layers import Flatten

from keras.optimizers import SGD

# 加载训练和测试数据集

def load_dataset():

# 加载数据集

(trainX, trainY), (testX, testY) = fashion_mnist.load_data()

# 将数据集重塑为单通道

trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))

testX = testX.reshape((testX.shape[0], 28, 28, 1))

# 独热编码目标值

trainY = to_categorical(trainY)

testY = to_categorical(testY)

return trainX, trainY, testX, testY

# 缩放像素

def prep_pixels(train, test):

# 将整数转换为浮点数

train_norm = train.astype('float32')

test_norm = test.astype('float32')

# 归一化到0-1范围

train_norm = train_norm / 255.0

test_norm = test_norm / 255.0

# 返回归一化图像

return train_norm, test_norm

# 定义cnn模型

def define_model():

model = Sequential()

model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))

model.add(MaxPooling2D((2, 2)))

model.add(Flatten())

model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))

model.add(Dense(10, activation='softmax'))

# 编译模型

opt = SGD(lr=0.01, momentum=0.9)

model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

return model

# 运行测试工具以评估 cifar10 数据集上的模型

def run_test_harness():

# 加载数据集

trainX, trainY, testX, testY = load_dataset()

# 准备像素数据

trainX, testX = prep_pixels(trainX, testX)

# 定义模型

model = define_model()

# 拟合模型

model.fit(trainX, trainY, epochs=10, batch_size=32, verbose=0)

# 保存模型

model.save('final_model.h5')

# 入口点，运行测试工具

run_test_harness()

运行此示例后，您将在当前工作目录中获得一个名为 `final_model.h5` 的 1.2MB 文件。

评估最终模型

我们现在可以加载最终模型并在保留测试数据集上对其进行评估。

如果我们有兴趣向项目利益相关者展示所选模型的性能，我们可能会这样做。

模型可以通过 *load_model()* 函数加载。

下面列出了加载保存的模型并在测试数据集上对其进行评估的完整示例。

# evaluate the deep model on the test dataset
from keras.datasets import fashion_mnist
from keras.models import load_model
from keras.utils import to_categorical

# load train and test dataset
def load_dataset():
	# load dataset
	(trainX, trainY), (testX, testY) = fashion_mnist.load_data()
	# reshape dataset to have a single channel
	trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
	testX = testX.reshape((testX.shape[0], 28, 28, 1))
	# one hot encode target values
	trainY = to_categorical(trainY)
	testY = to_categorical(testY)
	return trainX, trainY, testX, testY

# scale pixels
def prep_pixels(train, test):
	# convert from integers to floats
	train_norm = train.astype('float32')
	test_norm = test.astype('float32')
	# normalize to range 0-1
	train_norm = train_norm / 255.0
	test_norm = test_norm / 255.0
	# return normalized images
	return train_norm, test_norm

# run the test harness for evaluating a model
def run_test_harness():
	# load dataset
	trainX, trainY, testX, testY = load_dataset()
	# prepare pixel data
	trainX, testX = prep_pixels(trainX, testX)
	# load model
	model = load_model('final_model.h5')
	# evaluate model on test dataset
	_, acc = model.evaluate(testX, testY, verbose=0)
	print('> %.3f' % (acc * 100.0))

# entry point, run the test harness
run_test_harness()

# 在测试数据集上评估深度模型

from keras.datasets import fashion_mnist

from keras.models import load_model

从 keras.utils 导入 to_categorical

# 加载训练和测试数据集

def load_dataset():

# 加载数据集

(trainX, trainY), (testX, testY) = fashion_mnist.load_data()

# 将数据集重塑为单通道

trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))

testX = testX.reshape((testX.shape[0], 28, 28, 1))

# 独热编码目标值

trainY = to_categorical(trainY)

testY = to_categorical(testY)

return trainX, trainY, testX, testY

# 缩放像素

def prep_pixels(train, test):

# 将整数转换为浮点数

train_norm = train.astype('float32')

test_norm = test.astype('float32')

# 归一化到0-1范围

train_norm = train_norm / 255.0

test_norm = test_norm / 255.0

# 返回归一化图像

return train_norm, test_norm

# 运行测试工具以评估 cifar10 数据集上的模型

def run_test_harness():

# 加载数据集

trainX, trainY, testX, testY = load_dataset()

# 准备像素数据

trainX, testX = prep_pixels(trainX, testX)

# 加载模型

model = load_model('final_model.h5')

# 在测试数据集上评估模型

_, acc = model.evaluate(testX, testY, verbose=0)

print('> %.3f' % (acc * 100.0))

# 入口点，运行测试工具

run_test_harness()

运行示例将加载保存的模型并在保留测试数据集上评估模型。

计算并打印模型在测试数据集上的分类准确率。

注意：由于算法或评估程序的随机性，或者数值精度的差异，您的结果可能有所不同。请考虑运行示例几次并比较平均结果。

在这种情况下，我们可以看到模型达到了 90.990% 的准确率，或者分类错误略低于 10%，这还不错。

> 90.990

> 90.990

进行预测

我们可以使用保存的模型对新图像进行预测。

该模型假定新图像是灰度图，它们已被分割，使得一个图像包含黑色背景上的一个居中衣物，并且图像大小是正方形，大小为 28×28 像素。

下面是从 MNIST 测试数据集中提取的一张图像。您可以将其保存在当前工作目录中，文件名设置为 `sample_image.png`。

Sample Clothing (Pullover)

Download Image (sample_image.png)

我们将这张图片假定为全新的、未见过的图片，并以所需方式进行准备，看看如何使用保存的模型来预测图像所代表的整数。在此示例中，我们预期是类别“*2*”代表“*Pullover*”（也称为套头衫）。

首先，我们可以加载图像，强制其为灰度格式，并强制其大小为 28×28 像素。然后可以调整加载的图像大小，使其具有单个通道并代表数据集中的单个样本。`load_image()` 函数实现了这一点，并将返回已加载的、可用于分类的图像。

重要的是，像素值的准备方式与拟合最终模型时训练数据集的像素值准备方式相同，在本例中为归一化。

# load and prepare the image
def load_image(filename):
	# load the image
	img = load_img(filename, grayscale=True, target_size=(28, 28))
	# convert to array
	img = img_to_array(img)
	# reshape into a single sample with 1 channel
	img = img.reshape(1, 28, 28, 1)
	# prepare pixel data
	img = img.astype('float32')
	img = img / 255.0
	return img

# 加载并准备图像

def load_image(filename):

# 加载图像

img = load_img(filename, grayscale=True, target_size=(28, 28))

# 转换为数组

img = img_to_array(img)

# reshape into a single sample with 1 channel

img = img.reshape(1, 28, 28, 1)

# 准备像素数据

img = img.astype('float32')

img = img / 255.0

return img

接下来，我们可以像上一节一样加载模型，然后调用 `predict_classes()` 函数来预测图像中的衣物。

# predict the class
result = model.predict_classes(img)

1 2	# 预测类别 result = model.predict_classes(img)

完整的示例如下所示。

# make a prediction for a new image.
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.models import load_model

# load and prepare the image
def load_image(filename):
	# load the image
	img = load_img(filename, grayscale=True, target_size=(28, 28))
	# convert to array
	img = img_to_array(img)
	# reshape into a single sample with 1 channel
	img = img.reshape(1, 28, 28, 1)
	# prepare pixel data
	img = img.astype('float32')
	img = img / 255.0
	return img

# load an image and predict the class
def run_example():
	# load the image
	img = load_image('sample_image.png')
	# load model
	model = load_model('final_model.h5')
	# predict the class
	result = model.predict_classes(img)
	print(result[0])

# entry point, run the example
run_example()

# 为新图像进行预测。

from keras.preprocessing.image import load_img

from keras.preprocessing.image import img_to_array

from keras.models import load_model

# 加载并准备图像

def load_image(filename):

# 加载图像

img = load_img(filename, grayscale=True, target_size=(28, 28))

# 转换为数组

img = img_to_array(img)

# reshape into a single sample with 1 channel

img = img.reshape(1, 28, 28, 1)

# 准备像素数据

img = img.astype('float32')

img = img / 255.0

return img

# 加载图像并预测类别

def run_example():

# 加载图像

img = load_image('sample_image.png')

# 加载模型

model = load_model('final_model.h5')

# 预测类别

result = model.predict_classes(img)

print(result[0])

# 入口点，运行示例

run_example()

运行示例后，会先加载并准备图像，加载模型，然后正确预测加载的图像代表的是套头衫或类别“2”。

2

扩展

本节列出了一些您可能希望探索的扩展本教程的想法。

正则化。探索正则化如何影响模型性能，与基线模型相比，例如权重衰减、提前停止和 Dropout。
调整学习率。探索不同的学习率如何影响模型性能，与基线模型相比，例如 0.001 和 0.0001。
调整模型深度。探索为模型添加更多层如何影响模型性能，与基线模型相比，例如另一组卷积和池化层，或分类器部分中的另一个密集层。

如果您探索了这些扩展中的任何一个，我很想知道。
请在下面的评论中发布您的发现。

进一步阅读

如果您想深入了解，本节提供了更多关于该主题的资源。

API

文章

Fashion-MNIST GitHub Repository

总结

在本教程中，您从头开始学习了如何为服装分类开发卷积神经网络。

具体来说，你学到了：

如何开发一个测试工具，以对模型进行稳健评估，并为分类任务建立性能基线。
如何探索基线模型的扩展以改进学习和模型容量。
如何开发一个最终模型，评估最终模型的性能，并使用它对新图像进行预测。

你有什么问题吗？
在下面的评论中提出你的问题，我会尽力回答。

How to Develop a CNN From Scratch for CIFAR-10 Photo Classification

71 Responses to Deep Learning CNN for Fashion-MNIST Clothing Classification

Alvaro del Castillo May 11, 2019 at 9:02 pm #

Jason，这是一个非常出色的教程。我跟着学了全部内容，现在很多神奇的东西都变成了关于驱动所有深度学习的机制的真实知识。另外，非常感谢你在 Kaggle 上的课程。你值得我花钱，所以你的书很快就会出现在我的虚拟书架上。

您的代码有任何许可证策略吗？我很想将其用作起点，因此如果我可以用开源许可证（MIT、Apache 或 GPL）重新使用它，我将以此为基础。

回复
- Jason Brownlee May 12, 2019 at 6:41 am #
  
  谢谢，很高兴对您有帮助。
  
  是的，您可以在您的项目中使用该代码，我在这里有更详细的说明
  https://machinelearning.org.cn/faq/single-faq/can-i-use-your-code-in-my-own-project
  
  回复
Aditya Janaswamy June 13, 2019 at 8:58 pm #

在您构建基线模型的章节中，`evaluate_model()` 函数将模型作为参数。这样做会导致在所有 k 折分割中使用相同的模型，每次都对前一次迭代产生的权重进行训练。这也是为什么我们看到模型在 5 次迭代中的准确率稳步提高的原因。

根据我对 k 折交叉验证的理解，模型应该为每一次迭代的新训练数据和测试数据重新训练。否则，我们就有过拟合的风险，因为模型最终会学习所有提供的数据，而实际上并没有训练模型 5 次。

回复
- Jason Brownlee June 14, 2019 at 6:43 am #
  
  哎呀，这看起来像个 bug！谢谢。
  
  我会安排时间修复它。
  
  回复
  - Aditya Janaswamy June 14, 2019 at 7:27 pm #
    
    可以理解，修复 bug 将消除 k 折交叉验证中获得的 99% 以上的准确率。然而，获得的准确率现在与在测试集上获得的准确率一致。
    
    正如您所提到的，我添加了另一个卷积层，并将密集层增加了 40% 的 Dropout，准确率达到了 92.3%。
    
    将填充从 valid 更改为 same 导致准确率略有下降，而不是提高（90.81% -> 90.05%）。虽然我没有通过再次运行来重现这个结果，但在我看来，这并没有显著改变结果。
    
    回复
    - Jason Brownlee June 15, 2019 at 6:28 am #
      
      干得好！
      
      更新：我已修复教程中的 bug 并更新了结果。
      
      回复
Muhammad khalid June 17, 2019 at 10:56 pm #

有完整的代码链接吗？谢谢

回复
- Jason Brownlee June 18, 2019 at 6:40 am #
  
  是的，完整的代码直接在上面的教程中提供。
  
  你到底遇到了什么问题？
  
  回复
Ludwik August 25, 2019 at 12:27 pm #

我最近遇到了一个大问题。
我的 CNN 和 MLP 模型出现了奇怪的过拟合现象；
在训练过程中，验证损失增加，但验证准确率也增加。
最终我得到的模型的得分比相应的不过拟合模型要好……
这是期望的行为吗？最终重要的是预测，而不是损失函数的结果……

回复
- Jason Brownlee August 26, 2019 at 6:08 am #
  
  这很有趣。
  
  这可能是由于数据集太小且不具代表性，或者问题过于简单。
  
  也许这篇帖子会有帮助
  https://machinelearning.org.cn/learning-curves-for-diagnosing-machine-learning-model-performance/
  
  回复
Khalid October 2, 2019 at 3:49 am #

谢谢，伙计，写得真好。

我有一个问题，你能推荐一些其他的模型吗？例如，在分类之后，使用一些定制的算法进行推荐，这些算法使用单个输入。

谢谢

回复
- Jason Brownlee October 2, 2019 at 8:04 am #
  
  抱歉，我不明白，你能详细说明一下吗？
  
  回复
  - khalid October 7, 2019 at 12:08 am #
    
    嗨
    你能将这段代码合并到你的模型中吗？例如，对于预测，你能添加 knn、cosine 或其他模块吗？
    
    谢谢
    https://github.com/samgrassi01/Cosine-Similarity-Classifier/blob/master/Superior%20K-NN%20Classifier.ipynb
    
    回复
    - Jason Brownlee October 7, 2019 at 8:29 am #
      
      抱歉，我没有能力为你合并代码。
      
      回复
Terance November 13, 2019 at 2:17 pm #

在 `evaluate_model` 部分，语句 `model = define_model()` 应该移到 `for train_ix, test_ix in kfold.split(dataX):` 之前。因此，模型准确率应该是

> 90.842
> 94.500
> 97.117
> 99.317
> 99.908

回复
- Jason Brownlee 2019年11月14日上午7:56 #
  
  不，模型必须在每次评估时重新定义（重新初始化）。
  
  否则，评估将继续从最后一组权重进行。
  
  回复
  - Thenerd 2021年2月5日上午3:42 #
    
    Jason，如果你为每次迭代重新定义模型，那么在最终的测试数据上使用的是哪个模型？我只看到一个针对整个训练数据的model.fit，但这是哪个模型？我们是不是不应该重新定义模型，然后对整个训练数据进行model.fit，然后再应用于测试数据。
    
    回复
    - Jason Brownlee 2021年2月5日上午5:48 #
      
      最终模型可以拟合所有数据并用于对新数据进行预测，更多关于最终模型的信息请参见此处
      https://machinelearning.org.cn/train-final-machine-learning-model/
      
      回复
Rachana 2019年11月13日下午6:22 #

我遇到了KeyError: ‘accuracy’
在执行summarize_diagnostics时

回复
- Jason Brownlee 2019年11月14日上午7:58 #
  
  听到这个消息我很难过，我这里有一些建议可能有所帮助
  https://machinelearning.org.cn/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  回复
  - Amrith Ravindra 2019年12月19日上午10:26 #
    
    嗨，Jason，
    
    我认为这个bug源于你在‘summarize_diagnostics()’函数中的以下代码行使用了键“accuracy”而不是“acc”
    
    pyplot.plot(histories[i].history[‘accuracy’], color=’blue’, label=’train’)
    
    实际上，它应该是
    
    pyplot.plot(histories[i].history[‘acc’], color=’blue’, label=’train’)
    
    我说的对吗？
    因为你在前面的函数中已经将键“acc”附加到了你的histories列表中。
    
    回复
    - Jason Brownlee 2019年12月19日下午12:50 #
      
      不，Keras 2.3中的API发生了变化。我建议更新到最新版本的Keras。
      
      回复
Yansen Xiao 2019年11月25日上午7:04 #

Jason，这是一个很棒的教程。我对机器学习是新手，但我能够按照你描述的成功运行一切。我花了30分钟运行#带有填充卷积的CNN模型用于fashion mnist数据集或#带有双倍过滤器的CNN模型用于fashion mnist数据集，但只用了几秒钟运行final_model.h5。我使用的是Windows 10和Intel 4核CPU。我测试了mnist的一些图像，一切都运行得很完美。然而，当我使用我自己的相似物品的图像，给定一个黑色背景，准确率非常低。有什么建议吗？（我也是机器学习和编码的新手）。

回复
- Jason Brownlee 2019年11月25日下午2:05 #
  
  谢谢！做得好。
  
  也许你的图像与训练期间使用的图像差异太大？
  
  回复
Tarandeep 2019年12月5日上午11:35 #

Jason，感谢你的教程！
一个问题是，是否可以使用这个Fashion MNIST数据集来查找其他图像中的物体（T恤/连衣裙/鞋子…）？例如，如果我有服装网站的库存照片，那么是否有可能训练一个神经网络来查找这些库存照片中的鞋子？

回复
- Jason Brownlee 2019年12月5日下午1:20 #
  
  并不是真的。
  
  我推荐使用目标识别算法
  https://machinelearning.org.cn/object-recognition-with-deep-learning/
  
  回复
Tarandeep Singh 2019年12月7日上午4:31 #

谢谢，我会仔细研究的。

回复
- Jason Brownlee 2019年12月7日上午5:41 #
  
  不客气。
  
  回复
Adeel H 2019年12月11日上午7:25 #

你好 Jason，

我正在做一个视觉搜索引擎，所以给定一张图片，我需要找到视觉上相似的图片。我只使用了预训练模型vgg16来生成图像向量，并使用nmslib来计算分数。我的图片种类繁多，从艺术品到产品图片、漫画等。我发现当我搜索运动鞋时，会出现不相关的物品。用这些时尚图像重新训练现有的vgg16模型，以提高对时尚物品的敏感度，是否合理，因为vgg16可能没有足够的这些图像？但我发现，一张包含很多内容的尺寸较大的图像（例如，模特穿着运动鞋跑步）的向量，可能与28像素的训练图像的向量非常不同……或者我遗漏了什么基本概念？

从你上面给Tarandeep的回答来看，这似乎不是要采取的方法，而是像你建议的那样，我应该看看图片中是否有运动鞋，以及整张图片是否是运动鞋。

如果你在另一篇文章或课程中涵盖了相关主题，任何链接都有帮助，或者看起来我应该直接买你的书。

感谢任何输入

回复
- Jason Brownlee 2019年12月11日下午1:37 #
  
  听起来是一个很棒的项目！
  
  尝试一个新的训练为自编码器的模型，或者一个分类模型可能会很有趣。
  
  尝试其他预训练模型也可能很有趣。
  
  回复
Adeel M Hasan 2019年12月12日上午5:49 #

感谢关于自编码器的建议，我将尝试一下。

除了Keras中包含的那些知名预训练模型之外，您知道还有哪些值得尝试的吗？我发现Clarifai.com上的通用模型非常好，但它的API对于我们的量来说很昂贵。我怀疑即使是商业级别的（除了API服务）也没有那么好的东西。

总的来说，您建议我如何在进行最近邻搜索时将图像向量数据与一些其他元数据结合起来？元数据可以是用户为图像分配的类别，以及提取的对象。您的书籍中是否涵盖了类似内容？

回复
- Jason Brownlee 2019年12月12日上午6:34 #
  
  这听起来是个不错的方法。
  
  不，我在任何教程中都没有这个确切的问题。
  
  回复
sush kulk 2020年3月27日上午12:28 #

Fashion MNIST Cnn 在 Pycharm 中？

回复
- Jason Brownlee 2020年3月27日上午6:14 #
  
  我建议不要使用IDE，而是使用文本编辑器，原因如下
  https://machinelearning.org.cn/faq/single-faq/why-dont-use-or-recommend-notebooks
  
  回复
Jyoti 2020年4月5日下午10:21 #

我正在尝试通过迁移学习（VGG 16）进行训练，输入图像尺寸为28×28，我该如何将其更改为224,224,64？或者我能否以某种方式更改vgg16模型的输入层尺寸？我无法使用model.fit_generator_from_directory，因为它需要训练图像和测试图像的路径……然而，我直接从Keras访问了数据，并通过load_data访问了它！您能否帮助我如何更改输入图像的形状或修改vgg16的输入层？

回复
- Jason Brownlee 2020年4月6日上午6:05 #
  
  是的，您可以更改预训练VGG模型的输入形状，也许可以从这里开始
  https://machinelearning.org.cn/how-to-use-transfer-learning-when-developing-convolutional-neural-network-models/
  
  回复
Preethi 2020年5月7日下午12:09 #

在这个教程中，您使用的是哪种CNN模型？（LeNet、ResNet、VGG Net……）

回复
- Jason Brownlee 2020年5月7日下午1:31 #
  
  从零开始开发的自定义CNN模型。
  
  回复
Tushar 2020年7月22日下午7:56 #

该模型在从网上下载的非灰度图像上的表现非常糟糕。它们是彩色图像，我需要模型来分类这些图像。你能帮帮我吗？

回复
- Jason Brownlee 2020年7月23日上午6:05 #
  
  该模型是为灰度图像设计的。
  
  如果处理彩色图像，您将需要一个不同的模型，例如
  https://machinelearning.org.cn/how-to-develop-a-cnn-from-scratch-for-cifar-10-photo-classification/
  
  回复
lkforward 2020年7月27日上午7:09 #

嗨，Jason，

感谢您的帖子！

打印模型准确率时可能有一个拼写错误：它不是以百分比表示，所以不需要*100。这是我对函数进行的修改

# 总结模型性能
def summarize_performance(scores)
# print summary
print(f”Accuracy: mean={np.mean(scores): .3f} std={np.std(scores): .3f}, n={len(scores)}”)
# box and whisker plots of results
pyplot.boxplot(scores)
pyplot.title(“Model Accuracy on Validation Set”)
pyplot.ylabel(“Accuracy”)
pyplot.show()

回复
- Jason Brownlee 2020年7月27日下午1:04 #
  
  我们正在打印分类准确率，它是一个比率，可以乘以100得到百分比。
  
  你为什么说有拼写错误？
  
  回复
TGS 2020年8月1日下午7:45 #

一如既往，这是一个很棒的教程，Jason。

我在load_img()函数时遇到问题。我一直收到错误消息：“ValueError: cannot reshape array of size 2352 into shape (28,28,1)”，尽管设置了grayscale='True'并且还使用了color_mode='gray'，这似乎是更新的设置。这意味着，图像在加载时具有形状[28,28,3]。

最后，我不得不在将其转换为数组后使用img=img[:,:,0]，然后才能工作。不确定我哪里出错了，以及其他人是否也遇到了同样的问题。

代码如下

new= img_to_array(new)
new=new[:,:,0]
new= new.reshape(28, 28, 1)
new= new.astype(‘float32’) / 255.0

此外，我尝试了外部灰度图像，这些图像是从谷歌图片中随机挑选的，即套头衫、运动鞋等。模型在预测这些图像方面非常糟糕。

回复
- Jason Brownlee 2020年8月2日上午5:42 #
  
  听到您遇到这个问题，我很遗憾，这或许能帮到您。
  https://machinelearning.org.cn/faq/single-faq/why-does-the-code-in-the-tutorial-not-work-for-me
  
  回复
Preethi 2020年9月6日下午3:36 #

很棒的教程，
您能否帮助我使用RNN架构来实现这一点？

回复
- Jason Brownlee 2020年9月7日上午8:25 #
  
  RNN不适用于此数据集，请看这里
  https://machinelearning.org.cn/when-to-use-mlp-cnn-and-rnn-neural-networks/
  
  回复
SK 2020年11月18日上午7:45 #

我正在研究一个时尚连衣裙模块，我想在我的模块中使用fashion mnist数据集的权重，并将其与resnet或inceptionV3等架构一起使用。我想知道是否有办法做到这一点。如果可以，请给我一些代码，我是深度学习新手。任何帮助都将非常感激！

回复
- Jason Brownlee 2020年11月18日下午1:07 #
  
  这应该是可行的，尽管您必须重新定义输入层。
  
  这会有帮助
  https://machinelearning.org.cn/how-to-use-transfer-learning-when-developing-convolutional-neural-network-models/
  
  回复
SK 2020年11月18日下午3:12 #

inception = InceptionV3(input_shape=IMAGE_SIZE + [3], weights=’fashion_mnist’, include_top=False), 这样做可以吗？
还有，fashion_mnist图像是灰度的，而我的数据集图像是彩色的，这会成为问题吗？
我找不到fashion_mnist权重并导入到我的模块中，请帮忙。

回复
- Jason Brownlee 2020年11月19日上午7:42 #
  
  您必须进行测试才能找出答案。
  
  回复
Szymon 2021年1月7日下午11:47 #

嗨，为什么你在K-Fold交叉验证中使用相同的验证集和测试集？这难道不是一个很大的错误吗？你甚至没有使用10000张测试集图片。我说得对吗？

回复
- Jason Brownlee 2021年1月8日上午5:46 #
  
  这是我在这里回答的一个常见问题
  https://machinelearning.org.cn/faq/single-faq/why-do-you-use-the-test-dataset-as-the-validation-dataset
  
  回复
Shobi 2021年3月13日下午1:21 #

嗨，Jason，

非常感谢您的精彩文章。

您能否指导我关于增加图像尺寸的问题？如果准确率在提高，增加图像尺寸是否是个好主意？该数据集报告的准确率为95%，但使用原始尺寸32。如果我通过调整图像大小获得相同的准确率呢？这种方法值得吗？我正在使用迁移学习模型，有些模型需要最小96的图像尺寸，所以这就是为什么需要增加图像尺寸……您如何看待调整图像大小的方法？

谢谢！

回复
- Jason Brownlee 2021年3月14日上午5:23 #
  
  不客气。
  
  这将有助于调整图像大小
  https://machinelearning.org.cn/how-to-load-and-manipulate-images-for-deep-learning-in-python-with-pil-pillow/
  
  回复
Ziar 2021年7月17日上午4:28 #

您有这个流程图吗？

回复
- Jason Brownlee 2021年7月17日上午5:25 #
  
  不行。
  
  回复
khaled mohamed 2021年8月4日上午2:10 #

感谢这篇有趣的论文，我想问您一个问题，我想创建一个计算机视觉模型，其中模型首先接收用户的图片并进行分析，输出是评估用户着装的评分（满分100），以及颜色搭配是否合适。我想在机器学习和深度学习中创建这个模型。
由于我是初学者，您会建议我学习什么来构建这个模型？

回复
- Jason Brownlee 2021年8月4日上午5:15 #
  
  也许您可以从现有的目标检测模型开始，并针对您的需求进行调整。
  
  回复
  - khaled mohamed 2021年8月5日上午5:10 #
    
    这篇文章中的模型不这样做吗？？
    
    回复
sarya 2021年9月7日下午11:59 #

通过编写以下代码：trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix], history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0) and _, acc = model.evaluate(testX, testY, verbose=0)我们正在为验证和评估使用相同的数据集，我说的对吗？我尝试了这段代码，当我运行它时，我得到了相同的值，这是这些代码行的目的吗？还是两个数据集应该不同（正如我怀疑的）？

回复
- Adrian Tam 2021年9月8日上午1:59 #
  
  是的，您说得对。这就是为什么人们会使用train-test-validation split。
  
  回复
Trent 2022年10月11日上午9:18 #

嗨，刚看到这个教程，谢谢！
我在尝试遵循教程时遇到了一个问题，pyplot.show()出现语法错误

for i in range(9)
… pyplot.subplot(330 + 1 + i)
… pyplot.imshow(trainX[i], cmap=pyplot.get_cmap(‘gray’))
… pyplot.show()
文件“”，第4行
pyplot.show()
^
SyntaxError: invalid syntax

我甚至复制粘贴了代码，但错误仍然是相同的。
我做错了什么？

回复
- James Carmichael 2022年10月12日上午8:00 #
  
  您好 Trent…非常欢迎！我建议您手动输入代码以避免复制粘贴问题。此外，您可能还想尝试在Google Colab中输入代码。让我们知道您发现了什么！
  
  回复
Bruno 2023年6月26日上午8:03 #

你好，
我正在做一个学校项目，需要比较我在MLP中为Fashion MNIST使用1、2和3个隐藏层时获得的结果。根据我对您的代码的理解，在定义模型时只需重复以下行两次

model.add(Conv2D(64, (3, 3), padding = ‘same’,activation=’relu’, kernel_initializer=’he_uniform’, input_shape=(28, 28, 1)))

我说的对吗？

感谢您的帮助！

回复
- James Carmichael 2023年6月26日上午8:35 #
  
  您好 Bruno…请继续您的想法，并让我们知道您的发现。
  
  回复
no_name 2023年8月1日上午2:16 #

如果我想用这个数据集之外的图像进行测试，我该怎么做？

回复
- James Carmichael 2023年8月1日上午9:08 #
  
  您好 no_name…模型训练完成后，您可以使用“验证”数据集对其进行验证。
  
  https://machinelearning.org.cn/difference-test-validation-datasets/
  
  回复
  - no_name 2023年8月1日下午8:41 #
    
    是否有任何方法可以分类Fashion MNIST数据集之外的图像？我已经尝试了数据集中的图像，并取得了相对不错的结果。但是，对于数据集之外的图像，模型似乎效果不佳。
    
    回复
no_name 2023年8月1日下午11:29 #

当我使用这个模型测试Fashion MNIST数据集内的图像时，结果相对准确。然而，当我使用数据集外的图像时，模型似乎无法识别结果。有什么方法可以克服这个问题吗？

回复
Maya 2024年2月4日下午6:46 #

嗨 Jason，感谢您提供的清晰简洁的教程。它帮助我清楚地了解了CNN模型。我正在研究一个CNN-SNN架构，其中CNN（带有3个CNN层）在MNIST数据集上进行训练，然后替换中间的CNN层为SNN层，并通过冻结第一层再次进行训练。您是否有任何教程或文章能帮助我解决这种类型的问题，因为我是这方面的新手，找不到任何相关资源。

回复
- James Carmichael 2024年2月5日上午7:45 #
  
  嗨 Maya…非常欢迎！以下资源提供了对这类网络实际应用的深刻见解。
  
  https://www.frontiersin.org/articles/10.3389/fnins.2022.759900/full
  
  回复

导航

用于 Fashion-MNIST 服装分类的深度学习 CNN

教程概述

想通过深度学习实现计算机视觉成果吗？

Fashion MNIST 服装分类

模型评估方法

如何开发基线模型

加载数据集

准备像素数据

定义模型

评估模型

5. 呈现结果

完整示例

如何开发改进模型

Padding Convolutions

Increasing Filters

如何完成模型并进行预测

保存最终模型

评估最终模型

进行预测

扩展

进一步阅读

API

文章

总结

立即开发用于视觉的深度学习模型！

在几分钟内开发您自己的视觉模型

最终将深度学习引入您的视觉项目

关于此主题的更多信息

71 Responses to Deep Learning CNN for Fashion-MNIST Clothing Classification

Leave a Reply Click here to cancel reply.