在 PyTorch 中为图像构建 Softmax 分类器

作者 Muhammad Asad Iqbal Khan 于 2023年4月8日发表在 PyTorch 深度学习 4

Softmax 分类器是一种监督学习中的分类器。它是深度学习网络中的一个重要组成部分，也是深度学习从业者中最受欢迎的选择。

Softmax 分类器适用于多类别分类，它输出每个类别的概率。

本教程将教您如何为图像数据构建 Softmax 分类器。您将学习如何准备数据集，然后学习如何使用 PyTorch 实现 Softmax 分类器。具体来说，您将学习：

关于 Fashion-MNIST 数据集。
如何在 PyTorch 中将 Softmax 分类器用于图像。
如何在 PyTorch 中构建和训练多类别图像分类器。
如何绘制模型训练后的结果。

通过我的《用PyTorch进行深度学习》一书来启动你的项目。它提供了包含可用代码的自学教程。

让我们开始吧。

使用 PyTorch 为图像构建 Softmax 分类器。
图片来源：Joshua J. Cotten。部分权利保留。

概述

本教程分为三个部分；它们是

- 准备数据集
- 构建模型
- 训练模型

准备数据集

这里将使用 Fashion-MNIST 数据集。这是一个预处理过的、组织良好的数据集，包含 70,000 张图像，其中 60,000 张用于训练数据，10,000 张用于测试数据。

数据集中的每个样本都是一张 $28 \times 28$ 像素的灰度图像，总像素数为 784。该数据集有 10 个类别，每个图像都标记为一种时尚单品，并关联一个从 0 到 9 的整数标签。

该数据集可以从 torchvision 加载。为了加快训练速度，我们将数据集限制为 4000 个样本。

from torchvision import datasets

train_data = datasets.FashionMNIST('data', train=True, download=True)
train_data = list(train_data)[:4000]

from torchvision import datasets

train_data = datasets.FashionMNIST('data', train=True, download=True)

train_data = list(train_data)[:4000]

首次获取 fashion-MNIST 数据集时，您会看到 PyTorch 从互联网下载它并将其保存到名为 data 的本地目录。

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz
  0%|          | 0/26421880 [00:00<?, ?it/s]
Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz
  0%|          | 0/29515 [00:00<?, ?it/s]
Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz
  0%|          | 0/4422102 [00:00<?, ?it/s]
Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz
  0%|          | 0/5148 [00:00<?, ?it/s]
Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz

0%| | 0/26421880 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz

0%| | 0/29515 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz

0%| | 0/4422102 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz

0%| | 0/5148 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw

上面的 train_data 是一个元组列表，每个元组包含一个图像（PIL 图像对象格式）和一个整数标签。

让我们用 matplotlib 绘制数据集中前 10 张图像。

import matplotlib.pyplot as plt

# plot the first 10 images in the training data
for i, (img, label) in enumerate(train_data[:10]):
    plt.subplot(4, 3, i+1)
    plt.imshow(img, cmap="gray")

plt.show()

import matplotlib.pyplot as plt

# plot the first 10 images in the training data

for i, (img, label) in enumerate(train_data[:10]):

plt.subplot(4, 3, i+1)

plt.imshow(img, cmap="gray")

plt.show()

您应该会看到类似以下的图像

PyTorch 需要 PyTorch 张量格式的数据。因此，您需要通过应用转换来转换这些数据，使用 PyTorch 转换中的 ToTensor() 方法。此转换可以通过 torchvision 的数据集 API 透明地完成。

from torchvision import datasets, transforms

# download and apply the transform
train_data = datasets.FashionMNIST('data', train=True, download=True, transform=transforms.ToTensor())
train_data = list(train_data)[:4000]

from torchvision import datasets, transforms

# download and apply the transform

train_data = datasets.FashionMNIST('data', train=True, download=True, transform=transforms.ToTensor())

train_data = list(train_data)[:4000]

在进行模型构建之前，我们还将数据分为训练集和验证集，其中前 3500 张图像为训练集，其余为验证集。通常我们会先对数据进行洗牌，但为了代码简洁，我们可以跳过这一步。

# splitting the dataset into train and validation sets
train_data, val_data = train_data[:3500], train_data[3500:]

1 2	# splitting the dataset into train and validation sets train_data, val_data = train_data[:3500], train_data[3500:]

想开始使用PyTorch进行深度学习吗？

立即参加我的免费电子邮件速成课程（附示例代码）。

点击注册，同时获得该课程的免费PDF电子书版本。

构建模型

为了构建自定义的 Softmax 模块进行图像分类，我们将使用 PyTorch 库中的 nn.Module。为了保持简单，我们构建一个只有一层的模型。

import torch

# build custom softmax module
class Softmax(torch.nn.Module):
    def __init__(self, n_inputs, n_outputs):
        super().__init__()
        self.linear = torch.nn.Linear(n_inputs, n_outputs)

    def forward(self, x):
        pred = self.linear(x)
        return pred

import torch

# build custom softmax module

class Softmax(torch.nn.Module):

def __init__(self, n_inputs, n_outputs):

super().__init__()

self.linear = torch.nn.Linear(n_inputs, n_outputs)

def forward(self, x):

pred = self.linear(x)

return pred

现在，我们实例化模型对象。它接收一个一维向量作为输入，并预测 10 个不同的类别。我们还可以检查参数是如何初始化的。

# call Softmax Classifier
model_softmax = Softmax(784, 10)
print(model_softmax.state_dict())

# call Softmax Classifier

model_softmax = Softmax(784, 10)

print(model_softmax.state_dict())

您应该会看到模型的权重是随机初始化的，但其形状应该如下所示：

OrderedDict([('linear.weight',
              tensor([[-0.0344,  0.0334, -0.0278,  ..., -0.0232,  0.0198, -0.0123],
                      [-0.0274, -0.0048, -0.0337,  ..., -0.0340,  0.0274, -0.0091],
                      [ 0.0078, -0.0057,  0.0178,  ..., -0.0013,  0.0322, -0.0219],
                      ...,
                      [ 0.0158, -0.0139, -0.0220,  ..., -0.0054,  0.0284, -0.0058],
                      [-0.0142, -0.0268,  0.0172,  ...,  0.0099, -0.0145, -0.0154],
                      [-0.0172, -0.0224,  0.0016,  ...,  0.0107,  0.0147,  0.0252]])),
             ('linear.bias',
              tensor([-0.0156,  0.0061,  0.0285,  0.0065,  0.0122, -0.0184, -0.0197,  0.0128,
                       0.0251,  0.0256]))])

OrderedDict([('linear.weight',

tensor([[-0.0344, 0.0334, -0.0278, ..., -0.0232, 0.0198, -0.0123],

[-0.0274, -0.0048, -0.0337, ..., -0.0340, 0.0274, -0.0091],

[ 0.0078, -0.0057, 0.0178, ..., -0.0013, 0.0322, -0.0219],

...,

[ 0.0158, -0.0139, -0.0220, ..., -0.0054, 0.0284, -0.0058],

[-0.0142, -0.0268, 0.0172, ..., 0.0099, -0.0145, -0.0154],

[-0.0172, -0.0224, 0.0016, ..., 0.0107, 0.0147, 0.0252]])),

('linear.bias',

tensor([-0.0156, 0.0061, 0.0285, 0.0065, 0.0122, -0.0184, -0.0197, 0.0128,

0.0251, 0.0256]))])

训练模型

我们将使用随机梯度下降进行模型训练，并配合交叉熵损失。我们将学习率固定为 0.01。为了方便训练，我们还将数据加载到数据加载器中，用于训练集和验证集，并将批次大小设置为 16。

class Softmax(torch.nn.Module):
    "custom softmax module"
    def __init__(self, n_inputs, n_outputs):
        super().__init__()
        self.linear = torch.nn.Linear(n_inputs, n_outputs)

    def forward(self, x):
        pred = self.linear(x)
        return pred

class Softmax(torch.nn.Module):

"custom softmax module"

def __init__(self, n_inputs, n_outputs):

super().__init__()

self.linear = torch.nn.Linear(n_inputs, n_outputs)

def forward(self, x):

pred = self.linear(x)

return pred

现在，让我们将所有内容整合起来，训练我们的模型 200 个 epoch。

epochs = 200
Loss = []
acc = []
for epoch in range(epochs):
    for i, (images, labels) in enumerate(train_loader):
        optimizer.zero_grad()
        outputs = model_softmax(images.view(-1, 28*28))
        loss = criterion(outputs, labels)
        # Loss.append(loss.item())
        loss.backward()
        optimizer.step()
    Loss.append(loss.item())
    correct = 0
    for images, labels in val_loader:
        outputs = model_softmax(images.view(-1, 28*28))
        _, predicted = torch.max(outputs.data, 1)
        correct += (predicted == labels).sum()
    accuracy = 100 * (correct.item()) / len(val_data)
    acc.append(accuracy)
    if epoch % 10 == 0:
        print('Epoch: {}. Loss: {}. Accuracy: {}'.format(epoch, loss.item(), accuracy))

epochs = 200

Loss = []

acc = []

for epoch in range(epochs):

for i, (images, labels) in enumerate(train_loader):

optimizer.zero_grad()

outputs = model_softmax(images.view(-1, 28*28))

loss = criterion(outputs, labels)

# Loss.append(loss.item())

loss.backward()

optimizer.step()

Loss.append(loss.item())

correct = 0

for images, labels in val_loader:

outputs = model_softmax(images.view(-1, 28*28))

_, predicted = torch.max(outputs.data, 1)

correct += (predicted == labels).sum()

accuracy = 100 * (correct.item()) / len(val_data)

acc.append(accuracy)

if epoch % 10 == 0:

print('Epoch: {}. Loss: {}. Accuracy: {}'.format(epoch, loss.item(), accuracy))

您应该会看到进度每 10 个 epoch 打印一次。

Epoch: 0. Loss: 1.0223602056503296. Accuracy: 67.2
Epoch: 10. Loss: 0.5806267857551575. Accuracy: 78.4
Epoch: 20. Loss: 0.5087125897407532. Accuracy: 81.2
Epoch: 30. Loss: 0.46658074855804443. Accuracy: 82.0
Epoch: 40. Loss: 0.4357391595840454. Accuracy: 82.4
Epoch: 50. Loss: 0.4111904203891754. Accuracy: 82.8
Epoch: 60. Loss: 0.39078089594841003. Accuracy: 83.4
Epoch: 70. Loss: 0.37331104278564453. Accuracy: 83.4
Epoch: 80. Loss: 0.35801735520362854. Accuracy: 83.4
Epoch: 90. Loss: 0.3443795442581177. Accuracy: 84.2
Epoch: 100. Loss: 0.33203184604644775. Accuracy: 84.2
Epoch: 110. Loss: 0.32071244716644287. Accuracy: 84.0
Epoch: 120. Loss: 0.31022894382476807. Accuracy: 84.2
Epoch: 130. Loss: 0.30044111609458923. Accuracy: 84.4
Epoch: 140. Loss: 0.29124370217323303. Accuracy: 84.6
Epoch: 150. Loss: 0.28255513310432434. Accuracy: 84.6
Epoch: 160. Loss: 0.2743147313594818. Accuracy: 84.4
Epoch: 170. Loss: 0.26647457480430603. Accuracy: 84.2
Epoch: 180. Loss: 0.2589966356754303. Accuracy: 84.2
Epoch: 190. Loss: 0.2518490254878998. Accuracy: 84.2

Epoch: 0. Loss: 1.0223602056503296. Accuracy: 67.2

Epoch: 10. Loss: 0.5806267857551575. Accuracy: 78.4

Epoch: 20. Loss: 0.5087125897407532. Accuracy: 81.2

Epoch: 30. Loss: 0.46658074855804443. Accuracy: 82.0

Epoch: 40. Loss: 0.4357391595840454. Accuracy: 82.4

Epoch: 50. Loss: 0.4111904203891754. Accuracy: 82.8

Epoch: 60. Loss: 0.39078089594841003. Accuracy: 83.4

Epoch: 70. Loss: 0.37331104278564453. Accuracy: 83.4

Epoch: 80. Loss: 0.35801735520362854. Accuracy: 83.4

Epoch: 90. Loss: 0.3443795442581177. Accuracy: 84.2

Epoch: 100. Loss: 0.33203184604644775. Accuracy: 84.2

Epoch: 110. Loss: 0.32071244716644287. Accuracy: 84.0

Epoch: 120. Loss: 0.31022894382476807. Accuracy: 84.2

Epoch: 130. Loss: 0.30044111609458923. Accuracy: 84.4

Epoch: 140. Loss: 0.2912437021733303. Accuracy: 84.6

Epoch: 150. Loss: 0.28255513310432434. Accuracy: 84.6

Epoch: 160. Loss: 0.2743147313594818. Accuracy: 84.4

Epoch: 170. Loss: 0.26647457480430603. Accuracy: 84.2

Epoch: 180. Loss: 0.2589966356754303. Accuracy: 84.2

Epoch: 190. Loss: 0.2518490254878998. Accuracy: 84.2

如您所见，模型准确率在每个 epoch 后都会增加，损失会减少。这里，您为 Softmax 图像分类器达到的准确率约为 85%。如果您使用更多数据并增加 epoch 数量，准确率可能会提高很多。现在让我们看看损失和准确率的图表。

首先是损失图

plt.plot(Loss)
plt.xlabel("no. of epochs")
plt.ylabel("total loss")
plt.show()

plt.plot(Loss)

plt.xlabel("no. of epochs")

plt.ylabel("总损失")

plt.show()

应该看起来像这样：

这是模型准确率图

plt.plot(acc)
plt.xlabel("no. of epochs")
plt.ylabel("total accuracy")
plt.show()

plt.plot(acc)

plt.xlabel("no. of epochs")

plt.ylabel("total accuracy")

plt.show()

如下所示：

把所有东西放在一起，下面是完整的代码。

import torch
import matplotlib.pyplot as plt
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
from torchvision import datasets

# download and apply the transform
train_data = datasets.FashionMNIST('data', train=True, download=True, transform=transforms.ToTensor())
train_data = list(train_data)[:4000]

# splitting the dataset into train and validation sets
train_data, val_data = train_data[:3500], train_data[3500:]

# build custom softmax module
class Softmax(torch.nn.Module):
    def __init__(self, n_inputs, n_outputs):
        super(Softmax, self).__init__()
        self.linear = torch.nn.Linear(n_inputs, n_outputs)

    def forward(self, x):
        pred = self.linear(x)
        return pred

# call Softmax Classifier
model_softmax = Softmax(784, 10)
model_softmax.state_dict()

# define loss, optimizier, and dataloader for train and validation sets
optimizer = torch.optim.SGD(model_softmax.parameters(), lr = 0.01)
criterion = torch.nn.CrossEntropyLoss()
batch_size = 16
train_loader = DataLoader(dataset = train_data, batch_size = batch_size)
val_loader = DataLoader(dataset = val_data, batch_size = batch_size)

epochs = 200
Loss = []
acc = []
for epoch in range(epochs):
    for i, (images, labels) in enumerate(train_loader):
        optimizer.zero_grad()
        outputs = model_softmax(images.view(-1, 28*28))
        loss = criterion(outputs, labels)
        # Loss.append(loss.item())
        loss.backward()
        optimizer.step()
    Loss.append(loss.item())
    correct = 0
    for images, labels in val_loader:
        outputs = model_softmax(images.view(-1, 28*28))
        _, predicted = torch.max(outputs.data, 1)
        correct += (predicted == labels).sum()
    accuracy = 100 * (correct.item()) / len(val_data)
    acc.append(accuracy)
    if epoch % 10 == 0:
        print('Epoch: {}. Loss: {}. Accuracy: {}'.format(epoch, loss.item(), accuracy))

plt.plot(Loss)
plt.xlabel("no. of epochs")
plt.ylabel("total loss")
plt.show()

plt.plot(acc)
plt.xlabel("no. of epochs")
plt.ylabel("total accuracy")
plt.show()

import torch

import matplotlib.pyplot as plt

from torch.utils.data import DataLoader

from torchvision import datasets, transforms

from torchvision import datasets

# download and apply the transform

train_data = datasets.FashionMNIST('data', train=True, download=True, transform=transforms.ToTensor())

train_data = list(train_data)[:4000]

# splitting the dataset into train and validation sets

train_data, val_data = train_data[:3500], train_data[3500:]

# build custom softmax module

class Softmax(torch.nn.Module):

def __init__(self, n_inputs, n_outputs):

super(Softmax, self).__init__()

self.linear = torch.nn.Linear(n_inputs, n_outputs)

def forward(self, x):

pred = self.linear(x)

return pred

# call Softmax Classifier

model_softmax = Softmax(784, 10)

model_softmax.state_dict()

# define loss, optimizier, and dataloader for train and validation sets

optimizer = torch.optim.SGD(model_softmax.parameters(), lr = 0.01)

criterion = torch.nn.CrossEntropyLoss()

batch_size = 16

train_loader = DataLoader(dataset = train_data, batch_size = batch_size)

val_loader = DataLoader(dataset = val_data, batch_size = batch_size)

epochs = 200

Loss = []

acc = []

for epoch in range(epochs):

for i, (images, labels) in enumerate(train_loader):

optimizer.zero_grad()

outputs = model_softmax(images.view(-1, 28*28))

loss = criterion(outputs, labels)

# Loss.append(loss.item())

loss.backward()

optimizer.step()

Loss.append(loss.item())

correct = 0

for images, labels in val_loader:

outputs = model_softmax(images.view(-1, 28*28))

_, predicted = torch.max(outputs.data, 1)

correct += (predicted == labels).sum()

accuracy = 100 * (correct.item()) / len(val_data)

acc.append(accuracy)

if epoch % 10 == 0:

print('Epoch: {}. Loss: {}. Accuracy: {}'.format(epoch, loss.item(), accuracy))

plt.plot(Loss)

plt.xlabel("no. of epochs")

plt.ylabel("总损失")

plt.show()

plt.plot(acc)

plt.xlabel("no. of epochs")

plt.ylabel("total accuracy")

plt.show()

总结

在本教程中，您学习了如何为图像数据构建 Softmax 分类器。具体来说，您学习了：

关于 Fashion-MNIST 数据集。
如何在 PyTorch 中将 Softmax 分类器用于图像。
如何在 PyTorch 中构建和训练多类别图像分类器。
如何绘制模型训练后的结果。

关于此主题的更多信息

4 条对在 PyTorch 中为图像构建 Softmax 分类器的回复

Dhavan Rathore 2023年1月9日下午8:41 #

写得很好，解释得很清楚

回复
Dhavan Rathore 2023年1月9日下午8:42 #

是的，说得好

回复
Andrew 2023年10月27日上午2:41 #

我有点困惑：如果我没记错的话，这不是一个 Softmax 分类器，它是一个用 BCE 损失训练的 SVM。这里没有 Softmax 层，也就是说，没有实际的归一化来使线性层的输出成为概率分布。但也许我错了，也许 Softmax 层不是必须的。有人知道吗？

回复
- James Carmichael 2023年10月27日上午9:39 #
  
  你好 Andrew... 以下资源可能会有所帮助
  
  https://machinelearning.org.cn/introduction-to-softmax-classifier-in-pytorch/
  
  回复

导航

在 PyTorch 中为图像构建 Softmax 分类器

概述

准备数据集

想开始使用PyTorch进行深度学习吗？

构建模型

训练模型

总结

开始使用PyTorch进行深度学习！

学习如何构建深度学习模型

通过动手练习开启你的深度学习之旅

关于此主题的更多信息

4 条对在 PyTorch 中为图像构建 Softmax 分类器的回复

留下回复点击此处取消回复。

导航

概述

准备数据集

想开始使用PyTorch进行深度学习吗？

构建模型

训练模型

总结

开始使用PyTorch进行深度学习！

学习如何构建深度学习模型

通过动手练习开启你的深度学习之旅

关于此主题的更多信息

4 条对 在 PyTorch 中为图像构建 Softmax 分类器 的回复

留下回复 点击此处取消回复。

4 条对在 PyTorch 中为图像构建 Softmax 分类器的回复

留下回复点击此处取消回复。