PyTorch 中 Softmax 分类器入门

作者 Muhammad Asad Iqbal Khan 于 2023年4月8日发布在 PyTorch 深度学习 4

虽然逻辑回归分类器用于二元分类，但 Softmax 分类器是一种有监督学习算法，主要用于涉及多个类别的情况。

Softmax 分类器通过为每个类别分配概率分布来工作。具有最高概率的类别的概率分布被归一化为 1，所有其他概率相应缩放。

同样，Softmax 函数将神经元的输出转换为类别的概率分布。它具有以下特性：

它与逻辑 Sigmoid 相关，逻辑 Sigmoid 用于概率建模并具有相似的特性。
它接收 0 到 1 之间的值，其中 0 对应于不可能发生的事件，1 对应于肯定会发生的事件。
Softmax 关于输入 x 的导数可以解释为预测给定输入 x 时选择特定类别的可能性。

在本教程中，我们将构建一个一维 Softmax 分类器并探索其功能。特别是，我们将学习：

如何使用 Softmax 分类器进行多类别分类。
如何在 PyTorch 中构建和训练 Softmax 分类器。
如何分析模型在测试数据上的结果。

通过我的《用PyTorch进行深度学习》一书来启动你的项目。它提供了包含可用代码的自学教程。

让我们开始吧。

PyTorch 中的 Softmax 分类器入门。
图片来自 Julia Caesar。部分权利保留。

概述

本教程分为四个部分；它们是：

准备数据集
将数据集加载到 DataLoader 中
使用 nn.Module 构建模型
训练分类器

准备数据集

首先，让我们构建我们的数据集类来生成一些数据样本。与之前的实验不同，您将生成多类数据。然后，您将在这些数据样本上训练 Softmax 分类器，之后使用它来对测试数据进行预测。

下面，我们根据单个输入变量为四个类别生成数据。

import torch
from torch.utils.data import Dataset

class toy_data(Dataset):
    "The data for multi-class classification"
    def __init__(self):
        # single input
        self.x = torch.arange(-3, 3, 0.1).view(-1, 1)
        # multi-class output
        self.y = torch.zeros(self.x.shape[0])
        self.y[(self.x > -2.0)[:, 0] * (self.x < 0.0)[:, 0]] = 1 self.y[(self.x >= 0.0)[:, 0] * (self.x < 2.0)[:, 0]] = 2 self.y[(self.x >= 2.0)[:, 0]] = 3
        self.y = self.y.type(torch.LongTensor)
        self.len = self.x.shape[0]

    def __getitem__(self, idx):
        "accessing one element in the dataset by index"
        return self.x[idx], self.y[idx] 

    def __len__(self):
        "size of the entire dataset"
        return self.len

import torch

from torch.utils.data import Dataset

class toy_data(Dataset):

"The data for multi-class classification"

def __init__(self):

# single input

self.x = torch.arange(-3, 3, 0.1).view(-1, 1)

# multi-class output

self.y = torch.zeros(self.x.shape[0])

self.y[(self.x > -2.0)[:, 0] * (self.x < 0.0)[:, 0]] = 1 self.y[(self.x >= 0.0)[:, 0] * (self.x < 2.0)[:, 0]] = 2 self.y[(self.x >= 2.0)[:, 0]] = 3

self.y = self.y.type(torch.LongTensor)

self.len = self.x.shape[0]

def __getitem__(self, idx):

"accessing one element in the dataset by index"

return self.x[idx], self.y[idx]

def __len__(self):

"size of the entire dataset"

return self.len

让我们创建数据对象，并检查前十个数据样本及其标签。

# Create the dataset object and check a few samples
data = toy_data()
print("first ten data samples: ", data.x[0:10])
print("first ten data labels: ", data.y[0:10])

# Create the dataset object and check a few samples

data = toy_data()

print("first ten data samples: ", data.x[0:10])

print("first ten data labels: ", data.y[0:10])

输出如下：

first ten data samples:  tensor([[-3.0000],
        [-2.9000],
        [-2.8000],
        [-2.7000],
        [-2.6000],
        [-2.5000],
        [-2.4000],
        [-2.3000],
        [-2.2000],
        [-2.1000]])
first ten data labels:  tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

first ten data samples: tensor([[-3.0000],

[-2.9000],

[-2.8000],

[-2.7000],

[-2.6000],

[-2.5000],

[-2.4000],

[-2.3000],

[-2.2000],

[-2.1000]])

first ten data labels: tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

使用 `nn.Module` 构建 Softmax 模型

我们将使用 PyTorch 的 nn.Module 来构建一个自定义的 Softmax 模块。它与您在之前的逻辑回归教程中构建的自定义模块类似。那么，这里的区别是什么呢？之前您在 n_ouputs 的位置使用了 1 进行二元分类，而在这里我们将定义四个类别用于多类别分类。其次，在 forward() 函数中，模型不使用逻辑函数进行预测。

class Softmax(torch.nn.Module):
    "custom softmax module"
    def __init__(self, n_inputs, n_outputs):
        super().__init__()
        self.linear = torch.nn.Linear(n_inputs, n_outputs)

    def forward(self, x):
        pred = self.linear(x)
        return pred

class Softmax(torch.nn.Module):

"custom softmax module"

def __init__(self, n_inputs, n_outputs):

super().__init__()

self.linear = torch.nn.Linear(n_inputs, n_outputs)

def forward(self, x):

pred = self.linear(x)

return pred

现在，让我们创建模型对象。它接收一个一维向量作为输入，并为四个不同的类别进行预测。我们还可以看看参数是如何初始化的。

# call Softmax Classifier
model_softmax = Softmax(1, 4)
model_softmax.state_dict()

# call Softmax Classifier

model_softmax = Softmax(1, 4)

model_softmax.state_dict()

输出如下：

OrderedDict([('linear.weight',
              tensor([[-0.0075],
                      [ 0.5364],
                      [-0.8230],
                      [-0.7359]])),
             ('linear.bias', tensor([-0.3852,  0.2682, -0.0198,  0.7929]))])

OrderedDict([('linear.weight',

tensor([[-0.0075],

[ 0.5364],

[-0.8230],

[-0.7359]])),

('linear.bias', tensor([-0.3852, 0.2682, -0.0198, 0.7929]))])

想开始使用PyTorch进行深度学习吗？

立即参加我的免费电子邮件速成课程（附示例代码）。

点击注册，同时获得该课程的免费PDF电子书版本。

训练模型

结合随机梯度下降，我们将使用交叉熵损失进行模型训练，并将学习率设置为 0.01。我们将数据加载到数据加载器中，并将批量大小设置为 2。

...
from torch.utils.dataimport DataLoader

# define loss, optimizier, and dataloader
optimizer = torch.optim.SGD(model_softmax.parameters(), lr = 0.01)
criterion = torch.nn.CrossEntropyLoss()
train_loader = DataLoader(dataset = data, batch_size = 2)

...

from torch.utils.dataimport DataLoader

# define loss, optimizier, and dataloader

optimizer = torch.optim.SGD(model_softmax.parameters(), lr = 0.01)

criterion = torch.nn.CrossEntropyLoss()

train_loader = DataLoader(dataset = data, batch_size = 2)

一切准备就绪后，让我们将模型训练 100 个 epoch。

# Train the model
Loss = []
epochs = 100
for epoch in range(epochs):
    for x, y in train_loader:
        optimizer.zero_grad()
        y_pred = model_softmax(x)
        loss = criterion(y_pred, y)
        Loss.append(loss)
        loss.backward()
        optimizer.step()
print("Done!")

# 训练模型

Loss = []

epochs = 100

for epoch in range(epochs):

for x, y in train_loader:

optimizer.zero_grad()

y_pred = model_softmax(x)

loss = criterion(y_pred, y)

Loss.append(loss)

loss.backward()

optimizer.step()

print("完成！")

训练循环完成后，您调用模型的 max() 方法进行预测。参数 1 返回轴一的最大值，即返回每列中的最大值索引。

# Make predictions on test data
pred_model =  model_softmax(data.x)
_, y_pred = pred_model.max(1)
print("model predictions on test data:", y_pred)

# Make predictions on test data

pred_model = model_softmax(data.x)

_, y_pred = pred_model.max(1)

print("model predictions on test data:", y_pred)

从上面，您应该会看到

model predictions on test data: tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
        2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])

model predictions on test data: tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,

2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])

这些是模型在测试数据上的预测。

让我们也检查一下模型的准确率。

# check model accuracy
correct = (data.y == y_pred).sum().item()
acc = correct / len(data)
print("model accuracy: ", acc)

# check model accuracy

correct = (data.y == y_pred).sum().item()

acc = correct / len(data)

print("model accuracy: ", acc)

在这种情况下，您可能会看到

model accuracy:  0.9833333333333333

1	model accuracy: 0.9833333333333333

在这个简单的模型中，如果您训练更长时间，您可以看到准确率接近 1。

把所有东西放在一起，下面是完整的代码。

import torch
from torch.utils.data import Dataset, DataLoader

class toy_data(Dataset):
    "The data for multi-class classification"
    def __init__(self):
        # single input
        self.x = torch.arange(-3, 3, 0.1).view(-1, 1)
        # multi-class output
        self.y = torch.zeros(self.x.shape[0])
        self.y[(self.x > -2.0)[:, 0] * (self.x < 0.0)[:, 0]] = 1
        self.y[(self.x >= 0.0)[:, 0] * (self.x < 2.0)[:, 0]] = 2
        self.y[(self.x >= 2.0)[:, 0]] = 3
        self.y = self.y.type(torch.LongTensor)
        self.len = self.x.shape[0]

    def __getitem__(self, idx):
        "accessing one element in the dataset by index"
        return self.x[idx], self.y[idx] 

    def __len__(self):
        "size of the entire dataset"
        return self.len

# Create the dataset object and check a few samples
data = toy_data()
print("first ten data samples: ", data.x[0:10])
print("first ten data labels: ", data.y[0:10])

class Softmax(torch.nn.Module):
    "custom softmax module"
    def __init__(self, n_inputs, n_outputs):
        super().__init__()
        self.linear = torch.nn.Linear(n_inputs, n_outputs)

    def forward(self, x):
        pred = self.linear(x)
        return pred

# call Softmax Classifier
model_softmax = Softmax(1, 4)
model_softmax.state_dict()

# define loss, optimizier, and dataloader
optimizer = torch.optim.SGD(model_softmax.parameters(), lr=0.01)
criterion = torch.nn.CrossEntropyLoss()
train_loader = DataLoader(dataset=data, batch_size=2)

# Train the model
Loss = []
epochs = 100
for epoch in range(epochs):
    for x, y in train_loader:
        optimizer.zero_grad()
        y_pred = model_softmax(x)
        loss = criterion(y_pred, y)
        Loss.append(loss)
        loss.backward()
        optimizer.step()
print("Done!")

# Make predictions on test data
pred_model =  model_softmax(data.x)
_, y_pred = pred_model.max(1)
print("model predictions on test data:", y_pred)

# check model accuracy
correct = (data.y == y_pred).sum().item()
acc = correct / len(data)
print("model accuracy: ", acc)

import torch

from torch.utils.data import Dataset, DataLoader

class toy_data(Dataset):

"The data for multi-class classification"

def __init__(self):

# single input

self.x = torch.arange(-3, 3, 0.1).view(-1, 1)

# multi-class output

self.y = torch.zeros(self.x.shape[0])

self.y[(self.x > -2.0)[:, 0] * (self.x < 0.0)[:, 0]] = 1

self.y[(self.x >= 0.0)[:, 0] * (self.x < 2.0)[:, 0]] = 2

self.y[(self.x >= 2.0)[:, 0]] = 3

self.y = self.y.type(torch.LongTensor)

self.len = self.x.shape[0]

def __getitem__(self, idx):

"accessing one element in the dataset by index"

return self.x[idx], self.y[idx]

def __len__(self):

"size of the entire dataset"

return self.len

# Create the dataset object and check a few samples

data = toy_data()

print("first ten data samples: ", data.x[0:10])

print("first ten data labels: ", data.y[0:10])

class Softmax(torch.nn.Module):

"custom softmax module"

def __init__(self, n_inputs, n_outputs):

super().__init__()

self.linear = torch.nn.Linear(n_inputs, n_outputs)

def forward(self, x):

pred = self.linear(x)

return pred

# call Softmax Classifier

model_softmax = Softmax(1, 4)

model_softmax.state_dict()

# define loss, optimizier, and dataloader

optimizer = torch.optim.SGD(model_softmax.parameters(), lr=0.01)

criterion = torch.nn.CrossEntropyLoss()

train_loader = DataLoader(dataset=data, batch_size=2)

# 训练模型

Loss = []

epochs = 100

for epoch in range(epochs):

for x, y in train_loader:

optimizer.zero_grad()

y_pred = model_softmax(x)

loss = criterion(y_pred, y)

Loss.append(loss)

loss.backward()

optimizer.step()

print("完成！")

# Make predictions on test data

pred_model = model_softmax(data.x)

_, y_pred = pred_model.max(1)

print("model predictions on test data:", y_pred)

# check model accuracy

correct = (data.y == y_pred).sum().item()

acc = correct / len(data)

print("model accuracy: ", acc)

总结

在本教程中，您学习了如何构建一个简单的一维 Softmax 分类器。特别是，您学习了：

如何使用 Softmax 分类器进行多类别分类。
如何在 PyTorch 中构建和训练 Softmax 分类器。
如何分析模型在测试数据上的结果。

关于此主题的更多信息

“PyTorch 中的 Softmax 分类器入门” 的 4 条回复

John William O'Meara 2023年7月3日上午3:39 #

如果我说错了，请纠正我，但这似乎根本没有实现 Softmax 分类器。
定义的名为“Softmax()”的自定义类只有一个线性层，没有 Softmax 激活。
这是疏忽吗？

回复
- James Carmichael 2023年7月3日上午7:59 #
  
  你好 John… 本教程介绍了概念。如果您使用它，请告诉我们，并在您自己的项目中分享您的发现。
  
  回复
  - JC 2024年11月24日上午10:02 #
    
    您的输出维度是 4，但您的预测只需要 1 个数字，我看不到您如何从 4 得到 1。
    
    回复
    - James Carmichael 2024年11月25日上午7:25 #
      
      你好 JC… 你能运行代码来确认你的输出吗？请及时告知你的进展。
      
      回复

导航

PyTorch 中 Softmax 分类器入门

概述

准备数据集

使用 `nn.Module` 构建 Softmax 模型

想开始使用PyTorch进行深度学习吗？

训练模型

总结

开始使用PyTorch进行深度学习！

学习如何构建深度学习模型

通过动手练习开启你的深度学习之旅

关于此主题的更多信息

“PyTorch 中的 Softmax 分类器入门” 的 4 条回复

留下回复点击此处取消回复。

导航

概述

准备数据集

使用 nn.Module 构建 Softmax 模型

想开始使用PyTorch进行深度学习吗？

训练模型

总结

开始使用PyTorch进行深度学习！

学习如何构建深度学习模型

通过动手练习开启你的深度学习之旅

关于此主题的更多信息

“PyTorch 中的 Softmax 分类器入门” 的 4 条回复

留下回复 点击此处取消回复。

使用 `nn.Module` 构建 Softmax 模型

留下回复点击此处取消回复。