在 PyTorch 中创建自定义层和损失函数

作者： Nahla Davies 于 2025年2月11日发布在实用机器学习 0

Creating Custom Layers and Loss Functions in PyTorch

在 PyTorch 中创建自定义层和损失函数
图片来源：编辑 | Midjourney

在 PyTorch 中创建自定义层和损失函数是一项基本技能，对于构建灵活且经过优化的深度学习模型至关重要。虽然 PyTorch 提供了强大的预定义层和损失函数库，但在某些情况下，为您的特定问题量身定制这些元素可以带来更好的性能和可解释性。

考虑到这一点，我们将通过代码片段和实践见解，探讨在 PyTorch 中创建和集成自定义层和损失函数的要点。

理解自定义组件的需求

PyTorch 的预定义模块和函数功能非常强大，但实际问题常常需要超越标准工具的创新。自定义层和损失函数可以：

处理特定领域的要求：例如，涉及不规则数据结构或专用指标的任务可能受益于独特的转换或评估方法。
提高模型性能：为您的特定问题量身定制层或损失函数可以带来更好的收敛性、更高的准确性或更低的计算成本。
融入领域知识：通过将特定领域的见解直接嵌入模型中，您可以提高模型的可解释性并与现实场景保持一致。

虽然基本用例可能会认为引入自定义层和损失函数是多此一举，但它非常适合医疗保健和物流等行业。同样，金融是另一个可能看到 PyTorch 使用量激增的潜在领域。即使是像从发票中提取数据这样简单的任务也需要处理不规则数据，计算机视觉模型已经在这些方面取得了长足的进步。

PyTorch 中的自定义层

自定义层允许您定义 PyTorch 标准库中不存在的特定转换或操作。这在涉及独特数据处理要求的任务中非常有用，例如对不规则模式进行建模或应用特定领域的逻辑。

步骤 1：定义层类

在 PyTorch 中，所有自定义层都是通过继承 torch.nn.Module 并定义两个关键方法来实现的：

__init__：初始化层使用的参数或子模块。
forward：定义前向传播逻辑。

以下是一个自定义线性层的示例：

import torch
import torch.nn as nn

class CustomLinear(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(CustomLinear, self).__init__()
        self.weight = nn.Parameter(torch.randn(output_dim, input_dim))
        self.bias = nn.Parameter(torch.randn(output_dim))
    def forward(self, x):
        return torch.matmul(x, self.weight.T) + self.bias

# Example usage
x = torch.randn(10, 5)  # Batch of 10 samples, each with 5 features
custom_layer = CustomLinear(input_dim=5, output_dim=3)
output = custom_layer(x)

print(output.shape)
# Output >> torch.Size([10, 3])

import torch

import torch.nn as nn

class CustomLinear(nn.Module):

def __init__(self, input_dim, output_dim):

super(CustomLinear, self).__init__()

self.weight = nn.Parameter(torch.randn(output_dim, input_dim))

self.bias = nn.Parameter(torch.randn(output_dim))

def forward(self, x):

return torch.matmul(x, self.weight.T) + self.bias

# 示例用法

x = torch.randn(10, 5) # 10 个样本的批次，每个样本有 5 个特征

custom_layer = CustomLinear(input_dim=5, output_dim=3)

output = custom_layer(x)

print(output.shape)

# 输出 >> torch.Size([10, 3])

该层执行线性变换，但完全可自定义，如果需要，可以进行进一步的调整。

步骤 2：添加高级功能

自定义层还可以包含非线性变换或特定操作。例如，具有可配置阈值的自定义 ReLU 层可能如下所示：

class ThresholdReLU(nn.Module):
    def __init__(self, threshold=0.0):
        super(ThresholdReLU, self).__init__()
        self.threshold = threshold
    def forward(self, x):
        return torch.where(x > self.threshold, x, torch.zeros_like(x))

# Example usage
relu_layer = ThresholdReLU(threshold=0.5)
x = torch.tensor([[-1.0, 0.3], [0.6, 1.2]])
output = relu_layer(x)

print(output)
# Output >> tensor([[0.0000, 0.0000], [0.6000, 1.2000]])

class ThresholdReLU(nn.Module):

def __init__(self, threshold=0.0):

super(ThresholdReLU, self).__init__()

self.threshold = threshold

def forward(self, x):

return torch.where(x > self.threshold, x, torch.zeros_like(x))

# 示例用法

relu_layer = ThresholdReLU(threshold=0.5)

x = torch.tensor([[-1.0, 0.3], [0.6, 1.2]])

output = relu_layer(x)

print(output)

# 输出 >> tensor([[0.0000, 0.0000], [0.6000, 1.2000]])

这说明了 PyTorch 在实现特定领域操作方面的灵活性。

步骤 3：集成自定义层

自定义层可以通过将其包含在大型架构中作为子模块来无缝集成到模型中。例如：

class CustomModel(nn.Module):
    def __init__(self):
        super(CustomModel, self).__init__()
        self.layer1 = nn.Linear(5, 10)
        self.custom_layer = CustomLinear(10, 3)
        self.output_layer = nn.Linear(3, 1)
    def forward(self, x):
        x = torch.relu(self.layer1(x))
        x = self.custom_layer(x)
        return self.output_layer(x)

model = CustomModel()

class CustomModel(nn.Module):

def __init__(self):

super(CustomModel, self).__init__()

self.layer1 = nn.Linear(5, 10)

self.custom_layer = CustomLinear(10, 3)

self.output_layer = nn.Linear(3, 1)

def forward(self, x):

x = torch.relu(self.layer1(x))

x = self.custom_layer(x)

return self.output_layer(x)

model = CustomModel()

这种模块化方法确保了自定义组件的可维护性和可重用性。

自定义损失函数

当均方误差或交叉熵等预定义选项不符合您模型的特定要求时，自定义损失函数至关重要。此外，我们将关注需要非标准距离度量或特定领域评估标准的任务。

步骤 1：定义损失类

与自定义层类似，自定义损失函数是通过继承 torch.nn.Module 来实现的。关键在于定义一个根据输入计算损失的 forward 方法。

以下是一个自定义损失函数的示例，该函数会惩罚较大的输出：

class CustomLoss(nn.Module):
    def __init__(self):
        super(CustomLoss, self).__init__()
    def forward(self, predictions, targets):
        mse_loss = torch.mean((predictions - targets) ** 2)
        penalty = torch.mean(predictions ** 2)
        return mse_loss + 0.1 * penalty

# Example usage
predictions = torch.randn(10, 1)
targets = torch.randn(10, 1)
loss_fn = CustomLoss()
loss = loss_fn(predictions, targets)
print(loss)

class CustomLoss(nn.Module):

def __init__(self):

super(CustomLoss, self).__init__()

def forward(self, predictions, targets):

mse_loss = torch.mean((predictions - targets) ** 2)

penalty = torch.mean(predictions ** 2)

return mse_loss + 0.1 * penalty

# 示例用法

predictions = torch.randn(10, 1)

targets = torch.randn(10, 1)

loss_fn = CustomLoss()

loss = loss_fn(predictions, targets)

print(loss)

该惩罚项鼓励输出值更小，这在某些回归问题中很有用。

步骤 2：扩展功能

您可以为更复杂的指标设计损失函数。例如，考虑一个结合了 MAE 和余弦相似度的自定义损失函数。

class CombinedLoss(nn.Module):
    def __init__(self):
        super(CombinedLoss, self).__init__()
    def forward(self, predictions, targets):
        mae_loss = torch.mean(torch.abs(predictions - targets))
        cosine_loss = 1 - torch.nn.functional.cosine_similarity(predictions, targets, dim=0).mean()
        return mae_loss + cosine_loss

# Example usage
loss_fn = CombinedLoss()
loss = loss_fn(predictions, targets)
print(loss)

class CombinedLoss(nn.Module):

def __init__(self):

super(CombinedLoss, self).__init__()

def forward(self, predictions, targets):

mae_loss = torch.mean(torch.abs(predictions - targets))

cosine_loss = 1 - torch.nn.functional.cosine_similarity(predictions, targets, dim=0).mean()

return mae_loss + cosine_loss

# 示例用法

loss_fn = CombinedLoss()

loss = loss_fn(predictions, targets)

print(loss)

这种灵活性允许将多个指标集成到需要细致评估标准的任务中。

结合自定义层和损失

最后，我们来看一个将自定义层和损失函数集成到简单模型中的示例：

class ExampleModel(nn.Module):
    def __init__(self):
        super(ExampleModel, self).__init__()
        self.custom_layer = CustomLinear(5, 3)
        self.output_layer = nn.Linear(3, 1)
    def forward(self, x):
        x = torch.relu(self.custom_layer(x))
        return self.output_layer(x)
# Data
inputs = torch.randn(100, 5)
targets = torch.randn(100, 1)

# Model, Loss, Optimizer
model = ExampleModel()
loss_fn = CustomLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

# Training Loop
for epoch in range(50):
    optimizer.zero_grad()
    predictions = model(inputs)
    loss = loss_fn(predictions, targets)
    loss.backward()
    optimizer.step()
    if epoch % 10 == 0:
        print(f"Epoch {epoch}, Loss: {loss.item()}")

class ExampleModel(nn.Module):

def __init__(self):

super(ExampleModel, self).__init__()

self.custom_layer = CustomLinear(5, 3)

self.output_layer = nn.Linear(3, 1)

def forward(self, x):

x = torch.relu(self.custom_layer(x))

return self.output_layer(x)

# 数据

inputs = torch.randn(100, 5)

targets = torch.randn(100, 1)

# 模型、损失、优化器

model = ExampleModel()

loss_fn = CustomLoss()

optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

# 训练循环

for epoch in range(50):

optimizer.zero_grad()

predictions = model(inputs)

loss = loss_fn(predictions, targets)

loss.backward()

optimizer.step()

if epoch % 10 == 0:

print(f"Epoch {epoch}, Loss: {loss.item()}")

结论

在 PyTorch 中创建自定义层和损失函数，可以让你设计高度定制且有效的模型。这项能力可以让你应对独特的挑战，并在深度学习工作流程中获得更好的性能。

在处理自己的自定义层和损失函数时，请务必考虑这些调试和优化建议。

独立验证组件：使用合成数据来验证自定义层和损失函数的功能。
利用 PyTorch 工具：使用 torch.autograd.gradcheck 来验证梯度，并使用 torch.profiler 进行性能分析。
优化实现：使用向量化实现重构计算密集型操作，以获得更好的性能。

将灵活性与 PyTorch 丰富的生态系统相结合，可以确保您的模型保持可扩展性、可解释性，并与您应用程序的特定需求保持一致。