如何对 PyTorch 模型进行超参数网格搜索

作者 Adrian Tam 于 2023年4月8日发布在 PyTorch 深度学习 6

神经网络的“权重”在 PyTorch 代码中被称为“参数”，在训练过程中由优化器进行微调。相反，超参数是神经网络中通过设计固定且不通过训练进行调整的参数。例如隐藏层数量和激活函数的选择。超参数优化是深度学习的重要组成部分。原因是神经网络的配置极其困难，需要设置很多参数。此外，单个模型的训练速度可能非常慢。

在本帖中，您将学习如何使用 scikit-learn Python 机器学习库中的网格搜索功能来调整 PyTorch 深度学习模型的超参数。阅读本帖后，您将了解：

如何封装 PyTorch 模型以在 scikit-learn 中使用以及如何使用网格搜索
如何对常见的神经网络参数进行网格搜索，例如学习率、dropout 率、训练轮数和神经元数量
如何定义您自己的超参数调优实验

通过我的《用PyTorch进行深度学习》一书来启动你的项目。它提供了包含可用代码的自学教程。

让我们开始吧。

如何对 PyTorch 模型进行超参数网格搜索
照片作者：brandon siu。部分权利保留。

概述

在本帖中，您将看到如何使用 scikit-learn 的网格搜索功能，其中包含一系列示例，您可以将这些示例复制粘贴到您自己的项目中作为起点。下面是我们即将涵盖的主题列表：

如何在 scikit-learn 中使用 PyTorch 模型
如何在 scikit-learn 中使用网格搜索
如何调整批次大小和训练轮数
如何调整优化算法
如何调整学习率和动量
如何调整网络权重初始化
如何调整激活函数
如何调整 dropout 正则化
如何调整隐藏层中的神经元数量

如何在 scikit-learn 中使用 PyTorch 模型

如果 PyTorch 模型被 skorch 封装，就可以在 scikit-learn 中使用。这是为了利用 Python 的鸭子类型特性，使 PyTorch 模型提供与 scikit-learn 模型相似的 API，这样 scikit-learn 中的一切都可以协同工作。在 skorch 中，有用于分类神经网络的 NeuralNetClassifier 和用于回归神经网络的 NeuralNetRegressor。您可能需要运行以下命令来安装该模块。

pip install skorch

1	pip install skorch

要使用这些封装器，您必须使用 nn.Module 将您的 PyTorch 模型定义为一个类，然后在使用 NeuralNetClassifier 类实例化时，将类的名称传递给 module 参数。例如：

class MyClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        ...

    def forward(self, x):
        ...
        return x

# create the skorch wrapper
model = NeuralNetClassifier(
    module=MyClassifier
)

class MyClassifier(nn.Module):

def __init__(self):

super().__init__()

...

def forward(self, x):

...

return x

# 创建 skorch 封装器

model = NeuralNetClassifier(

module=MyClassifier

)

NeuralNetClassifier 类的构造函数可以接受传递给 model.fit() (在 scikit-learn 模型中调用训练循环的方式) 的默认参数，例如训练轮数和批次大小。例如：

model = NeuralNetClassifier(
    module=MyClassifier,
    max_epochs=150,
    batch_size=10
)

model = NeuralNetClassifier(

module=MyClassifier,

max_epochs=150,

batch_size=10

)

NeuralNetClassifier 类的构造函数还可以接受传递给您的模型类的构造函数的新参数，但您必须在前面加上 module__ (两个下划线)。这些新参数可能在构造函数中带有默认值，但在封装器实例化模型时会被覆盖。例如：

import torch.nn as nn
from skorch import NeuralNetClassifier

class SonarClassifier(nn.Module):
    def __init__(self, n_layers=3):
        super().__init__()
        self.layers = []
        self.acts = []
        for i in range(n_layers):
            self.layers.append(nn.Linear(60, 60))
            self.acts.append(nn.ReLU())
            self.add_module(f"layer{i}", self.layers[-1])
            self.add_module(f"act{i}", self.acts[-1])
        self.output = nn.Linear(60, 1)

    def forward(self, x):
        for layer, act in zip(self.layers, self.acts):
            x = act(layer(x))
        x = self.output(x)
        return x

model = NeuralNetClassifier(
    module=SonarClassifier,
    max_epochs=150,
    batch_size=10,
    module__n_layers=2
)

import torch.nn as nn

from skorch import NeuralNetClassifier

class SonarClassifier(nn.Module):

def __init__(self, n_layers=3):

super().__init__()

self.layers = []

self.acts = []

for i in range(n_layers):

self.layers.append(nn.Linear(60, 60))

self.acts.append(nn.ReLU())

self.add_module(f"layer{i}", self.layers[-1])

self.add_module(f"act{i}", self.acts[-1])

self.output = nn.Linear(60, 1)

def forward(self, x):

for layer, act in zip(self.layers, self.acts):

x = act(layer(x))

x = self.output(x)

return x

model = NeuralNetClassifier(

module=SonarClassifier,

max_epochs=150,

batch_size=10,

module__n_layers=2

)

您可以通过初始化模型并打印它来验证结果

print(model.initialize())

1	print(model.initialize())

在此示例中，您应该会看到：

<class 'skorch.classifier.NeuralNetClassifier'>[initialized](
  module_=SonarClassifier(
    (layer0): Linear(in_features=60, out_features=60, bias=True)
    (act0): ReLU()
    (layer1): Linear(in_features=60, out_features=60, bias=True)
    (act1): ReLU()
    (output): Linear(in_features=60, out_features=1, bias=True)
  ),
)

<class 'skorch.classifier.NeuralNetClassifier'>[initialized](

module_=SonarClassifier(

(layer0): Linear(in_features=60, out_features=60, bias=True)

(act0): ReLU()

(layer1): Linear(in_features=60, out_features=60, bias=True)

(act1): ReLU()

(output): Linear(in_features=60, out_features=1, bias=True)

)

想开始使用PyTorch进行深度学习吗？

立即参加我的免费电子邮件速成课程（附示例代码）。

点击注册，同时获得该课程的免费PDF电子书版本。

如何在 scikit-learn 中使用网格搜索

网格搜索是一种模型超参数优化技术。它简单地穷举超参数的所有组合，并找到给出最佳分数的组合。在 scikit-learn 中，此技术由 GridSearchCV 类提供。在实例化此类时，您必须在 param_grid 参数中提供要评估的超参数字典。这是一个模型参数名称和要尝试的值数组的映射。

默认情况下，准确率是优化的分数，但其他分数可以在 GridSearchCV 构造函数的 score 参数中指定。然后，GridSearchCV 进程将为每个参数组合构建和评估一个模型。交叉验证用于评估每个模型，默认使用 3 折交叉验证，尽管您可以通过在 GridSearchCV 构造函数中指定 cv 参数来覆盖此设置。

下面是一个定义简单网格搜索的示例：

param_grid = {
    'epochs': [10,20,30]
}
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(X, Y)

param_grid = {

'epochs': [10,20,30]

}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

grid_result = grid.fit(X, Y)

通过将 GridSearchCV 构造函数中的 n_jobs 参数设置为 $-1$，该过程将使用您机器上的所有核心。否则，网格搜索过程将仅在单线程中运行，在多核 CPU 上速度会变慢。

完成后，您可以从 grid.fit() 返回的结果对象中访问网格搜索的结果。best_score_ 成员提供了在优化过程中观察到的最佳分数，而 best_params_ 描述了获得最佳结果的参数组合。您可以在 scikit-learn API 文档中找到有关 GridSearchCV 类的更多信息。

通过我的《用PyTorch进行深度学习》一书来启动你的项目。它提供了包含可用代码的自学教程。

问题描述

现在您已经知道了如何在 scikit-learn 中使用 PyTorch 模型以及如何在 scikit-learn 中使用网格搜索，接下来让我们看一些示例。

所有示例都将在一个名为 Pima Indians 发病糖尿病分类数据集的小型标准机器学习数据集上进行演示。这是一个小型数据集，所有属性都是数值型的，易于处理。

在继续阅读本帖中的示例时，您将聚合最佳参数。这不是网格搜索的最佳方法，因为参数会相互影响，但对于演示目的而言是好的。

如何调整批次大小和训练轮数

在第一个简单示例中，您将学习如何调整拟合网络时使用的批次大小和训练轮数。

在迭代梯度下降中，批次大小是指在权重更新之前展示给网络的模式数量。它也是网络训练中的一个优化项，定义了每次读取并保留在内存中的模式数量。

训练轮数是指在训练过程中，整个训练数据集被展示给网络的次数。有些网络对批次大小很敏感，例如 LSTM 循环神经网络和卷积神经网络。

在这里，您将评估从 10 到 100，步长为 20 的一系列不同的最小批次大小。

完整的代码清单如下

import random
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from skorch import NeuralNetClassifier
from sklearn.model_selection import GridSearchCV

# load the dataset, split into input (X) and output (y) variables
dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch classifier
class PimaClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer = nn.Linear(8, 12)
        self.act = nn.ReLU()
        self.output = nn.Linear(12, 1)
        self.prob = nn.Sigmoid()

    def forward(self, x):
        x = self.act(self.layer(x))
        x = self.prob(self.output(x))
        return x

# create model with skorch
model = NeuralNetClassifier(
    PimaClassifier,
    criterion=nn.BCELoss,
    optimizer=optim.Adam,
    verbose=False
)

# define the grid search parameters
param_grid = {
    'batch_size': [10, 20, 40, 60, 80, 100],
    'max_epochs': [10, 50, 100]
}
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(X, y)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

import random

import numpy as np

import torch

import torch.nn as nn

import torch.optim as optim

from skorch import NeuralNetClassifier

from sklearn.model_selection import GridSearchCV

# 加载数据集，分割为输入 (X) 和输出 (y) 变量

dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')

X = dataset[:,0:8]

y = dataset[:,8]

X = torch.tensor(X, dtype=torch.float32)

y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch 分类器

class PimaClassifier(nn.Module):

def __init__(self):

super().__init__()

self.layer = nn.Linear(8, 12)

self.act = nn.ReLU()

self.output = nn.Linear(12, 1)

self.prob = nn.Sigmoid()

def forward(self, x):

x = self.act(self.layer(x))

x = self.prob(self.output(x))

return x

# 使用 skorch 创建模型

model = NeuralNetClassifier(

PimaClassifier,

criterion=nn.BCELoss,

optimizer=optim.Adam,

verbose=False

)

# define the grid search parameters

param_grid = {

'batch_size': [10, 20, 40, 60, 80, 100],

'max_epochs': [10, 50, 100]

}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

grid_result = grid.fit(X, y)

# 总结结果

print("最佳： %f 使用 %s" % (grid_result.best_score_, grid_result.best_params_))

means = grid_result.cv_results_['mean_test_score']

stds = grid_result.cv_results_['std_test_score']

params = grid_result.cv_results_['params']

for mean, stdev, param in zip(means, stds, params):

print("%f (%f) 使用: %r" % (mean, stdev, param))

运行此示例会产生以下输出

Best: 0.714844 using {'batch_size': 10, 'max_epochs': 100}
0.665365 (0.020505) with: {'batch_size': 10, 'max_epochs': 10}
0.588542 (0.168055) with: {'batch_size': 10, 'max_epochs': 50}
0.714844 (0.032369) with: {'batch_size': 10, 'max_epochs': 100}
0.671875 (0.022326) with: {'batch_size': 20, 'max_epochs': 10}
0.696615 (0.008027) with: {'batch_size': 20, 'max_epochs': 50}
0.714844 (0.019918) with: {'batch_size': 20, 'max_epochs': 100}
0.666667 (0.009744) with: {'batch_size': 40, 'max_epochs': 10}
0.687500 (0.033603) with: {'batch_size': 40, 'max_epochs': 50}
0.707031 (0.024910) with: {'batch_size': 40, 'max_epochs': 100}
0.667969 (0.014616) with: {'batch_size': 60, 'max_epochs': 10}
0.694010 (0.036966) with: {'batch_size': 60, 'max_epochs': 50}
0.694010 (0.042473) with: {'batch_size': 60, 'max_epochs': 100}
0.670573 (0.023939) with: {'batch_size': 80, 'max_epochs': 10}
0.674479 (0.020752) with: {'batch_size': 80, 'max_epochs': 50}
0.703125 (0.026107) with: {'batch_size': 80, 'max_epochs': 100}
0.680990 (0.014382) with: {'batch_size': 100, 'max_epochs': 10}
0.670573 (0.013279) with: {'batch_size': 100, 'max_epochs': 50}
0.687500 (0.017758) with: {'batch_size': 100, 'max_epochs': 100}

最佳：0.714844 使用 {'batch_size': 10, 'max_epochs': 100}

0.665365 (0.020505) 使用: {'batch_size': 10, 'max_epochs': 10}

0.588542 (0.168055) 使用: {'batch_size': 10, 'max_epochs': 50}

0.714844 (0.032369) 使用: {'batch_size': 10, 'max_epochs': 100}

0.671875 (0.022326) 使用: {'batch_size': 20, 'max_epochs': 10}

0.696615 (0.008027) 使用: {'batch_size': 20, 'max_epochs': 50}

0.714844 (0.019918) 使用: {'batch_size': 20, 'max_epochs': 100}

0.666667 (0.009744) 使用: {'batch_size': 40, 'max_epochs': 10}

0.687500 (0.033603) 使用: {'batch_size': 40, 'max_epochs': 50}

0.707031 (0.024910) 使用: {'batch_size': 40, 'max_epochs': 100}

0.667969 (0.014616) 使用: {'batch_size': 60, 'max_epochs': 10}

0.694010 (0.036966) 使用: {'batch_size': 60, 'max_epochs': 50}

0.694010 (0.042473) 使用: {'batch_size': 60, 'max_epochs': 100}

0.670573 (0.023939) 使用: {'batch_size': 80, 'max_epochs': 10}

0.674479 (0.020752) 使用: {'batch_size': 80, 'max_epochs': 50}

0.703125 (0.026107) 使用: {'batch_size': 80, 'max_epochs': 100}

0.680990 (0.014382) 使用: {'batch_size': 100, 'max_epochs': 10}

0.670573 (0.013279) 使用: {'batch_size': 100, 'max_epochs': 50}

0.687500 (0.017758) 使用: {'batch_size': 100, 'max_epochs': 100}

可以看出，批次大小为 10 且训练轮数为 100 时取得了最佳结果，准确率约为 71%（但您也应考虑准确率的标准差）。

如何调整训练优化算法

所有深度学习库都应提供各种优化算法。PyTorch 也不例外。

在本示例中，您将调整用于训练网络的优化算法，每个算法都使用默认参数。

这是一个不寻常的示例，因为通常您会预先选择一种方法，而专注于为您的特定问题调整其参数（请参见下一个示例）。
在这里，您将评估 PyTorch 中可用的一系列优化算法。

完整的代码清单如下

import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from skorch import NeuralNetClassifier
from sklearn.model_selection import GridSearchCV

# load the dataset, split into input (X) and output (y) variables
dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch classifier
class PimaClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer = nn.Linear(8, 12)
        self.act = nn.ReLU()
        self.output = nn.Linear(12, 1)
        self.prob = nn.Sigmoid()

    def forward(self, x):
        x = self.act(self.layer(x))
        x = self.prob(self.output(x))
        return x

# create model with skorch
model = NeuralNetClassifier(
    PimaClassifier,
    criterion=nn.BCELoss,
    max_epochs=100,
    batch_size=10,
    verbose=False
)

# define the grid search parameters
param_grid = {
    'optimizer': [optim.SGD, optim.RMSprop, optim.Adagrad, optim.Adadelta,
                  optim.Adam, optim.Adamax, optim.NAdam],
}
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(X, y)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

import numpy as np

import torch

import torch.nn as nn

import torch.optim as optim

from skorch import NeuralNetClassifier

from sklearn.model_selection import GridSearchCV

# 加载数据集，分割为输入 (X) 和输出 (y) 变量

dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')

X = dataset[:,0:8]

y = dataset[:,8]

X = torch.tensor(X, dtype=torch.float32)

y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch 分类器

class PimaClassifier(nn.Module):

def __init__(self):

super().__init__()

self.layer = nn.Linear(8, 12)

self.act = nn.ReLU()

self.output = nn.Linear(12, 1)

self.prob = nn.Sigmoid()

def forward(self, x):

x = self.act(self.layer(x))

x = self.prob(self.output(x))

return x

# 使用 skorch 创建模型

model = NeuralNetClassifier(

PimaClassifier,

criterion=nn.BCELoss,

max_epochs=100,

batch_size=10,

verbose=False

)

# define the grid search parameters

param_grid = {

'optimizer': [optim.SGD, optim.RMSprop, optim.Adagrad, optim.Adadelta,

optim.Adam, optim.Adamax, optim.NAdam],

}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

grid_result = grid.fit(X, y)

# 总结结果

print("最佳： %f 使用 %s" % (grid_result.best_score_, grid_result.best_params_))

means = grid_result.cv_results_['mean_test_score']

stds = grid_result.cv_results_['std_test_score']

params = grid_result.cv_results_['params']

for mean, stdev, param in zip(means, stds, params):

print("%f (%f) 使用: %r" % (mean, stdev, param))

运行此示例会产生以下输出

Best: 0.721354 using {'optimizer': <class 'torch.optim.adamax.Adamax'>}
0.674479 (0.036828) with: {'optimizer': <class 'torch.optim.sgd.SGD'>}
0.700521 (0.043303) with: {'optimizer': <class 'torch.optim.rmsprop.RMSprop'>}
0.682292 (0.027126) with: {'optimizer': <class 'torch.optim.adagrad.Adagrad'>}
0.572917 (0.051560) with: {'optimizer': <class 'torch.optim.adadelta.Adadelta'>}
0.714844 (0.030758) with: {'optimizer': <class 'torch.optim.adam.Adam'>}
0.721354 (0.019225) with: {'optimizer': <class 'torch.optim.adamax.Adamax'>}
0.709635 (0.024360) with: {'optimizer': <class 'torch.optim.nadam.NAdam'>}

最佳：0.721354 使用 {'optimizer': <class 'torch.optim.adamax.Adamax'>}

0.674479 (0.036828) 使用: {'optimizer': <class 'torch.optim.sgd.SGD'>}

0.700521 (0.043303) 使用: {'optimizer': <class 'torch.optim.rmsprop.RMSprop'>}

0.682292 (0.027126) 使用: {'optimizer': <class 'torch.optim.adagrad.Adagrad'>}

0.572917 (0.051560) 使用: {'optimizer': <class 'torch.optim.adadelta.Adadelta'>}

0.714844 (0.030758) 使用: {'optimizer': <class 'torch.optim.adam.Adam'>}

0.721354 (0.019225) 使用: {'optimizer': <class 'torch.optim.adamax.Adamax'>}

0.709635 (0.024360) 使用: {'optimizer': <class 'torch.optim.nadam.NAdam'>}

结果表明，Adamax 优化算法效果最好，得分为约 72% 的准确率。

值得一提的是，GridSearchCV 会经常重新创建您的模型，因此每次试验都是独立的。之所以能够这样做，是因为 NeuralNetClassifier 封装器知道您的 PyTorch 模型的类名，并在请求时为您实例化一个。

如何调整学习率和动量

通常会预先选择一个优化算法来训练您的网络并调整其参数。

到目前为止，最常见的优化算法是普通的随机梯度下降（SGD），因为它非常容易理解。在本示例中，您将研究如何优化 SGD 的学习率和动量参数。

学习率控制每次批次结束时权重的更新幅度，动量控制前一次更新对当前权重更新的影响程度。

您将尝试一系列小的标准学习率和动量值，从 0.2 到 0.8，步长为 0.2，以及 0.9（因为它在实践中可能是一个流行的值）。在 PyTorch 中，设置学习率和动量的方法如下：

optimizer = optim.SGD(lr=0.001, momentum=0.9)

1	optimizer = optim.SGD(lr=0.001, momentum=0.9)

在 skorch 封装器中，您可以使用 optimizer__ 前缀将参数路由到优化器。

通常，最好也将训练轮数包含在此类优化中，因为学习率（每批次的学习量）、批次大小（每 epoch 的更新次数）和训练轮数之间存在依赖关系。

完整的代码清单如下

import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from skorch import NeuralNetClassifier
from sklearn.model_selection import GridSearchCV

# load the dataset, split into input (X) and output (y) variables
dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch classifier
class PimaClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer = nn.Linear(8, 12)
        self.act = nn.ReLU()
        self.output = nn.Linear(12, 1)
        self.prob = nn.Sigmoid()

    def forward(self, x):
        x = self.act(self.layer(x))
        x = self.prob(self.output(x))
        return x

# create model with skorch
model = NeuralNetClassifier(
    PimaClassifier,
    criterion=nn.BCELoss,
    optimizer=optim.SGD,
    max_epochs=100,
    batch_size=10,
    verbose=False
)

# define the grid search parameters
param_grid = {
    'optimizer__lr': [0.001, 0.01, 0.1, 0.2, 0.3],
    'optimizer__momentum': [0.0, 0.2, 0.4, 0.6, 0.8, 0.9],
}
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(X, y)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

import numpy as np

import torch

import torch.nn as nn

import torch.optim as optim

from skorch import NeuralNetClassifier

from sklearn.model_selection import GridSearchCV

# 加载数据集，分割为输入 (X) 和输出 (y) 变量

dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')

X = dataset[:,0:8]

y = dataset[:,8]

X = torch.tensor(X, dtype=torch.float32)

y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch 分类器

class PimaClassifier(nn.Module):

def __init__(self):

super().__init__()

self.layer = nn.Linear(8, 12)

self.act = nn.ReLU()

self.output = nn.Linear(12, 1)

self.prob = nn.Sigmoid()

def forward(self, x):

x = self.act(self.layer(x))

x = self.prob(self.output(x))

return x

# 使用 skorch 创建模型

model = NeuralNetClassifier(

PimaClassifier,

criterion=nn.BCELoss,

optimizer=optim.SGD,

max_epochs=100,

batch_size=10,

verbose=False

)

# define the grid search parameters

param_grid = {

'optimizer__lr': [0.001, 0.01, 0.1, 0.2, 0.3],

'optimizer__momentum': [0.0, 0.2, 0.4, 0.6, 0.8, 0.9],

}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

grid_result = grid.fit(X, y)

# 总结结果

print("最佳： %f 使用 %s" % (grid_result.best_score_, grid_result.best_params_))

means = grid_result.cv_results_['mean_test_score']

stds = grid_result.cv_results_['std_test_score']

params = grid_result.cv_results_['params']

for mean, stdev, param in zip(means, stds, params):

print("%f (%f) 使用: %r" % (mean, stdev, param))

运行此示例将产生以下输出。

Best: 0.682292 using {'optimizer__lr': 0.001, 'optimizer__momentum': 0.9}
0.648438 (0.016877) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.0}
0.671875 (0.017758) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.2}
0.674479 (0.022402) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.4}
0.677083 (0.011201) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.6}
0.679688 (0.027621) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.8}
0.682292 (0.026557) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.9}
0.671875 (0.019918) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.0}
0.648438 (0.024910) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.2}
0.546875 (0.143454) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.4}
0.567708 (0.153668) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.6}
0.552083 (0.141790) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.8}
0.451823 (0.144561) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.9}
0.348958 (0.001841) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.0}
0.450521 (0.142719) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.2}
0.450521 (0.142719) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.4}
0.450521 (0.142719) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.6}
0.348958 (0.001841) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.8}
0.348958 (0.001841) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.9}
0.444010 (0.136265) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.0}
0.450521 (0.142719) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.2}
0.348958 (0.001841) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.4}
0.552083 (0.141790) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.6}
0.549479 (0.142719) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.8}
0.651042 (0.001841) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.9}
0.552083 (0.141790) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.0}
0.348958 (0.001841) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.2}
0.450521 (0.142719) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.4}
0.552083 (0.141790) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.6}
0.450521 (0.142719) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.8}
0.450521 (0.142719) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.9}

最佳：0.682292 使用 {'optimizer__lr': 0.001, 'optimizer__momentum': 0.9}

0.648438 (0.016877) 使用: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.0}

0.671875 (0.017758) 使用: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.2}

0.674479 (0.022402) 使用: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.4}

0.677083 (0.011201) 使用: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.6}

0.679688 (0.027621) 使用: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.8}

0.682292 (0.026557) 使用: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.9}

0.671875 (0.019918) 使用: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.0}

0.648438 (0.024910) 使用: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.2}

0.546875 (0.143454) 使用: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.4}

0.567708 (0.153668) 使用: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.6}

0.552083 (0.141790) 使用: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.8}

0.451823 (0.144561) 使用: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.9}

0.348958 (0.001841) 使用: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.0}

0.450521 (0.142719) 使用: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.2}

0.450521 (0.142719) 使用: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.4}

0.450521 (0.142719) 使用: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.6}

0.348958 (0.001841) 使用: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.8}

0.348958 (0.001841) 使用: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.9}

0.444010 (0.136265) 使用: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.0}

0.450521 (0.142719) 使用: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.2}

0.348958 (0.001841) 使用: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.4}

0.552083 (0.141790) 使用: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.6}

0.549479 (0.142719) 使用: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.8}

0.651042 (0.001841) 使用: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.9}

0.552083 (0.141790) 使用: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.0}

0.348958 (0.001841) 使用: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.2}

0.450521 (0.142719) 使用: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.4}

0.552083 (0.141790) 使用: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.6}

0.450521 (0.142719) 使用: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.8}

0.450521 (0.142719) 使用: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.9}

可以看到，使用 SGD 时，使用学习率为 0.001 和动量为 0.9 取得了最佳结果，准确率为约 68%。

如何调整神经网络权重初始化

神经网络的权重初始化曾经很简单：使用小的随机值。

现在有多种不同的技术可供选择。您可以在 torch.nn.init 文档中找到一个详细列表。

在本示例中，您将通过评估所有可用技术来尝试调整神经网络权重初始化的选择。

您将在每个层上使用相同的权重初始化方法。理想情况下，根据每个层使用的激活函数使用不同的权重初始化方案可能会更好。在下面的示例中，您将为隐藏层使用整流器。对于输出层使用 sigmoid，因为预测是二元的。PyTorch 模型中的权重初始化是隐式的。因此，您需要编写自己的逻辑来初始化权重，在层创建之后但在使用之前。让我们按如下方式修改 PyTorch：

# PyTorch classifier
class PimaClassifier(nn.Module):
    def __init__(self, weight_init=torch.nn.init.xavier_uniform_):
        super().__init__()
        self.layer = nn.Linear(8, 12)
        self.act = nn.ReLU()
        self.output = nn.Linear(12, 1)
        self.prob = nn.Sigmoid()
        # manually init weights
        weight_init(self.layer.weight)
        weight_init(self.output.weight)

    def forward(self, x):
        x = self.act(self.layer(x))
        x = self.prob(self.output(x))
        return x

# PyTorch 分类器

class PimaClassifier(nn.Module):

def __init__(self, weight_init=torch.nn.init.xavier_uniform_):

super().__init__()

self.layer = nn.Linear(8, 12)

self.act = nn.ReLU()

self.output = nn.Linear(12, 1)

self.prob = nn.Sigmoid()

# 手动初始化权重

weight_init(self.layer.weight)

weight_init(self.output.weight)

def forward(self, x):

x = self.act(self.layer(x))

x = self.prob(self.output(x))

return x

在 PimaClassifier 类中添加了一个 weight_init 参数，它期望 torch.nn.init 中的一个初始化器。在 GridSearchCV 中，您需要使用 module__ 前缀来让 NeuralNetClassifier 将参数路由到模型的类构造函数。

完整的代码清单如下

import numpy as np
import torch
import torch.nn as nn
import torch.nn.init as init
import torch.optim as optim
from skorch import NeuralNetClassifier
from sklearn.model_selection import GridSearchCV

# load the dataset, split into input (X) and output (y) variables
dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch classifier
class PimaClassifier(nn.Module):
    def __init__(self, weight_init=init.xavier_uniform_):
        super().__init__()
        self.layer = nn.Linear(8, 12)
        self.act = nn.ReLU()
        self.output = nn.Linear(12, 1)
        self.prob = nn.Sigmoid()
        # manually init weights
        weight_init(self.layer.weight)
        weight_init(self.output.weight)

    def forward(self, x):
        x = self.act(self.layer(x))
        x = self.prob(self.output(x))
        return x

# create model with skorch
model = NeuralNetClassifier(
    PimaClassifier,
    criterion=nn.BCELoss,
    optimizer=optim.Adamax,
    max_epochs=100,
    batch_size=10,
    verbose=False
)

# define the grid search parameters
param_grid = {
    'module__weight_init': [init.uniform_, init.normal_, init.zeros_,
                           init.xavier_normal_, init.xavier_uniform_,
                           init.kaiming_normal_, init.kaiming_uniform_]
}
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(X, y)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

import numpy as np

import torch

import torch.nn as nn

import torch.nn.init as init

import torch.optim as optim

from skorch import NeuralNetClassifier

from sklearn.model_selection import GridSearchCV

# 加载数据集，分割为输入 (X) 和输出 (y) 变量

dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')

X = dataset[:,0:8]

y = dataset[:,8]

X = torch.tensor(X, dtype=torch.float32)

y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch 分类器

class PimaClassifier(nn.Module):

def __init__(self, weight_init=init.xavier_uniform_):

super().__init__()

self.layer = nn.Linear(8, 12)

self.act = nn.ReLU()

self.output = nn.Linear(12, 1)

self.prob = nn.Sigmoid()

# 手动初始化权重

weight_init(self.layer.weight)

weight_init(self.output.weight)

def forward(self, x):

x = self.act(self.layer(x))

x = self.prob(self.output(x))

return x

# 使用 skorch 创建模型

model = NeuralNetClassifier(

PimaClassifier,

criterion=nn.BCELoss,

optimizer=optim.Adamax,

max_epochs=100,

batch_size=10,

verbose=False

)

# define the grid search parameters

param_grid = {

'module__weight_init': [init.uniform_, init.normal_, init.zeros_,

init.xavier_normal_, init.xavier_uniform_,

init.kaiming_normal_, init.kaiming_uniform_]

}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

grid_result = grid.fit(X, y)

# 总结结果

print("最佳： %f 使用 %s" % (grid_result.best_score_, grid_result.best_params_))

means = grid_result.cv_results_['mean_test_score']

stds = grid_result.cv_results_['std_test_score']

params = grid_result.cv_results_['params']

for mean, stdev, param in zip(means, stds, params):

print("%f (%f) 使用: %r" % (mean, stdev, param))

运行此示例将产生以下输出。

Best: 0.697917 using {'module__weight_init': <function kaiming_uniform_ at 0x112020c10>}
0.348958 (0.001841) with: {'module__weight_init': <function uniform_ at 0x1120204c0>}
0.602865 (0.061708) with: {'module__weight_init': <function normal_ at 0x112020550>}
0.652344 (0.003189) with: {'module__weight_init': <function zeros_ at 0x112020820>}
0.691406 (0.030758) with: {'module__weight_init': <function xavier_normal_ at 0x112020af0>}
0.592448 (0.171589) with: {'module__weight_init': <function xavier_uniform_ at 0x112020a60>}
0.563802 (0.152971) with: {'module__weight_init': <function kaiming_normal_ at 0x112020ca0>}
0.697917 (0.013279) with: {'module__weight_init': <function kaiming_uniform_ at 0x112020c10>}

最佳：0.697917 使用 {'module__weight_init': <function kaiming_uniform_ at 0x112020c10>}

0.348958 (0.001841) 使用：{'module__weight_init': <function uniform_ at 0x1120204c0>}

0.602865 (0.061708) 使用：{'module__weight_init': <function normal_ at 0x112020550>}

0.652344 (0.003189) 使用：{'module__weight_init': <function zeros_ at 0x112020820>}

0.691406 (0.030758) 使用：{'module__weight_init': <function xavier_normal_ at 0x112020af0>}

0.592448 (0.171589) 使用：{'module__weight_init': <function xavier_uniform_ at 0x112020a60>}

0.563802 (0.152971) 使用：{'module__weight_init': <function kaiming_normal_ at 0x112020ca0>}

0.697917 (0.013279) 使用：{'module__weight_init': <function kaiming_uniform_ at 0x112020c10>}

最佳结果是通过 He-uniform 权重初始化方案获得的，性能约为 70%。

如何调整神经元激活函数

激活函数控制单个神经元的非线性和何时激活。

通常，整流器激活函数是最受欢迎的。然而，它以前是 sigmoid 和 tanh 函数，这些函数对于不同的问题可能仍然更合适。

在本示例中，您将评估 PyTorch 中提供的一些激活函数。由于二元分类问题需要在输出层使用 sigmoid 激活函数，因此您只会将这些函数用于隐藏层。与前面的示例类似，这是模型类构造函数的参数，并且您将为 GridSearchCV 参数网格使用 module__ 前缀。

通常，将数据准备到不同转换函数的范围内是一个好主意，但在本例中您不会这样做。

完整的代码清单如下

import numpy as np
import torch
import torch.nn as nn
import torch.nn.init as init
import torch.optim as optim
from skorch import NeuralNetClassifier
from sklearn.model_selection import GridSearchCV

# load the dataset, split into input (X) and output (y) variables
dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch classifier
class PimaClassifier(nn.Module):
    def __init__(self, activation=nn.ReLU):
        super().__init__()
        self.layer = nn.Linear(8, 12)
        self.act = activation()
        self.output = nn.Linear(12, 1)
        self.prob = nn.Sigmoid()
        # manually init weights
        init.kaiming_uniform_(self.layer.weight)
        init.kaiming_uniform_(self.output.weight)

    def forward(self, x):
        x = self.act(self.layer(x))
        x = self.prob(self.output(x))
        return x

# create model with skorch
model = NeuralNetClassifier(
    PimaClassifier,
    criterion=nn.BCELoss,
    optimizer=optim.Adamax,
    max_epochs=100,
    batch_size=10,
    verbose=False
)

# define the grid search parameters
param_grid = {
    'module__activation': [nn.Identity, nn.ReLU, nn.ELU, nn.ReLU6,
                           nn.GELU, nn.Softplus, nn.Softsign, nn.Tanh,
                           nn.Sigmoid, nn.Hardsigmoid]
}
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(X, y)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

import numpy as np

import torch

import torch.nn as nn

import torch.nn.init as init

import torch.optim as optim

from skorch import NeuralNetClassifier

from sklearn.model_selection import GridSearchCV

# 加载数据集，分割为输入 (X) 和输出 (y) 变量

dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')

X = dataset[:,0:8]

y = dataset[:,8]

X = torch.tensor(X, dtype=torch.float32)

y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch 分类器

class PimaClassifier(nn.Module):

def __init__(self, activation=nn.ReLU):

super().__init__()

self.layer = nn.Linear(8, 12)

self.act = activation()

self.output = nn.Linear(12, 1)

self.prob = nn.Sigmoid()

# 手动初始化权重

init.kaiming_uniform_(self.layer.weight)

init.kaiming_uniform_(self.output.weight)

def forward(self, x):

x = self.act(self.layer(x))

x = self.prob(self.output(x))

return x

# 使用 skorch 创建模型

model = NeuralNetClassifier(

PimaClassifier,

criterion=nn.BCELoss,

optimizer=optim.Adamax,

max_epochs=100,

batch_size=10,

verbose=False

)

# define the grid search parameters

param_grid = {

'module__activation': [nn.Identity, nn.ReLU, nn.ELU, nn.ReLU6,

nn.GELU, nn.Softplus, nn.Softsign, nn.Tanh,

nn.Sigmoid, nn.Hardsigmoid]

}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

grid_result = grid.fit(X, y)

# 总结结果

print("最佳： %f 使用 %s" % (grid_result.best_score_, grid_result.best_params_))

means = grid_result.cv_results_['mean_test_score']

stds = grid_result.cv_results_['std_test_score']

params = grid_result.cv_results_['params']

for mean, stdev, param in zip(means, stds, params):

print("%f (%f) 使用: %r" % (mean, stdev, param))

运行此示例将产生以下输出。

Best: 0.699219 using {'module__activation': <class 'torch.nn.modules.activation.ReLU'>}
0.687500 (0.025315) with: {'module__activation': <class 'torch.nn.modules.linear.Identity'>}
0.699219 (0.011049) with: {'module__activation': <class 'torch.nn.modules.activation.ReLU'>}
0.674479 (0.035849) with: {'module__activation': <class 'torch.nn.modules.activation.ELU'>}
0.621094 (0.063549) with: {'module__activation': <class 'torch.nn.modules.activation.ReLU6'>}
0.674479 (0.017566) with: {'module__activation': <class 'torch.nn.modules.activation.GELU'>}
0.558594 (0.149189) with: {'module__activation': <class 'torch.nn.modules.activation.Softplus'>}
0.675781 (0.014616) with: {'module__activation': <class 'torch.nn.modules.activation.Softsign'>}
0.619792 (0.018688) with: {'module__activation': <class 'torch.nn.modules.activation.Tanh'>}
0.643229 (0.019225) with: {'module__activation': <class 'torch.nn.modules.activation.Sigmoid'>}
0.636719 (0.022326) with: {'module__activation': <class 'torch.nn.modules.activation.Hardsigmoid'>}

最佳：0.699219 使用 {'module__activation': <class 'torch.nn.modules.activation.ReLU'>}

0.687500 (0.025315) 使用：{'module__activation': <class 'torch.nn.modules.linear.Identity'>}

0.699219 (0.011049) 使用：{'module__activation': <class 'torch.nn.modules.activation.ReLU'>}

0.674479 (0.035849) 使用：{'module__activation': <class 'torch.nn.modules.activation.ELU'>}

0.621094 (0.063549) 使用：{'module__activation': <class 'torch.nn.modules.activation.ReLU6'>}

0.674479 (0.017566) 使用：{'module__activation': <class 'torch.nn.modules.activation.GELU'>}

0.558594 (0.149189) 使用：{'module__activation': <class 'torch.nn.modules.activation.Softplus'>}

0.675781 (0.014616) 使用：{'module__activation': <class 'torch.nn.modules.activation.Softsign'>}

0.619792 (0.018688) 使用：{'module__activation': <class 'torch.nn.modules.activation.Tanh'>}

0.643229 (0.019225) 使用：{'module__activation': <class 'torch.nn.modules.activation.Sigmoid'>}

0.636719 (0.022326) 使用：{'module__activation': <class 'torch.nn.modules.activation.Hardsigmoid'>}

结果表明，ReLU 激活函数取得了最佳结果，准确率约为 70%。

如何调整 Dropout 正则化

在本示例中，您将尝试调整用于深度神经网络正则化的 Dropout 速率，以限制过拟合并提高模型的泛化能力。

为了获得最佳结果，Dropout 与权重约束（如在前向传播函数中实现的 max norm 约束）结合使用效果最佳。

这涉及拟合 dropout 百分比和权重约束。我们将尝试 dropout 百分比在 0.0 到 0.9 之间（1.0 没有意义），以及 MaxNorm 权重约束值在 0 到 5 之间。

完整的代码清单如下。

import numpy as np
import torch
import torch.nn as nn
import torch.nn.init as init
import torch.optim as optim
from skorch import NeuralNetClassifier
from sklearn.model_selection import GridSearchCV

# load the dataset, split into input (X) and output (y) variables
dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch classifier
class PimaClassifier(nn.Module):
    def __init__(self, dropout_rate=0.5, weight_constraint=1.0):
        super().__init__()
        self.layer = nn.Linear(8, 12)
        self.act = nn.ReLU()
        self.dropout = nn.Dropout(dropout_rate)
        self.output = nn.Linear(12, 1)
        self.prob = nn.Sigmoid()
        self.weight_constraint = weight_constraint
        # manually init weights
        init.kaiming_uniform_(self.layer.weight)
        init.kaiming_uniform_(self.output.weight)

    def forward(self, x):
        # maxnorm weight before actual forward pass
        with torch.no_grad():
            norm = self.layer.weight.norm(2, dim=0, keepdim=True).clamp(min=self.weight_constraint / 2)
            desired = torch.clamp(norm, max=self.weight_constraint)
            self.layer.weight *= (desired / norm)
        # actual forward pass
        x = self.act(self.layer(x))
        x = self.dropout(x)
        x = self.prob(self.output(x))
        return x

# create model with skorch
model = NeuralNetClassifier(
    PimaClassifier,
    criterion=nn.BCELoss,
    optimizer=optim.Adamax,
    max_epochs=100,
    batch_size=10,
    verbose=False
)

# define the grid search parameters
param_grid = {
    'module__weight_constraint': [1.0, 2.0, 3.0, 4.0, 5.0],
    'module__dropout_rate': [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
}
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(X, y)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

import numpy as np

import torch

import torch.nn as nn

import torch.nn.init as init

import torch.optim as optim

from skorch import NeuralNetClassifier

from sklearn.model_selection import GridSearchCV

# 加载数据集，分割为输入 (X) 和输出 (y) 变量

dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')

X = dataset[:,0:8]

y = dataset[:,8]

X = torch.tensor(X, dtype=torch.float32)

y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# PyTorch 分类器

class PimaClassifier(nn.Module):

def __init__(self, dropout_rate=0.5, weight_constraint=1.0):

super().__init__()

self.layer = nn.Linear(8, 12)

self.act = nn.ReLU()

self.dropout = nn.Dropout(dropout_rate)

self.output = nn.Linear(12, 1)

self.prob = nn.Sigmoid()

self.weight_constraint = weight_constraint

# 手动初始化权重

init.kaiming_uniform_(self.layer.weight)

init.kaiming_uniform_(self.output.weight)

def forward(self, x):

# 在实际前向传播之前施加 maxnorm 权重

with torch.no_grad():

norm = self.layer.weight.norm(2, dim=0, keepdim=True).clamp(min=self.weight_constraint / 2)

desired = torch.clamp(norm, max=self.weight_constraint)

self.layer.weight *= (desired / norm)

# 实际前向传播

x = self.act(self.layer(x))

x = self.dropout(x)

x = self.prob(self.output(x))

return x

# 使用 skorch 创建模型

model = NeuralNetClassifier(

PimaClassifier,

criterion=nn.BCELoss,

optimizer=optim.Adamax,

max_epochs=100,

batch_size=10,

verbose=False

)

# define the grid search parameters

param_grid = {

'module__weight_constraint': [1.0, 2.0, 3.0, 4.0, 5.0],

'module__dropout_rate': [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]

}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

grid_result = grid.fit(X, y)

# 总结结果

print("最佳： %f 使用 %s" % (grid_result.best_score_, grid_result.best_params_))

means = grid_result.cv_results_['mean_test_score']

stds = grid_result.cv_results_['std_test_score']

params = grid_result.cv_results_['params']

for mean, stdev, param in zip(means, stds, params):

print("%f (%f) 使用: %r" % (mean, stdev, param))

运行此示例将产生以下输出。

Best: 0.701823 using {'module__dropout_rate': 0.1, 'module__weight_constraint': 2.0}
0.669271 (0.015073) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 1.0}
0.692708 (0.035132) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 2.0}
0.589844 (0.170180) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 3.0}
0.561198 (0.151131) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 4.0}
0.688802 (0.021710) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 5.0}
0.697917 (0.009744) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 1.0}
0.701823 (0.016367) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 2.0}
0.694010 (0.010253) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 3.0}
0.686198 (0.025976) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 4.0}
0.679688 (0.026107) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 5.0}
0.701823 (0.029635) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 1.0}
0.682292 (0.014731) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 2.0}
0.701823 (0.009744) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 3.0}
0.701823 (0.026557) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 4.0}
0.687500 (0.015947) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 5.0}
0.686198 (0.006639) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 1.0}
0.656250 (0.006379) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 2.0}
0.565104 (0.155608) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 3.0}
0.700521 (0.028940) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 4.0}
0.669271 (0.012890) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 5.0}
0.661458 (0.018688) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 1.0}
0.669271 (0.017566) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 2.0}
0.652344 (0.006379) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 3.0}
0.680990 (0.037783) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 4.0}
0.692708 (0.042112) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 5.0}
0.666667 (0.006639) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 1.0}
0.652344 (0.011500) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 2.0}
0.662760 (0.007366) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 3.0}
0.558594 (0.146610) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 4.0}
0.552083 (0.141826) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 5.0}
0.548177 (0.141826) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 1.0}
0.653646 (0.013279) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 2.0}
0.661458 (0.008027) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 3.0}
0.553385 (0.142719) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 4.0}
0.669271 (0.035132) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 5.0}
0.662760 (0.015733) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 1.0}
0.636719 (0.024910) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 2.0}
0.550781 (0.146818) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 3.0}
0.537760 (0.140094) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 4.0}
0.542969 (0.138144) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 5.0}
0.565104 (0.148654) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 1.0}
0.657552 (0.008027) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 2.0}
0.428385 (0.111418) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 3.0}
0.549479 (0.142719) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 4.0}
0.648438 (0.005524) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 5.0}
0.540365 (0.136861) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 1.0}
0.605469 (0.053083) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 2.0}
0.553385 (0.139948) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 3.0}
0.549479 (0.142719) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 4.0}
0.595052 (0.075566) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 5.0}

最佳：0.701823 使用 {'module__dropout_rate': 0.1, 'module__weight_constraint': 2.0}

0.669271 (0.015073) 使用：{'module__dropout_rate': 0.0, 'module__weight_constraint': 1.0}

0.692708 (0.035132) 使用：{'module__dropout_rate': 0.0, 'module__weight_constraint': 2.0}

0.589844 (0.170180) 使用：{'module__dropout_rate': 0.0, 'module__weight_constraint': 3.0}

0.561198 (0.151131) 使用：{'module__dropout_rate': 0.0, 'module__weight_constraint': 4.0}

0.688802 (0.021710) 使用：{'module__dropout_rate': 0.0, 'module__weight_constraint': 5.0}

0.697917 (0.009744) 使用：{'module__dropout_rate': 0.1, 'module__weight_constraint': 1.0}

0.701823 (0.016367) 使用：{'module__dropout_rate': 0.1, 'module__weight_constraint': 2.0}

0.694010 (0.010253) 使用：{'module__dropout_rate': 0.1, 'module__weight_constraint': 3.0}

0.686198 (0.025976) 使用：{'module__dropout_rate': 0.1, 'module__weight_constraint': 4.0}

0.679688 (0.026107) 使用：{'module__dropout_rate': 0.1, 'module__weight_constraint': 5.0}

0.701823 (0.029635) 使用：{'module__dropout_rate': 0.2, 'module__weight_constraint': 1.0}

0.682292 (0.014731) 使用：{'module__dropout_rate': 0.2, 'module__weight_constraint': 2.0}

0.701823 (0.009744) 使用：{'module__dropout_rate': 0.2, 'module__weight_constraint': 3.0}

0.701823 (0.026557) 使用：{'module__dropout_rate': 0.2, 'module__weight_constraint': 4.0}

0.687500 (0.015947) 使用：{'module__dropout_rate': 0.2, 'module__weight_constraint': 5.0}

0.686198 (0.006639) 使用：{'module__dropout_rate': 0.3, 'module__weight_constraint': 1.0}

0.656250 (0.006379) 使用：{'module__dropout_rate': 0.3, 'module__weight_constraint': 2.0}

0.565104 (0.155608) 使用：{'module__dropout_rate': 0.3, 'module__weight_constraint': 3.0}

0.700521 (0.028940) 使用：{'module__dropout_rate': 0.3, 'module__weight_constraint': 4.0}

0.669271 (0.012890) 使用：{'module__dropout_rate': 0.3, 'module__weight_constraint': 5.0}

0.661458 (0.018688) 使用：{'module__dropout_rate': 0.4, 'module__weight_constraint': 1.0}

0.669271 (0.017566) 使用：{'module__dropout_rate': 0.4, 'module__weight_constraint': 2.0}

0.652344 (0.006379) 使用：{'module__dropout_rate': 0.4, 'module__weight_constraint': 3.0}

0.680990 (0.037783) 使用：{'module__dropout_rate': 0.4, 'module__weight_constraint': 4.0}

0.692708 (0.042112) 使用：{'module__dropout_rate': 0.4, 'module__weight_constraint': 5.0}

0.666667 (0.006639) 使用：{'module__dropout_rate': 0.5, 'module__weight_constraint': 1.0}

0.652344 (0.011500) 使用：{'module__dropout_rate': 0.5, 'module__weight_constraint': 2.0}

0.662760 (0.007366) 使用：{'module__dropout_rate': 0.5, 'module__weight_constraint': 3.0}

0.558594 (0.146610) 使用：{'module__dropout_rate': 0.5, 'module__weight_constraint': 4.0}

0.552083 (0.141826) 使用：{'module__dropout_rate': 0.5, 'module__weight_constraint': 5.0}

0.548177 (0.141826) 使用：{'module__dropout_rate': 0.6, 'module__weight_constraint': 1.0}

0.653646 (0.013279) 使用：{'module__dropout_rate': 0.6, 'module__weight_constraint': 2.0}

0.661458 (0.008027) 使用：{'module__dropout_rate': 0.6, 'module__weight_constraint': 3.0}

0.553385 (0.142719) 使用：{'module__dropout_rate': 0.6, 'module__weight_constraint': 4.0}

0.669271 (0.035132) 使用：{'module__dropout_rate': 0.6, 'module__weight_constraint': 5.0}

0.662760 (0.015733) 使用：{'module__dropout_rate': 0.7, 'module__weight_constraint': 1.0}

0.636719 (0.024910) 使用：{'module__dropout_rate': 0.7, 'module__weight_constraint': 2.0}

0.550781 (0.146818) 使用：{'module__dropout_rate': 0.7, 'module__weight_constraint': 3.0}

0.537760 (0.140094) 使用：{'module__dropout_rate': 0.7, 'module__weight_constraint': 4.0}

0.542969 (0.138144) 使用：{'module__dropout_rate': 0.7, 'module__weight_constraint': 5.0}

0.565104 (0.148654) 使用：{'module__dropout_rate': 0.8, 'module__weight_constraint': 1.0}

0.657552 (0.008027) 使用：{'module__dropout_rate': 0.8, 'module__weight_constraint': 2.0}

0.428385 (0.111418) 使用：{'module__dropout_rate': 0.8, 'module__weight_constraint': 3.0}

0.549479 (0.142719) 使用：{'module__dropout_rate': 0.8, 'module__weight_constraint': 4.0}

0.648438 (0.005524) 使用：{'module__dropout_rate': 0.8, 'module__weight_constraint': 5.0}

0.540365 (0.136861) 使用：{'module__dropout_rate': 0.9, 'module__weight_constraint': 1.0}

0.605469 (0.053083) 使用：{'module__dropout_rate': 0.9, 'module__weight_constraint': 2.0}

0.553385 (0.139948) 使用：{'module__dropout_rate': 0.9, 'module__weight_constraint': 3.0}

0.549479 (0.142719) 使用：{'module__dropout_rate': 0.9, 'module__weight_constraint': 4.0}

0.595052 (0.075566) 使用：{'module__dropout_rate': 0.9, 'module__weight_constraint': 5.0}

可以看到，10% 的 dropout 速率和 2.0 的权重约束取得了约 70% 的最佳准确率。

如何调整隐藏层中的神经元数量

层中的神经元数量是需要调整的重要参数。通常，层中的神经元数量控制着网络在该拓扑结构中的表示能力。

根据通用逼近定理，一个足够大的单层网络可以逼近任何其他神经网络。

在本示例中，您将尝试调整单个隐藏层中的神经元数量。您将尝试从 1 到 30 的值，步长为 5。
更大的网络需要更多的训练，并且至少应该根据神经元数量来优化批量大小和 epoch 数量。

完整的代码清单如下。

import numpy as np
import torch
import torch.nn as nn
import torch.nn.init as init
import torch.optim as optim
from skorch import NeuralNetClassifier
from sklearn.model_selection import GridSearchCV

# load the dataset, split into input (X) and output (y) variables
dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
X = dataset[:,0:8]
y = dataset[:,8]
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

class PimaClassifier(nn.Module):
    def __init__(self, n_neurons=12):
        super().__init__()
        self.layer = nn.Linear(8, n_neurons)
        self.act = nn.ReLU()
        self.dropout = nn.Dropout(0.1)
        self.output = nn.Linear(n_neurons, 1)
        self.prob = nn.Sigmoid()
        self.weight_constraint = 2.0
        # manually init weights
        init.kaiming_uniform_(self.layer.weight)
        init.kaiming_uniform_(self.output.weight)

    def forward(self, x):
        # maxnorm weight before actual forward pass
        with torch.no_grad():
            norm = self.layer.weight.norm(2, dim=0, keepdim=True).clamp(min=self.weight_constraint / 2)
            desired = torch.clamp(norm, max=self.weight_constraint)
            self.layer.weight *= (desired / norm)
        # actual forward pass
        x = self.act(self.layer(x))
        x = self.dropout(x)
        x = self.prob(self.output(x))
        return x

# create model with skorch
model = NeuralNetClassifier(
    PimaClassifier,
    criterion=nn.BCELoss,
    optimizer=optim.Adamax,
    max_epochs=100,
    batch_size=10,
    verbose=False
)

# define the grid search parameters
param_grid = {
    'module__n_neurons': [1, 5, 10, 15, 20, 25, 30]
}
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(X, y)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

import numpy as np

import torch

import torch.nn as nn

import torch.nn.init as init

import torch.optim as optim

from skorch import NeuralNetClassifier

from sklearn.model_selection import GridSearchCV

# 加载数据集，分割为输入 (X) 和输出 (y) 变量

dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')

X = dataset[:,0:8]

y = dataset[:,8]

X = torch.tensor(X, dtype=torch.float32)

y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

class PimaClassifier(nn.Module):

def __init__(self, n_neurons=12):

super().__init__()

self.layer = nn.Linear(8, n_neurons)

self.act = nn.ReLU()

self.dropout = nn.Dropout(0.1)

self.output = nn.Linear(n_neurons, 1)

self.prob = nn.Sigmoid()

self.weight_constraint = 2.0

# 手动初始化权重

init.kaiming_uniform_(self.layer.weight)

init.kaiming_uniform_(self.output.weight)

def forward(self, x):

# 在实际前向传播之前施加 maxnorm 权重

with torch.no_grad():

norm = self.layer.weight.norm(2, dim=0, keepdim=True).clamp(min=self.weight_constraint / 2)

desired = torch.clamp(norm, max=self.weight_constraint)

self.layer.weight *= (desired / norm)

# 实际前向传播

x = self.act(self.layer(x))

x = self.dropout(x)

x = self.prob(self.output(x))

return x

# 使用 skorch 创建模型

model = NeuralNetClassifier(

PimaClassifier,

criterion=nn.BCELoss,

optimizer=optim.Adamax,

max_epochs=100,

batch_size=10,

verbose=False

)

# define the grid search parameters

param_grid = {

'module__n_neurons': [1, 5, 10, 15, 20, 25, 30]

}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

grid_result = grid.fit(X, y)

# 总结结果

print("最佳： %f 使用 %s" % (grid_result.best_score_, grid_result.best_params_))

means = grid_result.cv_results_['mean_test_score']

stds = grid_result.cv_results_['std_test_score']

params = grid_result.cv_results_['params']

for mean, stdev, param in zip(means, stds, params):

print("%f (%f) 使用: %r" % (mean, stdev, param))

运行此示例将产生以下输出。

Best: 0.708333 using {'module__n_neurons': 30}
0.654948 (0.003683) with: {'module__n_neurons': 1}
0.666667 (0.023073) with: {'module__n_neurons': 5}
0.694010 (0.014382) with: {'module__n_neurons': 10}
0.682292 (0.014382) with: {'module__n_neurons': 15}
0.707031 (0.028705) with: {'module__n_neurons': 20}
0.703125 (0.030758) with: {'module__n_neurons': 25}
0.708333 (0.015733) with: {'module__n_neurons': 30}

最佳：0.708333 使用 {'module__n_neurons': 30}

0.654948 (0.003683) 使用：{'module__n_neurons': 1}

0.666667 (0.023073) 使用：{'module__n_neurons': 5}

0.694010 (0.014382) 使用：{'module__n_neurons': 10}

0.682292 (0.014382) 使用：{'module__n_neurons': 15}

0.707031 (0.028705) 使用：{'module__n_neurons': 20}

0.703125 (0.030758) 使用：{'module__n_neurons': 25}

0.708333 (0.015733) 使用：{'module__n_neurons': 30}

可以看到，隐藏层中具有 30 个神经元的网络取得了最佳结果，准确率约为 71%。

超参数优化技巧

本节列出了一些在调整神经网络超参数时可以考虑的有用技巧。

k 折交叉验证。您可以看到本帖子示例中的结果存在一些差异。默认使用 3 折交叉验证，但 k=5 或 k=10 可能会更稳定。请仔细选择您的交叉验证配置，以确保结果稳定。
审查整个网格。不要只关注最佳结果，审查整个结果网格并寻找趋势以支持配置决策。当然，会有更多的组合，评估它们需要更长的时间。
并行化。如果可以，使用您的所有核心，神经网络训练速度很慢，我们通常想尝试很多不同的参数。考虑在云平台（如 AWS）上运行。
使用数据集的样本。由于网络训练速度很慢，尝试在较小的训练数据集样本上进行训练，以便对参数的总体方向有所了解，而不是最佳配置。
从粗粒度网格开始。从粗粒度网格开始，一旦您能够缩小范围，就可以进入更细粒度的网格。
不要转移结果。结果通常是特定于问题的。在新问题上尽量避免偏好配置。您在一个问题上发现的最优结果不太可能转移到您的下一个项目中。相反，应该关注更广泛的趋势，例如层数或参数之间的关系。
可复现性是一个问题。尽管我们为 NumPy 的随机数生成器设置了种子，但结果并非 100% 可复现。对于包装 PyTorch 模型的网格搜索，可复现性比本帖子介绍的要复杂得多。

进一步阅读

如果您想深入了解，本节提供了更多关于该主题的资源。

skorch 文档
PyTorch 的 torch.nn
scikit-learn 的 GridSearchCV

总结

在本帖子中，您了解了如何使用 PyTorch 和 scikit-learn 在 Python 中调整深度学习网络的超参数。
具体来说，你学到了：

如何包装 PyTorch 模型以在 scikit-learn 中使用以及如何使用网格搜索。
如何为 PyTorch 模型进行各种标准神经网络参数的网格搜索。
如何设计自己的超参数优化实验。

关于此主题的更多信息

6 条对《如何为 PyTorch 模型进行超参数网格搜索》的回复

Aminul 2023 年 10 月 12 日上午 5:49 #

你好，
在这种情况下，我可以使用支持 GPU 的 pytorch 模型吗？

回复
- James Carmichael 2023 年 10 月 12 日上午 8:28 #
  
  你好 Aminul……以下资源是利用 Pytorch 中 GPU 的绝佳起点。
  
  https://medium.com/ai%C2%B3-theory-practice-business/use-gpu-in-your-pytorch-code-676a67faed09
  
  回复
Yaswanth 2023 年 11 月 22 日上午 3:33 #

你好，

代码在 grid.fit(x,y) 行出错。可能是什么错误？

回复
- James Carmichael 2023 年 11 月 22 日上午 10:30 #
  
  你好 Yaswanth……请告知您遇到的确切错误措辞。这将更好地帮助我们。
  
  回复
Lukas 2024 年 1 月 23 日上午 1:22 #

你好，

我只想说，machinelearningmastery.com 上的这些指南对我学习机器学习非常有价值。我只从事了 2 年，但我对未来感到兴奋。

非常感谢。

祝一切顺利

回复
- James Carmichael 2024 年 1 月 23 日上午 9:17 #
  
  感谢您的反馈 Lukas！我们非常感谢您的支持！
  
  回复

导航

如何对 PyTorch 模型进行超参数网格搜索

概述

如何在 scikit-learn 中使用 PyTorch 模型

想开始使用PyTorch进行深度学习吗？

如何在 scikit-learn 中使用网格搜索

问题描述

如何调整批次大小和训练轮数

如何调整训练优化算法

如何调整学习率和动量

如何调整神经网络权重初始化

如何调整神经元激活函数

如何调整 Dropout 正则化

如何调整隐藏层中的神经元数量

超参数优化技巧

进一步阅读

总结

开始使用PyTorch进行深度学习！

学习如何构建深度学习模型

通过动手练习开启你的深度学习之旅

关于此主题的更多信息

6 条对《如何为 PyTorch 模型进行超参数网格搜索》的回复

留下回复点击此处取消回复。

导航

概述

如何在 scikit-learn 中使用 PyTorch 模型

想开始使用PyTorch进行深度学习吗？

如何在 scikit-learn 中使用网格搜索

问题描述

如何调整批次大小和训练轮数

如何调整训练优化算法

如何调整学习率和动量

如何调整神经网络权重初始化

如何调整神经元激活函数

如何调整 Dropout 正则化

如何调整隐藏层中的神经元数量

超参数优化技巧

进一步阅读

总结

开始使用PyTorch进行深度学习！

学习如何构建深度学习模型

通过动手练习开启你的深度学习之旅

关于此主题的更多信息

6 条对《如何为 PyTorch 模型进行超参数网格搜索》的回复

留下回复 点击此处取消回复。

留下回复点击此处取消回复。