7 个 Matplotlib 技巧,更好地可视化你的机器学习模型

7 Matplotlib Tricks to Better Visualize Your Machine Learning Models

7 个 Matplotlib 技巧,更好地可视化你的机器学习模型
图片作者 | ChatGPT

引言

可视化模型性能是机器学习工作流程中必不可少的一环。虽然许多从业者能够创建基本的图表,但将这些图表从简单的图表提升到能够轻松讲述机器学习模型解释和预测故事的、有见地的、高级可视化,是一项能让优秀专业人士脱颖而出的技能。作为科学和计算Python生态系统中的基础绘图工具,Matplotlib库充满了可以帮助您实现这一目标的特性。

本教程提供了7个实用的Matplotlib技巧,可以帮助您更好地理解、评估和展示您的机器学习模型。我们将超越默认设置,创建不仅美观而且信息丰富的可视化。这些技术旨在与NumPyScikit-learn等库顺畅集成到您的工作流程中。

这里的假设是您已经熟悉Matplotlib及其一般用法,因为我们在这里不会涵盖这些内容。相反,我们将专注于如何在7个特定的与机器学习任务相关的场景中提高您的代码技能。

由于我们将采用独立处理每个代码解决方案的方法,所以请准备好今天会多次看到import matplotlib.pyplot as plt 🙂

1. 应用专业样式,即时提升质感

Matplotlib的默认外观有时可能感觉有点……过时。一种简单而有效的方法是使用Matplotlib内置的样式表。只需一行代码,您就可以应用专业的样式,模仿R的ggplot或Seaborn库等流行工具的美学风格。这可以立即提高可读性和视觉吸引力。

让我们看看样式表能带来什么不同。我们将从一个基本的散点图开始,然后应用'seaborn-v0_8-whitegrid'样式。

生成的可视化如下

Applying professional styles for instant polish

应用专业样式,即时提升质感

正如您所见,应用样式会添加网格,更改字体,并调整整体配色方案,使图表更易于解读。

2. 可视化分类器决策边界

理解分类模型如何分隔数据是必须的。决策边界图显示了模型与每个类关联的特征空间区域。这种可视化是诊断模型如何泛化以及它可能在哪里出错的宝贵工具。

我们将在经典的Iris数据集上训练一个支持向量机(SVM),并绘制其决策边界。为了在2D中可见,我们将只使用两个特征。诀窍是创建一个点网格,让模型为每个点预测类别,然后使用plt.contourf()绘制彩色区域。

这是我们分类器决策边界的可视化

Visualizing classifier decision boundaries

可视化分类器决策边界

这张图展示了SVM分类器如何划分特征空间,区分了三种鸢尾花。

3. 绘制清晰的接收者操作特征曲线

接收者操作特征(ROC)曲线是评估二元分类器的标准工具。ROC图在各种阈值设置下,将真阳性率绘制为假阳性率。曲线下面积(AUC)提供了一个单一数字来总结模型的性能,如ROC图中所示。一个好的ROC图应包括AUC得分和用于比较的基线。

让我们使用Scikit-learn计算ROC曲线点和AUC,然后使用Matplotlib将它们清晰地绘制出来。添加一个带有AUC得分的标签可以使图表自成一体且易于理解。

由此产生的稳健ROC曲线图如下

Plotting a clear receiver operating characteristic curve

绘制清晰的接收者操作特征曲线

4. 构建带注释的混淆矩阵热力图

混淆矩阵是总结分类模型性能的表格。原始数字在这里很有用,但热力图可视化可以更快地发现模式,例如哪些类经常被混淆。用实际数字注释热力图既提供了快速的视觉摘要,又提供了精确的细节。

我们将使用Matplotlib的imshow()函数来创建热力图,然后遍历矩阵为每个单元格添加文本标签。

Here is the resulting easy-to-quickly-interpret confusion matrix

Building an annotated confusion matrix heatmap

Building an annotated confusion matrix heatmap

5. Highlighting Feature Importance

For many models, especially tree-based ensembles like random forests or gradient boosting, we can extract a measure of how important each feature was in making predictions. Visualizing these scores helps in understanding the model’s behavior and guiding feature selection efforts. A horizontal bar chart is often the best choice for this task.

We’ll train a RandomForestClassifier, extract the feature importances, and display them in a sorted horizontal bar chart for easy comparison.

Let’s take a look at the feature importances plotted

Highlighting feature importance

Highlighting feature importance

6. Plotting Diagnostic Learning Curves

Learning curves are a powerful tool for diagnosing whether a model is suffering from a bias problem (underfitting) or a variance problem (overfitting). They show the model’s performance on the training set and the validation set as a function of the number of training samples.

We’ll use Scikit-learn’s learning_curve utility to generate the scores and then plot them. A key trick here is to also plot the standard deviation of the scores to understand the stability of the model’s performance.

This is the resulting learning curve plot

Plotting diagnostic learning curves

Plotting diagnostic learning curves

7. Creating a Gallery of Models with Subplots

There are times when you will want to compare the performance of several different models. Placing their visualizations side-by-side in a single figure makes this comparison direct and efficient. Matplotlib’s subplot functionality is perfect for creating this kind of “model gallery.”

We’ll create a grid of plots, with each subplot showing the decision boundary for a different classifier on the same dataset.

Here are the gallery of the various different classifier’s decision boundaries

Creating a gallery of models with subplots

Creating a gallery of models with subplots

总结

Mastering these 7 Matplotlib tricks will significantly enhance your ability to analyze, diagnose, and communicate the results of your machine learning models. Effective visualization is not only about creating pretty pictures; it’s about crafting and presenting a deeper intuition for how models work and conveying complex findings in a clear, impactful way. By moving beyond default plots and thoughtfully crafting your visualizations, you can accelerate your own understanding and more effectively share your insights with others.

2 Responses to 7 Matplotlib Tricks to Better Visualize Your Machine Learning Models

  1. NTECH GLOBAL SOLUTIONS August 16, 2025 at 10:14 pm #

    Very useful tips! These Matplotlib tricks make ML model visuals much clearer and more insightful.

    • James Carmichael August 17, 2025 at 1:02 am #

      不客气!

Leave a Reply

Machine Learning Mastery 是 Guiding Tech Media 的一部分,Guiding Tech Media 是一家领先的数字媒体出版商,专注于帮助人们了解技术。访问我们的公司网站以了解更多关于我们的使命和团队的信息。