yikit.visualize package#

Visualization tools for machine learning models.

This module provides utilities for visualizing model results, including permutation importance, learning curves, distribution plots, and matplotlib configuration helpers.

class yikit.visualize.SummarizePI(importances)#

Bases: object

get_data()#
get_figure(fontfamily: str | None = None)#
yikit.visualize.get_dist_figure(y_dist, y_true=None, keep_y_range=True, return_axis=False, verbose=True, titles=[], fontfamily: str | None = None)#

get distribution figure.

Parameters:
  • y_dist (ngboost.distns.Distn object) – generated by ngb.pred_dist(X)

  • y_true (1d array, optional) – [description], by default None

  • keep_y_range (bool, optional) – keep y range equal or not, by default True

  • return_axis (bool, optional) – return axis or not, by default False

  • verbose (bool, optional) – drwaing progress bar or not, by default True

  • titles (1d list, optional) – titles whose lengh == n_samples, by default []

  • fontfamily (str, optional) – fontfamily, by default None

Return type:

matplotlib.pyplot.Figure | (matplotlib.pyplot.Figure, matplotlib.pyplot.Axis)

yikit.visualize.get_learning_curve_gb(estimator, fontfamily: str | None = None, return_axis: bool = False)#

Plot learning curve for gradient boosting models.

This function visualizes the learning curve (evaluation scores over iterations) for NGBoost or LightGBM models. It plots the evaluation scores from the evals_result attribute of the estimator.

Parameters:
  • estimator (NGBRegressor, NGBClassifier, LGBMRegressor, or LGBMClassifier) – A fitted gradient boosting model with an evals_result or evals_result_ attribute containing evaluation history.

  • fontfamily (str or None, optional) – Font family to use for the plot. If None, uses default font. Default is None.

  • return_axis (bool, optional) – Whether to return the matplotlib axis object along with the figure. Default is False.

Returns:

If return_axis=False, returns only the figure. If return_axis=True, returns (figure, axis) tuple.

Return type:

matplotlib.pyplot.Figure or tuple

Raises:

TypeError – If the estimator is not an NGBoost or LightGBM model.

Examples

>>> from yikit.visualize import get_learning_curve_gb
>>> from ngboost import NGBRegressor
>>> import numpy as np
>>> X = np.random.randn(100, 10)
>>> y = np.random.randn(100)
>>> model = NGBRegressor()
>>> model.fit(X, y)
>>> fig = get_learning_curve_gb(model)
yikit.visualize.get_learning_curve_optuna(study: Study, loc='best', fontfamily: str | None = None, return_axis: bool = False)#

Plot learning curve for Optuna optimization study.

This function visualizes the optimization history of an Optuna study, showing both the objective values of all trials and the best value found so far at each trial.

Parameters:
  • study (optuna.study.Study) – An Optuna study object containing the optimization history.

  • loc (str, optional) – Legend location. Can be any valid matplotlib legend location string. Default is ‘best’.

  • fontfamily (str or None, optional) – Font family to use for the plot. If None, uses default font. Default is None.

  • return_axis (bool, optional) – Whether to return the matplotlib axis object along with the figure. Default is False.

Returns:

If return_axis=False, returns only the figure. If return_axis=True, returns (figure, axis) tuple.

Return type:

matplotlib.pyplot.Figure or tuple

Examples

>>> from yikit.visualize import get_learning_curve_optuna
>>> import optuna
>>>
>>> def objective(trial):
...     x = trial.suggest_float('x', -10, 10)
...     return (x - 2) ** 2
>>>
>>> study = optuna.create_study()
>>> study.optimize(objective, n_trials=100)
>>> fig = get_learning_curve_optuna(study)
yikit.visualize.set_font(fontfamily: str | None = None, fontsize: int = 13)#

Set matplotlib font family and size.

This function configures the default font family and size for matplotlib plots. If no font family is specified, it uses a platform-appropriate default.

Parameters:
  • fontfamily (str or None, default=None) – Font family to use. If None, uses platform default: - macOS: ‘Helvetica’ - Windows/Linux: ‘DejaVu Sans’

  • fontsize (int, default=13) – Font size to use for all text elements.

Examples

>>> from yikit.visualize import set_font
>>> set_font(fontfamily="Arial", fontsize=14)
>>> # Now all matplotlib plots will use Arial font at size 14
yikit.visualize.with_custom_matplotlib_settings(fontfamily: str | None = None, fontsize: int = 13, restore: bool = True)#

Decorator to apply custom matplotlib settings temporarily.

This decorator applies custom font family and size settings to matplotlib for the duration of the decorated function execution. After execution, the original settings are restored if restore=True.

Parameters:
  • fontfamily (str or None, optional) – Font family to use. If None, uses platform default: - macOS: ‘Helvetica’ - Windows/Linux: ‘DejaVu Sans’ Default is None.

  • fontsize (int, optional) – Font size to use for all text elements. Default is 13.

  • restore (bool, optional) – Whether to restore original matplotlib settings after function execution. Default is True.

Returns:

Decorator function that wraps the target function.

Return type:

Callable

Examples

>>> from yikit.visualize import with_custom_matplotlib_settings
>>> import matplotlib.pyplot as plt
>>>
>>> @with_custom_matplotlib_settings(fontfamily="Helvetica", fontsize=14)
... def plot_data():
...     plt.plot([1, 2, 3], [1, 4, 9])
...     return plt.gcf()
>>>
>>> fig = plot_data()  # Uses Helvetica font at size 14
>>> # After function execution, original settings are restored
yikit.visualize.yyplot(y_true: ArrayLike, y_pred: ArrayLike, *, labels: Sequence[str | None] | None = None, metrics: Sequence[Literal['r2', 'rmse', 'mae', 'mse']] = ('r2', 'rmse'), ax: Axes | None = None, alpha: float = 0.05) Axes#
yikit.visualize.yyplot(y_train: ArrayLike, y_pred_on_train: ArrayLike, y_test: ArrayLike, y_pred_on_test: ArrayLike, *, labels: Sequence[str] | None = ('train', 'test'), metrics: Sequence[Literal['r2', 'rmse', 'mae', 'mse']] = ('r2', 'rmse'), ax: Axes | None = None, alpha: float = 0.05) Axes
yikit.visualize.yyplot(y_train: ArrayLike, y_pred_on_train: ArrayLike, y_val: ArrayLike, y_pred_on_val: ArrayLike, y_test: ArrayLike, y_pred_on_test: ArrayLike, *, labels: Sequence[str | None] | None = ('train', 'val', 'test'), metrics: Sequence[Literal['r2', 'rmse', 'mae', 'mse']] = ('r2', 'rmse'), ax: Axes | None = None, alpha: float = 0.05) Axes
yikit.visualize.yyplot(*y_data: ArrayLike, labels: Sequence[str | None] | None = None, metrics: Sequence[Literal['r2', 'rmse', 'mae', 'mse']] = ('r2', 'rmse'), ax: Axes | None = None, alpha: float = 0.05) Axes

Plot true vs. predicted values for one or more data sets.

This function draws a scatter plot of multiple (y_true, y_pred) pairs and overlays the identity line \(y = x\) as a reference. It computes summary regression metrics for each pair and displays them as text in the top-left of the plot. Typical usage is to compare model performance on different splits such as train, validation, and test.

Parameters:
  • *y_data (ArrayLike) – A sequence of arrays interpreted as consecutive pairs (y_true_0, y_pred_0, y_true_1, y_pred_1, ...). The length of y_data must be even, and each pair must be broadcastable to the same shape.

  • labels (Sequence of str or None, optional) – Labels for each (y_true, y_pred) pair, used in the legend and metric annotations. If None (default), labels are automatically set to ("train", "test") for two data sets, to ("train", "val", "test") for three data sets, or to all-None for other numbers of data sets.

  • metrics (Sequence of {"r2", "rmse", "mae", "mse"}, optional) – Metrics to compute for each data set. Each metric is shown in the annotation box together with the corresponding label (if provided). Defaults to ("r2", "rmse").

  • ax (matplotlib.axes.Axes, optional) – Existing Axes on which to draw the plot. If None, a new figure and axes are created.

  • alpha (float, optional) – Relative margin added to the data range when determining plot limits. The default is 0.05 (5% margin on each side).

Returns:

The axes object with the scatter plot and annotations.

Return type:

matplotlib.axes.Axes

Raises:

ValueError – If no data is provided, or if the number of positional arguments is odd (i.e., there is an unmatched y_true or y_pred), or if the length of labels does not match the number of data sets, or if an unknown metric name is given in metrics.

Examples

Plot a single data set:

>>> import numpy as np
>>> from yikit.visualize import yyplot
>>> y_true = np.array([1.0, 2.0, 3.0])
>>> y_pred = np.array([0.8, 2.1, 2.9])
>>> ax = yyplot(y_true, y_pred)

Plot train and test data sets:

>>> y_train, y_pred_train = np.arange(5), np.arange(5) + 0.1
>>> y_test, y_pred_test = np.arange(5), np.arange(5) - 0.2
>>> ax = yyplot(y_train, y_pred_train, y_test, y_pred_test,
...             labels=("train", "test"))