yikit.visualize package#
Visualization tools for machine learning models.
This module provides utilities for visualizing model results, including permutation importance, learning curves, distribution plots, and matplotlib configuration helpers.
- class yikit.visualize.SummarizePI(importances)#
Bases:
object- get_data()#
- get_figure(fontfamily: str | None = None)#
- yikit.visualize.get_dist_figure(y_dist, y_true=None, keep_y_range=True, return_axis=False, verbose=True, titles=[], fontfamily: str | None = None)#
get distribution figure.
- Parameters:
y_dist (ngboost.distns.Distn object) – generated by ngb.pred_dist(X)
y_true (1d array, optional) – [description], by default None
keep_y_range (bool, optional) – keep y range equal or not, by default True
return_axis (bool, optional) – return axis or not, by default False
verbose (bool, optional) – drwaing progress bar or not, by default True
titles (1d list, optional) – titles whose lengh == n_samples, by default []
fontfamily (str, optional) – fontfamily, by default None
- Return type:
matplotlib.pyplot.Figure | (matplotlib.pyplot.Figure, matplotlib.pyplot.Axis)
- yikit.visualize.get_learning_curve_gb(estimator, fontfamily: str | None = None, return_axis: bool = False)#
Plot learning curve for gradient boosting models.
This function visualizes the learning curve (evaluation scores over iterations) for NGBoost or LightGBM models. It plots the evaluation scores from the evals_result attribute of the estimator.
- Parameters:
estimator (NGBRegressor, NGBClassifier, LGBMRegressor, or LGBMClassifier) – A fitted gradient boosting model with an evals_result or evals_result_ attribute containing evaluation history.
fontfamily (str or None, optional) – Font family to use for the plot. If None, uses default font. Default is None.
return_axis (bool, optional) – Whether to return the matplotlib axis object along with the figure. Default is False.
- Returns:
If return_axis=False, returns only the figure. If return_axis=True, returns (figure, axis) tuple.
- Return type:
matplotlib.pyplot.Figure or tuple
- Raises:
TypeError – If the estimator is not an NGBoost or LightGBM model.
Examples
>>> from yikit.visualize import get_learning_curve_gb >>> from ngboost import NGBRegressor >>> import numpy as np >>> X = np.random.randn(100, 10) >>> y = np.random.randn(100) >>> model = NGBRegressor() >>> model.fit(X, y) >>> fig = get_learning_curve_gb(model)
- yikit.visualize.get_learning_curve_optuna(study: Study, loc='best', fontfamily: str | None = None, return_axis: bool = False)#
Plot learning curve for Optuna optimization study.
This function visualizes the optimization history of an Optuna study, showing both the objective values of all trials and the best value found so far at each trial.
- Parameters:
study (optuna.study.Study) – An Optuna study object containing the optimization history.
loc (str, optional) – Legend location. Can be any valid matplotlib legend location string. Default is ‘best’.
fontfamily (str or None, optional) – Font family to use for the plot. If None, uses default font. Default is None.
return_axis (bool, optional) – Whether to return the matplotlib axis object along with the figure. Default is False.
- Returns:
If return_axis=False, returns only the figure. If return_axis=True, returns (figure, axis) tuple.
- Return type:
matplotlib.pyplot.Figure or tuple
Examples
>>> from yikit.visualize import get_learning_curve_optuna >>> import optuna >>> >>> def objective(trial): ... x = trial.suggest_float('x', -10, 10) ... return (x - 2) ** 2 >>> >>> study = optuna.create_study() >>> study.optimize(objective, n_trials=100) >>> fig = get_learning_curve_optuna(study)
- yikit.visualize.set_font(fontfamily: str | None = None, fontsize: int = 13)#
Set matplotlib font family and size.
This function configures the default font family and size for matplotlib plots. If no font family is specified, it uses a platform-appropriate default.
- Parameters:
fontfamily (str or None, default=None) – Font family to use. If None, uses platform default: - macOS: ‘Helvetica’ - Windows/Linux: ‘DejaVu Sans’
fontsize (int, default=13) – Font size to use for all text elements.
Examples
>>> from yikit.visualize import set_font >>> set_font(fontfamily="Arial", fontsize=14) >>> # Now all matplotlib plots will use Arial font at size 14
- yikit.visualize.with_custom_matplotlib_settings(fontfamily: str | None = None, fontsize: int = 13, restore: bool = True)#
Decorator to apply custom matplotlib settings temporarily.
This decorator applies custom font family and size settings to matplotlib for the duration of the decorated function execution. After execution, the original settings are restored if restore=True.
- Parameters:
fontfamily (str or None, optional) – Font family to use. If None, uses platform default: - macOS: ‘Helvetica’ - Windows/Linux: ‘DejaVu Sans’ Default is None.
fontsize (int, optional) – Font size to use for all text elements. Default is 13.
restore (bool, optional) – Whether to restore original matplotlib settings after function execution. Default is True.
- Returns:
Decorator function that wraps the target function.
- Return type:
Callable
Examples
>>> from yikit.visualize import with_custom_matplotlib_settings >>> import matplotlib.pyplot as plt >>> >>> @with_custom_matplotlib_settings(fontfamily="Helvetica", fontsize=14) ... def plot_data(): ... plt.plot([1, 2, 3], [1, 4, 9]) ... return plt.gcf() >>> >>> fig = plot_data() # Uses Helvetica font at size 14 >>> # After function execution, original settings are restored
- yikit.visualize.yyplot(y_true: ArrayLike, y_pred: ArrayLike, *, labels: Sequence[str | None] | None = None, metrics: Sequence[Literal['r2', 'rmse', 'mae', 'mse']] = ('r2', 'rmse'), ax: Axes | None = None, alpha: float = 0.05) Axes#
- yikit.visualize.yyplot(y_train: ArrayLike, y_pred_on_train: ArrayLike, y_test: ArrayLike, y_pred_on_test: ArrayLike, *, labels: Sequence[str] | None = ('train', 'test'), metrics: Sequence[Literal['r2', 'rmse', 'mae', 'mse']] = ('r2', 'rmse'), ax: Axes | None = None, alpha: float = 0.05) Axes
- yikit.visualize.yyplot(y_train: ArrayLike, y_pred_on_train: ArrayLike, y_val: ArrayLike, y_pred_on_val: ArrayLike, y_test: ArrayLike, y_pred_on_test: ArrayLike, *, labels: Sequence[str | None] | None = ('train', 'val', 'test'), metrics: Sequence[Literal['r2', 'rmse', 'mae', 'mse']] = ('r2', 'rmse'), ax: Axes | None = None, alpha: float = 0.05) Axes
- yikit.visualize.yyplot(*y_data: ArrayLike, labels: Sequence[str | None] | None = None, metrics: Sequence[Literal['r2', 'rmse', 'mae', 'mse']] = ('r2', 'rmse'), ax: Axes | None = None, alpha: float = 0.05) Axes
Plot true vs. predicted values for one or more data sets.
This function draws a scatter plot of multiple
(y_true, y_pred)pairs and overlays the identity line \(y = x\) as a reference. It computes summary regression metrics for each pair and displays them as text in the top-left of the plot. Typical usage is to compare model performance on different splits such as train, validation, and test.- Parameters:
*y_data (ArrayLike) – A sequence of arrays interpreted as consecutive pairs
(y_true_0, y_pred_0, y_true_1, y_pred_1, ...). The length ofy_datamust be even, and each pair must be broadcastable to the same shape.labels (Sequence of str or None, optional) – Labels for each
(y_true, y_pred)pair, used in the legend and metric annotations. IfNone(default), labels are automatically set to("train", "test")for two data sets, to("train", "val", "test")for three data sets, or to all-Nonefor other numbers of data sets.metrics (Sequence of {"r2", "rmse", "mae", "mse"}, optional) – Metrics to compute for each data set. Each metric is shown in the annotation box together with the corresponding label (if provided). Defaults to
("r2", "rmse").ax (matplotlib.axes.Axes, optional) – Existing Axes on which to draw the plot. If
None, a new figure and axes are created.alpha (float, optional) – Relative margin added to the data range when determining plot limits. The default is
0.05(5% margin on each side).
- Returns:
The axes object with the scatter plot and annotations.
- Return type:
matplotlib.axes.Axes
- Raises:
ValueError – If no data is provided, or if the number of positional arguments is odd (i.e., there is an unmatched
y_trueory_pred), or if the length oflabelsdoes not match the number of data sets, or if an unknown metric name is given inmetrics.
Examples
Plot a single data set:
>>> import numpy as np >>> from yikit.visualize import yyplot >>> y_true = np.array([1.0, 2.0, 3.0]) >>> y_pred = np.array([0.8, 2.1, 2.9]) >>> ax = yyplot(y_true, y_pred)
Plot train and test data sets:
>>> y_train, y_pred_train = np.arange(5), np.arange(5) + 0.1 >>> y_test, y_pred_test = np.arange(5), np.arange(5) - 0.2 >>> ax = yyplot(y_train, y_pred_train, y_test, y_pred_test, ... labels=("train", "test"))