Explainability module

cherrypick.explain.explainer(model, data, impact_type: Literal['pos', 'neg', 'all'] = 'all')

Compute SHAP-based feature importance and return sorted impact values.

This function uses SHAP’s TreeExplainer to calculate feature contributions for a given model and dataset. It aggregates SHAP values across samples and (if applicable) across multiple classes, returning feature importance based on absolute SHAP magnitudes.

Parameters:
  • model (object) – A trained tree-based model compatible with shap.TreeExplainer (e.g., XGBoost, LightGBM, RandomForest).

  • data (pandas.DataFrame) – Input dataset for which SHAP values are computed. Must contain only feature columns (no target column).

  • impact_type ({'pos', 'neg', 'all'}, default='all') –

    Type of feature impact to return:

    • pos : Features with positive contribution

    • neg : Features with negative contribution

    • all : All features based on absolute SHAP values

Returns:

  • result (pandas.DataFrame) – Sorted DataFrame containing feature importance:

    • all → columns: [‘Features’, ‘Overall_Impact’]

    • pos → columns: [‘Features’, ‘Positive_Impact’]

    • neg → columns: [‘Features’, ‘Negative_Impact’]

  • shap_values (shap.Explanation) – Raw SHAP explanation object containing per-sample contributions.

Notes

  • For multi-class models, SHAP values are averaged across classes.

  • Feature importance is computed using mean absolute SHAP values.

  • SHAP values are also stored globally in _shap_val.

Raises:

ValueError – If impact_type is not one of {‘pos’, ‘neg’, ‘all’}.

Examples

>>> result, shap_vals = explainer(model, X_test, impact_type='all')
>>> result
cherrypick.explain.summary_plot(data)

Summary plot for feature contribution for all the classes.

cherrypick.explain.bar_plot(n_classes)

Bar plot analysis of feature contribution for each class

cherrypick.explain.tree_plot(model, feature_names, size: tuple)