Explainability module

cherrypick.explain.explainer(model, data, impact_type: Literal['pos', 'neg', 'all'] = 'all')

Compute SHAP-based feature importance and return sorted impact values.

This function uses SHAP’s TreeExplainer to calculate feature contributions for a given model and dataset. It aggregates SHAP values across samples and (if applicable) across multiple classes, returning feature importance based on absolute SHAP magnitudes.

Parameters:

model (object) – A trained tree-based model compatible with shap.TreeExplainer (e.g., XGBoost, LightGBM, RandomForest).
data (pandas.DataFrame) – Input dataset for which SHAP values are computed. Must contain only feature columns (no target column).
impact_type ({'pos', 'neg', 'all'}, default='all') –
Type of feature impact to return:
- pos : Features with positive contribution
- neg : Features with negative contribution
- all : All features based on absolute SHAP values

Returns:

result (pandas.DataFrame) – Sorted DataFrame containing feature importance:
- all → columns: [‘Features’, ‘Overall_Impact’]
- pos → columns: [‘Features’, ‘Positive_Impact’]
- neg → columns: [‘Features’, ‘Negative_Impact’]
shap_values (shap.Explanation) – Raw SHAP explanation object containing per-sample contributions.

Notes

For multi-class models, SHAP values are averaged across classes.
Feature importance is computed using mean absolute SHAP values.
SHAP values are also stored globally in _shap_val.

Raises:: ValueError – If impact_type is not one of {‘pos’, ‘neg’, ‘all’}.

Examples

>>> result, shap_vals = explainer(model, X_test, impact_type='all')
>>> result

cherrypick.explain.summary_plot(data): Summary plot for feature contribution for all the classes.

cherrypick.explain.bar_plot(n_classes): Bar plot analysis of feature contribution for each class

cherrypick.explain.tree_plot(model, feature_names, size: tuple)