Explainability module
- cherrypick.explain.explainer(model, data, impact_type: Literal['pos', 'neg', 'all'] = 'all')
Compute SHAP-based feature importance and return sorted impact values.
This function uses SHAP’s TreeExplainer to calculate feature contributions for a given model and dataset. It aggregates SHAP values across samples and (if applicable) across multiple classes, returning feature importance based on absolute SHAP magnitudes.
- Parameters:
model (object) – A trained tree-based model compatible with
shap.TreeExplainer(e.g., XGBoost, LightGBM, RandomForest).data (pandas.DataFrame) – Input dataset for which SHAP values are computed. Must contain only feature columns (no target column).
impact_type ({'pos', 'neg', 'all'}, default='all') –
Type of feature impact to return:
pos: Features with positive contributionneg: Features with negative contributionall: All features based on absolute SHAP values
- Returns:
result (pandas.DataFrame) – Sorted DataFrame containing feature importance:
all→ columns: [‘Features’, ‘Overall_Impact’]pos→ columns: [‘Features’, ‘Positive_Impact’]neg→ columns: [‘Features’, ‘Negative_Impact’]
shap_values (shap.Explanation) – Raw SHAP explanation object containing per-sample contributions.
Notes
For multi-class models, SHAP values are averaged across classes.
Feature importance is computed using mean absolute SHAP values.
SHAP values are also stored globally in
_shap_val.
- Raises:
ValueError – If
impact_typeis not one of {‘pos’, ‘neg’, ‘all’}.
Examples
>>> result, shap_vals = explainer(model, X_test, impact_type='all') >>> result
- cherrypick.explain.summary_plot(data)
Summary plot for feature contribution for all the classes.
- cherrypick.explain.bar_plot(n_classes)
Bar plot analysis of feature contribution for each class
- cherrypick.explain.tree_plot(model, feature_names, size: tuple)