Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: How to perform global explanation on an EBM where the bins have been modified with the GAM changer #592

Closed
HenrikSmith opened this issue Dec 20, 2024 · 7 comments

Comments

@HenrikSmith
Copy link

Hello everyone,

first of all, thank you so much for your work, the interpret ML package continues to be a great tool for us!

I have been trying to perform global explanation on EBMs which have been modified using the GAM changer. It does work as long as only the contributions are altered but the bins are not modified in any way. However, as soon as the bin boundaries move or the overall number of bins is changed, I keep on getting a ValueError. What am I missing here, is there any way to fix this?

.../Lib/site-packages/interpret/glassbox/_ebm/_ebm.py:1528)         upper_bound = max(upper_bound, np.max(scores))
.../Lib/site-packages/interpret/glassbox/_ebm/_ebm.py:1529)     else:
.../Lib/site-packages/interpret/glassbox/_ebm/_ebm.py:1530)         lower_bound = min(lower_bound, np.min(scores - errors))
.../Lib/site-packages/interpret/glassbox/_ebm/_ebm.py:1531)         upper_bound = max(upper_bound, np.max(scores + errors))
.../Lib/site-packages/interpret/glassbox/_ebm/_ebm.py:1533) bounds = (lower_bound, upper_bound)

ValueError: operands could not be broadcast together with shapes (10,) (9,)
@paulbkoch
Copy link
Collaborator

Try setting:

ebm.standard_deviations_ = None

@HenrikSmith
Copy link
Author

Hello Paul, thanks for your reply - I tried this and unfortunately I'm getting a "TypeError: 'NoneType' object is not iterable". Can you shed some light on what happens in the background?

@paulbkoch
Copy link
Collaborator

Ok, then try:

del ebm.standard_deviations_

I'm not really sure what's going on in the GAMChanger code to cause this, but it appears the dimensionality of the standard_deviations_ attribute is different from the term_scores_ attribute in the resulting model, which is an error.

You should be able to delete the standard_deviations_ attribute though as it's optional. It contains the error bars shown in the graphs. If you're modifying the graphs though I would say the error bars are no longer a fair representation of how the model was constructed on the original data, so it would probably make sense to delete them.

You might want to raise this as an issue in the GAMChanger repo where I believe the underlying issue resides. I haven't really looked at that codebase and it's maintained by Jay Wang.

@HenrikSmith
Copy link
Author

Hey Paul, thank you very much for your feedback! I looked a bit more into the issue. Unfortunately "del ebm.standard_deviations_" didn't solve the issue - I'm getting an AttributeError that the object has no attribute 'standard_deviations_'. You're right, though, the error bars are no longer a fair representation, and the gamchanger seems to acknowledge that that by setting these values to zero. I managed to fix it by modifying the standard_deviations_ by adding the missing zero (i.e. overwriting with an array that has the same length as in the original model). Unfortunately that only seems to help if the number of bins is exactly the same as in the original model. As soon as I add bins via the gamchanger, I can't explain or predict anything with the model. I'll try and raise this as an issue in the GAMChanger repo, thank you for the advice!

@HenrikSmith
Copy link
Author

HenrikSmith commented Jan 7, 2025

Hey Paul, I checked a little more in the code and what happens is that, if the number of bins for an input variable is increased, the gamchanger appears to update the scores successfully, but not the bin_weights_. So in _ebm.py, the code section calculating the importances "importances[i] = np.average(scores, weights=self.bin_weights_[i])" fails due to a mismatch of dimensions... I am getting a "TypeError: Axis must be specified when shapes of a and weights differ".

@paulbkoch
Copy link
Collaborator

Ah, thanks for the detailed explanation @HenrikSmith. This makes sense now.

@HenrikSmith
Copy link
Author

I may have found the root causes and documented the findings in here: interpretml/gam-changer#15 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants