Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design - Improve MAPIE and mlFlow interaction #454

Open
simon-hirsch opened this issue May 28, 2024 · 3 comments
Open

Design - Improve MAPIE and mlFlow interaction #454

simon-hirsch opened this issue May 28, 2024 · 3 comments
Labels
MAPIE v1 Indicate that the issue will be resolved with the release of MAPIE v1 Other or internal If no other grey tag is relevant or if issue from the MAPIE team

Comments

@simon-hirsch
Copy link

Hi! This is a bit of a general question / suggestion. I have trouble working with MAPIE and mlflow for experiment / model tracking. That is a bit of a pity, because it limits the usability of an otherwise nice library.

Is your feature request related to a problem? Please describe.

The model.predict() output of Tuple[Array, Tuple[Array, Array]] is not super self-explanatory and a bit cumbersome when it comes to further downstream processing, especially with mlflow experiment tracking / deployment.

Suggestion / possible solution (but very open for discussion)

A relatively straight-forwad solution would be to have the model output as Dict({"mean": Array, "lower": Array, "upper": Array}). That way it is clear what is what and this is ought to be accepted by the mlflow infer_signature(). (I've monkey patched my estimator to check this). To avoid breaking changes, one could add an output_format parameter in the estimator class.

Did somebody find other ways to work well with MAPIE and mlflow apart from monkey patching? Appreciate any input :)

Cheers, Simon

@LacombeLouis
Copy link
Collaborator

Hey @simon-hirsch,
thank you for this issue and it seems like your monkey patch fixes this issue for the moment! This is not something we had taken into account. We do have a very specific structure for the output of conformal predictions. Also note that for some models, you can provide multiple alphas in the model.predict(). Meaning that:

print(mapie_regressor.predict(X_test, alpha=0.2)[0].shape)
print(mapie_regressor.predict(X_test, alpha=0.2)[1].shape)

# output
(250,)
(250, 2, 1)

and

print(mapie_regressor.predict(X_test, alpha=[0.2, 0.3])[0].shape)
print(mapie_regressor.predict(X_test, alpha=[0.2, 0.3])[1].shape)

# output
(250,)
(250, 2, 2)

This is a comment we will take into account for future changes, so thank you!

@jawadhussein462 jawadhussein462 added Other or internal If no other grey tag is relevant or if issue from the MAPIE team MAPIE v1 Indicate that the issue will be resolved with the release of MAPIE v1 labels Nov 7, 2024
@jawadhussein462
Copy link
Collaborator

Hello,

This issue will be addressed with the release of MAPIE v1.

The output shape of model.predict(), currently structured as Tuple[Array, Tuple[Array, Array]], will be divided into two distinct methods:

  • model.predict() for point predictions, with output shape (n_samples,)
  • model.predict_set() for interval predictions, with output shape (n_samples, 2)

@simon-hirsch
Copy link
Author

Cool, looking forward. Do you also plan to support multiple sets at once, i.e. something along the lines of: estimator.predict_sets(X, widths=[0.5, 0.75, 0.9]) with output shape (n, 6)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MAPIE v1 Indicate that the issue will be resolved with the release of MAPIE v1 Other or internal If no other grey tag is relevant or if issue from the MAPIE team
Projects
None yet
Development

No branches or pull requests

5 participants