Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG, ENH] Make sure classifier is new everytime and Add optional caching to classifier-based CI tests #93

Open
adam2392 opened this issue Jan 12, 2023 · 0 comments
Labels
conditional-independence Related to CI testing

Comments

@adam2392
Copy link
Collaborator

Problem

The classifier-based CI tests are nice because they are non-parametric generally, but slow because for each test, you need to re-fit a classifier. For modular components, we should optionally cache predictions.

For example:

  1. CCMI: we can cache MI estimates of I(X;Y), I(X;Y,Z), such that we can re-use them when needed
  2. CCIT: store the metric (and optionally biased metric) values for combinations of (X, Y, Z). The ordering of (X and Y) does not matter here.

Another issue is that we should ensure that the classification model is refit from scratch each time ci_estimator.test(...) is called. Downstream classification models in scikit-learn might support warm-starts, and this may be the subject

Proposed solution

Create a nested dictionary

For CCMI:

  • X variable
  • Y variable
  • Z variable (can be None to indicate no conditioning)

For CCIT:

  • X variable
  • Y variable
@adam2392 adam2392 added the conditional-independence Related to CI testing label Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conditional-independence Related to CI testing
Projects
None yet
Development

No branches or pull requests

1 participant