A python package for training stratified machine learning models with laplacian regularization.
Based on and inspired by these great papers:
- Tuck, Jonathan, Shane Barratt, and Stephen Boyd. "A distributed method for fitting Laplacian regularized stratified models."
- Tuck, Jonathan, and Stephen Boyd. "Fitting Laplacian regularized stratified Gaussian models."
- Tuck, Jonathan, and Stephen Boyd. "Eigen-stratified models."
A work in progress.
- fix cvxpy installation in nox
- run pre-commit in nox (or, if it's too hard, in ci)
- dependabot
- uncomment
stratified_models/__init__.py
? - coverage report and badge
- pack
- replace pandas with dask/polars
- replace numpy with dask arrays/jax/torch
- implement a multi-threaded/processing version of ADMM (rust or dask?)
- implement prox of circle graph and path graphs using FFT
- util to transform a continuous feature to discrete and automatically get the right path graph for it.
- constant width
- percentile
- util to normalize target and regressors before optimization
- util to create common models:
- ridge
- lasso
- ols
- logistic regression
- svm? (would require to compute the prox for the hinge loss)
- integration with optuna for hyperparameter optimization out of the box
- util for featureless prediction (non parametric)
- construct arbitrary regularization graphs without networkx
- a score function that approximates the leave one out cross validation, or SURE. relevant papers:
- more losses:
- poisson negative log likelihood
- hinge
- huber
- (major) probabilistic and non-scalar predictions
- normal
- CDF
- logits
- more options for proxes of networkx laplacians:
- sparse eigh
- cg with diagonal preconditioner
- (major) eigen stratified models (constrain theta to low graph-frequencies)
- (major) graph learning, e.g Joint Graph Learning and Model Fitting in Laplacian Regularized Stratified Models
- smart initialization strategies:
- no stratification
- no stratification, then train a strat model using only the prediction
- FFT for fast circle graph prox
- DCT for fact path graph prox
- look into fast algorithms for fast tree graph proxes
- no regularization
- multiple local regularizations
- more than 2 graphs
- no graphs
- no data (expect theta=0)