-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT - Add quadratic datafit with no access to the target #249
Comments
Thanks @QB3 that sounds doable How would we pass the current design ? Thoughts @Badr-MOUFAD @QB3 ? |
It is true that the form of the problem is a bit different but it fits our framework as we can write the datafit as As you mentioned @mathurin, while it breaks our conventions, I don’t think it would break the code. Thinking about it, adding support of this case, we will have in total three cases that don't abide by the conventions: |
There's a working (withoutintercept) implementation with minimal overhead (we inherit most of the stuff from Quadratic) in #250 I need to get the intercept update step right, but it was not painful at all to implement ! |
Thanks @QB3, this would be great! Just to add a bit more context: In statistical genetics, we have extremely high dimensional data (10s of millions of features) that we'd like to use to predict a trait (e.g. blood cholesterol levels) or disease (e.g. diabetes). Due to privacy, large cohorts that collect the data don't release either So, we have to resort to approximations of We usually assume both Given this, two options are possible for the implementation:
Happy to help with the implementation or testing. |
|
Description of the feature
Exact feature
Solve the following optimization problem
with no access to$y$ , but with access to $X^\top y$ .
Additional context
Context,$y$ : one only has access to the design matrix $X$ , and an estimation of $X^\top y$ (usually estimated from another dataset).
I have been discussing with @shz9 to implement a specific datafit for genomic applications (@shz9 is finishing his PhD on statistical analysis of genomics data). From what I understood, genomics data are sensitive: one does not have access to the target
Steps
I guess we have to add the datafit$$\frac{1}{2n} || X \beta ||^2 + \frac{1}{n} \beta^\top X^\top y \enspace ,$$ and handle the fact there is no $y$ provided.
The text was updated successfully, but these errors were encountered: