You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It'd be a nice feature if it would recognize str/enum/cat columns and automatically convert them to dummies.
You'd have to have a for loop on the inputs checking their type and if it was one of those types it'd have to convert them to dummies. Since df.to_dummies already exists that shouldn't be too bad.
For the interactions, on the user side I'm thinking it'd be like
The only way I can think to implement that would be to haveih wrap the columns to be interacted in a pl.struct so that then on the rust side we can check if it's a struct column and if it is then it first checks to see if it should make dummies out of the sub-columns and then multiplies the columns in the struct to form a single new column.
But wait there's more, the helper function wouldn't be merely a wrapper for pl.struct, it could have parameters to allow only typing the columns once so instead of the above you could do df.select(pds.lin_reg_report( *ih("color", "height", include_main=True), target="width", add_bias=True)
Additionally it could have product (or maybe call it cartesian) so if the user puts in 3 or more variables it would do all the interactions between the 3 or if set to False it would only do the 3 together.
which would have regressors: red_round_height, red_round, red_square_height, red_square, red_height, ..., green_height. (I don't want to type any more examples but the colors turn into dummies, the shape turns into dummies and then height stays a float and then all the possible combinations between them become their own interactions.
The text was updated successfully, but these errors were encountered:
It'd be a nice feature if it would recognize str/enum/cat columns and automatically convert them to dummies.
You'd have to have a for loop on the inputs checking their type and if it was one of those types it'd have to convert them to dummies. Since
df.to_dummies
already exists that shouldn't be too bad.For the interactions, on the user side I'm thinking it'd be like
The only way I can think to implement that would be to have
ih
wrap the columns to be interacted in apl.struct
so that then on the rust side we can check if it's a struct column and if it is then it first checks to see if it should make dummies out of the sub-columns and then multiplies the columns in the struct to form a single new column.But wait there's more, the helper function wouldn't be merely a wrapper for pl.struct, it could have parameters to allow only typing the columns once so instead of the above you could do
df.select(pds.lin_reg_report( *ih("color", "height", include_main=True), target="width", add_bias=True)
Additionally it could have
product
(or maybe call it cartesian) so if the user puts in 3 or more variables it would do all the interactions between the 3 or if set to False it would only do the 3 together.that would be like
which would have regressors: red_round_height, red_round, red_square_height, red_square, red_height, ..., green_height. (I don't want to type any more examples but the colors turn into dummies, the shape turns into dummies and then height stays a float and then all the possible combinations between them become their own interactions.
The text was updated successfully, but these errors were encountered: