Hi Ramesh,

Problem -Multivariate Linear regression.

I am trying to use column transformer with onehot encoder for a data set with 8 nominal categorical columns and balance more than 350 columns which are binary only(used Remained='passthrough'). After fitting the linear regression and when i tried to use the predict function, the X_test_transformed with the column transformer (same used for fit_transform X_train) the shapes (no of columns) have changed between X_train_transformed and X_test_transformed. Struck at this point and unable to move forward . Kindly advise .....

Again i tried to use Column transformer prior to split and able to implement linear regression.But got negative r2 score in standard linear reg .Improve slightly with regularisation,but not much.

r2 of test scores.

Linear -1.3 exp-23

Ridge 0.43

Lasso 0.47

Elasticnet 0.38

Can we do feature engineering with completely fully categorical variables to improve???

Request your suggestions....