choose of variables

Discussion in 'Big Data and Analytics' started by Narayana Surya, Jan 23, 2019.

  1. Narayana Surya

    Narayana Surya Well-Known Member

    Feb 27, 2018
    Likes Received:

    Can you please let me know how can i choose what variables are impacted output in non linear models
    (eg: Decision tree,Random forest).Generally for linear models we will choose impacted components by correlation.but we can find correlation in case of non linear data.

    now let us consider the below file as example:

    Can you please let me know what model i should use for the attached file and how the independent variables find out dependent variable in case of categorical.


    Attached Files:

  2. Vikas Kumar_18

    Vikas Kumar_18 Well-Known Member
    Simplilearn Support Alumni

    Dec 17, 2018
    Likes Received:
    No one would tell you by seeing the data that you should go for specific ML Model. In previous discussion also i already told you that drop the highly correlated attributes and then build all models which you know and then compare the result. In case of categorical first go for dummification which could be processed in python as Labelencoding and OneHotencoding thereafter again do the correlation plot and drop the highly correlated attributes. Just google it for detailed understanding of Labelencoding and OneHotencoding.

    You can use correlation plot for selecting the important attributes before linear and nonlinear model building both, which is already discussed in previous thread.

Share This Page