Need help in Project Completion: Attrition Analysis

Discussion in 'Masters Program - Customers only' started by Debajyoti Das_1, Aug 14, 2017.

  1. Debajyoti Das_1

    Alumni

    Joined:
    Apr 19, 2017
    Messages:
    5
    Likes Received:
    0
    Hi,

    I am enrolled for the Masters in Data Science Program.

    I have started working on the project Attrition Analysis based on SAS, for the Data Science with SAS course.

    I am not able to make much progress on the same. I am following the instructions mentioned in the 'Analysis' section. I have done the FREQUENCY analysis of the dataset, but I am not sure what kind of descriptive statistics is required on the dataset. Do we need to do a histogram plot for the given dataset?

    Also, after doing the logistic regression, I am stuck at how to find the Max & Min values of the probability of churn. The analysis asks us to create a new dataset with the sum of all the "churned" employees above the cut-off. I am at a complete loss as to how to achieve the same. Please advise.

    I am attaching the code that I have written so far.
     

    Attached Files:

    #1
  2. Jitu Moni Das

    Jitu Moni Das Member
    Alumni

    Joined:
    Jun 5, 2017
    Messages:
    8
    Likes Received:
    1
    how to find the Max & Min values of the probability of churn.--- Use PROC FREQ as it will give the max and minimum values of the employees left out.
     
    #2
  3. Debajyoti Das_1

    Alumni

    Joined:
    Apr 19, 2017
    Messages:
    5
    Likes Received:
    0
    Hi,
    Thanks for your revert. I have already used PROC FREQUENCY in the 2nd step itself, and then I have used PROC CORR for Descriptive Analytics. Finally I have used PROC LOGISTIC for logistic regression.

    Do I need to use PROC FREQ on top of the data output of PROC LOGISTIC?

    On the initial use of PROC FREQ, it returned me the number of employees who have left or have been retained, based on the RETAIN_INDICATOR parameter, as 22 & 28 respectively.

    How to determine the MAX & MIN values of the probability of churn, I am still not sure. I have also attached my progress so far. Can you please guide further?

    Code:
    FILENAME REFFILE '/folders/myfolders/Data Science SAS/Attrition Analysis.xlsx';
    
    PROC IMPORT DATAFILE=REFFILE
        DBMS=XLSX
        OUT=WORK.Attrition;
        GETNAMES=YES;
    RUN;
    
    PROC SORT DATA=Attrition;
        BY Retain_Indicator;
    RUN;
    
    PROC FREQ DATA=Attrition;
        TABLE Retain_Indicator;
    RUN;
    
    ODS GRAPHICS ON;
    PROC CORR DATA=Attrition PLOTS=MATRIX(HISTOGRAM);
        VAR Retain_Indicator Relocation_Indicator Sex_Indicator Marital_Status;
    RUN;
    ODS GRAPHICS OFF;
    
    PROC LOGISTIC DATA=Attrition;
        MODEL Retain_Indicator = Relocation_Indicator Sex_Indicator Marital_Status;
    RUN;
     
    #3
  4. Nidhi_52

    Nidhi_52 New Member

    Joined:
    Jul 18, 2017
    Messages:
    1
    Likes Received:
    0
    i am facing similar problems. i have also carried out the same codes for frequency which i believe is the descriptive statistics they want . The problem is they are asking for maximum and minimum probabilities and asking us to create a dataset for it. plus in the logistic regression i find that all my independent variables have a probability value greater than 5% hence i am at a loss at what to report.
     
    #4
  5. Debajyoti Das_1

    Alumni

    Joined:
    Apr 19, 2017
    Messages:
    5
    Likes Received:
    0
    Precisely. The same dilemma here.
    1. All variables have probability value greater than 5%
    2. How to obtain the Max & Min values of the probability of churn
     
    #5
  6. Nikhil Kamath_1

    Alumni

    Joined:
    Apr 14, 2017
    Messages:
    4
    Likes Received:
    0
    Hi,

    I am stuck here too! All the significant variables have a P value greater than 5% and I am unable to reach a conclusion for the regaression, and the last couple of questions seem very unfamiliar. Please can someone help out here??
     
    #6
  7. Niwas Kumar

    Niwas Kumar Member
    Alumni

    Joined:
    Aug 4, 2017
    Messages:
    2
    Likes Received:
    0
    Thats correct, even if you reduce model the p value indicates no relation between predictor and term. Therefore either the data supplied is wrong or we need to report what comes out ( i.e no significant relationship between predictor and terms) . On small data set with 50 observations do we need to create test and train samples and then test the model on train ? I tried doing that but again p value comes to accept H0
     
    #7
  8. Ushdeep Singh

    Ushdeep Singh Member
    Alumni

    Joined:
    Jun 19, 2017
    Messages:
    3
    Likes Received:
    0
    Hello, All

    Is there any progress here with Attrition Analysis Project. I am stuck here too in Proc Logistics as there are no significant variables
     
    #8
  9. Priyanka_Mehta

    Priyanka_Mehta Well-Known Member
    Simplilearn Support

    Joined:
    May 25, 2017
    Messages:
    491
    Likes Received:
    34
    #9
  10. Priyanka_Mehta

    Priyanka_Mehta Well-Known Member
    Simplilearn Support

    Joined:
    May 25, 2017
    Messages:
    491
    Likes Received:
    34
  11. Divya Vellanki

    Joined:
    Aug 22, 2017
    Messages:
    7
    Likes Received:
    0
    Hi ,

    Can someone help me out with what needs to be done if all the independent variables are insginificant? The webex recording doesn't cover this part.
     
    #11
  12. _20103

    _20103 Member
    Alumni

    Joined:
    Jan 11, 2018
    Messages:
    3
    Likes Received:
    0
    Hi Priyanka,

    Both links are not working
     
    #12
  13. _20103

    _20103 Member
    Alumni

    Joined:
    Jan 11, 2018
    Messages:
    3
    Likes Received:
    0
    Hi Priyanka,

    I am struck to find the below step.kindly advise.
    The analysis asks us to create a new dataset with the sum of all the "churned" employees above the cut-off
     
    #13

Share This Page