Data_Science_with_R

Discussion in 'Big Data and Analytics' started by _22998, Mar 18, 2018.

  1. _22998

    _22998 Member

    Joined:
    Feb 13, 2018
    Messages:
    11
    Likes Received:
    5
    Hi All,
    This thread is for Data_Science_with_R started on 10 march
     
    #1
  2. Jitendra Budiya

    Alumni Customer

    Joined:
    Oct 9, 2017
    Messages:
    12
    Likes Received:
    2
    #2
    vjshanky and _24450 like this.
  3. _24420

    _24420 Member

    Joined:
    Feb 27, 2018
    Messages:
    9
    Likes Received:
    1
    Hi All
    I have 1 issue in R Studio desktop while Import Dataset in which didn't show from .csv file so what to do for that ?
     
    #3
  4. Soumya Chandra

    Joined:
    Dec 16, 2017
    Messages:
    2
    Likes Received:
    0
    HI ! What type of projects we need to submit when I am taking the course Data Science with R.??
     
    #4
  5. Jitendra Budiya

    Alumni Customer

    Joined:
    Oct 9, 2017
    Messages:
    12
    Likes Received:
    2
    hp<-read.csv(file.choose(), header = T)
    use this line of code for import dataset,if u didnt get the import block maybe the window is not in front of you, kindly
    press alt+enter to check open tabs. i hope this will work
     
    #5
  6. Jitendra Budiya

    Alumni Customer

    Joined:
    Oct 9, 2017
    Messages:
    12
    Likes Received:
    2
    simple projects with knowledge of algorithms discussed in live class and bit tricky approach will be sufficient to submit he project
     
    #6
  7. vjshanky

    vjshanky Member

    Joined:
    Mar 25, 2018
    Messages:
    2
    Likes Received:
    0
    In the "Projects for R" attachment in simplilearn (Data science with R _ Downloads) you will see 4 zip folders (Insurance, Retail,Internet & Healthcare). You will have submit one of those projects
     
    #7
  8. _23889

    _23889 New Member

    Joined:
    Feb 21, 2018
    Messages:
    1
    Likes Received:
    0
    Hi All,

    Can you tell me, Where our Trainer Nimisha attached the class room training R codes.

    Let me know the updates.

    Thanks,
    Manju
     
    #8
  9. Jeny George

    Jeny George Member
    Alumni

    Joined:
    Dec 22, 2017
    Messages:
    4
    Likes Received:
    0
    @Nimisha Pandey Hi Nimisha just wanted to clarify when is the Data Science (started Mar'10th) Course with R programming actually getting over, because I'm a little worried about the project submission time. The Web ex recordings says Mar'10th through Apr'8th, but scheduled classes are only till Apr'1st.
     
    #9
  10. _24420

    _24420 Member

    Joined:
    Feb 27, 2018
    Messages:
    9
    Likes Received:
    1

    Thanks Jitendra for solution
     
    #10
    Jitendra Budiya likes this.
  11. Jitendra Budiya

    Alumni Customer

    Joined:
    Oct 9, 2017
    Messages:
    12
    Likes Received:
    2
    your welcome
     
    #11
  12. Jitendra Budiya

    Alumni Customer

    Joined:
    Oct 9, 2017
    Messages:
    12
    Likes Received:
    2
    jeny the course will end on 1st april as per the LMS showing but it depends upon trainer to extend the limit of classes of batch. so if Nimisha mam want to take few more session in the batch then we will get links through mails to attend the class and to download the recording of the class
     
    #12
  13. Jeny George

    Jeny George Member
    Alumni

    Joined:
    Dec 22, 2017
    Messages:
    4
    Likes Received:
    0
    @
    @Nimisha Pandey So the dates mentioned in the web-ex are irrelevant ?
     
    #13
  14. Murugadoss B

    Murugadoss B New Member

    Joined:
    Jan 27, 2017
    Messages:
    1
    Likes Received:
    0
    the Project files are .Zip files and not able to open the CSV file within that in the Rstudio lab, did any one try to open , pls share the steps
     
    #14
  15. Priyanka_Mehta

    Priyanka_Mehta Well-Known Member
    Simplilearn Support

    Joined:
    May 25, 2017
    Messages:
    830
    Likes Received:
    57
    HI Murugadoss,

    You have to right-click on the zip file and extract the data from it. Then you will find the CSV file inside it which you need to use in the RStudio.

    I am sure this will help you.
     
    #15
  16. Priyanka_Mehta

    Priyanka_Mehta Well-Known Member
    Simplilearn Support

    Joined:
    May 25, 2017
    Messages:
    830
    Likes Received:
    57
    Hi Jeny,

    The dates mentioned in LMS are absolutely relevant. As we have extended the session by 2 days since all the learners will be doing their project submission in the last session. So, I would like to inform you that, kindly follow the instructions given on the below mentioned community thread which will guide you on which projects you can work and you have submit your project on the last day.
    http://community.simplilearn.com/threads/project-guidance-for-data-science-with-r-course.32417/

    We have 3 sessions to go before that, so, if you will start referring to the projects and choose any 1 domain on which you would like to work on. And you will be able to ask your general queries related to the same before the last session and successfully submit your project in the last session.

    i hope this will help you. All the very best !!!
     
    #16
  17. Nimisha Pandey

    Trainer

    Joined:
    Aug 4, 2017
    Messages:
    12
    Likes Received:
    5
    Hi All,

    Please find attached codes for Z test, t test (Basic Stats Questions), Anova and Chi square test and also the titanic case study attached here.

    Thanks and Best
     

    Attached Files:

    #17
    _24450 likes this.
  18. vjshanky

    vjshanky Member

    Joined:
    Mar 25, 2018
    Messages:
    2
    Likes Received:
    0
    Link to description about Linear Regression with explanation of all the outputs for the lm() function

    http://r-statistics.co/Linear-Regression.html

    The most common metrics to look at while selecting the model are:

    STATISTIC CRITERION
    R-Squared Higher the better (> 0.70)
    Adj R-Squared Higher the better
    F-Statistic Higher the better
    Std. Error Closer to zero the better
    t-statistic Should be greater 1.96 for p-value to be less than 0.05
    AIC Lower the better
    BIC Lower the better
    Mallows cp Should be close to the number of predictors in model
    MAPE (Mean absolute percentage error) Lower the better
    MSE (Mean squared error) Lower the better
    Min_Max Accuracy => mean(min(actual, predicted)/max(actual, predicted)) Higher the better
     
    #18
  19. Shomik Bhattacharyya_1

    Joined:
    Aug 22, 2017
    Messages:
    4
    Likes Received:
    1
    Nimisha - as mentioned in the class on April 1st, the flight data logistic regression demo project is throwing error when I am trying to run confusion.matrix.on test. The error I get is something like "number of observations do not match the number of predictions"
     
    #19
    _24450 likes this.
  20. Nimisha Pandey

    Trainer

    Joined:
    Aug 4, 2017
    Messages:
    12
    Likes Received:
    5
    Hi All,

    Please find attached codes for Linear regression, Logistic regression and decision tree attached here.

    Thanks and Best
     

    Attached Files:

    #20
    _24450 likes this.
  21. _24450

    _24450 Member

    Joined:
    Feb 27, 2018
    Messages:
    2
    Likes Received:
    0
    thanks
     
    #21
  22. _30058

    _30058 Member

    Joined:
    Apr 23, 2018
    Messages:
    2
    Likes Received:
    0
    Good morning Nimisha,I belong to the R training program (Jul16 -Aug1), the following are my doubts,

    1-usage of abs()
    2-Usage of print along with paste in FOR loop
    3-Prime number program, limit of numbers in for loop
    4- Why is break needed in while loop when it doesn't auto increase
    5-Summary stats?

    Titanic train dataset
    1-Why did u change survived column to factor type ?
    2-Why ticket column is in factor format?
    3-What is tapply used for?
    4- Why is the data type of 'name' factor?
    5-in sum(is.na(x)), what is x?
     
    #22
  23. Nimisha Pandey

    Trainer

    Joined:
    Aug 4, 2017
    Messages:
    12
    Likes Received:
    5
    1.abs() – used to calculate absolute value of any number

    2.print function is used for explicit printing along with paste which allows us to combine multiple values and text and print it.

    3.For prime no. program the limit of numbers is based on the properties of prime nos. i.e. they are divisible by only 1 and themselves. And since we are looking for only whole no factors so we need not go beyond num/2.

    4.Break is not “needed” in while loop. It depends on the logical flow of the program when you need to break out of any loop.

    5.Summary stats are Descriptive stats – mean median Min max and the two quartiles


    Titanic dataset:

    1.As discussed in class all the variables that are of categorical type need to be converted factor data type.

    2.By default all character columns are factor type also ticket is a nominal variable hence needs to be factor format

    3.tapply is used to apply any function on a dataframe bsed on a factor.. you can go through the slides for detailed explanation

    4.factor refers to categorical data type in R.

    5.sum(is.na(x))—here x is the variable for which the no. of na values you are trying to claculate
     
    #23
    Priyanka_Mehta likes this.

Share This Page