DATA SCIENCE WITH R | Saurabh Bansal

Discussion in 'Big Data and Analytics' started by Kunal Guwalani, Jan 30, 2020.

  1. Kunal Guwalani

    Kunal Guwalani Well-Known Member
    Staff Member Simplilearn Support

    Joined:
    Jul 17, 2018
    Messages:
    204
    Likes Received:
    24
    #1
  2. Shashwat Samrat Paul

    Shashwat Samrat Paul Active Member

    Joined:
    Jan 31, 2020
    Messages:
    26
    Likes Received:
    3
    Hi Saurabh, I have no prior knowledge of any programming languages. So, while explaining the loops to us today, I was confused as to why do we need to repeat an action. Is there any way that you can explain it clearly?
     
    #2
  3. Payal Saxena

    Payal Saxena Member
    Alumni

    Joined:
    Dec 23, 2019
    Messages:
    5
    Likes Received:
    0
    Hi,
    I have below dataset:
    colnames(RSC)
    [1] "branch" "ncust" "customer Id" "age" "education" "employ"
    [7] "address" "income" "creddebt" "othdebt" "default"
    I am trying to run this and getting error.
    tapply(RSC[,8:10], RSC$branch, mean)
    Error in tapply(RSC[, 8:10], RSC$branch, mean) :
    arguments must have same length
    How can we get branch wise mean for multiple columns by using tapply?
    Thanks & Regards
     
    #3
  4. CHIGIRIKOTA MEGHANA

    CHIGIRIKOTA MEGHANA New Member

    Joined:
    Jan 29, 2020
    Messages:
    1
    Likes Received:
    0
    can some please tell me whats the reason for error in my below code ,when iam trying to read the value its not taking input also neither checking in loop
    i<-readline(prompt="enter a value")
    for (i in 1:20){
    if(i<=20)
    print(i)
    }
     
    #4
  5. Aman kumar verma

    Alumni

    Joined:
    Oct 27, 2019
    Messages:
    11
    Likes Received:
    1
    it will be better if you can explain what you are trying to do with your code if you want to predict the input of code from the loop then you can try this code
    input <- readline(prompt = 'enter any value')
    for(i in 1:20){
    if(i == input){
    print(i)
    }
    }
    please see the syntax and match with your code and then you will find where you have made mistake
     
    #5
  6. Saurabh Bansal_2

    Saurabh Bansal_2 Customer
    Customer

    Joined:
    Jan 28, 2020
    Messages:
    5
    Likes Received:
    2
    Hi, Think of a scenario where you need to read names of all employees from a file -> read their net salary -> add taxes to net salary -> store in a new variable.

    In this case you have to read the salary of all employees one by one -> add it to taxes and save in a variable.
    This has to be repeated for every employee so loop is required here(though can be done by other methods as well).
     
    #6
  7. Saurabh Bansal_2

    Saurabh Bansal_2 Customer
    Customer

    Joined:
    Jan 28, 2020
    Messages:
    5
    Likes Received:
    2
    Hi,,
    in the code lines "i<-readline(prompt="enter a value")", the i that you will get will be char type, convert it in int and then use loop.
     
    #7
    CHIGIRIKOTA MEGHANA likes this.
  8. Aman kumar verma

    Alumni

    Joined:
    Oct 27, 2019
    Messages:
    11
    Likes Received:
    1
    Hi sir,
    I have a question regarding stats
    According to the central limit theorem, we knew that a sample means of good size is approximately Normally distributed irrespective of its original distribution this theorem applies for --> sample's "means" <-- right! so my question is we generally collect one sample of specific size "we don't collect too many samples and calculate their means for sampling the population data" right as it is not feasible at all, then how we consider any random sample of threshold size 30 normally distributed.

    I HOPE MY QUERY WILL WE SHORTED
    ----thanks,
     
    #8
  9. Saurabh Bansal_2

    Saurabh Bansal_2 Customer
    Customer

    Joined:
    Jan 28, 2020
    Messages:
    5
    Likes Received:
    2
    Hi All,
    Material for session dated 22nd and 23rd is available in google drive.

    Thanks
    Saurabh
     
    #9
  10. Nadia Zafaf

    Nadia Zafaf Member

    Joined:
    Jan 29, 2020
    Messages:
    6
    Likes Received:
    1
    Hi, In Swedish Motor Insurance Project, the field called 'Insured (Number of insured in policy-years)' is not clear. Can anyone please explain?
     
    #10
  11. Akanksha Chaudhary

    Akanksha Chaudhary Active Member
    Alumni

    Joined:
    Feb 20, 2019
    Messages:
    20
    Likes Received:
    0
    p<-1:10
    for (a in p) {print(a)
    a=a*2
    }

    It doesnt show up the expected result i.e. 1,2,4,8
    It shows up 1 to 10
    Please help correct the concept and the error ...


    Regards
     
    #11
  12. Akanksha Chaudhary

    Akanksha Chaudhary Active Member
    Alumni

    Joined:
    Feb 20, 2019
    Messages:
    20
    Likes Received:
    0
    Sir , why are the following codes different :-

    m<-1:3
    print(m)
    ifelse(m==1,print("one"),(ifelse(m==2,print("two"),print("three"))))

    VS

    m<-1:3
    m
    ifelse(m==1,"one",(ifelse(m==2,"two","three")))

    vs

    m<-1:3
    ifelse(m==1,print("one"),(ifelse(m==2,print("two"),print("three"))))
     
    #12
  13. Nadia Zafaf

    Nadia Zafaf Member

    Joined:
    Jan 29, 2020
    Messages:
    6
    Likes Received:
    1
    In your 1st and 3rd codes, you are using the print() function and so the output is printing twice. In ifelse() , you do not have to use print() function as by default it would print the value based on the condition.
     
    #13
    Akanksha Chaudhary likes this.
  14. Saurabh Bansal_2

    Saurabh Bansal_2 Customer
    Customer

    Joined:
    Jan 28, 2020
    Messages:
    5
    Likes Received:
    2
    Hi All,
    Datasets, Project code and presentations for discussion on 29th Feb and 1st Mar is available on google drive.

    Thanks
    Saurabh
     
    #14
  15. Saurabh Bansal_2

    Saurabh Bansal_2 Customer
    Customer

    Joined:
    Jan 28, 2020
    Messages:
    5
    Likes Received:
    2
    pls use.

    p<-1:10
    for (a in p)
    {
    a = a*2
    print(a)
    }
     
    #15
    Akanksha Chaudhary likes this.
  16. Akanksha Chaudhary

    Akanksha Chaudhary Active Member
    Alumni

    Joined:
    Feb 20, 2019
    Messages:
    20
    Likes Received:
    0
    but it is extending from 10 to print ... any way we can keep it limited to p and multiply like 1,2,4&8
     
    #16
  17. MANAS CHANDRA SAHOO

    Joined:
    Feb 14, 2020
    Messages:
    2
    Likes Received:
    0
    Have anyone tried to install the R program in Ubuntu OS?
     
    #17
  18. Vinay Sheva Ramani

    Joined:
    Jan 28, 2020
    Messages:
    2
    Likes Received:
    0
    in ANOVA() topic , you took dataset and apply function to get average sale of each item in different restaurants and you got p>0.05 ie all items are popular equally......but if i do calculation manually then i got different results... it seems that items are not equally popular.....How???


    DataSet
    A B
    1 Item1 23
    2 Item2 56
    3 Item3 20
    4 Item1 52
    5 Item2 40
    6 Item3 22
    7 Item1 44
    8 Item2 8
    9 Item3 19
    10 Item1 52
    11 Item2 47
    12 Item3 18
     
    #18
  19. Sneha I

    Sneha I Member

    Joined:
    Jan 24, 2020
    Messages:
    4
    Likes Received:
    0
    why is my code giving error below?

    code

    df1 = read.csv("fastfood-1.txt", header = TRUE, sep = "")

    df1

    response

    Item X1 Item.1 X2 Item.2 X3

    1 22 52 16 NA NA NA

    2 42 33 24 NA NA NA

    3 44 8 19 NA NA NA

    4 52 47 18 NA NA NA

    5 45 43 34 NA NA NA

    6 37 32 39 NA NA NA
     
    #19
  20. Liju Varghese

    Liju Varghese Member

    Joined:
    Oct 5, 2019
    Messages:
    5
    Likes Received:
    1
    this question is regarding hierarchical forecasting in R. I am trying to do forecasting in R but there is a starting trouble especially in defining the hierarchy of the my data using 'HTS' library. I didn’t really understand how to tell R tool about the hierarchy that I have in my data.
    I am sure am this might be a silly question for some of you.Seeks like there are very less written notes on setting this up in R. Could you kindly help me with this problem. Example of the data is given below.


    Date: Jan-2016 till Feb-2020, 1-country, 4-Regions, Different districts, Target -Sales


    Date(months) Country Regions Districts Sales
    Jan-16 CountryX North District-1 20120
    Feb-16 CountryX North District-2 20508
    Mar-16 CountryX North District-3 20896
    Apr-16 CountryX North District-4 21284
    May-16 CountryX North District-5 21672
    Jun-16 CountryX North District-6 22060
    Jul-16 CountryX North District-7 22448
    Aug-16 CountryX North District-8 22836
    Sep-16 CountryX North District-9 23224
    Oct-16 CountryX North. District-10. 23612
    Nov-16 CountryX North District-11 24000
    Dec-16 CountryX North District-12 24388
    Jan-17 CountryX South District-2 22060
    Feb-17 CountryX South District-3 22448
    Mar-17 CountryX South District-4 22836
    Apr-17 CountryX South District-5 23224
    May-17 CountryX South District-6 23612
    Jun-17 CountryX South District-7 24000
    Jul-17 CountryX South District-8 24388
    Aug-17 CountryX South District-9 24776
    Sep-17 CountryX South District-10 25164
    Oct-17 CountryX South District-11 25552
    Nov-17 CountryX South District-12 25940​
    Dec-17 CountryX South District-13 26328​
     
    #20
  21. Akanksha Chaudhary

    Akanksha Chaudhary Active Member
    Alumni

    Joined:
    Feb 20, 2019
    Messages:
    20
    Likes Received:
    0
    Hi Saurabh sir , pl tell me when will your next set of R classes start .
     
    #21
  22. Akanksha Chaudhary

    Akanksha Chaudhary Active Member
    Alumni

    Joined:
    Feb 20, 2019
    Messages:
    20
    Likes Received:
    0
    Dear Darshana . I hope u used the Library function for loading the function dplyr to enable select .... It is working now yes?
     
    #22
  23. Akanksha Chaudhary

    Akanksha Chaudhary Active Member
    Alumni

    Joined:
    Feb 20, 2019
    Messages:
    20
    Likes Received:
    0
    Dear Community members , can u pl reply if confidence for following picture is 1 and not .75 :-
     

    Attached Files:

    #23
  24. Akanksha Chaudhary

    Akanksha Chaudhary Active Member
    Alumni

    Joined:
    Feb 20, 2019
    Messages:
    20
    Likes Received:
    0
    hi pl explain me the following :-
     

    Attached Files:

    #24
  25. Akanksha Chaudhary

    Akanksha Chaudhary Active Member
    Alumni

    Joined:
    Feb 20, 2019
    Messages:
    20
    Likes Received:
    0
    hi , in the following example , ( question :
    If {2,3,4} is frequent with sup = 50% and proper nonempty subsets: {2,3}, {2,4}, {3,4}, {2}, {3}, {4}, with sup = 50%, 50%, 75%, 75%, 75%, 75%, respectively, find the association rule. for :

    2,3 → 4, confidence =
    50/75 ( my ans )and not 100 % , 2,4 → 3, confidence = 50/75 ( my ans ) and not 100% ; 3,4 → 2, confidence = 50/75 = 67% ;2 → 3,4, confidence = 67% = 50/75 % ; 3 → 2,4, confidence =50/50 = 100 % ( my ans ) and not 67% ; 4 → 2,3, confidence = 50/50=100% (my ans ) and not 67% , Support of all rules = 50% ( not sure for this ) )
     

    Attached Files:

    #25

Share This Page