DATA SCIENCE WITH R | JULY 04 | SONAL

Discussion in 'Big Data and Analytics' started by Sriraksha G, Jul 3, 2020.

  1. Mahaboob_2

    Mahaboob_2 Member
    Alumni

    Joined:
    Jun 22, 2020
    Messages:
    3
    Likes Received:
    0
    Hi Srikanth,

    I am able to view the queries/threads from the previous batches. Do I need to raise a support request ?

    I would like to view the queries from my batch

    Regards
    Mahaboob
     
    #51
  2. Saumya Singh_4

    Joined:
    Jul 3, 2020
    Messages:
    4
    Likes Received:
    0
    v10 is a character vector because of the 's' in the elements.
    as.logical will return NA for all characters and strings besides T, F, TRUE and FALSE
    Since all elements are being treated as strings now the result will be NA for all except F
     
    #52
  3. Nagrale Nisarg Navnath

    Joined:
    Jun 30, 2020
    Messages:
    7
    Likes Received:
    0
    hello mam,
    you shared half file mam, there is no piping function include in it and missing variable files is correct will you please reupload it again
     
    #53
  4. NIKHITA GERA

    NIKHITA GERA New Member

    Joined:
    Jun 28, 2020
    Messages:
    1
    Likes Received:
    0
    hi Sonal mam,
    I have one doubt. What is the output when we give two variable as input in arrange function as in arrange(mtcars,cyl,disp)
     
    #54
  5. Sukumar Voggu

    Sukumar Voggu New Member

    Joined:
    Jul 2, 2020
    Messages:
    1
    Likes Received:
    0
    Hi Sonal,
    Could you please explain how to access CSV or Excel from local system to RStudio as i am unable to access the data sheets once i have downloaded it from G-drive. I would appriciate it if you could show it in the class.

    Thanks
    Sukumar.
     
    #55
  6. Saurav Das_2

    Saurav Das_2 Member

    Joined:
    Jun 14, 2020
    Messages:
    2
    Likes Received:
    0
    Hi,
    I'm getting the error on running the code.
    ------Code------------
    RSC=read.table("RetailScoreData.txt")
    View(RSC)

    Error---
    Error in View : object 'RSC' not found
     

    Attached Files:

    #56
  7. BIKRAM BHATTACHARJEE

    Joined:
    Jun 26, 2020
    Messages:
    2
    Likes Received:
    0
    HI Sonal

    how can we get the text and the csv files uploaded?
    can anyone help plz
     
    #57
  8. Saurav Das_2

    Saurav Das_2 Member

    Joined:
    Jun 14, 2020
    Messages:
    2
    Likes Received:
    0
    First I need to go to the source and after that when I run the code then only I'm getting the output.
     
    #58
  9. Mohit Kumar Singh

    Joined:
    Jul 3, 2020
    Messages:
    2
    Likes Received:
    1
    How can i convert characters into date where dates are present in two different format using / and - . EX= 23/05/2020 and 25-5-2004
     
    #59
    Last edited: Jul 21, 2020
  10. Ajaykiran

    Ajaykiran Member

    Joined:
    Jul 8, 2020
    Messages:
    8
    Likes Received:
    0
    Hello,

    If I unlocked the certificate by completing 1 project. then I complete a 2nd project. shouldn't the certificate be updated to "2 Projects Completed" ?
     
    #60
  11. Bhagyalaxmi

    Bhagyalaxmi Member

    Joined:
    Jun 30, 2020
    Messages:
    9
    Likes Received:
    0
    Hello everyone i have some issues with hospital datasets of 7th project if anybody have download can send me datasets if you have csv file then it would be better.
     
    #61
  12. Mohit Kumar Singh

    Joined:
    Jul 3, 2020
    Messages:
    2
    Likes Received:
    1

    Try giving full path where the file is located(use '/' in path). Also crosscheck the file format once, if its text or not.
     
    #62
    Sonal Ghanshani_1 likes this.
  13. Krishna_258

    Krishna_258 Member

    Joined:
    Jun 19, 2020
    Messages:
    4
    Likes Received:
    0
    Hi Sonal, I am trying the project 7 but it is throwing an error like this can u help me out please.
    > library(dplyr)
    Attaching package: ‘dplyr’

    The following objects are masked from ‘package:stats’:

    filter, lag

    The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

    > dffunction (x, df1, df2, ncp, log = FALSE)
    {
    if (missing(ncp))
    .Call(C_df, x, df1, df2, log)
    else .Call(C_dnf, x, df1, df2, ncp, log)
    }
    <bytecode: 0x55927fe85ee0>
    <environment: namespace:stats>
    > arrange(df, desc(TotChgAge))Error in UseMethod("arrange_") :
    no applicable method for 'arrange_' applied to an object of class "function"
    > arrange(df, desc(TotChgAge))[1,]Error in UseMethod("arrange_") :
    no applicable method for 'arrange_' applied to an object of class "function"
    >
     
    #63
  14. Krishna_258

    Krishna_258 Member

    Joined:
    Jun 19, 2020
    Messages:
    4
    Likes Received:
    0
    Hi sonal, when am practice R studio it is giving me for every code error... can you please help me out?
    Regards,
    krishna
     
    #64
  15. JIthin Narayanan

    Joined:
    Jun 18, 2020
    Messages:
    6
    Likes Received:
    0
    Hi Sonal,

    Iam having tough time understanding on how to crack the third question of the project. It says to find if there is any relation between race and hospitalization cost.

    So do i have to use linear regression here and represent that in a chart
     
    #65
  16. Bhagyalaxmi

    Bhagyalaxmi Member

    Joined:
    Jun 30, 2020
    Messages:
    9
    Likes Received:
    0
    Hello Sonal mam
    Can you tell me how to arrange data column some of are in different format and some of are in different?
     
    #66
  17. Bhagyalaxmi

    Bhagyalaxmi Member

    Joined:
    Jun 30, 2020
    Messages:
    9
    Likes Received:
    0
    Hello Sonal mam
    can you help me with code as there as there is a factor with 4 elements "open pending, closed, solved" and i want to filter as Open & Pending is to be categorized as Open and Closed & Solved is to be categorized as Closed.and Create a new categorical variable with value as Open and Closed.

    Open = project2$Status %in% c("Pending","Open")
    Closed = project2$Status %in% c("Closed", "Solved")

    project2$value = c("Open","Closed")

    this code is working but mismatch with dataset
     
    #67
  18. Rahul Chaurasia_1

    Rahul Chaurasia_1 Active Member

    Joined:
    Jun 19, 2020
    Messages:
    20
    Likes Received:
    2
    hello sonal mam,
    i have one query while changing one col name from data frame .Please let me know how to change a particular col name
    RSC is my data frame and branch is a col. now i want to update branch to n_branch. how to do it.

    colnames(RSC$branch) = 'n_branch'
    # error :attempt to set 'colnames' on an object with less than two dimensions

    Plz reply if anyone can help.
     
    #68
  19. Ajaykiran

    Ajaykiran Member

    Joined:
    Jul 8, 2020
    Messages:
    8
    Likes Received:
    0
    are

    Are you working on Project 2 ? the dates column?

    If so what I did is:
    1. formatted the dates in this form '%d/%m/%Y' to '%d-%m-%Y' and saved it in a variable. the dates already in '%d-%m-%Y' form were stored as NA in my variable say 'date1'.
    2. Stored all dates in format '%d-%m-%Y' to another variable say 'date2'. here all dates in form of '%d/%m/%Y' was stored as NA
    3. Replaced all NA's in date1 with non NA's in date2.
    4. Then Put it back to dates column in my dataframe.

    try it, and if you are stuck i can help with the code.
     
    #69
  20. Ajaykiran

    Ajaykiran Member

    Joined:
    Jul 8, 2020
    Messages:
    8
    Likes Received:
    0


    open=(df$Status=="Open"| df$Status=="Pending")
    closed=(df$Status=="Closed"| df$Status=="Solved")
     
    #70
  21. Ajaykiran

    Ajaykiran Member

    Joined:
    Jul 8, 2020
    Messages:
    8
    Likes Received:
    0
    Hello,

    When doing Multiple Regressing. Say I have a Categorical variable with 4 levels. If 2 of the levels have significance *** and other 2 have no significance (no stars). Do I eliminate the Categorical variable as a whole ?
     
    #71
  22. Sachin Gaikwad_1

    Sachin Gaikwad_1 New Member

    Joined:
    Jun 22, 2020
    Messages:
    1
    Likes Received:
    0
    Execute(run)the first line
     
    #72
  23. JIthin Narayanan

    Joined:
    Jun 18, 2020
    Messages:
    6
    Likes Received:
    0
    Hey Guys,

    Anyone there
     
    #73
  24. JIthin Narayanan

    Joined:
    Jun 18, 2020
    Messages:
    6
    Likes Received:
    0
    Can anyone help me out with answering project 7 question number 6
     
    #74
  25. JIthin Narayanan

    Joined:
    Jun 18, 2020
    Messages:
    6
    Likes Received:
    0
    Hi Guys,

    Shall we discuss Project 7
     
    #75
  26. JIthin Narayanan

    Joined:
    Jun 18, 2020
    Messages:
    6
    Likes Received:
    0
    I have doubt with interpretation part for 6th question
    Anyone know how to explain it would be great if you can help
     
    #76
  27. Bhagyalaxmi

    Bhagyalaxmi Member

    Joined:
    Jun 30, 2020
    Messages:
    9
    Likes Received:
    0
    I have done it
     
    #77
  28. Bhagyalaxmi

    Bhagyalaxmi Member

    Joined:
    Jun 30, 2020
    Messages:
    9
    Likes Received:
    0
    They want to know which variable is mainly affect to hospital cost here we have to check relationship with respect to hospital cost.
     
    #78
  29. Manasa Lakshmi

    Manasa Lakshmi New Member

    Joined:
    Jun 14, 2020
    Messages:
    1
    Likes Received:
    0
    Hi Sonal,

    Please explain the 6th question in Healthcare cost analysis project and please provide some hint , to achieve the same.Waiting for your kind reply.

    Thank you
     
    #79
  30. Nagrale Nisarg Navnath

    Joined:
    Jun 30, 2020
    Messages:
    7
    Likes Received:
    0
    Hello Mam,
    I have doubt in project no7. Question no.5 mam here I have to compare LOS(length of stay which is numerical) with other three categorical variables. We can use ANOVA function for this operation then why are we going with linear regression?
     
    #80
  31. Ajaykiran

    Ajaykiran Member

    Joined:
    Jul 8, 2020
    Messages:
    8
    Likes Received:
    0
    Hello, I have doubt in Project 01. When I import the xl to r the date gets formatted to some numbers. How do I change them ?
     
    #81
  32. Nagrale Nisarg Navnath

    Joined:
    Jun 30, 2020
    Messages:
    7
    Likes Received:
    0
    Hello mam,
    I have used 50/50 hours of Rlab. Now it is not opening and whatever data I have saved, not able to access. Give me the solution mam
     
    #82
  33. Bhagyalaxmi

    Bhagyalaxmi Member

    Joined:
    Jun 30, 2020
    Messages:
    9
    Likes Received:
    0
    Mam i know this code this is working but my question is how to add this two values "open", "Closed" in one column with respect to "status" Column
    i tried
    df$Value = c("Open","Closed")
    its added but not connected with "Status" column
     
    #83
  34. Saumya Singh_4

    Joined:
    Jul 3, 2020
    Messages:
    4
    Likes Received:
    0
    To connect the new column to the Status column use ifelse

    df$Value = ifelse(df$Status=="Open"| df$Status=="Pending","open","closed")
     
    #84
  35. Saurabh_245

    Saurabh_245 New Member

    Joined:
    Jul 3, 2020
    Messages:
    1
    Likes Received:
    0
    hello sonal ma'am,
    this is saurabh. i am working on project (healthcare data) so i have a doubt in Q6.
    as i run this code
    Model4 <- lm(TOTCHG ~ ., data = hops) # DOT ( .) will consider all the other variable from data apart TOTCHG.
    summary(Model4)
    i get this pvalue - 2.2e-16 (this is overall pvalue) and as compare with alpha this give me output true.but in output i have diff. pvalue of diff. variables.
    so do i have compare all pvalue with alpha to find out which variable have relation with dependent variable or do i have to stop on overall pvalue output?
     
    #85
  36. Sonal Ghanshani_1

    Sonal Ghanshani_1 Well-Known Member
    Alumni Customer

    Joined:
    Oct 31, 2018
    Messages:
    186
    Likes Received:
    19
    Hey,
    Please try.
    matrix(c(1,2,3,4, rep(0,5)), 3)

    regards,
    Sg
     
    #86
  37. Sonal Ghanshani_1

    Sonal Ghanshani_1 Well-Known Member
    Alumni Customer

    Joined:
    Oct 31, 2018
    Messages:
    186
    Likes Received:
    19
    v10 = c(0,1,'s',-20,10+0i,F,10.6)
    class(v10)

    Class is character. it cannot be converted to Logocal. only T and F will be read as logical.

    regards,
    Sg
     
    #87
  38. Sonal Ghanshani_1

    Sonal Ghanshani_1 Well-Known Member
    Alumni Customer

    Joined:
    Oct 31, 2018
    Messages:
    186
    Likes Received:
    19
    > v8 = 1:7
    > as.logical(v8)
    [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
    > v10 = c(0,1,'s',-20,10+0i,F,10.6)
    > class(v10)
    [1] "character"
    > t = as.integer(v10)
    Warning message:
    NAs introduced by coercion
    > t
    [1] 0 1 NA -20 NA NA 10
    > as.logical(t)
    [1] FALSE TRUE NA TRUE NA NA TRUE
    > v8&t
    [1] FALSE TRUE NA TRUE NA NA TRUE
     
    #88
  39. Sonal Ghanshani_1

    Sonal Ghanshani_1 Well-Known Member
    Alumni Customer

    Joined:
    Oct 31, 2018
    Messages:
    186
    Likes Received:
    19
    mat4 <- matrix(c(1:18), nrow = 6, ncol = 3)
    b <- mat4>5
    b
    # Instead work with
    b <- mat4[mat4>5]
    b
     
    #89
  40. Sonal Ghanshani_1

    Sonal Ghanshani_1 Well-Known Member
    Alumni Customer

    Joined:
    Oct 31, 2018
    Messages:
    186
    Likes Received:
    19
    Hey,

    Only those levels that have pvalue less than alpha (0.05) are insignificant not the entire variable.


    regards,
    Sg
     
    #90
  41. Sonal Ghanshani_1

    Sonal Ghanshani_1 Well-Known Member
    Alumni Customer

    Joined:
    Oct 31, 2018
    Messages:
    186
    Likes Received:
    19
    Hey,

    Here we have to check relationship with respect to hospital cost.

    # model5 <- lm(TOTCHG ~ ., df)
    # summary(model5)

    regards,
    Sg
     
    #91
  42. Sonal Ghanshani_1

    Sonal Ghanshani_1 Well-Known Member
    Alumni Customer

    Joined:
    Oct 31, 2018
    Messages:
    186
    Likes Received:
    19
    Model3 <- lm(dependent ~ independent1 + independent2 + independent3, df)

    Model3 <- aov(dependent ~ independent1 + independent2 + independent3, df)

    AoV - independent can be Categorical only

    lm - independent can be Continuous and Categorical only

    regards,
    Sg
     
    #92
  43. Sonal Ghanshani_1

    Sonal Ghanshani_1 Well-Known Member
    Alumni Customer

    Joined:
    Oct 31, 2018
    Messages:
    186
    Likes Received:
    19
    Please raise a ticket for this issue. Technical team will be able to help you.
     
    #93
  44. Sonal Ghanshani_1

    Sonal Ghanshani_1 Well-Known Member
    Alumni Customer

    Joined:
    Oct 31, 2018
    Messages:
    186
    Likes Received:
    19
    hey,
    Can try
    summary(model5)$coefficients[,4]
    summary(model5)$coefficients[,4]["AGE"]
    summary(model5)$coefficients[,4]["AGE"] < alpha

    regards,
    Sg
     
    #94
  45. Vishal Suppal

    Vishal Suppal Member

    Joined:
    Jun 22, 2020
    Messages:
    2
    Likes Received:
    0
    Hello Sonal Mam,

    My doubt is for gI() function
    Actually while studying one way ANOVA from self learning in LMS (Lesson 6 Statistics for data science II < Slide 6.3 of parametric test < @ 10 minutes 39 seconds) I saw question

    Create vector of treatment factors, corresponding to each element of R in step 3 using gI function

    tm = gI(k,1,n*k,factor(f)) #matching elements
    tm
    av = aovr(r~tm)
    summary(av)

    Can you please explain this mam ??
     
    #95
  46. Gaurav_326

    Gaurav_326 Member

    Joined:
    Nov 16, 2019
    Messages:
    5
    Likes Received:
    0
    Works with data frames, you are passing a vector
     
    #96
  47. Gaurav_326

    Gaurav_326 Member

    Joined:
    Nov 16, 2019
    Messages:
    5
    Likes Received:
    0
    It just returns a factor of k levels,
    1(no. of repetitions),
    n*k is the length of the factor generated,
    labels are fetched from this factor(f)

    For more details you may use the link below,
    https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/gl
     
    #97
  48. Vishal Suppal

    Vishal Suppal Member

    Joined:
    Jun 22, 2020
    Messages:
    2
    Likes Received:
    0
    #98
  49. Ajaykiran

    Ajaykiran Member

    Joined:
    Jul 8, 2020
    Messages:
    8
    Likes Received:
    0
    I don't think we got a class on Hierarchical Clustering. Did we ?
     
    #99
  50. Bhagyalaxmi

    Bhagyalaxmi Member

    Joined:
    Jun 30, 2020
    Messages:
    9
    Likes Received:
    0
    Hello Sonal Mam

    "The team wants to analyze each variable of the data collected through data summarization to get a basic understanding of the dataset and to prepare for further analysis."
    mam in this question what exectly they are asking data cleaning part or what most project has this question

    Regards
    Bhagya Patel
     
    #100

Share This Page