### Welcome to the Simplilearn Community

Want to join the rest of our members? Sign up right away!

# Data Science Certification Training - R Programming|Pratul|Apr 3-May 2

Alumni
Customer
Hi Everyone,

Regards,
Simplilearn

#### Yogesh Thanvi

##### Member
I am facing an error while using mat%%as.matrix(vec) : as non-conformable arrays. My code is
mat = matrix(1:6,nrow = 2, ncol = 3)
vec = c(10,20,30)
mat%%as.matrix(vec)

Thanks,

#### Yogesh Thanvi

##### Member
@Maria Garcia You can use webx recording tool or VLC player to watch recorded sessions

#### Yogesh Thanvi

##### Member
I am facing an error while using mat%%as.matrix(vec) : as non-conformable arrays. My code is
mat = matrix(1:6,nrow = 2, ncol = 3)
vec = c(10,20,30)
mat%%as.matrix(vec)
My mistake , it should be
mat%*%as.matrix(vec)

#### Yogesh Thanvi

##### Member
If the user entered operator (character) is not in any of the above then you can assign some default statement like:

switch(operator,
"+" = print(paste("Addition of two numbers is: ", number1 + number2)),
"-" = print(paste("Subtraction of two numbers is: ", number1 - number2)),
"*" = print(paste("Multiplication of two numbers is: ", number1 * number2)),
"^" = print(paste("Exponent of two numbers is: ", number1 ^ number2)),
"/" = print(paste("Division of two numbers is: ", number1 / number2)),
"%/%" = print(paste("Integer Division of two numbers is: ", number1 %/% number2)),
"%%" = print(paste("Division of two numbers is: ", number1 %% number2)),
print("default") # Default Statement
)

#### Tathagata_8

##### Member
in the below code:

#Taking care of the missing Values
dataset\$Age = ifelse(is.na(dataset\$Age),
ave(dataset\$Age,FUN = function(x) mean(x,na.rm = TRUE)),
dataset\$Age)

can you please tell me in the part FUN = function(x), how by default entire Age column ins going into variable x ? We have not declared the value to x.

#### Maria Garcia

##### Member
Hello Pratul,

Who can help me since I do not have the laboratories available. When I give launch lab I get a blank R page and there I was following your exercises but today my exercises that I had are showing an error in addition to the fact that in the top bar it appears that I have done 0/7 projects.

Today I got lost because I keep getting an R error and I couldn't do anything.

Let me know if you need screenshots, I need to do my labs and I can't and those labs will be disabled as soon as the course finish.

#### Vaishnavi Chauhan

##### Active Member
Simplilearn Support
Customer

#### pratul.goyal111

##### Well-Known Member
in the below code:

#Taking care of the missing Values
dataset\$Age = ifelse(is.na(dataset\$Age),
ave(dataset\$Age,FUN = function(x) mean(x,na.rm = TRUE)),
dataset\$Age)

can you please tell me in the part FUN = function(x), how by default entire Age column ins going into variable x ? We have not declared the value to x.
Age column is the first argument under Ave so if the value is NULL it will be filled by average of all the other columns present

#### pratul.goyal111

##### Well-Known Member
Hello Pratul,

Who can help me since I do not have the laboratories available. When I give launch lab I get a blank R page and there I was following your exercises but today my exercises that I had are showing an error in addition to the fact that in the top bar it appears that I have done 0/7 projects.

Today I got lost because I keep getting an R error and I couldn't do anything.

Let me know if you need screenshots, I need to do my labs and I can't and those labs will be disabled as soon as the course finish.
There was an issue with the lab which I feel isd rectified please reach me out since it again do not work

#### pratul.goyal111

##### Well-Known Member
Code have been updated please have a look on the drive

#### Ejaz Mehmood

##### New Member
Hi, every one Ejaz here.

#### Amal M_2

##### New Member
# 1 Sample T Test
# H0:mu>=30000

# If P Value > Alpha (100-confidence level) then accept H0
View(cars_data)
sedan_data=cars_data[cars_data\$Type=='Sedan','MSRP']
t.test(sedan_data,mu=30000,alternative = 'less')
# One Sample t-test
#
# data: sedan_data
# t = -0.23512, df = 261, p-value = 0.4071
# alternative hypothesis: true mean is less than 30000
# 95 percent confidence interval:
# -Inf 31362.96
# sample estimates:
# mean of x
# 29773.62

# T Test
# H0:mu<=30000

t.test(sedan_data,mu=30000,alternative = 'greater')
# One Sample t-test
#
# data: sedan_data
# t = -0.23512, df = 261, p-value = 0.5929
# alternative hypothesis: true mean is greater than 30000
# 95 percent confidence interval:
# 28184.28 Inf
# sample estimates:
# mean of x
# 29773.62

In both cases we are accepting H0 as P Value > Alpha. How is it logicaly possible?

#### Katherine M Linton

##### Member
In regards to the Project, are we expected to clean the data in Excel before loading the data into R Studio OR should we include data cleaning in our code OR it doesn't matter which data-cleaning method we choose?

#### ShravanKumar Rama

##### Member
Hi Pratul,

I am going through 1st project and I can't able to understand the 2 datasets in it. I see more price columns. I am assuming that price is dependent variable and rest of the attributes are independent variable. Is my assumption correct?

can you explain this datasets? do I need to merge them into single dataset for analyzing?

Thanks & Regards,
Shravan Kumar Rama

#### Katherine M Linton

##### Member
Hi Pratul,

I am going through 1st project and I can't able to understand the 2 datasets in it. I see more price columns. I am assuming that price is dependent variable and rest of the attributes are independent variable. Is my assumption correct?

can you explain this datasets? do I need to merge them into single dataset for analyzing?

Thanks & Regards,
Shravan Kumar Rama
I ran into that problem. The files are *.xlsx, which R doesn't really like. You can open them in excel and export them as *.csv on your machine or use an online converter. Then the read.csv should work well.

#### ShravanKumar Rama

##### Member
I ran into that problem. The files are *.xlsx, which R doesn't really like. You can open them in excel and export them as *.csv on your machine or use an online converter. Then the read.csv should work well.
Thank you Katherine, have you started this project? I am trying to understand the requirement from 1st project.

#### Katherine M Linton

##### Member
Thank you Katherine, have you started this project? I am trying to understand the requirement from 1st project.
Hi! I looked at it, but chose not to do it. It needs a lot of data cleaning and it will take some regression to suggest dresses. It seemed too complicated for me right now.

I chose to work on Project 2 (comcast complaints) and am about halfway through it. It is mostly data visualization and sorting (as far as I can tell), which feels more manageable to me. Most of my time is spent on troubleshooting syntax right now.

#### Katherine M Linton

##### Member
Just to make sure I'm understanding the instructions...In Project 2, One of the directives says to...
"Provide state wise status of complaints in a stacked bar chart. Use the categorized variable from Q3."

The term "Q3" is ambiguous to me. Does it mean "Quarter 3," as in the months of the year in the third quarter, July, August and September?

OR Does it mean "Question 3," which I find more confusing because the directives are not numbered and the bulleted list is formatted strangely.

#### Shashank Pandey_3

##### Member
Hi Pratul, I am working on Comcast Telecom Consumer Complaints project so i have done few parts but in few cases i am facing problems so here i am mentioning few questions and also sharing my code.

q1)
• Which complaint types are maximum i.e., around internet, network issues, or across any other domains.
- Create a new categorical variable with value as Open and Closed. Open & Pending is to be categorized as Open and Closed & Solved is to be categorized as Closed.

-Provide state wise status of complaints in a stacked bar chart. Use the categorized variable from Q3.

What is Q3 means here?

• Which state has the highest percentage of unresolved complaints( how will i get the percentage)
I have filter out the pending complaint types but not bringing out the percentage:
see code:
comcast %>% filter(Status=='Pending') %>% select(State,Status) %>% count(State,Status)

Provide the percentage of complaints resolved till date, which were received through theInternet and customer care calls.

Now i am sharing my code:
#Which state has the maximum complaints( is this code right for this question?)

statewise_complaint <- summarise(group_by(comcast,state=tolower(State)),Count=n())
View(statewise_complaint)

statewise <- arrange(statewise_complaint,desc(Count))
View(statewise)

ggplot(statewise,mapping = aes(x=state,y=Count,fill=state))+geom_bar(stat = 'identity')+
scale_x_discrete(breaks=statewise\$state)+geom_label(aes(label=Count))

#### Katherine M Linton

##### Member
Just to make sure I'm understanding the instructions...In Project 2, One of the directives says to...
"Provide state wise status of complaints in a stacked bar chart. Use the categorized variable from Q3."

The term "Q3" is ambiguous to me. Does it mean "Quarter 3," as in the months of the year in the third quarter, July, August and September?

OR Does it mean "Question 3," which I find more confusing because the directives are not numbered and the bulleted list is formatted strangely.

This is all I got back from Support (see attached). I am still confused. What does "Q3" mean???

#### Attachments

• response.png
52.1 KB · Views: 22

#### Katherine M Linton

##### Member
I am working on Project 2 (Comcast Complaints).
I am currently working on the part that asks which state has the highest percentage of unresolved complaints.
To find this, I need to divide the number of Open complaints by the Total Number of Complaints per State (*100%).

I have a data frame that is Complaints_By_State that includes 42 obs. of 2 variables (State and its corresponding Total Number of Complaints).

I also have a data frame that is Open_By_State. <-- I am having trouble with this one. Right now, my Open_By_State collects all the "Open" complaint statuses, but doesn't separate by State. I want to get be 42 obs. of 2 variables (State and its corresponding number of Open Complaints) however, my code returns 517 obs. of 11 variables (not what I want).

Here is my code for Open_By_State:
Open_By_State <- complaints[which(complaints\$Complaint_Status == "Open"),]

How can I get it to return just State and Open status?

#### Attachments

• Open_By_State df.png
28.6 KB · Views: 17
• Complaints_By_State df.png
16.8 KB · Views: 17