Welcome to the Simplilearn Community

Want to join the rest of our members? Sign up right away!

Sign Up

Data Science with R | Pulkit Taneja | Apr 5

pulkitaneja

Active Member
CONCERNS REGARDING PROJECTS & SIMULATION TESTS:

Please go through the below FAQ and their answers:

Q1. When can I submit the assignment/project. How many projects do I need to attempt?
Ans: You need to complete at least 1 assignment to successfully complete the course and generate the certificate. The submission deadline is 2 days after the course is completed. In case you cannot complete and submit before the deadline, please submit a ticket requesting extra time with necessary reason.

Q2. When can I attempt the simulation test? How much do I need to score to pass the exam? What is the test syllabus? Is there a mock test for practice?
Ans: The simulation test is to assess the knowledge gained from the R course. It's composed of MCQ style questions and requires a score of at least 60% to clear the exam and generate the certificate. Going through the self learning section, recorded videos, hands-on R practice would suffice to clear the exam. There are no mock test available for practice.
The simulation test needs to be attempted within a week of course completion. Please raise a ticket if you have any other concerns.


Q3. Who will evaluate the projects?
Ans: The projects will be evaluated by a team of course teaching assistants. Please write well commented codes and document to explain your steps
 

pulkitaneja

Active Member
I have raised the support ticket for an extra class recording, since 6 hours no one reply or send the link of the same, Can anyone send me the link if someone got the same, Thanks in advance!!!
Hi Prajesh. You will have to allow 1-2 days of time for them to provide the recording. As the TA mentioned in today's class, you will receive the recordings of extra classes in your google drive as well on the last day.

Regards,
Pulkit
 

Jyotsana Patil

New Member
NOTICE: Extra Classes Scheduled for 22, 23 and 26 April

Hi all,
Please note that extra class has been scheduled for Thursday, Friday and Monday on our usual time of 6 AM IST. There might not be any intimation made about these classes through email. About 30 minutes before the class you will receive a webex link on your mail through which you shall be able to join the session.

Please comment on this thread regarding any concerns.

Thanks,
Pulkit
i am not able to join classes ,how i can join classes ?
 

Prajesh Sortee

Active Member
Regarding Project: In project 2, the problem is coming with month and date i.e 2nd question. how we can solve 2nd question and further onwards ? How to manipulate the complaints column?
 
Great question Nikita!

The reason for seeing different plots is because airpassengers dataset is not a simple data.frame but something called as ts or time-series. Time series plots are implemented as line plots.
Understanding time series dataframes and plots is beyond the scope of our current class.
run class(airpassengers) and see the output. It will all start making sense.

If you are interested in exploring time-series data please go through the following link:
https://a-little-book-of-r-for-time-series.readthedocs.io/en/latest/src/timeseries.html

Regards,
Pulkit
Okay.
Thank you Sir.
 
CONCERNS REGARDING PROJECTS & SIMULATION TESTS:

Please go through the below FAQ and their answers:

Q1. When can I submit the assignment/project. How many projects do I need to attempt?
Ans: You need to complete at least 1 assignment to successfully complete the course and generate the certificate. The submission deadline is 2 days after the course is completed. In case you cannot complete and submit before the deadline, please submit a ticket requesting extra time with necessary reason.

Q2. When can I attempt the simulation test? How much do I need to score to pass the exam? What is the test syllabus? Is there a mock test for practice?
Ans: The simulation test is to assess the knowledge gained from the R course. It's composed of MCQ style questions and requires a score of at least 60% to clear the exam and generate the certificate. Going through the self learning section, recorded videos, hands-on R practice would suffice to clear the exam. There are no mock test available for practice.
The simulation test needs to be attempted within a week of course completion. Please raise a ticket if you have any other concerns.


Q3. Who will evaluate the projects?
Ans: The projects will be evaluated by a team of course teaching assistants. Please write well commented codes and document to explain your steps
This is exactly what i needed to know about [Projects & Simulation Tests]. THANK YOU.
 
Hi,
In project 2 getting this error. Can anybody help

internet_calls<-comcast_complaints_data %>% filter(ReceivedVia=='Internet',ComplaintStatus=='Closed') %>% group_by(ReceivedVia,ComplaintStatus) %>% summarize(NumOfComplaints=n())
Error: Problem with `filter()` input `..1`.
x object 'ReceivedVia' not found
ℹ Input `..1` is `ReceivedVia == "Internet"`.
ℹ The error occurred in group 1: State = "Alabama", ComplaintStatus = "Closed".
Run `rlang::last_error()` to see where the error occurred.
 

Sachin Sethi_1

Active Member
Hi,
In project 2 getting this error. Can anybody help

internet_calls<-comcast_complaints_data %>% filter(ReceivedVia=='Internet',ComplaintStatus=='Closed') %>% group_by(ReceivedVia,ComplaintStatus) %>% summarize(NumOfComplaints=n())
Error: Problem with `filter()` input `..1`.
x object 'ReceivedVia' not found
ℹ Input `..1` is `ReceivedVia == "Internet"`.
ℹ The error occurred in group 1: State = "Alabama", ComplaintStatus = "Closed".
Run `rlang::last_error()` to see where the error occurred.
I guess column name is Recieved.Via
 

Prajesh Sortee

Active Member
Project---Health cost analysis :
1. The agency wants to find the age category of people who frequently visit the hospital and have the maximum expenditure to record the patient statistics.

2. In order of severity of the diagnosis and treatments and to find out the expensive treatments, the agency wants to find the diagnosis-related group that has maximum hospitalization and expenditure.

3. To make sure that there is no malpractice, the agency needs to analyze if the race of the patient is related to the hospitalization costs.
can you explain what exactly we have to do in the above questions? what functions would be helpful here?
 
I have raised the support ticket for an extra class recording, since 6 hours no one reply or send the link of the same, Can anyone send me the link if someone got the same, Thanks in advance!!!
Kindly find the below recordings for the extra session conducted on April 22nd and 23rd.

April 23rd-https://simplilearnsolutions.webex....ldr.php?RCID=93b6a5269dc548ed8489fe938b9da0f4

April 22nd -https://simplilearnsolutions.webex....ldr.php?RCID=c6afeae112b8c01ef89448f0be6dca04

This was shared by the simplilearn team.
 

Sachin Sethi_1

Active Member
Doubts in projects:-

Project2:- Comcast
---Provide a table with the frequency of complaint types. For this question,
freq_comp_types <- table(dataset$Customer.Complaint)
is this ok or need to do some other processing in Complaint values using GrpString?

Project5:- College Admission
-----Use variable reduction techniques to identify significant variables.
How to perform variable reduction??? I googled it and it showed something based on covariance. Should i go with it or any other technique?

Project7:- Healthcare cost analysis
----In the questions like below:-

1. To record the patient statistics, the agency wants to find the age category of people who frequently visit the hospital and has the maximum expenditure.
2. In order of severity of the diagnosis and treatments and to find out the expensive treatments, the agency wants to find the diagnosis-related group that has maximum hospitalization and expenditure.

"and" is used between two conditions. So whether it asks to combine both conditions in single expression or it asks to find results individually for both the variables.
----In project 5, there is a question to identify and treat outliers. So do we need to treat outliers for all the datasets for project? As dataset in Project7 has a lot of outliers that are removed completely by running the below process for 3 times.

ggplot(dataset)+
geom_boxplot(aes(x=TOTCHG))

out_vals <- boxplot.stats(dataset$TOTCHG)$out

out_indx <- which(dataset$TOTCHG %in% out_vals)

dataset[out_indx,]$TOTCHG <- NA

dataset$TOTCHG <- impute(dataset$TOTCHG, mean)

ggplot(dataset)+
geom_boxplot(aes(x=TOTCHG))

PFA boxplot of the same.BoxPlot_TOTCH_with_outliers.jpeg
Thanks.
 
Last edited:

Prajesh Sortee

Active Member
Friends, please share the link of the live class recording of the last session of our batch if you have. I am having 22 and 23rd recordings with me but not the 26th recording.
 

Sachin Sethi_1

Active Member
Anyone got answer for below tasks of Project 2:-
Provide a table with the frequency of complaint types.
Which complaint types are maximum i.e., around internet, network issues, or across any other domains.
 
Hi Pulkit,

I tried to attempt project 1 and below are my work, but I am not sure whether my approach is correct or not, and for one of the question I dont understand what to do and the another problem I am facing is to document my ananlys ( I have build the model but I am not able to answer based on my outputs).
Please guide me through.

1619510158049.png
 
In Project 2 ,

Which complaint types are maximum i.e., around internet, network issues, or across any other domains.

- Create a new categorical variable with value as Open and Closed. Open & Pending is to be categorized as Open and Closed & Solved is to be categorized as Closed.
- Provide state wise status of complaints in a stacked bar chart. Use the categorized variable from Q3. Provide insights on:


Can any one help here to get these ans
 

rubeen

New Member
Hi Pulkit,

Assessment datasets are in XLSX format , i have used the below code instead of converting the file to CSV format, wont be a problem right?
library(readxl)
Health_Care <- read_xlsx("1555054100_hospitalcosts.xlsx")
 
Please send me a dummy project. I am unable to understand the guidelines especially " Write-Up" one! What do I have to add in this?
 
Last edited:
Please send me a dummy project. I am unable to understand the guidelines especially " Write-Up" one! What do I have to add in this?
Nikita, "Write-Up", I believe, is your insights about the project. What concept, according to you, is tested in the particular project (eg: Regression, Classification etc). A brief explanation about the dataset (no of rows and columns and their data type) and what concepts are tested in that particular project. How you went about solving it, no code ..just a brief explanation that's all. Think of "Write-Up" section as you trying to give a short explanation to someone about your project !! Hope this helps.
 
Nikita, "Write-Up", I believe, is your insights about the project. What concept, according to you, is tested in the particular project (eg: Regression, Classification etc). A brief explanation about the dataset (no of rows and columns and their data type) and what concepts are tested in that particular project. How you went about solving it, no code ..just a brief explanation that's all. Think of "Write-Up" section as you trying to give a short explanation to someone about your project !! Hope this helps.
Okay. Thank you Ratish.
 
In Project 2 ,

Which complaint types are maximum i.e., around internet, network issues, or across any other domains.

- Create a new categorical variable with value as Open and Closed. Open & Pending is to be categorized as Open and Closed & Solved is to be categorized as Closed.
- Provide state wise status of complaints in a stacked bar chart. Use the categorized variable from Q3. Provide insights on:


Can any one help here to get these ans
Try something as below:

network_tickets<-contains(Cons_data$Customer.Complaint, match = 'network', ignore.case = T)

Cons_data$ComplaintType[network_tickets]<- "Network"

Cons_data$ComplaintType[-c(network_tickets)]
table(Cons_data$ComplaintType)
 
Hello,
i register for tableau class right now because python class is not available in the morning batch this month.
Is any one guide me that is there any issue in doing tableau class first and attend python after that?
 

Prajesh Sortee

Active Member
Project7:- Healthcare cost analysis
----In the questions like below:-

1. To record the patient statistics, the agency wants to find the age category of people who frequently visit the hospital and have the maximum expenditure.
2. In order of severity of the diagnosis and treatments and to find out the expensive treatments, the agency wants to find the diagnosis-related group that has maximum hospitalization and expenditure.

"and" is used between two conditions. So whether it asks to combine both conditions in a single expression or it asks to find results individually for both the variables.
 
I have raised ticket for extension of Practice Lab. I had called up on 28th April at 1:35 to their help desk and raised request but I didn't receive any ticket no. on my e-mail. In the evening on 28th Apr at 7:25 pm again I called up help desk and raised request for extension of practice lab. the ticket no. is 00887481. I was told that request will be resolved within 24 hours. But the Practice lab extension has still not been done.
Have others in the community able to get extension of Practice Lab. Pl. help. Thanks
 

Attachments

  • Screenshot 2021-04-29 at 7.15.26 PM.png
    Screenshot 2021-04-29 at 7.15.26 PM.png
    201.5 KB · Views: 3
I have raised ticket for extension of Practice Lab. I had called up on 28th April at 1:35 to their help desk and raised request but I didn't receive any ticket no. on my e-mail. In the evening on 28th Apr at 7:25 pm again I called up help desk and raised request for extension of practice lab. the ticket no. is 00887481. I was told that request will be resolved within 24 hours. But the Practice lab extension has still not been done.
Have others in the community able to get extension of Practice Lab. Pl. help. Thanks
it usually takes 24-48 hrs.
 
Hello,
i register for tableau class right now because python class is not available in the morning batch this month.
Is any one guide me that is there any issue in doing tableau class first and attend python after that?
No Priyanka, there is absolutely no problems in doing Tableau first and Python later as they both are mutually exclusive in terms of their usage. In fact you are making good use of your time by finishing the Tableau course .
 
Hello Team, does anybody have the dataset to the 6th project (E-Commerce Company) ? The dataset provided at the bottom of the project description gives this link ( https://github.com/Simplilearn-Edu/Data-Science-with-R/blob/master/Ecommerce.rar ) which is corrupt. I tried to unrar the file in windows and linux both but to no avail !! I have raised a ticket with the support for the same, but wanted to know if anyone has obtained it before me and is willing to share ?! Thanks.
 
In Project 2 ,

Which complaint types are maximum i.e., around internet, network issues, or across any other domains.

- Create a new categorical variable with value as Open and Closed. Open & Pending is to be categorized as Open and Closed & Solved is to be categorized as Closed.
- Provide state wise status of complaints in a stacked bar chart. Use the categorized variable from Q3. Provide insights on:


Can any one help here to get these ans
shashi use select (contains =.........) for internet ,network, email and others
No Priyanka, there is absolutely no problems in doing Tableau first and Python later as they both are mutually exclusive in terms of their usage. In fact you are making good use of your time by finishing the Tableau course .
Thanku ratish.
 

Sahith_9

Member
Good evening sir
(Project 2 Related)
I had one doubt, after plotting the day wise and month wise count of complaints,
how to sort different complaint types what function should we use.
please help.
 
Top