Welcome to the Simplilearn Community

Want to join the rest of our members? Sign up right away!

Sign Up

Data Science with R | Pulkit Taneja | Apr 5

Q1 Possible to use both if and switch, let say if any condition meet then switch to?
Q2 if I want to loop from 1 to 10 number and I want the out put as
No 1
No 2 etc.
how to add string to numeric while print
 
Switch is used when there is finite number of values that has to be run through the loop whereas the If..else is used when there is unlimited number of values.
 

pulkitaneja

Active Member
Q1 Possible to use both if and switch, let say if any condition meet then switch to?
Q2 if I want to loop from 1 to 10 number and I want the out put as
No 1
No 2 etc.
how to add string to numeric while print
Ans1 : Switch and if-else can be used interchangeably but in a lot of situations, using switch might not be feasible. The reason for the same can be understood by looking at the way switch is implemented.

switch(expression, case1={....}, case2={...}, .....). Here expression can take values case1, case2, case3, etc. When dealing with a conditional statement related to let's say 100 different values, you will have to write cases for all 100 values.

If-else on the other hand will be able to tackle this situation much easily by making use of relational statements like x >= 50 which would handle a large number of values in one go.

Ans2 : You can make use of paste/ paste0 function to accomplish this. You can check out it's documentation from the RStudio using ?paste.
In case you are not able to accomplish this, write to me again.
 
Ans1 : Switch and if-else can be used interchangeably but in a lot of situations, using switch might not be feasible. The reason for the same can be understood by looking at the way switch is implemented.

switch(expression, case1={....}, case2={...}, .....). Here expression can take values case1, case2, case3, etc. When dealing with a conditional statement related to let's say 100 different values, you will have to write cases for all 100 values.

If-else on the other hand will be able to tackle this situation much easily by making use of relational statements like x >= 50 which would handle a large number of values in one go.

Ans2 : You can make use of paste/ paste0 function to accomplish this. You can check out it's documentation from the RStudio using ?paste.
In case you are not able to accomplish this, write to me again.
for(x in 5:10){
v1 <-(x*2)
v2 <- paste("No",v1)
print(v2)

Got it... Thank you :) But if any changes needed to improve ? please let me know.
 
Last edited:
Q3 How to create a function, which will print the tables using loops. where I need to input j an i values as input and the function will print the table.
I have created a Dynamic code for table calculation, which not hard coded. I want use this to code as function.

#### print any table using for loop just Change j value###

for (i in 1:10){
j <- 2
k <- (i*j)
b1<-paste(i,"x",j,"=",k,sep=" ")
print(b1)
}


#### print any table using while loop just Change j value###
i <- 1
j <- 2
while (i<=10)
{
k <- (i*j)
b1<-paste(i,"x",j,"=",k,sep=" ")
print(b1)
i <- i+1
}
 
Last edited:

pulkitaneja

Active Member
Codes for 9 April Class:

There is an issue with drive and hence I'm unable to upload the codes for loops and functions . Posting it here so you guys have material to go over the weekend.

Thanks,
Pulkit
 
While running my for loop I could notice an icon that appeared which seemed to collapse the entire loop and on clicking it you could see the entire code under it?
I got the answer to this. We select the particular section and press Alt+L.

# Can anyone let me know how we can save the R file on the practice lab to our desktop?
 
Last edited:

pulkitaneja

Active Member
Codes for 9 April Class:

There is an issue with drive and hence I'm unable to upload the codes for loops and functions . Posting it here so you guys have material to go over the weekend.

Thanks,
Pulkit
Google Drive is now up and running fine. Please find all the codes in the drive.
 

pulkitaneja

Active Member
While running my for loop I could notice an icon that appeared which seemed to collapse the entire loop and on clicking it you could see the entire code under it?
I got the answer to this. We select the particular section and press Alt+L.

# Can anyone let me know how we can save the R file on the practice lab to our desktop?
Kudos for figuring out the answer on your own :) Great going!!

R File download to local desktop:
In the section where files are displayed and documentation for functions pops up(right bottom section of RStudio), select the files option. This will display all the files in your cloud including all your .R codes. Select the code you want to download and click 'More' option. Further click on 'Export' and you shall be able to download your file.
 
In the data frame example, how can we find the mean of onscreen_time of male and female and print the data frame as below...

is_male mean(onscreen_time)
T (50+10)/2 = 30
F (40+30)/2 = 35
 
size <- rep(c("small", "medium", "large"), 10)
size
size_factor1<- factor(size, levels = ("small,medium,large"))
size_factor1

output: <NA> is coming and levels are in small, medium and large sequence. please tell whats wrong?
 
Ans1 : Switch and if-else can be used interchangeably but in a lot of situations, using switch might not be feasible. The reason for the same can be understood by looking at the way switch is implemented.

switch(expression, case1={....}, case2={...}, .....). Here expression can take values case1, case2, case3, etc. When dealing with a conditional statement related to let's say 100 different values, you will have to write cases for all 100 values.

If-else on the other hand will be able to tackle this situation much easily by making use of relational statements like x >= 50 which would handle a large number of values in one go.

Ans2 : You can make use of paste/ paste0 function to accomplish this. You can check out it's documentation from the RStudio using ?paste.
In case you are not able to accomplish this, write to me again.
sir what is paste/paste0 function?
 
Kudos for figuring out the answer on your own :) Great going!!

R File download to local desktop:
In the section where files are displayed and documentation for functions pops up(right bottom section of RStudio), select the files option. This will display all the files in your cloud including all your .R codes. Select the code you want to download and click 'More' option. Further click on 'Export' and you shall be able to download your file.
After exporting any .R file to desktop, it appears blank if you open in local Rstudio. Any idea why?
There is no issue when we try to open it in the cloud environment.
 
Q4, do we have any functions in R to filter text or character fields using "contains" ,"begins", "equal to", etc. like filters in excel.
any wild card search operation ?

Example: if I want to filter cars names. in mtcars


Q5, I have Comments Column in a data set, where I need to filter Comments based on a specific string in the comment Column and I wanted to group them so to "categorize" the batá set for further analysis.

Please suggest any way to do it?
Example:
Comments <-C("Active but not in use", "Frozen account", "Billing issue", "Not active Frozen because of Billing issues")

Now I want to group data set in 4 categories like
1. Active,
2. Frozen,
3. Billing,
4 Other ( if the comment have String matching "Active billing and Frozen", all comment in a single comment),
 
Last edited:

pulkitaneja

Active Member
15 April : DPLYR and Apply Reading Material
Hi all, sharing some awesome tutorials with elaborative examples covering usage of dplyr package.
https://www.listendata.com/2016/08/dplyr-tutorial.html

Please check out the below link to go through apply functions excluding the ones covered in today's class:
https://www.datacamp.com/community/tutorials/r-tutorial-apply-family

For tomorrow's class, please go through self learning section of Data Visualization.

Happy learning everyone.

Regards,
Pulkit
 

pulkitaneja

Active Member
Q4, do we have any functions in R to filter text or character fields using "contains" ,"begins", "equal to", etc. like filters in excel.
any wild card search operation ?

Example: if I want to filter cars names. in mtcars


Q5, I have Comments Column in a data set, where I need to filter Comments based on a specific string in the comment Column and I wanted to group them so to "categorize" the batá set for further analysis.

Please suggest any way to do it?
Example:
Comments <-C("Active but not in use", "Frozen account", "Billing issue", "Not active Frozen because of Billing issues")

Now I want to group data set in 4 categories like
1. Active,
2. Frozen,
3. Billing,
4 Other ( if the comment have String matching "Active billing and Frozen", all comment in a single comment),
Q4: use rownames(mtcars) to create vector of rownames and add it as a column to your mtcars dataset. After that simply use the filter function to get the data for required car:
mtcars %>% filter(car_names == "Mazda")

Q5: What you are trying to accomplish is called as 'pattern matching' or 'string matching'.
This can be achieved using the grep/grepl functions. Please see the usage in the link shared below:
https://statisticsglobe.com/grep-grepl-r-function-example

You need to use the ifelse function on the comments column and create appropriate categories based on and exclusive condition:
ex: df$comment_category <- ifelse( grepl("active", df$comment), "active", "other"). # here grepl will produce output true whenever "active" is found in the comments
Create similar ifelse for all the different categories.

Regards,
Pulkit
 
Q4: use rownames(mtcars) to create vector of rownames and add it as a column to your mtcars dataset. After that simply use the filter function to get the data for required car:
mtcars %>% filter(car_names == "Mazda")

Q5: What you are trying to accomplish is called as 'pattern matching' or 'string matching'.
This can be achieved using the grep/grepl functions. Please see the usage in the link shared below:
https://statisticsglobe.com/grep-grepl-r-function-example

You need to use the ifelse function on the comments column and create appropriate categories based on and exclusive condition:
ex: df$comment_category <- ifelse( grepl("active", df$comment), "active", "other"). # here grepl will produce output true whenever "active" is found in the comments
Create similar ifelse for all the different categories.

Regards,
Pulkit
Thank you so much Pulkit. "grepl" Function seems to be useful in my analysis. :)
 
airpassengers <-datasets::AirPassengers
plot(airpassengers)
'but the plot for this is a line chart'
time<-c(1991L,1992L,1993L,1994L)
pop<-c(23,34,38,40)
plot(time, pop, main = 'Population vs year', col = 'darkgreen')
'the plot for this is clustered'
Please tell me how is it that both don't have type argument then both should be line or clustered, Why 1 is clustered and the other is line !
 
Last edited:
Trying to use Bar plots on kkhh data frame. Resulting plot does not show x axis labels.

actor_name <- c("SRK","Kajol","Rani","Salman")
onscreen_time <- c(50,45,30,10)
kkhh2 = data.frame(actor_name, onscreen_time)
barplot(kkhh2$onscreen_time, xlab="Actors", ylab = "On-screen Time" , col = c("Red", "Yellow", "Green", "Blue") )

How can I get SRK , Kajol, Rani and Salman on the x-axis ?
 
seq (1,10,length=5)
my result is
[1] 1.00 3.25 5.50 7.75 10.00
please explain that sequence?

The sequence contains 5 numbers with a difference of 2.25. It automatically creates the numbers with equal differences. As the first number is 1.00 and the last one is 10.00 and there are 4 differences between 5 numbers so (10-1)/4 = difference
 
Trying to use Bar plots on kkhh data frame. Resulting plot does not show x axis labels.

actor_name <- c("SRK","Kajol","Rani","Salman")
onscreen_time <- c(50,45,30,10)
kkhh2 = data.frame(actor_name, onscreen_time)
barplot(kkhh2$onscreen_time, xlab="Actors", ylab = "On-screen Time" , col = c("Red", "Yellow", "Green", "Blue") )

How can I get SRK , Kajol, Rani and Salman on the x-axis ?
Add names = kkhh2$actor_name in barplot( )argument
 

Prajesh Sortee

Active Member
ggplot(airquality) +
geom_histogram(aes(x=Ozone),fill = "blue",colour = "black")
scale_y_discrete(breaks = seq(0,16,2))
My result is a histogram of ozone on the x-axis and counts on the y-axis shown in the image file I have included here.
My Question:
what is the function to represent the frequency values of ozone on the Y-axis i.e as they are not the continuous variable as well as
discrete so what is the function like scale_...? for frequency values?
 

Attachments

  • Rplot.jpeg
    Rplot.jpeg
    45.4 KB · Views: 5

pulkitaneja

Active Member
Want to know that is there any deadline for project submission?
In usual circumstances, the deadline for project/assignment submission is 2 days after the last class of the course. However, given that most of you have to manage your office work in parallel, you can raise a ticket and the deadline extend for yourself.

Personal Note: After the Thursday and Friday class(22 Apr and 23 Apr), you will be able to attempt multiple projects and you can get the heavy-lifting done over the weekend.
 

pulkitaneja

Active Member
ggplot(airquality) +
geom_histogram(aes(x=Ozone),fill = "blue",colour = "black")
scale_y_discrete(breaks = seq(0,16,2))
My result is a histogram of ozone on the x-axis and counts on the y-axis shown in the image file I have included here.
My Question:
what is the function to represent the frequency values of ozone on the Y-axis i.e as they are not the continuous variable as well as
discrete so what is the function like scale_...? for frequency values?
Interesting question!
Even though counts or frequencies don't conform with the usual definition of continuous variables( they are definitely not discrete!), continuous is the best possible way we have to handle these values. They abide by the technical definition of continuous as they are not non-finite as counts can take infinite possible values, even if those values are not decimal or floating.

Remember: Age is also a variable similar to counts, but we handle it with numeric/integer and not with categorical/character/factor. Because of the given data type options, numeric and integer make the most sense.

General Tip on Searching function:
now that you are already aware that scaling functions have a general architecture 'scale_x_(continuous, discrete, etc)'. So just type in scale_x and see the recommendations thrown by R. If there is function that suits your need and use-case, it will pop-up and you can experiment with it.
 

pulkitaneja

Active Member
seq (1,10,length=5)
my result is
[1] 1.00 3.25 5.50 7.75 10.00
please explain that sequence?
It's pretty simple. Length = 5 means that there will only be 5 elements in your vector. And they will be equally spaced from each other, starting from 1 and ending in 10.

I'm thinking, you wanted to space your numbers by 2 and you wanted 2,4,6,8,10. And you got the unexpected answer. Write seq(2,10, length = 5) to get the required answer.

Regards,
Pulkit
 

pulkitaneja

Active Member
airpassengers <-datasets::AirPassengers
plot(airpassengers)
'but the plot for this is a line chart'
time<-c(1991L,1992L,1993L,1994L)
pop<-c(23,34,38,40)
plot(time, pop, main = 'Population vs year', col = 'darkgreen')
'the plot for this is clustered'
Please tell me how is it that both don't have type argument then both should be line or clustered, Why 1 is clustered and the other is line !
Great question Nikita!

The reason for seeing different plots is because airpassengers dataset is not a simple data.frame but something called as ts or time-series. Time series plots are implemented as line plots.
Understanding time series dataframes and plots is beyond the scope of our current class.
run class(airpassengers) and see the output. It will all start making sense.

If you are interested in exploring time-series data please go through the following link:
https://a-little-book-of-r-for-time-series.readthedocs.io/en/latest/src/timeseries.html

Regards,
Pulkit
 

pulkitaneja

Active Member
HYPOTHESIS TESTING in R:

Please go through the following link to explore hypothesis testing further.
http://www.r-tutor.com/elementary-statistics/hypothesis-testing

Relevance to Projects: The hypothesis testing topic is an important concept if you conduct data science experiments in the future. However, the topic of hypothesis testing is not tested in any of your projects. ( It might be there in your exam though. So do get clarity on the theory. Try not to skip it)

For Thursday Class:
Please go through the self-learning section of 'regression' chapter.
 
Hi Pulkit, Yesterday was our last class as per the batch timelines and we were supposed to get class extension, as per discussion with TA. However I don't see anything in the available class. How do we plan to complete the remaining syllabus ?
I have raised ticket in this regard today morning. My ticket no. is 00880644.
 
HYPOTHESIS TESTING in R:

Please go through the following link to explore hypothesis testing further.
http://www.r-tutor.com/elementary-statistics/hypothesis-testing

Relevance to Projects: The hypothesis testing topic is an important concept if you conduct data science experiments in the future. However, the topic of hypothesis testing is not tested in any of your projects. ( It might be there in your exam though. So do get clarity on the theory. Try not to skip it)

For Thursday Class:
Please go through the self-learning section of 'regression' chapter.
I don't see the class in " My class" option in Live Classes. How do I access the live class if it is happening. Pl. help.
 
Hi Pulkit, Yesterday was our last class as per the batch timelines and we were supposed to get class extension, as per discussion with TA. However I don't see anything in the available class. How do we plan to complete the remaining syllabus ?
I have raised ticket in this regard today morning. My ticket no. is 00880644.
To join the training session
-------------------------------------------------------
1. Go to https://simplilearnsolutions.webex..../j.php?MTID=teda7c33e81191a2ad9db4e0b808c66bc
2. Enter your name and email address (or registration ID).
3. Enter the session password: p9ygxh37.
4. Click "Join Now".
5. Follow the instructions that appear on your screen.

Check your mail for the intimation link.
 
To join the training session
-------------------------------------------------------
1. Go to https://simplilearnsolutions.webex..../j.php?MTID=teda7c33e81191a2ad9db4e0b808c66bc
2. Enter your name and email address (or registration ID).
3. Enter the session password: p9ygxh37.
4. Click "Join Now".
5. Follow the instructions that appear on your screen.

Check your mail for the intimation link.
Hi,thanks for sharing..

By following this steps, it shows the session is over.still not able to join
 

pulkitaneja

Active Member
NOTICE: Extra Classes Scheduled for 22, 23 and 26 April

Hi all,
Please note that extra class has been scheduled for Thursday, Friday and Monday on our usual time of 6 AM IST. There might not be any intimation made about these classes through email. About 30 minutes before the class you will receive a webex link on your mail through which you shall be able to join the session.

Please comment on this thread regarding any concerns.

Thanks,
Pulkit
 
Top