doubts related to machine learning

Discussion in 'Big Data and Analytics' started by Narayana Surya, Oct 1, 2018.

1. Narayana Surya Well-Known Member Alumni

Joined:
Feb 27, 2018
Messages:
59
0
Can you please explain with example how can we predict categorical variable(good,bad,worst) in classification(logistic regression..etc)?
Explain:

As you told that we can also predict categorical variable by using classification models.
In my previous r course tutor told that we will use classification only if we have only 2 outcomes.(TRUE or FALSE)

When will discrete numeric comes under classification ?
Explain:

As per my knowledge if a outcome is numeric(both continuous or discrete) we will use linear regression.please explain when we have
to consider it in association.

Unsupervised learning please explain with detail example and when to use?
Explain:

In my previous R class they told that we will use unsupervised learning when we want to categorize customers depends on their purchase power(HERE LABELS ARE PRESENT IN DATA THAT WE ARE USING TO CATEGORIZE CUSTOMERS).
Eg: let consider a bank is going to introduce 3 new credit cards so they categorize customers depends on their purchase power so they will target customers accordingly.
Is my understanding correct..?

How ml learning models predict significant variable or not..?

Note:here when you run ML model it shows if a variable is Significant variables or not.How is model predicting that..?
Eg: AS we posses intelligence we know what is impact of one variable on other(house price vs crime) but how is machine differentiate different variable?

How the liner regression equaction changes when i have factors in it..?

If i have 2 independent variables with high negative correlation in that case also i have to take only one variable in model..?

can any one please clarify my doubts

#1
2. K Manoj Moderator Staff MemberSimplilearn Support

Joined:
Aug 4, 2017
Messages:
239
23
Can you please explain with example how can we predict categorical variable(good,bad,worst) in classification(logistic regression..etc)?

Let's say you have a dataset of cars with features like horsepower, price, trunk size etc.
You can use a logistic regression to classify the cars (based on the above features) into categories like Sedan, SUV, Hatchback
This is an example of multi-class classification.

When will discrete numeric comes under classification ?
Explain:
In the above example, you can categorize trunk size of less than 1 as hatchback, 1 -2 as SUV and 4-7 as Sedan.

Unsupervised learning please explain with detail example and when to use?

Unsupervised learning is where you don't label your data.
You provide images of cars, but you have let's say 10 million images with no labels.

Without hindering yourself, you use unsupervised learning to represent your images. Representation will mean much more as you supply 10-100 labeled images to label your whole dataset.

Its the machine job to classify the various cars and during its analysis it may even find features which were not previously given to it i.e it's creating its own set of features based on data.

How ml learning models predict significant variable or not..?

Correlation is the term. Google it.

How the linear regression equation changes when i have factors in it..?
Again, the complexity of the model increases with an increasing number of features. Accordingly, the equation changes w.r.t to the independent variables.

If i have 2 independent variables with high negative correlation in that case also i have to take only one variable in model..?

In ML, it depends. Are those variables co-related? If yes, ignore one and take the other.

#2
3. Narayana Surya Well-Known Member Alumni

Joined:
Feb 27, 2018
Messages:
59
0
How ml learning models predict significant variable or not..?
Correlation is the term. Google it.

According to my understanding we consider correlation factor only in case of linear and logistic regression for other models (Decision tree,random forest) we did not consider it.So can you please let m know how we find significant variable for other models..?

please let me know if i am wrong with my understanding

#3
4. Vikas Kumar_18 Well-Known Member Simplilearn SupportAlumni

Joined:
Dec 17, 2018
Messages:
177
31
Hi learner,

Selecting the significant features(variable) and dropping the insignificant one comes under feature selection and feature selection comes under Data Pre-processing.

Some other feture selection techniques to drop the insignificant variables are :
1. VIF (Variation Inflation Factor)
2. Forward and Backward Step elimination techniques

Just google it and you'll explore above two terms better.

I would like to say that correlation could be applied not only before Logistic and Linear Regression but before others algorithms too so that Decision tree and other related ML model won't give poor predictions in the presence of insignificant variables.

#4