Separate names with a comma.
Recommended. Know people from your network.
Don't have an account?Sign up Now
To reset your password, enter the email address you registered with and we"ll send your instructions on their way.
Discussion in 'Big Data and Analytics' started by K Manoj, Nov 10, 2018.
Please post your questions below.
What is voting in knn for the attribute weights. Uniform means giving equal vote to all values, what does that mean.?
can you upload the slides for SVM,random forest,hyperparameter estimation and regularization under techdocs
The same has been requested to the concerned trainer and the needful will be done for sure. Really appreciate your cooperation in this regards.
Hello to everybody,
Here is my problem: I understand that one have to convert categorical variable in numeric. But converting for example M-F variable in 2 looks redundant for me. Evidently, 2 resulting columns are 100% correlated with coefficient -1, so they have the same information in them. I would expect that one of those 2 variable. has to be dropped. Otherwise if I will do, for example, straightforward fit with something like sigmoid function it will take this information with doubled weight compare to any other variable. That will be true for any mthod unless the toll is somehow specifically informed about that redundancy or it is sophisticated enough to take into account this type of correlation.
More generically, from group of N variables with M exact explicit functional relations defined on that group, (of course N > M), I would drop M variables to keep my variable set balanced.
Thank you for reaching out to us. We apologize for the delay in the response as it is a bit technical query thus, we have requested the trainer to answer it as soon as possible.
Kindly allow as 24 - 48 hours to get this query answered.
Certain ML algorithms would require us to convert convert the categorical feature to numeric. hence binary feature would need to be converted to 0/1. The options are
1. Map values 0 or 1 to the feature
2. Convert the field to dummy variables
With pandas.get_dummies, there is a parameter i.e. drop_first which indicates whether to get k-1 dummies out of k categorical levels by removing the first level. Please note default = False, meaning that k dummies created out of k categorical levels.
Hope this helps. Let us know
Hi Bhupen, Can you please address below question, you might feel this could be basic one but wondering solution for it,
1: I have enrolled for AI master course, Where the starting course was Data Science with python, then after Machine learning (Deep Learning, Apache, spark). Looking after market demand, these days they are breaking up the position say like, Machine Learning demand, Data Science, AI Architect, (considering the fact these all comes under AI Only) for which particular demand end of this course we should be in position to apply.
2: After going through Data Science with Python, Machine Learning, i don't think so to learn everything from anyone can be possible. so precisely if you can focus the content of this master course which we should be focus looking after the market requirement.
3: Ideally after this course how much experience we can put across our profile. and to make it more visible our profile, is there any way where we can highlite any project contribution, the specific one.
1:I have enrolled for AI master course, Where the starting course was Data Science with python, then after Machine learning (Deep Learning, Apache, spark). Looking after market demand, these days they are breaking up the position say like, Machine Learning demand, Data Science, AI Architect, (considering the fact these all comes under AI Only) for which particular demand end of this course we should be in the position to apply.
AI includes "Classical Machine learning + Deep learning"
Data Science is about preparing the right data/tables for ML/DL. Includes a lot of Statistical tools/methods based data inferences, Exploratory data analysis. Huge focus on CLustering techniques and feature engg topics.
Big Data platforms are infrastructure enablers with focus on managing data across 1000s of computing resources, parallelism, synchronization
Specialization areas that one can choose (each of the following area can give career success)
Data Analytics using classical machine learning
Data Analytics using DL techniques
Text based Analytics (lot of classical ML is used + DL is used too here)
Advanced plots for traditional analytics
Basic stats + intermediate to adv stats
Calculus (derivatives, integral)
Programming specialization (SAS, R or Python)
Cloud platform certifications (AWS, Google, Azure ... they all provide AI/ML enablers
Big Data platforms
NOSQLs (cassandra, MongoDB)
Handling unstructured data (text, audio, GPS, video, pdfs... images, spatial)
Integration of data and unstructured text etc
3:Ideally after this course how much experience we can put across our profile ? And to make it more visible our profile, is there any way where we can highlight any project contribution, the specific one.
a learner after completing the full masters program can state 6 months of training and 'POC/project' experience
Continue to develop ML/DS projects as listed on Kaggle competition sites
Continue to read blogs and articles on DS/ML/DL/Big data ... at least 5 blogs a week. That is 250+ blogs in a year!