Separate names with a comma.
Recommended. Know people from your network.
Don't have an account?Sign up Now
To reset your password, enter the email address you registered with and we"ll send your instructions on their way.
Discussion in 'Big Data and Analytics' started by K Manoj, Jun 9, 2018.
*thread locked for this batch learners.
Please post your queries below..
Anand, Will client allow to download Python and use Python as a language for doing Data science?
Anand, in today's call you mentioned data scientist should do ML, But ML is a capabililty of an AI systems to understand the data and derive patterns from the data and predict outcomes. So would a data scientist be involved in designing the AI systems as well? Pls advice.
Anand, As Analytics maturity continuum, you mentioned descriptive, diagnostic, predictive and prescriptive. Is Cognitive Analytics next level into analytics maturity?
Thanks for the thread
Manoj, This thread seems to be different from the rest in the sense where do we post our queries. I want help from Anand on the launch of Jupyter from the Anaconda prommpt He said some conda. which I couldnt catch. thx,
Hi Manoj, Pls ignore my previous post. I have been able to install Py on my system. Its easy if you follow the steps from google
Hi Anand, Can you please share some data and steps to do the Hypothesis testing? And how to build an effective model based on the samples collected from the data.
If you can also share some case studies that may also help.
class 2 session has not been uploaded.Iam able to download only class 1 session.
Purushothaman, Cognitive Analytics as you may have known, tries to use the same behavior of human brain. Human brain functions based on nueral activity (neurons). The Neural Network aspect of data science builds algorithms based on functioning of nuerons (our brain cells).
The ANN, CNN (Artificial Neural Network), Convolutional Neural Network techniques build algorithms that learns in the same way as that of our brain. so, Cognitive analytics has its roots on Neural Networks. Both Cognitive Analytics and Neural Networks help with deep learning.
Deep learning techniques help in optimization.
So under optimization - you can group, deep learning (neural network, cog computing). optimization can also be thought of as AI. the word optimization is generic and hence can be confusing.
Please check the google link which has instructions to set up anaconda. Do let me know if you have issues.
By "Client" i hope you are referring to your employer or customer.
Python is an Open Source language that is now being adopted by many organizations. That said, there are some "enterprise software compliance teams", that monitor the installation of software . please check with them and they would definitely suggest you a way to install python at your work. Hope this response helps
I am not seeing the Class 2 downloadable link. Could you please upload it.
Hi Anand, I used the above. was able to download Anaconda , set up Jupyter and run programmes on Py. thx.
In the last class, I felt that we jumped into the tools like Anaconda and Jupyter without a complete introduction to package managers. I am not sure if its only me or if there are other like me who do not have any background in python. So, I did some basic study of these tools and I am summarizing it here.
Anand (or anybody), please correct\amend this information if I have got something wrong.
The way I understand it, there are 3 package\environment managers:
PIP - This is Python's package manager developed by MIT which runs on Python environment. PIP is a recursive acronym that can stand for either "Pip Installs Packages" or "Pip Installs Python". PIP installs any Python package in any environment.
PIP plagued by issues like
Does not perform all the dependency checks. One must read the package instructions (requirements.txt file) to understand the dependencies and install the pre-requisites. Without this a developer would face runtime errors in the program.
This is not an environment manager - This is most applicable to developers who could be maintaining different environments for data science, web development etc.
It affects the system python installation - This is applicable to Linux which comes with python installed in the system core. Packages installed directly affects the system python and any version specific programs or packages will be affected.
Conda- This was developed by Continum Analytics and is a cross platform package and environment manager. Conda installs any package within conda environment.
The advantages of Conda over PIP:
Takes care of and installs all the dependent packages, including non-python dependencies.
Allows installation, switching and management of different versions of packages
Anaconda Navigator (GUI tool) facilitates creating and managing different environments without having to worry with the nitty-gritties of package management.
It supports packages written in python, R etc. This is a general package manager.
Does not affect the system python
Very effective for data science projects; it brings in all the packages needed for data science and machine learning.
Anaconda - This a full distribution of the central software in the PyData ecosystem, and includes Python itself along with binaries for several hundred third-party open-source projects. Alternatively there is something called 'Miniconda' which contains the package manager conda only. Conda will subsequently need to be used to install other package from the scratch.
VirtualEnv - This is an environment manager which utilizes pip to manage packages create virtual environments. Helps manage the different packages and versions across virtual environments.
For hardcore developers, there is something called PyEnv, which encompasses both Anaconda and VirtualEnv allowing developers to manage their projects using both Ananconda and VirtualEnv. Additionally this ecosystem also allows developers to manage projects on different versions of python.
PIP can be used to install conda.
PIP can be used to install Jupyter
Conda is built upon PiP - it uses PIP under the hood.
Anaconda Navigator uses virtualenv under the hood to manage the environment
Pip vs Conda : Differences and Comparisons.
Which Python Package Manager Should You Use?
Jupyter Notebook- is a web-based interactive computational environment that allows you to run live code, embed visualization, explanatory text and even videos in one place. The embedded visualization reflects the changes in the data in real time. This combined with the power of word processing makes it a good notebook that has all the textual information, your code and immediate output, all in one place. It supports 40 programming languages, integration with big data, it can be shared using email, dropbox etc,
What is Jupyter Notebook?
Hi, please write to the support team. they will provide you the link. i will inform them as well
Thank you Sashi Kiran for taking time and sharing your notes and insights on package managers. My apologies that i did not cover these in detail in the class. will spend some time on this this week.
The reason i did not cover them in detail was because
1. jupyter installation via anaconda is very easy and a no brainer.
2. for starters of python , its better to go with one IDE than exploring everything. Hence i narrowed down on Jupyter which is both an ide used for learning and being used by enterprises as well.
3. Jupyter is a notebook that automatically helps you in learning best practices in python (indentation, comments, function doc strings etc)
No problem, lets use this form as effectively as possible.. I understand that you have a plan..
In Jupyter while typing python code it doesnt show any suggestions
eg: if we type "pr" it should show some suggestions starting from "pr" like print
any insights on this
Jupyter has extensive keyboard shortcuts that can be customized to help with "code completion."
one of the "code completion" features which is automatic with jupyter is,
1. when you press tab key after a command, keyword or method or function, jupyter will suggest you with options or complete the command. please find attached screen shots for below
below is an example, when i type pr and hit tab key, its showing me all the commands, keywords that start with "pr"
here is another example when i hit tab key after a "." , gives me all the methods that can be used
Hi Anand, We have completed Numpy and Pandas is in progress. Can you please share some case studies how these tools are helping Business in analytics and in real time problems?
I am getting the error when I am trying to run any command after importing numpy. Please let me know how to resolve this.
This error occurs when you have NOT run "import numpy as np" statement, but trying to run the statements following it.
please run the import numpy as np statement and then rerun the line with arr=np.array(my_list)
Numpy and Pandas are very useful in analysing data and there by business. In the life cycle of a data anlytics project numpy and pandas are quintessential for data acquisition, data wrangling and data exploration.
The use cases are everywhere. any business with data, can use numpy and pandas for anlysing the data.
one real time example that i worked on is i did a churn analysis for a leading chain of business in the beauty industry. They wanted me to
look at their data and solve the following problems
1. identify gaps in their business process and articulate the problem for them
2. customer churn
3. provide recommendations to them on how to retain clients
4. create a digital marketing strategy for them to do targeted marketing and increasing their revenue
As you can clearly see,
bullet 1 - is a descriptive analytics problem where the client did not clearly know what the problem was and asked me to come out with
bullets2 - is a typical problem that every client has and they threw that in to the foray.
2.1 to answer how clients are leaving them, i needed to do RFM (Recency, Freq, Monetary) analysis and Customer life time value
calculation. This helped me to come out with various segments of customers and their characteristics
2.3 Why clients are leaving. this is diagnostic analytics problem. We explored the data using statistical tools and
extracted insights to find out business insights like 20% of clients who leave , leave because of lack of connects from the company
bullet .2.2 - Prediction of Churn - Predictive Analytics.
When are clients leaving, was a difficult proposition for us, because in businesses like beauty industry, there is no track of
customer churn. we used statistical techinques like survival analysis to approximate the churn rate.
bullet3 & 4 - based off of 1 and 2, i had to predict when the clients are churning. this is a predictive analytics problem
is a prescriptive analytics problem where in, i provided recommendations for 1. customer retention 2. data quality up keep 3.
marketing strategy and 4. optimizing current process efficiencies
There are several other case studies in kaggle.com where real time business use cases are posted by companies and seek data scientists to solve their problems.
One such is https://www.kaggle.com/kaggle/sf-salaries.
Happy to Help
It will be great if we can also cover some solutions and examples for our further classes.
It works now. Thanks for the help.
Sure Girish. As part of this course, all of you are supposed to do a capstone project at the end. I ask of you to focus all your learning in such a way that your project gets done as per the design.
Once you complete this project offered by simplilearn, you will have good confidence on how to approach, plan and execute a data science project.
Then you can slowly hone your skills by taking up projects from kaggle.
Exception handling and python operator notebook is missing from the google drive link that you have shared with us.Can you upload these files.
Also one more doubt.For logical operations in python can we use & | in place of and or.
Please check again. Its there under specific folder called Exception Handling.
Yes Ekanth. They can be used.
where to post the doubts? we can do that here only?
What is the best way to read the large Excel / CSV files? If i am trying to read through jupyter notebook, but the browser is hanging.