Welcome to the Simplilearn Community

Want to join the rest of our members? Sign up right away!

Sign Up

Python for Data Science | April 04,2020 - May 09,2020 |Rajneesh

hello Rajneesh sir,
i am having doubt from matrix factorisation i m stuck there can u pls help me .
can i convert the dataframe to matrix?
i am not understanding

Aditya Uddandam

New Member
The means of the average response time across complaint types is 82% similar, can i accept Null Hypothesis ?? or it has to be 100% similar ??
sir, regarding nyc project
8. Statistical test for the average response time across complaint types is similar or not-

// taking any three random Comlaint_type
agency_issue=data[data.Complaint_Type=='Agency Issues']
animal_abuse=data[data.Complaint_Type=='Animal Abuse']
illegal_firework=data[data.Complaint_Type=='Illegal Fireworks']


i am getting output-
F_onewayResult(statistic=6.770120839427416, pvalue=0.0011541927966116955)

on other side if i am trying-
//by head () method i am taking values manually from dataset by filtering Complaint_type

my output is-
F_onewayResult(statistic=1.4747769182277108, pvalue=0.25267221557695146)

why i am getting different pvalues??
## 9. Declaring Null Hypothesis or Alternate Hypothesis for the type of complaint or service requested and location related


is it right way??

Amarshree V

Active Member
Yes your solution serves the use case, no need to go to topic modeling.
Thank you sir,
I have other doubt which I have attached below.
I calculated the complaints closed and open how to convert the same to percentage ? is there any predefined function to do the same?


  • Doubt Comcast.png
    Doubt Comcast.png
    24 KB · Views: 7
Last edited:
## 9. Declaring Null Hypothesis or Alternate Hypothesis for the type of complaint or service requested and location related
**Hint :** This task is similar to the gender and spending co-relation example we looked in stats class.

Which lecture are you referring to? Please advise.

Gaanashree S Patil_1

Please help me how to open .data file in python for the project movie lens.
Hi Abhishek!

  1. Python allows you to read, write and delete files.
  2. Use the function open("filename","w+") to create a file. ...
  3. To append data to an existing file use the command open("Filename", "a")
  4. Use the read function to read the ENTIRE contents of a file.
  5. Use the readlines function to read the content of the file one by one.

genres = master_data[['Title', 'Genres']]---->Creating Title & Genres column
genres['Genres'] = genres['Genres'].apply(lambda x: x.split('|'))---->splitting the genres
genres_list = genres.Genres.tolist()--->convert into list
unique_genres1 = set(functools.reduce(operator.concat, genres_list[:50000]))----->convert into set

But last command execution is possible only for 50K entries & total 1mn entries consist...so i have to execute last command 20 times...Pls help me to do the right approach...


Well-Known Member
Staff member
Simplilearn Support
I am getting error for cross tab:As i have given the same name whats there when data has imported:
View attachment 9501

Hi Shalini,

Whenever a column name has a space in the name, we do not use the dot(.) operator for it.

# Incorrect way :
data.Complaint Type

# Correct way :
data["Complaint Type"]

Please do this above-suggested change and your code will work smoothly.

Nishant Singh
Senior Global Teaching Assistant

elon jigar

New Member
return self._wrap_aggregated_output(output, names)

Did you initialize an empty DataFrame first and then filled it? If so that's probably why it changed with the new version as before 0.9 empty DataFrame were initialized to float type but now they are of object type. If so you can change the initialization to DataFrame(dtype=float).

You can also call frame.astype(float)