Separate names with a comma.
Recommended. Know people from your network.
Don't have an account?Sign up Now
To reset your password, enter the email address you registered with and we"ll send your instructions on their way.
Discussion in 'Big Data and Analytics' started by K Manoj, Jul 28, 2018.
*thread locked for this batch learrners.
Please post your questions below
I';m not able to execute the below code while creating an empty DataFrame and appending it, its giving me error as "ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series" :- result_train = pd.DataFrame() result_train[';Actual Profit';] = result_train.append(pd.DataFrame(dv_train[[';Profit';]]),ignore_index=True) result_train[';Linear Predictions';] = lin_regressor.predict(iv_train)
Tried working on California Housing proj and had ended with few questions.
But, i am not able to proceed further with the stats formula smf.OLS part.
I am getting a very big error and have no idea to resolve it, though i tried my best to address it.
Please find the attached notebook for your reference.
Request your help in this regards.
This part of code is giving error.Also i tried uploading .ipynb file, but i don see that either.
Hence pasting the code. Please help.
lm = smf.ols(formula = 'median_house_value ~ longitude + latitude + housing_median_age + total_rooms + total_bedrooms + population + households + median_income + ocn_prox_INLAND + ocn_prox_ISLAND + ocn_prox_NEARBAY + ocn_prox_NEAROCEAN', data = x_train_std).fit()
I have couple of question
1) In the Bonus Exercise there is statement
Extract just the median_income column from the independent variables (from X_train and X_test).
can you please explain this. Is this mean that we have to fetch the column value from X_train and X_test which we have already used in our prediction OR we have to perform linear regression on one column(median_income) that we can get from our dataset?
2) if we have to create a dataframe on median_income what could be the 2nd parameter, can it be between max and min
3) Should we have to do simple exercise x=median_income and y= median_house_value and plot the graph on the basis of our prediction through linear regression
Please clarify, as my project is completed, I will submit once this bonus exercise is completed.
Bonus Exercise meant .
Only that particular column median_income .
that LR will be between median_income Vs median_house_value.
result_train[';Actual Profit';] seems wrong give correct name for result_train['Actual Profit']
Better first look at columns by issuing this
I'm facing error as shown below for the California pricing prediction , kindly let me know where i might be going wrong.
TypeError Traceback (most recent call last)
<ipython-input-28-ed546acdc78a> in <module>()
----> 1X= housing('longitude','latitude','housing_median_age','total_rooms','total_bedrooms','population','households','median_income','ocean_proximity').values
2 y= housing('median_house_value').values
TypeError: 'DataFrame' object is not callable
When I try to find rmse using the following code in decision tree and random forest, I get the error as
ValueError: Number of features of the model must match the input. Model n_features is 9 and input n_features is 5
predictions = clf.predict(test_x)
mse = mean_squared_error(test_y, predictions)
rmse = np.sqrt(mse)
Could you please help me , how to rectify this?
This is Sakthi here.
Could you please help me out with the link for Titanic Dataset in HTM, the one you used for Titanic Decision Tree.
Tried to find a link to use for my learning, but unable to find.
datafile = "\\titanicdata.htm" the once which you used.
Please help me out.