Machine Learning | Jul 28 - Aug 25 | Sayan

Discussion in 'Big Data and Analytics' started by K Manoj, Jul 28, 2018.

  1. K Manoj

    K Manoj Moderator
    Staff Member Simplilearn Support

    Joined:
    Aug 4, 2017
    Messages:
    228
    Likes Received:
    19
    *thread locked for this batch learrners.
    Please post your questions below
     
    #1
  2. _31793

    _31793 Member

    Joined:
    Jun 8, 2018
    Messages:
    5
    Likes Received:
    0
    I';m not able to execute the below code while creating an empty DataFrame and appending it, its giving me error as "ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series" :- result_train = pd.DataFrame() result_train[';Actual Profit';] = result_train.append(pd.DataFrame(dv_train[[';Profit';]]),ignore_index=True) result_train[';Linear Predictions';] = lin_regressor.predict(iv_train)
     
    #2
  3. _24461

    _24461 Member
    Alumni

    Joined:
    Feb 27, 2018
    Messages:
    6
    Likes Received:
    0
    Hi Sayen,
    Good Morning

    Tried working on California Housing proj and had ended with few questions.
    But, i am not able to proceed further with the stats formula smf.OLS part.
    I am getting a very big error and have no idea to resolve it, though i tried my best to address it.
    Please find the attached notebook for your reference.
    Request your help in this regards.

    This part of code is giving error.Also i tried uploading .ipynb file, but i don see that either.
    Hence pasting the code. Please help.

    lm = smf.ols(formula = 'median_house_value ~ longitude + latitude + housing_median_age + total_rooms + total_bedrooms + population + households + median_income + ocn_prox_INLAND + ocn_prox_ISLAND + ocn_prox_NEARBAY + ocn_prox_NEAROCEAN', data = x_train_std).fit()

    Thanks,
    Sakthivel.
     
    #3
  4. _34254

    _34254 New Member

    Joined:
    Jul 9, 2018
    Messages:
    1
    Likes Received:
    0
    Hi Sayan,
    I have couple of question
    1) In the Bonus Exercise there is statement
    Extract just the median_income column from the independent variables (from X_train and X_test).
    can you please explain this. Is this mean that we have to fetch the column value from X_train and X_test which we have already used in our prediction OR we have to perform linear regression on one column(median_income) that we can get from our dataset?

    2) if we have to create a dataframe on median_income what could be the 2nd parameter, can it be between max and min

    3) Should we have to do simple exercise x=median_income and y= median_house_value and plot the graph on the basis of our prediction through linear regression

    Please clarify, as my project is completed, I will submit once this bonus exercise is completed.


    Thanks
    Arun wadhwa
     
    #4
  5. Dinesh Jolania

    Joined:
    Apr 14, 2016
    Messages:
    4
    Likes Received:
    0
    Bonus Exercise meant .
    Only that particular column median_income .
    that LR will be between median_income Vs median_house_value.
     
    #5
  6. Dinesh Jolania

    Joined:
    Apr 14, 2016
    Messages:
    4
    Likes Received:
    0
     
    #6
  7. Dinesh Jolania

    Joined:
    Apr 14, 2016
    Messages:
    4
    Likes Received:
    0
    result_train[';Actual Profit';] seems wrong give correct name for result_train['Actual Profit']
    Better first look at columns by issuing this
    result_train.columns
     
    #7
  8. Prajwala

    Prajwala Member

    Joined:
    May 16, 2018
    Messages:
    4
    Likes Received:
    0
    hey sayan,
    I'm facing error as shown below for the California pricing prediction , kindly let me know where i might be going wrong.

    code:
    X= housing('longitude','latitude','housing_median_age','total_rooms','total_bedrooms','population','households','median_income','ocean_proximity').values
    y= housing('median_house_value').values

    error message:
    TypeError Traceback (most recent call last)
    <ipython-input-28-ed546acdc78a> in <module>()
    ----> 1X= housing('longitude','latitude','housing_median_age','total_rooms','total_bedrooms','population','households','median_income','ocean_proximity').values
    2 y= housing('median_house_value').values

    TypeError: 'DataFrame' object is not callable


    regards,
    Prajwala
     
    #8
  9. _26956

    _26956 Member

    Joined:
    Mar 23, 2018
    Messages:
    2
    Likes Received:
    0
    Hi Sayan,

    Please help

    When I try to find rmse using the following code in decision tree and random forest, I get the error as
    ValueError: Number of features of the model must match the input. Model n_features is 9 and input n_features is 5

    predictions = clf.predict(test_x)
    mse = mean_squared_error(test_y, predictions)
    rmse = np.sqrt(mse)
    rmse

    Could you please help me , how to rectify this?
     
    #9
  10. _24461

    _24461 Member
    Alumni

    Joined:
    Feb 27, 2018
    Messages:
    6
    Likes Received:
    0
    Hi Sayan,
    This is Sakthi here.
    Could you please help me out with the link for Titanic Dataset in HTM, the one you used for Titanic Decision Tree.
    Tried to find a link to use for my learning, but unable to find.
    datafile = "\\titanicdata.htm" the once which you used.

    Please help me out.
     
    #10

Share This Page