To deal with Categorical data I am getting the fol...

Discussion in 'General Discussions' started by Munmun Mahato, Jun 30, 2018.

  1. Munmun Mahato

    Munmun Mahato Member

    Joined:
    May 25, 2018
    Messages:
    2
    Likes Received:
    0
    To deal with Categorical data I am getting the following error :

    [10:24 PM, 6/30/2018] Munmun: # Dealing with Catagorial Data
    # Encode the data--- Label Encoder
    #Remove the mathematical weightage-- OneHotEncoder

    from sklearn.preprocessing import LabelEncoder
    X_labelendcoder=LabelEncoder()
    X[:,0]=X_labelendcoder.fit_transform(X[:,0])
    X_labelendcoder.classes_
    y_labelEncoder=LabelEncoder()
    y = y_labelEncoder.fit_transform(y)
    y
    from sklearn.preprocessing import OneHotEncoder
    X_ohe = OneHotEncoder(categorical_features=[9])
    X = X_ohe.fit_transform(X).toarray()
    [10:25 PM, 6/30/2018] Munmun: IndexError Traceback (most recent call last)
    <ipython-input-138-8bd09576f703> in <module>()
    12 from sklearn.preprocessing import OneHotEncoder
    13 X_ohe = OneHotEncoder(categorical_features=[9])
    ---> 14 X = X_ohe.fit_transform(X).toarray()

    C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing\data.py in fit_transform(self, X, y)
    2017 """
    2018 return _transform_selected(X, self._fit_transform,
    -> 2019 self.categorical_features, copy=True)
    2020
    2021 def _transform(self, X):

    C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing\data.py in _transform_selected(X, transform, selected, copy)
    1818 ind = np.arange(n_features)
    1819 sel = np.zeros(n_features, dtype=bool)
    -> 1820 sel[np.asarray(selected)] = True
    1821 not_sel = np.logical_not(sel)
    1822 n_selected = np.sum(sel)

    IndexError: index 9 is out of bounds for axis 0 with size 2
     
    #1
  2. Vikas Kumar_18

    Vikas Kumar_18 Active Member
    Simplilearn Support Alumni

    Joined:
    Dec 17, 2018
    Messages:
    25
    Likes Received:
    0
    Hi Learner,

    As i am seeing the error: "IndexError: index 9 is out of bounds for axis 0 with size 2".

    You're trying to process and fit the value which is not available in your dataset.

    Please understand that if you want to access 9th column then Python understand it as "8" because Python starts with 0 so here index would be (0 to 8) but we understand as 1 to 9.


    Please do the change and try to execute.

    Regards,
    VIKAS KUMAR
    Lead-Global Teaching Assistant

    Simplilearn
    1 million+happy learners
    Facebook | Linkedin | Twitter | YouTube
    Digital economy training
     
    #2

Share This Page