big data project error

Discussion in 'Big Data and Analytics' started by Narayana Surya, Dec 9, 2018.

  1. Narayana Surya

    Narayana Surya Active Member
    Alumni

    Joined:
    Feb 27, 2018
    Messages:
    48
    Likes Received:
    0
    hi

    i am facing following error when i try to load data.

    error:
    'DataFrameReader' object has no attribute 'describe'
    'DataFrameReader' object has no attribute 'show'

    regards,
    Surya.
     

    Attached Files:

    #1
  2. Neha_Pandey

    Neha_Pandey Well-Known Member
    Simplilearn Support Alumni

    Joined:
    Jun 7, 2018
    Messages:
    82
    Likes Received:
    0
    Hi Learner,
    Please use below command-
    Command-
    Use .rdd.map:

    >>> data.select(...).rdd.map(...)

    Regards,
    Neha Pandey
     
    #2
  3. Narayana Surya

    Narayana Surya Active Member
    Alumni

    Joined:
    Feb 27, 2018
    Messages:
    48
    Likes Received:
    0
    Can i use Rdd commands for Data frame...? if so then what is the difference between both
     
    #3
  4. Narayana Surya

    Narayana Surya Active Member
    Alumni

    Joined:
    Feb 27, 2018
    Messages:
    48
    Likes Received:
    0
    Hi

    i am again getting same error.Can you please let me know how to load data in data frame.

    i use the following command to upload data from CSV file

    df1=sl.read.format("/user/zzzzxxx/pbank.csv").options(header='true')

    it gets uploaded successfully

    when i type type(df1)

    it shows following:
    <class 'pyspark.sql.readwriter.DataFrameReader'>

    but when i type
    df1.head(5)
    df1.select('name')

    it is showing me

    'DataFrameReader' object has no attribute 'select'
    'DataFrameReader' object has no attribute 'head'

    All the commands which i used are data frame commands only but i don't know what i gets this error

    Also please send me list of all the spark commands that can apply on dataframe and RDD
     

    Attached Files:

    #4
  5. Neha_Pandey

    Neha_Pandey Well-Known Member
    Simplilearn Support Alumni

    Joined:
    Jun 7, 2018
    Messages:
    82
    Likes Received:
    0
    Hi Narayana,

    Please try using the below code.
    val df =sqlContext.read.format("com.databricks.spark.csv").option("header","true").option("inferSchema","true").option("delimiter",",").load("PATH_Location here")

    Or,

    Case Class method
    Que – 1. Load data and create Spark data frame.
    Solution – 1:
    scala> val bankdata=sc.textFile("/user/cloudera/HadoopProject/Project1_dataset_bank-full.csv")
    bankdata: org.apache.spark.rdd.RDD[String] = /user/cloudera/HadoopProject/Project1_dataset_bank-full.csv
    MapPartitionsRDD[9] at textFile at <console>:27
    scala> val cleaned_Bank_Data=bankdata.map(_.replaceAll("\"",""))
    cleaned_Bank_Data: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[10] at map at <console>:29
    val header=cleaned_Bank_Data.first()
    header: String =
    age;job;marital;education;default;balance;housing;loan;contact;day;month;duration;campaign;pdays;previous
    ;poutcome;y
    val data_rows=cleaned_Bank_Data.filter(x => x!=header)
    data_rows: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[11] at filter at <console>:33
    case class mktData(
    age:Int, job:String, marital:String, education:String, default:String,
    balance:Double, housing:String, loan:String, contact:String,
    day:Int, month:String,
    duration:Int, campaign:Int, pdays:Int, previous:Int, poutcome:String, y:String)
    defined class mktData
    val bank_mkt_df=data_rows.map(_.split(";")).map(x => mktData(
    x(0).trim.toInt, x(1), x(2), x(3), x(4),
    x(5).trim.toDouble, x(6), x(7), x(8),
    x(9).trim.toInt, x(10),
    x(11).trim.toInt,
    x(12).trim.toInt,
    x(13).trim.toInt,
    x(14).trim.toInt, x(15), x(16)
    )).toDF()
    bank_mkt_df: org.apache.spark.sql.DataFrame = [age: int, job: string, marital: string, education: string,
    default: string, balance: double, housing: string, loan: string, contact: string, day: int, month: string, duration:
    int, campaign: int, pdays: int, previous: int, poutcome: string, y: string]

    bank_mkt_df.show()
    SQL Context creation:
    scala>val sqlContext =new org.apache.spark.sql.SQLContext(sc);

    Please visit the link below for all the updated commands required to work on the new Cloud Lab platform:
    http://community.simplilearn.com/threads/big-data-cloud-lab-is-live-now.38277/

    Regards,
    Neha Pandey
     
    #5
  6. Narayana Surya

    Narayana Surya Active Member
    Alumni

    Joined:
    Feb 27, 2018
    Messages:
    48
    Likes Received:
    0
    Thanks
    but when i load data from project csv file dataframe is getting created but when i try to run sql query it is throwing me error but this problem get solved when i use data set given by mentor for project.can you please let me know why i am getting error for that data set

    screenshots are attached please refer error1.png error2.png

    please provide me list of all the transformations and action command list
     
    #6
    Last edited: Dec 16, 2018
  7. Narayana Surya

    Narayana Surya Active Member
    Alumni

    Joined:
    Feb 27, 2018
    Messages:
    48
    Likes Received:
    0
    Can any one give reply for the following error
     
    #7
  8. Support Simplilearn(4685)

    Support Simplilearn(4685) Well-Known Member
    Alumni

    Joined:
    Feb 11, 2010
    Messages:
    86
    Likes Received:
    0
    #8
  9. Narayana Surya

    Narayana Surya Active Member
    Alumni

    Joined:
    Feb 27, 2018
    Messages:
    48
    Likes Received:
    0
    already i completed my project i just want to know why i gets this error
     
    #9
  10. Neha_Pandey

    Neha_Pandey Well-Known Member
    Simplilearn Support Alumni

    Joined:
    Jun 7, 2018
    Messages:
    82
    Likes Received:
    0
    Hi Learner,

    There is no error as such. You have created the data frame and now you can execute other use cases.

    Regards,
    Neha Pandey
     
    #10

Share This Page