Big Data Hadoop and spark developer| Rakesh|

Discussion in 'Big Data and Analytics' started by Neha_Pandey, Feb 16, 2019.

  1. Neha_Pandey

    Neha_Pandey Well-Known Member
    Simplilearn Support Alumni

    Joined:
    Jun 7, 2018
    Messages:
    95
    Likes Received:
    0
    Hi Learners,

    Kindly post your question over here.

    Regards,
    Neha Pandey
     
    #1
  2. Nirdesh Saxena

    Joined:
    Feb 7, 2019
    Messages:
    3
    Likes Received:
    0
    Thank you Neha for the page..
     
    #2
  3. Nirdesh Saxena

    Joined:
    Feb 7, 2019
    Messages:
    3
    Likes Received:
    0
    Hi Neha , we like to know, how to connect putty and winscp to our lab machine.
     
    #3
  4. Neha_Pandey

    Neha_Pandey Well-Known Member
    Simplilearn Support Alumni

    Joined:
    Jun 7, 2018
    Messages:
    95
    Likes Received:
    0
    Hi Learner,
    You need to
    Go to http://www.putty.org/ and click the You can download PuTTY here link.
    1. Run the PuTTY program. On your computer, go to All Programs > PuTTY > PuTTY.
    2. Select or enter the following information:
    Regards,
    Neha Pandey
     
    #4
  5. amit_478

    amit_478 Member

    Joined:
    Jul 29, 2017
    Messages:
    4
    Likes Received:
    0
    How to complete Lab and project on simplilearn.
     
    #5
  6. ruhi.jain

    ruhi.jain Well-Known Member
    Simplilearn Support

    Joined:
    Jun 7, 2018
    Messages:
    226
    Likes Received:
    5
    The trainer will guide you through the process.
     
    #6
  7. Nirdesh Saxena

    Joined:
    Feb 7, 2019
    Messages:
    3
    Likes Received:
    0
    Hi Neha,

    I tried the above but still not able to connect the lab through putty. can you please assign someone to guide us the process.

    Thanks,
    Nirdesh Saxena
     
    #7
  8. pankaj_211

    pankaj_211 Member

    Joined:
    Feb 15, 2019
    Messages:
    9
    Likes Received:
    1
    Hi Nirdish,
    I tried below procedure and it workfor me :-
    Find out External ipadress to connect to machine (through shell box). Try to pass the same adress to putty with port 22 it will connect to the machine.

    Let me know if you are not able to connect.
    Thanks and Regard,
    PAnkaj
     
    #8
  9. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hello All,

    I have uploaded a folder called mapreduce practice in one drive. The datafile for practice is also available in my unix file path: /home/rakeshsrivastva75_gmail/practice_dataset/retail_input

    If you have any question on understanding the practice, please post your query.

    Regards
    Rakesh
     
    #9
  10. ruhi.jain

    ruhi.jain Well-Known Member
    Simplilearn Support

    Joined:
    Jun 7, 2018
    Messages:
    226
    Likes Received:
    5
    Hi Nirdesh,

    Thank you for reaching out to us.

    As requested, for connecting Web Console via Putty you need to mention the below details:

    Host Name:sl.cloudloka.com
    Port: 22
    Connection Type:SSH


    Later use your login credentials to access the console which is available on the Practice Lab tab. I have also attached the screenshot of the same to assist you better in this regard.

    I hope this helps:)
     
    #10
  11. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hi All,

    I have added a file using eclipse.rtf in the map reduce folder of one drive. Please refer this file to know which JARs to use for running map reduce app.

    Regards
    Rakesh Srivastva
     
    #11
  12. Anand Kumar Tayi

    Joined:
    Feb 10, 2019
    Messages:
    2
    Likes Received:
    0
    Hi Rakesh,
    We are still facing FTP issues. Do you mind giving access permissions to datasets in your home directory in linux?

    Regards,
    Anand
     
    #12
  13. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hi,

    Please note that I have uploaded the demo commands from weekend, sentiment analysis use case for MR and a use case for data analysis using Hive in one drive folder. Folders to refer: 1) File Operations and data Analysis 2) Map Reduce practice.

    Regards
    Rakesh Srivastva
     
    #13
  14. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hi Anand/All,

    Titanic data set available at: /home/rakeshsrivastva75_gmail/practice_dataset/titanic/
    Sentiment data set available at: /home/rakeshsrivastva75_gmail/practice_dataset/sentiment/
    All datasets available at: /home/rakeshsrivastva75_gmail/practice_dataset/

    All should be able to access from this location in case FTP is still not working.

    Regards
    Rakesh
     
    #14
  15. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hi All,

    I have added project4 from LMS to be done using Hive. You will find the problem statement and dataset in one drive under File operations and analysis --> movie analysis with Hive.

    You will also find the movie dataset at this location: /home/rakeshsrivastva75_gmail/practice_dataset/Movie/

    Happy practice! Will share learnings in our next session

    Regards
    Rakesh
     
    #15
  16. Anand Kumar Tayi

    Joined:
    Feb 10, 2019
    Messages:
    2
    Likes Received:
    0
    Hi Rakesh,
    Could you also upload import/export commands for practice in one-drive using sqoop/hive/pig tools that were part of our session ?

    Regards,
    Anand
     
    #16
  17. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hi Anand,

    I have already uploaded for sqoop and Hive(what we covered so far). Please look for them in respective folders i have created.
    We have not started with pig yet.

    regards
    Rakesh
     
    #17
  18. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hello All,

    Please note I have rearranged our one drive folder:
    1) All demo commands, practice of lesson 1 to 3 are now available in folder: lesson_1_to_3
    2) All demo commands, practice of lesson 4 to 9 are now available in folder: lesson_4_to_9
    -- You will find 3 new projects that you are required to do in Hive/Pig
    -- Have uploaded all the demoed flume configuration file and agent running command in a subfolder called flume
    -- All other supporting files that we used for Hive demo (Transform, UDF, JSON ) are copied too
    3) Please complete your 3 projects and start submitting as soon as you are done. You can either upload here or create a sub folder (your name) under the 2 available parent folders (lesson_1_to_3) or (lesson_4_to_9)

    Happy Learning!

    Regards
    Rakesh
     
    #18
  19. pankaj_211

    pankaj_211 Member

    Joined:
    Feb 15, 2019
    Messages:
    9
    Likes Received:
    1
    Hi Rakesh,
    Not able to load file on google drive getting permission error. Please go through the attached file containing solution for titnaic data. I got my mistake because of which i as not able to solve this last week.

    Thanks,
    Pankaj
     

    Attached Files:

    #19
  20. pankaj_211

    pankaj_211 Member

    Joined:
    Feb 15, 2019
    Messages:
    9
    Likes Received:
    1
    Hi Rakesh,
    For project 4 IMBd Movie what is the last column of data set
    1,The Nightmare Before Christmas,1993,3.9,4568
    I want not know column for 4568 is it movie duration in seconds
     
    #20
  21. pankaj_211

    pankaj_211 Member

    Joined:
    Feb 15, 2019
    Messages:
    9
    Likes Received:
    1
    Hi Rakesh,
    Please go through the attached file containing solution of Project 4(IMBD_Movie)
    Thanks,
    Pankaj
     

    Attached Files:

    #21
  22. pankaj_211

    pankaj_211 Member

    Joined:
    Feb 15, 2019
    Messages:
    9
    Likes Received:
    1
    Hi Rakesh,
    For Project 3 i am not able to map one column please help me. As per my understanding _ is the delimiter

    1_563355_62701_0_1235000081_php,error,gd,image-processing_220_2_563372_67183_2_1235000501

    1 -- qid
    563355 -- User id of questioner
    62701 -- Score of the question
    0 -- Time of the question (in epoch time)
    1235000081 -- ?? Not able to Mapp
    php,error,gd,image-processing -- tags
    220 -- qvc
    2 -- qac
    563372 -- aid
    67183 -- j
    2 -- as
    1235000501 -- at

    Thanks,
    Pankaj
     
    #22
  23. _41000

    _41000 Member

    Joined:
    Sep 20, 2018
    Messages:
    6
    Likes Received:
    0
    Hai Rakesh,

    I keep having trouble with the Mapreduce homework.

    Even when I work on the playerstaistics file and use the code you provided for the team statistics data am still having no success in Eclipse when I run it. I have put the input and outout part but still no success.

    Maybe my problem is not knowing how to properly put a .csv file as input. spent hours on youtube and research still to no avail.

    help
    help
     
    #23
  24. _41000

    _41000 Member

    Joined:
    Sep 20, 2018
    Messages:
    6
    Likes Received:
    0

    Am having the same problem.....
    nothing is working for me today.
     
    #24
  25. Ashish_281

    Ashish_281 Member

    Joined:
    Dec 26, 2016
    Messages:
    3
    Likes Received:
    0
    Hi Rakesh,
    pls find below file includes commands for project 4. thanks!
     

    Attached Files:

    #25
  26. pankaj_211

    pankaj_211 Member

    Joined:
    Feb 15, 2019
    Messages:
    9
    Likes Received:
    1
    I think remove first column from data then things will work
     
    #26
  27. pankaj_211

    pankaj_211 Member

    Joined:
    Feb 15, 2019
    Messages:
    9
    Likes Received:
    1
    Hi Rakesh,
    Below is the solution for project 3, i have removed first column from file using pig. Please let me know if something is wrong in the solution

    Thanks,
    Pankaj
     

    Attached Files:

    #27
  28. Ashish_281

    Ashish_281 Member

    Joined:
    Dec 26, 2016
    Messages:
    3
    Likes Received:
    0
    Hi Rakesh,
    How to Type cast string field into int in hive?
     
    #28
  29. pankaj_211

    pankaj_211 Member

    Joined:
    Feb 15, 2019
    Messages:
    9
    Likes Received:
    1
    Hi Ashish,
    Hope below line will be helpful for you
    SELECT CAST((123.89+20) AS STRING);

    Thanks,
    Pankaj
     
    #29
  30. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Thanks Pankaj for providing the solution.

    Ashish,

    define your schema as you want it to be matching to actual values present in CSV. If that was the reason you were looking to typecase later then no need. In case you want to use in select then do as Pankaj provided.

    Regards
    Rakesh
     
    #30
  31. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Thank you Pankaj and Ashish for submitting assignments. I will review them. Keep it up! :)

    Regards
    Rakesh
     
    #31
  32. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hi,

    On the project day,i will give you presenter rights to show me your steps on eclipse so we can solve it together.

    Regards
    Rakesh
     
    #32
  33. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hello Everyone,

    I have updated drive with:
    1) Twitter config file and steps to configure account and run the agent
    2) Hbase & Phoenix sample commands
    3) Spark sample commands
    4) week assignment using spark core api: It has 2 assignments and your missing should you choose to accept ;) is to complete both or either of them.
    5) text file with the link of a youtube on how to share a folder between your host machine with guest VB

    CCA sample questions to follow soon...
    Happy learning!

    Regards
    Rakesh Srivastva
     
    #33
  34. Ashish_281

    Ashish_281 Member

    Joined:
    Dec 26, 2016
    Messages:
    3
    Likes Received:
    0
    Hello Rakesh,

    I tried to do one of the Spark assignment. please have a look. Do let me know your views. Thanks
     

    Attached Files:

    #34
  35. pankaj_211

    pankaj_211 Member

    Joined:
    Feb 15, 2019
    Messages:
    9
    Likes Received:
    1
    Hi Rakeh,
    Please find solution of assignment Sentimental Analysis. Please let me know yourcomments on this solution. I tried to solve the problem by two different ways. Attached both the solution. Used pyspark for the solution

    Thanks,
    Pankaj
     

    Attached Files:

    #35
    Rakesh_236 likes this.
  36. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hi Pankaj,

    Sure will do that next week.

    Regards
    Rakesh
     
    #36
  37. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hi All,

    I have uploaded a word document in google drive named as (Project_question_answer.doc) that has output of each question minus the command itself. Please match it to ensure you have got the right answer.

    Happy learning!

    Regards
    Rakesh
     
    #37
  38. _41000

    _41000 Member

    Joined:
    Sep 20, 2018
    Messages:
    6
    Likes Received:
    0
    Rakesh this is kaluba from the other group.... managed to complete assignment but still having trouble with question 2. help me
     
    #38
  39. _41000

    _41000 Member

    Joined:
    Sep 20, 2018
    Messages:
    6
    Likes Received:
    0

    Rakesh this is kaluba .... managed to complete assignment but still having trouble with question 2 dont why but cant seem to wrap my head around it. help me
     
    #39
  40. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hi Kaluba,

    The question 2 is marketing success rate. Do a count of all recourds and store in a variable. Do a count of all records where customer have subscribed. Check the column Y. it has either yes or no. yes means customer has subscribed and no means not subscribe.
    yes count by total count is success rate. No count by total count is failure rate.

    Let me know if this helps
    Regards
    Rakesh
     
    #40
  41. Rakesh_236

    Rakesh_236 Active Member

    Joined:
    Dec 27, 2018
    Messages:
    22
    Likes Received:
    2
    Hello everyone,

    I hope all of you have submitted the projects. Let me know if any one still needing help. I had a very good time and a lot of intelligent questions helped all of us :). Keep up the learning curve! You can find me@rakeshsrivastva75_gmail.com // www.linkedin.com/in/rakesh-srivastva

    Regards
    Rakesh Srivastva
     
    #41
  42. _62919

    _62919 Member

    Joined:
    Jun 1, 2019
    Messages:
    12
    Likes Received:
    0
    hello!
    i am trying to import table through sqoop using the following command in terminal in virtual box.
    sqoop import --connect jdbc:mysql://localhost/training --username training --password training --table countries;

    I am getting this error:
    ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use n
    ear 'sqoop import --connect jdbc:mysql://localhost/training --username training --pas' at line 1

    please help me out.
     
    #42

Share This Page