Separate names with a comma.
Recommended. Know people from your network.
Don't have an account?Sign up Now
To reset your password, enter the email address you registered with and we"ll send your instructions on their way.
Discussion in 'Big Data and Analytics' started by Neha_Pandey, Feb 16, 2019.
Kindly post your question over here.
Thank you Neha for the page..
Hi Neha , we like to know, how to connect putty and winscp to our lab machine.
You need to
Go to http://www.putty.org/ and click the You can download PuTTY here link.
Run the PuTTY program. On your computer, go to All Programs > PuTTY > PuTTY.
Select or enter the following information:
How to complete Lab and project on simplilearn.
The trainer will guide you through the process.
I tried the above but still not able to connect the lab through putty. can you please assign someone to guide us the process.
I tried below procedure and it workfor me :-
Find out External ipadress to connect to machine (through shell box). Try to pass the same adress to putty with port 22 it will connect to the machine.
Let me know if you are not able to connect.
Thanks and Regard,
I have uploaded a folder called mapreduce practice in one drive. The datafile for practice is also available in my unix file path: /home/rakeshsrivastva75_gmail/practice_dataset/retail_input
If you have any question on understanding the practice, please post your query.
Thank you for reaching out to us.
As requested, for connecting Web Console via Putty you need to mention the below details:
Later use your login credentials to access the console which is available on the Practice Lab tab. I have also attached the screenshot of the same to assist you better in this regard.
I hope this helps
I have added a file using eclipse.rtf in the map reduce folder of one drive. Please refer this file to know which JARs to use for running map reduce app.
We are still facing FTP issues. Do you mind giving access permissions to datasets in your home directory in linux?
Please note that I have uploaded the demo commands from weekend, sentiment analysis use case for MR and a use case for data analysis using Hive in one drive folder. Folders to refer: 1) File Operations and data Analysis 2) Map Reduce practice.
Titanic data set available at: /home/rakeshsrivastva75_gmail/practice_dataset/titanic/
Sentiment data set available at: /home/rakeshsrivastva75_gmail/practice_dataset/sentiment/
All datasets available at: /home/rakeshsrivastva75_gmail/practice_dataset/
All should be able to access from this location in case FTP is still not working.
I have added project4 from LMS to be done using Hive. You will find the problem statement and dataset in one drive under File operations and analysis --> movie analysis with Hive.
You will also find the movie dataset at this location: /home/rakeshsrivastva75_gmail/practice_dataset/Movie/
Happy practice! Will share learnings in our next session
Could you also upload import/export commands for practice in one-drive using sqoop/hive/pig tools that were part of our session ?
I have already uploaded for sqoop and Hive(what we covered so far). Please look for them in respective folders i have created.
We have not started with pig yet.
Please note I have rearranged our one drive folder:
1) All demo commands, practice of lesson 1 to 3 are now available in folder: lesson_1_to_3
2) All demo commands, practice of lesson 4 to 9 are now available in folder: lesson_4_to_9
-- You will find 3 new projects that you are required to do in Hive/Pig
-- Have uploaded all the demoed flume configuration file and agent running command in a subfolder called flume
-- All other supporting files that we used for Hive demo (Transform, UDF, JSON ) are copied too
3) Please complete your 3 projects and start submitting as soon as you are done. You can either upload here or create a sub folder (your name) under the 2 available parent folders (lesson_1_to_3) or (lesson_4_to_9)
Not able to load file on google drive getting permission error. Please go through the attached file containing solution for titnaic data. I got my mistake because of which i as not able to solve this last week.
For project 4 IMBd Movie what is the last column of data set
1,The Nightmare Before Christmas,1993,3.9,4568
I want not know column for 4568 is it movie duration in seconds
Please go through the attached file containing solution of Project 4(IMBD_Movie)
For Project 3 i am not able to map one column please help me. As per my understanding _ is the delimiter
1 -- qid
563355 -- User id of questioner
62701 -- Score of the question
0 -- Time of the question (in epoch time)
1235000081 -- ?? Not able to Mapp
php,error,gd,image-processing -- tags
220 -- qvc
2 -- qac
563372 -- aid
67183 -- j
2 -- as
1235000501 -- at
I keep having trouble with the Mapreduce homework.
Even when I work on the playerstaistics file and use the code you provided for the team statistics data am still having no success in Eclipse when I run it. I have put the input and outout part but still no success.
Maybe my problem is not knowing how to properly put a .csv file as input. spent hours on youtube and research still to no avail.
Am having the same problem.....
nothing is working for me today.
pls find below file includes commands for project 4. thanks!
I think remove first column from data then things will work
Below is the solution for project 3, i have removed first column from file using pig. Please let me know if something is wrong in the solution
How to Type cast string field into int in hive?
Hope below line will be helpful for you
SELECT CAST((123.89+20) AS STRING);
Thanks Pankaj for providing the solution.
define your schema as you want it to be matching to actual values present in CSV. If that was the reason you were looking to typecase later then no need. In case you want to use in select then do as Pankaj provided.
Thank you Pankaj and Ashish for submitting assignments. I will review them. Keep it up!
On the project day,i will give you presenter rights to show me your steps on eclipse so we can solve it together.
I have updated drive with:
1) Twitter config file and steps to configure account and run the agent
2) Hbase & Phoenix sample commands
3) Spark sample commands
4) week assignment using spark core api: It has 2 assignments and your missing should you choose to accept is to complete both or either of them.
5) text file with the link of a youtube on how to share a folder between your host machine with guest VB
CCA sample questions to follow soon...
I tried to do one of the Spark assignment. please have a look. Do let me know your views. Thanks
Please find solution of assignment Sentimental Analysis. Please let me know yourcomments on this solution. I tried to solve the problem by two different ways. Attached both the solution. Used pyspark for the solution
Sure will do that next week.
I have uploaded a word document in google drive named as (Project_question_answer.doc) that has output of each question minus the command itself. Please match it to ensure you have got the right answer.
Rakesh this is kaluba from the other group.... managed to complete assignment but still having trouble with question 2. help me
Rakesh this is kaluba .... managed to complete assignment but still having trouble with question 2 dont why but cant seem to wrap my head around it. help me
The question 2 is marketing success rate. Do a count of all recourds and store in a variable. Do a count of all records where customer have subscribed. Check the column Y. it has either yes or no. yes means customer has subscribed and no means not subscribe.
yes count by total count is success rate. No count by total count is failure rate.
Let me know if this helps
I hope all of you have submitted the projects. Let me know if any one still needing help. I had a very good time and a lot of intelligent questions helped all of us . Keep up the learning curve! You can find me@rakeshsrivastva75_gmail.com // www.linkedin.com/in/rakesh-srivastva
i am trying to import table through sqoop using the following command in terminal in virtual box.
sqoop import --connect jdbc:mysql://localhost/training --username training --password training --table countries;
I am getting this error:
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use n
ear 'sqoop import --connect jdbc:mysql://localhost/training --username training --pas' at line 1
please help me out.