CloudLab - Kickstarter

Discussion in 'Big Data and Analytics' started by AnupriyaT, Sep 30, 2017.

  1. AnupriyaT

    AnupriyaT Well-Known Member
    Alumni

    Joined:
    May 29, 2017
    Messages:
    151
    Likes Received:
    28
    Our Big Data Hadoop course has a massive learner-base, with participants from different work experience years and background. Installation is a vital part of Hadoop, and installing its components separately is quite a tedious process, and is quite demanding for learners with limited experience in installations and administration. If your interested to try hadoop installation on your local machine, you can refer the thread : http://community.simplilearn.com/threads/hadoop-setup-and-vt-x-disabled-error.26383/


    However, To help all our participants get past this roadblock on installation and get straight to work, we have created a fully functional multi-node Hadoop cluster - CloudLab. Simplilearn’s CloudLab will cater to the developer needs of Big Data Hadoop and helps you work with different Hadoop components like Pig, Hive, Impala, Sqoop, and Apache Spark.

    Here’s an introduction to CloudLab. Feel free to use this as a ready reckoner for all the Hadoop practice on the Lab.

    Accessing CloudLab:
    On your LMS, head to the Projects > Lab Access tab and you can find CloudLab services, and the credentials required for logging in.

    upload_2017-9-30_20-54-52.png

    CloudLab is based on a Linux Virtual machine, on which Hadoop is installed, and hence its different services. File systems available are the linux file system and the HDFS, which is the Hadoop Distributed File System. This simply means that it is a way to store the files in a distributed mode. You can access the linux Virtual Machine, by logging in to WebConsole. Create a file using the ‘vi’ command and you can see the file using the -ls command, just like any other Linux machine.To move files to and from the Linux system you can use our FTP service. To move files to and from your hadoop file system you can use our HUE service. For a more detailed insight, refer the Community link below,
    http://community.simplilearn.com/threads/hue-and-ftp-services-explained.27291/

    upload_2017-9-30_21-8-22.png

    Services in CloudLab
    • Webconsole- The Linux terminal of the system on which Hadoop is installed. Use the -ls command to access the local files, and the hdfs commands to access the HDFS files.
    • FTP- Move files from your computer to the Linux machine.

    • Hue- Use the File Browser in Hue to get a User interface for the files in HDFS, and the Query editors for Pig,Hive and Impala for writing Pig,Hive and Impala queries.

    • Cloudera Manager- Login to cloudera Manager to view the Hadoop services running, and all the other cluster characteristics.

    • Spark 2.1 - Login to spark 2.1 to view the event log directory of the spark commands executed.

    • Flink - Use the Apache flink service for stream processing and other data streaming applications.

    Here are the introductory commands for different services on CloudLab.

    1. MySQL
    Server name : sqoopdb.cloudlab.com
    Username : labuser
    Password : simplilearn
    Sample command : mysql -h sqoopdb.cloudlab.com -u labuser -p
    (OR) mysql -h 10.0.3.12 -u labuser -p
    Once you hit the above command, it will prompt for a password, then type : simplilearn

    2. Pig and Hive
    Simply type ‘pig’ or ‘hive’ to invoke the respective interactive shells, on the Webconsole. Use the Hue Query editor for User Interface for the Pig and Hive services.

    3. Sqoop
    Here’s the very first Sqoop command you can try,
    sqoop import --connect jdbc:mysql://10.0.3.12:3306/test --username labuser --password simplilearn --table CUSTOMERS --driver com.mysql.jdbc.Driver --m 1 --target-dir output
    Replace the db_name, table_name and path/results variables and you’re all good to go!

    4. Beeline Hive
    Type beeline to invoke the beeline shell. Once you enter, type the following connect command, to connect to Beeline,
    !connect jdbc:hive2://cloudera-masternode1.cloudlab.com:10000
    Use the below credentials to login,
    Username: beeline
    Password: simplilearn

    5. Impala :
    impala-shell -i impala.cloudlab.com (OR)
    impala-shell -i cloudera-slavenode3.cloudlab.com


    6. Apache Spark
    You can work with the Python shell of Apache Spark, by using the ‘pyspark’ command and Scala shell of Apache Spark by using the ‘spark-shell’ command.

    Pitfall!

    Permission denied error on trying to create a directory in root level.


    On Webconsole, you are within a Linux environment and a ‘/’ means the root directory in Linux. If you are trying to create a directory using the command : mkdir /simplilearn, you will not be able to create that, as you do not have the required permissions since we don't provide sudo/root user access to create a folder or to write in root level.
    Kindly remove the ‘/’ and use the following command structure to create a directory,
    mkdir <directory_name>
    Sample : mkdir simplilearn
    Note : "/" indicates root

    Now you’re all equipped, we wish you all the very best for your hands-on practice and projects.

    If you hit any roadblocks, look out for the same in the Simplitalk or community, so you can find your answers right away!

    Do reach us on the Help & Support section, with a screenshot/traceback/log, if necessary, if you happen to face more hurdles.

    All the very best!
     
    #1
    Raghul_4 and Sandeep_250 like this.
  2. _6230

    _6230 Well-Known Member
    Alumni

    Joined:
    Apr 4, 2017
    Messages:
    176
    Likes Received:
    8
    This is best handbook anyone can maintain. Thank you for sharing the same.
     
    #2
    AnupriyaT likes this.
  3. Sandeep_250

    Sandeep_250 Active Member
    Alumni

    Joined:
    Jun 21, 2016
    Messages:
    20
    Likes Received:
    1
    Good Briefing !!
     
    #3
    AnupriyaT likes this.
  4. AnupriyaT

    AnupriyaT Well-Known Member
    Alumni

    Joined:
    May 29, 2017
    Messages:
    151
    Likes Received:
    28
    This is to keep you all posted that when you navigate to Projects > Lab Access tab in your LMS for the first time, its so common to view the below error message,
    ERROR : Your lab account is being setup. Please refresh the page after 2 minutes.
    Though the message says two minutes, you will ideally need to wait for 20-30 mins for the services to be visible. Please do clear all your browser cache and refresh the page and now you'll be able to find your CloudLab services - Cloudera Manager, HUE, FTP, Webcosole, Spark2.1, Flink. If you still face difficulties, try using a different browser.
    Kindly Note : The above message occurs only once for the first time as it will take a few minutes for the services to be initialized to your account.

    Now you are all set to seamlessly access your lab.

    It's time for some hands-on :)

    Good luck!
     
    #4
  5. Aishwarya_24

    Aishwarya_24 Member

    Joined:
    Jun 22, 2017
    Messages:
    4
    Likes Received:
    0

    Hi Anupriya,

    I am not able to login to scala. I have used the following command to login.

    spark-shell --packages com.databricks:spark-csv_2.10:1.4.0

    But it is taking too long time to respond. I am waiting since long time may be more than half an hour, Still not logged into scala. Below is the screen shot attached. I am working on my BDH project.

    Please guide me on this!!
     

    Attached Files:

    #5
  6. AnupriyaT

    AnupriyaT Well-Known Member
    Alumni

    Joined:
    May 29, 2017
    Messages:
    151
    Likes Received:
    28
    Hi Aishwarya,

    The below thread will help you resolve this issue,
    http://community.simplilearn.com/threads/spark-shell-getting-freezed.27303/
     
    #6

Share This Page