Big Data Hadoop and Spark Developers | Gautam

Discussion in 'Big Data and Analytics' started by Koyel Sinha Chowdhury, Jul 23, 2019.

  1. Koyel Sinha Chowdhury

    Koyel Sinha Chowdhury Well-Known Member

    Joined:
    Feb 14, 2019
    Messages:
    54
    Likes Received:
    5
    Hi All,

    This thread is for you to discuss the queries and concepts related to Big Data Hadoop and Spark Developers

    Happy Learning !!

    Regards,
    Team Simplilearn
     
    #1
    saurav singla_1 likes this.
  2. Sujata Mandal

    Sujata Mandal Member

    Joined:
    Jul 17, 2019
    Messages:
    2
    Likes Received:
    0
    Gautam sir .. what went wrong in creating my instance in Google cloud...wget command is not available for me....getting below message "-bash: wget: command not found"
    As advised, while creating the instance i had just changed the os to CentOs 7 image... tried using firewall options HTTPS Allowed as well... but getting the same error.... Please help

    [sujatamandal@instance-1 ~]$ wget -c --header "Cookie: oraclelicense=accept-securebackup-cookie" https://download.oracle.com/otn/jav...b4ff2b6607d096fa80163/jdk-8u131-linux-x64.rpm-bash: wget: command not found[sujatamandal@instance-1 ~]$ wget
    -bash: wget: command not found
    [sujatamandal@instance-1 ~]$ whereis wget
    wget:
    [sujatamandal@instance-1 ~]$
    [sujatamandal@instance-1 ~]$ whereis ls
    ls: /usr/bin/ls /usr/share/man/man1/ls.1.gz
    [sujatamandal@instance-1 ~]$
     
    #2
  3. Sujata Mandal

    Sujata Mandal Member

    Joined:
    Jul 17, 2019
    Messages:
    2
    Likes Received:
    0
    wget issue resolved. Installed wget using following command :
    $> sudo yum install wget
     
    #3
  4. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    Install wget first. $sudo yum install wget -y
     
    #4
  5. Ks4823

    Ks4823 Member

    Joined:
    Jul 25, 2019
    Messages:
    2
    Likes Received:
    0
    Hello Gautam,

    Today I got a notice that there is an update available for our class in Simplilearn. Should we accept the update? It looks like there is quite a bit of new content that would be useful. Here is the change notice details:

    "Our recently updated Big Data Hadoop and Spark Developer Course now offers more industry-relevant concepts and significant hands-on exposure. The course has been updated with the following changes:
    > Added 4 Real industry projects from Amazon, New York Stock Exchange, Glassdoor, and an insurance firm that covers Retail, Stock Exchange, Human Resource, and BFSI domains
    > Number of demos have been increased to 46 which will give more exposure to the most relevant tools, technologies, and use cases
    > Course structure is revamped by incorporating lesson-end projects"
     
    #5
  6. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    Please accept updates to get new course content.
     
    #6
  7. Binish Mushtaq

    Joined:
    Jun 26, 2019
    Messages:
    2
    Likes Received:
    0
    Hi Gautam,

    I have been trying to execute my bigdata.jar on the hadoop cluster.

    However, it would be very helpful if there was a guide similar to the cluster setup guide.

    Could you please oblidge?

    Regards
    BMH
     
    #7
  8. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    Sure. Will upload a hadoop command guide. Thanks.
     
    #8
  9. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,
    while running "mvn clean install", to execute the BigData (mr-examples\src\main\java\org\myorg) I am getting below errors, please help me to resolve this issue:

    C:\MyHadoopProjects>mvn clean install
    [INFO] Scanning for projects...
    [INFO]
    [INFO] ----------------------------< GFT:bigdata >-----------------------------
    [INFO] Building BigData 1.0-SNAPSHOT
    [INFO] --------------------------------[ jar ]---------------------------------
    [INFO]
    [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ bigdata ---
    [INFO] Deleting C:\MyHadoopProjects\target
    [INFO]
    [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ bigdata ---
    [WARNING] Using platform encoding (Cp1252 actually) to copy filtered resources, i.e. build is platform dependent!
    [INFO] skip non existing resourceDirectory C:\MyHadoopProjects\src\main\resources
    [INFO]
    [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ bigdata ---
    [INFO] Changes detected - recompiling the module!
    [WARNING] File encoding has not been set, using platform encoding Cp1252, i.e. build is platform dependent!
    [INFO] Compiling 6 source files to C:\MyHadoopProjects\target\classes
    [INFO] -------------------------------------------------------------
    [ERROR] COMPILATION ERROR :
    [INFO] -------------------------------------------------------------
    [ERROR] Source option 5 is no longer supported. Use 7 or later.
    [ERROR] Target option 5 is no longer supported. Use 7 or later.
    [INFO] 2 errors
    [INFO] -------------------------------------------------------------
    [INFO] ------------------------------------------------------------------------
    [INFO] BUILD FAILURE
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time: 1.442 s
    [INFO] Finished at: 2019-08-04T21:34:39-04:00
    [INFO] ------------------------------------------------------------------------
    [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project bigdata: Compilation failure: Compilation failure:
    [ERROR] Source option 5 is no longer supported. Use 7 or later.
    [ERROR] Target option 5 is no longer supported. Use 7 or later.
    [ERROR] -> [Help 1]
    [ERROR]
    [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
    [ERROR] Re-run Maven using the -X switch to enable full debug logging.
    [ERROR]
    [ERROR] For more information about the errors and possible solutions, please read the following articles:
    [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
    C:\MyHadoopProjects>
     
    #9
  10. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    add the following in pom.xml (before depenencies tag)
    <properties>
    <maven.compiler.source>1.8</maven.compiler.source>
    <maven.compiler.target>1.8</maven.compiler.target>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    OR else use Jdk8
     
    #10
    Shailendra Parauha likes this.
  11. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Thanks Gautam for solution :) I used jdk1.8.0_221 and issue is resolved :)
     
    #11
  12. _34000

    _34000 Member

    Joined:
    Jul 5, 2018
    Messages:
    2
    Likes Received:
    0
    Hi Gautam,
    I am not able to start google cloud cluster, There is an account issue instead solving this, can I create cluster on Smplilearn lab? I am able to run basic commands like hdfs dfs -mkdir on the lab but how to create master and slave cluster on the lab. Same steps? Please suggest.
     
    #12
  13. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,

    I am getting error "Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient" when running HIVE command "show tables"
    I thought it may be because of .lck file, but I can't see the folder metastore_db inside "/var/lib/hive/metastore"
    If I run the same command "show tables" again I am getting another error "java.sql.SQLException: Directory /var/lib/hive/metastore/metastore_db cannot be created"
    please see the step 1,2,3
    (1)
    hive> show tables;
    FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to i
    nstantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

    (2)
    bash-4.2$ whoami
    hdfs
    bash-4.2$ pwd
    /var/lib/hive/metastore
    bash-4.2$ ls -lrt
    total 0

    (3)
    hive> show tables;
    ============= begin nested exception, level (1) ===========
    java.sql.SQLException: Directory /var/lib/hive/metastore/metastore_db cannot be created.
    at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
    at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
    at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
    at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection.createDatabase(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
    at org.apache.derby.jdbc.InternalDriver.getNewEmbedConnection(Unknown Source)
    at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
    at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
    at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:208)
    at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)
    at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:416)
    at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:120)
    at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryI
    mpl.java:501)
    at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:298)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:
    631)
    at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
    at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187)
    at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.ja
    va:775)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManag
    erFactory.java:333)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerF
    actory.java:202)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     
    #13
  14. Geetha Kannanchath

    Geetha Kannanchath New Member

    Joined:
    Jul 9, 2019
    Messages:
    1
    Likes Received:
    0
    Hi,

    Inside bashrc file add the below environment variables at End Of File : sudo gedit ~/.bashrc

    #Java Home directory configuration
    export JAVA_HOME="/usr/lib/jvm/java-9-oracle"
    export PATH="$PATH:$JAVA_HOME/bin"

    # Hadoop home directory configuration
    export HADOOP_HOME=/usr/local/hadoop
    export PATH=$PATH:$HADOOP_HOME/bin
    export PATH=$PATH:$HADOOP_HOME/sbin

    export HIVE_HOME=/usr/lib/hive
    export PATH=$PATH:$HIVE_HOME/bin

    Also the error message shows that a connection pool is not being created as the current user is a read-only user and don't have write permissions on metastore_db files.
    This happens when we try to login to hive from a different user other than permitted.
    The easiest way to get rid of it is to give all permissions to all users to required files.[required files location must have come when you had seen the second exception]
    Below are the steps:
    1. Check which user are you using to login.
    Command: whoami
    2. Now do a ls to the location to see if the user have enough write and execute permissions.
    Command: ls -l [location] [default is /var/lib/hive/metastore/metastore_db]
    3. From step2 list, check what all permissions are available to files under metastore_db
    It should be 'rwx' for the user you got from step1.
    4. If not, give rwx permissions to all user using a sudo command.
    Command: cd [location]
    Command: sudo chmod a+rwx . --recursive
    Command: rm *.lck

    Hope this solves the problem.
     
    #14
  15. Avinash_153

    Avinash_153 New Member

    Joined:
    Jul 21, 2019
    Messages:
    1
    Likes Received:
    0
    Hi Gautam,

    I'm getting the following error when i try to import table from Mysql to HDFS.

    Command I ran:

    sqoop import --connect jdbc:mysql://localhost/avinash --username labuser --password simplilearn --table avi --target-dir avinash/my_table

    Error:

    Task Id : attempt_1565602742504_0005_m_000000_0, Status : FAILED
    Error: java.lang.RuntimeException: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link f
    ailure
    The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
    at org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:170)
    at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:161)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:755)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
    Caused by: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

    Thanks.
     
    #15
  16. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    No need to create master and slave node in the simplilearn environment. Its apreinstalled single node cluster and everything preconfigured. Sust open hue browser or Linux terminal to submit jobs.
     
    #16
  17. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
     
    #17
  18. _34000

    _34000 Member

    Joined:
    Jul 5, 2018
    Messages:
    2
    Likes Received:
    0
    I am always getting ssh error while starting the VM .

    You cannot connect to the VM instance because of an unexpected error. Wait a few moments and then try again. (#79) . Can you suggest pls.
     
    #18
  19. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    #chmod -R 777 /var/lib/hive/metastore/metastore_db
    #rm -rf /var/lib/hive/metastore/metastore_db/*.lck
    Reopen Hive again
     
    #19
  20. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    If you are able to connect earlier, try rebooting the instance. May be out of disk. Can increase the memory and disk after stopping the instance.
     
    #20
  21. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    Check if you are able to create a folder in hdfs. hadoop fs -mkdir /tmp/mydir. If you are not able to create then problem with the Hdaoop services. Mysql connector version should be version 8.
     
    #21
  22. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    Manually try to create /var/lib/hive/metastore/metastore_db folder. Give permission and retry.
     
    #22
  23. Shanthakumar_1

    Shanthakumar_1 New Member

    Joined:
    Mar 30, 2019
    Messages:
    1
    Likes Received:
    0
    Hi Gautam,

    I am trying to do some exercise on hive using ngrams following the link

    https://gist.github.com/umbertogriffo/a512baaf63ce0797e175

    I have followed the instructions and created table tweets_raw . After creating I have checked the records its showing 0 that mean its not picking the already uploaded twitter data. Below is the output while checking select * from tweets_raw; Could you please check and guide me to proceed further.

    0: jdbc:hive2://instance1:10001/default> select * from tweets_raw;
    INFO : Compiling command(queryId=hive_20190818193232_a6d4bd50-9632-4529-a495-2a9efc376a89): select * from tweets_raw
    INFO : Semantic Analysis Completed
    INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tweets_raw.id, type:bigint, comment:null), FieldSchema(name:tweets_raw.created_at, type:string, comment:null), FieldSchema(name:tweets_raw.source, type:string, comment:null), FieldSchema(name:tweets_raw.favorited, type:boolean, comment:null), FieldSchema(name:tweets_raw.retweet_count, type:int, comment:null), FieldSchema(name:tweets_raw.retweeted_status, type:struct<text:string,user:struct<screen_name:string,name:string>>, comment:null), FieldSchema(name:tweets_raw.entities, type:struct<urls:array<struct<expanded_url:string>>,user_mentions:array<struct<screen_name:string,name:string>>,hashtags:array<struct<text:string>>>, comment:null), FieldSchema(name:tweets_raw.text, type:string, comment:null), FieldSchema(name:tweets_raw.user, type:struct<screen_name:string,name:string,friends_count:int,followers_count:int,statuses_count:int,verified:boolean,utc_offset:string,time_zone:string>, comment:null), FieldSchema(name:tweets_raw.in_reply_to_screen_name, type:string, comment:null), FieldSchema(name:tweets_raw.year, type:int, comment:null), FieldSchema(name:tweets_raw.month, type:int, comment:null), FieldSchema(name:tweets_raw.day, type:int, comment:null), FieldSchema(name:tweets_raw.hour, type:int, comment:null)], properties:null)
    INFO : Completed compiling command(queryId=hive_20190818193232_a6d4bd50-9632-4529-a495-2a9efc376a89); Time taken: 0.286 seconds
    INFO : Concurrency mode is disabled, not creating a lock manager
    INFO : Executing command(queryId=hive_20190818193232_a6d4bd50-9632-4529-a495-2a9efc376a89): select * from tweets_raw
    INFO : Completed executing command(queryId=hive_20190818193232_a6d4bd50-9632-4529-a495-2a9efc376a89); Time taken: 0.001 seconds
    INFO : OK
    +----------------+------------------------+--------------------+-----------------------+---------------------------+------------------------------+----------------------+------------------+------------------+-------------------------------------+------------------+-------------------+-----------------+------------------+--+
    | tweets_raw.id | tweets_raw.created_at | tweets_raw.source | tweets_raw.favorited | tweets_raw.retweet_count | tweets_raw.retweeted_status | tweets_raw.entities | tweets_raw.text | tweets_raw.user | tweets_raw.in_reply_to_screen_name | tweets_raw.year | tweets_raw.month | tweets_raw.day | tweets_raw.hour |
    +----------------+------------------------+--------------------+-----------------------+---------------------------+------------------------------+----------------------+------------------+------------------+-------------------------------------+------------------+-------------------+-----------------+------------------+--+
    +----------------+------------------------+--------------------+-----------------------+---------------------------+------------------------------+----------------------+------------------+------------------+-------------------------------------+------------------+-------------------+-----------------+------------------+--+
    No rows selected (0.523 seconds)
    0: jdbc:hive2://instance1:10001/default>
     
    #23
  24. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,
    I was able to connect to hive-server2 but while running sqoop import command I am getting error "
    ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_D
    IR is set correctly" please help me on this, below is details:

    bash-4.2$ sqoop import --connect jdbs:mysql://localhost/testDb \
    > --username root -P \
    > --table studentinto \
    > --hive-import \
    > --hive-table test.studentinfo -m 1;


    19/08/22 04:20:55 INFO hive.HiveImport: Loading uploaded data into Hive
    19/08/22 04:20:55 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_D
    IR is set correctly.
    19/08/22 04:20:55 ERROR tool.ImportTool: Import failed: java.io.IOException: java.lang.ClassNotFoundException: org.
    apache.hadoop.hive.conf.HiveConf
    at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
    at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
    at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
    at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
    at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
    at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
    at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
    at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
    Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
    ... 12 more
    bash-4.2$ hadoop fs -ls /user/hive/warehouse
    Found 2 items
    drwxrwxrwx - hdfs supergroup 0 2019-08-17 00:32 /user/hive/warehouse/demo.db
    drwxrwxrwx - anonymous supergroup 0 2019-08-22 03:49 /user/hive/warehouse/test.db

    also please let me know why test.db was created by anonymous in hive-server2 ?
     
    #24
  25. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,

    jar file "hive-hcatalog-core.jar" is existing inside folders "/usr/lib/sentry/lib/" and "/usr/lib/impala/lib/" but while running "add jar" command,
    I am getting error message jar file doesn't exit.
    Why "add jar" command is unable to find the jar file.

    One thing I can see that this jar file "hive-hcatalog-core.jar" is shared

    [root@instance-m sparauha]# cd /
    [root@instance-m /]# find . -name hive-hcatalog-core.jar
    ./usr/lib/sentry/lib/hive-hcatalog-core.jar
    ./usr/lib/impala/lib/hive-hcatalog-core.jar

    0: jdbc:hive2://localhost:10001/default> add jar /usr/lib/impala/lib/hive-hcatalog-core.jar;
    Error: Error while processing statement: /usr/lib/impala/lib/hive-hcatalog-core.jar does not exist (state=,code=1)
    0: jdbc:hive2://localhost:10001/default> add jar /usr/lib/sentry/lib/hive-hcatalog-core.jar;
    Error: Error while processing statement: /usr/lib/sentry/lib/hive-hcatalog-core.jar does not exist (state=,code=1)
    0: jdbc:hive2://localhost:10001/default>

    [root@instance-m lib]# cd /usr/lib/impala/lib
    [root@instance-m lib]# ls -lrt hive-hcatalog-core.jar
    lrwxrwxrwx. 1 root root 60 Aug 23 00:55 hive-hcatalog-core.jar -> /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar
    [root@instance-m /]# ls -lrt /usr/lib/sentry/lib/hive-hcatalog-core.jar
    lrwxrwxrwx. 1 root root 60 Aug 16 23:06 /usr/lib/sentry/lib/hive-hcatalog-core.jar -> /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar
    [root@instance-m /]#


    Thanks,
    Shailendra
     
    #25
  26. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,

    While running "beeline -u jdbc:hive2:/localhost:10001/default" command I am getting below error, please help me on this.
    "ERROR beeline.ClassNameCompleter: Fail to parse the class name from the Jar file due to the exception:java.io.FileNotFoundException: minlog-1.2.jar (No such file or directory)"


    bash-4.2$ beeline -u jdbc:hive2:/localhost:10001/default
    which: no hbase in (/sbin:/bin:/usr/sbin:/usr/bin)
    scan complete in 2ms
    19/08/22 22:48:36 [main]: ERROR beeline.ClassNameCompleter: Fail to parse the class name from the Jar file due to t
    he exception:java.io.FileNotFoundException: minlog-1.2.jar (No such file or directory)
    19/08/22 22:48:36 [main]: ERROR beeline.ClassNameCompleter: Fail to parse the class name from the Jar file due to t
    he exception:java.io.FileNotFoundException: objenesis-1.2.jar (No such file or directory)
    19/08/22 22:48:36 [main]: ERROR beeline.ClassNameCompleter: Fail to parse the class name from the Jar file due to t
    he exception:java.io.FileNotFoundException: reflectasm-1.07-shaded.jar (No such file or directory)
    Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The
    driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
    scan complete in 2215ms
    No known driver to handle "jdbc:hive2:/localhost:10001/default"
    Beeline version 1.1.0-cdh5.16.2 by Apache Hive


    Thanks,
    Shailendra
     
    #26
  27. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    The data for the table needs to be copied to hdfs (at /user/YOURUSER/upload/upload/data/tweets_raw) location. Check this location in hdfs contains data.
     
    #27
  28. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    Looks like you are missing hive-common-x.x.jar file. Check the file (hive-common-x.x.jar) under /usr/lib/hive and copy it to /usr/lib/sqoop folder and retry.
     
    #28
  29. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    You are missing required libraries from the installation. Delete hive and reinstall. First stop hive server2 and metastore. Uninstall yum remove hive -y. Delete /usr/lib/hive /var/lib/hve /etc/hive /usr/bin/hive. Reinstall hive. yum install hive -y
     
    #29
  30. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    From the output of ls -l command, there is a soft link of hive-hcatalog-core.jar (hive-hcatalog-core.jar -> /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar). so the jar is actually located in /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar and a soft link is created in /usr/lib/impala/lib/. Use the path to add jar: /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar
     
    #30
  31. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    hi gautam
    I have copied file hive-common-1.1.0-cdh5.16.2.jar from dir /usr/lib/hive/lib to /usr/lib/sqoop/lib
    now when running sqoop import command I am getting error "Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/shims/ShimLoader"
    please help me...

    [root@instance-m lib]# pwd
    /usr/lib/hive/lib
    [root@instance-m lib]# ls -l hive-common*
    -rw-r--r--. 1 root root 334031 Jun 3 10:39 hive-common-1.1.0-cdh5.16.2.jar
    lrwxrwxrwx. 1 root root 31 Aug 16 23:06 hive-common.jar -> hive-common-1.1.0-cdh5.16.2.jar
    [root@instance-m lib]# pwd
    /usr/lib/sqoop/lib
    [root@instance-m lib]# cp /usr/lib/hive/lib/hive-common-1.1.0-cdh5.16.2.jar .
    [root@instance-m lib]# ls -l hive-common*
    -rw-r--r--. 1 root root 334031 Aug 24 00:44 hive-common-1.1.0-cdh5.16.2.jar
    [root@instance-m lib]#


    sqoop import --connect jdbc:mysql://localhost/testDb \
    --username root -P \
    --table studentinfo \
    --hive-import \
    --hive-table test.studentinfo -m 1;


    19/08/24 00:50:08 INFO mapreduce.ImportJobBase: Transferred 99 bytes in 32.5144 seconds (3.0448 bytes/sec)
    19/08/24 00:50:08 INFO mapreduce.ImportJobBase: Retrieved 9 records.
    19/08/24 00:50:08 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `studentinfo` AS t LIMIT 1
    19/08/24 00:50:08 INFO hive.HiveImport: Loading uploaded data into Hive
    Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/shims/ShimLoader
    at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:370)
    at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:108)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
    at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
    at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
    at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
    at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
    at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
    at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
    at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
    Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.shims.ShimLoader
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 17 more
     
    #31
    Last edited: Aug 23, 2019
  32. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,
    I don't see folder hive-hcatalog inside /usr/lib
    Please let me how to handle this issue?

    Thanks,
    Shailendra
     
    #32
  33. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,

    I followed all the above steps mentioned as you mentioned.
    After installing hive again, I am getting "Missing Hive Execution Jar: /usr/lib/hive/lib/hive-exec-*.jar"
    I copied file "hive-exec-1.1.0-cdh5.16.2-core.jar" and "hive-exec-core.jar" fie from /usr/lib/hive/auxlib to /usr/lib/hive/lib/
    Now I am getting error "/usr/lib/hive/lib/hive-exec-1.1.0-cdh5.16.2-core.jar: binary operator expected
    Missing Hive MetaStore Jar"

    Please let me know what to do now?


    [root@instance-m lib]# sudo hive --service metastore
    Missing Hive Execution Jar: /usr/lib/hive/lib/hive-exec-*.jar
    [root@instance-m lib]# ls -l /var/lib/hive/metastore
    total 0
    [root@instance-m lib]#
    [root@instance-m lib]# su hdfs
    bash-4.2$ hive
    Missing Hive Execution Jar: /usr/lib/hive/lib/hive-exec-*.jar
    bash-4.2$

    [root@instance-m lib]# cd /
    [root@instance-m /]# find . -name hive-exec-*
    ./etc/hive/conf.dist/hive-exec-log4j.properties
    ./usr/lib/hive/auxlib/hive-exec-1.1.0-cdh5.16.2-core.jar
    ./usr/lib/hive/auxlib/hive-exec-core.jar
    [root@instance-m /]# cd /usr/lib/hive/auxlib/
    [root@instance-m auxlib]# cp hive-exec* /usr/lib/hive/lib/
    [root@instance-m auxlib]# cd /usr/lib/hive/lib/
    [root@instance-m lib]# ls -l hive-exec*
    -rw-r--r--. 1 root root 8130246 Aug 24 01:45 hive-exec-1.1.0-cdh5.16.2-core.jar
    -rw-r--r--. 1 root root 8130246 Aug 24 01:45 hive-exec-core.jar

    [root@instance-m lib]# sudo hive --service metastore
    /usr/lib/hive/bin/hive: line 91: [: /usr/lib/hive/lib/hive-exec-1.1.0-cdh5.16.2-core.jar: binary operator expected
    Missing Hive MetaStore Jar
    [root@instance-m lib]# su hdfs
    bash-4.2$ hive
    /usr/lib/hive/bin/hive: line 91: [: /usr/lib/hive/lib/hive-exec-1.1.0-cdh5.16.2-core.jar: binary operator expected
    Missing Hive MetaStore Jar


    Thanks,
    Shailendra
     
    #33
  34. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,

    I am getting error "2019-08-25 02:38:11,494 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManag
    er org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.UnknownHostException: Invalid host name: loc
    al host is: (unknown); destination host is: "instance-m1":8031; java.net.UnknownHostException;"

    I have created new master instance: instance-m1 in place of instance-m
    so, I have updated, all 4 conf file in slave instance: instance-s1 as below
    hdfs-site.xml=> hostname as instance-s1
    core-site.xml => hostname as instance-m1
    mapred-site.xml => hostname as instance-m1
    yarn-site.xml=> hostname as instance-m1

    Error:

    2019-08-25 02:38:11,494 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManag
    er
    org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.UnknownHostException: Invalid host name: loc
    al host is: (unknown); destination host is: "instance-m1":8031; java.net.UnknownHostException; For more detai
    ls see: http://wiki.apache.org/hadoop/UnknownHost
    at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl
    .java:215)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
    at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:329)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:563
    )
    at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:609)
    Caused by: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "
    instance-m1":8031; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/Unknow
    nHost

    Thanks,
    Shailendra
     
    #34
  35. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    Sorry for the delay in response. Looks like the master node instance-m1 is not running the RespurceManager process successfully. So the slave node can not find the master. Please check if the master JPS shows RespurceManager process. Possible restart the RespurceManager process and try starting NodeManager in the slave node again.
     
    #35
  36. Ks4823

    Ks4823 Member

    Joined:
    Jul 25, 2019
    Messages:
    2
    Likes Received:
    0
    Hi Gautam,

    Where can I find the Simplilearn pdf's for the updated course presentations? I found the original ones from before the course update but can't locate the updated ones with the new/changed slides. Thanks!
     
    #36
  37. Ashwini Kotwal

    Alumni

    Joined:
    Sep 6, 2016
    Messages:
    5
    Likes Received:
    1
    Hi Gautam,

    I am trying to access optimized version of Hue to open Flume config file in Hue. But safari is giving me error as 'Failed to open the page' for both links; http://ip-10-0-1-10.ec2.internal:8889 and http://ip-10-0-1-11.ec2.internal:8889
    Is there any other way to access optimized Hue. The link from practice lab tab doesn't take to these instances directly.
     
    #37
  38. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam
    From window command prompt , I am running command "mvn exec:java -Dexec.mainClass=com.cloudera.example.ClouderaHiveJdbcExample"
    I am getting error "java.sql.SQLException: Could not open connection to jdbc:hive2://104.154.84.219:10001/test: java.net.ConnectException: Connection timed out: connect"
    please help me ...

    below is details:
    C:\myhdfsprj\hive-impala-java-client>mvn exec:java -Dexec.mainClass=com.cloudera.example.ClouderaHiveJdbcExample
    [INFO] Scanning for projects...
    [INFO]
    [INFO] ---------< com.cloudera.example:cloudera-impala-jdbc-example >----------
    [INFO] Building cloudera-impala-jdbc-example 1.0
    [INFO] --------------------------------[ jar ]---------------------------------
    [INFO]
    [INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ cloudera-impala-jdbc-example ---
    =============================================
    Cloudera Impala JDBC Example
    Using Connection URL: jdbc:hive2://104.154.84.219:10001/test
    Running Query: select * from studentinfo limit 50
    java.sql.SQLException: Could not open connection to jdbc:hive2://104.154.84.219:10001/test: java.net.ConnectException: Connection timed out: connect
    at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:187)
    at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:164)
    at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:247)
    at com.cloudera.example.ClouderaHiveJdbcExample.main(ClouderaHiveJdbcExample.java:42)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
    at java.lang.Thread.run(Thread.java:748)
    Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out: connect
    at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
    at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248)
    at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
    at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:185)
    ... 11 more
    Caused by: java.net.ConnectException: Connection timed out: connect
    at java.net.DualStackPlainSocketImpl.connect0(Native Method)
    at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
    ... 14 more
    [INFO] ------------------------------------------------------------------------
    [INFO] BUILD SUCCESS
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time: 22.155 s
    [INFO] Finished at: 2019-08-28T18:08:11-04:00
    [INFO] ------------------------------------------------------------------------
    C:\myhdfsprj\hive-impala-java-client>
     
    #38
  39. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    #39
  40. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    Hue is used here to submit Hadoop/Spark jobs: http://sl.cloudloka.com:8888/hue/accounts/login/?next=/
    To access Flume configuration file, use Web Console option in practice lab and go to /etc/flume-ng/conf folder from terminal
     
    #40
  41. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    Connection exception because hive can not be connected through beeline. Check if
     
    #41
  42. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    Connection exception because hive can not be connected through beeline. Check if you are able to connect to Hive beeline at 10001 port from terminal.
     
    #42
  43. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    from terminal its working

    thanks
     
    #43
  44. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,

    (1)Below command is hanging not executing, do we need to do any setting for NGRAMS command?

    0: jdbc:hive2://localhost:10001/default> SELECT EXPLODE(NGRAMS(SENTENCES(LOWER(review)), 2, 5)) AS bigrams FROM customer;

    (2) similarly below insert command is also hanging

    hive> insert into table mytable123 values ("mumbai");
    Query ID = hdfs_20190830014848_52cedbc2-571d-430c-b7c7-97ae6b0474a7Total jobs = 3Launching Job 1 out of 3Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job = job_1566711604661_1020, Tracking URL = http://master:8088/proxy/application_1566711604661_1020/Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1566711604661_1020



    (3)similarly pig dump command is also hanging
    grunt>a = load '/tmp/1.txt' as (line);
    grunt>b = foreach a generate FLATTEN(TOKENIZE(line)) as word;
    grunt>c = group b by word;
    grunt>d = foreach c generate group, COUNT(b);
    grunt>dump d;
    2019-08-29 22:14:05,647 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.jar is deprecated. I
    nstead, use mapreduce.job.jar
    2019-08-29 22:14:05,682 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompile
    r - Setting up single store job
    2019-08-29 22:14:05,702 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will
    not generate code.
    2019-08-29 22:14:05,702 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated c
    ode to distributed cache
    2019-08-29 22:14:05,702 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes
    ] with classes to deserialize []
    2019-08-29 22:14:05,819 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
    - 1 map-reduce job(s) waiting for submission.
    2019-08-29 22:14:05,820 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.add
    ress is deprecated. Instead, use mapreduce.jobtracker.http.address
    2019-08-29 22:14:05,849 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at
    master/10.128.0.6:8032
    2019-08-29 22:14:05,881 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is de
    precated. Instead, use fs.defaultFS
    2019-08-29 22:14:06,682 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input path
    s to process : 1
    2019-08-29 22:14:06,682 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total in
    put paths to process : 1
    2019-08-29 22:14:06,696 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total in
    put paths (combined) to process : 1
    2019-08-29 22:14:06,763 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
    2019-08-29 22:14:07,000 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: jo
    b_1566711604661_0774
    2019-08-29 22:14:07,480 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted applic
    ation application_1566711604661_0774
    2019-08-29 22:14:07,580 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://maste
    r:8088/proxy/application_1566711604661_0774/
    2019-08-29 22:14:07,581 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
    - HadoopJobId: job_1566711604661_0774
    2019-08-29 22:14:07,582 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
    - Processing aliases a,b,c,d
    2019-08-29 22:14:07,582 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
    - detailed locations: M: a[1,4],b[2,4],d[4,4],c[3,4] C: d[4,4],c[3,4] R: d[4,4]
    2019-08-29 22:14:07,680 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
    - 0% complete

    Thanks,
    Shailendra
     
    #44
    Last edited: Aug 29, 2019
  45. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    if jdbc:hive2://104.154.84.219:10001/test works from terminal then from java code also it should work.
     
    #45
  46. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    from window didn't work
     
    #46
  47. Gautam Pal

    Gautam Pal Customer
    Customer

    Joined:
    Jul 23, 2019
    Messages:
    27
    Likes Received:
    1
    It looks like out of memory. Increase the instance type to 7.5 RB RAM 3 vCPU. Or reboot both nodes once.
     
    #47
  48. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,

    I installed cassandra in linux, getting error while launching cassandra

    ./cassandra is giving error - no file or directory

    If I give cassandra command I am getting error
    "ERROR [main] 2019-08-31 02:33:31,678 CassandraDaemon.java:749 - Port already in use: 7199; nested exception is: java.net.BindException: Address already in use (Bind failed)"

    [root@master cassandra]# su hdfsbash-4.2
    $ cassandra

    WARN [main] 2019-08-31 02:33:31,386 DatabaseDescriptor.java:480 - Small commitlog volume detected at /var/lib/cassandra/commitlog; setting commitlog_total_space_in_mb to 2557. You can override this in cassandra.yamlWARN [main] 2019-08-31 02:33:31,389 DatabaseDescriptor.java:507 - Small cdc volume detected at /cdc_raw; setting cdc_total_space_in_mb to 1278. You can override this in cassandra.yamlWARN [main] 2019-08-31 02:33:31,521 DatabaseDescriptor.java:556 - Only 4.469GiB free across all data volumes. Consider adding more capacity to your cluster or removing obsolete snapshotsINFO [main] 2019-08-31 02:33:31,549 RateBasedBackPressure.java:123 - Initialized back-pressure with high ratio: 0.9, factor: 5, flow: FAST, window size: 2000.INFO [main] 2019-08-31 02:33:31,550 DatabaseDescriptor.java:735 - Back-pressure is disabled with strategy org.apache.cassandra.net.RateBasedBackPressure{high_ratio=0.9, factor=5, flow=FAST}.ERROR [main] 2019-08-31 02:33:31,678 CassandraDaemon.java:749 - Port already in use: 7199; nested exception is: java.net.BindException: Address already in use (Bind failed)java.net.BindException: Address already in use (Bind failed) at java.net.PlainSocketImpl.socketBind(Native Method) ~[na:1.8.0_131] at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387) ~[na:1.8.0_131] at java.net.ServerSocket.bind(ServerSocket.java:375) ~[na:1.8.0_131] at java.net.ServerSocket.<init>(ServerSocket.java:237) ~[na:1.8.0_131] at javax.net.DefaultServerSocketFactory.createServerSocket(ServerSocketFactory.java:231) ~[na:1.8.0_131] at org.apache.cassandra.utils.RMIServerSocketFactoryImpl.createServerSocket(RMIServerSocketFactoryImpl.java:42) ~[apache-cassandra-3.11.4.jar:3.11.4] at sun.rmi.transport.tcp.TCPEndpoint.newServerSocket(TCPEndpoint.java:666) ~[na:1.8.0_131] at sun.rmi.transport.tcp.TCPTransport.listen(TCPTransport.java:330) ~[na:1.8.0_131] at sun.rmi.transport.tcp.TCPTransport.exportObject(TCPTransport.java:249) ~[na:1.8.0_131] at sun.rmi.transport.tcp.TCPEndpoint.exportObject(TCPEndpoint.java:411) ~[na:1.8.0_131] at sun.rmi.transport.LiveRef.exportObject(LiveRef.java:147) ~[na:1.8.0_131] at sun.rmi.server.UnicastServerRef.exportObject(UnicastServerRef.java:234) ~[na:1.8.0_131] at sun.rmi.registry.RegistryImpl.setup(RegistryImpl.java:195) ~[na:1.8.0_131] at sun.rmi.registry.RegistryImpl.<init>(RegistryImpl.java:155) ~[na:1.8.0_131] at java.rmi.registry.LocateRegistry.createRegistry(LocateRegistry.java:239) ~[na:1.8.0_131] at org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:75) ~[apache-cassandra-3.11.4.jar:3.11.4]
     
    #48
    Last edited: Aug 30, 2019
  49. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,

    loading of data into table person failed from hdfs file /tmp/sample.txt
    getting below error "2019-09-01 00:46:33,601 INFO [main] mapreduce.Job: Task Id : attempt_1567293714941_0066_m_000000_1000, Status : FAILED No space available in any of the local directories"

    Please let me know the solution, I believe enough space is available in my instances as I deleted all files from my home directory.
    Please let me know how to make space available in this case, also help me how I can check how much space is available.


    hbase> create 'person', 'address'
    su hdfs
    $ vi /tmp/sample.txt
    1,mumbai
    2,delhi
    $ hadoo fs -put /tmp/sample.txt /tmp/sample.txt
    $ hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="," -Dimporttsv.columns=HBASE_ROW_KEY,address person /tmp/sample.txt
    Error Details:
    -------------
    2019-09-01 00:46:09,928 INFO [main] mapreduce.Job: Running job: job_1567293714941_0066
    2019-09-01 00:46:30,477 INFO [main] mapreduce.Job: Job job_1567293714941_0066 running in uber mode : false
    2019-09-01 00:46:30,479 INFO [main] mapreduce.Job: map 0% reduce 0%
    2019-09-01 00:46:33,601 INFO [main] mapreduce.Job: Task Id : attempt_1567293714941_0066_m_000000_1000, Status : FA
    ILED
    No space available in any of the local directories.
    2019-09-01 00:46:36,644 INFO [main] mapreduce.Job: Task Id : attempt_1567293714941_0066_m_000000_1001, Status : FA
    ILED
    No space available in any of the local directories.
    2019-09-01 00:46:40,684 INFO [main] mapreduce.Job: Task Id : attempt_1567293714941_0066_m_000000_1002, Status : FA
    ILED
    No space available in any of the local directories.
    2019-09-01 00:46:45,721 INFO [main] mapreduce.Job: map 100% reduce 0%
    2019-09-01 00:46:45,730 INFO [main] mapreduce.Job: Job job_1567293714941_0066 failed with state FAILED due to: Tas
    k failed task_1567293714941_0066_m_000000
    Job failed as tasks failed. failedMaps:1 failedReduces:0
    2019-09-01 00:46:45,857 INFO [main] mapreduce.Job: Counters: 12
    Job Counters
    Failed map tasks=4
    Launched map tasks=4
    Other local map tasks=3
    Data-local map tasks=1
    Total time spent by all maps in occupied slots (ms)=3167
    Total time spent by all reduces in occupied slots (ms)=0
    Total time spent by all map tasks (ms)=3167
    Total vcore-milliseconds taken by all map tasks=3167
    Total megabyte-milliseconds taken by all map tasks=3243008
    Map-Reduce Framework
    CPU time spent (ms)=0
    Physical memory (bytes) snapshot=0
    Virtual memory (bytes) snapshot=0
    bash-4.2$
     
    #49
  50. Shailendra Parauha

    Shailendra Parauha Active Member

    Joined:
    Jul 26, 2019
    Messages:
    21
    Likes Received:
    0
    Hi Gautam,

    what is solution for below

    [root@master sparauha]# sudo service zookeeper-server init
    No myid provided, be sure to specify it in /var/lib/zookeeper/myid if using non-standalone

    Thanks,
    Shailendra
     
    #50

Share This Page