Welcome to the Simplilearn Community

Want to join the rest of our members? Sign up right away!

Sign Up

WIPRO - BIGDATA ACADEMY

Shashidhar Jambanour

Member
Customer
Assignment : Computing how many patients have been treated by each doctor in a hospital.
 

Attachments

  • DoctorPatientsResult.jpg
    DoctorPatientsResult.jpg
    117.4 KB · Views: 12
  • DoctorToPatients.txt
    2.1 KB · Views: 14

Gaurav Khandelwal_2

Member
Customer
Hi Team,
Can you please help to understand why warehouse folder is not there in my /user/hive folder?
I created the DB and table, I can see it in Hive browser but not able to find it in file structure..
Please find attached screenshot.

Also getting error while import table using sqoop.
mysql> describe studentdetails;
+------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(20) | YES | | NULL | |
| department | varchar(30) | YES | | NULL | |
| grade | char(1) | YES | | NULL | |
| doj | date | YES | | NULL | |
+------------+-------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)

[gaurav.khandelwal1_wipro@ec2-52-86-42-143 ~]$ sqoop import --connect jdbc:mysql://172.31.54.174/student --username labuser --password simplilearn --table studentdetails;
etails;
16/12/23 04:51:34 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.0.0-169
16/12/23 04:51:34 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/12/23 04:51:34 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/12/23 04:51:34 INFO tool.CodeGenTool: Beginning code generation
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/12/23 04:51:35 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `studentdetails` AS t LIMIT 1
16/12/23 04:51:35 ERROR manager.SqlManager: Error reading from database: java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@25641d39 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@25641d39 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:934)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:931)
at com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2735)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1899)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2619)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2569)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1524)
at com.mysql.jdbc.ConnectionImpl.getMaxBytesPerChar(ConnectionImpl.java:3003)
at com.mysql.jdbc.Field.getMaxBytesPerCharacter(Field.java:602)
at com.mysql.jdbc.ResultSetMetaData.getPrecision(ResultSetMetaData.java:445)
at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:286)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:241)
at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:227)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1845)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
16/12/23 04:51:35 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: No columns to generate for ClassWriter
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1651)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)

Thanks,
Gaurav
 

Attachments

  • HiveIssue.png
    HiveIssue.png
    64 KB · Views: 1

praveen.rachapally

Member
Customer
assignment: doctor to patient mapping
 

Attachments

  • screens.pdf
    110 KB · Views: 4
  • doctorPatient1.txt
    182 bytes · Views: 5
  • doctorPatient.txt
    112 bytes · Views: 3
  • logs.txt
    4.2 KB · Views: 4
  • DoctorPatientCount.zip
    1.3 KB · Views: 3

Manish Pundir

Member
Customer
Solution.
input
D1 P1
D1 P2
D1 P3
D2 P4
D2 P5
D3 P6
D3 P7
D3 P8
D3 P9
D3 P10
D4 P11
D5 P12
D5 P13

output
D1 3
D2 2
D3 5
D4 1
D5 2
 

Attachments

  • DoctorPatient.txt
    2.1 KB · Views: 10

Elavarasan_1

Member
Customer
Assignment 1

Code:
package hadooptraining;
import java.io.IOException;


import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class DoctorAssignment {

    public static class MyMap extends Mapper<LongWritable,Text,Text,IntWritable>{
        public void map(LongWritable offset,Text line,Context context) throws IOException, InterruptedException{
            String[] stringLine= line.toString().split("\\s+");
            context.write(new Text(stringLine[0]), new IntWritable(1));
           
        }
    }
    public static class MyReduce extends Reducer<Text,IntWritable,Text,IntWritable>{
        int totalPatient;
        public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException, InterruptedException{
            totalPatient=0;
            for(IntWritable value:values){
                totalPatient+= value.get();
               
            }
            context.write(key, new IntWritable(totalPatient));
        }
    }
   
    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException{
        Configuration conf = new Configuration();
       
        @SuppressWarnings("deprecation")
        Job job = new Job(conf,"doctorjob");
        job.setJarByClass(hadooptraining.DoctorAssignment.class);
        job.setMapperClass(MyMap.class);
        job.setReducerClass(MyReduce.class);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        job.setOutputKeyClass(Text.class);
      job.setOutputValueClass(IntWritable.class);
        System.exit(job.waitForCompletion(true)?0:1);
    }
}
 

Attachments

  • doctor.txt
    36 bytes · Views: 3
  • part-r-00000.txt
    15 bytes · Views: 1

DeshDeep Singh

Well-Known Member
Simplilearn Support
Alumni
Hi Team,
Can you please help to understand why warehouse folder is not there in my /user/hive folder?
I created the DB and table, I can see it in Hive browser but not able to find it in file structure..
Please find attached screenshot.

Also getting error while import table using sqoop.
mysql> describe studentdetails;
+------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(20) | YES | | NULL | |
| department | varchar(30) | YES | | NULL | |
| grade | char(1) | YES | | NULL | |
| doj | date | YES | | NULL | |
+------------+-------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)

[gaurav.khandelwal1_wipro@ec2-52-86-42-143 ~]$ sqoop import --connect jdbc:mysql://172.31.54.174/student --username labuser --password simplilearn --table studentdetails;
etails;
16/12/23 04:51:34 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.0.0-169
16/12/23 04:51:34 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/12/23 04:51:34 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/12/23 04:51:34 INFO tool.CodeGenTool: Beginning code generation
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/12/23 04:51:35 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `studentdetails` AS t LIMIT 1
16/12/23 04:51:35 ERROR manager.SqlManager: Error reading from database: java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@25641d39 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@25641d39 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:934)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:931)
at com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2735)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1899)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2619)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2569)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1524)
at com.mysql.jdbc.ConnectionImpl.getMaxBytesPerChar(ConnectionImpl.java:3003)
at com.mysql.jdbc.Field.getMaxBytesPerCharacter(Field.java:602)
at com.mysql.jdbc.ResultSetMetaData.getPrecision(ResultSetMetaData.java:445)
at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:286)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:241)
at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:227)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1845)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
16/12/23 04:51:35 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: No columns to generate for ClassWriter
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1651)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)

Thanks,
Gaurav

Hi Gaurav,

By Default hive tables are created in below location.

/apps/hive/warehouse

Once you import the data into the table then the table will store under your database.

Below is the sample command to import, please modify accorind to your need.

sqoop import --connect jdbc:mysql://172.31.54.174/student --driver com.mysql.jdbc.Driver --username labuser --password simplilearn --table struc_data --target-dir struc_data1 /user/deshdeep_gmail/test -m1
 

DeshDeep Singh

Well-Known Member
Simplilearn Support
Alumni
Solution.
input
D1 P1
D1 P2
D1 P3
D2 P4
D2 P5
D3 P6
D3 P7
D3 P8
D3 P9
D3 P10
D4 P11
D5 P12
D5 P13

output
D1 3
D2 2
D3 5
D4 1
D5 2
Assignment 1

Code:
package hadooptraining;
import java.io.IOException;


import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class DoctorAssignment {

    public static class MyMap extends Mapper<LongWritable,Text,Text,IntWritable>{
        public void map(LongWritable offset,Text line,Context context) throws IOException, InterruptedException{
            String[] stringLine= line.toString().split("\\s+");
            context.write(new Text(stringLine[0]), new IntWritable(1));
          
        }
    }
    public static class MyReduce extends Reducer<Text,IntWritable,Text,IntWritable>{
        int totalPatient;
        public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException, InterruptedException{
            totalPatient=0;
            for(IntWritable value:values){
                totalPatient+= value.get();
              
            }
            context.write(key, new IntWritable(totalPatient));
        }
    }
  
    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException{
        Configuration conf = new Configuration();
      
        @SuppressWarnings("deprecation")
        Job job = new Job(conf,"doctorjob");
        job.setJarByClass(hadooptraining.DoctorAssignment.class);
        job.setMapperClass(MyMap.class);
        job.setReducerClass(MyReduce.class);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        job.setOutputKeyClass(Text.class);
      job.setOutputValueClass(IntWritable.class);
        System.exit(job.waitForCompletion(true)?0:1);
    }
}
assignment: doctor to patient mapping

HI Praveen, Elavarsan, Manish,

Good work, lets discuss this in the class today.

Good job!!!
 

Akhilesh_19

Member
Customer
Hi Team,
I am getting the below error when I am trying to open cloudera in virtual box - Can you please help ?

Failed to open a session for the virtual machine cloudera-quickstart-vm-5.8.0-0-virtualbox.

VT-x is disabled in the BIOS for all CPU modes (VERR_VMX_MSR_ALL_VMX_DISABLED).

Result Code: E_FAIL (0x80004005)
Component: ConsoleWrap
Interface: IConsole {872da645-4a9b-1727-bee2-5585105b9eed}
 

DeshDeep Singh

Well-Known Member
Simplilearn Support
Alumni
Hi Team,
I am getting the below error when I am trying to open cloudera in virtual box - Can you please help ?

Failed to open a session for the virtual machine cloudera-quickstart-vm-5.8.0-0-virtualbox.

VT-x is disabled in the BIOS for all CPU modes (VERR_VMX_MSR_ALL_VMX_DISABLED).

Result Code: E_FAIL (0x80004005)
Component: ConsoleWrap
Interface: IConsole {872da645-4a9b-1727-bee2-5585105b9eed}

Hi Akhilesh,

Please follow the below link to enable virtualization:

 

DeshDeep Singh

Well-Known Member
Simplilearn Support
Alumni
Hi All,

This is to inform you that we have given access to all the registered participants for Introduction to Programming in Scala course.

Please click on the activation link sent to your registered mail Id to activate this course at the earliest (24 hrs). In case of further issues, please mail us at wipro@simplilearn.com
 

Shashidhar Jambanour

Member
Customer
Sqoop export assignment:
Exporting data from HDFS to mysql.

The below query is executed but I am unable to see any data loaded into the output file i.e student.txt.
Could anyone please help in resolving this.


Command:
==========
sqoop export --connect jdbc:mysql://172.31.54.174/student --driver com.mysql.jdbc.Driver --username labuser --password simplilearn --table studentdetails --export-dir /user/shashidhar.jambanour_wipro/input/student.txt

Terminal Logs:
===========
16/12/23 11:11:06 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.0.0-169
16/12/23 11:11:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/12/23 11:11:06 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection
-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
16/12/23 11:11:06 INFO manager.SqlManager: Using default fetchSize of 1000
16/12/23 11:11:06 INFO tool.CodeGenTool: Beginning code generation
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/12/23 11:11:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM studentdetails AS t WHERE 1=0
16/12/23 11:11:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM studentdetails AS t WHERE 1=0
16/12/23 11:11:06 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.4.0.0-169/hadoop-mapreduce
Note: /tmp/sqoop-shashidhar.jambanour_wipro/compile/22c030eaeda2ea496347e077b20e7ccb/studentdetails.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/12/23 11:11:07 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-shashidhar.jambanour_wipro/compile/22c030eaeda2ea496347e077b20e7ccb/studentdetails.jar
16/12/23 11:11:07 INFO mapreduce.ExportJobBase: Beginning export of studentdetails
16/12/23 11:11:08 WARN mapreduce.ExportJobBase: IOException checking input file header: java.io.EOFException
16/12/23 11:11:08 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM studentdetails AS t WHERE 1=0
16/12/23 11:11:09 INFO impl.TimelineClientImpl: Timeline service address: http://ec2-52-86-42-143.compute-1.amazonaws.com:8188/ws/v1/timeline/
16/12/23 11:11:09 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
16/12/23 11:11:18 INFO input.FileInputFormat: Total input paths to process : 1
16/12/23 11:11:18 INFO input.FileInputFormat: Total input paths to process : 1
16/12/23 11:11:18 INFO mapreduce.JobSubmitter: number of splits:1
16/12/23 11:11:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1478959862039_10026
16/12/23 11:11:19 INFO impl.YarnClientImpl: Submitted application application_1478959862039_10026
16/12/23 11:11:19 INFO mapreduce.Job: The url to track the job: http://ip-172-31-51-30.ec2.internal:8088/proxy/application_1478959862039_10026/
16/12/23 11:11:19 INFO mapreduce.Job: Running job: job_1478959862039_10026
16/12/23 11:12:06 INFO mapreduce.Job: Job job_1478959862039_10026 running in uber mode : false
16/12/23 11:12:06 INFO mapreduce.Job: map 0% reduce 0%
16/12/23 11:12:17 INFO mapreduce.Job: map 100% reduce 0%
16/12/23 11:12:30 INFO mapreduce.Job: Job job_1478959862039_10026 completed successfully
16/12/23 11:12:30 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=155309
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=148
HDFS: Number of bytes written=0
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=22710
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=7570
Total vcore-seconds taken by all map tasks=7570
Total megabyte-seconds taken by all map tasks=11627520
Map-Reduce Framework
Map input records=0
Map output records=0
Input split bytes=148
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=188
CPU time spent (ms)=920
Physical memory (bytes) snapshot=210333696
Virtual memory (bytes) snapshot=3215007744
Total committed heap usage (bytes)=168820736
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
16/12/23 11:12:30 INFO mapreduce.ExportJobBase: Transferred 148 bytes in 82.0177 seconds (1.8045 bytes/sec)
16/12/23 11:12:30 INFO mapreduce.ExportJobBase: Exported 0 records.
[shashidhar.jambanour_wipro@ec2-52-86-42-143 ~]$
 

DeshDeep Singh

Well-Known Member
Simplilearn Support
Alumni
Hi Manish,

Can you share it again.

Find the number of cases per location and categorize the count with respect to reason for taking loan in Hive query.
+---------+--------------------+-----------+
| id | reason | location |
+---------+--------------------+-----------+
| 1077501 | credit_card | AZ |
| 1077430 | car | GA |
| 1077175 | small_business | IL |
| 1076863 | other | CA |
| 1075358 | other | OR |
| 1075269 | wedding | AZ |
| 1069639 | debt_consolidation | NC |
| 1072053 | car | CA |
| 1071795 | small_busines | CA |
+---------+--------------------+-----------+
output should be
+----------+-------------+-----+----------------+----------+---------+-------+
| location | credit_card | car | small_business | other... | wedding | total |
+----------+-------------+-----+----------------+----------+---------+-------+
| AZ | 1 | 0 | 0 | 0 | 1 | 2 |
| CA | 0 | 1 | 1 | 0 | 0 | 2 |
+----------+-------------+-----+----------------+----------+---------+-------+

Getting null age every time while loading data.
Values are there in the input but still getting all the columns age null.
please find the input/data in attached file.

Hi Manish,

Can you share it again.
yes posted again
 

Attachments

  • query.txt
    1.7 KB · Views: 4
Last edited by a moderator:

DeshDeep Singh

Well-Known Member
Simplilearn Support
Alumni
Sqoop export assignment:
Exporting data from HDFS to mysql.

The below query is executed but I am unable to see any data loaded into the output file i.e student.txt.
Could anyone please help in resolving this.


Command:
==========
sqoop export --connect jdbc:mysql://172.31.54.174/student --driver com.mysql.jdbc.Driver --username labuser --password simplilearn --table studentdetails --export-dir /user/shashidhar.jambanour_wipro/input/student.txt

Terminal Logs:
===========
16/12/23 11:11:06 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.0.0-169
16/12/23 11:11:06 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/12/23 11:11:06 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection
-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
16/12/23 11:11:06 INFO manager.SqlManager: Using default fetchSize of 1000
16/12/23 11:11:06 INFO tool.CodeGenTool: Beginning code generation
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/12/23 11:11:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM studentdetails AS t WHERE 1=0
16/12/23 11:11:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM studentdetails AS t WHERE 1=0
16/12/23 11:11:06 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.4.0.0-169/hadoop-mapreduce
Note: /tmp/sqoop-shashidhar.jambanour_wipro/compile/22c030eaeda2ea496347e077b20e7ccb/studentdetails.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/12/23 11:11:07 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-shashidhar.jambanour_wipro/compile/22c030eaeda2ea496347e077b20e7ccb/studentdetails.jar
16/12/23 11:11:07 INFO mapreduce.ExportJobBase: Beginning export of studentdetails
16/12/23 11:11:08 WARN mapreduce.ExportJobBase: IOException checking input file header: java.io.EOFException
16/12/23 11:11:08 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM studentdetails AS t WHERE 1=0
16/12/23 11:11:09 INFO impl.TimelineClientImpl: Timeline service address: http://ec2-52-86-42-143.compute-1.amazonaws.com:8188/ws/v1/timeline/
16/12/23 11:11:09 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
16/12/23 11:11:18 INFO input.FileInputFormat: Total input paths to process : 1
16/12/23 11:11:18 INFO input.FileInputFormat: Total input paths to process : 1
16/12/23 11:11:18 INFO mapreduce.JobSubmitter: number of splits:1
16/12/23 11:11:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1478959862039_10026
16/12/23 11:11:19 INFO impl.YarnClientImpl: Submitted application application_1478959862039_10026
16/12/23 11:11:19 INFO mapreduce.Job: The url to track the job: http://ip-172-31-51-30.ec2.internal:8088/proxy/application_1478959862039_10026/
16/12/23 11:11:19 INFO mapreduce.Job: Running job: job_1478959862039_10026
16/12/23 11:12:06 INFO mapreduce.Job: Job job_1478959862039_10026 running in uber mode : false
16/12/23 11:12:06 INFO mapreduce.Job: map 0% reduce 0%
16/12/23 11:12:17 INFO mapreduce.Job: map 100% reduce 0%
16/12/23 11:12:30 INFO mapreduce.Job: Job job_1478959862039_10026 completed successfully
16/12/23 11:12:30 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=155309
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=148
HDFS: Number of bytes written=0
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=22710
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=7570
Total vcore-seconds taken by all map tasks=7570
Total megabyte-seconds taken by all map tasks=11627520
Map-Reduce Framework
Map input records=0
Map output records=0
Input split bytes=148
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=188
CPU time spent (ms)=920
Physical memory (bytes) snapshot=210333696
Virtual memory (bytes) snapshot=3215007744
Total committed heap usage (bytes)=168820736
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
16/12/23 11:12:30 INFO mapreduce.ExportJobBase: Transferred 148 bytes in 82.0177 seconds (1.8045 bytes/sec)
16/12/23 11:12:30 INFO mapreduce.ExportJobBase: Exported 0 records.
[shashidhar.jambanour_wipro@ec2-52-86-42-143 ~]$

HI Shashidhar,

Can you go there manually using Hue and check I think it should eb saved as th query executed successfully.
 

DeshDeep Singh

Well-Known Member
Simplilearn Support
Alumni
Hi Desh - could you please help me to get database script for loudacre database - which shivang used while training - It really helps to practice -

Thanks,
Siddu

HI Siddalingesh,

I am not sure which data base you are talking about here, can you please confirm. loudacre database?
 

DeshDeep Singh

Well-Known Member
Simplilearn Support
Alumni
Hi All,

I have shared 3 practice projects, data set for one of them is shared in the mail but for rest you can go to the link given i PDF and download the data set.

Once the practice test is done, I request you guys to share them here, so that I can evaluate and share my inputs on the same.
 

MEGHA AHLUWALIA

Member
Customer
Hi Team,
I am getting the below error when I am trying to open cloudera in virtual box - Can you please help ?

Failed to open a session for the virtual machine cloudera-quickstart-vm-5.8.0-0-virtualbox.

VT-x is disabled in the BIOS for all CPU modes (VERR_VMX_MSR_ALL_VMX_DISABLED).

Result Code: E_FAIL (0x80004005)
Component: ConsoleWrap
Interface: IConsole {872da645-4a9b-1727-bee2-5585105b9eed}

Hi Akhilesh,

Is the above issue resolved? I am facing similar issue while trying to open cloudera in virtual box. Are you trying this on wipro laptop, how did you change the bios setting?

Hope to get some reply soon.

Thanks & Regards
 

DeshDeep Singh

Well-Known Member
Simplilearn Support
Alumni
Hi Akhilesh,

Is the above issue resolved? I am facing similar issue while trying to open cloudera in virtual box. Are you trying this on wipro laptop, how did you change the bios setting?

Hope to get some reply soon.

Thanks & Regards

HI Megha,

Check below link to rectify this error.

 

Gaurav Khandelwal_2

Member
Customer
Hi Deshdeep.

Can you please check and let me know why I am getting below error. I am just trying to get the 24MB files to local.

[gaurav.khandelwal1_wipro@ec2-52-86-42-143 ~]$ hadoop dfs -get answers.csv;
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Exception in thread "main" org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:248) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:87) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59) at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119) at org.apache.hadoop.fs.shell.CommandWithDestination$TargetFileSystem.writeStreamToFile(CommandWithDestination.java:466) at org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:391) at org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:328) at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:263) at org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:248) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289) at org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:243) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255) at org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:220) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201) at org.apache.hadoop.fs.shell.Command.run(Command.java:165) at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)

I checked the space and t shows me 100% use. Could you please suggest what to delete?

[gaurav.khandelwal1_wipro@ec2-52-86-42-143 ~]$ df /home/gaurav.khandelwal1_wipro
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvdg1 103079868 97820268 16792 100% /home

Not sure what to delete as I didn't copy any data.
 
Last edited:

Gaurav Khandelwal_2

Member
Customer
Hi Deshdeep,

Could you please validate the result for SocialMedia project as attached?

Thanks,
Gaurav
 

Attachments

  • SocialMediaProjectResult.txt
    1.3 KB · Views: 16

MEGHA AHLUWALIA

Member
Customer
Hi Deshdeep,

I am getting an error while running the wordcount example in hadoop. Can you please help me urgently.

Thanks & Regards,
Megha
9873126243
 

Attachments

  • file1.txt
    2.4 KB · Views: 2
  • logs.txt
    108 KB · Views: 1

MEGHA AHLUWALIA

Member
Customer
Hi Deshdeep,

I created a different class in same package, now I am getting the following error on execution:

hadoop jar wco1.jar mapreduce.wordcount / user/ megha.ahluwalia_wipro/ wipro1.txt / user/ megha.ahluwalia_wipro/ wordout.txtWARNING: Use "yarn jar" to launch YARN applications.16/12/29 09:19:52 INFO impl.TimelineClientImpl: Timeline service address: http://ec2-52-86-42-143.compute-1.amazonaws.com:8188/ws/v1/timeline/Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://cloudlabns/user/megha.ahluwalia_wipro/user already exists at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308) at mapreduce.wordcount.main(wordcount.java:71) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 

Gaurav Khandelwal_2

Member
Customer
Hi All,
I am having issue in using Zip command.
Can you help me and let me know what mistake I am doing.
scala> rdd1.collect
res22: Array[String] = Array(UP, MP, MH)
scala> rdd2.collect
res23: Array[String] = Array(Agra, Gwalior, Pune)

scala> rdd1.zip(rdd2).collect
java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of partitions
I have the checked the file and there is no extra space present and no new line.

Thanks,
Gaurav
 

Khuzema Challawala

Member
Customer
Finding max number from the file with numbers separated by space.

scala> val numberRDD= sc.textFile("sample.txt")
numberRDD: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[8] at textFile at <console>:27
scala> val num1=numberRDD.flatMap(_.split(" "));
num1: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[9] at flatMap at <console>:29
scala> val inVal=num1.map(_.toInt)
inVal: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[10] at map at <console>:31
scala> inVal.top(1)
res4: Array[Int] = Array(89)
 

Soumit Jana

Member
Customer
Hi Team, i have crated two files soumit1.txt and soumit2.txt. When i am trying to run zip command it is giving error. Then again i have created another two files 1.txt and 2.txt. Where if i am running zip command it is working successfully. So something is wrong for my input files soumit1.txt and soumit2.txt. I am unable to find any extra line or space there. Both count is showing me 4 only.





upload_2017-1-2_14-27-46.png



upload_2017-1-2_14-28-16.png
 

Attachments

  • soumit1.txt
    29 bytes · Views: 1
  • 1.txt
    8 bytes · Views: 1
  • soumit2.txt
    19 bytes · Views: 1
  • 2.txt
    6 bytes · Views: 1
  • upload_2017-1-2_14-25-1.png
    upload_2017-1-2_14-25-1.png
    172.8 KB · Views: 2
  • upload_2017-1-2_14-26-20.png
    upload_2017-1-2_14-26-20.png
    201.8 KB · Views: 2
Top