Data Loading csv file data into HBase, using Pig












0















I am trying to load the one CSV file to HBase table. I can successfully dump the data from CSV, but while importing into a table I am getting an error.
But, when loading other data, i can load any issues.



raw_data = LOAD
'hdfs://sandbox.hortonworks.com:8020/user/aravind/hbase/emp_data/emp_data.txt' USING PigStorage(',') AS (empno:int, ename:chararray, job:chararray, mgr:chararray, hiredate:chararray, sal:chararray, comm:chararray, deptno:int);
dump raw_data;

STORE raw_data INTO 'hbase://emp_tab' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'emp_info:ename
emp_info:job
emp_info:mgr
emp_info:hiredate
emp_info:sal
emp_info:comm
emp_info:deptno'
);


Output:



(7369,SMITH,CLERK,7902,13-06-93,800,0,20)
(7499,ALLEN,SALESMAN,7698,15-08-98,1600,300,30)
(7521,WARD,SALESMAN,7698,26-03-96,1250,500,30)
(7566,JONES,MANAGER,7839,31-10-95,2975,0,20)
(7698,BLAKE,MANAGER,7839,11-06-92,2850,0,30)
(7782,CLARK,MANAGER,7839,14-05-93,2450,0,10)
(7788,SCOTT,ANALYST,7566,05-03-96,3000,0,20)
(7839,KING,PRESIDENT,0,09-06-90,5000,0,10)
(7844,TURNER,SALESMAN,7698,04-06-95,1500,0,30)
(7876,ADAMS,CLERK,7788,04-06-99,1100,0,20)
(7900,JAMES,CLERK,7698,23-06-00,950,0,30)
(7934,MILLER,CLERK,7782,21-01-00,1300,0,10)
(7902,FORD,ANALYST,7566,05-12-97,3000,0,20)
(7654,MARTIN,SALESMAN,7698,05-12-98,1250,1400,30)


Then



STORE raw_data INTO 'hbase://emp_tab' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'emp_info:ename
>> emp_info:job
>> emp_info:mgr
>> emp_info:hiredate
>> emp_info:sal
>> emp_info:comm
>> emp_info:deptno'
>> );
2018-12-31 13:45:55,581 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2018-12-31 13:45:55,594 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2018-12-31 13:45:55,594 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2018-12-31 13:45:55,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - Tez staging directory is /tmp/temp-725993693
2018-12-31 13:45:55,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.plan.TezCompiler - File concatenation threshold: 100 optimistic? false
2018-12-31 13:45:55,645 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2018-12-31 13:45:55,645 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2018-12-31 13:45:55,649 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: htrace-core-3.1.0-incubating.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: metrics-core-2.2.0.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: protobuf-java-2.5.0.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: joda-time-2.8.2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: zookeeper-3.4.6.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: antlr-runtime-3.4.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-hadoop-compat-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-protocol-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-common-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: netty-all-4.0.23.Final.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: guava-11.0.2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-server-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: pig-0.15.0.2.3.2.0-2950-core-h2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-client-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: automaton-1.11-8.jar
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.merge.percent to 0.66 from MR setting mapreduce.reduce.shuffle.merge.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.fetch.buffer.percent to 0.7 from MR setting mapreduce.reduce.shuffle.input.buffer.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.io.sort.mb to 64 from MR setting mapreduce.task.io.sort.mb
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.memory.limit.percent to 0.25 from MR setting mapreduce.reduce.shuffle.memory.limit.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.io.sort.factor to 100 from MR setting mapreduce.task.io.sort.factor
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.connect.timeout to 180000 from MR setting mapreduce.reduce.shuffle.connect.timeout
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.internal.sorter.class to org.apache.hadoop.util.QuickSort from MR setting map.sort.class
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.merge.progress.records to 10000 from MR setting mapreduce.task.merge.progress.records
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.compress to false from MR setting mapreduce.map.output.compress
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.sort.spill.percent to 0.7 from MR setting mapreduce.map.sort.spill.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.ssl.enable to false from MR setting mapreduce.shuffle.ssl.enabled
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.ifile.readahead to true from MR setting mapreduce.ifile.readahead
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.parallel.copies to 30 from MR setting mapreduce.reduce.shuffle.parallelcopies
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.ifile.readahead.bytes to 4194304 from MR setting mapreduce.ifile.readahead.bytes
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.task.input.post-merge.buffer.percent to 0.0 from MR setting mapreduce.reduce.input.buffer.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.read.timeout to 180000 from MR setting mapreduce.reduce.shuffle.read.timeout
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.compress.codec to org.apache.hadoop.io.compress.DefaultCodec from MR setting mapreduce.map.output.compress.codec
2018-12-31 13:45:55,706 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - For vertex - scope-673: parallelism=1, memory=250, java opts=-Xmx256m
2018-12-31 13:45:55,777 [PigTezLauncher-0] INFO org.apache.pig.tools.pigstats.tez.TezScriptState - Pig script settings are added to the job
2018-12-31 13:45:55,778 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Tez Client Version: [ component=tez-api, version=0.7.0.2.3.2.0-2950, revision=4900a9cea70487666ace4c9e490d4d8fc1fee96f, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, buildTime=20150930-1859 ]
2018-12-31 13:45:55,836 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:45:55,837 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/192.168.61.158:8050
2018-12-31 13:45:55,837 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Using org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager to manage Timeline ACLs
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Session mode. Starting session.
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.tez.client.TezClientUtils - Using tez.lib.uris value from configuration: /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz
2018-12-31 13:45:55,900 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Tez system stage directory hdfs://sandbox.hortonworks.com:8020/tmp/temp-725993693/.tez/application_1546253526030_0012 doesn't exist and is created
2018-12-31 13:45:55,919 [PigTezLauncher-0] INFO org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager - Created Timeline Domain for History ACLs, domainId=Tez_ATS_application_1546253526030_0012
2018-12-31 13:45:56,163 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1546253526030_0012
2018-12-31 13:45:56,165 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - The url to track the Tez Session: http://sandbox.hortonworks.com:8088/proxy/application_1546253526030_0012/
2018-12-31 13:46:00,233 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Submitting DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,233 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Submitting dag to TezSession, sessionName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig, applicationId=application_1546253526030_0012, dagName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,487 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Submitted dag to TezSession, sessionName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig, applicationId=application_1546253526030_0012, dagName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/192.168.61.158:8050
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Submitted DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66. Application id: application_1546253526030_0012
2018-12-31 13:46:00,751 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - HadoopJobId: job_1546253526030_0012
2018-12-31 13:46:01,587 [Timer-33] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 0 Killed: 0, diagnostics=, counters=null
2018-12-31 13:46:15,741 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=FAILED, progress=TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 1 Killed: 0 FailedTaskAttempts: 4, diagnostics=Vertex failed, vertexName=scope-673, vertexId=vertex_1546253526030_0012_1_00, diagnostics=[Task failed, taskId=task_1546253526030_0012_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1546253526030_0012_1_00 [scope-673] killed/failed due to:OWN_TASK_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0, counters=Counters: 4
org.apache.tez.common.counters.DAGCounter
NUM_FAILED_TASKS=4
TOTAL_LAUNCHED_TASKS=4
AM_CPU_MILLISECONDS=1310
AM_GC_TIME_MILLIS=17
2018-12-31 13:46:15,757 [PigTezLauncher-0] WARN org.apache.pig.tools.pigstats.JobStats - unable to find an output size reader
2018-12-31 13:46:15,761 [main] INFO org.apache.pig.tools.pigstats.tez.TezPigScriptStats - Script Statistics:

HadoopVersion: 2.7.1.2.3.2.0-2950
PigVersion: 0.15.0.2.3.2.0-2950
TezVersion: 0.7.0.2.3.2.0-2950
UserId: root
FileName:
StartedAt: 2018-12-31 13:45:55
FinishedAt: 2018-12-31 13:46:15
Features: UNKNOWN

Failed!

DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66:
ApplicationId: job_1546253526030_0012
TotalLaunchedTasks: 4
FileBytesRead: 0
FileBytesWritten: 0
HdfsBytesRead: 0
HdfsBytesWritten: 0

Input(s):
Failed to read data from "hdfs://sandbox.hortonworks.com:8020/user/aravind/hbase/emp_data/emp_data.txt"

Output(s):
Failed to produce result in "hbase://emp_tab"

grunt>









share|improve this question

























  • Pretty sure you need commas between your column names in your store command.

    – Andrew
    Dec 31 '18 at 15:16
















0















I am trying to load the one CSV file to HBase table. I can successfully dump the data from CSV, but while importing into a table I am getting an error.
But, when loading other data, i can load any issues.



raw_data = LOAD
'hdfs://sandbox.hortonworks.com:8020/user/aravind/hbase/emp_data/emp_data.txt' USING PigStorage(',') AS (empno:int, ename:chararray, job:chararray, mgr:chararray, hiredate:chararray, sal:chararray, comm:chararray, deptno:int);
dump raw_data;

STORE raw_data INTO 'hbase://emp_tab' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'emp_info:ename
emp_info:job
emp_info:mgr
emp_info:hiredate
emp_info:sal
emp_info:comm
emp_info:deptno'
);


Output:



(7369,SMITH,CLERK,7902,13-06-93,800,0,20)
(7499,ALLEN,SALESMAN,7698,15-08-98,1600,300,30)
(7521,WARD,SALESMAN,7698,26-03-96,1250,500,30)
(7566,JONES,MANAGER,7839,31-10-95,2975,0,20)
(7698,BLAKE,MANAGER,7839,11-06-92,2850,0,30)
(7782,CLARK,MANAGER,7839,14-05-93,2450,0,10)
(7788,SCOTT,ANALYST,7566,05-03-96,3000,0,20)
(7839,KING,PRESIDENT,0,09-06-90,5000,0,10)
(7844,TURNER,SALESMAN,7698,04-06-95,1500,0,30)
(7876,ADAMS,CLERK,7788,04-06-99,1100,0,20)
(7900,JAMES,CLERK,7698,23-06-00,950,0,30)
(7934,MILLER,CLERK,7782,21-01-00,1300,0,10)
(7902,FORD,ANALYST,7566,05-12-97,3000,0,20)
(7654,MARTIN,SALESMAN,7698,05-12-98,1250,1400,30)


Then



STORE raw_data INTO 'hbase://emp_tab' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'emp_info:ename
>> emp_info:job
>> emp_info:mgr
>> emp_info:hiredate
>> emp_info:sal
>> emp_info:comm
>> emp_info:deptno'
>> );
2018-12-31 13:45:55,581 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2018-12-31 13:45:55,594 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2018-12-31 13:45:55,594 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2018-12-31 13:45:55,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - Tez staging directory is /tmp/temp-725993693
2018-12-31 13:45:55,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.plan.TezCompiler - File concatenation threshold: 100 optimistic? false
2018-12-31 13:45:55,645 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2018-12-31 13:45:55,645 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2018-12-31 13:45:55,649 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: htrace-core-3.1.0-incubating.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: metrics-core-2.2.0.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: protobuf-java-2.5.0.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: joda-time-2.8.2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: zookeeper-3.4.6.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: antlr-runtime-3.4.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-hadoop-compat-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-protocol-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-common-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: netty-all-4.0.23.Final.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: guava-11.0.2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-server-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: pig-0.15.0.2.3.2.0-2950-core-h2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-client-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: automaton-1.11-8.jar
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.merge.percent to 0.66 from MR setting mapreduce.reduce.shuffle.merge.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.fetch.buffer.percent to 0.7 from MR setting mapreduce.reduce.shuffle.input.buffer.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.io.sort.mb to 64 from MR setting mapreduce.task.io.sort.mb
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.memory.limit.percent to 0.25 from MR setting mapreduce.reduce.shuffle.memory.limit.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.io.sort.factor to 100 from MR setting mapreduce.task.io.sort.factor
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.connect.timeout to 180000 from MR setting mapreduce.reduce.shuffle.connect.timeout
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.internal.sorter.class to org.apache.hadoop.util.QuickSort from MR setting map.sort.class
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.merge.progress.records to 10000 from MR setting mapreduce.task.merge.progress.records
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.compress to false from MR setting mapreduce.map.output.compress
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.sort.spill.percent to 0.7 from MR setting mapreduce.map.sort.spill.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.ssl.enable to false from MR setting mapreduce.shuffle.ssl.enabled
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.ifile.readahead to true from MR setting mapreduce.ifile.readahead
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.parallel.copies to 30 from MR setting mapreduce.reduce.shuffle.parallelcopies
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.ifile.readahead.bytes to 4194304 from MR setting mapreduce.ifile.readahead.bytes
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.task.input.post-merge.buffer.percent to 0.0 from MR setting mapreduce.reduce.input.buffer.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.read.timeout to 180000 from MR setting mapreduce.reduce.shuffle.read.timeout
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.compress.codec to org.apache.hadoop.io.compress.DefaultCodec from MR setting mapreduce.map.output.compress.codec
2018-12-31 13:45:55,706 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - For vertex - scope-673: parallelism=1, memory=250, java opts=-Xmx256m
2018-12-31 13:45:55,777 [PigTezLauncher-0] INFO org.apache.pig.tools.pigstats.tez.TezScriptState - Pig script settings are added to the job
2018-12-31 13:45:55,778 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Tez Client Version: [ component=tez-api, version=0.7.0.2.3.2.0-2950, revision=4900a9cea70487666ace4c9e490d4d8fc1fee96f, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, buildTime=20150930-1859 ]
2018-12-31 13:45:55,836 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:45:55,837 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/192.168.61.158:8050
2018-12-31 13:45:55,837 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Using org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager to manage Timeline ACLs
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Session mode. Starting session.
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.tez.client.TezClientUtils - Using tez.lib.uris value from configuration: /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz
2018-12-31 13:45:55,900 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Tez system stage directory hdfs://sandbox.hortonworks.com:8020/tmp/temp-725993693/.tez/application_1546253526030_0012 doesn't exist and is created
2018-12-31 13:45:55,919 [PigTezLauncher-0] INFO org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager - Created Timeline Domain for History ACLs, domainId=Tez_ATS_application_1546253526030_0012
2018-12-31 13:45:56,163 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1546253526030_0012
2018-12-31 13:45:56,165 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - The url to track the Tez Session: http://sandbox.hortonworks.com:8088/proxy/application_1546253526030_0012/
2018-12-31 13:46:00,233 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Submitting DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,233 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Submitting dag to TezSession, sessionName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig, applicationId=application_1546253526030_0012, dagName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,487 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Submitted dag to TezSession, sessionName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig, applicationId=application_1546253526030_0012, dagName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/192.168.61.158:8050
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Submitted DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66. Application id: application_1546253526030_0012
2018-12-31 13:46:00,751 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - HadoopJobId: job_1546253526030_0012
2018-12-31 13:46:01,587 [Timer-33] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 0 Killed: 0, diagnostics=, counters=null
2018-12-31 13:46:15,741 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=FAILED, progress=TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 1 Killed: 0 FailedTaskAttempts: 4, diagnostics=Vertex failed, vertexName=scope-673, vertexId=vertex_1546253526030_0012_1_00, diagnostics=[Task failed, taskId=task_1546253526030_0012_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1546253526030_0012_1_00 [scope-673] killed/failed due to:OWN_TASK_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0, counters=Counters: 4
org.apache.tez.common.counters.DAGCounter
NUM_FAILED_TASKS=4
TOTAL_LAUNCHED_TASKS=4
AM_CPU_MILLISECONDS=1310
AM_GC_TIME_MILLIS=17
2018-12-31 13:46:15,757 [PigTezLauncher-0] WARN org.apache.pig.tools.pigstats.JobStats - unable to find an output size reader
2018-12-31 13:46:15,761 [main] INFO org.apache.pig.tools.pigstats.tez.TezPigScriptStats - Script Statistics:

HadoopVersion: 2.7.1.2.3.2.0-2950
PigVersion: 0.15.0.2.3.2.0-2950
TezVersion: 0.7.0.2.3.2.0-2950
UserId: root
FileName:
StartedAt: 2018-12-31 13:45:55
FinishedAt: 2018-12-31 13:46:15
Features: UNKNOWN

Failed!

DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66:
ApplicationId: job_1546253526030_0012
TotalLaunchedTasks: 4
FileBytesRead: 0
FileBytesWritten: 0
HdfsBytesRead: 0
HdfsBytesWritten: 0

Input(s):
Failed to read data from "hdfs://sandbox.hortonworks.com:8020/user/aravind/hbase/emp_data/emp_data.txt"

Output(s):
Failed to produce result in "hbase://emp_tab"

grunt>









share|improve this question

























  • Pretty sure you need commas between your column names in your store command.

    – Andrew
    Dec 31 '18 at 15:16














0












0








0








I am trying to load the one CSV file to HBase table. I can successfully dump the data from CSV, but while importing into a table I am getting an error.
But, when loading other data, i can load any issues.



raw_data = LOAD
'hdfs://sandbox.hortonworks.com:8020/user/aravind/hbase/emp_data/emp_data.txt' USING PigStorage(',') AS (empno:int, ename:chararray, job:chararray, mgr:chararray, hiredate:chararray, sal:chararray, comm:chararray, deptno:int);
dump raw_data;

STORE raw_data INTO 'hbase://emp_tab' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'emp_info:ename
emp_info:job
emp_info:mgr
emp_info:hiredate
emp_info:sal
emp_info:comm
emp_info:deptno'
);


Output:



(7369,SMITH,CLERK,7902,13-06-93,800,0,20)
(7499,ALLEN,SALESMAN,7698,15-08-98,1600,300,30)
(7521,WARD,SALESMAN,7698,26-03-96,1250,500,30)
(7566,JONES,MANAGER,7839,31-10-95,2975,0,20)
(7698,BLAKE,MANAGER,7839,11-06-92,2850,0,30)
(7782,CLARK,MANAGER,7839,14-05-93,2450,0,10)
(7788,SCOTT,ANALYST,7566,05-03-96,3000,0,20)
(7839,KING,PRESIDENT,0,09-06-90,5000,0,10)
(7844,TURNER,SALESMAN,7698,04-06-95,1500,0,30)
(7876,ADAMS,CLERK,7788,04-06-99,1100,0,20)
(7900,JAMES,CLERK,7698,23-06-00,950,0,30)
(7934,MILLER,CLERK,7782,21-01-00,1300,0,10)
(7902,FORD,ANALYST,7566,05-12-97,3000,0,20)
(7654,MARTIN,SALESMAN,7698,05-12-98,1250,1400,30)


Then



STORE raw_data INTO 'hbase://emp_tab' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'emp_info:ename
>> emp_info:job
>> emp_info:mgr
>> emp_info:hiredate
>> emp_info:sal
>> emp_info:comm
>> emp_info:deptno'
>> );
2018-12-31 13:45:55,581 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2018-12-31 13:45:55,594 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2018-12-31 13:45:55,594 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2018-12-31 13:45:55,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - Tez staging directory is /tmp/temp-725993693
2018-12-31 13:45:55,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.plan.TezCompiler - File concatenation threshold: 100 optimistic? false
2018-12-31 13:45:55,645 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2018-12-31 13:45:55,645 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2018-12-31 13:45:55,649 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: htrace-core-3.1.0-incubating.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: metrics-core-2.2.0.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: protobuf-java-2.5.0.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: joda-time-2.8.2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: zookeeper-3.4.6.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: antlr-runtime-3.4.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-hadoop-compat-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-protocol-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-common-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: netty-all-4.0.23.Final.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: guava-11.0.2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-server-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: pig-0.15.0.2.3.2.0-2950-core-h2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-client-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: automaton-1.11-8.jar
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.merge.percent to 0.66 from MR setting mapreduce.reduce.shuffle.merge.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.fetch.buffer.percent to 0.7 from MR setting mapreduce.reduce.shuffle.input.buffer.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.io.sort.mb to 64 from MR setting mapreduce.task.io.sort.mb
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.memory.limit.percent to 0.25 from MR setting mapreduce.reduce.shuffle.memory.limit.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.io.sort.factor to 100 from MR setting mapreduce.task.io.sort.factor
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.connect.timeout to 180000 from MR setting mapreduce.reduce.shuffle.connect.timeout
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.internal.sorter.class to org.apache.hadoop.util.QuickSort from MR setting map.sort.class
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.merge.progress.records to 10000 from MR setting mapreduce.task.merge.progress.records
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.compress to false from MR setting mapreduce.map.output.compress
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.sort.spill.percent to 0.7 from MR setting mapreduce.map.sort.spill.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.ssl.enable to false from MR setting mapreduce.shuffle.ssl.enabled
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.ifile.readahead to true from MR setting mapreduce.ifile.readahead
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.parallel.copies to 30 from MR setting mapreduce.reduce.shuffle.parallelcopies
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.ifile.readahead.bytes to 4194304 from MR setting mapreduce.ifile.readahead.bytes
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.task.input.post-merge.buffer.percent to 0.0 from MR setting mapreduce.reduce.input.buffer.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.read.timeout to 180000 from MR setting mapreduce.reduce.shuffle.read.timeout
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.compress.codec to org.apache.hadoop.io.compress.DefaultCodec from MR setting mapreduce.map.output.compress.codec
2018-12-31 13:45:55,706 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - For vertex - scope-673: parallelism=1, memory=250, java opts=-Xmx256m
2018-12-31 13:45:55,777 [PigTezLauncher-0] INFO org.apache.pig.tools.pigstats.tez.TezScriptState - Pig script settings are added to the job
2018-12-31 13:45:55,778 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Tez Client Version: [ component=tez-api, version=0.7.0.2.3.2.0-2950, revision=4900a9cea70487666ace4c9e490d4d8fc1fee96f, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, buildTime=20150930-1859 ]
2018-12-31 13:45:55,836 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:45:55,837 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/192.168.61.158:8050
2018-12-31 13:45:55,837 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Using org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager to manage Timeline ACLs
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Session mode. Starting session.
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.tez.client.TezClientUtils - Using tez.lib.uris value from configuration: /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz
2018-12-31 13:45:55,900 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Tez system stage directory hdfs://sandbox.hortonworks.com:8020/tmp/temp-725993693/.tez/application_1546253526030_0012 doesn't exist and is created
2018-12-31 13:45:55,919 [PigTezLauncher-0] INFO org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager - Created Timeline Domain for History ACLs, domainId=Tez_ATS_application_1546253526030_0012
2018-12-31 13:45:56,163 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1546253526030_0012
2018-12-31 13:45:56,165 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - The url to track the Tez Session: http://sandbox.hortonworks.com:8088/proxy/application_1546253526030_0012/
2018-12-31 13:46:00,233 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Submitting DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,233 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Submitting dag to TezSession, sessionName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig, applicationId=application_1546253526030_0012, dagName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,487 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Submitted dag to TezSession, sessionName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig, applicationId=application_1546253526030_0012, dagName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/192.168.61.158:8050
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Submitted DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66. Application id: application_1546253526030_0012
2018-12-31 13:46:00,751 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - HadoopJobId: job_1546253526030_0012
2018-12-31 13:46:01,587 [Timer-33] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 0 Killed: 0, diagnostics=, counters=null
2018-12-31 13:46:15,741 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=FAILED, progress=TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 1 Killed: 0 FailedTaskAttempts: 4, diagnostics=Vertex failed, vertexName=scope-673, vertexId=vertex_1546253526030_0012_1_00, diagnostics=[Task failed, taskId=task_1546253526030_0012_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1546253526030_0012_1_00 [scope-673] killed/failed due to:OWN_TASK_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0, counters=Counters: 4
org.apache.tez.common.counters.DAGCounter
NUM_FAILED_TASKS=4
TOTAL_LAUNCHED_TASKS=4
AM_CPU_MILLISECONDS=1310
AM_GC_TIME_MILLIS=17
2018-12-31 13:46:15,757 [PigTezLauncher-0] WARN org.apache.pig.tools.pigstats.JobStats - unable to find an output size reader
2018-12-31 13:46:15,761 [main] INFO org.apache.pig.tools.pigstats.tez.TezPigScriptStats - Script Statistics:

HadoopVersion: 2.7.1.2.3.2.0-2950
PigVersion: 0.15.0.2.3.2.0-2950
TezVersion: 0.7.0.2.3.2.0-2950
UserId: root
FileName:
StartedAt: 2018-12-31 13:45:55
FinishedAt: 2018-12-31 13:46:15
Features: UNKNOWN

Failed!

DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66:
ApplicationId: job_1546253526030_0012
TotalLaunchedTasks: 4
FileBytesRead: 0
FileBytesWritten: 0
HdfsBytesRead: 0
HdfsBytesWritten: 0

Input(s):
Failed to read data from "hdfs://sandbox.hortonworks.com:8020/user/aravind/hbase/emp_data/emp_data.txt"

Output(s):
Failed to produce result in "hbase://emp_tab"

grunt>









share|improve this question
















I am trying to load the one CSV file to HBase table. I can successfully dump the data from CSV, but while importing into a table I am getting an error.
But, when loading other data, i can load any issues.



raw_data = LOAD
'hdfs://sandbox.hortonworks.com:8020/user/aravind/hbase/emp_data/emp_data.txt' USING PigStorage(',') AS (empno:int, ename:chararray, job:chararray, mgr:chararray, hiredate:chararray, sal:chararray, comm:chararray, deptno:int);
dump raw_data;

STORE raw_data INTO 'hbase://emp_tab' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'emp_info:ename
emp_info:job
emp_info:mgr
emp_info:hiredate
emp_info:sal
emp_info:comm
emp_info:deptno'
);


Output:



(7369,SMITH,CLERK,7902,13-06-93,800,0,20)
(7499,ALLEN,SALESMAN,7698,15-08-98,1600,300,30)
(7521,WARD,SALESMAN,7698,26-03-96,1250,500,30)
(7566,JONES,MANAGER,7839,31-10-95,2975,0,20)
(7698,BLAKE,MANAGER,7839,11-06-92,2850,0,30)
(7782,CLARK,MANAGER,7839,14-05-93,2450,0,10)
(7788,SCOTT,ANALYST,7566,05-03-96,3000,0,20)
(7839,KING,PRESIDENT,0,09-06-90,5000,0,10)
(7844,TURNER,SALESMAN,7698,04-06-95,1500,0,30)
(7876,ADAMS,CLERK,7788,04-06-99,1100,0,20)
(7900,JAMES,CLERK,7698,23-06-00,950,0,30)
(7934,MILLER,CLERK,7782,21-01-00,1300,0,10)
(7902,FORD,ANALYST,7566,05-12-97,3000,0,20)
(7654,MARTIN,SALESMAN,7698,05-12-98,1250,1400,30)


Then



STORE raw_data INTO 'hbase://emp_tab' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'emp_info:ename
>> emp_info:job
>> emp_info:mgr
>> emp_info:hiredate
>> emp_info:sal
>> emp_info:comm
>> emp_info:deptno'
>> );
2018-12-31 13:45:55,581 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2018-12-31 13:45:55,594 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2018-12-31 13:45:55,594 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2018-12-31 13:45:55,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - Tez staging directory is /tmp/temp-725993693
2018-12-31 13:45:55,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.plan.TezCompiler - File concatenation threshold: 100 optimistic? false
2018-12-31 13:45:55,645 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2018-12-31 13:45:55,645 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2018-12-31 13:45:55,649 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: htrace-core-3.1.0-incubating.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: metrics-core-2.2.0.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: protobuf-java-2.5.0.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: joda-time-2.8.2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: zookeeper-3.4.6.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: antlr-runtime-3.4.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-hadoop-compat-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-protocol-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-common-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: netty-all-4.0.23.Final.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: guava-11.0.2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-server-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: pig-0.15.0.2.3.2.0-2950-core-h2.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hbase-client-1.1.2.2.3.2.0-2950.jar
2018-12-31 13:45:55,671 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: automaton-1.11-8.jar
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.merge.percent to 0.66 from MR setting mapreduce.reduce.shuffle.merge.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.fetch.buffer.percent to 0.7 from MR setting mapreduce.reduce.shuffle.input.buffer.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.io.sort.mb to 64 from MR setting mapreduce.task.io.sort.mb
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.memory.limit.percent to 0.25 from MR setting mapreduce.reduce.shuffle.memory.limit.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.io.sort.factor to 100 from MR setting mapreduce.task.io.sort.factor
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.connect.timeout to 180000 from MR setting mapreduce.reduce.shuffle.connect.timeout
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.internal.sorter.class to org.apache.hadoop.util.QuickSort from MR setting map.sort.class
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.merge.progress.records to 10000 from MR setting mapreduce.task.merge.progress.records
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.compress to false from MR setting mapreduce.map.output.compress
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.sort.spill.percent to 0.7 from MR setting mapreduce.map.sort.spill.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.ssl.enable to false from MR setting mapreduce.shuffle.ssl.enabled
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.ifile.readahead to true from MR setting mapreduce.ifile.readahead
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.parallel.copies to 30 from MR setting mapreduce.reduce.shuffle.parallelcopies
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.ifile.readahead.bytes to 4194304 from MR setting mapreduce.ifile.readahead.bytes
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.task.input.post-merge.buffer.percent to 0.0 from MR setting mapreduce.reduce.input.buffer.percent
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.read.timeout to 180000 from MR setting mapreduce.reduce.shuffle.read.timeout
2018-12-31 13:45:55,700 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.compress.codec to org.apache.hadoop.io.compress.DefaultCodec from MR setting mapreduce.map.output.compress.codec
2018-12-31 13:45:55,706 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - For vertex - scope-673: parallelism=1, memory=250, java opts=-Xmx256m
2018-12-31 13:45:55,777 [PigTezLauncher-0] INFO org.apache.pig.tools.pigstats.tez.TezScriptState - Pig script settings are added to the job
2018-12-31 13:45:55,778 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Tez Client Version: [ component=tez-api, version=0.7.0.2.3.2.0-2950, revision=4900a9cea70487666ace4c9e490d4d8fc1fee96f, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, buildTime=20150930-1859 ]
2018-12-31 13:45:55,836 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:45:55,837 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/192.168.61.158:8050
2018-12-31 13:45:55,837 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Using org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager to manage Timeline ACLs
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Session mode. Starting session.
2018-12-31 13:45:55,878 [PigTezLauncher-0] INFO org.apache.tez.client.TezClientUtils - Using tez.lib.uris value from configuration: /hdp/apps/2.3.2.0-2950/tez/tez.tar.gz
2018-12-31 13:45:55,900 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Tez system stage directory hdfs://sandbox.hortonworks.com:8020/tmp/temp-725993693/.tez/application_1546253526030_0012 doesn't exist and is created
2018-12-31 13:45:55,919 [PigTezLauncher-0] INFO org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager - Created Timeline Domain for History ACLs, domainId=Tez_ATS_application_1546253526030_0012
2018-12-31 13:45:56,163 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1546253526030_0012
2018-12-31 13:45:56,165 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - The url to track the Tez Session: http://sandbox.hortonworks.com:8088/proxy/application_1546253526030_0012/
2018-12-31 13:46:00,233 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Submitting DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,233 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Submitting dag to TezSession, sessionName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig, applicationId=application_1546253526030_0012, dagName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,487 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Submitted dag to TezSession, sessionName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig, applicationId=application_1546253526030_0012, dagName=PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/192.168.61.158:8050
2018-12-31 13:46:00,586 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Submitted DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66. Application id: application_1546253526030_0012
2018-12-31 13:46:00,751 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - HadoopJobId: job_1546253526030_0012
2018-12-31 13:46:01,587 [Timer-33] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 0 Killed: 0, diagnostics=, counters=null
2018-12-31 13:46:15,741 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=FAILED, progress=TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 1 Killed: 0 FailedTaskAttempts: 4, diagnostics=Vertex failed, vertexName=scope-673, vertexId=vertex_1546253526030_0012_1_00, diagnostics=[Task failed, taskId=task_1546253526030_0012_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:1005)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
at org.apache.tez.mapreduce.output.MROutput$1.write(MROutput.java:503)
at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:125)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:196)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1546253526030_0012_1_00 [scope-673] killed/failed due to:OWN_TASK_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0, counters=Counters: 4
org.apache.tez.common.counters.DAGCounter
NUM_FAILED_TASKS=4
TOTAL_LAUNCHED_TASKS=4
AM_CPU_MILLISECONDS=1310
AM_GC_TIME_MILLIS=17
2018-12-31 13:46:15,757 [PigTezLauncher-0] WARN org.apache.pig.tools.pigstats.JobStats - unable to find an output size reader
2018-12-31 13:46:15,761 [main] INFO org.apache.pig.tools.pigstats.tez.TezPigScriptStats - Script Statistics:

HadoopVersion: 2.7.1.2.3.2.0-2950
PigVersion: 0.15.0.2.3.2.0-2950
TezVersion: 0.7.0.2.3.2.0-2950
UserId: root
FileName:
StartedAt: 2018-12-31 13:45:55
FinishedAt: 2018-12-31 13:46:15
Features: UNKNOWN

Failed!

DAG PigLatin:/home/hdfs/aravind/hbase/emp_data.pig-0_scope-66:
ApplicationId: job_1546253526030_0012
TotalLaunchedTasks: 4
FileBytesRead: 0
FileBytesWritten: 0
HdfsBytesRead: 0
HdfsBytesWritten: 0

Input(s):
Failed to read data from "hdfs://sandbox.hortonworks.com:8020/user/aravind/hbase/emp_data/emp_data.txt"

Output(s):
Failed to produce result in "hbase://emp_tab"

grunt>






hive apache-pig loading






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 31 '18 at 17:05









Robert

2,15962536




2,15962536










asked Dec 31 '18 at 14:00









Aravind PeddolaAravind Peddola

11




11













  • Pretty sure you need commas between your column names in your store command.

    – Andrew
    Dec 31 '18 at 15:16



















  • Pretty sure you need commas between your column names in your store command.

    – Andrew
    Dec 31 '18 at 15:16

















Pretty sure you need commas between your column names in your store command.

– Andrew
Dec 31 '18 at 15:16





Pretty sure you need commas between your column names in your store command.

– Andrew
Dec 31 '18 at 15:16












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53988300%2fdata-loading-csv-file-data-into-hbase-using-pig%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53988300%2fdata-loading-csv-file-data-into-hbase-using-pig%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Monofisismo

Angular Downloading a file using contenturl with Basic Authentication

Olmecas