java.io.IOException: Failed to transform the record start=0 length=38

talendforge.org | 5 months ago
  1. 0
    Check for bad records in the input data (like '(null)')
  2. 0
    Download the winutils.exe for your Hadoop version: https://github.com/steveloughran/winutils . Save it to HADOOP_HOME/bin
  3. Speed up your debug routine!

    Automated exception search integrated into your IDE

  4. 0

    Hadoop job failing while in-merging files in reducer

    Stack Overflow | 4 years ago | user2507538
    java.io.IOException: Intermediate merge failed
  5. 0
    Bad input data (not properly separated)

    Not finding the right solution?
    Take a tour to get the most out of Samebug.

    Tired of useless tips?

    Automated exception search integrated into your IDE

    Root Cause Analysis

    1. java.io.IOException

      Failed to transform the record start=0 length=38

      at org.talend.transform.dataflow.spark.AbstractDataTransformation.transformRecord()
    2. org.talend.transform
      AbstractDataTransformation.call
      1. org.talend.transform.dataflow.spark.AbstractDataTransformation.transformRecord(AbstractDataTransformation.java:62)
      2. org.talend.transform.dataflow.spark.AbstractDataTransformation.call(AbstractDataTransformation.java:72)
      3. org.talend.transform.dataflow.spark.AbstractDataTransformation.call(AbstractDataTransformation.java:31)
      3 frames
    3. Spark
      JavaRDDLike$$anonfun$fn$3$1.apply
      1. org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$3$1.apply(JavaRDDLike.scala:149)
      2. org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$3$1.apply(JavaRDDLike.scala:149)
      2 frames
    4. Scala
      Iterator$$anon$13.hasNext
      1. scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
      1 frame
    5. Spark
      Executor$TaskRunner.run
      1. org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply$mcV$sp(PairRDDFunctions.scala:1197)
      2. org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1197)
      3. org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1197)
      4. org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1250)
      5. org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1205)
      6. org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1185)
      7. org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
      8. org.apache.spark.scheduler.Task.run(Task.scala:89)
      9. org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
      9 frames
    6. Java RT
      Thread.run
      1. java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      2. java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      3. java.lang.Thread.run(Thread.java:745)
      3 frames